Welcome to FACT: Factuality Analysis and Classification Task, a task to classify events in Spanish texts, according to their factuality status. The main page of the task is here. This task is part of IberLEF 2020.
The FACT shared task is organized by Grupo PLN-InCo (UdelaR - Uruguay) and GRIAL (UB-UAB-UDL, España).
June, 17th:
Final results published.
June, 3rd:
A description of the baseline for each task was published at "Learn the Details > Evaluation".
The guidelines for the working notes are described at the end of this page.
May, 25th:
Results submission deadline has been extended to June 17.
May, 20th:
March, 19th:
The training dataset is available, see the Participate section.
Factuality is understood, following Sauri (2008), as the category that determines the factual status of events, that is, whether events are presented or not as certain. In 2019, the first edition of the FACT task focused on determining the factuality of verbal events. The goal of the second edition is to identify noun events and determine the factuality of all events (verbs and nouns).
In this task facts are not verified in regard to the real world, just assessed with respect to how they are presented by the source (in this case the writer), that is, the commitment of the source to the truth-value of the event. In this sense, the task could be conceived as a core procedure for other tasks such as fact-checking and fake-news, making it possible, in future tasks, to compare what is narrated in the text (fact tagging) to what is happening in the world (fact-checking and fake-news).
We establish three possible categories for factuality:
The systems will have to automatically propose a factual tag for each event (vervs and nouns). The events are already annotated in the texts. The structure of the tags used in the annotation is the following:
<event factuality=”F”>verb</event>
For example, for the following paragraph, where events are already anotated:
De acuerdo con el Instituto Nacional de Sismología, Vulcanología, Meteorología e Hidrología (Insivumeh), el volcán de Fuego <event>ha</event> <event> vuelto</event> a la normalidad, aunque <event>mantiene</event> <event>explosiones</event> moderadas, por lo que no <event>descarta</event> una nueva <event>erupción</event> .
The systems outcome should be:
De acuerdo con el Instituto Nacional de Sismología, Vulcanología, Meteorología e Hidrología (Insivumeh), el volcán de Fuego <event factuality = “F”>ha</event> <event factuality = “F”>vuelto</event> a la normalidad, aunque <event factuality = “F”>mantiene</event> <event factuality = “F”>explosiones</event> moderadas, por lo que no <event factuality = “CF”>descarta</event”> una nueva <event factuality = “U”>erupción</event> .
The expected target audience is NLP researchers interested in providing understanding and advances in event detection and modeling, temporal text analysis, and Information Extraction in general.
The performance of this task will be measured against the evaluation corpus using these metrics:
The main score for evaluating the submissions will be Macro-F1.
The recognition of noun events presents different challenges (Saurí et al., 2005; Wonserver et al., 2012), on the one hand, identifying the nouns that transmit eventive information, such as war or construction, and, on the other hand, disambiguating those nouns that are eventive in certain contexts (conversaremos durante la cena) and not eventive in others (la cena está servida).
The participants will receive text with no annotations:
De acuerdo con el Instituto Nacional de Sismología, Vulcanología, Meteorología e Hidrología (Insivumeh), el volcán de Fuego ha vuelto a la normalidad, aunque mantiene explosiones moderadas, por lo que no descarta una nueva erupción.
and have to identify verbal and noun events:
De acuerdo con el Instituto Nacional de Sismología, Vulcanología, Meteorología e Hidrología (Insivumeh), el volcán de Fuego <event>ha</event> <event>vuelto</event> a la normalidad, aunque <event>mantiene</event> <event>explosiones</event> moderadas, por lo que no <event>descarta</event> una nueva <event>erupción</event> .
The performance of this task will be measured against the evaluation corpus using these metrics:
The corpus contains Spanish texts with approximately 6,300 events classified as F (Fact), CF (Counterfact), U (Undefined). The texts belong to the journalistic register and most of them are from the political sections from Spanish and Uruguayan newspaper.
All the papers will be part of the official IberLEF Proceedings that will be published at CEUR-WS.org. The proceedings of the workshop will be named: Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2020)
Please send your working notes to factiberlef@fing.edu.uy by July 15, 2020.
Instructions for working notes:
@article{rosa2020overview, title={Overview of FACT at IberLEF 2020: Events Detection and Classification}, author={Ros{\'a}, Aiala and Alonso, Laura and Castell{\'o}n, Irene and Chiruzzo, Luis and Curell, Hortensia and Fern{\'a}ndez, Ana and G{\'o}ngora, Santiago and Malcuori, Marisa and V{\'a}zquez, Gloria and Wonsever, Dina}, booktitle={Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2020)}, year={2020} }
The following are the results for Subtask 1:
Participant | Macro-F1 | Macro-Precision | Macro-Recall | Accuracy |
---|---|---|---|---|
t.romani | 60.7 | 61.2 | 60.4 | 84.8 |
guster | 59.3 | 62.1 | 57.4 | 83.1 |
accg14 | 55.0 | 55.6 | 54.5 | 79.8 |
trinidadg | 53.6 | 55.8 | 52.0 | 80.6 |
premjithb | 39.3 | 45.5 | 37.6 | 71.6 |
garain | 36.6 | 35.7 | 39.4 | 59.9 |
FACT_baseline | 24.6 | 25.4 | 25.1 | 52.4 |
The following are the results for Subtask 2:
Participant | F1 | Precision | Recall |
---|---|---|---|
trinidadg | 86.5 | 95.1 | 79.3 |
FACT_baseline | 59.7 | 60.3 | 59.1 |
Please join the Google Group factiberlef2020. We will be sharing news and important information about the task in that group.
For the factuality classification task, the test corpus will include unique identifiers for each event, like this:
De acuerdo con el Instituto Nacional de Sismología, Vulcanología, Meteorología e Hidrología (Insivumeh), el volcán de Fuego <event id="1">ha</event> <event id="2">vuelto</event> a la normalidad, aunque <event id="3">mantiene</event> <event id="4">explosiones</event> moderadas, por lo que no <event id="5">descarta</event> una nueva <event id="6">erupción</event>.
The results must be uploaded in a CSV file named task1.csv with two columns: id and factuality. For example:
id,factuality 1,F 2,F 3,F 4,F 5,CF 6,U
For the event detection task, the test corpus will include unique identifiers for each token, like this:
De/1 acuerdo/2 con/3 el/4 Instituto/5 Nacional/6 de/7 Sismología/8 ,/9 Vulcanología/10 ,/11 Meteorología/12 e/13 Hidrología/14 (/15 Insivumeh/16 )/17 ,/18 el/19 volcán/20 de/21 Fuego/22 ha/23 vuelto/24 a/25 la/26 normalidad/27 ,/28 aunque/29 mantiene/30 explosiones/31 moderadas/32 ,/33 por/34 lo/35 que/36 no/37 descarta/38 una/39 nueva/40 erupción/41 ./42
The results must be uploaded in a CSV file named task2.csv with one column (id). For example:
id 23 24 30 31 38 41
Participants must upload their submissions as a zip file containing one CSV file for each of the tasks. There must be at least one CSV file in the submission, with the format described above.
The baselines for each task are as follows:
The data used in this competition was created by Grupo PLN-InCo (Uruguay) and GRIAL (España).
The entire corpus will be published at the end of the competition for research and teaching purposes.
If you use the corpus please cite the overview of the FACT shared task 2019 and 2020.
Start: March 18, 2020, midnight
June 17, 2020, 11:59 p.m.
You must be logged in to participate in competitions.
Sign In