FACT@IberLEF2020

Organized by pln_udelar - Current server time: April 3, 2020, 5:43 a.m. UTC

Current

Event Identification and Factuality Detection
March 18, 2020, midnight UTC

End

Competition Ends
July 30, 2020, midnight UTC

Welcome to FACT: Factuality Analysis and Classification Task

Welcome to FACT: Factuality Analysis and Classification Task, a task to classify events in Spanish texts, according to their factuality status. The main page of the task is here. This task is part of IberLEF 2020.

The FACT shared task is organized by Grupo PLN-InCo (UdelaR - Uruguay) and GRIAL (UB-UAB-UDL, España).

News

March, 19th: The training dataset is available, see the Participate section.

Task Description

Factuality is understood, following Sauri (2008), as the category that determines the factual status of events, that is, whether events are presented or not as certain. In 2019, the first edition of the FACT task focused on determining the factuality of verbal events. The goal of the second edition is to identify noun events and determine the factuality of all events (verbs and nouns).

In this task facts are not verified in regard to the real world, just assessed with respect to how they are presented by the source (in this case the writer), that is, the commitment of the source to the truth-value of the event. In this sense, the task could be conceived as a core procedure for other tasks such as fact-checking and fake-news, making it possible, in future tasks, to compare what is narrated in the text (fact tagging) to what is happening in the world (fact-checking and fake-news).

Sub task 1: Factuality Determination

We establish three possible categories for factuality:

  • Facts (F): Current and past situations in the world that are presented as real.
  • Counterfacts (CF): Current and past situations that the writer presents as not having happened.
  • Undefined (U): Possibilities, future situations, predictions, hypothesis and other options. Situations presented as uncertain since the writer does not commit openly to the truth-value either because they have not happened yet or because the author does not know.

 

The systems will have to automatically propose a factual tag for each event (vervs and nouns). The events are already annotated in the texts. The structure of the tags used in the annotation is the following:

<event factuality=”F”>verb</event>

For example, for the following paragraph, where events are already anotated:

De acuerdo con el Instituto Nacional de Sismología, Vulcanología, Meteorología e Hidrología (Insivumeh), el volcán de Fuego <event>ha</event> <event> vuelto</event> a la normalidad, aunque <event>mantiene</event> <event>explosiones</event> moderadas, por lo que no <event>descarta</event> una nueva <event>erupción</event> .

The systems outcome should be:

De acuerdo con el Instituto Nacional de Sismología, Vulcanología, Meteorología e Hidrología (Insivumeh), el volcán de Fuego <event factuality = “F”>ha</event> <event factuality = “F”>vuelto</event> a la normalidad, aunque <event factuality = “F”>mantiene</event> <event factuality = “F”>explosiones</event> moderadas, por lo que no <event factuality = “CF”>descarta</event”> una nueva <event factuality = “U”>erupción</event> .

The expected target audience is NLP researchers interested in providing understanding and advances in event detection and modeling, temporal text analysis, and Information Extraction in general.

The performance of this task will be measured against the evaluation corpus using these metrics:

  • Precision, Recall and F1 score for each category.
  • Macro-F1.
  • Global accuracy.

The main score for evaluating the submissions will be Macro-F1.

Sub task 2: Event Identification

The recognition of noun events presents different challenges (Saurí et al., 2005; Wonserver et al., 2012), on the one hand, identifying the nouns that transmit eventive information, such as war or construction, and, on the other hand, disambiguating those nouns that are eventive in certain contexts (conversaremos durante la cena) and not eventive in others (la cena está servida).

The participants will receive text with no annotations:

De acuerdo con el Instituto Nacional de Sismología, Vulcanología, Meteorología e Hidrología (Insivumeh), el volcán de Fuego ha vuelto a la normalidad, aunque mantiene explosiones moderadas, por lo que no descarta una nueva erupción.

and have to identify verbal and noun events:

De acuerdo con el Instituto Nacional de Sismología, Vulcanología, Meteorología e Hidrología (Insivumeh), el volcán de Fuego <event>ha</event> <event>vuelto</event> a la normalidad, aunque <event>mantiene</event> <event>explosiones</event> moderadas, por lo que no <event>descarta</event> una nueva <event>erupción</event> .

The performance of this task will be measured against the evaluation corpus using these metrics:

  • Precision, Recall and F1 score.

 

Corpus

The corpus contains Spanish texts with approximately 6,300 events classified as F (Fact), CF (Counterfact), U (Undefined). The texts belong to the journalistic register and most of them are from the political sections from Spanish and Uruguayan newspaper.

Important Dates

  • March 11th, 2020: team registration.
  • March 18th, 2020: release of training data.
  • May 20th, 2020: release of test data.
  • May 27th, 2020: publication of results.
  • June 5th, 2020 working notes paper submission.
  • June 12th, 2020: notification of acceptance.
  • June 19th, 2020: camera ready paper submission.
  • September, 2020: IberLEF 2020 Workshop.

Contact

Please join the Google Group factiberlef2020. We will be sharing news and important information about the task in that group.

Task 1

For the factuality classification task, the test corpus will include unique identifiers for each event, like this:

De acuerdo con el Instituto Nacional de Sismología, Vulcanología, Meteorología e Hidrología (Insivumeh), el volcán de Fuego <event id="1">ha</event> <event id="2">vuelto</event> a la normalidad, aunque <event id="3">mantiene</event> <event id="4">explosiones</event> moderadas, por lo que no <event id="5">descarta</event> una nueva <event id="6">erupción</event>.

The results must be uploaded in a CSV file named task1.csv with two columns: id and factuality. For example:

id,factuality
1,F
2,F
3,F
4,F
5,CF
6,U

Task 2

For the event detection task, the test corpus will include unique identifiers for each token, like this:

De/1 acuerdo/2 con/3 el/4 Instituto/5 Nacional/6 de/7 Sismología/8 ,/9 Vulcanología/10 ,/11 Meteorología/12 e/13 Hidrología/14 (/15 Insivumeh/16 )/17 ,/18 el/19 volcán/20 de/21 Fuego/22 ha/23 vuelto/24 a/25 la/26 normalidad/27 ,/28 aunque/29 mantiene/30 explosiones/31 moderadas/32 ,/33 por/34 lo/35 que/36 no/37 descarta/38 una/39 nueva/40 erupción/41 ./42

The results must be uploaded in a CSV file named task2.csv with one column (id). For example:

id
23
24
30
31
38
41

Submissions

Participants must upload their submissions as a zip file containing one CSV file for each of the tasks. There must be at least one CSV file in the submission, with the format described above.

Terms and Conditions

The data used in this competition was created by Grupo PLN-InCo (Uruguay) and GRIAL (España).

The entire corpus will be published at the end of the competition for research and teaching purposes.

If you use the corpus please cite the overview of the FACT shared task 2019 and 2020.

Event Identification and Factuality Detection

Start: March 18, 2020, midnight

Competition Ends

July 30, 2020, midnight

You must be logged in to participate in competitions.

Sign In