FACT@IberLEF2019

Organized by luischir - Current server time: April 26, 2019, 3:57 a.m. UTC

Current

Task: Factuality Detection
March 25, 2019, midnight UTC

End

Competition Ends
June 10, 2019, midnight UTC

Welcome to FACT: Factuality Analysis and Classification Task

Welcome to FACT: Factuality Analysis and Classification Task, a task to classify events in Spanish texts, according to their factuality status. The main page of the task is here. This task is part of IberLEF 2019.

The FACT shared task is organized by Grupo PLN-InCo (UdelaR - Uruguay), Grupo de Procesamiento del Lenguaje Natural (FaMAF, UNC, Argentina), and GRIAL (UB-UAB-UDL, España).

Introduction

In order to analyze event references in texts, it is crucial to determine whether they are presented as having taken place or as potential or not accomplished events. This information can be used for different applications like Question Answering, Information Extraction, or Incremental Timeline Construction.

Despite its centrality for Natural Language Understanding, this task has been underresearched, with the work by Saurí and Pustejovsky (2009) as a reference for English and Wonsever et al. (2009) for Spanish. The bottleneck to advance on this task has usually been the lack of annotated resources, together with its inherent difficulty. Currently PLN-InCo and GRIAL both have ongoing research projects on this topic, which are producing and will produce such annotated resources. This makes the proposal of this task even more interesting.

Task Description

Factuality is understood, following Sauri (2008), as the category that determines the factual status of events, that is, whether events are presented or not as certain. The goal of this task is the determination of the status of verb events with respect to factuality in Spanish texts.

In this task facts are not verified in regard to the real world, just assessed with respect to how they are presented by the source (in this case the writer), that is, the commitment of the source to the truth-value of the event. In this sense, the task could be conceived as a core procedure for other tasks such as fact-checking and fake-news, making it possible, in future tasks, to compare what is narrated in the text (fact tagging) to what is happening in the world (fact-checking and fake-news).

We establish three possible categories:

  • Facts: current and past situations in the world that are presented as real.
  • Counterfacts: current and past situations that the writer presents as not having happened.
  • Possibilities, future situations, predictions, hypothesis and other options: situations presented as uncertain since the writer does not commit openly to the truth-value either because they have not happened yet or because the author does not know.

 

And their respective tags:

  • F: Fact
  • CF: CounterFact
  • U: Undefined

The systems will have to automatically propose a factual tag for each event in the text. The events are already annotated in the texts. The structure of the tags used in the annotation is the following:

<event factuality=”F”>verb</event>

For example, in a sentence such as:

El fin de semana <event factuality=“”>llegó</event> a Uruguay el segundo avión.

The systems outcome should be:

El fin de semana <event factuality=“F”>llegó</event> a Uruguay el segundo avión.

The expected target audience is NLP researchers interested in providing understanding and advances in event detection and modeling, temporal text analysis, and Information Extraction in general.

Corpus

The corpus contains Spanish texts with approximately 5,000 verbal events classified as F (Fact), CF (Counterfact), U (Undefined). There are two subcorpora: the training corpus, with 4,000 events, and the evaluation corpus, with 1,000 events for testing. The texts belong to the journalistic register and most of them are from the political sections from Spanish and Uruguayan newspaper. An annotation guide is provided in order to explain the meaning of the tags and the scope of the annotation.

Important Dates

  • March 18th, 2019: team registration page.
  • March 25th, 2019: release of training data.
  • May 20th, 2019: release of test data.
  • June 3rd, 2019: results submission page.
  • June 10th, 2019: publication of results.
  • June 17th, 2019: working notes paper submission.
  • June 24th, 2019: notification of acceptance.
  • July 1st, 2019: camera ready paper submission.
  • September 24th, 2019: IberLEF 2019 Workshop.

Contact

Please join the Google Group factiberlef2019. We will be sharing news and important information about the task in that group.

FACT shared task is organized by:

Bibliography

(Alonso et al., 2018) Alonso, L., I. Castellón, H, Curell, A. Fernández-Montraveta, S. Oliver, G. Vázquez (2018). "Proyecto TAGFACT: Del texto al conocimiento. Factualidad y grados de certeza en español", Procesamiento del Lenguaje Natural, 61, p. 151-154. ISSN: 1135-5948

(Saurí 2008) Saurí, Roser. 2008. A Factuality Profiler for Eventualities in Text. Ph.D. Thesis. Brandeis University.

(Saurí and Pustejovsky 2009) Saurí, Roser and James Pustejovsky. 2009. FactBank: A Corpus Annotated with Event Factuality. In: Language Resources and Evaluation.

(Wonsever et al., 2009) Wonsever, D., Malcuori, M., & Rosá Furman, A. (2009). Factividad de los eventos referidos en textos. Reportes Técnicos 09-12, Pedeciba.

(Wonsever et al., 2016) Wonsever, D., Rosá, A., & Malcuori, M. (2016). Factuality Annotation and Learning in Spanish Texts. In LREC.

Task Description

Given a text with its events already identified, assign a factuality category to each one of the events. We consider the following three possible categories for an event:

  • F (facts): current and past situations in the world that are presented as real.
  • CF (counterfacts): current and past situations that the writer presents as not having happened.
  • U (possibilities, future situations, predictions, hypothesis and other options): situations presented as uncertain since the writer does not commit openly to the truth-value either because they have not happened yet or because the author does not know.

 

Evaluation Criteria

The performance of this task will be measured against the evaluation corpus using these metrics:

  • Precision, Recall and F1 score for each category.
  • Macro-F1.
  • Global accuracy.

 

The main score for evaluating the submissions will be Macro-F1.

Submissions

The test data will contain a unique identifier for each event.

The upload format is a .zip file containing a text file. The text file must have one line per event, indicating the event id and the category in the following format:

id1 F
id2 CF
id3 F
id4 U
...

Terms and Conditions

The data used in this competition was created by Grupo PLN-InCo (Uruguay) and GRIAL (España).

The entire corpus will be published at the end of the competition for research and teaching purposes.

If you use the corpus please cite the overview of the FACT shared task that will be available in September 2019.

Task: Factuality Detection

Start: March 25, 2019, midnight

Competition Ends

June 10, 2019, midnight

You must be logged in to participate in competitions.

Sign In