Fincausal 2020

Organized by YseopLab - Current server time: April 3, 2020, 5:47 a.m. UTC

Previous

Practice-Task1
March 1, 2020, midnight UTC

Current

Practice-Task2
March 1, 2020, midnight UTC

Next

Evaluation-Task1
May 1, 2020, midnight UTC

FNP 2020

FinCausal 2020 - Causality Identification in Financial documents

Introduction

Causality detection is a well known topic in the NLP and linguistic communities and has many applications in information retrieval. This shared task proposes data to experiment causality detection, and focuses on determining causality associated to an event. An event is defined as the arising or emergence of a new object or context in regard of a previous situation. So the task will emphasise the detection of causality associated with financial or economic analysis and resulting in a quantified output.

Data Processing

The data are extracted from a corpus of 2019 financial news provided by QWAM. The original raw corpus is an ensemble of HTML pages corresponding to daily information retrieval from financial news feed. These news mostly inform on the 2019 financial landscape, but can also contain information related to politics, micro economics or other topic considered relevant for finance information. This raw set has been normalised as to fit in the following format: Index; Text

Task 1

Task 1 is a binary classification task. The dataset consists of a sample of text sections labeled with 1 if the text section is considered containing a causal relation, 0 otherwise. The dataset is by nature unbalanced, as to reflect the proportion of causal sentences extracted from the original news and SEC corpus, with provisional distribution approximately 5% 1 and 95% 0.

Task 2

Task 2 is a relation extraction task. The text sections will correspond to the ones labeled as 1 in the Task 1 dataset, though for the purpose of results evaluation, they will not be exactly the same in the blind test set. The purpose of this task is to extract, in a causal text section, the sub-string identifying the causal elements and the sub-string describing the effects.

Submission Details & Evaluation Criteria

We provide data sets for xxx-Task1 and xxx-Task2 respectively. (xxx = Trial, Practice, Evaluation)

Please note that you can only use the corresponding data set for xxx-Task1 to build models for Task1 and data set for xxx-Task2 to build models for Task2.

A valid submission zip file for CodaLab contains one of the following files:

  • task1.csv (directly zip it first and only submitted to "xxx-Task1" section)
  • task2.csv (directly zip it first and only submitted to "xxx-Task2" section)

Notes:

  • A .csv file with an incorrect file name (case is sensitive) will not be accepted.
  • A zip file containing both files will not be accepted.
  • Neither .csv nor .rar nor .7z file will be accepted, only .zip file is accepted.
  • Please zip your results file (e.g. task1.csv or task2.csv) directly without putting it into a folder and zipping the folder.

Submission format for xxx-Task1

The expected results would be provided by the participants in a csv file with headers: Index; Text; Prediction

  • The 'Index' must be in the same order as in 'xxx-task1.csv' for xxx-Task1.
  • The 'Prediction' must be labeled with '1' if the text section is considered containing a causal relation, 0 otherwise.

Submission example for xxx-Task1

Index; Text; Prediction
0001.00010; Nearly all of the victims had Latino last names.; 0
0001.00011; We have the highest child poverty rate of almost any country on Earth.; 0
...; ...; ...

Submission format for xxx-Task2

The expected results should be provided by the participants in a csv file with headers: Index; Text; Cause; Effect where Cause and Effect should be the sub-string identifying the causal elements and the sub-string describing the effects elements from a causal text section.

  • The 'Index' must be in the same order as in 'xxx-task2.csv' for xxx-Task2.
  • The 'Cause' should be a sub-string of the text section referencing the cause of an event (event or related object included)
  • The 'Effect' should be a sub-string referencing the effect of the cause

Submission example for xxx-Task2

Index; Text; Cause; Effect; Offset_Sentence2; Offset_Sentence3
0003.00117; Transat loss more than doubles as it works to complete Air Canada deal ; it works to complete Air Canada deal; Transat loss more than doubles; ;
...; ...; ...; ...; ...; ...

Note: the last 2 columns (sentence offsets) are not considered by the task 2 scoring program and can be omitted in the submission

Evaluation Method

Participants can participate either to task 1 or 2, or both. The evaluation metrics that will be applied are:

Task1: Precision, Recall, and F1

The evaluation script will verify whether the binary "Prediction" is the same as the desired "Prediction" which has been manually annotated, and then calculate its precision, recall, and F1 scores.

Task2: Exact Match, Precision, Recall, and F1

Exact Match will represent what percentage of both your predicted cause and effect are exactly matched with the desired outcome that is annotated by human workers. F1 score is a token level metric and will be calculated according to the submitted cause, effect parts. Please refer to our baseline model for evaluation details.

Terms & Conditions

By registering to this competition, you commit to write a paper of 4 to 8 pages explaining the detail of this experiment for each subscribed task. These papers will be reviewed by the Scientific Committee of FNP 2020 to be published in COLING 2020 proceedings if accepted.

Before registering to CodaLab, please fill this form with detailed mail address, team member names and professional or academic affiliation, then please register to CodaLab with the same email address. Only registered teams will be provided access.

By submitting results to this competition, you consent to the public release of your scores at the Fincausal-2020 workshop and in the associated proceedings, at the task organizers' discretion. Scores may include but are not limited to, automatic and manual quantitative judgments, qualitative judgments, and such other metrics as the task organizers see fit. You accept that the ultimate decision of metric choice and score value is that of the task organizers.

You further agree that the task organizers are under no obligation to release scores and that scores may be withheld if it is the task organizers' judgment that the submission was incomplete, erroneous, deceptive, or violated the letter or spirit of the competition's rules. Inclusion of a submission's scores is not an endorsement of a team or individual's submission, system, or science.

You further agree that your system may be named according to the team name provided at the time of submission, or to a suitable shorthand as determined by the task organizers.

You agree not to redistribute the test data except in the manner prescribed by its license.

Baseline

Refer to our baseline models for evaluation details: https://github.com/yseop/YseopLab

These baselines are liable to frequent updates while on the develop branch, so we suggest you track changes using git, as the repository will only be frozen at the end of the competition. They will still be runnable at every iteration and should run with any hardware. Please get in touch should you experience trouble.

Trial-Task1

Start: Feb. 1, 2020, midnight

Description: Fincausal 2020 Task 1 with Trial dataset

Trial-Task2

Start: March 1, 2020, midnight

Description: Fincausal 2020 Task 2 with Trial dataset

Practice-Task1

Start: March 1, 2020, midnight

Description: Fincausal 2020 Task 1 with Practice dataset

Practice-Task2

Start: March 1, 2020, midnight

Description: Fincausal 2020 Task 2 with Practice dataset

Evaluation-Task1

Start: May 1, 2020, midnight

Description: Fincausal 2020 Task 1 with Blind dataset

Evaluation-Task2

Start: May 1, 2020, midnight

Description: Fincausal 2020 Task 2 with Blind dataset

Competition Ends

June 8, 2020, midnight

You must be logged in to participate in competitions.

Sign In
# Username Score
1 YseopLab 1.000000