To contact our competition organizers, please email us. Our email address:
(1) email@example.com (2) firstname.lastname@example.org
To model counterfactual semantics and reasoning in natural language, our shared task aims to provide a benchmark for two basic problems.
In this task, you are asked to determine whether a given statement is counterfactual or not. Counterfactual statements describe events that did not actually happen or cannot happen, as well as the possible consequence if the events have had happened. More specifically, counterfactuals describe events counter to facts and hence naturally involve common sense, knowledge, and reasoning. Tackling this problem is the basis for all down-stream counterfactual related causal inference analysis in natural language. For example, the following statements are counterfactuals that need to be detected: one from healthcare and one from the finance domain:
The important dates have been updated as below according to the updated SemEval-2020 schedule. For the details, please refer to the official website of SemEval-2020: http://alt.qcri.org/semeval2020/
We provide datasets for task-1 and task-2 respectively, and both will include train.csv and test.csv.
Please note that you could only use the corresponding dataset for task-1 to build models for task-1 and dataset for task-2 to build models for task-2 to ensure fairness.
Here we provide two example zip files to show the format of submission. In 'Participate -> Submit/View Results -> Practise-Subtask1' or '...->Practise-Subtask2', you could also try to submit your own results to verify the format.
A valid submission zip file for CodaLab contains one of the following files:
* The .csv file with the incorrect file name (sensitive to capitalization of letters) will not be accepted.
* A zip file containing both files will not be accepted.
* Neither .csv nor .rar file will be accepted, only .zip file is accepted.
* Please zip your results file (e.g. subtask1.csv) directly without putting it into a folder and zipping the folder.
For the pred_label, '1' refers to counterfactual while '0' refers to non-counterfactual. The 'sentenceID' should be in the same order as in 'test.csv' for subtask-1 (in evaluation phase).
If there is no consequent part (a consequent part not always exists in a counterfactual statement) in this sentence, please put '-1' in the consequent_startid and 'consequent_endid'. The 'sentenceID' should be in the same order as in 'test.csv' for subtask-2 (in evaluation phase).
"6000627","1","Had Russia possessed such warships in 2008, boasted its naval chief, Admiral Vladimir Vysotsky, it would have won its war against Georgia in 40 minutes instead of 26 hours."
3S0001,"For someone who's so emotionally complicated, who could have given up many times if he was made of straw - he hasn't.",Health,83,105,48,81
Participants have to participate in both of the 2 tasks. The evaluation metrics that will be applied are:
The evaluation script will verify whether the predicted binary "label" is the same as the desired "label" which is annotated by human workers, and then calculate its precision, recall, and F1 scores.
Exact Match will represent what percentage of both your predicted antecedents and consequences are exactly matched with the desired outcome that is annotated by human workers.
F1 score is a token level metric and will be calculated according to the submitted antecedent_startid, antecedent_endid, consequent_startid, consequent_endid. Please refer to our baseline model for evaluation details.
By submitting results to this competition, you consent to the public release of your scores at the SemEval-2020 workshop and in the associated proceedings, at the task organizers' discretion. Scores may include but are not limited to, automatic and manual quantitative judgments, qualitative judgments, and such other metrics as the task organizers see fit. You accept that the ultimate decision of metric choice and score value is that of the task organizers.
You further agree that the task organizers are under no obligation to release scores and that scores may be withheld if it is the task organizers' judgment that the submission was incomplete, erroneous, deceptive, or violated the letter or spirit of the competition's rules. Inclusion of a submission's scores is not an endorsement of a team or individual's submission, system, or science.
You further agree that your system may be named according to the team name provided at the time of submission, or to a suitable shorthand as determined by the task organizers.
You agree not to redistribute the test data except in the manner prescribed by its license.
Sequence Labeling model (with eval method)
You are free to build a system from scratch using any available software packages and resources, as long as they are not against the spirit of fair competition.
Xiaodan Zhu, Queen's University
Xiaoyu Yang, Queen's University
Huasha Zhao, Alibaba Group
Qiong Zhang, Alibaba Group
Stan Matwin, Dalhousie University
We also kindly thank Jiaqi Li, Qianyu Zhang, Stephen Obadinma, Xiao Chu and Rohan for their help and effort in this project.
Start: Sept. 1, 2019, midnight
Description: ****** here you submit the results for subtask-1! [phase: practice] ****** ( Please choose a specific task and a target phase before submitting your answers! )
Start: Sept. 1, 2019, midnight
Description: ****** here you submit the results for subtask-2! [phase: practice] ****** ( Please choose a specific task and a target phase before submitting your answers! ) e you submit the results for subtask-2 !!! ************************ ( Please choose a specific task and a target phase before submitting your answers! )
Start: Feb. 19, 2020, midnight
Description: ********************** Evaluation subtask-1 (only for competition)************************
Start: March 1, 2020, midnight
Description: ********************** Evaluation subtask-2 (only for competition)************************
Start: March 18, 2020, 1 a.m.
Description: >> Please submit your results for Subtask-1 here after Mar 18, 2020 ! Here we only keep the latest results not the best ! [Post Evaluation]
Start: March 18, 2020, 1 a.m.
Description: >> Please submit your results for Subtask-2 here after Mar 18, 2020 ! Here we only keep the latest results not the best ! [Post Evaluation]
Sept. 14, 2020, midnight
You must be logged in to participate in competitions.Sign In