The task ProtestNews aims at extracting event information from news articles across multiple countries. We particularly focus on events that are in the scope of contentious politics and characterized by riots and social movements, i.e. the “repertoire of contention” (Giugni 1998, Tarrow 1994, Tilly 1984). Our aim is to develop text classification and information extraction tools on one country and test them on data from different countries. The text data is in English and collected from India, China, and South Africa.
We believe our task will set a baseline in evaluating generalizability of the NLP tools. Another challenge of the task is the handling of the nuanced protest definition used in social science studies, difference in protest types and their expression across countries, and the target information to be extracted. The clues that are needed to discriminate between the relevant and irrelevant information in this context may be either implied without any explicit expression or hinted with a single word in the whole article. For instance, a news article about a protest threat or an open letter written by a single person does not qualify as relevant. A protest should have happened and an open letter should be supported by more than one person to be in-scope.
Please regularly check the website of the lab for the updates. The Forums tab can be used to discuss any issues.
If you have not done so, please complete the individual application form (forma link) for each member of your team. The forms should be sent to Ali Hürriyetoglu (firstname.lastname@example.org) to access the data and be accepted into submission system.
The participants are strongly recommended to read each page and refer to the starting kit in the Participate Tab. README.ipynb provided inside the starting kit includes example code for reading data and making a valid submission file in .zip format.
Ali Hürriyetoglu: email@example.com
Deniz Yüret: firstname.lastname@example.org
Erdem Yörük: email@example.com
Çağrı Yoltar: firstname.lastname@example.org
Burak Gürel: email@example.com
Fırat Duruşan: firstname.lastname@example.org
Osman Mutlu: email@example.com
Arda Akdemir: firstname.lastname@example.org
Theresa Gessler: Theresa.Gessler@EUI.eu
Peter Makarov: email@example.com
We use English online news archives from India and China as data sources to create the training and test corpora. India and China are the source and the target countries respectively in our setting.
Our datasets are annotated by multiple annotators and the disagreements are resolved by another expert. Further we used Machine Learning tools to detect possible misannotations and annotators rechecked the detected ones in order to achieve a gold standard annotation quality.
Starting kit contains task12_eval.py and task3_eval.py which are evaluation scripts for the tasks. Example run:
python task12_eval.py [input file path] [output file path]
The annotation manuals that were used to prepare the distributed data are in the Starting Kit that can be found under Participate > Files part. The description of the data files are under the Data page of the Participate tab of the competition.
The lab aims at evaluating generalizability of text classification and information extraction tools. Therefore, we designed the evaluation as follows. The training data is obtained from a single country, which is the source country. The evaluation is consists of two steps. The first step of evaluation, which we call Test 1 or intermediate evaluation, is performed on data from the source country. The second step of evaluation, which we call Test 2 or Final evaluation, is performed on data from a target country, which is China in our setting. The performance metrics for both Test 1 and Test 2 are described below.
All tasks will be evaluated using News articles from India in this phase. The aim of this phase is to give some feedback to the participants and get some rivalry going on our leaderboard!
We will be releasing the results obtained on this phase on some specific dates. We will be using this test set as part of the test set as well. So we will give only limited amount of intermediate evaluation results to make sure the participants do not overfit to the dataset.
In addition to the News articles obtained from India, we will make use of News Articles from China to make the final evaluation of the participants for all three tasks.
Our main aim in choosing this approach is to favor models which can generalize better and adapt to new domains better.
Below we give the dates related to the data release and deadlines for each phase in the competition.
The competition will end on May 11.
Data Release, India: April 12
Submission deadline: April 26
Scores: April 26
Data Release, China: April 29
Submission deadline: May 3
Scores: May 4
Cycle 3 (final evaluation):
No new data.
Submission deadline: May 10
Scores: May 11
We will evaluate the performance of the participants by the average of all 3 F1 scores obtained.
Task 1 : The task is to classify news documents as protest (1) or non-protest (0), given the raw document.
Task 2 : The task is to classify sentences as a sentence containing an event-trigger (1) or not (0), given the sentence and the news article containing that sentnece.
Task 3 : The task is to extract various information from a given event sentence such as location, time and participant of an event.
Both Task 1 and Task 2 are binary classification tasks.
The submission will be evaluated using the F1 score for Task 1 and Task 2.
We will give intermediate results at the end of the first phase several times in order to ensure that competitors can get feedback about their models.
For intermediate evaluation of Task 1 and Task 2 we will use news articles from India. For final evaluation we use a mixture of news articles from china and india.
Final evaluation will be made on the test set that will be released for the final phase.
For Task 3, F1 metric will be used. BIO tagging scheme is used to annotate the corpus for various information types.
We will provide intermediate results under the Task 3 phase for Task 3 submissions.
The final results will be given on the test set which will be provided later.
The participants of the competition are assumed to have read and agreed the terms and conditions listed below.
ProtestNews 2019 Organizing Committee
Download the datasets using Docker image and obtain the public data (in submission format) from the Files under Participate tab for each phase and task.
For each phase '.solution' files having the same name with the provided '.solution' files in the Public data must be zipped together and submitted as a single file (name of the zip file can be anything). Important thing to note is that the name of the predict files must match with the names of the data files provided for evaluation.
Submit results on test sets which will be provided.
Submission format : The submission files must have the same basename with the data provided. The files must have the ending .predict. For example, for x_dev.data file the predictions must be given in the file named x_dev.predict. The prediction files must be zipped. The format of the submitted files must be in the same format with the .data file provided in the Public Data. The data we provide is in a very straightforward format where each line contains the id of an instance followed by the prediction for that single instance. For Task 1, each line correspond to the binary prediction for the label of the news document.
Example : If we have three news articles in the x_dev.data file an example .predict file submission would look as follows:
Each line corresponds to the prediction made for the news article in the corresponding line. .data files contain the ids to news articles. These news articles (in raw text format) are to be obtained using the docker image.
Important Note : The scoring algorithm will go over all the instances given in the .data file. Be sure to include all the predictions in your submission.
Important Note:Both predictions for Task 1 and Task 2 must be zipped together into a single zip file during submission. The files must be zipped with no intermediate folder otherwise the scoring program will not detect them. Do not put the files in a folder before zipping.
The system accepts separate submission as well if you are planning to participate to a single task.
If you plan to participate to both task submit both .predict files in a single zip file otherwise the previously submitted task will have the score 0.
Submit results on all test sets for Task 3, which will be provided.
We will provide a token-per-line format data. The participants must make their predictions with a tab between the token and the prediction at each line.
Again the ending must be .predict and the file name must exactly match the .data file that will be shared.
Important Note:All lines must exactly match with the .data file provided. Otherwise the scoring program will not be able to calculate the score.
Be careful with extra empty lines and assure that each token at each line and empty lines completely overlap.
Start: April 1, 2019, 9:15 a.m.
Description: We will provide intermediate evaluation results for Task 1 and Task 2 to give everyone a feedback.
Start: April 1, 2019, 11:18 a.m.
Description: We will provide intermediate evaluation results to give everyone a feedback.
Start: April 1, 2019, 11:18 a.m.
Description: Submissions for both Intermediate and Final Evaluation of Task 3 will be done here.
Start: May 20, 2019, 11:18 a.m.
Description: End of Competition
You must be logged in to participate in competitions.Sign In