CodaLab - Competition

SMM4H 2021 - Social Media Mining for Health Shared Task

Organized by amagge - Current server time: March 27, 2025, 2:48 a.m. UTC

Task7b Profession span ext, Post-Eval

March 5, 2021, midnight UTC

Current

Task8 Breast cancer classif, Post-Eval

March 5, 2021, midnight UTC

End

Competition Ends

Never

Overview
Evaluation
Terms and Conditions
Details
Results

About SMM4H 2021

The proposed SMM4H shared tasks involve NLP challenges on social media mining for health monitoring and surveillance. This requires processing imbalanced, noisy, real-world, and substantially creative language expressions from social media. The proposed systems should be able to deal with many linguistic variations and semantic complexities in various ways people express medication-related concepts and outcomes. It has been shown in past research that automated systems frequently underperform when exposed to social media text because of the presence of novel/creative phrases and misspellings, and frequent use of idiomatic, ambiguous and sarcastic expressions. The tasks will thus act as a discovery and verification process of what approaches work best for social media data.

Similar to the first five runs of the shared tasks, the data include annotated collections of posts on Twitter. The training data is already prepared and will be available to the teams registering to participate.

The eight shared tasks proposed this year are:

Task 1: Classification, Extraction and Normalization of Adverse Effect mentions in English tweets
Task 2: Classification of Russian tweets for detecting presence of Adverse Effect mentions
Task 3: Classification of change in medications regimen in tweets
Task 4: Classification of tweets self-reporting adverse pregnancy outcomes
Task 5: Classification of tweets self-reporting potential cases of COVID-19
Task 6: Classification of COVID19 tweets containing symptoms
Task 7: Identification of professions and occupations (ProfNER) in Spanish tweets
Task 8: Classification of self-reported breast cancer posts on Twitter

Training Data Release	Dec 15, 2020
Validation Data Release	Feb 1, 2020
Validation set submission due [Required]	Feb 15, 2021
Test data release, evaluation phase starts	See table below
Test set predictions due	See table below
Test set evaluation scores release	Mar 8, 2021
System descriptions due	Mar 15, 2021
Acceptance notification	April 1, 2020
Camera ready system descriptions	April 12, 2020

Practice, Evaluation and Post Evaluation phases:

There are three phases to the SMM4H shared task. During the Practice phase, participants can upload their predictions for Validation sets and practice training their systems on the datasets. The participants are required to post their predictions on the validation set by Feb 15th to avoid formatting issues during the evaluation phase. The Practice phase will continue until the Evaluation phase starts and participants can submit unlimited predictions during the Practice phase.

During the Evaluation phase, participants will be provided with the Evaluation/Test dataset. They will be required to upload their predictions on the Evaluation set similar to the Practice phase but limited to two uploads per task. Results for the shared task will be populated based on the Evaluation stage performance.

Task	Practice on Validation set until (UTC 23:59)	Evaluation stage submissions open (UTC 00:01)	Evaluation stage submissions due (UTC 23:59)
Task 1a, 1b and 1c	Feb 25	Feb 26	Feb 28
Task 2	Feb 25	Feb 26	Feb 28
Task 3a and 3b	Feb 26	Feb 27	Mar 01
Task 4	Feb 26	Feb 27	Mar 01
Task 5	Feb 27	Feb 28	Mar 02
Task 6	Feb 27	Feb 28	Mar 02
Task 7a and 7b	Feb 28	Mar 01	Mar 03
Task 8	Feb 28	Mar 01	Mar 03

After Evaluation phase is completed, participants and researchers can continue research on the topic by making predictions on the evaluation set during the Post-evaluation phase where they can continue to improve their systems. These scores will not be reported in SMM4H rankings and proceedings.

Organizers

Graciela Gonzalez-Hernandez, University of Pennsylvania, USA
Arjun Magge, University of Pennsylvania, USA
Davy Weissenbacher, University of Pennsylvania, USA
Ari Z. Klein, University of Pennsylvania, USA
Karen O’Connor, University of Pennsylvania, USA
Abeed Sarker, Emory University, USA
Mohammed Ali Al-Garadi, Emory University, USA
Elena Tutubalina, Kazan Federal University, Russia
Zulfat Miftahutdinov, Kazan Federal University, Russia
Ilsear Alimova, Kazan Federal University, Russia
Martin Krallinger, Barcelona Supercomputing Center, Spain
Antonio Miranda, Barcelona Supercomputing Center, Spain
Salvador Lima, Barcelona Supercomputing Center, Spain
Eulàlia Farré-Maduell, Barcelona Supercomputing Center, Spain
Juan Banda, Georgia State University, USA

Evaluation Metrics

The evaluation metric for each task is as follows:

Task 1a: Submissions will be ranked by Precision, Recall and F1-score for the ADE class Task 1b: Submissions will be ranked by Precision, Recall and F1-score for each ADE extracted where the spans overlap either entirely or partially. Task 1c: Submissions will be ranked by Precision, Recall and F1-score for each ADE extracted where the spans overlap either entirely or partially AND each span is normalized to the correct MedDRA preferred term ID. Task 2: Submissions will be ranked by Precision, Recall and F1-score for the ADE class. Task 3a and 3b: Submissions will be ranked by Precision, Recall and F1-score for the Medication Change class. Task 4: Submissions will be ranked by Precision, Recall and F1-score for the APO class . Task 5: Submissions will be ranked by Precision, Recall and F1-score for the potential COVID19 class. Task 6: Submissions will be ranked by micro-averaged Precision, Recall and F1-score for the all classes. Task 7a: Submissions will be ranked by Precision, Recall and F1-score for the Prof class. Task 7b: Submissions will be ranked by Precision, Recall and F1-score for each PROFESION and SITUACION_LABORAL extracted where the spans overlap entirely. Task 8: Submissions will be ranked by Precision, Recall and F1-score for the Breast Cancer class.

F1-score = 2 * ((Precision * Recall)/(Precision + Recall)); Precision = TP/(TP + FP); Recall = TP/(TP + FN);
Abbreviations

TP	true positives
FP	false positives
FN	false negatives

Terms and Conditions

By submitting results to this competition, you consent to the public release of your scores at the SMM4H'21 workshop and in the associated proceedings, at the task organizers' discretion. Scores may include, but are not limited to, automatic and manual quantitative judgements, qualitative judgements, and such other metrics as the task organizers see fit. You accept that the ultimate decision of metric choice and score value is that of the task organizers. You further agree that the task organizers are under no obligation to release scores and that scores may be withheld if it is the task organizers' judgement that the submission was incomplete, erroneous, deceptive, or violated the letter or spirit of the competition's rules. Inclusion of a submission's scores is not an endorsement of a team or individual's submission, system, or science. You further agree that your system may be named according to the team name provided at the time of submission, or to a suitable shorthand as determined by the task organizers. You further agree to submit and present a short paper describing your system during the workshop. You agree not to redistribute the training and test data except in the manner prescribed by its licence.

Task Details

What is the submission format?

The submission format for Tasks 2, 3, 4, 5, 6, 8 will be the same as the validation set format. Task 1 and Task 7 have variations. Please see task details for submission format instructions. Evaluation scripts will be made available in the same location where data is avialable.

FAQ

Q: How will I submit my results? A: Teams should submit their results to CodaLab as a ZIP file containing a TSV file in the appropriate format. The TSV file should not be in a folder in the ZIP file, and the ZIP file should not contain any folders or files other than the TSV file. Q: How many submissions can I make? A: For each subtask, two submissions from each team will be accepted. You can participate in one or multiple tasks. Q: Are there any restrictions on data and resources that can be used for training the classification system? For example, can we use manually or automatically constructed lexicons? Can we use other data (e.g., tweets, blog posts, medical records) annotated or unlabeled? A: There are currently no restrictions on data and resources. External resources and data can be used. However, all external resources used will need to be cited and explained in the system description paper. Q: Is there any information on the test data? Will the test data be collected in the same way as the training data? A: The test data has been collected the same way.

Test set scores will be communicated along with median scores of all submissions on the March 8th by email. Rankings will be announced and released on the day of the workshop.