SMM4H 2021 - Social Media Mining for Health Shared Task

Organized by amagge - Current server time: March 27, 2025, 2:48 a.m. UTC

Previous

Task7b Profession span ext, Post-Eval
March 5, 2021, midnight UTC

Current

Task8 Breast cancer classif, Post-Eval
March 5, 2021, midnight UTC

End

Competition Ends
Never

About SMM4H 2021

The proposed SMM4H shared tasks involve NLP challenges on social media mining for health monitoring and surveillance. This requires processing imbalanced, noisy, real-world, and substantially creative language expressions from social media. The proposed systems should be able to deal with many linguistic variations and semantic complexities in various ways people express medication-related concepts and outcomes. It has been shown in past research that automated systems frequently underperform when exposed to social media text because of the presence of novel/creative phrases and misspellings, and frequent use of idiomatic, ambiguous and sarcastic expressions. The tasks will thus act as a discovery and verification process of what approaches work best for social media data.

Similar to the first five runs of the shared tasks, the data include annotated collections of posts on Twitter. The training data is already prepared and will be available to the teams registering to participate.

The eight shared tasks proposed this year are:

  • Task 1: Classification, Extraction and Normalization of Adverse Effect mentions in English tweets
  • Task 2: Classification of Russian tweets for detecting presence of Adverse Effect mentions
  • Task 3: Classification of change in medications regimen in tweets
  • Task 4: Classification of tweets self-reporting adverse pregnancy outcomes
  • Task 5: Classification of tweets self-reporting potential cases of COVID-19
  • Task 6: Classification of COVID19 tweets containing symptoms
  • Task 7: Identification of professions and occupations (ProfNER) in Spanish tweets
  • Task 8: Classification of self-reported breast cancer posts on Twitter
 Training Data Release   Dec 15, 2020
 Validation Data Release   Feb 1, 2020
 Validation set submission due [Required]    Feb 15, 2021
 Test data release, evaluation phase starts    See table below
 Test set predictions due    See table below
 Test set evaluation scores release    Mar 8, 2021
 System descriptions due    Mar 15, 2021
 Acceptance notification   April 1, 2020
 Camera ready system descriptions   April 12, 2020

 

Practice, Evaluation and Post Evaluation phases:

There are three phases to the SMM4H shared task. During the Practice phase, participants can upload their predictions for Validation sets and practice training their systems on the datasets. The participants are required to post their predictions on the validation set by Feb 15th to avoid formatting issues during the evaluation phase. The Practice phase will continue until the Evaluation phase starts and participants can submit unlimited predictions during the Practice phase.

During the Evaluation phase, participants will be provided with the Evaluation/Test dataset. They will be required to upload their predictions on the Evaluation set similar to the Practice phase but limited to two uploads per task. Results for the shared task will be populated based on the Evaluation stage performance.

 Task

Practice on
Validation set until
(UTC 23:59)

Evaluation stage
submissions open
(UTC 00:01)

Evaluation stage
submissions
due
(UTC 23:59)

 Task 1a, 1b and 1c  Feb 25   Feb 26  Feb 28
 Task 2    Feb 25  Feb 26  Feb 28
 Task 3a and 3b     Feb 26  Feb 27  Mar 01
 Task 4    Feb 26  Feb 27  Mar 01
 Task 5    Feb 27  Feb 28  Mar 02
 Task 6    Feb 27  Feb 28  Mar 02
 Task 7a and 7b  Feb 28  Mar 01  Mar 03
 Task 8   Feb 28  Mar 01  Mar 03

After Evaluation phase is completed, participants and researchers can continue research on the topic by making predictions on the evaluation set during the Post-evaluation phase where they can continue to improve their systems. These scores will not be reported in SMM4H rankings and proceedings. 

 

Organizers

  • Graciela Gonzalez-Hernandez, University of Pennsylvania, USA
  • Arjun Magge, University of Pennsylvania, USA
  • Davy Weissenbacher, University of Pennsylvania, USA
  • Ari Z. Klein, University of Pennsylvania, USA
  • Karen O’Connor, University of Pennsylvania, USA
  • Abeed Sarker, Emory University, USA
  • Mohammed Ali Al-Garadi, Emory University, USA
  • Elena Tutubalina, Kazan Federal University, Russia
  • Zulfat Miftahutdinov, Kazan Federal University, Russia
  • Ilsear Alimova, Kazan Federal University, Russia
  • Martin Krallinger, Barcelona Supercomputing Center, Spain
  • Antonio Miranda, Barcelona Supercomputing Center, Spain
  • Salvador Lima, Barcelona Supercomputing Center, Spain
  • Eulàlia Farré-Maduell, Barcelona Supercomputing Center, Spain
  • Juan Banda, Georgia State University, USA

Evaluation Metrics

The evaluation metric for each task is as follows:

Task 1a: Submissions will be ranked by Precision, Recall and F1-score for the ADE class Task 1b: Submissions will be ranked by Precision, Recall and F1-score for each ADE extracted where the spans overlap either entirely or partially. Task 1c: Submissions will be ranked by Precision, Recall and F1-score for each ADE extracted where the spans overlap either entirely or partially AND each span is normalized to the correct MedDRA preferred term ID. Task 2: Submissions will be ranked by Precision, Recall and F1-score for the ADE class. Task 3a and 3b: Submissions will be ranked by Precision, Recall and F1-score for the Medication Change class. Task 4: Submissions will be ranked by Precision, Recall and F1-score for the APO class . Task 5: Submissions will be ranked by Precision, Recall and F1-score for the potential COVID19 class. Task 6: Submissions will be ranked by micro-averaged Precision, Recall and F1-score for the all classes. Task 7a: Submissions will be ranked by Precision, Recall and F1-score for the Prof class. Task 7b: Submissions will be ranked by Precision, Recall and F1-score for each PROFESION and SITUACION_LABORAL extracted where the spans overlap entirely. Task 8: Submissions will be ranked by Precision, Recall and F1-score for the Breast Cancer class.

F1-score = 2 * ((Precision * Recall)/(Precision + Recall)); Precision = TP/(TP + FP); Recall = TP/(TP + FN);
Abbreviations

TP	true positives
FP	false positives
FN	false negatives

Terms and Conditions

By submitting results to this competition, you consent to the public release of your scores at the SMM4H'21 workshop and in the associated proceedings, at the task organizers' discretion. Scores may include, but are not limited to, automatic and manual quantitative judgements, qualitative judgements, and such other metrics as the task organizers see fit. You accept that the ultimate decision of metric choice and score value is that of the task organizers. You further agree that the task organizers are under no obligation to release scores and that scores may be withheld if it is the task organizers' judgement that the submission was incomplete, erroneous, deceptive, or violated the letter or spirit of the competition's rules. Inclusion of a submission's scores is not an endorsement of a team or individual's submission, system, or science. You further agree that your system may be named according to the team name provided at the time of submission, or to a suitable shorthand as determined by the task organizers. You further agree to submit and present a short paper describing your system during the workshop. You agree not to redistribute the training and test data except in the manner prescribed by its licence.

Task Details

What is the submission format?

The submission format for Tasks 2, 3, 4, 5, 6, 8 will be the same as the validation set format. Task 1 and Task 7 have variations. Please see task details for submission format instructions. Evaluation scripts will be made available in the same location where data is avialable.

FAQ

Q: How will I submit my results? A: Teams should submit their results to CodaLab as a ZIP file containing a TSV file in the appropriate format. The TSV file should not be in a folder in the ZIP file, and the ZIP file should not contain any folders or files other than the TSV file. Q: How many submissions can I make? A: For each subtask, two submissions from each team will be accepted. You can participate in one or multiple tasks. Q: Are there any restrictions on data and resources that can be used for training the classification system? For example, can we use manually or automatically constructed lexicons? Can we use other data (e.g., tweets, blog posts, medical records) annotated or unlabeled? A: There are currently no restrictions on data and resources. External resources and data can be used. However, all external resources used will need to be cited and explained in the system description paper. Q: Is there any information on the test data? Will the test data be collected in the same way as the training data? A: The test data has been collected the same way.

Test set scores will be communicated along with median scores of all submissions on the March 8th by email. Rankings will be announced and released on the day of the workshop.

Task 1a ADE tweet classification, Practice

Start: Jan. 1, 2021, midnight

Task 1a ADE tweet classification, Evaluation

Start: Feb. 26, 2021, midnight

Task 1a ADE tweet classification, Post-Eval

Start: March 2, 2021, midnight

Task1b ADE span detection, Practice

Start: Jan. 1, 2021, midnight

Task1b ADE span detection, Evaluation

Start: Feb. 26, 2021, midnight

Task1b ADE span detection, Post-Eval

Start: March 2, 2021, midnight

Task1c ADE resolution, Practice

Start: Jan. 1, 2021, midnight

Task1c ADE resolution, Evaluation

Start: Feb. 26, 2021, midnight

Task1c ADE resolution, Post-Eval

Start: March 2, 2021, midnight

Task2 Russian ADE classification, Practice

Start: Jan. 1, 2021, midnight

Task2 Russian ADE classification, Evaluation

Start: Feb. 26, 2021, midnight

Task2 Russian ADE classification, Post-Eval

Start: March 2, 2021, midnight

Task3a Med Change-Twitter, Practice

Start: Jan. 1, 2021, midnight

Task3a Med Change-Twitter, Evaluation

Start: Feb. 27, 2021, midnight

Task3a Med Change-Twitter, Post-Eval

Start: March 3, 2021, midnight

Task3b Med Change-WebMD, Practice

Start: Jan. 1, 2021, midnight

Task3b Med Change-WebMD, Evaluation

Start: Feb. 27, 2021, midnight

Task3b Med Change-WebMD, Post-Eval

Start: March 3, 2021, midnight

Task4 Adverse Pregnancy, Practice

Start: Jan. 1, 2021, midnight

Task4 Adverse Pregnancy, Evaluation

Start: Feb. 27, 2021, midnight

Task4 Adverse Pregnancy, Post-Eval

Start: March 3, 2021, midnight

Task5 Potential COVID19 Cases, Practice

Start: Jan. 1, 2021, midnight

Task5 Potential COVID19 Cases, Evaluation

Start: Feb. 28, 2021, midnight

Task5 Potential COVID19 Cases, Post-Eval

Start: March 4, 2021, midnight

Task6 COVID19 symptoms, Practice

Start: Jan. 1, 2021, midnight

Task6 COVID19 symptoms, Evaluation

Start: Feb. 28, 2021, midnight

Task6 COVID19 symptoms, Post-Eval

Start: March 4, 2021, midnight

Task7a Classify professions, Practice

Start: Jan. 1, 2021, midnight

Task7a Classify professions, Evaluation

Start: March 1, 2021, midnight

Task7a Classify professions, Post-Eval

Start: March 5, 2021, midnight

Task7b Profession span ext, Practice

Start: Jan. 1, 2021, midnight

Task7b Profession span ext, Evaluation

Start: March 1, 2021, midnight

Task7b Profession span ext, Post-Eval

Start: March 5, 2021, midnight

Task8 Breast cancer classif, Practice

Start: Jan. 1, 2021, midnight

Task8 Breast cancer classif, Evaluation

Start: March 1, 2021, midnight

Task8 Breast cancer classif, Post-Eval

Start: March 5, 2021, midnight

Competition Ends

Never

You must be logged in to participate in competitions.

Sign In
# Username Score
1 kanishksin -
2 tongzhou21 -
3 Varad -