The proposed SMM4H shared tasks involve NLP challenges on social media mining for health monitoring and surveillance. This requires processing imbalanced, noisy, real-world, and substantially creative language expressions from social media. The proposed systems should be able to deal with many linguistic variations and semantic complexities in various ways people express medication-related concepts and outcomes. It has been shown in past research that automated systems frequently underperform when exposed to social media text because of the presence of novel/creative phrases and misspellings, and frequent use of idiomatic, ambiguous and sarcastic expressions. The tasks will thus act as a discovery and verification process of what approaches work best for social media data.
Similar to the first five runs of the shared tasks, the data include annotated collections of posts on Twitter. The training data is already prepared and will be available to the teams registering to participate.
The eight shared tasks proposed this year are:
Training Data Release | Dec 15, 2020 |
Validation Data Release | Feb 1, 2020 |
Validation set submission due [Required] | Feb 15, 2021 |
Test data release, evaluation phase starts | See table below |
Test set predictions due | See table below |
Test set evaluation scores release | Mar 8, 2021 |
System descriptions due | Mar 15, 2021 |
Acceptance notification | April 1, 2020 |
Camera ready system descriptions | April 12, 2020 |
Practice, Evaluation and Post Evaluation phases:
There are three phases to the SMM4H shared task. During the Practice phase, participants can upload their predictions for Validation sets and practice training their systems on the datasets. The participants are required to post their predictions on the validation set by Feb 15th to avoid formatting issues during the evaluation phase. The Practice phase will continue until the Evaluation phase starts and participants can submit unlimited predictions during the Practice phase.
During the Evaluation phase, participants will be provided with the Evaluation/Test dataset. They will be required to upload their predictions on the Evaluation set similar to the Practice phase but limited to two uploads per task. Results for the shared task will be populated based on the Evaluation stage performance.
Task |
Practice on |
Evaluation stage submissions open (UTC 00:01) |
Evaluation stage |
Task 1a, 1b and 1c | Feb 25 | Feb 26 | Feb 28 |
Task 2 | Feb 25 | Feb 26 | Feb 28 |
Task 3a and 3b | Feb 26 | Feb 27 | Mar 01 |
Task 4 | Feb 26 | Feb 27 | Mar 01 |
Task 5 | Feb 27 | Feb 28 | Mar 02 |
Task 6 | Feb 27 | Feb 28 | Mar 02 |
Task 7a and 7b | Feb 28 | Mar 01 | Mar 03 |
Task 8 | Feb 28 | Mar 01 | Mar 03 |
After Evaluation phase is completed, participants and researchers can continue research on the topic by making predictions on the evaluation set during the Post-evaluation phase where they can continue to improve their systems. These scores will not be reported in SMM4H rankings and proceedings.
The evaluation metric for each task is as follows:
Task 1a: Submissions will be ranked by Precision, Recall and F1-score for the ADE class Task 1b: Submissions will be ranked by Precision, Recall and F1-score for each ADE extracted where the spans overlap either entirely or partially. Task 1c: Submissions will be ranked by Precision, Recall and F1-score for each ADE extracted where the spans overlap either entirely or partially AND each span is normalized to the correct MedDRA preferred term ID. Task 2: Submissions will be ranked by Precision, Recall and F1-score for the ADE class. Task 3a and 3b: Submissions will be ranked by Precision, Recall and F1-score for the Medication Change class. Task 4: Submissions will be ranked by Precision, Recall and F1-score for the APO class . Task 5: Submissions will be ranked by Precision, Recall and F1-score for the potential COVID19 class. Task 6: Submissions will be ranked by micro-averaged Precision, Recall and F1-score for the all classes. Task 7a: Submissions will be ranked by Precision, Recall and F1-score for the Prof class. Task 7b: Submissions will be ranked by Precision, Recall and F1-score for each PROFESION and SITUACION_LABORAL extracted where the spans overlap entirely. Task 8: Submissions will be ranked by Precision, Recall and F1-score for the Breast Cancer class.
F1-score = 2 * ((Precision * Recall)/(Precision + Recall)); Precision = TP/(TP + FP); Recall = TP/(TP + FN);
Abbreviations
TP true positives FP false positives FN false negatives
By submitting results to this competition, you consent to the public release of your scores at the SMM4H'21 workshop and in the associated proceedings, at the task organizers' discretion. Scores may include, but are not limited to, automatic and manual quantitative judgements, qualitative judgements, and such other metrics as the task organizers see fit. You accept that the ultimate decision of metric choice and score value is that of the task organizers. You further agree that the task organizers are under no obligation to release scores and that scores may be withheld if it is the task organizers' judgement that the submission was incomplete, erroneous, deceptive, or violated the letter or spirit of the competition's rules. Inclusion of a submission's scores is not an endorsement of a team or individual's submission, system, or science. You further agree that your system may be named according to the team name provided at the time of submission, or to a suitable shorthand as determined by the task organizers. You further agree to submit and present a short paper describing your system during the workshop. You agree not to redistribute the training and test data except in the manner prescribed by its licence.
The submission format for Tasks 2, 3, 4, 5, 6, 8 will be the same as the validation set format. Task 1 and Task 7 have variations. Please see task details for submission format instructions. Evaluation scripts will be made available in the same location where data is avialable.
Q: How will I submit my results? A: Teams should submit their results to CodaLab as a ZIP file containing a TSV file in the appropriate format. The TSV file should not be in a folder in the ZIP file, and the ZIP file should not contain any folders or files other than the TSV file. Q: How many submissions can I make? A: For each subtask, two submissions from each team will be accepted. You can participate in one or multiple tasks. Q: Are there any restrictions on data and resources that can be used for training the classification system? For example, can we use manually or automatically constructed lexicons? Can we use other data (e.g., tweets, blog posts, medical records) annotated or unlabeled? A: There are currently no restrictions on data and resources. External resources and data can be used. However, all external resources used will need to be cited and explained in the system description paper. Q: Is there any information on the test data? Will the test data be collected in the same way as the training data? A: The test data has been collected the same way.
Test set scores will be communicated along with median scores of all submissions on the March 8th by email. Rankings will be announced and released on the day of the workshop.
Start: Jan. 1, 2021, midnight
Start: Feb. 26, 2021, midnight
Start: March 2, 2021, midnight
Start: Jan. 1, 2021, midnight
Start: Feb. 26, 2021, midnight
Start: March 2, 2021, midnight
Start: Jan. 1, 2021, midnight
Start: Feb. 26, 2021, midnight
Start: March 2, 2021, midnight
Start: Jan. 1, 2021, midnight
Start: Feb. 26, 2021, midnight
Start: March 2, 2021, midnight
Start: Jan. 1, 2021, midnight
Start: Feb. 27, 2021, midnight
Start: March 3, 2021, midnight
Start: Jan. 1, 2021, midnight
Start: Feb. 27, 2021, midnight
Start: March 3, 2021, midnight
Start: Jan. 1, 2021, midnight
Start: Feb. 27, 2021, midnight
Start: March 3, 2021, midnight
Start: Jan. 1, 2021, midnight
Start: Feb. 28, 2021, midnight
Start: March 4, 2021, midnight
Start: Jan. 1, 2021, midnight
Start: Feb. 28, 2021, midnight
Start: March 4, 2021, midnight
Start: Jan. 1, 2021, midnight
Start: March 1, 2021, midnight
Start: March 5, 2021, midnight
Start: Jan. 1, 2021, midnight
Start: March 1, 2021, midnight
Start: March 5, 2021, midnight
Start: Jan. 1, 2021, midnight
Start: March 1, 2021, midnight
Start: March 5, 2021, midnight
Never
You must be logged in to participate in competitions.
Sign In# | Username | Score |
---|---|---|
1 | kanishksin | - |
2 | tongzhou21 | - |
3 | Varad | - |