SDU@AAAI-21 - Shared Task 2, Acronym Disambiguation

Organized by amirveyseh - Current server time: Jan. 18, 2021, 8:41 a.m. UTC

First phase

Sept. 1, 2020, midnight UTC


Competition Ends
Dec. 5, 2020, 11:59 a.m. UTC

SDU@AAAI-21 - Shared Task 2: Acronym Disambiguation

This competition is the second task of the shared tasks introduced in the AAAI-21 Workshop on Scientific Document UnderstandingThis task aims to find the correct meaning of an ambiguous acronym in a given sentence. The input to the system is a sentence with an ambiguous acronym and a dictionary with possible expansions (i.e., long-forms) of the acronym. For instance:

                              Input - Sentence: They use CNN in the proposed model.

                              Input - Dictionary: CNN: 1. Convolutional Neural Network, 2. Cable News Network

                              Output: Convolutional Neural Network

In this example, the ambiguous acronym in the input sentence is shown in boldface and the expected prediction for its correct meaning is "Convolutional Neural Network". For this task, participants are provided with the training and development datasets consisting of 62,441 sentences and a dictionary of 732 ambiguous acronyms. The dataset and dictionary are created from 6,786 English scientific papers published at arXiv.



Acronym Disambiguation competition has two phases:

  • Development: In this phase, the participants will use the training/development sets provided in the CodaLab participate page to design and develop their models.
  • Evaluation: Two weeks before the end of the competition, i.e., 20th November 2020, the test set is released and accessible in the CodaLab participate page. The test set has the same distribution and format as the development set. Run your model on the provided test sets and submit the prediction results in CodaLab participate page. For more information on the submission format, see the participate page.



To participate, first fill out this form to provide the details of your team: To submit the results of your model runs, use the CodaLab participate page. In the submit result page, please make sure to use the same team name you provided in the registration form. For more information on the shared task, check out the shared task GitHub page and the workshop website.



Competition participants are invited to present their work in the poster session of the SDU@AAAI-21 workshop. The winner of the competition will be provided with an oral presentation. In addition, SDU@AAAI-21 strongly encourages the participants to submit their system papers to the workshop. The system papers will appear in the workshop proceedings in the shared task track. For more information on the workshop, please see SDU@AAAI-21 website.  


Important Dates

  • Training and development set release: September 1, 2020

  • Test set release: November 20, 2020

  • System runs due date: December 4, 2020

  • System papers due date: December 11, 2020

  • Presentation at SDU@AAAI-21: February 9, 2021



Please send your inquiries to
For more updates, join our Google group: and follow us at Twitter

Evaluation Criteria 

The submitted results will be evaluated based on their macro-averaged precision, recall, and F1 scores computed for correct long-form prediction on the test set. A long-form prediction is counted as correct if it matches the ground truth long-form of the given acronym in the input sentence. The official score is the prediction F1 score.

The evaluation script is provided on the GitHub page of this competition.

The dataset provided for this competition is licensed under CC BY-NC-SA 4.0 international license, and the evaluation script and the baseline are licensed under MIT license. By accepting the terms and conditions you agree that:

  • Organizers have the right to publicly release the team name, the affiliation of the teams, and the scores (including all metrics computed in the evaluation script) in the upcoming publications,
  • Organizers have the right to exclude the results of the teams that do not comply with the fair competition rules (e.g., by deceptive or erroneous results)
  • You will not redistribute the dataset (including the training/development/test data)
  • You will provide enough description of the details of the model used to make predictions on the test set (including but not limited to model architecture, training settings, word embedding types, etc)


Start: Sept. 1, 2020, midnight

Description: Participants will use the training/development sets to design the models


Start: Nov. 20, 2020, midnight

Description: Participant will submit their model runs on test set

Competition Ends

Dec. 5, 2020, 11:59 a.m.

You must be logged in to participate in competitions.

Sign In