SDU@AAAI-21 - Shared Task 1, Acronym Identification

Organized by amirveyseh - Current server time: Jan. 18, 2021, 6:56 a.m. UTC

First phase

Sept. 1, 2020, midnight UTC


Competition Ends
Dec. 5, 2020, 11:59 a.m. UTC

SDU@AAAI-21 - Shared Task 1: Acronym Identification

This competition is the first task of the shared tasks introduced in the AAAI-21 Workshop on Scientific Document UnderstandingThis task aims to identify acronyms (i.e., short-forms) and their meanings (i.e.,long-forms) from the documents. For instance:

            Input: Existing methods for learning with noisy labels (LNL) primarily take a loss correction approach.

            Output: Existing methods for learning with noisy labels (LNL) primarily take a loss correction approach.

In this example, the acronym is shown in bold font and the long-form is shown with an underline. This task is modeled as a sentence-level sequence labeling problem. Participants are provided with manually labeled training and development datasets consisting of 17,506 sentences extracted from English scientific papers published at arXiv. 



Acronym Identification competition has two phases:

  • Development: In this phase, the participants will use the training/development sets provided in the CodaLab participate page to design and develop their models.
  • Evaluation: Two weeks before the end of the competition, i.e., 20th November 2020, the test set is released and accessible in the CodaLab participate page. The test set has the same distribution and format as the development set. Run your model on the provided test sets and submit the prediction results in CodaLab participate page. For more information on the submission format, see CodaLab submit result page.



To participate, first fill out this form to provide the details of your team: To submit the results of your model runs, use the CodaLab participate page. In the submit result page, please make sure to use the same team name you provided in the registration form. For more information on the shared task, check out the shared task GitHub page and the workshop website.



Competition participants are invited to present their work in the poster session of the SDU@AAAI-21 workshop. The winner of the competition will be provided with an oral presentation. In addition, SDU@AAAI-21 strongly encourages the participants to submit their system papers to the workshop. The system papers will appear in the workshop proceedings in the shared task track. For more information on the workshop, please see SDU@AAAI-21 website.  


Important Dates

  • Training and development set release: September, 1, 2020

  • Test set release: November, 20, 2020

  • System runs due date: December, 4, 2020

  • System papers due date: December, 11, 2020

  • Presentation at SDU@AAAI-21: February, 8 or 9, 2021



Please send your inquiries to
For more updates, join our Google group: and follow us at Twitter

Evaluation Criteria 

The submitted results will be evaluated based on their macro-averaged precision, recall, and F1 scores on the test set computed for correct predictions of short-form (i.e., acronym) and long-form (i.e., phrase) boundaries in the sentences. A short-form or long-form boundary prediction is counted as correct if the beginning and the end of the predicted short-form or long-from boundaries equal to the ground-truth beginning and end of the short-form or long-form boundary, respectively. The official score is the macro average of short-form and long-form prediction F1 score.

The evaluation script is provided on the GitHub page of this competition.

The dataset provided for this competition is licensed under CC BY-NC-SA 4.0 international license, and the evaluation script and the baseline are licensed under MIT license. By accepting the terms and conditions you agree that:

  • Organizers have the right to publicly release the team name, the affiliation of the teams, and the scores (including all metrics computed in the evaluation script) in the upcoming publications,
  • Organizers have the right to exclude the results of the teams that do not comply with the fair competition rules (e.g., by deceptive or erroneous results)
  • You will not redistribute the dataset (including the training/development/test data)
  • You will provide enough description of the details of the model used to make predictions on the test set (including but not limited to model architecture, training settings, word embedding types, etc)


Start: Sept. 1, 2020, midnight

Description: Participants will use the training/development sets to design the models


Start: Nov. 20, 2020, midnight

Description: Participant will submit their model runs on test set

Competition Ends

Dec. 5, 2020, 11:59 a.m.

You must be logged in to participate in competitions.

Sign In