This competition is the second task of the shared tasks introduced in the AAAI-21 Workshop on Scientific Document Understanding. This task aims to find the correct meaning of an ambiguous acronym in a given sentence. The input to the system is a sentence with an ambiguous acronym and a dictionary with possible expansions (i.e., long-forms) of the acronym. For instance:
Input - Sentence: They use CNN in the proposed model.
Input - Dictionary: CNN: 1. Convolutional Neural Network, 2. Cable News Network
Output: Convolutional Neural Network
In this example, the ambiguous acronym in the input sentence is shown in boldface and the expected prediction for its correct meaning is "Convolutional Neural Network". For this task, participants are provided with the training and development datasets consisting of 62,441 sentences and a dictionary of 732 ambiguous acronyms. The dataset and dictionary are created from 6,786 English scientific papers published at arXiv.
Acronym Disambiguation competition has two phases:
To participate, first fill out this form to provide the details of your team: https://rb.gy/m7frwz. To submit the results of your model runs, use the CodaLab participate page. In the submit result page, please make sure to use the same team name you provided in the registration form. For more information on the shared task, check out the shared task GitHub page and the workshop website.
Competition participants are invited to present their work in the poster session of the SDU@AAAI-21 workshop. The winner of the competition will be provided with an oral presentation. In addition, SDU@AAAI-21 strongly encourages the participants to submit their system papers to the workshop. The system papers will appear in the workshop proceedings in the shared task track. For more information on the workshop, please see SDU@AAAI-21 website.
Training and development set release: September 1, 2020
Test set release: November 20, 2020
System runs due date: December 4, 2020
System papers due date: December 11, 2020
Presentation at SDU@AAAI-21: February 9, 2021
The submitted results will be evaluated based on their macro-averaged precision, recall, and F1 scores computed for correct long-form prediction on the test set. A long-form prediction is counted as correct if it matches the ground truth long-form of the given acronym in the input sentence. The official score is the prediction F1 score.
The evaluation script is provided on the GitHub page of this competition.
The dataset provided for this competition is licensed under CC BY-NC-SA 4.0 international license, and the evaluation script and the baseline are licensed under MIT license. By accepting the terms and conditions you agree that:
Thien Huu Nguyen, University of Oregon, USA
Walter Chang, Adobe Research, USA
Amir Pouran Ben Veyseh, University of Oregon, USA
Leo Anthony Celi, Harvard University and MIT, USA
Franck Dernoncourt, Adobe Research, USA
Start: Sept. 1, 2020, midnight
Description: Participants will use the training/development sets to design the models
Start: Nov. 20, 2020, midnight
Description: Participant will submit their model runs on test set
Dec. 5, 2020, 11:59 a.m.
You must be logged in to participate in competitions.
Sign In