ELEXIS Monolingual Word Sense Alignment Task

Organized by jmccrae - Current server time: July 5, 2020, 4:17 p.m. UTC

Current

Test Phase
Feb. 3, 2020, midnight UTC

End

Competition Ends
Never

1st “Monolingual Word Sense Alignment” Shared Task

Call for Participation

The ELEXIS project is organizing a shared task on the task of monolingual word sense alignment across dictionaries as part of the GLOBALEX 2020 – Linked Lexicography workshop at the 12th Language Resources and Evaluation Conference (LREC 2020) taking place on Tuesday, May 12 2020 in Marseille (France).

Monolingual word sense alignment is a challenging task of finding matching senses between two dictionary entries and will play a crucial role in the development of new lexical resources. In addition, this task presents a challenging combination of NLP, semantic textual similarity and reasoning in order to find the best alignment across a group of senses.

Description of Task

The task of monolingual word sense alignment is presented as a task of predicting the relationship between two senses in one of five categories: “exact”, “broader”, “narrower”, “related” or “none”. For each sense pair the following information will be provided:

  • The lemma shared between the two entries
  • The part of speech of the entries*
  • The sense text (including definition) of the sense of the first entry
  • The sense text (including definition) of the sense of the second entry
  • (Training Data) The label of the relation (“exact”, “broader”, “narrower”, “related” or “none”)

For each pair of entry all mappings between senses will be provided, as such we expect the best systems to consider the mapping of an entry as a block.

Training data will be available for monolingual dictionaries in the following languages:

  • Basque‡
  • Bulgarian‡
  • Danish
  • Dutch
  • English †
  • Estonian
  • German
  • Hungarian*
  • Irish
  • Italian
  • Portuguese‡
  • Serbian
  • Slovenian
  • Spanish‡
  • Russian

*For Hungarian part-of-speech information is not provided
† There are two English evaluation tasks with different dictionaries
‡ Data for these languages will be released in February

Participants may participate in any or all of the above languages. The test data will consist of a group of entries with the label of the relation missing, participants should submit the result in the same form of the training data, that is the test data with the predicted label.

Example data

squall verb blow in a squall to cry out; to scream or cry violently, as a woman frightened, or a child in anger or distress; . none
squall verb make high-pitched, whiney noises to cry out; to scream or cry violently, as a woman frightened, or a child in anger or distress; . none
squall verb utter a sudden loud cry to cry out; to scream or cry violently, as a woman frightened, or a child in anger or distress; . exact
commentator noun a writer who reports and analyzes events of the day one who writes a commentary or comments; an expositor; an annotator. narrower
commentator noun an expert who observes and comments on something one who writes a commentary or comments; an expositor; an annotator. narrower

 

Publication of Results

Participants will submit a system paper that should include a description of the system, the way the data has been processed, the applied algorithms, the obtained results, as well as the conclusions and ideas for future improvements. The papers will be peer reviewed prior to publication to confirm that all aspects are well covered.

The workshop will accept also regular papers from participants who are not participating in the shared task but still have worked in the topic of translation inference and want to publish novel results or ideas, maybe with different datasets and experimental basis as the ones proposed in this shared task. Such papers will be peer reviewed on the basis of their scientific quality.

All the accepted papers will be published as part of the Globalex workshop proceedings and presented during the workshop.

Important Dates

6/12/2019 – Call for participation
12/12/2019 – Technical description of the evaluation process and data provided by organisers
13/03/2020 01/04/2020 – Submission of results by participants / submission of regular papers
03/04/2020 – Evaluation results communicated by organisers / notification of regular papers
24/04/2020 – Submission of system description papers
12/05/2020 – Workshop day

Organizers

John P. McCrae - Data Science Institute, National University of Ireland Galway Sina Ahmadi - Data Science Institute, National University of Ireland Galway

Review Committee

To be announced

squall verb blow in a squall to cry out; to scream or cry violently, as a woman frightened, or a child in anger or distress; . none
         
         
         
         

 

 

Evaluation Criteria

Evaluation will be performed on a per-language basis. For each of the following languages we will provide four evaluation scores.

  • Accuracy: This represents the percentage of scores for which the predicted label matches the reference label. This uses 5 classes, namely the exact, broader, narrower, related and none. Participants must choose the correct label
  • Precision, Recall and F-Measure: This takes into account accuracy in predicting the link but not the type of the link. Thus predicting "related" when the gold standard is "exact" is considered correct. It is only incorrect to predict "none" when a non-"none" label is in the gold standard or vica versa.

In addition, we provide an average over all languages participated in.

Submission

Your submission should consist of a single zip file, containing the files as in the test folder of the public data. Your system should append to each of the lines in the tab file a single tab and the predicted relationship: exact, broader, narrower, related, none.

For example a typical submission should look like:

  • reference.zip
    • english_nuig.tsv
    • spanish.tsv
    • slovene.tsv

Terms and Conditions

All resources are provided under a CC-BY-SA license. Individual resources should be credited as follows

  1. English 1 - Princeton WordNet is provided by Princeton under the WordNet License. Webster's 1913 dictionary is in the public domain

Test Phase

Start: Feb. 3, 2020, midnight

Description: Main evaluation of the word sense alignment task

Competition Ends

Never

You must be logged in to participate in competitions.

Sign In
# Username Score
1 RaffaeleManna 0.844
2 pvf 0.822
3 jmccrae 0.769