CodaLab - Competition

SemEval-2021 Task 2: Multilingual and Cross-lingual Word-in-Context Disambiguation (MCL-WiC)

Organized by Federico_Martelli - Current server time: April 2, 2025, 6:46 p.m. UTC

Evaluation

Jan. 10, 2021, midnight UTC

Current

Post Evaluation

Feb. 1, 2021, midnight UTC

End

Competition Ends

Never

Overview
Evaluation
Terms and Conditions

Multilingual and Cross-lingual Word-in-Context Disambiguation

Introduction

Over recent years, computational lexical semantics has seen a surge of interest in a wide range of approaches, from multi-prototype embeddings to sense-based and contextualized embeddings, all aimed at providing some form of representation and understanding of a word in context. However, evaluating such a variety of approaches in a single framework is not easy. For instance, traditional Word Sense Disambiguation (WSD) fails to test latent representations unless these are linked to explicit sense inventories such as WordNet and BabelNet. To address this limitation, we propose a innovative common evaluation benchmark which allows to measure and compare the performance of the aforementioned context-based approaches. In this task, we will follow and extend Pilhevar and Camacho-Collados (2019), who proposed a benchmark consisting of semi-automatically-annotated English sentence pairs, which requires systems to determine whether a word occurring in two different sentences is used with the same meaning or not, without relying on a pre-defined sense inventory.

Task overview

Multilingual and Cross-lingual Word-in-Context Disambiguation (MCL-WiC) is the first SemEval task for Word-in-Context disambiguation which tackles the challenge of capturing the polysemous nature of words without relying on a fixed sense inventory in a multilingual and cross-lingual setting. MCL-WiC provides a single high-quality framework for the performance evaluation of a wide range of approaches aimed at evaluating the capability of a system to deeply understand word meaning. Compared to other datasets, MCL-WiC brings the following novelties:

it addresses multilinguality and cross-linguality,
it provides coverage of all parts of speech, and
it covers a high number of domains and genres.

Participating systems will be asked to perform a binary classification task in which they indicate whether the target word is used in the same meaning (tagged as T for true) or in a different meaning (F for false) in the same language (multilingual dataset) or across different languages (cross-lingual dataset). Below you can find two examples of sentence pairs, the first one from the multilingual part and the second one from the cross-lingual part:

la souris mange le fromage -- le chat court après la souris
click the right mouse button -- le chat court après la souris

In the first sentence pair, the target word souris will be tagged with T (True) since it is used in the same meaning in both sentences. Instead, in the second sentence pair, the target word mouse and its corresponding translation into French are used in two distinct meanings, therefore, in this case, the expected output will be F (False).

Languages

The following languages will be considered:

Arabic
Chinese
English
French
Russian

Annotation

The manual annotation was performed according to the following criteria. Given a target word w occurring in two sentences in the same language (multilingual task) or a target word w in the first sentence in one language and the corresponding target word w' in the second sentence in a second language, we used the tag:

T if the two words are used in the same exact meaning.
F if the two words are used in two different meanings (such as race in the meaning of competition vs. that of breed).

Important dates

Trial data: July 31, 2020
Training data ready: October 26, 2020
Test data ready: December 3, 2020
Evaluation starts: January 10, 2021
Evaluation ends: January 31, 2021
Paper submission due: February 23, 2021
Notification to authors: March 29, 2021
Camera ready due: April 5, 2021
SemEval workshop: Summer 2021

Key links

Github data repository: mcl-wic
Discussion forum: https://competitions.codalab.org/forums/23750/
Link to the paper: https://raw.githubusercontent.com/SapienzaNLP/mcl-wic/master/SemEval_2021_Task_2__Multilingual_and_Cross_lingual_Word_in_Context_Disambiguation__MCL_WiC___Paper_.pdf

Acknowledgments

The organizers gratefully acknowledge the support of the ELEXIS EU project No. 731015 and the MOUSSE ERC Consolidator Grant No. 726487 under the European Union’s Horizon 2020 research and innovation programme.

References

Pilehvar, Mohammad Taher, and Jose Camacho-Collados. WiC: the word-in-context dataset for evaluating context-sensitive meaning representations. Proceedings of NAACL-HLT 2019, pages 1267-1273.

Evaluation Criteria

Systems will be asked to perform a binary classification on each sentence pair in the dataset, for which they will have to output T or F depending on whether a given target word occurring in two sentences is used with the same meaning or with a different meaning respectively. The goal is to determine to what degree systems can discriminate meanings within and across languages without necessarily relying on an explicit sense inventory.

Results will be computed using the accuracy measure. A thorough analysis will be carried out for each language pair (cross-lingual dataset), for the different types of approach declared by participants (context-specific embeddings, WSD, etc.), the type and amount of training data used by the system, by domain and genre of the sentences (i.e. formal/parliamentary vs. encyclopedic), etc. Furthermore, we will distinguish between systems which exploit the training set provided for the given language(s) and those which do not exploit it, e.g., based on vector similarities or traditional WSD systems which output T/F based on sense assignment.

Submission instructions

Please follow these steps for the submission:

1. download the test data (.data) from our GitHub page https://github.com/SapienzaNLP/mcl-wic,
2. generate your answers,
3. name each file "test.{language}-{language}" (for example "test.ru-ru" if you wish to participate in the Russian multilingual sub-task),
4. create a submission.zip file containing all your datasets which you would like to submit (for example the submission.zip file could contain the files "test.ru-ru" and "test.en-ru", indicating that you will participate in the Russian multilingual sub-task and the English-Russian cross-lingual sub-task), and
5. submit!

Baselines

We will compare the performance of participating systems against a baseline neural classifier. Our baseline system will be input different types of embeddings:

sense embeddings, such as LMMS (Loureiro and Jorge, 2019) and SensEmBERT (Scarlini et al., 2020), which combine contextualized embeddings with the knowledge derived from resources such as WordNet and BabelNet;
context-specific word embeddings, such as Context2vec (Melamud et al., 2016), BERT (Devlin et al., 2019) etc.

Interestingly, this will provide an effective multilingual and cross-lingual benchmark for all types of embeddings and NLU systems.

References

Loureiro, Daniel and Alipio, Jorge. Language Modelling Makes Sense: Propagating Representations through WordNet for Full-Coverage Word Sense Disambiguation. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019, pages 5682-5691.
Scarlini, Bianca; Pasini, Tommaso and Navigli, Roberto. SensEmBERT: Context-Enhanced SenseEmbeddings for Multilingual Word Sense Disambiguation, Proceedings of the Association for the Advancement of Artificial Intelligence, 2020, pages 8758-8765.
Melamud, Oren; Goldberger, Jacob and Dagan, Ido. context2vec:Learning Generic Context Embeddingwith Bidirectional LSTM Proceedings of the 20th SIGNLL conference on computational natural language learning, 2016, pages 51-61.
Devlin, Jacob; Chang, Ming-Wei; Lee, Kenton and Toutanova, Kristina BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Proceedings of NAACL-HTL 2019, pages 4171-4186.

Terms and Conditions

The data of the Multilingual and Cross-lingual Word-in-Context Disambiguation are released under the CC-BY-NC 4.0 license. Attribution shall be provided by citing:

F. Martelli, N. Kalach, G. Tola, R. Navigli. SemEval-2021 Task 2: Multilingual and Cross-lingual Word-in-Context Disambiguation (MCL-WiC). Proc. of the 15th Workshop on Semantic Evaluation, 2021.

Training

Start: Oct. 1, 2020, midnight

Description: Please go to the github repository to download the training and dev data and work on your system(s)!

Evaluation

Start: Jan. 10, 2021, midnight

Description: During the evaluation phase, you can submit your runs, which will be evaluated against the test data.

Post Evaluation

Start: Feb. 1, 2021, midnight

Description: Post-evaluation analysis and discussion phase

Competition Ends

Never

You must be logged in to participate in competitions.

Competition

SemEval-2021 Task 2: Multilingual and Cross-lingual Word-in-Context Disambiguation (MCL-WiC)

Previous

Current

End

Multilingual and Cross-lingual Word-in-Context Disambiguation

Introduction

Task overview

Languages

Annotation

Important dates

Key links

Acknowledgments

References

Evaluation Criteria

Submission instructions

Baselines

References

Terms and Conditions

Training

Evaluation

Post Evaluation

Competition Ends