SemEval-2021 Task 2: Multilingual and Cross-lingual Word-in-Context Disambiguation (MCL-WiC) Forum

Go back to competition Back to thread list Post in this thread

> Training data

Hi,

May I know when will the full training data be available and what is the rough size? Thanks in advance.

Best regards,
Winst

Posted by: Xingran_Zhu @ Oct. 19, 2020, 6:48 p.m.

Hi!

The training data will be available by the end of this week and it's composed of 12,000 sentences (counting the fixed sentence only once per target lemma), whereas the dev data, available by the end of the next week, is made up of 1500 sentences. All our data is manually annotated. Training data will be available only in English.

Best regards,
The MCL-WiC team

Posted by: Federico_Martelli @ Oct. 19, 2020, 9:12 p.m.

Hi,

Thanks!

Best regards,
Winst

Posted by: Xingran_Zhu @ Oct. 19, 2020, 9:27 p.m.

Hello,

What about the data for the remaining multilingual and cross-lingual tasks? Will they be available soon?

Posted by: amroa @ Oct. 22, 2020, 12:28 p.m.

Also, can we get any estimates on the sizes of these datasets? Specifically, the Arabic multilingual and English-Arabic cross-lingual tasks?

Thanks!
Amro

Posted by: amroa @ Oct. 22, 2020, 12:30 p.m.

Hello,

We will release the training data in a couple of days, whereas the dev data by the end of the next week. All dev data in all languages will include 1500 sentences (counting the fixed sentence only once).
As far as the cross-lingual part is concerned, we will release only the test data which is due in December.

All our data sets are 100% manually annotated.

Best regards,
The MCL-WiC team

Posted by: Federico_Martelli @ Oct. 22, 2020, 12:54 p.m.

Hi,
What is the update in the availability of training data?
Thanking you,
Rohan

Posted by: rohangpt @ Oct. 27, 2020, 9:46 a.m.

Hello,

The training data is available!

Many thanks,
The MCL-WiC team

Posted by: Federico_Martelli @ Oct. 27, 2020, 2:47 p.m.

Am I seeing correctly that only english data is available in the training data so far? I downloaded the zip file from github. Am I missing something?

Posted by: dstrohmaier @ Oct. 29, 2020, 8:06 a.m.

Hello!

Yes, training data will be available only in English!

Best regards,
The MCL-WiC team

Posted by: Federico_Martelli @ Oct. 29, 2020, 8:08 a.m.
Post in this thread