Hi,
May I know when will the full training data be available and what is the rough size? Thanks in advance.
Best regards,
Winst
Hi!
The training data will be available by the end of this week and it's composed of 12,000 sentences (counting the fixed sentence only once per target lemma), whereas the dev data, available by the end of the next week, is made up of 1500 sentences. All our data is manually annotated. Training data will be available only in English.
Best regards,
The MCL-WiC team
Hi,
Thanks!
Best regards,
Winst
Hello,
What about the data for the remaining multilingual and cross-lingual tasks? Will they be available soon?
Posted by: amroa @ Oct. 22, 2020, 12:28 p.m.Also, can we get any estimates on the sizes of these datasets? Specifically, the Arabic multilingual and English-Arabic cross-lingual tasks?
Thanks!
Amro
Hello,
We will release the training data in a couple of days, whereas the dev data by the end of the next week. All dev data in all languages will include 1500 sentences (counting the fixed sentence only once).
As far as the cross-lingual part is concerned, we will release only the test data which is due in December.
All our data sets are 100% manually annotated.
Best regards,
The MCL-WiC team
Hi,
What is the update in the availability of training data?
Thanking you,
Rohan
Hello,
The training data is available!
Many thanks,
The MCL-WiC team
Am I seeing correctly that only english data is available in the training data so far? I downloaded the zip file from github. Am I missing something?
Posted by: dstrohmaier @ Oct. 29, 2020, 8:06 a.m.Hello!
Yes, training data will be available only in English!
Best regards,
The MCL-WiC team