Hello, is there five final multilingual test datasets? How many datasets are there in each language? How many cross language test datasets are there, and what is the data size? How many evaluation lists will be published?
Thanks very much
Posted by: Sattiy @ Jan. 4, 2021, 2:56 a.m.Hello,
As far as the test data is concerned, we provided/are providing 5 multilingual datasets (test.ar-ar.data, test.en-en.data, test.fr-fr.data, test.ru-ru.data and test.zh-zh.data) and 4 cross-lingual datasets (test.en-ar.data, test.en-fr.data, test.en-ru.data and test.en-zh.data). Gold files will not be released. Each file contains 500 unique lemmas and 2000 sentences. What do you mean exactly with evaluation lists?
Please check out our GitHub page to access the data!
Best regards,
The MCL-WiC
I mean, does the last nine test data mean nine results(ranking lists) in evaluation phase?
Thanks very much
Hi,What is the format of submission?
Thanks
Hello,
Yes, there will be 9 ranking lists (one for each gold dataset provided) and you can also submit only one or two datasets.
Please follow these steps for the submission:
1. download the test data (.data) from our GitHub page https://github.com/SapienzaNLP/mcl-wic,
2. generate your answers,
3. name each file "test.{language}-{language}" (for example "test.ru-ru" if you wish to participate in the Russian multilingual sub-task),
4. create a submission.zip file containing all your datasets which you would like to submit (for example the submission.zip file could contain the files "test.ru-ru" and "test.en-ru", indicating that you will participate in the Russian multilingual sub-task and the English-Russian cross-lingual sub-task), and
5. submit!
Best regards,
The MCL-WiC team
PS. Your "test.{language}-{language}" files (to be zipped together) must be in the same format as our .gold files. Please see our CodaLab page for a detailed description and download the dev .gold files from our GitHub page: https://github.com/SapienzaNLP/mcl-wic.
PSS. Possible language combinations for the multilingual sub-task: ar-ar, en-en, fr-fr, ru-ru, zh-zh. Possible combinations for the cross-lingual sub-task: en-ar, en-fr, en-ru, en-zh.
Thanks for your reply!
Posted by: Sattiy @ Jan. 11, 2021, 1:14 a.m.Dear participants,
Please pay attention to the format of the submission files before uploading. IMPORTANT: tags must be either T or F (Y/N will not be processed by the script).
Example1:
[
{
"id": "test.en-en.0",
"tag": "T"
}
]
Example2:
[
{
"id": "test.en-en.1",
"tag": "F"
}
]
Best regards,
The MCL-WiC team
Hello,
Please clarify this - 'and you can also submit only one or two datasets.'
Thanks.
Posted by: amansinha_ @ Jan. 12, 2021, 3:59 p.m.Does the submission status display "Finished" mean that the submission is successful?
But I see the line chart shows "-1".
Thanks!
Yes, now if you received no errors and see "Finished", that means that the submission was successful!
All the best,
The MCL-WiC team
Hi again,
to answer this question: Please clarify this - 'and you can also submit only one or two datasets': Your submission.zip file can also contain only one file (for example test.ru-ru), in this case you will receive only only one score (multilingual sub-task, language combination: Russian-Russian), in all other datasets you will receive -1, meaning datasets not uploaded).
All the best,
The MCL-WiC team
Is the score based on the last submission?
Thanks!
Hello,
You will receive a score for each submission.
Cheers,
The MCL-Wic team
I mean the result of the final Evaluation List depends on which submission.
Thanks!
Hi,
This is still to be clarified internally.
Best regards,
The MCL-WiC team
Hi,
Sorry I didn't understand what should we do if we see a -1 score in the chart? I am sure that I am submitting my results in the right way.
Thank you,
Niloofar_R
Hi,
That's because you did not submit all datasets, -1 indicates that the corresponding dataset was not uploaded.
Cheers,
The MCL-WiC team