BioCreative'21, Task 3 - Automatic extraction of medication names in tweets Forum

Go back to competition Back to thread list Post in this thread

> Problems with online submission (on validation set)

Hi Davy,
I just wanted to figure out how to do the submission. If I understand it right, we can run tests on the validation set and check our results locally with your script and online on CodaLab. If I am under "Task 3 -Practice" I can upload my file. So, just to be clear, it should be the exact format as the validation file (tsv, same number of tabs etc)? If I upload the original validation script I get an error message ("invalid file type"). However, if I zip the file and upload, it seems to work. But then I receive another error, which looks as follows:

WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
2021-08-06 09:32:34,963 root INFO Pred file:/tmp/codalab/tmpRreevM/run/input/res/BioCreative_ValTask3.tsv, Gold file:/tmp/codalab/tmpRreevM/run/input/ref/BioCreative_ValTask3.tsv
2021-08-06 09:32:34,963 root INFO Output file:/tmp/codalab/tmpRreevM/run/output/scores.txt
2021-08-06 09:32:34,963 root INFO Start scoring
Traceback (most recent call last):
File "/tmp/codalab/tmpRreevM/run/program/evaluationTask3.py", line 205, in
evaluate()
File "/tmp/codalab/tmpRreevM/run/program/evaluationTask3.py", line 199, in evaluate
score_task(pred_file, gold_file, out_file)
File "/tmp/codalab/tmpRreevM/run/program/evaluationTask3.py", line 136, in score_task
assert gtw.text==ptw.text, "The text of the tweet {}:[{}] in the gold standard is different from the text of the same tweet {}:[{}] in the predictions...".format(gtwID, gtw.text, ptw.twid, ptw.text)
AssertionError: The text of the tweet 862069984670994432:[@xtheyLOVEashxo @tachaa_ I joined a Facebook mom group I hate them lmfao this one lady got mad cause I kept saying one baby on my FB] in the gold standard is different from the text of the same tweet 862069984670994432:[@xtheyLOVEashxo @tachaa_ I joined a Facebook mom group I hate them lmfao this one lady got mad cause I kept saying one baby on my FB] in the predictions...

Any ideas or guidance how to to the submission?
Best,
Roland

Posted by: rroller @ Aug. 6, 2021, 9:40 a.m.

Dear Roland,

This is correct, you can test locally your score with the evaluation script and online during the practice period. You do not need to upload the evaluation script on codalab, I have done it when I created the competition. You just need to upload a zip file containing your prediction (and only this file, if there is any other file with the file containing your predictions in the zip it will cause an error).
From the error message returned after your submission I can see that there is an extra character at the end of your tweet. The character is not visible but it is there, probably an end of line. The tweet should be identical to the tweet of the original file. Since we are predicting the spans I check that your predictions are done on the same tweets as the ones in the gold standard to avoid any issue. Please, remove the extra character at the end of your tweet(s) and try to evaluate your predictions using the evaluation script locally. When you get a score locally, upload your predictions on codalab. If you still have an issue, let me know I will check your submission.

Best regards,
Davy

Posted by: dweissen @ Aug. 6, 2021, 1:05 p.m.

Okay, I will check that. Thank you!

Posted by: rroller @ Aug. 6, 2021, 1:33 p.m.

Good morning!
Sorry to bother you again about this. I can run the evaluation script locally. However I am not sure why the online submission does not work. I actually also took the "original" validation file you sent us, zipped it, and uploaded it, and still get the same error. I would assume that this should work, as I did not make any modifications here.
Best,
Roland

Posted by: rroller @ Aug. 20, 2021, 7:16 a.m.

Dear Roland,

No problem, I am here to answer any questions. My apology, the character at the end of the tweet 862069984670994432 is in the original file that I distributed. Now I remember removing it manually before I uploaded the file on codalab as it causes issue when running the evaluation script, and I forgot to delete it from the file I distributed. If you just delete the character manually from the file and resubmit it on the server it works. The character is not visible, but when you move the cursor you can see that at the end of the tweet you need to press twice the direction arrow to move the cursor to the next position, this is where the character is. Please, just delete it and save the file, this should fix the issue. Thanks for pointing the problem out, I will post a new message in the forum to let other participants know. Let me know if you have any other issues when submitting your run.

Best regards,
Davy

Posted by: dweissen @ Aug. 20, 2021, 2:20 p.m.

Fantastic, it works now! :-)

Posted by: rroller @ Aug. 20, 2021, 2:42 p.m.

I just did the same test, and it worked for me, too. Thanks!

Posted by: spiccolo @ Aug. 24, 2021, 6:04 p.m.
Post in this thread