SemEval-2020 Task5: Modelling Causal Reasoning in Language: Detecting Counterfactuals Forum

Go back to competition Back to thread list Post in this thread

> question about the submit(practice) in subtask2

Hello~I met some error when I want to make a submission in subtask2(practice):
####
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
Traceback (most recent call last):
File "/tmp/codalab/tmpT2YLS6/run/program/evaluation.py", line 306, in
f1_scores, recall, precision, em = evaluate2(task2_submission_path, task2_solution)
File "/tmp/codalab/tmpT2YLS6/run/program/evaluation.py", line 275, in evaluate2
f1_mean, recall_mean, precision_mean = metrics_task2(submission_list, truth_list)
File "/tmp/codalab/tmpT2YLS6/run/program/evaluation.py", line 226, in metrics_task2
f1_score = 2 * precision * recall / (precision + recall)
ZeroDivisionError: integer division or modulo by zero
####
the above tells me that I meet a ZeroDivisionError.
However,when I test my model offline,I didn't meet this problem.
And in the baseline script, the metrics_task2 func says:
# calculate precision, recall, f1-score
if inter_len > 0:
precision = inter_len / submission_len
recall = inter_len / truth_len
f1_score = 2 * precision * recall / (precision + recall)
So actually this problem should not happen because we only do this when inter_len > 0.
So please help me check it,thank you!

Posted by: will_go @ Nov. 27, 2019, 8:32 a.m.

Hello, thank you so much for your feedback!

Actually I just tested and found the same error while everything is fine when offline.
I'm close to the core problem now while it may take a while (I think it's related to the Codalab system cuz it seems that the 'if inter_len > 0:' didn't work at all) and will get back to you after solving this problem.

Posted by: Ariel_yang @ Nov. 27, 2019, 5:31 p.m.

Hello, I've fixed that problem, could you please submit and check the results one more time?

Posted by: Ariel_yang @ Nov. 28, 2019, 6:47 p.m.

Hello,I just submit some sample csv to the Codalab, and I find somthing strange.
1. xxx.csv scored 0.848 offline but 0.729 online.
2. yyy.csv scored 0.562 offline but 0.223 online.(And yyy.csv scored 0.05 yesterday...)
the above two files have differnent scores when offline and online,maybe I make some mistakes offline...
But there are two things I can guarantee that I did not make a mistake:
1.I submit the train antecedent_startid,antecedent_endid,consequent_startid,consequent_endid ,and it should score 1.However it's 0.8536 online(But it is 1 on 11/25/2019..),but the "Exact_match" score is 1.They are contradictory.
2. yyy.csv has at least one row that exactly matches the label,but online the "Exact_match" score is 0.0000,it should be at least 1/3551=0.0003 not 0.0000.

Please help me check it.Thank you very much!

Posted by: will_go @ Nov. 29, 2019, 12:10 p.m.

Hello,I just said:
###
1. xxx.csv scored 0.848 offline but 0.729 online.
2. yyy.csv scored 0.562 offline but 0.223 online.(And yyy.csv scored 0.05 yesterday...)
the above two files have differnent scores when offline and online,maybe I make some mistakes offline...
###
I checked again just now, I think there should be no problem with my offline verification, so is there some bug in Codalab's code or something wrong with me?
Thank you.

Posted by: will_go @ Nov. 29, 2019, 1:23 p.m.

I tested the evaluation part and now it will be fine. Thank you so much for your feedback, it's my bad to introduce a new error when I tried to fix the old problems yesterday. Please test it and feel free to contact us anytime.

Posted by: Ariel_yang @ Nov. 29, 2019, 4:57 p.m.

Yeah the score online is fine and matches the score offline.
Thank you very much for your work!

Posted by: will_go @ Nov. 30, 2019, 10:29 a.m.

You're welcome!

Posted by: Ariel_yang @ Dec. 1, 2019, 6:08 p.m.
Post in this thread