Sentence-level Direct Assessment QE shared task 2021 Forum

Go back to competition Back to thread list Post in this thread

> Failed Submission: what is the reason?


I receive failed submission status for Task 1 si-en, even though I formatted the output as requested on the website. Line 1 for disk, line 2 for the number of parameters, and lines 3-n+2 are for the test set scores.

What might be the reason?

Ergun

Posted by: bicici @ July 27, 2021, 9:19 a.m.


The first and last 4 lines for en-ja are:
61203283968
380
en-ja RTM 0 -2.6975261206662413
en-ja RTM 1 -2.6794057820147135
...
...
...
en-ja RTM 996 -2.7422375484511736
en-ja RTM 997 -2.6648084122300677
en-ja RTM 998 -2.6715050179411737
en-ja RTM 999 -2.8975340403022347

Posted by: bicici @ July 27, 2021, 9:22 a.m.


The scoring output specifies that:
###
Loading goldlabels...
done.
Loading your predictions...
done.
Computing scores...
Computed the scores in 0.0109 seconds.
Writing in output file...
----
disk_footprint:61203283968
model_params:380
pearson => better than baseline? no
----
Done!
###

The error log:
###
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
###

The output from scoring step:
###
disk_footprint:61203283968
model_params:380
pearson:0.0610823909621
mae:2.48540940293
rmse:2.82197975289
###

Can you identify why the submission is failing?

Posted by: bicici @ July 27, 2021, 9:28 a.m.

Dear bicici,
I do not see anything obvious that would explain why your submissions fail.
The format of your submissions looks fine and the scoring program exits normally.
I keep investigating. Meanwhile, keep submitting before the deadline to the
tasks you want to participate to, and if the issue happens to be on Codalab's
side, we will include you as part of the official results.

Best,
Fred.

Posted by: fblain @ July 27, 2021, 10:20 a.m.

Dear Fred,

Can you run the evaluation script on the command line and let me know the warnings and errors?

Thank you.
Ergun

Posted by: bicici @ July 27, 2021, 2:31 p.m.

Dear Ergun,
this is what I meant by "the scoring program exists normally". According to the scoring log on Codalab,
and by running the scoring program manually after I downloaded your submissions, there is no error
parsing your predictions.

Best,
Fred.

Posted by: fblain @ July 27, 2021, 3:19 p.m.

Additionally, I only see submissions from the user Anonymous in the results tab. Maybe this is also an error of CodaLab. Normally, we see participants' user names in the corresponding column.

Posted by: bicici @ July 28, 2021, 10:43 a.m.

Dear Ergun,
that is because all the leaderboards have been made anonymous throughout the evaluation campaign.
Now the deadline has passed, names of participants will soon be revealed.

Best,
Fred.

Posted by: fblain @ July 28, 2021, 11:59 a.m.


Is the "output from scoring step" of my failed submissions reliable enough to be used in comparisons and ranking?

Posted by: bicici @ July 28, 2021, 1:26 p.m.

Maybe we should ask to CodaLab technical support about my failed submissions.

Posted by: bicici @ July 28, 2021, 2:47 p.m.


Some runs that I submitted yesterday at 07/28/2021 17:37:08 for en-zh and 07/28/2021 17:34:06 for en-de post-editing are still shown as Submitting.

Some run that I submitted for en-ja once is listed as failed twice at the same time and date 1) 07/27/2021 09:13:46 Failed 2) 07/27/2021 09:13:46

Posted by: bicici @ July 29, 2021, 6:29 a.m.

Anonymity of the results is removed and now the submissions ask for more fields:
*Team name (20 characters max):
*Method name (20 characters max):
*Method description:
Project URL:
Publication URL:
Bibtex:
Organization/affiliation:

But my submissions still fail or get stuck with submitting state for some reason.

Posted by: bicici @ July 29, 2021, 9:34 a.m.


Dear Fred,

Can you email me a working submission file so that I compare with mine to identify the error? I can also submit the same file to see whether the error is related to my account. Thank you.

Ergun

Posted by: bicici @ July 29, 2021, 10:41 a.m.
Post in this thread