Sentence-level Direct Assessment QE shared task 2021 Forum

Go back to competition Back to thread list Post in this thread

> Leaderboard Ranking Question

Hi, I found that when I submitted two systems with very very close Pearsons scores, but one obviously have much better RMSE and MAE than the other one. However, the platform picks the one only according to the Pearson metric to appear on the leaderboard. Even though it seems that Pearson score is higher, the system's bad RMSE and MAE will harm the <rank> metric evaluation's result. How to deal with this problem?

Posted by: joanne.wjy @ July 26, 2021, 5:16 p.m.
Post in this thread