COVID-19 Retweet Prediction Challenge Forum

Go back to competition Back to thread list Post in this thread

> To clarify the rules regarding additional data

I would like to clarify the rules regarding additional data.

> Participants are free to use any additional datasets that have been made publicly available *before* the beginning of the Competition.

If the condition is that the dataset was published before 2020/07/01, there is a possibility that the tweets will overlap with the tweets in the test set.
There is a possibility of leakage. Is that possibility protected by any rules of this competition?

Doesn't it require additional data to be released in advance, like Kaggle?
Example: https://www.kaggle.com/c/prostate-cancer-grade-assessment/discussion/145026

Posted by: myaunraitau @ July 15, 2020, 9:22 a.m.

"Participants are free to use any additional datasets that have been made publicly available *before* the beginning of the Competition" - what does "before the beginning of the competition" mean?
Does it mean we can only use external datasets up till the date Sep 30 2019, which is the first day in the training dataset? Admin could you confirm this?

Posted by: vinayaka @ July 16, 2020, 1:21 a.m.

Hi,

thanks for your questions! We updated the Terms and Conditions Page of the Challenge.

Best, the organizers.

Posted by: trovdimi @ July 20, 2020, 7:56 a.m.
Post in this thread