Wanted to ask if there was a reason the labels were made public: https://www.dropbox.com/sh/6yrzqr1vem5lnwn/AADC8HI-RL4X2zZ10EW0A3Zua?dl=0&preview=pairs_labelled.csv
Are we allowed to train on the test data?
Absolutely not! The only reason they were released is because we released them last year after competition, so to make sure everyone had the same material. Nonetheless, we are quite familiar with the data (of course), and especially the samples that are difficult (i.e., if there are submissions that look like they do not match the README, we would open results and see why (which I hope this does not happen, but just FYI)). Thanks for asking, but 100% no using any test to train (just validation and training sets). Ideally, when you create test results, you will just submit as is for scoring. Nonetheless, for papers and analysis, post processing and analysis and welcome! We are trusting all will work honestly and, again, typically it would be blind, but since some had the labels we figured it be only fair to make sure everyone had access to the same items. Thanks again!
Posted by: jvision @ Sept. 10, 2021, 3:11 p.m.