Discover the mysteries of the Maya @ ECML PKDD 2021- Integrated Image Segmentation Challenge Forum

Go back to competition Back to thread list Post in this thread

> private set vs public set

If I understand well, the public set is a subset of the private set. In theory we could probe the LB to increase the score. I already detected few images in the public set and know if they have objects or not.

If the organizer confirm we could do it, I will continue exploit in this direction.

The correct way to avoid this is to separate the public set and private set.

Posted by: taka @ June 13, 2021, 5:13 p.m.

If the organizer does not raise their hand, I will continue probe the LB ;-)

Posted by: taka @ June 13, 2021, 5:14 p.m.

Hi,

empty masks doesn't mean that they have been 'emptied' on purpose for some reason (such as that they are part of the private test set). It simply means that there are no objects of the particular type in that tile. This is btw also obvious from one of the baselines, and is part of the challenge. Indeed, the 'private' and the 'public' test sets are separate. In the rulebook, by stating that one is the subset of the other, we only mean that the final leadearboard will reflect the performance the methods on both sets, rather than only the 'private (hence the particular wording). Another important thing in the rulebook states that all the 'winning' or top-ranked solutions will also be evaluated in terms of the methods and approaches taken for obtaining the predictions (code, documentation etc.). This means that we will award the teams with the best performing approaches. This is a machine learning competition after all, and we are interested in solutions rather than guessing. In general, theoretically, even a completely random prediction can guess right and have the highest IoU score, but that doesn't mean that is a good solution, right ? :)

best,

Posted by: simidjievskin @ June 13, 2021, 6:29 p.m.

Many thanks for your quick reply.

Great to hear that the private set is separated from the public set. LB probing would be useless - no one needs to spend time on that.

Posted by: taka @ June 13, 2021, 7:27 p.m.

Hi simidjievskin, I do not go along with you reggarding "we are interested in solutions rather than guessing". I have been trying different approaches since the beggining of the competition and rapidly I reached the maximum number of submissions. However, I asked you to increase this limit in order to continue trying different architectures and novel augmentation, normalization and unsupervised pre-training techniques. Considering what you have said "Another important thing in the rulebook states that all the 'winning' or top-ranked solutions will also be evaluated in terms of the methods and approaches taken for obtaining the predictions (code, documentation etc.)" increase the number of submissions will only give place to more complex and novel solutions, because if someone want to cheat will do it whatever the number of submissions is.

So please, reconsider increase the number of submissions, there are two weeks left.

Posted by: cayala @ June 14, 2021, 6:42 p.m.

Hi,

without going into deeper debates (which we can after the competition if you like :) ) -- I'm inclined to agree that these are somewhat competing objectives: good solution on one hand, but under such constraints on the other. An indeed, there's no silver bullet against cheating (and/or overfitting) but limiting the submission-count is a step forward.

In any case, @cayala you hinted/guessed the 'surprise' before us announcing it tomorrow (for that you get a drink or chocolate or smth. on me :)) - there will be relaxation on these constraints for a 2-week photo-finish!

cheers,

Posted by: simidjievskin @ June 14, 2021, 7:49 p.m.
Post in this thread