Some of the data distributions are very un-balanced. Are the test sets (current and private) be manually balanced by you ? or do you expected to measure "real-life" performance
with test sets that are "real-life" distributed ?
Is the distribution of both test sets similar?
The test set includes real-world data that is "real-world distributed". Obviously, we can't disclose the distribution characteristics of the test set. The reason behind that is that we are looking for models that will generalize well to the real-world, and we are not particularly interested in models that 'hack' the test-set distribution...
As stated in the competition website, the private test set will include a) data gathered from locations that were not represented in the training set; b) segments, not tracks.
We do expect to measure real-world performance with test sets that are real-world distributed.
We published many details regarding the training and aux sets. Those details should be helpful so that participants can create validation sets that are reasonable (i.e., should predict generalization well).
The public test set was selected in a way that represents the private test set in some aspects, but this is far from falling under the category of the exact same distribution.
Please note that this is not a competition issue - this is a real-world issue! When we develop algorithms and train models for real-world problems, we expect them to perform well in the real-world. The real-world distribution is hardly ever known in advanced (and is hardly ever stationary)!
(MAFAT Challenge Team)