> Why the obejective is different in phase A and phase B?

In the development phase, we just need to predict 1 day's data.
But in the test phase, Why do we need to predict 7 days results in phase B?
They are two different objects in one competition. Because in the development phase we can use the last 1 day's data to make features but in the test phase we cannot use the last day's data.
eg. In the development phase, I use yesterday's data to make a lot of features such as news ctr, users' click count, and so on. But now these feature is invalid. Because the next 7 days don't know yesterday's data.
Maybe this change doesn't matter for the DL model(nrms 、naml ...). But it will destroy all effort for these ML method participants.

Posted by: YangZhenghong @ Aug. 22, 2020, 5:56 a.m.

agree with YangZhenghong.

Posted by: huailei @ Aug. 22, 2020, 7:46 a.m.

Hi YangZhenghong, we have introduced the dataset construction and split in our dataset description paper. We hope the model is capable of mining long-term user interest rather than capturing short-term dynamics only. Thus, we reserve the logs in the last week as the test set. You may consider using the features extracted from the training/dev set only or designing new features.

Posted by: MIND_Organizer @ Aug. 22, 2020, 10:08 a.m.

Thanks for your answer, dear MIND organizer. I agree with you that it's valuable to design the user's long-term interest model in research. But in competition, Many click behaviors depend on short-term data. (eg. An entertainment user may also concern the "Notre Dame de Paris is destroyed on fire" news.) And if we don't use these short-term data it's difficult to find the breaking news. And if we want to win it's necessary to design a good method to combine short-term and long-term data/models. Because the actual result is consist of all situation. Have a nice day, thanks again.

Posted by: YangZhenghong @ Aug. 25, 2020, 3:55 a.m.
