MIND News Recommendation Competition Forum

Go back to competition Back to thread list Post in this thread

> Dataset Question

Hello, can you provide us the created time info of the news?
Maybe the regional IP restriction or the other reasons, the provided crawler code seems inefficient(It has taken about one whole day to crawl the MIND-small. ),
And I think this info is very important to recall the news. As the MIND paper said, 85% of's news lifetime only has 4 or 5days.
So can you provide the info in the original dataset? ~ Have a nice day :>

Posted by: YangZhenghong @ Aug. 5, 2020, 2:34 a.m.

There is a simple method to obtain the approximate publish time of candidate news by using the timestamp of the first impression of a candidate news article. If the news crawling is too slow, you can use cloud servers or the Colab tool to crawl the raw news information.

Posted by: MIND_Organizer @ Aug. 5, 2020, 1:20 p.m.
Post in this thread