MIND News Recommendation Competition

Organized by v-jinyi - Current server time: Aug. 11, 2020, 10:16 p.m. UTC

Current

Development
July 20, 2020, 11:59 p.m. UTC

Next

Final test
Aug. 21, 2020, midnight UTC

End

Competition Ends
Sept. 4, 2020, 11:59 p.m. UTC

Overview

Introduction

Online news services such as Microsoft News have gained huge popularity for online news reading. However, since massive news articles are published everyday, users of online news services are facing heavy information overload. Therefore, news recommendation is an important technique for personalized news services to improve the reading experience of users and alleviate information overload.

However, news recommendation is a challenging task. First, news articles on news websites emerge and update very quickly. Many new articles are posted continuously, and existing news articles will disappear after a short period of time. Thus, there is a severe cold-start problem in news recommendation. Second, news articles usually contain rich textual information such as title and body. It is very important to understand news content from their texts using NLP techniques. Third, there is no explicit rating of news articles posted by users in news platforms. Thus, in news recommendation we need to model users’ interests from their browsing and click behaviors. However, user interests are usually diverse and dynamic, which poses significant challenges to user modeling algorithms. Thus, further researches are highly needed to tackle the various challenges in news recommendation.

To promote the research and practice on news recommendation, we hold a MIND News Recommendation Competition based on the MIND dataset, which is a large-scale English dataset for news recommendation. This challenge will provide a good testbed for participants to develop better news recommender systems to improve the future reading experience of millions of users.

How to participate?

  • Read the competition details on this website
  • Join the competition on Codalab
  • Send an email titled "MIND Competition Registration" to mind[at]microsoft.com with your information (CodaLab account nickname, real name, contact email and affiliation) and your agreement of Microsoft MIND News Recommendation Contest Official Rules (please write "I agree to the Microsoft MIND News Recommendation Contest Official Rules" in your email). We will approve your participation after receiving this email in one or two days if all necessary information is included, and will send an email to you for confirmation
  • Train and evaluate your news recommendation models on the MIND dataset
  • In the dev phase, you can submit your results on the dev set to the Codalab system to obtain an official score
  • In the test phase, we will release the test set, and you can submit your predicted results on it to Codalab before the deadline (see Timeline tab for more details)

Any questions and suggestions on the competition can be sent to mind[at]microsoft.com

Task

The task in this competition is described as follows. Given the news browsing history [n1, n2,..., nP] of a user u and a set of candidate news [c1,c2,...,cM] in an impression log, the goal is to rank these candidate news articles according to the personal interest of this user. In this process, news articles can be modeled by their content, and users' interests can be modeled by their news browsing history. Then, the model predicts the click scores of candidate news based on the relevance between candidate news and user interests. Finally, the candidate news articles in each impression are ranked by their click scores. The ranking results will be compared with the real user click labels to measure the ranking quality via several metrics including AUC, MRR and nDCG@K (see Evaluation tab).

Dataset

The dataset used in this competition is the MIcrosoft News Dataset (MIND), which is a large-scale dataset for news recommendation research. It was collected from anonymized behavior logs of Microsoft News website. You can visit the official website of MIND https://msnews.github.io to download the training, validation and test (after the test phase begins) data sets of MIND. The detailed information of this dataset can also be found on this website. 

 

 

Evaluation Metrics

Systems are evaluated using several standard evaluation metrics in the recommendation field, including: area under the ROC curve (AUC), mean reciprocal rank (MRR), and normalized discounted cumulative gain for K shown recommendations (nDCG@K). The final result is the average of these metrics on all impression logs. The primary metric for submission ranking is AUC.

Scoring script

You can download the official evaluate script here: evaluation.py

 

 

Submission Guidelines

Submission Formats

Participants need to submit the ranking results of news in each impression generated by a recommender system. Prediction results submitted to CodaLab should be zip-compressed, containing a file named prediction.txt. In this file, each line contains an impression ID and a rank list of candidate news. The format of each line is:

ImpressionID [Rank-of-News1,Rank-of-News2,...,Rank-of-NewsN]

For example, given the impression as follows:

ImpressionID Candidate News
24481 N125045 N87192 N73556 N20417

The prediction results of this impression can be:

24481 [4,1,3,2]

which means that the ranking orders of the candidate news articles in this impression are N87192, N20417, N73556 and N125045. The evaluation script will evaluate your ranking results against the gold labels. The script, as well as a sample file containing 10 lines of predictions (cannot be directly submitted to the Codalab system) can be found on Github. Following are several additional points:

  • A valid zip submission should contain nothing but a json file named prediction.txt.  For Mac users, make sure that the submission contains no __macosx file.
  • Do not place the submission file within folders before it is compressed
  • The row orders of the results should be consistent with those in the original files.
  • The ranking results are integers starting from 1.

Submission Process

You need several steps to make a submission:

  • Navigate to 'Participate'
  • Write a brief description of your model (optional)
  • Click the button 'Submit / View Results'
  • Upload your zipped submission
  • Wait until the evaluation status turns to 'Finished' or 'Failed'

If the submission status is 'Failed'(*), you can click 'View scoring output log' and 'View scoring error log' to see the debug logs. When the evaluation is finished, you can decide whether to show your scores on the leaderboard. During the development phase, participants can upload their predictions on the validation set and tune their models according to the results. Although this submission is not obligatory, we highly encourage you to submit in case that you have troubles in obtaining the normal evaluation results, and can also be useful practice for those participants new to CodaLab.  

During the test phase, you can only see the evaluation results on a small subset of the test set. The results on the full test set will be posted a few days after the end of the test phase since we need to compute the final results, check the validity of each submission and fix the broken submissions.

Important: Each user can only upload at most 3 submissions each day in order not to overwhelm the system. In the test phase, only the last submission during the test phase will be regarded as the official submission. 

(*) If the error log raises an exception that contains "File "/worker/worker.py", line 330, in run", this may be because the codalab runners are busy now. If you face this problem and do not have more submission chances in a day, please contact us to delete your failed submissions so that you can submit them again. 

Timeline

  • July 20th, 2020: Competition is open. Participants registration begins.
  • July 20th - August 20th, 2020: Dev phase. Participants can submit their results on the dev set to obtain official evaluation scores.
  • August 21th - September 4th, 2020: Test phase. Test data can be downloaded and participants can submit their results on the test set. 
  • September 11st, 2020: Competition results announcement.

*All deadlines are at 11:59 PM UTC on the corresponding day.

Prizes

Within 7 days following the Entry Period one Grand, two Second Place and four Third Place participants will be selected from among all eligible entries received.

  • Grand Prize: The winner will receive US $10,000. 
  • Second Place Prizes: Each winner will receive US $3,000.
  • Third Place Prizes: Each winner will receive US $1,000.

In addition, we invite each winner to submit a system description paper and present your work after the end of the competition (other participants' submissions are also highly encouraged). 

Terms and Conditions

Before you participate in the competition, please read the rules of this competition and confirm that you agree to them (you need to send your agreement in your registration email). In addition, the MIND dataset is free to download for research purposes under Microsoft Research License Terms. Please read these terms and confirm that you agree to them before you download the dataset. Feel free to contact us if you have any questions or need clarification regarding the rules of this competition or the licensing of the data.

 

Organizers

This competition is collaboratively organized by Microsoft News and Microsoft Research Asia teams:

  • Ying Qiao, Jiun-Hung Chen, Winnie Wu (Microsoft News Team)
  • Fangzhao Wu, Chuhan Wu, Tao Qi, Jingwei Yi, Ling Luo, Xing Xie (Microsoft Research Aisa)

Contact: mind[at]microsoft.com

Development

Start: July 20, 2020, 11:59 p.m.

Final test

Start: Aug. 21, 2020, midnight

Competition Ends

Sept. 4, 2020, 11:59 p.m.

You must be logged in to participate in competitions.

Sign In