The Story Cloze Test

Organized by ROCNLP - Current server time: May 23, 2019, 7:25 p.m. UTC


Validation Set Spring 2016
Oct. 24, 2017, midnight UTC


Test Set Spring 2016
Oct. 30, 2016, midnight UTC


Competition Ends

Story Cloze Test Challenge

Designing systems that are capable of understanding stories is an extremely challenging task that has often been of great interest within the field of natural language understanding. However, a key issue to the design of such systems has been the limitations of evaluation frameworks. 'Story Cloze Test' is a new challenge for evaluating story understanding, story generation, and script learning. This test requires a system to predict the correct ending to a four-sentence story.


Story Cloze Test

ContextRight EndingWrong Ending
Karen was assigned a roommate her first year of college. Her roommate asked her to go to a nearby city for a concert. Karen agreed happily. The show was absolutely exhilarating. Karen became good friends with her roommate. Karen hated her roommate.
Jim got his first credit card in college. He didn’t have a job so he bought everything on his card. After he graduated he amounted a $10,000 debt. Jim realized that he was foolish to spend so much money. Jim decided to devise a plan for repayment. Jim decided to open another credit card.
Gina misplaced her phone at her grandparents. It wasn’t anywhere in the living room. She realized she was in the car before. She grabbed her dad’s keys and ran outside. She found her phone in the car. She didn’t want her phone anymore.

To enable the Story Cloze Test, we created a new corpus of five-sentence commonsense stories, 'ROCStories'. This corpus is unique in two ways: (1) it captures a rich set of causal and temporal commonsense relations between daily events, and (2) it is a high-quality collection of everyday life stories that can also be used for story generation. ROCStories can be used as training data for the Story Cloze Test challenge.

Original Spring 2016 Dataset

To find out more about the original Spring 2016 Task, please read this paper. It was later found that some of the examples in the original Story Cloze Test validation and test sets contained stylistic biases that worked against the intent of the task.

New Winter 2018 Dataset

We remove many of these original biases and introduce an enhanced validation and test set as our Winter Validation 2018 and Winter Test 2018. To learn more about how these datasets were constructed, read this paper. Click on "Get Data" to find "ROCStoties", "Winter 2018 Validation" and "Winter 2018 Test."


For any questions regarding the Story Cloze Test please contact '' and ''.

Evaluation Criteria

This challenge evaluates models designed for tackling the story cloze test. Each challenge set contains Story Cloze Test instances, where given a context of four-sentence story and two alternative endigns to the story, the system should choose the right ending. The challenge sets can be downloaded via this link. Given each Story Cloze instance, the systems should submit a two column file containing 'storyid' the and system's choice of alternative '1' or '2'. If the chosen answer matches the correct answer, the system will be rewarded 1 point. Otherwise, it will receive 0. We aggregate the challenge set results by computing overall accuracy: #total correct/#test case instances.

Two example submission files can be found here and here. Please make sure to use the exact same format. If you face any difficulties submitting your zip file, please double check the following:

- Your file should end in an empty line, as you can see in the example submission files.
- Your answer file should use the file name 'answer.txt' and should be the only file inside your zip file. The zip file can have any name.

Model Description Form

Please use this form to add information about your approach. The results can be found here.

For any problems with the evaluation please contact '' and ''.

Terms and Conditions

For terms and conditions of the competition please refer to this link . Currently, there is no restriction on the availability of the datasets or the challenge.

Following please find the current challenge datasets. The Winter 2018 is a blind test set, whereas the current sets are mainly for making a record of the state-of-the-art within the community.

ROCStories Spring 2016 Set

ROCStories Winter 2017 Set

The validation and test sets are as follows:

Winter 2018 Validation Set [**we advise using this set]

Winter 2018 Test Set [**we advise using this set]

Spring 2016 Validation Set

Spring 2016 Test Set


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Validation Set Spring 2016

Start: Oct. 24, 2017, midnight

Test Set Winter 2018

Start: Oct. 22, 2018, midnight

Test Set Spring 2016

Start: Oct. 30, 2016, midnight

Competition Ends


You must be logged in to participate in competitions.

Sign In
# Username Score
1 lizhongyang 0.903246
2 jose.fonollosa 0.886696
3 edc 0.807766