Designing systems that are capable of understanding stories is an extremely challenging task that has often been of great interest within the field of natural language understanding. However, a key issue to the design of such systems has been the limitations of evaluation frameworks. 'Story Cloze Test' is a new challenge for evaluating story understanding, story generation, and script learning. This test requires a system to predict the correct ending to a four-sentence story.
|Context||Right Ending||Wrong Ending|
|Karen was assigned a roommate her first year of college. Her roommate asked her to go to a nearby city for a concert. Karen agreed happily. The show was absolutely exhilarating.||Karen became good friends with her roommate.||Karen hated her roommate.|
|Jim got his first credit card in college. He didn’t have a job so he bought everything on his card. After he graduated he amounted a $10,000 debt. Jim realized that he was foolish to spend so much money.||Jim decided to devise a plan for repayment.||Jim decided to open another credit card.|
|Gina misplaced her phone at her grandparents. It wasn’t anywhere in the living room. She realized she was in the car before. She grabbed her dad’s keys and ran outside.||She found her phone in the car.||She didn’t want her phone anymore.|
To enable the Story Cloze Test, we created a new corpus of five-sentence commonsense stories, 'ROCStories'. This corpus is unique in two ways: (1) it captures a rich set of causal and temporal commonsense relations between daily events, and (2) it is a high-quality collection of everyday life stories that can also be used for story generation. ROCStories can be used as training data for the Story Cloze Test challenge.
To find out more about the original Spring 2016 Task, please read this paper. It was later found that some of the examples in the original Story Cloze Test validation and test sets contained stylistic biases that worked against the intent of the task.
We remove many of these original biases and introduce an enhanced validation and test set as our Winter Validation 2018 and Winter Test 2018. To learn more about how these datasets were constructed, read this paper. Click on "Get Data" to find "ROCStoties", "Winter 2018 Validation" and "Winter 2018 Test."
For any questions regarding the Story Cloze Test please contact 'email@example.com' and 'firstname.lastname@example.org'.
This challenge evaluates models designed for tackling the story cloze test. Each challenge set contains Story Cloze Test instances, where given a context of four-sentence story and two alternative endigns to the story, the system should choose the right ending. The challenge sets can be downloaded via this link. Given each Story Cloze instance, the systems should submit a two column file containing 'storyid' the and system's choice of alternative '1' or '2'. If the chosen answer matches the correct answer, the system will be rewarded 1 point. Otherwise, it will receive 0. We aggregate the challenge set results by computing overall accuracy: #total correct/#test case instances.
- Your file should end in an empty line, as you can see in the example submission files.
- Your answer file should use the file name 'answer.txt' and should be the only file inside your zip file. The zip file can have any name.
For any problems with the evaluation please contact 'email@example.com' and 'firstname.lastname@example.org'.
For terms and conditions of the competition please refer to this link . Currently, there is no restriction on the availability of the datasets or the challenge.
Following please find the current challenge datasets. The Winter 2018 is a blind test set, whereas the current sets are mainly for making a record of the state-of-the-art within the community.
The validation and test sets are as follows:
Winter 2018 Validation Set [**we advise using this set]
Winter 2018 Test Set [**we advise using this set]
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Start: Oct. 24, 2017, midnight
Start: Oct. 22, 2018, midnight
Start: Oct. 30, 2016, midnight
You must be logged in to participate in competitions.Sign In