SemEval-2021 Task 4: Reading Comprehension of Abstract Meaning

Organized by BoyuanZheng - Current server time: Sept. 23, 2020, 10:08 p.m. UTC

First phase

Practice (Training data ready)
Oct. 1, 2020, midnight UTC


Competition Ends

ReCAM: Reading Comprehension of Abstract Meaning

Boyuan Zheng, Xiaoyu Yang, Yu-Ping Ruan, Quan Liu, Zhen-Hua Ling, Si Wei, Xiaodan Zhu

Computers' ability in understanding, representing, and expressing abstract meaning is a fundamental problem towards achieving true natural language understanding. In the past decade significant advancement has been achieved or claimed in representation learning on many NLP problems. How such a success helps develop models for abstract meaning understanding and modeling? 

The aim of our shared task is to provide a benchmark for studying machines' ability in representing and understanding abstract concepts. In the task, computers are given passages to read and understand. If a model can digest the passages as human do, we expect it can predict abstract words that human being use to write summaries after understand the passage. Note that the popular CNN/DailyMail dataset (Hermann et al., 2015), among others, requests computers to predict concrete concepts, e.g., named entities. However, in our task we require a model to fill out abstract words removed from human-written summaries. 


Our shared task has three subtasks. Subtask-1 and subtask-2 focus on evaluating machine-learning models' performance with regard to two definitions of abstractness imperceptibility (Spreen and Schulz, 1966) and nonspecificility (Changizi, 2008). Our subtask-3 aims to provide more insights to their relationships.

• Subtask-1: ReCAM: Imperceptibility

Concrete words refer to things, events, and properties that we can perceive directly with our senses such as donut, trees, and red.  In contrast, abstract words refer to ideas and concepts that are distant from immediate perception. Examples include objective, culture, and economy. In subtask-1, the participanting systems are required to perform reading comprehension of abstract meaning for imperceptible concepts.

In this subtask, you are given a passage and a question related to this passage. You need to choose one word from five candidate words with the highest imperceptibility.


• Subtask-2: ReCAM: Nonspecificity
Subtask-2 focuses on the other typical definition of abstractness: Nonspecificity. In this definition, compared with concrete concepts like groundhog and whale, words such as vertebrate are regarded as more abstract. 

• Subtask3: ReCAM-Interaction

Subtask-3 aims to provide more insights to the relationship of the two views on abstractness, In this subtask, we test the performance of a system that is trained on one definition and evaluted on the other. Furthermore, we will evluate systems' performance on concepts that are both imperceptible and nonspecific. 

Important Date

Trail data ready: July 31, 2020

Training data ready: October 1, 2020

Test data ready: December 3, 2020

Evaluation start: January 10, 2021

Evaluation end: January 31, 2021

Paper submission due: February 23, 2021

Notification to authors: March 29, 2021

Camera ready due: April 5, 2021

SemEval workshop: Summer 2021

Contact Us


Evaluation Method

Accuracy is used as evaluation metric for all three subtasks.

Participants need submit the files of trained model.

(Details will come soon)

Terms & Conditions

By submitting results to this competition, you consent to the public release of your scores at the SemEval-2021 workshop and in the associated proceedings, at the task organizers' discretion. Scores may include but are not limited to, automatic and manual quantitative judgments, qualitative judgments, and such other metrics as the task organizers see fit. You accept that the ultimate decision of metric choice and score value is that of the task organizers.

You further agree that the task organizers are under no obligation to release scores and that scores may be withheld if it is the task organizers' judgment that the submission was incomplete, erroneous, deceptive, or violated the letter or spirit of the competition's rules. Inclusion of a submission's scores is not an endorsement of a team or individual's submission, system, or science.

You further agree that your system may be named according to the team name provided at the time of submission, or to a suitable shorthand as determined by the task organizers.

You agree not to redistribute the test data except in the manner prescribed by its license.


Since the three subtasks share the same task format, baselines are applicable to all subtasks.

GA Reader

Instruction for GA Reader

You are free to build a system from scratch using any available software packages and resources, as long as they are not against the spirit of fair competition. In order to assist testing of ideas, we also provide GA Reader that you can build on. The use of this system is completely optional. The system is available.


  • Boyuan Zheng
  • Northeastern University


  • Xiaoyu Yang
  • Queen's University


  • Yu-Ping Ruan
  • University of Science and Technology of China


  • Quan Liu
  • iFlytek Research


  • Zhen-Hua Ling
  • University of Science and Technology of China


  • Xiaodan Zhu
  • Queen's University


Practice (Training data ready)

Start: Oct. 1, 2020, midnight

Evaluation (Test data ready)

Start: Jan. 10, 2021, midnight


Start: Jan. 31, 2021, midnight

Competition Ends


You must be logged in to participate in competitions.

Sign In