Counts and measurements are an important part of scientific discourse. It is relatively easy to find measurements in text, but a bare measurement like "17 mg" is not informative. However, relatively little attention has been given to parsing and extracting these important semantic relations. This is challenging because the way scientists write can be ambiguous and inconsistent, and the location of this information relative to the measurement can vary greatly.
MeasEval is a new entity and semantic relation extraction task focused on finding counts and measurements, attributes of these quantities, and additional information including measured entities, properties, and measurement contexts.
MeasEval is composed of five sub-tasks that cover span extraction, classification, and relation extraction, including cross-sentence relations. Given a paragraph from a scientific text:
Additional resources and data will be available on the MeasEval Github Repo
Register your team on the CodaLab Participate page.
Join our listserv at https://groups.google.com/forum/#!forum/measeval-semeval-2021
Corey Harper, Elsevier Labs and INDE lab at the University of Amsterdam
Jessica Cox, Elsevier Labs
Ron Daniel, Elsevier Labs
Paul Groth, INDE lab at the University of Amsterdam
Curt Kohler, Elsevier Labs
Antony Scerri, Elsevier Labs
Evaluation will be based on precision, recall, and F1 metrics for classification subtasks and SQuAD-style Exact Match (EM) and Overlap (“F1”) scores for the span components. We opt for SQuAD-style F1 here because we wish to give some credit for substring matches given the difficulty of exactly matching entities, properties, and contexts, which may include various modifiers and determiners.
For the classification components of Task 2 and in the relations for Task 5, we will provide P/R/F1 for each of the evaluated classes, along with micro and macro averages.
For the span identification components of Tasks 1, 2, 3, and 4, we will provide SQuAD-style Exact Match (EM) and Overlap (“F1”) scores for the provided spans.
Our evaluation script will be made available prior to the October 1 release of our training data.
Data for this competition is in the form of annotations on CC-BY ScienceDirect Articles available from the Elsevier Labs OA-STM-Corpus. All data, including annotations, is provided under the CC-BY license.
The organizers make no warranties regarding the Dataset, including but not limited to being up-to- date, correct or complete.
By submitting results to this competition, you consent to the public release of your scores at SemEval2021 and in related publications.
Start: Oct. 1, 2020, midnight
Start: Jan. 10, 2021, midnight
Start: Feb. 1, 2021, midnight
You must be logged in to participate in competitions.Sign In