SemEval-2018 Task 6: Parsing Time Normalizations

Organized by Egoitz - Current server time: May 23, 2019, 7:21 p.m. UTC


Jan. 8, 2018, midnight UTC


Jan. 30, 2018, midnight UTC


Competition Ends


This is the CodaLab Competition for SemEval-2018 Task 6: Parsing Time Normalization.

Please join our Google Group to ask questions and get the most up-to-date information on the task.


Important Dates:

14 Aug 2017: Trial data release
18 Sep 2017: Training data release
8 Jan 2018:   Test data release 
29 Jan 2018: Evaluation end




The Parsing Time Normalizations shared task is a new approach to time normalization based on recognizing semantically compositional time operators. Such operators are more expressive, being able to represent many more time expressions, and are more machine-learnable, as they can naturally be viewed as a semantic parsing task.


Each operator in the semantic tree can be formally defined in terms of mathematical operations. For example, the operator BETWEEN can be expressed as: 

  Between([t1, t2): Interval, [t3, t4): Interval): Interval = [t2, t3)  


Thus, interpreting the formal operations that compose a time expression produces the corresponding time intervals. For the example in the figure above and assuming that the Doc-Time is April 21, 2017, the resulting intervals would be:



The ultimate goal of the shared task is to inpretate time expressions in order to identify appropriate intervals that can be placed on a timeline.


We offer two tracks: parsing text to time operators and producing time intervals. For the latter, we will provide an interpreter that infers time intervals from the time operators extracted by the participants. The interpreter is also able to obtain such intervals from timestamps in TimeML format. Thus, systems participating in Track 1 will automatically take part in Track 2. Furthermore, participants can join Track 2 directly by providing more traditional TimeML annotations.

  • Track 1: Parse text to time operators. Systems must identify time operators in text and link them correctly to signal how they have to be composed.
  • Track 2: Produce time intervals. Systems can participate through Track 1 or by providing a TimeML annotations. In both cases, the intervals are inferred by our interpreter.


Egoitz Laparra, Dongfang Xu, Steven Bethard (University of Arizona)

Ahmed S. Elsayed, Martha Palmer (University of Colorado)


Bethard, S. and Parker, J. (2016) A Semantically Compositional Annotation Scheme for Time Normalization. Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), Paris, France, 5 2016


Our dataset covers two different domains: newswire and notes on colon cancer. The data consists annotated documents from the TimeBank/AQUAINT corpus of news articles, and documents from the THYME corpus of clinical notes.

Participants interested in the clinical notes portion of the evaluation will have to sign a data use agreement with the Mayo Clinic to obtain the raw text of the clinical notes and pathology reports (since the THYME corpus contains incompletely de-identified clinical data; the time expressions were retained). Participants of Clinical TempEval 2015, 2016, or 2017 have already completed this process and would not have to do anything more to participate in the clinical portion of Parsing Time Normalizations. New participants can follow the instructions for the process.

Please apply for a data use agreement as soon as possible! The process may take some time.

Read the DUA carefully before agreeing to it. Among other things, you will be agreeing:

  • to keep the data secure using restricted passwords and encryption (e.g., on a secure server, not on personal computers)
  • not to attempt to re-identify the data
  • not to redistribute the data to anyone else for any purpose

The annotation is in Anafora XML format. This means that for each file in the corpus, there will be a directory. That directory will contain a XML file. The XML file contains stand-off annotations that follows the guidelines for the proposed time operator annotation scheme (example)

Evaluation metrics

For both Track 1 (Parsing) and Track 2 (Intervals), the results will be given in terms of precision, recall and f-measure. For Track 2, our scorer includes an interpreter that can produce time intervals reading the annotations in both Anafora and TimeML formats. The scores for each track are calculated as follows: 

  • Track 1:  In this track, we follow a traditional information extraction evaluation and measure the precision and recall of finding and linking the various time operators. A predicted annotation is considered as equal to the gold-standard if it has the same character span (offsets), type, and properties (with the definition applying recursively for properties that point to other annotations).
  • Track 2: For this track, we evaluate the accuracy of systems with respect to the timeline. One predicted interval is evaluated against another in the gold standard if their spans overlap. The scorer finds the intersection between the predicted and the gold standard intervals. Then, the precision is calculated by dividing the length of the intersection by the length of the predicted interval. Similarly, the recall is calculated by dividing the length of the intersection by the length of the gold standard interval. The final precision and recall will be obtained by averaging all the pairs evaluated.

System Output Format

For Track 1, your system must produce Anafora XML format files. In the case of Track 2, the format of submissions can be TimeML. In any case, your directory structure should follow the following organization:

  • Domain
    • doc_001
      • doc_001.TimeNorm.system.completed.xml
    • doc_002
      • doc_002.TimeNorm.system.completed.xml
    • doc_003
      • doc_003.TimeNorm.system.completed.xml
    • doc_004
      • doc_004.TimeNorm.system.completed.xml
    • ...
  • ...


Make sure that you comply with following rules when you create  your output directory:

  • The root must contain only the domain directories, namely, "Newswire" and "Cancer".
  • If you don't produce output for any of the domains, do not include its corresponding directory.
  • Each domain directory must contain only the corresponding document directories and their names must be same as in the dataset.
  • If you don't produce output for any of the documents, do not include its corresponding directory.
  • Each document directory must contain only the corresponding annotation file. 
  • The name of the annotation files must match the document name and they must include a "xml" extension, for Anafora format files,  or "tml" extension, for TimeML format files.
  • All the annotation files must have the same extension.


Terms and conditions

By submitting results to this competition, you consent to the public release of your scores at the SemEval-2018 workshop and in the associated proceedings, at the task organizers' discretion. Scores may include, but are not limited to, automatic and manual quantitative judgements, qualitative judgements, and such other metrics as the task organizers see fit. You accept that the ultimate decision of metric choice and score value is that of the task organizers.

You further agree that the task organizers are under no obligation to release scores and that scores may be withheld if it is the task organizers' judgement that the submission was incomplete, erroneous, deceptive, or violated the letter or spirit of the competition's rules. Inclusion of a submission's scores is not an endorsement of a team or individual's submission, system, or science.

You further agree that your system may be named according to the team name provided at the time of submission, or to a suitable shorthand as determined by the task organizers.

You agree not to redistribute the test data except in the manner prescribed by its licence.


Start: Aug. 14, 2017, midnight


Start: Jan. 8, 2018, midnight


Start: Jan. 30, 2018, midnight

Competition Ends


You must be logged in to participate in competitions.

Sign In