SemEval 2017 Task 12: Clinical TempEval

Organized by bethard - Current server time: June 24, 2018, 2:41 p.m. UTC

Previous

Supervised Domain Adaptation
Jan. 17, 2017, midnight UTC

Current

Post-competition
Jan. 31, 2017, midnight UTC

End

Competition Ends
Never

Clinical TempEval 2017 follows in the footsteps of the i2b2 2012 shared task, Clinical TempEval 2015, and Clinical TempEval 2016 in bringing timeline extraction to the clinical domain. As in past Clinical TempEvals, data will be drawn from clinical notes and pathology reports for cancer patients at the Mayo Clinic.

New in 2017

This year, Clinical TempEval will focus on domain adaptation: systems will be trained on data from colon cancer patients, but will be asked to make predictions on brain cancer patients. Adapting to the many differences between the two domains will be a key challenge for the task.

Subtasks

Clinical TempEval systems will be asked to extract the following temporal information:

  • TS: identifying the spans of time expressions
  • ES: Identifying the spans of event expressions
  • TA: identifying the attributes of time expressions
    • type=DATE, TIME, DURATION, QUANTIFIER, PREPOSTEXP or SET
  • EA: identifying the attributes of event expressions
    • type=N/A, ASPECTUAL or EVIDENTIAL
    • polarity=POS or NEG
    • degree=N/A, MOST or LITTLE
    • modality=ACTUAL, HEDGED, HYPOTHETICAL or GENERIC
  • DR: identifying the relation between an event and the document creation time
    • docTimeRel=BEFORE, OVERLAP, BEFORE-OVERLAP or AFTER
  • CR: identifying narrative container relations (CONTAINS a.k.a. INCLUDES)

For example, given the text:

April 23, 2014: The patient did not have any postoperative bleeding so we will resume chemotherapy with a larger bolus on Friday even if there is slight nausea.

Systems should identify:

  • TS:
    • April 23, 2014
    • postoperative
    • Friday
  • ES:
    • bleeding
    • resume
    • chemotherapy
    • bolus
    • nausea
  • TA:
    • April 23, 2014: type=DATE, value=2014-04-23
    • postoperative: type=PREPOSTEXP
    • Friday: type=DATE, value=2014-04-25
  • EA:
    • bleeding: type=N/A, degree=N/A, polarity=NEG, modality=ACTUAL
    • resume: type=ASPECTUAL, degree=N/A, polarity=POS, modality=ACTUAL
    • chemotherapy: type=N/A, degree=N/A, polarity=POS, modality=ACTUAL
    • bolus: type=N/A, degree=N/A, polarity=POS, modality=ACTUAL
    • nausea: type=N/A, degree=LITTLE, polarity=POS, modality=HYPOTHETICAL
  • DR:
    • bleeding BEFORE docTime
    • resume AFTER docTime
    • chemotherapy AFTER docTime
    • bolus AFTER docTime
    • nausea AFTER docTime
  • CR:
    • postoperative CONTAINS bleeding
    • Friday CONTAINS resume
    • Friday CONTAINS bolus

Participants in Clinical TempEval may participate in any or all of the 6 tasks (TS, ES, TA, EA, DR, CR). The evaluation metrics that will be applied are:

  • TS, ES: precision, recall and F1
  • TA, EA: precision, recall and F1 for each attribute
  • DR: precision, recall and F1
  • CR: precision, recall and F1, and closure-based precision, recall and F1, where temporal closure is run to infer additional relations on both the system and the reference relations and scores are calculated on the post-closure relations.

The focus of Clinical TempEval 2017 is on domain adaptation from a source domain (colon cancer) to a target domain (brain cancer). To allow the evaluation of both unsupervised and semi-supervised approaches to domain adaptation, Clinical TempEval 2017 will have multiple evaluation phases:

  1. Trial phase (identical to Clinical TempEval 2016):
    • Training data:
      • raw text from the train and dev sections of the source domain (colon cancer)
      • annotations from the train and dev sections of the source domain (colon cancer)
    • Test data:
      • raw text from the test section of the source domain (colon cancer)
  2. Unsupervised domain adaptation phase:
    • Training data:
      • raw text from the source domain (colon cancer)
      • annotations from the source domain (colon cancer)
      • raw text from the target domain (brain cancer)
    • Test data:
      • raw text from the target domain (brain cancer)
  3. Semi-supervised domain adaptation phase:
    • Training data:
      • raw text from the source domain (colon cancer)
      • annotations from the source domain (colon cancer)
      • raw text from the target domain (brain cancer)
      • annotations from the target domain (brain cancer)
        [only available during the evaluation period]
    • Test data:
      • raw text from the target domain (brain cancer)

System Output Format

The format of submissions is the same for both phase 1 and phase 2. Your system output should take the same format and organization as the Anafora XML files in the training data. Your directory structure should look like:

  • ID004_clinic_010
    • ID004_clinic_010.Temporal-Relation.system.completed.xml
  • ID004_clinic_012
    • ID004_clinic_012.Temporal-Relation.system.completed.xml
  • ID004_path_011
    • ID004_path_011.Temporal-Relation.system.completed.xml
  • ID005_clinic_013
    • ID005_clinic_013.Temporal-Relation.system.completed.xml
  • ...

 

Trial

Start: June 1, 2016, midnight

Unsupervised Domain Adaptation

Start: Jan. 9, 2017, midnight

Supervised Domain Adaptation

Start: Jan. 17, 2017, midnight

Post-competition

Start: Jan. 31, 2017, midnight

Competition Ends

Never

You must be logged in to participate in competitions.

Sign In