ICLR 2021 Workshop MLPCP Track 1 Entity-aware medical dialogue generation

Organized by WengeLiu - Current server time: Feb. 24, 2025, 1:05 p.m. UTC

First phase

Phases A
March 1, 2021, midnight UTC

End

Competition Ends
May 1, 2021, midnight UTC

Attention!

The contest is in its second phase and the current submission site is https://competitions.codalab.org/competitions/30755#learn_the_detail.

New test data: https://drive.google.com/file/d/14UfBGSBaP9RIaDH75yBitCO7O7EXLajT/view

Overview

A medical dialogue system aims to generate context consistent and medically meaningful responses conditioned on the dialogue history. In this track, we focus on entity-aware medical dialogue generation. Formally, given the dialogue history X={X_1,X_2,...,X_K} between the doctor and the patient, where X_K is the patient's last utterance, the target of this task is to generate the next response of the doctor X_{K+1} with as many correct entities as possible.

Data Description

MedDG is a large-scale entity-centric medical dialogue dataset related to 12 types of common gastrointestinal diseases, with more than 17K conversations and 385K utterances collected from the online health consultation community. Each conversation is annotated with five different categories of entities, including diseases, symptoms, attributes, tests, and medicines. For more details about this dataset, please refer to this preprint.

Dataset Examples

This is an dialogue example with entity annotation in the MedDG dataset. In the test stage, the input of the model is the dialogue history without any annotation, and the output is the next doctor's utterance.

[{'id': 'Patients', 'Sentence': '你好,肚脐周围隐隐作痛,不知道怎么回事(女,29岁)', 'Symptom': ['腹痛'], 'Medicine': [], 'Test': [], 'Attribute': [], 'Disease': []},
{'id': 'Doctor', 'Sentence': '你好,这种情况多长时间了?', 'Symptom': [], 'Medicine': [], 'Test': [], 'Attribute': ['时长'], 'Disease': []},
{'id': 'Patients', 'Sentence': '两三天了。', 'Symptom': [], 'Medicine': [], 'Test': [], 'Attribute': [], 'Disease': []},
{'id': 'Patients', 'Sentence': '隐隐作痛,疼一会就不疼了。', 'Symptom': [], 'Medicine': [], 'Test': [], 'Attribute': [], 'Disease': []},
{'id': 'Doctor', 'Sentence': '还有其他症状吗?恶心想吐吗。', 'Symptom': ['恶心', '呕吐'], 'Medicine': [], 'Test': [], 'Attribute': [], 'Disease': []},
{'id': 'Patients', 'Sentence': '没有。', 'Symptom': [], 'Medicine': [], 'Test': [], 'Attribute': [], 'Disease': []},
{'id': 'Doctor', 'Sentence': '是隐隐约约的疼吗。', 'Symptom': [], 'Medicine': [], 'Test': [], 'Attribute': [], 'Disease': []},
{'id': 'Patients', 'Sentence': '食欲也好的,稍微有点腹胀。', 'Symptom': ['腹胀'], 'Medicine': [], 'Test': [], 'Attribute': [], 'Disease': []},
{'id': 'Doctor', 'Sentence': '可能是胃肠功能紊乱。', 'Symptom': ['胃肠功能紊乱'], 'Medicine': [], 'Test': [], 'Attribute': [], 'Disease': []}]

Sample Input
{'history': ['你好,肚脐周围隐隐作痛,不知道怎么回事(女,29岁)',
'你好,这种情况多长时间了?',
'两三天了。',
'隐隐作痛,疼一会就不疼了。',
'还有其他症状吗?恶心想吐吗。',
'没有。',
'是隐隐约约的疼吗。',
'食欲也好的,稍微有点腹胀。']}

Sample Output
可能是胃肠功能紊乱。

How to start

You can download the dataset at Google Drive.

You can refer to the code at bert-gpt.

Evaluation Metrics

For this task, we use three metrics for evaluation. The final score is the average of the results of these three metrics.

  • BLEU 1/4 for response generation quality, introduced in the (Chen and Cherry, 2014).
  • Entity-F1 is the F1 score between predicted entities in generated response and gold entities to measure entity correctness.

Submission Format

The results should be pack into a single zip file. Example zip file is available in track1_submit.zip.

 

Terms and Conditions

General Rules

  • To ensure fairness, the top 3 winners in each track are required to send the supplementary materials to sqrt3tjh@gmail.com, which includes the source code of their submission and an instruction file to run the code.
  • Each entry is required to be associated to a team and its affiliation.
  • Using multiple accounts to increase the number of submissions is strictly prohibited.
  • Results should follow the correct format and must be uploaded to the evaluation server through the CodaLab competition site. Detailed information about how results will be evaluated is represented on the evaluation page.
  • The best entry of each team will be public in the leaderboard at all time.
  • The organizer reserves the absolute right to disqualify entries which is incomplete or illegible, late entries or entries that violate the rules.
  • Please note that each team has a maximum of three members.

Datasets

The datasets are released for academic research only and it is free to researchers from educational or research institutions for non-commercial purposes. When downloading the dataset you agree not to reproduce, duplicate, copy, sell, trade, resell or exploit for any commercial purposes, any portion of the images and any portion of derived data.

Concate Us

For more information, please concate us at liuwg8@mail2.sysu.edu.cn or sqrt3tjh@gmail.com.

In addition,we have formed a WeChat exchange group to facilitate discussion, please add WeChat "sqrt3tjh"  or "kzllwg" to enter the group.

Phases A

Start: March 1, 2021, midnight

Competition Ends

May 1, 2021, midnight

You must be logged in to participate in competitions.

Sign In