Welcome to DSTC 9: Multi-Domain Task-Completion Dialog Challenge II
As part of the Ninth Dialog System Technology Challenge (DSTC9), Microsoft Research and Tsinghua University are hosting Multi-domain Task-oriented Dialog Challenge II, aiming to solve two tasks in the multi-domain task completion setting:
- End-to-end Multi-domain Task Completion Dialog — In recent years there has been an increasing interest in building complex task completion bots that span over multiple domains. In this task, participants will develop an end-to-end dialog system that receives natural language as an input and generates natural language as an output in the travel planning setting. There is no restriction on the modeling approaches, and all resources/datasets in the community can be used for model training. The system will be evaluated in MultiWOZ 2.1 dataset setting with ConvLab-2.
- Cross-lingual Multi-domain Dialog State Tracking — Building a dialog system that handles multiple languages becomes increasingly important with the rapid process of globalization. To advance state-of-the-art technologies in handling cross-lingual multi-domain dialogs, we offer the task of building cross-lingual dialog state trackers with a training set in resource-rich language, and dev/test set in a resource-poor language. In particular, this task consists of two sub-tasks. One uses English as the resource-rich language and Chinese as the resource-poor language on the MultiWOZ dataset, and the other one uses Chinese as the resource-rich language and English as the resource-poor language on the CrossWOZ dataset.
- Jun 15, 2020: Competition Starts
- Sep 21, 2020: Test data is released
- Oct 5, 2020: Entry submission deadline
- Oct 19,2020: Results announced
- Nov 2020: Paper submission deadline
Jinchao Li, Qi Zhu, Baolin Peng, Zheng Zhang, Shahin Shayandeh, Ryiuchi Takanobu, Swadheen Shukla, Runze Liang, Lars Liden, Minlie Huang, Jianfeng Gao
You can contact all the contest organizers at email@example.com, or Jinchao Li (firstname.lastname@example.org), Qi Zhu (email@example.com)
We provide ConvLab-2, the next generation of dialog development platform built based on ConvLab, to facilitate participants’ development efficiency. ConvLab-2 inherits the framework and models from ConvLab and incorporates 2 new features, including most recent state-of-the-art models, an analysis tool, an interactive tool, etc. ConvLab-2 serves as the following functionalities:
- Toolkit for building dialog systems with both conventional pipeline approaches and end-to-end approaches.The platform includes state-of-the-art models for NLU, dialog state tracker, policy, NLG, and end-to-end models. The interfaces between modules, knowledgebase, and backend systems are designed to support multiple datasets so that APIs calls to the knowledge base can be easily made.
- Toolkit for dialog system evaluation using both automatic evaluation and human evaluations. For automatic evaluation, it consists of end-to-end user simulators and evaluators for component-wise modules and end-to-end dialog systems. For human evaluation, the tools to interact with human judges on Amazon Mechanic Turk are provided.
- Toolkit for system diagnosis. It consists of an interactive interface that not only illustrates the output of each module but also enables users to modify the result and diagnose the end-to-end performance with new outputs. It also consists of an analysis tool that considers statistics extracted from the conversations between the user simulator and the dialog system.
Both datasets of MultiWOZ 2.1 and CrossWOZ are incorporated into ConvLab-2 in its original language. The interface to interact with these datasets are fully supported in ConvLab-2, and multiple trained models associated with the datasets will be provided.
In the development stage of the challenge, we will translate partial datasets of MultiWOZ 2.1 and CrossWOZ to Chinese and English, respectively, and release them to the participants as the development set. In the test stage, unlabelled test sets will be released to the participants in the same language as in the development stage.