DBDC5:Dialogue Breakdown Detection Challenge 5

Click here to download DBDC5 data.


Following the success of DBDC4, we are please to organize DBDC5 as part of the WOCHAT workshop held in Madrid, Spain in conjunction with IWSDS2020.

There are three tracks in this edition of DBDC:

Shared Task Overview

  •  Dialogue breakdown detection

The task of dialogue breakdown detection is to detect whether the system utterance causes dialogue breakdown (a situation in a dialogue where users cannot proceed with the conversation) in a given dialogue context. The participants of this challenge will develop a dialogue breakdown detector that outputs a dialogue breakdown label (B: breakdown, PB: possible breakdown, or NB: Not a breakdown) and a distribution of these labels. The challenge includes dialogue breakdown detection for English and Japanese data.

  • Error category classification

The task of error category classification is to classify system utterances that led to dialogue breakdowns into one or more error categories that describe the causes of dialogue breakdowns. We defined 16 categories [1], where multiple categories can be annotated for an utterance. There is no English data for this track, only Japanese. Five annotators annotated each system utterance (annotated with a majority of PB/B labels) with the error categories. As evaluation metrics, we will use exact match (EM) and F1.

  • Recovery response generation

In this track, participants are required to build a response generator/selector. The system should be able to provide new responses aiming at correcting or recovering from a dialogue breakdown event. This track includes 600 dialogues in English from past Dialogue Breakdown Detection Challenges.


How to register, download data and submit results

You can register at https://my.chateval.org/accounts/login/, once registered, you will be able to download the datasets and readme documents as well as submit your results at https://chateval.org/shared_task


Information about the tracks

Any updates will be posted at the official website:

http://workshop.colips.org/wochat/@iwsds2020/index.html


Contact

If you have further questions regarding the data, please let us know by the following email address: dbdc5-admin@googlegroups.com


Organizers:

  • Ryuichiro Higashinaka (NTT)
  • Yuiko Tsunomori (NTT Docomo)
  • Tetsuro Takahashi (Fujitsu Laboratories, LTD)
  • Hiroshi Tsukahara (Denso IT Laboratories)
  • Masahiro Araki (Kyoto Institute of Technology)
  • João Sedoc (University of Pennsylvania)
  • Rafael Banchs (NTU)
  • Luis F. D'Haro (Universidad Politécnica de Madrid)

References

[1] Higashinaka R., Araki M., Tsukahara H., Mizukami M. (2019) Improving Taxonomy of Errors in Chat-Oriented Dialogue Systems. In: D'Haro L., Banchs R., Li H. (eds) 9th International Workshop on Spoken Dialogue System Technology. Lecture Notes in Electrical Engineering, vol 579. Springer, Singapore

 


Schedule

  • Feb/15 development data distribution @ chateval.org
  • Mar/31 (Tue) registration deadline
  • Apr/10 test data distribution
  • Apr/15 (Wed) run submission deadline
  • Apr/15-4/20 (Mon) evaluation
  • Apr/20 notification
  • May/18-20 event (online proceedings)

Human Evaluation Results

Dialogue Breakdown Detection Challenge

The DBDC dataset consists of a series of text-based conversations between a human and a chatbot where the human was aware they were chatting with a computer (Higashinaka et al. 2016).

SystemScoreLink
View System

Automatic Evaluation Results

Dialogue Breakdown Detection Challenge

The DBDC dataset consists of a series of text-based conversations between a human and a chatbot where the human was aware they were chatting with a computer (Higashinaka et al. 2016).

SystemScoreLink
View System

Metrics Description

Description of task