28(1)
/
2023 / 6
/
pp. 19 - 34
多領域任務導向一對一用戶對話收集系統
A Dialogue Collection System for One-to-One Multi-Domain Task Oriented Dialogs
作者
葉丞鴻 Cheng-Hung Yeh
(國立中央大學資訊工程學系 Department of Computer Science & Information Engineering, National Central University)
李聿鎧 Yu-Kai Lee
(國立中央大學資訊工程學系 Department of Computer Science & Information Engineering, National Central University)
張嘉惠 Chia-Hui Chang
*
(國立中央大學資訊工程學系 Department of Computer Science & Information Engineering, National Central University)
葉丞鴻 Cheng-Hung Yeh
國立中央大學資訊工程學系 Department of Computer Science & Information Engineering, National Central University
李聿鎧 Yu-Kai Lee
國立中央大學資訊工程學系 Department of Computer Science & Information Engineering, National Central University
張嘉惠 Chia-Hui Chang
*
國立中央大學資訊工程學系 Department of Computer Science & Information Engineering, National Central University
中文摘要
客服系統、聊天機器人、智慧音箱等對話系統需要有標記的對話語料進行模型訓練。然而如何快速有效地收集對話語料,是建構對話系統必須面對的問題。現有任務導向系統主要以餐廳、旅館、機票訂位為主,尚無個人虛擬助理提供傳送訊息、建立活動等事務性服務之對話語料。本篇論文仿照 CrossWOZ 收集對話語料的方法,透過任務生成及對話網站介面設計,讓標記人員模擬用戶與虛擬助理對話情境,建立一個能夠處理電子郵件、管理行事曆、以及傳遞訊息等三種服務的對話語料集(稱為MsgWOZ)。期望此語料為中文虛擬助理對話系統奠定發展基石。 標記系統和資料集已開源至https:// github.com/TedYeh/messageWOZ。
英文摘要
Task-oriented dialog systems require labeled corpus for model training. However, in the face of new services, how to effectively collect dialogue corpus is a problem that must be faced in the construction of dialogue systems. Existing task-oriented systems mainly focus on reservations for restaurants, hotels, and airline tickets. There is no dialogue corpus for virtual assistants that could provide transactional services such as sending messages and creating events. This paper imitates the method of collecting dialogue datasets from CrossWOZ to allow annotators to simulate user and virtual assistant dialogue scenarios through a dialogue website interface, creating a dialogue dataset that can handle three services: email management, calendar management, and message delivery. It is expected that this corpus will lay the foundation for the development of Chinese virtual assistant dialogue system. The annotation system and dataset have been open-sourced at https://github.com/TedYeh/messageWOZ.
中文關鍵字
任務導向對話系統(TOD);對話語料庫建構;Wizard-of-Oz(WOZ);Msg-WOZ
英文關鍵字
Task-orient Dialogue Systems (TOD); Dialog Corpus Construction; Wizard-of-Oz (WOZ); Msg-WOZ