臺灣學術期刊開放取用平台

keyboard_backspace

返回卷期清單 (26(1) / 2021 / 6)

26(1)

/

2021 / 6

/

pp. 33 - 42

2020 福爾摩沙臺語語音辨識比賽之初步實驗

A Preliminary Study of Formosa Speech Recognition Challenge 2020 - Taiwanese ASR

51

0

作者

Fu-Hao Yu

(Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology)

Ke-Han Lu

(Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology)

Yi-Wei Wang

(Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology)

Wei-Zhe Chang

(Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology)

Wei-Kai Huang

(Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology)

Kuan-Yu Chen *

(Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology)

Fu-Hao Yu

Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology

Ke-Han Lu

Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology

Yi-Wei Wang

Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology

Wei-Zhe Chang

Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology

Wei-Kai Huang

Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology

Kuan-Yu Chen *

Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology

中文摘要

為研究當前深度學習語音辨識模型於臺文與臺羅拼音之語音辨識任務之成效，本研究使用 2020 福爾摩沙臺語語音辨識競賽(Formosa Speech Recognition Challenge 2020, FSR-2020)所提供之臺文語音語料庫(TAT-Vol1)以及公視臺語台訓練語料，並基於 ESPnet 與 Kaldi，比較數種模型架構、訓練方法與參數於臺語語音辨識之成效。最終，在 2020 福爾摩沙臺語語音辨識競賽裡，我們的系統在臺文辨識(Track2)中取得 66.1%的錯誤率，而在臺羅拼音辨識(Track3) 方面，我們所得到的錯誤率為 28.6%。

英文摘要

In order to study the effectiveness of the current deep learning-based speech recognition models in the speech recognition tasks of Taiwanese Southern Min Recommended Characters and Taiwan Minnanyu Luomazi Pinyin, this study uses the corpora provided by the 2020 Formosa Speech Recognition Challenge 2020 (FSR-2020) to evalutae some neural-based ASR systems by ESPnet and Kaldi toolkits. In the end, our system achieved a 66.1% error rate in the Taiwanese Southern Min Recommended Characters recognition (Track2), and the error rate we got in the Taiwan Minnanyu Luomazi Pinyin recognition (Track3) was 28.6%.

中文關鍵字

臺文; 臺羅拼音; 臺語語音辨識

英文關鍵字

Taiwanese Southern Min Recommended Characters; Taiwan Minnanyu Luomazi Pinyin; Taiwanese ASR