26(1)
/
2021 / 6
/
pp. 33 - 42
2020 福爾摩沙臺語語音辨識比賽之初步實驗
A Preliminary Study of Formosa Speech Recognition Challenge 2020 - Taiwanese ASR
作者
Fu-Hao Yu
(Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology)
Ke-Han Lu
(Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology)
Yi-Wei Wang
(Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology)
Wei-Zhe Chang
(Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology)
Wei-Kai Huang
(Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology)
Kuan-Yu Chen
*
(Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology)
Fu-Hao Yu
Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology
Ke-Han Lu
Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology
Yi-Wei Wang
Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology
Wei-Zhe Chang
Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology
Wei-Kai Huang
Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology
Kuan-Yu Chen
*
Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology
中文摘要
為研究當前深度學習語音辨識模型於臺文與臺羅拼音之語音辨識任務之成效, 本研究使用 2020 福爾摩沙臺語語音辨識競賽(Formosa Speech Recognition Challenge 2020, FSR-2020)所提供之臺文語音語料庫(TAT-Vol1)以及公視臺語 台訓練語料,並基於 ESPnet 與 Kaldi,比較數種模型架構、訓練方法與參數於 臺語語音辨識之成效。最終,在 2020 福爾摩沙臺語語音辨識競賽裡,我們的 系統在臺文辨識(Track2)中取得 66.1%的錯誤率,而在臺羅拼音辨識(Track3) 方面,我們所得到的錯誤率為 28.6%。
英文摘要
In order to study the effectiveness of the current deep learning-based speech recognition models in the speech recognition tasks of Taiwanese Southern Min Recommended Characters and Taiwan Minnanyu Luomazi Pinyin, this study uses the corpora provided by the 2020 Formosa Speech Recognition Challenge 2020 (FSR-2020) to evalutae some neural-based ASR systems by ESPnet and Kaldi toolkits. In the end, our system achieved a 66.1% error rate in the Taiwanese Southern Min Recommended Characters recognition (Track2), and the error rate we got in the Taiwan Minnanyu Luomazi Pinyin recognition (Track3) was 28.6%.
中文關鍵字
臺文; 臺羅拼音; 臺語語音辨識
英文關鍵字
Taiwanese Southern Min Recommended Characters; Taiwan Minnanyu Luomazi Pinyin; Taiwanese ASR