28(1)
/
2023 / 6
/
pp. 53 - 68
公平語音情緒辨識的雙階段學習策略
A Two-Stage Learning Strategy for Fair Speech Emotion Recognition
作者
簡婉軒 Woan-Shiuan Chien *
(國立清華大學電機工程學系 Department of Electrical Engineering, National Tsing Hua University)
李祈均 Chi-Chun Lee
(國立清華大學電機工程學系 Department of Electrical Engineering, National Tsing Hua University)
簡婉軒 Woan-Shiuan Chien *
國立清華大學電機工程學系 Department of Electrical Engineering, National Tsing Hua University
李祈均 Chi-Chun Lee
國立清華大學電機工程學系 Department of Electrical Engineering, National Tsing Hua University
中文摘要
語音情緒辨識是眾多語音解決方案中的關鍵技術。語音情緒辨識中一個獨特的公平性問題源於評估者提供的數據標註中存在的固有情緒認知偏見。為了讓語音情緒辨識在識別效能與公平性上都提升,我們必須優先解決評估者偏見的問題。在本研究中,我們提出了一種兩階段架構,該架構在第一階段使用公平性約束的對抗性架構來產生去偏見的表徵。隨後,在第二階段的性別感知學習完成後,我們賦予用戶根據需求在特定的性別感知之間自由切換的能力,我們使用兩個重要的公平性指標來評價我們的結果,證明了在性別間的分佈和預測是公平的,進一步的分析指出透過我們的模型在特徵學習空間上有效地避免性別觀點的影響。
英文摘要
Speech Emotion Recognition (SER) is a key technology within the myriad of speech solutions. A unique fairness issue in SER stems from the inherent emotional perception bias present in the data labels provided by raters. To enhance both recognition performance and fairness in SER, addressing rater bias is paramount. In this study, we propose a two-stage framework. In the first stage, we generate debiased representations using a fairness-constrained adversarial framework. Subsequently, in the second stage, following gender-wise perceptual learning, we empower users with the ability to toggle freely between specific gender-wise perceptions as needed. We utilize two significant fairness metrics to evaluate our results, demonstrating that our distributions and predictions across genders are fair. Further analysis indicates that our model effectively mitigates the influence of gender perspectives in the feature learning space.
中文關鍵字
語音情緒辨識;評估者偏見;公平表徵;感知公平性
英文關鍵字
Speech Emotion Recognition; Rater Bias; Fair Representation; Perceptual Fairness