篇名 | Automatic Pronunciation Assessment for Mandarin Chinese: Approaches and System Overview |
---|---|
卷期 | 12:4 |
作者 | Chen, Jiang-chun 、 Jang, Jyh-shing Roger 、 Tsai, Te-lu |
頁次 | 443-458 |
關鍵字 | CAPT, CALL 、 Speech Recognition 、 Tone Recognition 、 Speech Assessment 、 Phoneme 、 Downhill Simplex Method 、 Mandarin Chinese 、 GMM 、 Intensity 、 Rhythm 、 Forced Alignment 、 THCI Core |
出刊日期 | 200712 |
This paper presents the algorithms used in a prototypical software system for automatic pronunciation assessment of Mandarin Chinese. The system uses forced alignment of HMM (Hidden Markov Models) to identify each syllable and the corresponding log probability for phoneme assessment, through a ranking-based confidence measure. The pitch vector of each syllable is then sent to a GMM (Gaussian Mixture Model) for tone recognition and assessment. We also compute the similarity of scores for intensity and rhythm between the target and test utterances. All four scores for phoneme, tone, intensity, and rhythm are parametric functions with certain free parameters. The overall scoring function was then
formulated as a linear combination of these four scoring functions of phoneme, tone, intensity, and rhythm. Since there are both linear and nonlinear parameters involved in the overall scoring function, we employ the downhill Simplex search to fine-tune these parameters in order to approximate the scoring results obtained from a human expert. The experimental results demonstrate that the system can give consistent scores that are close to those of a human’s subjective evaluation.