文章詳目資料

電腦與通訊

  • 加入收藏
  • 下載文章
篇名 應用於文字轉語音系統的語者調適方法回顧
卷期 139
並列篇名 The Review of Speaker Adaptation Methods for Text-to-Speech Systems
作者 林政源黃柏凱
頁次 052-059
關鍵字 文字轉語音語者調適語者正規化之語音模型訓練隱藏式半馬可夫模型最大可能性線性迴歸Text to speechTTSSpeaker adaptationSASpeaker adaptive trainingSATHidden semi-markov modelHSMMMaximum likelihood linear regressionMLLR
出刊日期 201106

中文摘要

對於文字轉語音應用而言,利用語者調適的方法可以有效地建構特定語者的語音合成系統。其方法涵蓋幾個不同層面,如語音模型的訓練,語者調適演算法,語者正規化訓練等。例如,對於語音模型的訓練方式可以採用HMM(Hidden Markov Model)或者HSMM(Hidden Semi Markov Model),對於調適演算法則有最大可能性線性迴歸或者最大事後機率概估等方法,而為了降低訓練語者之間的頻譜與韻律差異所帶來的影響,可採用speaker adaptive training方法進行語者正規化的訓練。本論文介紹近幾年關於語者調適的文字轉語音系統所採用的方法及其系統效能的評估方式。

英文摘要

For text-to-speech (TTS) applications, a specific speaker TTS system can be efficiently constructed by using the speaker adaptation (SA) method which involves some aspects, such as the training of voice models, the algorithms of SA, and the normalization of speakers. For example, there are two ways of model training, one is HMM (hidden Markov model) and the other is HSMM (hidden semi Markov model). Regarding the speaker adaptation algorithm, there are various approaches, e.g. maximum likelihood linear regression, maximum a posteriori, etc. In order to reduce the influence caused by the spectral and prosodic differences among training speakers, the speaker adaptive training is adopted accordingly for the normalization of speakers. This study introduces these approaches used in SA-based TTS systems and its corresponding evaluations of performance in the recent years.

相關文獻