HyRead Journal 台灣全文資料庫

文章詳目資料

電腦與通訊

自然科學/資訊/科技

篇名	應用於文字轉語音系統的語者調適方法回顧
卷期	139
並列篇名	The Review of Speaker Adaptation Methods for Text-to-Speech Systems
作者	林政源、黃柏凱
頁次	052-059
關鍵字	文字轉語音、語者調適、語者正規化之語音模型訓練、隱藏式半馬可夫模型、最大可能性線性迴歸、 Text to speech 、 TTS 、 Speaker adaptation 、 SA 、 Speaker adaptive training 、 SAT 、 Hidden semi-markov model 、 HSMM 、 Maximum likelihood linear regression 、 MLLR
出刊日期	201106

中文摘要

對於文字轉語音應用而言，利用語者調適的方法可以有效地建構特定語者的語音合成系統。其方法涵蓋幾個不同層面，如語音模型的訓練，語者調適演算法，語者正規化訓練等。例如，對於語音模型的訓練方式可以採用HMM(Hidden Markov Model)或者HSMM(Hidden Semi Markov Model)，對於調適演算法則有最大可能性線性迴歸或者最大事後機率概估等方法，而為了降低訓練語者之間的頻譜與韻律差異所帶來的影響，可採用speaker adaptive training方法進行語者正規化的訓練。本論文介紹近幾年關於語者調適的文字轉語音系統所採用的方法及其系統效能的評估方式。

英文摘要

For text-to-speech (TTS) applications, a specific speaker TTS system can be efficiently constructed by using the speaker adaptation (SA) method which involves some aspects, such as the training of voice models, the algorithms of SA, and the normalization of speakers. For example, there are two ways of model training, one is HMM (hidden Markov model) and the other is HSMM (hidden semi Markov model). Regarding the speaker adaptation algorithm, there are various approaches, e.g. maximum likelihood linear regression, maximum a posteriori, etc. In order to reduce the influence caused by the spectral and prosodic differences among training speakers, the speaker adaptive training is adopted accordingly for the normalization of speakers. This study introduces these approaches used in SA-based TTS systems and its corresponding evaluations of performance in the recent years.

本卷期文章目次

關鍵知識WIKI