文章詳目資料

International Journal of Computational Linguistics And Chinese Language Processing THCI

  • 加入收藏
  • 下載文章
篇名 Some Studies on Min-Nan Speech Processing
卷期 12:4
作者 Kuo, Wei-chihHo, Chen-chungZhong, Xiang-ruiLiang, Zhen-fengChen, Sin-horngWang, Yih-ruYu, Hsiu-min
頁次 391-410
關鍵字 Min-Nan Text-to-Speech SystemSpeech RecognitionModel-Based Tone LabelingTHCI Core
出刊日期 200712

中文摘要

英文摘要

In this paper, three studies of Min-Nan speech processing are presented. The first study concerns the implementation of a high-performance Min-Nan TTS system. On the basis of the waveform templates of 877 base-syllables used as basic synthesis units and through the application of the RNN-based prosody generation method and the PSOLA algorithm for prosody modification, this Min-Nan TTS system can convert texts, represented in both Han-Luo (漢羅) and Chinese logographic writing systems, into natural Min-Nan speech. An informal, subjective listening test confirms that the system performs well and the synthetic speech sounds natural for well-tokenized Min-Nan texts and for automatically tokenized Chinese logographic texts. The second investigation concerns the realization of a
Min-Nan speech recognizer. It adopts the initial-final-based HMM approach with a simple base-syllable bigram language model. A base-syllable recognition rate of 65.1% has been achieved. Finally, a model-based tone labeling method is presented. This method adopts a statistical model to eliminate the affections of all factors other than tone on the syllable pitch contour for automatic tone labeling. Experimental results confirm that this method outperforms the conventional VQ-based approach.

相關文獻