文章詳目資料

International Journal of Computational Linguistics And Chinese Language Processing THCI

  • 加入收藏
  • 下載文章
篇名 Modeling Cantonese Pronunciation Variations for Large-Vocabulary Continuous Speech Recognition
卷期 11:1
作者 Lee, TanKam, PatgiSoong, Frank K.
頁次 017-035
關鍵字 Automatic Speech RecognitionCantonesePronunciation VariationTHCI Core
出刊日期 200603

中文摘要

英文摘要

This paper presents different methods of handling pronunciation variations in Cantonese large-vocabulary continuous speech recognition. In an LVCSR system, three knowledge sources are involved: a pronunciation lexicon, acoustic models and language models. In addition, a decoding algorithm is used to search for the most likely word sequence. Pronunciation variation can be handled by explicitly modifying the knowledge sources or improving the decoding method. Two types of
pronunciation variations are defined, namely, phone changes and sound changes. Phone change means that one phoneme is realized as another phoneme. A sound change happens when the acoustic realization is ambiguous between two phonemes. Phone changes are handled by constructing a pronunciation variation dictionary to include alternative pronunciations at the lexical level or dynamically expanding the search space to include those pronunciation variants. Sound changes are handled by adjusting the acoustic models through sharing or adaptation of the Gaussian mixture components. Experimental results show that the use of a pronunciation variation dictionary and the method of dynamic search space expansion can improve speech recognition performance substantially. The methods of acoustic model refinement were found to be relatively less effective in our experiments.

相關文獻