文章詳目資料

International Journal of Computational Linguistics And Chinese Language Processing THCI

  • 加入收藏
  • 下載文章
篇名 Acoustic Model Optimization for Multilingual Speech Recognition
卷期 13:3
作者 Lyu, Dau-chengHsu, Chun-nanChiang, Yuang-chinLyu, Ren-yuan
頁次 363-385
關鍵字 Cross-lingual Phone Set OptimizationDelta-BICSpeech RecognitionTHCI Core
出刊日期 200809

中文摘要

英文摘要

Due to abundant resources not always being available for resource-limited
languages, training an acoustic model with unbalanced training data for
multilingual speech recognition is an interesting research issue. In this paper, we propose a three-step data-driven phone clustering method to train a multilingual acoustic model. The first step is to obtain a clustering rule of context independent phone models driven from a well-trained acoustic model using a similarity measurement. For the second step, we further clustered the sub-phone units using hierarchical agglomerative clustering with delta Bayesian information criteria
according to the clustering rules. Then, we chose a parametric modeling technique -- model complexity selection -- to adjust the number of Gaussian components in a Gaussian mixture for optimizing the acoustic model between the new phoneme set and the available training data. We used an unbalanced trilingual corpus where the percentages of the amounts of the training sets for Mandarin, Taiwanese, and Hakka are about 60%, 30%, and 10%, respectively. The experimental results show that the proposed sub-phone clustering approach reduced relative syllable error rate by 4.5% over the best result of the decision tree based approach and 13.5% over the best result of the knowledge-based approach.

相關文獻