文章詳目資料

International Journal of Computational Linguistics And Chinese Language Processing THCI

  • 加入收藏
  • 下載文章
篇名 Improved Minimum Phone Error based Discriminative Training of Acoustic Models for Mandarin Large Vocabulary Continuous Speech Recognition
卷期 13:3
作者 Liu, Shih-hungChu, Fang-huiLo, Yueng-tienChen, Berlin
頁次 343-361
關鍵字 Discriminative TrainingLarge Vocabulary Continuous Speech RecognitionTraining Data SelectionPhone Accuracy FunctionMinimum Phone ErrorTHCI Core
出刊日期 200809

中文摘要

英文摘要

This paper considers minimum phone error (MPE) based discriminative training of acoustic models for Mandarin broadcast news recognition. We present a new phone accuracy function based on the frame-level accuracy of hypothesized phone arcs instead of using the raw phone accuracy function of MPE training. Moreover, a novel data selection approach based on the frame-level normalized entropy of Gaussian posterior probabilities obtained from the word lattice of the training utterance is explored. It has the merit of making the training algorithm focus much more on the training statistics of those frame samples that center nearly around the
decision boundary for better discrimination. The underlying characteristics of the presented approaches are extensively investigated, and their performance is verified by comparison with the standard MPE training approach as well as the other related work. Experiments conducted on broadcast news collected in Taiwan demonstrate that the integration of the frame-level phone accuracy calculation and data selection yields slight but consistent improvements over the baseline system.

相關文獻