篇名 | A Comparative Study of Histogram Equalization (HEQ)for Robust Speech Recognition |
---|---|
卷期 | 12:2 |
作者 | Lin, Shih-hsiang 、 Yeh, Yao-ming 、 Chen, Berlin |
頁次 | 217-238 |
關鍵字 | Automatic Speech Recognition 、 Histogram Equalization 、 Robustness 、 Data Fitting 、 Temporal Average 、 THCI Core |
出刊日期 | 200706 |
The performance of current automatic speech recognition (ASR) systems often deteriorates radically when the input speech is corrupted by various kinds of noise sources. Quite a few techniques have been proposed to improve ASR robustness over the past several years. Histogram equalization (HEQ) is one of the most efficient techniques that have been used to reduce the mismatch between training and test acoustic conditions. This paper presents a comparative study of various HEQ approaches for robust ASR. Two representative HEQ approaches, namely, the
table-based histogram equalization (THEQ) and the quantile-based histogram
equalization (QHEQ), were first investigated. Then, a polynomial-fit histogram equalization (PHEQ) approach, exploring the use of the data fitting scheme to efficiently approximate the inverse of the cumulative density function of training speech for HEQ, was proposed. Moreover, the temporal average (TA) operation was also performed on the feature vector components to alleviate the influence of sharp peaks and valleys caused by non-stationary noises. All the experiments were results were
initially demonstrated. The best recognition performance was achieved by combing PHEQ with TA. Relative word error rate reductions of 68% and 40% over the MFCC-based baseline system, respectively, for clean- and multi- condition training, were obtained.