文章詳目資料

International Journal of Computational Linguistics And Chinese Language Processing THCI

  • 加入收藏
  • 下載文章
篇名 A Comparative Study of Methods for Topic Modeling in Spoken Document Retrieval
卷期 17:1
作者 Lin, Shih-hsiangChen, Berlin
頁次 065-085
關鍵字 Information RetrievalDocument Topic ModelsWord Topic ModelsSpoken Document RetrievalTHCI Core
出刊日期 201203

中文摘要

英文摘要

Topic modeling for information retrieval (IR) has attracted significant attention and demonstrated good performance in a wide variety of tasks over the years. In this paper, we first present a comprehensive comparison of various topic modeling approaches, including the so-called document topic models (DTM) and word topic models (WTM), for Chinese spoken document retrieval (SDR). Moreover, different granularities of index features, including words, subword units, and their combinations, are also exploited to work in conjunction with various extensions of
topic modeling presented in this paper, so as to alleviate SDR performance
degradation caused by speech recognition errors. All of the experiments were performed on the TDT Chinese collection.

相關文獻