文章詳目資料

Concentric:Studies in Linguistics ScopusTHCI

  • 加入收藏
  • 下載文章
篇名 Automatic Extraction of English Collocations and their Chinese-English Bilingual Examples: A Computational Tool for Bilingual Lexicography
卷期 40:1
並列篇名 自動擷取英文搭配語及中英文例句:雙語辭典編纂學的計算工具
作者 高照明
頁次 095-121
關鍵字 collocationdependency relationcomputational lexicographyparallel corporamutual informationt-scorelog likelihood ratio搭配語依存關係計算辭典編纂學雙語平行語料庫mutual informationt-scorelog likelihood ratioScopusTHCI
出刊日期 201405
DOI 10.6241/concentric.ling.40.1.04

中文摘要

本文描述EXEC線上系統的設計流程。EXEC由一千三百萬英文詞及二千七百萬中文字的中英雙語平行語料庫建立而成,結合英語搭配語檢索和中英雙語檢索功能。EXEC利用統計以及具有依存關係的英文句法剖析器擷取英文搭配語。查詢時輸入關鍵詞和關鍵詞的詞性以及所捜尋的搭配語的詞性,程式依據英文句法剖析器的依存關係和mutual information、t-score、log likelihood ratio等統計訊息自動擷取可能的英文搭配語,並連結包含英文搭配語的英文例句及中文翻譯。實驗顯示EXEC在正確率和效率上具有相當的水準。EXEC可做為自動編纂中英雙語搭配語辭典的工具,同時可以幫助學習英語的國人克服英語的語言障礙。

英文摘要

This paper describes the procedures involved in developing EXEC, a web-based system which can automatically extract English collocations and their Chinese-English bilingual examples from parallel corpora. The system draws on statistics, dependency parsing, and Chinese-English parallel corpora of more than 13 million English words and 27 million Chinese characters. By taking a word as well as the parts-of-speech of the word and its collocate as input, the program can automatically generate collocation candidates based on syntactic dependency relations as well as statistical information regarding mutual information, t-scores, and log likelihood ratios. In conjunction with a Chinese-English bilingual concordancer, it can further extract English sentences containing identified collocations along with their Chinese translations. Our evaluations suggest that the proposed system performs reasonably well in terms of accuracy and efficiency. EXEC can be used in facilitating automatic compilation of bilingual collocation dictionaries as well as in overcoming the L2 language barrier for Chinese learners of English.

相關文獻