|
1 |
Building a Bracketed Corpus Using φ2 Statistics
|
Lee, Yue-shi、Chen, Hsin-hsi
|
Bracketed Corpus
、
φ2 Statistics
、
Treebank
、
Probabilistic Chunkers
、
THCI Core
|
| |
|
2 |
Longest Tokenization
|
Jin, Guo
|
sentence toenization
、
word identification
、
word segmentation
、
critical toknization
、
maximum tokenization
、
tokenization disambiguation
、
THCI Core
|
| |
|
3 |
Segmentation Standard for Chinese Natural Language Processing
|
Huang, Chu-ren、Chen, Keh-jiann、Chen, Feng-yi、Chang, Li-li
|
THCI Core
|
| |
|
4 |
Aligning More Words with High Precision for Small Bilingual Corpora
|
Ker, J. Sue、Chang, S. Jason
|
Word alignment
、
machine readable dictionary and thesaurus
、
bilingual corpus
、
word sense disambiguation
、
THCI Core
|
| |
|
5 |
An Unsupervised Iterative Method for Chinese New Lexicon Extraction
|
Chang, Jing-shin、Su, Keh-yih
|
Unknown Word Identification
、
Lexicon
、
Chinese
、
Iterative Enhancement
、
Unsupervised Method
、
New Lexicon Extraction
、
THCI Core
|
| |