篇名 | TAICAR - The Collection and Annotation of an In-Car Speech Database Created in Taiwan |
---|---|
卷期 | 10:2 |
作者 | Wang, Hsien-chang 、 Yang, Chung-hsien 、 Wang, Jhing-fa 、 Wu, Chung-hsien 、 Chien, Jen-tzung |
頁次 | 237-249 |
關鍵字 | TAICAR, in-car speech 、 corpus collection and annotation 、 multi-channel recording 、 speech database 、 THCI Core |
出刊日期 | 200506 |
This paper describes a project that aims to create a Mandarin speech database for the automobile setting (TAICAR). A group of researchers from several universities and research institutes in Taiwan have participated in the project. The goal is to generate a corpus for the development and testing of various speech-processing techniques. There are six recording
sites in this project. Various words, sentences, and spontaneously queries uttered in the vehicular navigation setting have been collected in this project. A preliminary corpus of utterances from 192 speakers was created from utterances generated in different vehicles. The database contains more than 163,000 files, occupying 16.8 gigabytes of disk space.