Vol.6(2)
/
2008 / 7
/
pp. 79 - 118
DEVELOPING AN ONLINE CORPUS OF FORMOSAN LANGUAGES
作者
宋麗梅
(國立台灣大學)
蘇以文
(國立台灣大學)
謝富惠
(大同大學)
Zhemin Lin
(緯創資通)
宋麗梅
國立台灣大學
蘇以文
國立台灣大學
謝富惠
大同大學
Zhemin Lin
緯創資通
中文摘要
N/A
英文摘要
Information technologies have now matured to the point of enabling researchers to create a repository of language resources, especially for those languages facing the crisis of endangerment. The development of an online platform of corpora, made possible by recent advances in data storage, character-encoding and web technology, has profound consequences for the accessibility, quantity, quality and interoperability of linguistic field data. This is of particular significance for Formosan languages in Taiwan, many of which are on the verge of extinction. As a response to the recognition of this burgeoning problem, the key objectives of the establishment of the NTU Corpus of Formosan Languages aim to document and thus preserve valuable linguistic data, as well as relevant ethnological and cultural information. This paper will introduce some of the theoretical bases behind this initiative, as well as the procedures, transcription conventions, database normalization, in-house system and three special features in the creation of this corpus.
中文關鍵字
N/A
英文關鍵字
Formosan languages, Taiwan, corpus, database normalization, discourse, intonation unit (IU), 'Pear' story, 'Frog' story, cross-referencing retrievability, multilingual search, interoperability