mirage

Building the foundation text for Nanyang Technological University : multilingual corpus (NTU-MC).

DSpace/Manakin Repository

 

Search DR-NTU


Advanced Search Subject Search

Browse

My Account

Building the foundation text for Nanyang Technological University : multilingual corpus (NTU-MC).

Show simple item record

dc.contributor.author Tan, Li Ling.
dc.date.accessioned 2012-04-13T04:46:34Z
dc.date.available 2012-04-13T04:46:34Z
dc.date.copyright 2011
dc.date.issued 2012-04-13
dc.identifier.citation Tan, L. L. (2011). Building the Foundation Text for Nanyang Technological University : Multilingual Corpus (NTU-MC). Final year project report, Nanyang Technological University.
dc.identifier.uri http://hdl.handle.net/10220/7790
dc.description.abstract The NTU-MC is a multilingual corpus that taps on the availability of multilingual text available in Singapore. The current version of NTU-MC contains a total of ~375,000 words (15,096 sentences) for the NTU-MC in 6 languages (English, Chinese, Japanese, Korean, Indonesian and Vietnamese) from 6 language families (Indo-European, Japonic, Austro-Asiatic, Sino-Tibetan, Austronesian and Korean as a language isolate); all text in English, Chinese, Japanese, Korean and Vietnamese were Part Of Speech (POS) tagged. This project focuses on compiling the foundation text for the NTU-MC and this dissertation describes the motivations, the corpus compilation process and internal and cross-corpora evaluation of the corpus output. The corpus will be made available to the public under the Creative Common – Attribute 3.0 Unported license in Summer 2011.
dc.format.extent 47 p.
dc.language.iso en
dc.subject DRNTU::Humanities::Language::Linguistics.
dc.title Building the foundation text for Nanyang Technological University : multilingual corpus (NTU-MC).
dc.type Final Year Project (FYP)
dc.contributor.school School of Humanities and Social Sciences
dc.contributor.supervisor Francis Charles Bond.
dc.description.degree LINGUISTICS AND MULTILINGUAL STUDIES

Files in this item

Files Size Format View
Liling Tan.pdf 2.282Mb PDF View/Open

This item appears in the following Collection(s)

Show simple item record

Statistics

Total views

All Items Views
Building the foundation text for Nanyang Technological University : multilingual corpus (NTU-MC). 437

Total downloads

All Bitstreams Views
Liling Tan.pdf 430

Top country downloads

Country Code Views
United States of America 169
China 72
Singapore 35
Sweden 25
Germany 16

Top city downloads

city Views
Mountain View 74
Beijing 57
Singapore 32
Clarks Summit 26
Toronto 11

Downloads / month

  2014-02 2014-03 2014-04 total
Liling Tan.pdf 0 0 10 10