Please use this identifier to cite or link to this item:
Title: Chinese text retrieval system
Authors: Lim, Hong Koon.
Keywords: DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval
Issue Date: 1999
Abstract: In this fast growing information age, information retrieval (IR) systems and their related fields have now attracted close attention of researchers in the field of information science. Recently, as Asian languages like Chinese, Japanese and Korean are starting to gain popularity, they are now employed as the language medium in an IR system, especially so with the Chinese language. In order to employ Chinese language in the IR domain, the fundamental linguistic problem that lies in the Chinese text will have to be resolved. Unlike the English or other European languages, Chinese language does not possess spaces and other punctuation marks as word separators. Therefore, in order to extract meaningful words from lines of Chinese text for text processing, Chinese text segmentation would have to be carried out. This is an essential process during the indexing of corpus for a Chinese text retrieval system. In this project, the primary objective is to develop a prototype of a Chinese text retrieval system that can be used for future research purposes. Instead of building one from scratch, an alternative is found in the form of the mg system. Being a retrieval system that is capable of performing fast and efficient indexing and retrieval on both textual and graphical document collections, it was chosen as the base system that our Chinese text retrieval system is to be built on.
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Theses

Files in This Item:
File Description SizeFormat 
  Restricted Access
Main report17.93 MBAdobe PDFView/Open

Page view(s) 5

Updated on Feb 27, 2021


Updated on Feb 27, 2021

Google ScholarTM


Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.