Please use this identifier to cite or link to this item:
Title: Multiword expressions : a study on representation of Japanese MWEs in wordnet and other lexical databases
Authors: Lee, Hui Shan
Keywords: DRNTU::Humanities
Issue Date: 2017
Abstract: Multiword expressions (MWEs) make up a significant portion of the lexicon and have distinctive characteristics of non-compositionality, non-substitutability, non-modifiability. They have been widely recognized as a very problematic part of natural language processing (NLP) as the current linguistic databases often do not have enough coverage on MWEs. This paper attempts to fill a gap in research by looking at how difficult it is to retrieve and process Japanese MWEs. The research presents an overview of 360 entries obtained through automatic-retrieval (AR) and manual retrieval (MR) from the corpus. These entries are then compared across seven databases; goo dictionary, imiwa? dictionary, the JDMWE, the WWWJDIC, NINJAL, wordnet, and the N-gram count corpus to test for whether thery are MWEs. The results obtained from this study suggest that the coverage of the database used, the differences in how phrases are represented in the dictionary, complications caused by the different writing systems present in Japanese, as well as the need for human judgement, are some of the main problems in determining whether a phrase is an MWE.
Rights: Nanyang Technological University
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:HSS Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
Lee Hui Shan (U1221405G) - Combined FYP file.pdf
  Restricted Access
1.92 MBAdobe PDFView/Open

Page view(s) 10

Updated on Jan 18, 2021

Download(s) 10

Updated on Jan 18, 2021

Google ScholarTM


Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.