Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/66745
Title: Processing and mining large-scale bibliography data
Authors: Kerk, Wei Yang
Keywords: DRNTU::Engineering
Issue Date: 2016
Abstract: Extracting information from large collections of semi-structured data can be a considerable challenge when much of the information are hidden and require understanding. There are 3 primary objective of this project. First is to parse a large XML file from Digital Bibliography & Library Project (DBLP) into a designed database system. Next, to classify each author ethnicity using an external name based classification system. Analysis on DBLP authors will be done to identify trends and pattern. Lastly, to conduct a Link Prediction experiment on collaboration between authors using well known predictors. By using ethnicity as a feature for collaboration, it is interesting to see any relationship exists between ethnicity and collaboration. This information can be added in the community for further research or future experiment. This report consists of a detailed explanation of various implementations to achieve the objectives, solutions to overcome them, results, analysis and a recommendation for future work. Finally, this report highlights that an author’s ethnicity is a relevant factor in collaboration for some ethnicity.
URI: http://hdl.handle.net/10356/66745
Rights: Nanyang Technological University
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
Kerk Wei Yang (U1321927C) FYP report.pdf
  Restricted Access
1.63 MBAdobe PDFView/Open

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.