Please use this identifier to cite or link to this item:
https://hdl.handle.net/10356/74009
Title: | Information extraction from bibliography data | Authors: | Toh, Joel Zhu Er | Keywords: | DRNTU::Engineering::Computer science and engineering::Computing methodologies::Document and text processing DRNTU::Engineering::Computer science and engineering::Information systems::Information interfaces and presentation |
Issue Date: | 2018 | Abstract: | DBLP is a computer science bibliography hosted by the University of Trier from Germany. It contains bibliographic information on major computer science journals and proceedings. As of Dec 2017, there were 4,004,065 publications, 2,012,222 authors, 5,263 conferences and 1566 journals. Due to the magnitude of information, it is tedious for users to gain valuable insights and information from the data. In order to bridge this gap, this report consists of 4 main objectives. Firstly, parsing the large DBLP XML file and other datasets into a relational database to accommodate efficient querying. Secondly, an exploration of techniques used to extract author’s career length, ethnicity, area of specialization and gender from the DBLP data. In addition, this paper also explored the data to discover knowledge. Thirdly, modeling the data to perform link prediction to predict who might an author collaborate with in future. This includes improving the existing link prediction methods with the concept of homophily. Fourthly, this report also introduces a web application that was developed for data analysis and data visualization of the DBLP data. This helps users gain insight and make sense of the data. Finally, this report discusses the results from the link prediction and interprets the newly discovered insights. | URI: | http://hdl.handle.net/10356/74009 | Schools: | School of Computer Science and Engineering | Rights: | Nanyang Technological University | Fulltext Permission: | restricted | Fulltext Availability: | With Fulltext |
Appears in Collections: | SCSE Student Reports (FYP/IA/PA/PI) |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
TOH_ZHU_ER_JOEL_FYP_Report.pdf Restricted Access | Final Year Project Report on Information extraction from bibliography data | 3.55 MB | Adobe PDF | View/Open |
Page view(s)
409
Updated on May 7, 2025
Download(s) 50
24
Updated on May 7, 2025
Google ScholarTM
Check
Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.