Please use this identifier to cite or link to this item:
Title: Collection and analysis on data from
Authors: Aw, Teng Teng
Keywords: DRNTU::Engineering::Computer science and engineering::Computing methodologies::Document and text processing
Issue Date: 2017
Abstract: Internet users rely on the Internet for its convenience and efficiency. Search engines provide convenience and are time-saving. Depending on the source of results, search engines provide plenty of information at an utmost accuracy. For example, professional medical websites such as and Wikipedia are reliable as the authors are professionals with medical knowledge. The public, with no medical knowledge, can access this information and learn more about the prescribed drugs. Also, there are web scrapers on the Internet, known for aiding researchers in extracting data at a much faster speed in a specific time frame. In this report, Scrapy, is a web scraper, which will be used to extract data from Scrapy is a framework, done in Python and the outputs will be saved in JSON files. Scrapy adapts to the different webpages with different structures using XPath.selectors. The findings will be presented in this report. The aim of this project is to utilize web scraping tools to collect data from and to be further analyzed. Data collected can be used in the future, saving time for researchers intending to do the same. Next, analysis of the collected data will cover aspects of the website, such as the structure and accuracy of information. In addition, this report will analyze the different web scrapers, its costs, complexity level and accuracy of data extracted. To conclude, this report will indicate the recommended choice of the web scrapers.
Rights: Nanyang Technological University
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
FYP Report_Aw Teng Teng _ U1422605F.pdf
  Restricted Access
FYP Report submission for SCE 16-0379, Collection and Analysis on Data from Drugs.com2.56 MBAdobe PDFView/Open

Page view(s)

Updated on Jun 19, 2021

Download(s) 50

Updated on Jun 19, 2021

Google ScholarTM


Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.