Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/55016
Title: Web-based retrieval system for chemical structural formulas
Authors: Neo, Lok Tuan
Keywords: DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval
Issue Date: 2013
Abstract: The drug discovery process relies heavily on chemical substructure and similarity search results for lead identification. Researchers often pool substructure and similarity search results to obtain a larger set of lead molecules for drug suitability evaluation in subsequent stages of the drug discovery process. However, existing chemical search engines require users to issue similarity and substructure chemical search queries separately and only display search results to the users when the search is complete. In this project, an efficient web-based chemical search engine is proposed and implemented to efficiently deliver both types of search results to users once a match is found. Two approaches are proposed to support efficient chemical search: • Effective Substructure Screening - By combining substructure information with chemical functional groups and chemical bonds, the accuracy of the substructure screening process during a substructure search can be improved. Evaluation results showed that the combined chemical features improve precision, recall and F1 scores for almost all test queries. • Publisher-Subscriber Infrastructure - Using the Publisher-Subscriber pattern in conjunction with an effective molecule filtering process, various types of chemical search can be carried out simultaneously and results can be efficiently delivered to users. Evaluation results of the proposed search engine infrastructure indicate that it is linearly scalable when used on larger chemical databases with significant speed-ups in search time when cached results are used to filter molecules for substructure search. Both proposed approaches jointly work to enhance the efficiency and effectiveness of chemical structural formula search. In this report, the proposed substructure screening process and the proposed publisher-subscriber infrastructure will be discussed. The performance of the proposed approaches is also evaluated.
URI: http://hdl.handle.net/10356/55016
Rights: Nanyang Technological University
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
SCE12-0457.pdf
  Restricted Access
1.49 MBAdobe PDFView/Open

Page view(s) 50

201
checked on Oct 25, 2020

Download(s) 50

10
checked on Oct 25, 2020

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.