Please use this identifier to cite or link to this item:
|Title:||Web-based retrieval system for chemical structural formulas||Authors:||Neo, Lok Tuan||Keywords:||DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval||Issue Date:||2013||Abstract:||The drug discovery process relies heavily on chemical substructure and similarity search results for lead identification. Researchers often pool substructure and similarity search results to obtain a larger set of lead molecules for drug suitability evaluation in subsequent stages of the drug discovery process. However, existing chemical search engines require users to issue similarity and substructure chemical search queries separately and only display search results to the users when the search is complete. In this project, an efficient web-based chemical search engine is proposed and implemented to efficiently deliver both types of search results to users once a match is found. Two approaches are proposed to support efficient chemical search: • Effective Substructure Screening - By combining substructure information with chemical functional groups and chemical bonds, the accuracy of the substructure screening process during a substructure search can be improved. Evaluation results showed that the combined chemical features improve precision, recall and F1 scores for almost all test queries. • Publisher-Subscriber Infrastructure - Using the Publisher-Subscriber pattern in conjunction with an effective molecule filtering process, various types of chemical search can be carried out simultaneously and results can be efficiently delivered to users. Evaluation results of the proposed search engine infrastructure indicate that it is linearly scalable when used on larger chemical databases with significant speed-ups in search time when cached results are used to filter molecules for substructure search. Both proposed approaches jointly work to enhance the efficiency and effectiveness of chemical structural formula search. In this report, the proposed substructure screening process and the proposed publisher-subscriber infrastructure will be discussed. The performance of the proposed approaches is also evaluated.||URI:||http://hdl.handle.net/10356/55016||Rights:||Nanyang Technological University||Fulltext Permission:||restricted||Fulltext Availability:||With Fulltext|
|Appears in Collections:||SCSE Student Reports (FYP/IA/PA/PI)|
Page view(s) 50201
checked on Oct 25, 2020
checked on Oct 25, 2020
Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.