Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/63297
Title: Handwriting recognition and retrieval for chemical structural formulas
Authors: Tang, Peng
Keywords: DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval
DRNTU::Engineering::Computer science and engineering::Computing methodologies::Pattern recognition
Issue Date: 2015
Source: Tang, P. (2015). Handwriting recognition and retrieval for chemical structural formulas. Doctoral thesis, Nanyang Technological University, Singapore.
Abstract: Chemicals with similar structures often have similar chemical properties, chemical re- action and even physical properties. Therefore, in many drug discovery projects, it is required to search for similar chemical structures of drug-like compounds that are worthy for further synthetic investigation. However, most of the current search engines only work well for text-based information. They are unable to provide good support for chemical structural search. Moreover, to perform chemical structural search, it is necessary to input a chemical structural query. Compared to handwriting-based input, the traditional template-based input is much more complicated and non-intuitive. With the growing popularity of touch-based devices, handwriting-based input has become much more important. Due to the spatial complexity of chemical structural formulas, it is challenging to recognize handwritten chemical structural formulas with both precision and efficiency. In this research, we focus on investigating various techniques to support handwritten chemical recognition and retrieval for chemical structural formulas. In this research, we have made the following contributions: • Handwritten Chemical Symbol Recognition. We proposed a CF44 chemical feature set consisting of 44 chemical symbol features which model the writing process, visual appearance and contextual environment of handwritten chemical symbols. In addition, we also proposed a handwritten chemical symbol recognition approach which is based on Support Vector Machine and our proposed CF44 chemical symbol feature set. • Progressive Chemical Structural Analysis. We proposed a chemical structural analysis approach to support progressive recognition of handwritten chemical structural formulas. In the proposed approach, Chemical Structural Graph was proposed to model chemical structural formulas. In addition, we also proposed a novel connected bond analysis method and ring closure detection method to support the recognition of complex chemical structures such as connected bonds and cyclic ring structures. • Chemical Structural Similarity Retrieval. We proposed two approaches for chemical structural similarity retrieval which retrieve functionally similar chemical structural formulas to the query. The two proposed chemical structural retrieval approaches are based on Vector Space Model and Formal Concept Analysis respectively. In addition, we also proposed a web-based chemical retrieval system for efficient chemical structural similarity retrieval using the publish-subscribe model.
URI: http://hdl.handle.net/10356/63297
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Theses

Files in This Item:
File Description SizeFormat 
ThesisReport.pdf
  Restricted Access
Main Article3.93 MBAdobe PDFView/Open

Page view(s)

190
Updated on Nov 29, 2020

Download(s)

14
Updated on Nov 29, 2020

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.