Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/155296
Title: Comparing the performances of glass transition temperatures prediction : SMILES vs. Molfile
Authors: Goh, Kai Leong
Keywords: Science::Chemistry
Issue Date: 2021
Publisher: Nanyang Technological University
Source: Goh, K. L. (2021). Comparing the performances of glass transition temperatures prediction : SMILES vs. Molfile. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/155296
Project: CHEM/21/039 
Abstract: Glass transition temperature (Tg) is the temperature at which a polymer changes from rigid to flexible. Tg is an important tool for modifying physical properties of polymers, with a wide variety of industrial applications. The field of machine learning (ML) has significantly grown over the recent years due to advances in technology. In computational chemistry, ML takes the form of quantitative structure–property relationship (QSPR) modelling. The main objective of this project was the comparison between two different types of digital representations of molecular structures regarding their QSPR model performances for the prediction of Tg. A dataset of 1200 polymer data was collected from the PolyInfo polymer database. The Simplified Molecular-Input Line-Entry System (SMILES) and MDL Molfiles (.mol files) were the two digital representations of molecular structures. The two sets of features used were Mordred-2D and ECFP4. XGBoost (Extreme Gradient Boosting) was selected as the regression algorithm, with R2 and RMSE being the scoring metrics to evaluate the model performance. For Mordred-2D, SMILES generally performed better than .mol files. For ECFP4, SMILES and .mol files yielded very similar results. It was noted that the .mol file optimization process was more time-consuming than SMILES strings generation process. Based on the results obtained, it was concluded that using SMILES will be a better choice for future studies in terms of efficiency. The main focus of future work will be to collect more data from the PolyInfo database and to try other machine learning algorithms.
URI: https://hdl.handle.net/10356/155296
Schools: School of Physical and Mathematical Sciences 
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:SPMS Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
CM4073_AY2021-22_Sem1_GohKaiLeong.pdf
  Restricted Access
688.17 kBAdobe PDFView/Open

Page view(s)

209
Updated on Oct 2, 2023

Download(s) 50

24
Updated on Oct 2, 2023

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.