Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/157102
Full metadata record
DC FieldValueLanguage
dc.contributor.authorGoh, Kai Leongen_US
dc.date.accessioned2022-05-04T08:35:44Z-
dc.date.available2022-05-04T08:35:44Z-
dc.date.issued2022-
dc.identifier.citationGoh, K. L. (2022). A stacked generalisation with gradient boosting for highly accurate predictions of polymer bandgap. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/157102en_US
dc.identifier.urihttps://hdl.handle.net/10356/157102-
dc.description.abstractThe bandgap (Egap) is the energy difference between the highest valence band and the lowest conduction band. Generally, the conductivity of a solid material increases as its Egap decreases. As the amount of experimental data stored in online databases continue to increase over the years, it has allowed the possibility of using quantitative structure–property relationship (QSPR) modelling to predict the physical properties of synthetic materials. Recently, a paper by the Ramprasad Group has reported a highly accurate QSPR model for predicting the Egap values of a dataset of 4209 polymers. This paper presents an alternative QSPR model named LGB-Stack, which has achieved even higher accuracy scores using the same dataset. LGB-Stack performs a two-level stacked generalisation with the help of the LightGBM (Light Gradient Boosting Machine) algorithm, where multiple weak models are firstly trained, and secondly combined into a stronger final model. This paper also presents an extremely fast and efficient method of geometry optimisation that employs the Merck Molecular Force Field (MMFF). Prior to the actual model training, the Simplified Molecular Input Line Entry System (SMILES) notations of the polymers in the dataset were converted and optimised into 3D molecular objects using the MMFF method. Subsequently, four different molecular fingerprints were generated based on the 3D molecular objects, and used as the initial input features for training the weak models. The outputs of the weak models were used as the new input features for training the final model, which completes the LGB-Stack model training process.en_US
dc.language.isoenen_US
dc.publisherNanyang Technological Universityen_US
dc.relationCHEM/21/095en_US
dc.subjectScience::Chemistryen_US
dc.titleA stacked generalisation with gradient boosting for highly accurate predictions of polymer bandgapen_US
dc.typeFinal Year Project (FYP)en_US
dc.contributor.supervisorLu Yunpengen_US
dc.contributor.schoolSchool of Physical and Mathematical Sciencesen_US
dc.description.degreeBachelor of Science in Chemistry and Biological Chemistryen_US
dc.contributor.supervisoremailYPLu@ntu.edu.sgen_US
item.fulltextWith Fulltext-
item.grantfulltextrestricted-
Appears in Collections:SPMS Student Reports (FYP/IA/PA/PI)
Files in This Item:
File Description SizeFormat 
GohKaiLeong_U1840005E_CM4078.pdf
  Restricted Access
1.22 MBAdobe PDFView/Open

Page view(s)

125
Updated on Sep 30, 2023

Download(s)

3
Updated on Sep 30, 2023

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.