Chemometric techniques for multivariate calibration and their application in spectroscopic sensors
Date of Issue2012
School of Chemical and Biomedical Engineering
Chemometric modeling for multivariate calibration of spectroscopy is a crucial technique to ensure product quality and process performance at low cost in many industries. This technique provides fast, noninvasive and nondestructive analysis of sample/process by predicting analyte properties from measured spectra. Traditional multivariate calibration methods, such as principal component regression (PCR) and partial least squares (PLS), are only reliable when the relationship between analyte properties and spectra is linear. In practice, external disturbances, such as light scattering and baseline noise, will introduce non-linearity into spectral data, deteriorating the prediction accuracy of PCR and PLS. In this thesis, several chemometric strategies will be investigated to address this challenge, including pre-processing and non-linear calibration techniques. Pre-processing methods of the first (D1) and second derivatives (D2), standard normal variate (SNV), extended multiplicative signal correction (EMSC), and extended inverted signal correction (EISC), are proposed to remove the impact of disturbances first, so that the linear calibration method of PCR or PLS can be applied. In addition, a unique linear calibration strategy of optical path length estimation and correction (OPLEC), which involving the building of two linear calibration models, is also investigated. Non-linear calibration techniques aim to model the non-linearity directly, including the methods of artificial neural network (ANN), least squares support vector machine (LS-SVM), and Gaussian process regression (GPR). Through comparison of different linear and non-linear calibration techniques, it is found non-linear calibration techniques give more accurate prediction performance than linear methods in most cases. However, non-linear calibration models are not robust enough and small changes in training data or model parameters may result in significant changes in prediction. Therefore, the strategies of bagging/subagging are investigated to improve the prediction robustness of non-linear calibration models. Furthermore, when using spectroscopic data to predict the analyte property, not all of the variables have contribution to the calibration model. Therefore, selecting the useful variables is effective to improve the prediction performance of calibration models. Two penalized regression algorithms with variable selection using LASSO (least absolute shrinkage and selection operator), penalized linear regression (PLR) and penalized Gaussian process regression (PGPR), are investigated to solve the variable selection problem. Finally, chemometric calibration techniques are applied for solving practical problems which involve predicting the length distribution of single walled carbon nanotubes (SWCNTs) through ultraviolet-visible near-infrared (UV-vis-NIR) spectroscopy and designing a soft sensor to monitor an industrial anaerobic wastewater treatment process.