Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/139619
Title: Understanding and comparing scalable Gaussian process regression for big data
Authors: Liu, Haitao
Cai, Jianfei
Ong, Yew-Soon
Wang, Yi
Keywords: Engineering::Computer science and engineering
Issue Date: 2018
Source: Liu, H., Cai, J., Ong, Y.-S., & Wang, Y. (2019). Understanding and comparing scalable gaussian process regression for big data. Knowledge-Based Systems, 164, 324-335. doi:10.1016/j.knosys.2018.11.002
Journal: Knowledge-Based Systems
Abstract: As a non-parametric Bayesian model which produces informative predictive distribution, Gaussian process (GP) has been widely used in various fields, like regression, classification and optimization. The cubic complexity of standard GP however leads to poor scalability, which poses challenges in the era of big data. Hence, various scalable GPs have been developed in the literature in order to improve the scalability while retaining desirable prediction accuracy. This paper devotes to investigating the methodological characteristics and performance of representative global and local scalable GPs including sparse approximations and local aggregations from four main perspectives: scalability, capability, controllability and robustness. The numerical experiments on two toy examples and five real-world datasets with up to 250K points offer the following findings. In terms of scalability, most of the scalable GPs own a time complexity that is linear to the training size. In terms of capability, the sparse approximations capture the long-term spatial correlations, the local aggregations capture the local patterns but suffer from over-fitting in some scenarios. In terms of controllability, we could improve the performance of sparse approximations by simply increasing the inducing size. But this is not the case for local aggregations. In terms of robustness, local aggregations are robust to various initializations of hyperparameters due to the local attention mechanism. Finally, we highlight that the proper hybrid of global and local scalable GPs may be a promising way to improve both the model capability and scalability for big data.
URI: https://hdl.handle.net/10356/139619
ISSN: 0950-7051
DOI: 10.1016/j.knosys.2018.11.002
Rights: © 2018 Elsevier B.V. All rights reserved. This paper was published in Knowledge-Based Systems and is made available with permission of Elsevier B.V.
Fulltext Permission: open
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Journal Articles

Files in This Item:
File Description SizeFormat 
Understanding and comparing scalable Gaussian process regression for big data.pdf1.62 MBAdobe PDFView/Open

SCOPUSTM   
Citations

2
checked on Aug 31, 2020

WEB OF SCIENCETM
Citations 50

2
checked on Sep 28, 2020

Page view(s) 50

16
checked on Sep 29, 2020

Download(s) 50

1
checked on Sep 29, 2020

Google ScholarTM

Check

Altmetric


Plumx

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.