Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/170149
Title: Probabilistic optimal interpolation for data assimilation between machine learning model predictions and real time observations
Authors: Wei, Yuying
Law, Adrian Wing-Keung
Yang, Chun
Keywords: Engineering::Computer science and engineering
Issue Date: 2023
Source: Wei, Y., Law, A. W. & Yang, C. (2023). Probabilistic optimal interpolation for data assimilation between machine learning model predictions and real time observations. Journal of Computational Science, 67, 101977-. https://dx.doi.org/10.1016/j.jocs.2023.101977
Journal: Journal of Computational Science
Abstract: In this study, we propose a new framework for Data Assimilation (DA) named Probabilistic Optimal Interpolation (POI) to combine the predictions from Machine Learning (ML) models trained with historical data and real-time observations, with the key objective to improve the estimate on the state of system. The framework utilizes the heteroscedastic uncertainty of the ML predictions as well as the residual-based uncertainty of the observations and integrates the two through the technique of optimal interpolation. The quantification of the respective uncertainties is directly included within the framework itself. As an application example, we test the performance of POI using a multi-scale Lorenz 96 chaos system with various added noise levels. The ML model is based on a Long Short-Term Memory (LSTM) neural network and the technique of Monte Carlo (MC) dropout is adopted for the uncertainty quantification. The computational results show that the POI implementation can lead to improved predictions of the state of the system with less uncertainty and it can also filter the added level of noises effectively when the historical data are reasonably accurate. However, if the noise level is high, using the updated POI predictions as sequential inputs for the next time step does not guarantee better performance than using the real-time observations directly. Furthermore, under very noisy conditions, the average ML predictions after the MC dropout can already reduce the noises substantially, and these predictions might even be better than the POI updates. Therefore, the POI implementation (or data assimilation in general) is not recommended with a ML-based surrogate model in a noisy environment.
URI: https://hdl.handle.net/10356/170149
ISSN: 1877-7503
DOI: 10.1016/j.jocs.2023.101977
Schools: School of Civil and Environmental Engineering 
School of Mechanical and Aerospace Engineering 
Research Centres: Nanyang Environment and Water Research Institute 
Rights: © 2023 Elsevier B.V. All rights reserved.
Fulltext Permission: none
Fulltext Availability: No Fulltext
Appears in Collections:CEE Journal Articles

SCOPUSTM   
Citations 50

6
Updated on Jan 15, 2025

Page view(s)

167
Updated on Jan 19, 2025

Google ScholarTM

Check

Altmetric


Plumx

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.