Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/169077
Title: Soil database development with the application of machine learning methods in soil properties prediction
Authors: Li, Yangyang
Rahardjo, Harianto
Satyanaga, Alfrendo
Rangarajan, Saranya
Lee, Daryl Tsen-Tieng
Keywords: Engineering::Civil engineering
Issue Date: 2022
Source: Li, Y., Rahardjo, H., Satyanaga, A., Rangarajan, S. & Lee, D. T. (2022). Soil database development with the application of machine learning methods in soil properties prediction. Engineering Geology, 306, 106769-. https://dx.doi.org/10.1016/j.enggeo.2022.106769
Journal: Engineering Geology
Abstract: Excessive rainwater infiltration can be an important causal agent of both slope and whole tree uprooting failures. Early warnings or stabilization measures on high-risk slopes or trees are critically important. To identify the high-risk areas, it is necessary to conduct seepage, slope and tree stability analyses over a large region. Given the spatial variability of soil properties, a soil database is therefore required before performing distributed or Geographical Information System (GIS) -based water balance and stability analyses. Considering that the unsaturated soil properties could be very different from saturated soil properties, in this study, a soil database containing both saturated and unsaturated hydraulic and mechanical soil properties was developed for the first time. Machine learning methods were used to predict the unknown soil properties. Based on the predicted soil properties, spatial distributions of different saturated and unsaturated soil properties were generated using the ordinary kriging method. Then the soil database was developed with Singapore island being divided into 97 zones, with each zone having similar soil properties. In this study, the importance of different input variables in soil properties prediction was also investigated. In addition to soil plasticity (i.e., Liquid Limit (LL), Plastic Limit (PL) and Plasticity Index (PI)) and grain size distribution (i.e., gravel, sand, and fines fractions), location (i.e., longitude and latitude) was found to be of high importance as well and are recommended to be used as input variables to predict soil properties, especially when data volume is relatively limited. For those soil properties that cover a large range of values, model performance is better when logarithm values were used as the outputs. Moreover, given the possible correlation between some output parameters, the prediction of the Soil-water Characteristic Curve (SWCC) from a multi-output model is recommended after comparing its performance with a single output model. Furthermore, the performance of two commonly used machine learning methods (i.e., random forest regression and artificial neural network) in soil properties prediction were compared and the prediction error resulting from the random forest regression method was generally smaller. The developed database includes the mean values of saturated permeability, saturated and unsaturated shear strength parameters, and SWCC in each zone. The database can be applied in regional GIS-based water balance and slope stability analyses to account for the spatial heterogeneity instead of assuming constant soil properties.
URI: https://hdl.handle.net/10356/169077
ISSN: 0013-7952
DOI: 10.1016/j.enggeo.2022.106769
Schools: School of Civil and Environmental Engineering 
Rights: © 2022 Elsevier B.V. All rights reserved.
Fulltext Permission: none
Fulltext Availability: No Fulltext
Appears in Collections:CEE Journal Articles

SCOPUSTM   
Citations 20

18
Updated on Jun 11, 2024

Web of ScienceTM
Citations 50

4
Updated on Oct 26, 2023

Page view(s)

111
Updated on Jun 13, 2024

Google ScholarTM

Check

Altmetric


Plumx

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.