Skip to content

Asia-Pacific Network for Global Change Research

Asia-Pacific Network for Global Change Research

Read our Science Bulletin
Peer-reviewed publication

Reconstructing daily discharge in a megadelta using machine learning techniques

In this study, six machine learning (ML) models, namely, random forest (RF), Gaussian process regression (GPR), support vector regression (SVR), decision tree (DT), least squares support vector machine (LSSVM), and multivariate adaptive regression spline (MARS) models, were employed to reconstruct the missing daily-averaged discharge in a mega-delta from 1980 to 2015 using upstream-downstream multi-station data. The performance and accuracy of each ML model were assessed and compared with the stage-discharge rating curves (RCs) using four statistical indicators, Taylor diagrams, violin plots, scatter plots, time-series plots, and heatmaps. Model input selection was performed using mutual information and correlation coefficient methods after three data pre-processing steps: normalization, Fourier series fitting, and first-order differencing. The results showed that the ML models are superior to their RC counterparts, and MARS and RF are the most reliable algorithms, although MARS achieves marginally better performance than RF. Compared to RC, MARS and RF reduced the root mean square error (RMSE) by 135% and 141% and the mean absolute error by 194% and 179%, respectively, using year-round data. However, the performance of MARS and RF developed for the climbing (wet season) and recession (dry season) limbs separately worsened slightly compared to that developed using the year-round data. Specifically, the RMSE of MARS and RF in the falling limb was 856 and 1,040 m3/s, respectively, while that obtained using the year-round data was 768 and 789 m3/s, respectively. In this study, the DT model is not recommended, while the GPR and SVR models provide acceptable results.