A regression model integrating data pre-processing and transformation, input selection techniques and a data-driven statistical model, facilitated accurate 7 day ahead time series forecasting of selected water quality parameters. A core feature of the modelling approach is a novel recursive input–output algorithm. The herein described model development procedure was applied to the case of a 7 day ahead dissolved oxygen (DO) concentration forecast for the upper hypolimnion of Advancetown Lake, Queensland, Australia. The DO was predicted with an R2 > 0.8 and a normalised root mean squared error of 14.9% on a validation data set by using 10 inputs related to water temperature or pH. A key feature of the model is that it can handle nonlinear correlations, which was essential for this environmental forecasting problem. The pre-processing of the data revealed some relevant inputs that had only 6 days' lag, and as a consequence, those predictors were in-turn forecasted 1 day ahead using the same procedure. In this way, the targeted prediction horizon (i.e. 7 days) was preserved. The implemented approach can be applied to a wide range of time-series forecasting problems in the complex hydro-environment research area. The reliable DO forecasting tool can be used by reservoir operators to achieve more proactive and reliable water treatment management.
Data-driven recursive input–output multivariate statistical forecasting model: case of DO concentration prediction in Advancetown Lake, Australia
Edoardo Bertone, Rodney A. Stewart, Hong Zhang, Cameron Veal; Data-driven recursive input–output multivariate statistical forecasting model: case of DO concentration prediction in Advancetown Lake, Australia. Journal of Hydroinformatics 1 September 2015; 17 (5): 817–833. doi: https://doi.org/10.2166/hydro.2015.131
Download citation file: