A two-stage multiple-point conceptual model to predict river stage-discharge process using machine learning approaches

Due to the complex nature of river stage-discharge process, the present study tried to develop a unique strategy to predict it precisely. The proposed conceptual strategy has some advantages to cover the shortcomings. First, it uses one model instead of several models to predict multiple points instead of one point. On the one hand, the constructed model was inspired by physical-based model (to include time-space attributes of the catchment). On the other hand, ensemble empirical mode decomposition algorithm (EEMD), wavelet transform (WT), and mutual information (MI) were employed as a hybrid preprocessing approach conjugated to support vector machine. For this end, a conceptual strategy (multistation model) was developed to forecast the Souris River discharge more accurately. The strategy used herein was capable of covering uncertainties and complexities of river discharge modeling. First, a classic model along with WT was performed to predict the 1-day-ahead river discharge for each single station. Therefore DWT-EEMD and feature selection were used for decomposed subseries using MI to be employed in conceptual models. In the proposed feature selection method, some useless subseries were deleted to achieve better performance. The results approved efficiency of the proposed WTEEMD-MI approach to improve accuracy of different modeling strategies. This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/). doi: 10.2166/wcc.2020.006 om http://iwaponline.com/jwcc/article-pdf/12/1/278/851835/jwc0120278.pdf 1 Farhad Alizadeh (corresponding author) Alireza Faregh Gharamaleki Rasoul Jalilzadeh East Azerbaijan Regional Water Company, Tabriz, East Azerbaijan, Iran E-mail: f.alizadeh.ce@gmail.com


INTRODUCTION
The collection of continuous discharge measurements is a very difficult mission and, therefore, a stage-discharge (Q-H) relationship is commonly used to estimate stream discharges as measured stage values. Rating curves are widely used to determine the Q-H relationship, although they are not able to provide sufficiently accurate results. A Q-H rating curve is a relationship between stream stage (water level) and discharge for a particular section of a stream.
Such a relationship is usually established by a regressionbased analysis, and the curves are generally expressed in the form of a power equation (Kisi & Cobenar ; Farhan et al. ). However, using the Q-H relationship for different river conditions might not be capable enough.
In other words, to capture precise results, various researchers suggested applying AI-based models that have proved to provide more accurate outcome in comparison to the Q-H relationship (Potter et al. ; Ghorbani et al. ).
Artificial intelligence (AI) has been successfully applied in a number of diverse fields including water resources (e.g., Wavelet transform (WT) and ensemble empirical decomposition mode (EEMD) are pre-processing methods that give remarkable vision into the physical form of the data by representing information in both time and frequency domains (Roushangar & Alizadeh ). EEMD has been proven to be a quite effective method for extracting signals from data generated in noisy nonlinear and non-stationary processes (Huang et al. ). A time series in the WT breaks down into a series of linearly independent detail signals and one approximation signal by using a specific wavelet function. Mallat () presented a complete theory for wavelet multi-resolution signal decomposition (also mentioned as a pyramid decomposition algorithm).
Researches approved that proper data pre-processing by using the WT and EEMD can lead the models to adequately illustrate the real specifications of the basic system. WT along with EEMD decomposes a non-stationary signal into a given quantity of stationary sub-signals. Then, AI methods can be combined with WT and EEMD to improve preciseness of the prediction. In the present research, feature extraction and learning of historical data are the key to improve forecasting accuracy. For this reason, much of the actual effort is put into the processes of feature extraction, data pre-processing, and transformations. Maheswaran  In the modeling process based on AI approaches, some of the input variables might present correlation, noise, or have no significant relationship with target variables and generally are not equally informative. Therefore, one important step is to determine dominant input variables that are independent, informative, and effective. Mutual information (MI) as a nonlinear measure of information content can be useful in obviating the selection of effective inputs among huge numbers of WT-based or WT-EEMD subseries (Sannasiraja et  Eel River watershed runoff. The quality of the results supported the beneficial application of the approach in spatiotemporal modeling of the hydrological processes. Multistation modeling is a conceptual approach whose capability was confirmed and, in some cases, imposing geomorphological features to the model caused an improvement in results (Zhang & Govindaraju ; Sarangi et al. ).
The present study aims to predict the daily river stagedischarge process and enhance the capability of modeling scenarios by using hybrid WT-EEMD-MI and machine learning models. To achieve this goal, the capability of WT-EEMD-MI based multi-station model was investigated and improvements were studied, and results were compared with classic rating curve (RC).

Study area and used data
The Souris River is a river in central Selected stations (Sherwood, Foxholm, Minot, Verendry, Bantry, and Westhope) as shown in Figure 1, are in continuous form which is suitable to be used in the proposed models. Discharge-stage data ranges from 1 January 2012 to 1 January 2013 in daily scale. Also, the dataset was separated into two parts: the first division including 70% of total data as calibration dataset and the rest was considered as verification dataset. The historical daily discharge data were available for six stations in the Souris River at the USGS website (http://co.water.usgs.gov/sediment).

Rating curves (RC)
A discharge-height rating curve (RC) technique consists of a graph or equation, relating discharge to stream stage (water level), which can be used to estimate the discharge from the water level record. The Q-H rating curve generally represents a functional relationship of the form given in Equation (1): where Q is the stream discharge and H is the water level.
The values of a and b for a particular stream are determined from data via a linear regression between log H and log Q.
A major limitation of this approach is that it is not able to consider the hysteresis effect. In this study, the values of a and b are computed by using the least squares method.

Support vector machine (SVM)
Support vector machine (SVM) is a supervised learning model with associated learning algorithms that analyze data and recognize patterns, used for classification and regression analysis. A SVM model represents the samples as points in space, mapped so that the samples of the separate categories are divided by a clear gap that is as wide

Alternative regression approaches
In this study, in order to measure the efficiency of SVMs' performance, five regression models were selected to compare the outcome as follows.

Adaptive neuro-fuzzy inference system (ANFIS)
The ANFIS is a fuzzy Sugeno model put in the framework of adaptive systems to facilitate learning and adaptation (Jang ). Such framework makes the ANFIS modeling more systematic and less reliant on expert knowledge. To present the ANFIS architecture, two fuzzy if-then rules based on a first-order Sugeno model are considered: Rule 1: If x is A 1 and y is B 1 ; then f 1 ¼ p 1 x þ q 1 y þ r 1 Rule 2: If x is A 2 and y is B 2 ; then f 2 ¼ p 2 x þ q 2 y þ r 2 where x and y are the inputs, A i and B i are the fuzzy sets, f i are the outputs within the fuzzy region specified by the fuzzy rule, p i ; q i , and r i are the design parameters that are determined during the training process. The output of the ANFIS is calculated by employing the consequent parameters found in the forward pass. The detailed algorithm and mathematical background of the hybrid-learning algorithm can be found in Jang ().

Artificial neural network-radial basis function (ANN-RBF)
Radial basis function (RBF) neural network is based on supervised learning. RBF networks were independently proposed by many researchers and are a popular alternative to the multi-layer perceptron (MLP). RBF networks are also good at modeling nonlinear data and can be trained in one stage rather than using an iterative process as in MLP, and also learn the given application quickly (Venkatesan & Anitha ). The structure of RBF neural network is similar to that of MLP. It consists of layers of neurons.
The main distinction is that RBF has a hidden layer which contains nodes called RBF units. Each RBF has two key parameters that describe the location of the function's center and its deviation or width. The hidden unit measures the distance between an input data vector and the center of its RBF. The RBF has its peak when the distance between its center and that of the input data vector is zero and declines gradually as this distance increases. There is only a single

Concept of modeling via AI-based approaches
With respect to the reviewed researches, most of the deficiencies about application of AI-based models are related to its input-output structure. A simple AI model might not be able to handle the natural uncertainty of specific hydrological processes completely. As an instance, an ad hoc AI model is not designed to predict the variable of interest in an arbitrary point across the river. Therefore, it is required to design models which have semi-distributed model characteristics. For this end, in this research, different scenarios in river discharge modeling context were proposed. Multi-station modeling via SVM as a conceptual model was used in this study, which is going to be discussed.

Multi-station modeling of river discharge
The multi-station modeling of river discharge used in this study was designed to catch the nonlinearity, uncertainty, and complexity of the river discharge. For this purpose, the multi-station model was established according to the where 2 a represents the dyadic scale of the DWT. Applying

Efficiency criteria
In this study, the total data were separated into calibration and verification sets. Two different criteria were selected to evaluate the revenue of the proposed forecasting methods: the root mean square error (RMSE) and the determination coefficient (DC

RESULTS AND DISCUSSION
The RC is an empirical model which extracts information from recorded stage values. The RC was applied for all six hydrometric stations. As an instance, Figure 4(a) and 4(b) demonstrate the RC for Sherwood and Minot stations.
Results of RC using least square method led to the following results: Results of modeling are demonstrated in Figure 4(c).
Based on the results it was observed that results need to be strengthened in terms of DC, RMSE, and MAE. We are going to discuss the results of modeling using SVM in the next section.

Optimizing of SVM parameters
In this study, SVM was used to predict the discharge values wherein, σ is a kernel specific and γ is margin parameter.
The tuned parameters (σ, γ) were optimized via grid search procedure. Therefore, we will call the model optimized SVM (OSVM).
The RBF kernel function was selected as the core tool of OSVM used for all the rest of the models. The OSVM needs the choice of three crucial variables, that are ε, and (γ), in which γ is a variable of the RBF kernel function. In order to implement the above-mentioned variables, a systematic To consider this grid search, a normal range of variables settings are considered. Initially, modified values of ε and C for a determined γ were obtained and then γ was altered. Performance optimization procedures were applied to find the best values. Figure 5 illustrates the statistics parameters' values of various Gamma values of the SVM model (fed with the input configuration). Accordingly, optimal parameters were found out for the rest of the models.
A data-driven model's performance mostly depends on its structure. Hence, in this study, a conceptual model was developed to predict the Q-H process of the Souris River using only one model. First, different types of classic and wavelet-based modeling strategies based on temporal features (i.e., stage-discharge time series) were developed to assess the connection between discharge and river stage over the Souris River. Next, a unique multi-station model was constructed for the hydrometric stations.

Results of multi-station model
The river flow process in any cross section of a river system can be characterized as the function of various variables, such as discharge, catchment and river physical characteristics. The relationship between river discharge and influential variables can be expressed by: where Q(t þ 1) denotes the river discharge in any cross section of the river system, ε tþ1 is the random error, x(t þ 1) is the input vector, which may include many variables such as In order to determine input variables for AI-based models, the input data were extracted based on the principle underlying the RC, in which different model structures were selected by a combination of stage and/or discharge variables including: H t , H tÀ1 , H tÀ2, … and H tÀ7 representing stage values at times (t), (t À 1), (t À 2), … and (t À 7) and Q tÀ1 , Q tÀ2… and Q tÀ7 representing discharge values at times (t À 1), (t À 2), … and (t À 7). Models 1-3 extract information from stage values alone; Models 4-6 extract information from discharge values alone with a structure similar to autocorrelation; and Models 7-10 extract information from both stage and discharge values. While it is clear that the RC is not the only technique, Table 2 shows the possible different choices of dependent variables for each modeling technique, but these are required to have optimized structures by avoiding over-fitting. In the present study, only the best structures are used to report the performance of the proposed models. The performance measures outlined in 'Performance criteria' were used to select the particular model structure but other techniques will be used to assist the choice, e.g., scatter diagram, or plotting difference between individual model performances.
Based on the correlation analysis and modeling outcome Comb. 1, 2, 8, and 10 had better performance and this is going to be explained.
where L is decomposition level and N is the number of time Results are presented in Figure 6. According to Figure 6, the db(4,5) outperformed the classic modeling (antecedent input) and other wavelet-based inputs. Also, Figure 7 represents the scatterplots of observed against predicted models (best structures in verification period) for Foxholm station.
The proposed multi-station model, in addition to Q-H time series, took advantage of spatially varying parameters  (Nourani et al. ). In this way, the dimensionless geomorphological parameters were applied in the model. Therefore, the model construction for each sub-basin could be represented as: In Equation (8) In order to find the efficient structure of the proposed multi-station model, sensitivity analysis was performed. To this end, six best input combinations were considered to create the input matrix ( Figure 8). Figure 8 shows the multi-station model's performance for different input combinations. In order to explore the efficiency of physical parameters, Comb.
According to the obtained results, imposing dimensionless physical parameters along with temporal dataset caused an increase in modeling accuracy (Combs. (2)- (6)). Generally, Combs.
(2)- (6)  By considering improvement in Comb. 5, WT-EEMD-MI pre-processing approach was applied to find more dominant subseries. For this reason, detail 1 and 2 of db(4,5) was further decomposed using EEMD. MI was used to select the appropriate IMFs from decomposed subseries.
According to Comb. 6, imf 8 and 9 from detail subseries 1 and imf 6 and 7 from detail subseries 2 were selected via the proposed pre-processing approach. Figure 9 presents   OSVMs' capability as a nonlinear prediction approach.
The structure of ARIMAX and ANN-RBF was optimized by trial-and-error process; on the other hand, ELM is a self-tuning regression tool. In a similar manner and using the selected input combinations, the ANFIS model was also used for prediction of the river discharge. In the ANFIS modeling, two substantial points must be considered: first, the ANFIS architecture (e.g., number and type of MFs) and second, the calibration procedure (e.g., selection of iteration number, efficiency criteria). Based on the models, the Gaussian membership function (gaussmf) was selected because of its accurate results.

CONCLUSIONS
In this research, capability of a two-stage conceptual model was proposed and verified to discover its quantitative and qualitative aspects in prediction of river Q-H process. At the first stage, a strong novel DWT-EEMD based pre-processing approach was used to capture signals with more stationary properties. Application of the model led to interesting results which can be presented in two different categories: proposed hydrological models and proposed pre-processing approach. A conceptual strategy was proposed to study the Q-H relation of the river, namely, a multi-station model. All proposed models have unique capabilities and some deficiencies to be discussed.
• Rating curve: RC is a physical-based model which is used to extract information from river stage datasets.