This paper investigates the performance of wavelet-based regression models for monthly streamflow forecasting. The wavelet-based regression model combines wavelet transformation and multiple linear regression (LR). The wavelet-based regression forecasts are also compared to the wavelet-based neural network, which combines the wavelet transformation and feed forward neural network. The wavelet transformation has significantly positive effects on the modeling performance. In this study, the different approaches of the wavelet-based models were applied to forecast the monthly flow. The results show that the wavelet-based feed forward neural network and the wavelet-based linear regression (WLR) produce very good results for 1-month-ahead streamflow forecasting. Both techniques demonstrated an almost similar performance. Also, the result of the WLR5 model is better than the results of the other WLR models in terms of performance criteria.

## INTRODUCTION

The monthly river flow forecast is very important for water resources system planning and management problems such as dam construction, reservoir operation, and flood control. Generally, river flow forecasting models could be grouped under two main techniques, physical-based models and black-box models (Sivakumar *et al.* 2002). However, the physical models are basically dependent on inaccurate initial conditions and parameterization schemes of subscale phenomena. As well, errors in modeling and observational systems affect negatively the success of forecasting. On the other hand, black-box models such as artificial neural network (ANN) are increasingly being used in hydrologic time series forecasting (Maier & Dandy 2000; Cigizoglu 2003; Sudheer 2005; Jain & Kumar 2007; Gao *et al.* 2010). Singh & Borah (2013) predicted successfully the Indian summer rainfall using feed forward neural network (FFNN). Aksoy & Dahamsheh (2009) used three different ANN models and a linear regression (LR) model for monthly precipitation forecast. Sattari *et al.* (2012) tested performance of the neural network for daily reservoir inflow prediction. They showed that FFNN has the best performance criteria.

On the other hand, a hybrid ANN model combined with wavelet transformation has been improved for forecasting during the past years (Kim & Valdes 2003; Partal & Cigizoglu 2007; Adamovski & Chan 2011; Ramana *et al.* 2013; Shoaib *et al.* 2014; Belayneh *et al.* 2014; Nourani *et al.* 2014; Makwaba & Tiwari 2014). Cannas *et al.* (2006) studied prediction of monthly river flow data using the wavelet neural network (WNN) model. They studied two different approaches for river flow predicting. They used only 1 month preceding values of the wavelet coefficients. Partal (2009) studied the potential of the WNN models for streamflow forecasting. On the other hand, wavelet-based regression models have been successfully used in monthly streamflow forecasting in the last few years (Budu 2014; Liu *et al.* 2014; Kalteh 2015). Kisi (2009) used a wavelet regression model for monthly streamflow forecasting and showed that the WR model performed better than the ANN, LR, and auto regressive moving average models. Kisi (2010) used a linear wavelet regression model as an alternative to the ANN models for short-term streamflow forecasting. Sahay & Sehgal (2013) and Sehgal *et al.* (2014) used a wavelet regression model for forecasting 1-day-ahead river stages. Goyal (2014) showed that the wavelet regression method is better than the classical ANN for monthly rainfall prediction.

The purpose of this paper is to test the performance of the wavelet-based linear regression model for monthly streamflow modeling, and to compare it with the performance of the wavelet-based neural network. In this study, the different wavelet models were applied for forecasting of monthly streamflows. The first model was constructed with the wavelet transformation and linear regression (WLR). The second model was constructed with the wavelet transformation and the feed forward neural network (WFFNN). As well, the performances of the wavelet-based models are compared with the conventional methods (FFNN and LR). Different performance evaluation measures were employed to assess the results of the models. This study also researches the forecasting performance of rearranged wavelet series obtained by collecting the appropriate wavelet series. The main purpose of this study is to demonstrate that the wavelet regression model is useful and is an alternative to the WNN model for streamflow forecasting.

## METHODS

### Wavelet transformation

The wavelet transformation is a mathematical tool that provides a time–frequency representation of a signal in the time domain.

*x*(

*t*),

*t*∈ [ ∞, − ∞], wavelet function

*ψ*(

*τ, s*) can be acquired as below: where

*t*stands for time;

*s*for wavelet scale;

*τ*for the time step in which the window function is iterated (Meyer 1993). Successive wavelet transformation of

*x*(

*t*) is defined as: where (

***) indicates the complex conjugate.

*W*(

*τ,s*) presents a two-dimensional picture of wavelet power under different scale.

*et al.*2006). This transformation is called discrete wavelet transformation and described as below: where

*m*and

*n*are integers that control, respectively, the scale and time;

*s*is a specified fixed dilation step greater than 1; and

_{0}*t*

_{0}is the location parameter and must be greater than zero. Here, the translation step, , depends on the dilation, . The most general choice for the parameters

*s*

_{0}and

*τ*

_{0}is 2 and 1 (time steps), respectively. This is the most common and simplest choice for the parameters

*s*that the sampling of the frequency axis corresponds to dyadic sampling (Cannas

_{o}*et al.*2006).

*x*, where

_{i}*x*occurs at discrete time

_{i}*i*(i.e., here integer time steps are used), the discrete wavelet transformation becomes: In this equation, is wavelet coefficient for the discrete wavelet of scale and location . The Daubechies (db2) wavelet as the mother wavelet was selected in this study. The db2, db3, db4, db5, db6, db7, db8, db9, and db10 are the common members of the Daubechies wavelet family. Many former studies have generally used irregular db2 and db4 wavelets (Cannas

*et al.*2006; Nourani

*et al.*2014). The general choice of db2 in previous studies may be due to the fact that only limited information of the data is contained in the wavelet coefficients successfully expressed by the relatively simpler wavelet function db2, which is a polynomial with two coefficients (Shoaib

*et al.*2014).

### FFNN

Neural network with feed forward back propagation algorithm is often used in water resources planning (Wang *et al.* 2011). The neural network structure has one input layer, one output layer, and one hidden layer with hidden neurons. The connections between neurons in different layers are supplied by adjustment weights values. Each neuron is connected only with neurons in the next layers (Cigizoglu 2004). Each neuron sums its weighted inputs and later produces its output by activation function. In this study, tangent sigmoid function is used as neuron transfer function.

Predicted output values are always different from observed values. The weight of connections is modified based on the differences between the computed values and observed values at the output layer. This is the back-propagation process. After that, the feed forward process is again formed until an aimed for total error or number of prescribed iterations is reached. More details on neural networks can be seen in Cigizoglu (2003).

### Multiple LR

*Y*and the predictor variables

_{i}*Xi*is linear. LR equation is defined as follows: where

*α*is called the intercept and the

*βj*are called slopes or coefficients. The LR equation can be used to make a forecast of the value of

*Y*with the appropriate values of

*X*.

### Model performance comparison

*R*) statistics were used to evaluate the accuracy of the forecasting model.

*R*shows the degree which two variables are linearly related to and presents a value between +1 and −1 inclusive. The RMSE is defined as: in which

*N*is the number of data sets, and

*Yi*is the daily streamflow.

*E*). The Nash–Sutcliffe model efficiency coefficient is used to the predictive power of hydrological models (Adamovski & Chan 2011). where is the mean of observed precipitations. More details on the Nash–Sutcliffe coefficient can be seen in Pulido-Calvo & Gutierrez-Estrada (2009).

### Mann–Whitney homogeneity test

The homogeneity of the model predictions has been examined with the Mann–Whitney *U*-test to present more evidence on the success of the models. The test assumes that there are two independent samples from two populations, and that the samples have the same shapes and spreads. Test statistic *z* can be calculated by using the formula given by Mann & Whitney (1947). The critical *z* value at the 0.05 significance level is 1.96. If the *z* value is above the critical value, two populations means come from the same distribution. More details can be seen in Yue & Wang (2002).

### Global wavelet spectra

*T*is the number of points in the time series. The smoothed Fourier spectrum approaches the GWS when the amount of necessary smoothing is decreased with increasing scale. Hence, GWS provide an unbiased and consistent estimation of the true power spectrum. More details on GWS can be seen in Torrence & Compo (1998).

### Case study

Some of the statistical properties of the monthly flow data are presented in Table 1 (for the entire data set and testing data set). For station 518 (Manisa), the minimum and maximum values in the entire flow data fall in the range 0.1–460 m^{3}/s, while the minimum and maximum values in the testing flow data fall in the range 0.1–176 m^{3}/s. The training data set is more extreme than the testing data set limit. This means that the trained forecasting models do not face difficulties in making extrapolation. The auto-correlations of the data have significant values (*r _{1}* = 0.73,

*r*= 0.46 for Manisa station).

_{2}Stations | Some parameters of the streamflow data^{a} | |||||||
---|---|---|---|---|---|---|---|---|

x (m_{min}^{3}/s) | x (m_{mean}^{3}/s) | x (m_{max}^{3}/s) | r _{1} | r _{2} | r _{3} | r _{4} | ||

Manisa (station no. 518) | Entire data | 0.1 | 42 | 460 | 0.73 | 0.46 | 0.24 | 0.09 |

Testing data | 0.1 | 16 | 176 | 0.59 | 0.28 | 0.11 | 0.06 | |

Aydın (station no. 706) | Entire data | 0.2 | 59 | 300 | 0.8 | 0.56 | 0.35 | 0.18 |

Testing data | 0.2 | 29 | 174 | 0.54 | 0.24 | 0.09 | 0.04 |

Stations | Some parameters of the streamflow data^{a} | |||||||
---|---|---|---|---|---|---|---|---|

x (m_{min}^{3}/s) | x (m_{mean}^{3}/s) | x (m_{max}^{3}/s) | r _{1} | r _{2} | r _{3} | r _{4} | ||

Manisa (station no. 518) | Entire data | 0.1 | 42 | 460 | 0.73 | 0.46 | 0.24 | 0.09 |

Testing data | 0.1 | 16 | 176 | 0.59 | 0.28 | 0.11 | 0.06 | |

Aydın (station no. 706) | Entire data | 0.2 | 59 | 300 | 0.8 | 0.56 | 0.35 | 0.18 |

Testing data | 0.2 | 29 | 174 | 0.54 | 0.24 | 0.09 | 0.04 |

^{a}*x*_{min}, *x*_{mean}, x_{max} denote minimum, mean, and maximum monthly streamflow; *r _{1}, r_{2}, r_{3}, r_{4}* denote lag 1, lag 2, lag 3, and lag 4 autocorrelation coefficients, respectively.

## RESULTS

### Wavelet decomposition

In this study, the monthly streamflow data were decomposed into various D series at different resolution levels by using the wavelet transformation algorithm proposed by Mallat (1989). The algorithm provides wavelet coefficients at determined scales. This enables study of the components at different scales. The flow data were decomposed into an approximation and eight detailed components (2, 4, 8, 16, 32, 64, 128, 256 monthly). The D1 (2 monthly) shows the highest frequency component, while the D8 (256 monthly) shows the lowest frequency component. Wavelet-based model results depend on the number of decomposition levels applied in the wavelet transformation. Few decomposition levels result in poor performance of the model. On the other hand, excessive decomposition level results in significant computational work. However, ineffective components in the data can be easily determined in this way. Namely, it can be said that the more successful model results require a higher number of decomposition levels. In fact, the maximum level of the decomposition depends on the size of the data. In this study, the size of the data (468 value) allows to the maximum eight level of the decomposition.

The correlations between the D series and the observed streamflow are presented in Table 2 for Manisa station. The correlations between the D series at time *t**−**1* and the observed streamflow at time *t* demonstrate that the D3 component has the highest magnitude (equal to 0.54). As well, D2, D4, D5, D6, D7, D8, and approximate components show significantly high correlations. D1 has a negative correlation (equal to −0.14). For *t**−**2* time (2 months), the D3, D4, D5, D6, D7, D8, and approximate components have high positive correlations while the D1 and D2 components have negative correlations. The number of the selected components is actually dependent on the user's preference. However, the determination of a limit correlation value may be quite helpful towards this aim. The limit correlation value for selecting the D components was accepted as 0.20.

Discrete wavelet components | t−1 | t−2 | t−3 | t−4 | t−5 |
---|---|---|---|---|---|

D1 (2 monthly) | −0.14 | −0.06 | 0.03 | 0.004 | 0.001 |

D2 (4 monthly) | 0.26 | −0.16 | −0.19 | −0.16 | 0.008 |

D3 (8 monthly) | 0.54 | 0.28 | −0.05 | −0.18 | −0.17 |

D4 (16 monthly) | 0.42 | 0.35 | 0.26 | 0.14 | 0.02 |

D5 (32 monthly) | 0.32 | 0.31 | 0.29 | 0.26 | 0.22 |

D6 (64 monthly) | 0.21 | 0.21 | 0.21 | 0.21 | 0.20 |

D7 (128 monthly) | 0.38 | 0.38 | 0.38 | 0.38 | 0.38 |

D8 (256 monthly) | 0.43 | 0.43 | 0.43 | 0.43 | 0.43 |

A (Approximation) | 0.31 | 0.31 | 0.31 | 0.31 | 0.31 |

Discrete wavelet components | t−1 | t−2 | t−3 | t−4 | t−5 |
---|---|---|---|---|---|

D1 (2 monthly) | −0.14 | −0.06 | 0.03 | 0.004 | 0.001 |

D2 (4 monthly) | 0.26 | −0.16 | −0.19 | −0.16 | 0.008 |

D3 (8 monthly) | 0.54 | 0.28 | −0.05 | −0.18 | −0.17 |

D4 (16 monthly) | 0.42 | 0.35 | 0.26 | 0.14 | 0.02 |

D5 (32 monthly) | 0.32 | 0.31 | 0.29 | 0.26 | 0.22 |

D6 (64 monthly) | 0.21 | 0.21 | 0.21 | 0.21 | 0.20 |

D7 (128 monthly) | 0.38 | 0.38 | 0.38 | 0.38 | 0.38 |

D8 (256 monthly) | 0.43 | 0.43 | 0.43 | 0.43 | 0.43 |

A (Approximation) | 0.31 | 0.31 | 0.31 | 0.31 | 0.31 |

SD1 = D2 + D3 + D4 + D5 + D6 + D7 + D8 + A

SD2 = D3 + D4 + D5 + D6 + D7 + D8 + A

SD3 = D4 + D5 + D6 + D7 + D8 + A

SD4 = D5 + D6 + D7 + D8 + A

SD5 = D6 + D7 + D8 + A

Summed wavelet series | t−1 | t−2 | t−3 | t−4 | t−5 |
---|---|---|---|---|---|

SD1 | 0.83 | 0.55 | 0.25 | 0.10 | 0.06 |

SD2 | 0.82 | 0.63 | 0.47 | 0.17 | 0.03 |

SD3 | 0.63 | 0.66 | 0.58 | 0.53 | 0.47 |

SD4 | 0.53 | 0.55 | 0.57 | 0.55 | 0.53 |

SD5 | 0.49 | 0.49 | 0.49 | 0.49 | 0.49 |

Summed wavelet series | t−1 | t−2 | t−3 | t−4 | t−5 |
---|---|---|---|---|---|

SD1 | 0.83 | 0.55 | 0.25 | 0.10 | 0.06 |

SD2 | 0.82 | 0.63 | 0.47 | 0.17 | 0.03 |

SD3 | 0.63 | 0.66 | 0.58 | 0.53 | 0.47 |

SD4 | 0.53 | 0.55 | 0.57 | 0.55 | 0.53 |

SD5 | 0.49 | 0.49 | 0.49 | 0.49 | 0.49 |

Figure 2 shows how the SD series changes with the time. As seen in Figure 2, the SD1 and the SD2 series are fairly harmonious with the observed data. The most harmonious component is SD1.

Table 3 presents the cross-correlations between the SD series and the observed streamflow for Manisa station. For *t**−**1* time, SD1 has the highest magnitude (equal to 0.83). Also, SD2 has a significantly high correlation (equal to 0.82). For *t**−**2* time, SD3 and SD2 show the highest correlation (equal to 0.66 and 0.63, respectively). For *t**−**3* time, SD3 has the highest correlation (its value is 0.58). The correlations between the SD series and the observed data are significantly superior to the auto-correlations of the data. For instance, while the lag 3 autocorrelation of the streamflow data are 0.24, the correlation between the SD3 series at *t**−**3* time and the streamflow data is increased to 0.58 (Tables 1 and 3).

### Structure of the models

The purpose of the wavelet hybrid models is to forecast the monthly flow using the new wavelet components (SD). The SD series were used as inputs for forecasting. Each of the SD series has a distinct contribution to the observed time series. The SD series was obtained by collecting the selected D series as defined in the above section. In the model structures, the SD* _{t−i}* and Q

*denotes the SD series at time*

_{t−i}*t*

*−*

*i*and the observed streamflow at time

*t*

*−*

*i*(

*i*= 1,2,3…), respectively. The

*Q*is the observed streamflow at time

_{t}*t*.

### Stream flow forecasting

The 14 different forecast models were evaluated for the monthly flow forecasting (Table 4). The WLR1, WLR2, WLR3, WLR4, WFNN1, WFFNN2, WFFNN3, WFFNN4 models use the SD series as the inputs of the LR and the neural networks, respectively. The LR1, LR2, FFNN1, and FNNN2 models use the original streamflow values as the inputs of the LR and the neural networks, respectively. The data from 1962 to 1990 (348 values) were chosen in the training periods. The data from 1991 to 2000 (120 values) were employed to be forecasted. Before applying, the selected input data were normalized in the range [0 1] divided by their maximum values. The neural network simulations were developed using the MATLAB software program. The tangent sigmoid function is used for the hidden and output node(s), respectively. The hidden layer node numbers of each model were determined after trying various network structures. To avoid overfitting in the neural network, ‘early stopping’ technique was considered. Therefore, the neural network training was stopped after 200 iterations.

Model | Model inputs | Model structure | RMSE (m^{3}/s) | R | E |
---|---|---|---|---|---|

WLR1 | SD1_{t−1} + SD1_{t−2} + SD1_{t−3} +SD1_{t−4} | (4,1) | 7.94 | 0.931 | 0.875 |

WLR2 | SD1_{t−1} + SD1_{t−2} + SD1_{t−3} + SD1_{t−4} + SD2_{t−1} + SD2_{t−2} | (6,1) | 7.54 | 0.941 | 0.885 |

WLR3 | SD1_{t−1} + SD1_{t−2} + SD1_{t−3} +SD1_{t−4} +SD2_{t−1} + SD2_{t−2} + SD3_{t−3} | (7,1) | 7.11 | 0.949 | 0.905 |

WLR4 | SD1_{t−1} + SD1_{t−2} + SD1_{t−3} + SD1_{t−4} + SD2_{t−1} + SD2_{t−2} + SD3_{t−3} + SD4_{t−4} | (8,1) | 6.81 | 0.950 | 0.907 |

LR1 | Q_{t−1} + Q_{t−2} + Q_{t−3} | (3,1) | 19.90 | 0.588 | 0.354 |

LR2 | Q_{t−1} + Q_{t−2} + Q_{t−3} + Q_{t−4} | (4,1) | 20.02 | 0.587 | 0.342 |

FFNN1 | Q_{t−1} + Q_{t−2} + Q_{t−3} | (3,2,1) | 19.36 | 0.602 | 0.362 |

FFNN2 | Q_{t−1} + Q_{t−2} + Q_{t−3} + Q_{t−4} | (4,2,1) | 18.04 | 0.621 | 0.394 |

WFFNN1 | SD1_{t−1} + SD1_{t−2} + SD1_{t−3} + SD1_{t−4} | (4,3,1) | 7.48 | 0.958 | 0.928 |

WFFNN2 | SD1_{t−1} + SD1_{t−2} + SD1_{t−3} + SD1_{t−4} + SD2_{t−1} + SD2_{t−2} | (6,3,1) | 7.16 | 0.961 | 0.929 |

WFFNN3 | SD1_{t−1} + SD1_{t−2} + SD1_{t−3} + SD1_{t−4} + SD2_{t−1} +SD2_{t−2} + SD3_{t−3} | (7,4,1) | 6.75 | 0.957 | 0.928 |

WFFNN4 | SD1_{t−1} + SD1_{t−2} + SD1_{t−3} + SD1_{t−4} + SD2_{t−1} + SD2_{t−2 +}SD3_{t−3} + SD4_{t−4} | (8,4,1) | 8.29 | 0.931 | 0.874 |

WLR5 | SD1_{t−1} + SD1_{t−2} + SD1_{t−3} + SD1_{t−4} + SD2_{t−1} + SD2_{t−2} + SD3_{t−3} + SD4_{t−4} + Q_{t−1} + Q_{t−2} + Q_{t−3} | (11,1) | 1.70 | 0.997 | 0.994 |

WFFNN5 | SD1_{t−1} + SD1_{t−2} + SD1_{t−3} + SD1_{t−4} + SD2_{t−1} + SD2_{t−2} + SD3_{t−3} + Q_{t−1} + Q_{t−2} + Q_{t−3} + Q_{t−4} | (11,4,1) | 2.07 | 0.994 | 0.988 |

Model | Model inputs | Model structure | RMSE (m^{3}/s) | R | E |
---|---|---|---|---|---|

WLR1 | SD1_{t−1} + SD1_{t−2} + SD1_{t−3} +SD1_{t−4} | (4,1) | 7.94 | 0.931 | 0.875 |

WLR2 | SD1_{t−1} + SD1_{t−2} + SD1_{t−3} + SD1_{t−4} + SD2_{t−1} + SD2_{t−2} | (6,1) | 7.54 | 0.941 | 0.885 |

WLR3 | SD1_{t−1} + SD1_{t−2} + SD1_{t−3} +SD1_{t−4} +SD2_{t−1} + SD2_{t−2} + SD3_{t−3} | (7,1) | 7.11 | 0.949 | 0.905 |

WLR4 | SD1_{t−1} + SD1_{t−2} + SD1_{t−3} + SD1_{t−4} + SD2_{t−1} + SD2_{t−2} + SD3_{t−3} + SD4_{t−4} | (8,1) | 6.81 | 0.950 | 0.907 |

LR1 | Q_{t−1} + Q_{t−2} + Q_{t−3} | (3,1) | 19.90 | 0.588 | 0.354 |

LR2 | Q_{t−1} + Q_{t−2} + Q_{t−3} + Q_{t−4} | (4,1) | 20.02 | 0.587 | 0.342 |

FFNN1 | Q_{t−1} + Q_{t−2} + Q_{t−3} | (3,2,1) | 19.36 | 0.602 | 0.362 |

FFNN2 | Q_{t−1} + Q_{t−2} + Q_{t−3} + Q_{t−4} | (4,2,1) | 18.04 | 0.621 | 0.394 |

WFFNN1 | SD1_{t−1} + SD1_{t−2} + SD1_{t−3} + SD1_{t−4} | (4,3,1) | 7.48 | 0.958 | 0.928 |

WFFNN2 | SD1_{t−1} + SD1_{t−2} + SD1_{t−3} + SD1_{t−4} + SD2_{t−1} + SD2_{t−2} | (6,3,1) | 7.16 | 0.961 | 0.929 |

WFFNN3 | SD1_{t−1} + SD1_{t−2} + SD1_{t−3} + SD1_{t−4} + SD2_{t−1} +SD2_{t−2} + SD3_{t−3} | (7,4,1) | 6.75 | 0.957 | 0.928 |

WFFNN4 | SD1_{t−1} + SD1_{t−2} + SD1_{t−3} + SD1_{t−4} + SD2_{t−1} + SD2_{t−2 +}SD3_{t−3} + SD4_{t−4} | (8,4,1) | 8.29 | 0.931 | 0.874 |

WLR5 | SD1_{t−1} + SD1_{t−2} + SD1_{t−3} + SD1_{t−4} + SD2_{t−1} + SD2_{t−2} + SD3_{t−3} + SD4_{t−4} + Q_{t−1} + Q_{t−2} + Q_{t−3} | (11,1) | 1.70 | 0.997 | 0.994 |

WFFNN5 | SD1_{t−1} + SD1_{t−2} + SD1_{t−3} + SD1_{t−4} + SD2_{t−1} + SD2_{t−2} + SD3_{t−3} + Q_{t−1} + Q_{t−2} + Q_{t−3} + Q_{t−4} | (11,4,1) | 2.07 | 0.994 | 0.988 |

^{3}/s;

*E*= 0.907,

*R*= 0.950). Here, the configuration (8,1) denotes a LR model comprising eight inputs and one output. The WLR4 equation is: For the LR1 model, the correlation coefficient in the testing period was 0.588 (the model equation is

*Q*

_{t}*=*0.85

*Q*− 0.10

_{t−1}*Q*− 0.07

_{t−2}*Q*).

_{t−3}For the WFFNN2 model with six inputs (SD1_{t−1} + SD1_{t−2} + SD1_{t−3} + SD1_{t−4} + SD2_{t−1} + SD2_{t−2}) having the configuration (6,3,1), provided one of the best performance criteria (RMSE = 7.16 (m^{3}/s), *R* = 0.961, *E* = 0.929). The *E* coefficient shows a very strong relationship between the forecasted and the observed values. Here, the configuration (6,3,1) denotes a neural network model comprising six inputs, three hidden and one output nodes. The results of the WFFNN2 model are slightly better than those of the WLR4 model in terms of the *E* and *R* criteria. On the other hand, the result of the WLR4 model is slightly better than the WFFNN2 model in terms of the RMSE criterion. The WLR models are superior to the LR models. While correlation coefficient obtained by the LR1 model is 0.58, with the WLR4 model these values are increased to 0.95. It is well known from previous studies that the WLR models have been found to be more accurate than the LR models (Budu 2014). Also, the results of the FFNN models are better than the LR models in terms of the performance criteria.

*) and the observed series (Q*

_{t−i}*) in the same input layer for forecast. A regression equation is used for estimating the relationships among variables and it focuses on the relationship between dependent and independent variables. Moreover, the ANN approach employs a non-linear function to model the inputs–outputs relationship. These approaches are designed to identify the connection between the inputs and the outputs, without going into the analysis of internal structure of the physical process (Sivakumar*

_{t−i}*et al.*2002). The success of the forecasting models is sensitive to the selected inputs. The WLR5 model consists of 11 inputs and has the best performance criteria among all the models (RMSE = 1.70 m

^{3}/s;

*R*= 0.997;

*E*= 0.994 at Manisa station). The WLR5 equation obtained for Manisa station is: The results show that the WLR5 model performed better than the other WLR models. While the R value of the WLR4 model is 0.95, the

*R*value of the WLR5 model is 0.997. As well, the RMSE differences between the WLR5 and the other models in the testing period were quite apparent. The WFFNN5 model, having 10 inputs (SD1

_{t−1}+ SD1

_{t−2}+ SD1

_{t−3}+ SD1

_{t−4}+ SD2

_{t−1}+ SD2

_{t−2}) and the configuration (10,4,1), provided one of the best performance criteria (RMSE = 2.07 (m

^{3}/s);

*R*= 0.993,

*E*= 0.986). The estimated accuracy is 99% for the WLR5 and the WFFNN5 models.

^{3}/s;

*R*= 0.959,

*E*= 0.92). For the WLR4 model, the correlation coefficient was 0.939. The equation of the WLR4 model is: While the

*R*value obtained by the FFNN1 model is 0.545, for the LR1 model this value is 0.501. On the other hand, the estimated accuracy is 95% with the WFFNN3 model and 99% with the WFFNN5 model. The

*E*value is 0.992 for the WLR5 model. The equation of the WLR5 model is: From the WLR equations, it is clear that adding previous SD1 values into the inputs generally increases the models’ accuracies. As well, it can be seen from the WLR equations that the SD1 series have high regression coefficients. SD1 has wavelet decomposed components except for the D1 component which is assumed as noise component in the data. It can obviously be seen that the WFFNN5 model and the WLR5 model show the best performance in terms of performance criteria. The forecasting performance of the SD series obtained by collecting the appropriate wavelet series was investigated. The results show that SD1 is the most used series for all the wavelet-based model types.

Model | Model inputs | Model structure | RMSE (m^{3}/s) | R | E |
---|---|---|---|---|---|

WLR4 | SD1_{t−1} + SD1_{t−2} + SD1_{t−3} +SD1_{t−4} + SD2_{t−1} + SD2_{t−2} + SD3_{t−3} + SD4_{t−4} | (8,1) | 7.54 | 0.939 | 0.892 |

LR1 | Q_{t−1} + Q_{t−2} + Q_{t−3} | (3,1) | 21.50 | 0.501 | 0.299 |

FFNN1 | Q_{t−1} + Q_{t−2} + Q_{t−3} | (3,3,1) | 21.46 | 0.545 | 0.289 |

WFFNN3 | SD1_{t−1} + SD1_{t−2} + SD1_{t−3} + SD1_{t−4} + SD2_{t−1} + SD2_{t−2} + SD3_{t−3} | (7,3,1) | 6.73 | 0.959 | 0.920 |

WLR5 | SD1_{t−1} + SD1_{t−2} + SD1_{t−3} + SD1_{t−4} + SD2_{t−1} + SD2_{t−2} + SD3_{t−3} + SD4_{t−4} + Q_{t−1} + Q_{t−2} + Q_{t−3} | (11,1) | 2.06 | 0.996 | 0.992 |

WFFNN5 | SD1_{t−1} + SD1_{t−2} + SD1_{t−3} + SD1_{t−4} + SD2_{t−1+}SD2_{t−2} + SD3_{t−3} + Q_{t−1} + Q_{t−2} + Q_{t−3} + Q_{t−4} | (11,5,1) | 2.79 | 0.997 | 0.992 |

Model | Model inputs | Model structure | RMSE (m^{3}/s) | R | E |
---|---|---|---|---|---|

WLR4 | SD1_{t−1} + SD1_{t−2} + SD1_{t−3} +SD1_{t−4} + SD2_{t−1} + SD2_{t−2} + SD3_{t−3} + SD4_{t−4} | (8,1) | 7.54 | 0.939 | 0.892 |

LR1 | Q_{t−1} + Q_{t−2} + Q_{t−3} | (3,1) | 21.50 | 0.501 | 0.299 |

FFNN1 | Q_{t−1} + Q_{t−2} + Q_{t−3} | (3,3,1) | 21.46 | 0.545 | 0.289 |

WFFNN3 | SD1_{t−1} + SD1_{t−2} + SD1_{t−3} + SD1_{t−4} + SD2_{t−1} + SD2_{t−2} + SD3_{t−3} | (7,3,1) | 6.73 | 0.959 | 0.920 |

WLR5 | SD1_{t−1} + SD1_{t−2} + SD1_{t−3} + SD1_{t−4} + SD2_{t−1} + SD2_{t−2} + SD3_{t−3} + SD4_{t−4} + Q_{t−1} + Q_{t−2} + Q_{t−3} | (11,1) | 2.06 | 0.996 | 0.992 |

WFFNN5 | SD1_{t−1} + SD1_{t−2} + SD1_{t−3} + SD1_{t−4} + SD2_{t−1+}SD2_{t−2} + SD3_{t−3} + Q_{t−1} + Q_{t−2} + Q_{t−3} + Q_{t−4} | (11,5,1) | 2.79 | 0.997 | 0.992 |

### Comparison of the models

Table 6 compares the observed maximum streamflow values with the corresponding forecasted values for each year in the testing period (for Manisa station). In this case, the differences from the observed peaks and total absolute differences (m^{3}/sn) were computed. The models generally underestimate the corresponding streamflows. For instance, for the months 73–84 (between 73rd and 84th months of the testing period), the WLR4 model forecasts the maximum peak with underestimation of 8.17 m^{3}/s instead of the measured 36 m^{3}/s, while the WFFNN2 model forecasts the maximum peak with an underestimation of 5.51 m^{3}/s. Namely, the forecast of the WLR4 model is 8.17 m^{3}/s below the observed value. For the testing period, while the total absolute difference for the WLR4 model is 78.77 m^{3}/s, for the WFFNN2 model it is 62.59 m^{3}/s. The total absolute differences for the WLR5 model and for the WFFNN5 model are 3.85 m^{3}/s and 11.41 m^{3}/s, respectively. It can be said that performances of the WLR5 model were significantly inferior as compared with the other models. This is mainly due to the higher underestimations of the peak precipitation measurements. For example, the WLR5 model forecasts the maximum peak only with an underestimation of 0.04 m^{3}/s instead of the measured 36 m^{3}/s for the months between 73 and 84. The total absolute difference for the LR1 model is 234.80 m^{3}/s. The results clearly show that the WLR models are much better than the LR models.

Months | Observed peaks (m^{3}/s) | (Observed–forecasted) differences from peaks (m^{3}/s) | |||||
---|---|---|---|---|---|---|---|

WLR4 | WFFNN2 | WLR5 | WFFNN5 | FFNN2 | LR1 | ||

0–12 | 19 | 1.78 | −2.50 | 0.03 | −1.04 | 1.62 | −3.96 |

13–24 | 9.7 | −0.59 | −4.17 | −0.40 | −1.64 | −1.33 | −9.05 |

25–36 | 29 | 0.27 | −3.04 | −0.21 | −1.07 | 6.06 | −0.45 |

37–48 | 17.2 | 3.42 | 1.67 | −0.43 | −0.87 | 11.02 | 4.04 |

49–60 | 35.8 | 13.34 | 11.74 | 0.73 | −0.46 | 20.44 | 17.23 |

61–72 | 40.4 | 7.21 | 8.99 | 0.28 | 0.08 | 21.63 | 20.29 |

73–84 | 36 | 8.17 | 5.51 | 0.04 | −0.61 | 23.44 | 19.05 |

85–96 | 46.8 | 6.56 | 7.08 | 0.20 | −0.71 | 13.08 | 11.33 |

97–108 | 176 | 35.43 | 15.55 | 0.51 | −3.72 | 124.22 | 145.48 |

109–120 | 29.2 | 1.99 | −2.33 | 1.03 | 1.22 | 1.83 | −4.01 |

Total absolute differences | 78.77 | 62.59 | 3.85 | 11.41 | 224.66 | 234.80 |

Months | Observed peaks (m^{3}/s) | (Observed–forecasted) differences from peaks (m^{3}/s) | |||||
---|---|---|---|---|---|---|---|

WLR4 | WFFNN2 | WLR5 | WFFNN5 | FFNN2 | LR1 | ||

0–12 | 19 | 1.78 | −2.50 | 0.03 | −1.04 | 1.62 | −3.96 |

13–24 | 9.7 | −0.59 | −4.17 | −0.40 | −1.64 | −1.33 | −9.05 |

25–36 | 29 | 0.27 | −3.04 | −0.21 | −1.07 | 6.06 | −0.45 |

37–48 | 17.2 | 3.42 | 1.67 | −0.43 | −0.87 | 11.02 | 4.04 |

49–60 | 35.8 | 13.34 | 11.74 | 0.73 | −0.46 | 20.44 | 17.23 |

61–72 | 40.4 | 7.21 | 8.99 | 0.28 | 0.08 | 21.63 | 20.29 |

73–84 | 36 | 8.17 | 5.51 | 0.04 | −0.61 | 23.44 | 19.05 |

85–96 | 46.8 | 6.56 | 7.08 | 0.20 | −0.71 | 13.08 | 11.33 |

97–108 | 176 | 35.43 | 15.55 | 0.51 | −3.72 | 124.22 | 145.48 |

109–120 | 29.2 | 1.99 | −2.33 | 1.03 | 1.22 | 1.83 | −4.01 |

Total absolute differences | 78.77 | 62.59 | 3.85 | 11.41 | 224.66 | 234.80 |

In this section, the success of the model forecasting was also examined with the Mann–Whitney *U*-test. This test presents more evidence on the success of the models. If the *z*-value is above the critical value (1.96) at 95% confidence level, two populations have same median which means that they come from the same distribution. The results show that there is no statistically significant difference at the testing period between the forecasted values of WLR4, WLR5, WFFNN2, and WFFNN models and observed ones (Table 7). However, the *z*-value of the WLR5 model is the lowest (*z*-value is 0.28 for Manisa station). The results show that the FFNN and LR forecasting have no homogeneities for the testing set. In terms of *z* statistic criteria, the WLR5 results are much better than the results of the other models.

Model | Mann–Whitney U statistic | z |
---|---|---|

WLR4 | 7,891 | 1.06 |

WFFNN2 | 8,241 | 1.84 |

WLR5 | 7,414 | 0.28 |

WFFNN5 | 7,901 | 1.09 |

FFNN | 9,590 | 3.84 |

LR | 11,040 | 7.03 |

Model | Mann–Whitney U statistic | z |
---|---|---|

WLR4 | 7,891 | 1.06 |

WFFNN2 | 8,241 | 1.84 |

WLR5 | 7,414 | 0.28 |

WFFNN5 | 7,901 | 1.09 |

FFNN | 9,590 | 3.84 |

LR | 11,040 | 7.03 |

## CONCLUSIONS

The aim of this study was to investigate the capability of the wavelet-based regression model for forecasting monthly streamflow. The five different WLR models were applied to forecast the monthly flow. As well, the performances of the wavelet-based regression models were compared with the other models (WFFNN, FFNN and LR). Also, different performance evaluation measures were employed to assess the results of the models.

The observed flow data were decomposed into the D series by the wavelet transformation. Later, the SD series, arranged by collecting the appropriate D series, were employed as inputs of the forecast models. One of the greatest difficulties in wavelet modeling lies in determining the appropriate model inputs for such a problem. Therefore, choosing the most appropriate inputs for forecasting modeling is very important. Thus, the different SD series were tested as the inputs of the models in this study. As a result, SD1 was found, in general, to be the most appropriate series.

The model results show that the WLR and WFFNN models were found to provide better results than the LR and FFNN models. The R value obtained by the LR model was 0.58, whereas the WLR models provided values in the range of 0.90–0.95. Similarly, R values obtained by the FFNN model are in the range of 0.54–0.62, whereas the WFFNN models provided values in the range of 0.90–0.96. This means that using the optimal wavelet series affects the forecast ability positively. The results show that the WLR5 model demonstrated much better performance than the other WLR models in terms of performance criteria. For example, the forecasted accuracy was 96% with the WLR4 model while it was 99% with the WLR5 model. The *E*=value obtained by the WFFNN5 model was within the interval 0.982–0.988, whereas the WLR5 model provided values in the range of 0.992–0.994.

The significant contribution of the present study is that wavelet-based models affect forecast ability significantly positively because the wavelet transformation allows the determination of the noise component in the data. The noise component is mainly responsible for poor forecast accuracy. That is to say that removing the noise component from the data provided smoother and more efficient forecasting results. The significant point here is to determine the appropriate wavelet components for the inputs of the models. This study shows that the SD1 components were found, in general, to be the most appropriate series for streamflow forecasting. The other significant point this study demonstrates is that the wavelet-based regression models present good results as WNN models when the appropriate inputs are selected. Here, the success of the WLR model can be explained with high auto-correlation coefficients of the monthly streamflow data and significantly high correlation between the wavelet SD series and the observed monthly streamflow data. The training simulation performance of the FFNN is dependent on the different random weights assignment at the beginning of each training simulation. Therefore, excessive simulations are needed to select the best neural network performance. Previous studies had already shown that the WNN models have better performance than the classical ANN, ARIMA, regression models for streamflow forecasting. Another advantage of wavelet-based models (WNN, WLR, etc.) is a better estimation of peak value.

Monthly flows were forecasted accurately by wavelet-based regression models. This study shows that the wavelet regression model is a good alternative as compared to the WNN for monthly streamflow forecasting. Therefore, the WLR models can be successfully used for streamflow forecasting. Future studies should address the comparison of the use of different types of wavelet functions (e.g., Daubechies, Mexican Hat) in wavelet decomposition.

## ACKNOWLEDGEMENTS

The author would like to thank Ondokuz Mayis University for supporting this work.