For the prediction of river flow sequence, owing to the non-stationariness and randomness of the sequence, the prediction accuracy of extreme river flow is not enough. In this study, the sparse factor of the loss function in a sparse autoencoder was enhanced using the inverse method of simulated annealing (ESA), and the river flow of the Kenswat Station in the Manas River Basin in northern Xinjiang, China, at 9:00, 15:00, and 20:00 daily during June, July, and August in 1998–2000 was considered as the study sequence. When the initial values of the sparse factor β0 are 5, 10, 15, 20, and 25, the experiment is designed with 60, 70, 80, 90, and 100 neurons, respectively, in the hidden layer to explore the relationship between the output characteristics of the hidden layer, and the original river flow sequence after the network is trained with various sparse factors and different numbers of neurons in the hidden layer. Meanwhile, the orthogonal experimental groups ESA1, ESA2, ESA3, ESA4, and ESA5 were designed to predict the daily average river flow in September 2000 and compared with the prediction results of the support vector machine (SVM) and the feedforward neural network (FFNN). The results indicate that after the ESA training, the output of the hidden layer consists of a large number of features of the original river flow sequence, and the boundaries of these features can reflect the river flow series with large changes. The upper bound of the features can reflect the characteristics of the river flow during the flood. Meanwhile, the prediction results of the orthogonal experiment groups indicate that when the number of neurons in the hidden layer is 90 and β0 = 15, the ESA has the best prediction effect on the sequence. In particular, the fitting effect on the day of ‘swelling up’ of the river flow is more satisfactory than that of SVM and FFNN. The results are significant, as they provide a guide for exploring the evolution of the river flow under drought and flood as well as for optimally dispatching and managing water resources.

  • An enhanced sparse autoencoder for river flow prediction is proposed.

  • The optimal sparse factor and the number of hidden neurons of ESA are discussed and obtained.

  • The output value of the sparse layer implies extreme flow characteristics.

Graphical Abstract

Graphical Abstract
Graphical Abstract

Under the impact of frequent human activities, river flow sequences demonstrate significant non-stationariness and randomness (Ye et al. 2013). Therefore, the prediction of river flow sequences is becoming increasingly difficult. Frequent human activities cause changes in the local atmospheric circulation, underlying surfaces, and surface and underground river flow, which make it difficult to predict river flow (Han et al. 2009). A large gap exists between the predicted and actual river flow during various time periods every day. Effectively predicting river flow at various times is important for the optimal management and profitable dispatch of water resources.

After carefully considering the meteorological conditions, underlying surface, and other factors related to the study area, several models have been applied for daily average river flow prediction. In short- to medium-term prediction, the Soil and Water Assessment Tool (SWAT; Easton et al. 2010; Jimeno-Saez et al. 2018; Li et al. 2018; Fereidoon et al. 2019), Hydrological Simulation Program – Fortran (HSPF; Lee et al. 2020), System Hydrological European (SHE; Abbott et al. 1986), Physically based Distributed Tank (PDTank; Lee & Singh 1999), Stanford (Crawford & Linsley 1966), and other conceptual and distributed models can achieve good prediction results in different basins. In mid- to long-term forecasting, machine learning models generally show a good prediction effect in complex environments of different basins, such as the support vector machine (SVM; Sahoo et al. 2019), extreme learning machine (Zhu et al. 2019), artificial neural network (Jimeno-Saez et al. 2018; Tsakiri et al. 2018; Zhang et al. 2019b), wavelet neural network (Shafaei & Kisi 2017; Alizadeh et al. 2018; Rakhshandehroo et al. 2018; Sharghi et al. 2018; Nourani et al. 2019; Santos et al. 2019; Sharghi et al. 2019; Sun et al. 2019), and fuzzy neural network (Badrzadeh et al. 2018; Bou-Fakhreddine et al. 2018). However, on some specific days, owing to several factors affecting the occurrence of floods, it is a challenge to sufficiently consider all the natural and human conditions that cause the floods. According to the water–heat exchange and river water quality model, combined with GIS and other technologies, it is possible to make a relatively accurate prediction of the extreme flow of the river in the short term, but the long-term prediction effect is generally insufficient (Ye 2010). On the one hand, these prediction models have significant limitations in fitting practical engineering problems, and the fitting capability of different models is different. On the other hand, researchers have been unable to comprehensively consider all the factors that affect river flow evolution such as various complex human activities, local meteorological conditions, and underlying surface changes since the 20th century (Engeland et al. 2017; Gangrade et al. 2018; Kisakye et al. 2018; Leta et al. 2018; Yang et al. 2018; Schreiner-McGraw et al. 2019).

Several factors lead to the insufficiency of the prediction accuracy of extreme river flow in the sequence. Therefore, several researchers have studied the evolution of individual constituent sequences during the flood occurrence period (Mirzaei et al. 2015; Requena et al. 2016; Ajadi et al. 2017; Konrad & Dettinger 2017; Yan et al. 2017; Pandey et al. 2018; Sanchez-Garcia et al. 2019). Simultaneously, various data-driven models have been used to explore the prediction results of extreme river flow laws, including distributed models, such as the Kalinin-mijukou (Dooge 1959), ISBA-TOPMODEL (Bouilloud et al. 2010), and machine learning models, such as the particle swarm optimization extreme learning machine (Niu et al. 2018), particle swarm optimization SVM (Zaini et al. 2018), and long short-term memory (Widiasari et al. 2018; Wang & Lou 2019). Numerous studies have reported that the main bottleneck affecting the accuracy of river flow prediction lies in the lack of prediction capability of many models in extreme cases at certain moments in the basin. The sparse autoencoder was proposed to perform feature learning on unsupervised data (Chen & Li 2017). Its advantage is that it can compress the original input sequence and retain the important information in the original sequence. In the prediction of river flow series, the output value in the hidden layer can often reflect the important characteristics of the original series.

In practical engineering, while predicting the river flow during a non-stationary period time, the sequence often includes the period of flood. In fact, in some research areas, the occurrence of floods is not very frequent and only in certain special years, seasons, and months. Therefore, the forecast of river flow in the corresponding time period usually includes the period of occasional flood. To explore the periodicity of the flood, a period of several years of flood occurrence is selected for research (Toonen 2015; Bhat et al. 2019; Sanchez-Garcia et al. 2019; Zhang et al. 2019a). This type of exploration can often formulate the law of flood evolution of a specific study area using the macroscopic and general flood evolution periodicity. However, in the actual short- to mid-term river flow prediction, the prediction results of extreme flow have a significant impact on the general accuracy, and the macroscopically calculated flood evolution periodicity is difficult to apply to short- to mid-term decisions. Research on short- to mid-term river flow prediction must completely mine the internal laws of the original river flow sequence. Therefore, after the network is trained, exploring the relationship between the output results in the hidden layer of the sparse autoencoder and the original sequence can often help identify some characteristics of the input sequence, particularly the flow characteristics during the flood period.

In this study, the inverse method of simulated annealing is used to improve the sparse autoencoder and the proposed enhanced simulated annealing (ESA) algorithm to study the river flow sequence. To the best of our knowledge, this is the first study to explore the characteristic relationship between the output of the ESA hidden layer and the original sequence for improving the prediction accuracy of extreme flow. The designed orthogonal experiment is used to verify that the output of the hidden layer of the ESA contains a lot of boundary information in the original sequence and provides the parameters of the ESA with the best prediction result in the experimental group.

River flow sequence data at the Kenswat Station

Manas River Basin is located in the hinterland of Eurasia, between N43°27′–N45°21′ and E85°01′–E86°32′. The digital elevation ranges from 170 to 5,242.5 m, and the whole drainage basin is fan-shaped with a total length of 420 km. The annual average precipitation in this area is 100–200 mm, and the spatio-temporal distribution is uneven. The annual average evaporation is 1,500–2,100 mm, the annual average temperature is 4.7–5.7 °C, and the total area is approximately 3.099 × 104 km2.

The Kenswat Station is located in the middle reaches of the Manas River, and it can control 92% of the water volume of the Manas River, with a total length of approximately 8 km, a maximum dam height of 126.8 m, and a total storage capacity of 1.91 billion m3. The installed capacity of the power station is 100 MW, and the designed annual power generation capacity is 2.76 billion KW·h, as illustrated in Figure 1. The daily river flow data at 9:00, 15:00, and 20:00 in June, July, and August of 1998–2000 were used in this study and provided by the Kenswat Station of Manas River, which has an elevation of 900 m and a control area of 5,156 km2. The general trend of annual river flow and the distribution trend of river flow at each time point are presented in Figure 2.

Figure 1

Locations of Manas River and Kenswat Station.

Figure 1

Locations of Manas River and Kenswat Station.

Close modal
Figure 2

(a) General river flow trend in June, July, and August 1998–2000. (b) River flow distribution trend at 9:00, 15:00, and 20:00.

Figure 2

(a) General river flow trend in June, July, and August 1998–2000. (b) River flow distribution trend at 9:00, 15:00, and 20:00.

Close modal

Strategies for an enhanced sparse autoencoder

Regular initialization of weights

The sparse autoencoder is composed of simple neurons connected by a series of weights. The output of the upper layer can be trained as the input for the next layer. Figure 3 illustrates a simple three-layer structure consisting of the input, hidden, and output layers. In the figure, ‘+ 1’ is the threshold unit, ai(k) represents the output of the ith neuron in the kth layer, xi represents the ith input vector, hw,b(x) represents the output value of the input sample x under the connection of W and b, and Li represents the ith layer. The network is trained by gradient back propagation (refer Chen & Li (2017)) for specific training processes.

Figure 3

Simple structure of a three-layer sparse autoencoder.

Figure 3

Simple structure of a three-layer sparse autoencoder.

Close modal
In the process of gradient back propagation, the gradient value will decrease with gradient back propagation. Glorot & Bengio considered the linear mechanism of weights and the relationship between the activation function and network output values in the hidden layer and derived the random initial weights (Glorot & Bengio 2010). They explored how to maintain the error information always unblocked from the perspective of forward and backward propagation. It was proposed that the normalization factor was essential during propagation. Considering the multiplication effect of each layer, the following equation was used to initialize the weight and threshold of the depth network to effectively maintain the activation and gradient variances:
(1)
where nj is the number of neurons in the jth layer.

Simulated annealing theory

The simulated annealing algorithm is an effective method to solve the problem of local minimum energy in stochastic networks. The basic idea is to simulate the process of metal annealing. Let X represent the microscopic state of a system and E(X) represent the internal energy in that state.

For a given temperature T, when the system is in thermal equilibrium, during the cooling annealing process, the probability P(E) of the system to remain in a certain energy state and temperature obeys the Boltzmann distribution (Bengio et al. 2013). When the temperature is constant, the higher the energy of the material system, the lower the probability of it being in this state, and the internal energy of the material system tends to evolve in the direction of decreasing energy. As the temperature T becomes higher, the system state changes more easily. To make the material system finally converge to the equilibrium state at a low temperature, a higher temperature should be set at the beginning of the annealing; this is then gradually lowered. Finally, there is a high probability that the entire system will converge to the lowest energy state. The common way to set the temperature T is the following:
(2)
where t is the number of iterations.

Implementation of sparse coefficient selection based on the simulated annealing theory

When the sparse autoencoder is optimized using simulated annealing, a sparse factor is defined to simulate the annealing temperature of the metal. The change mode of the sparse factor is opposite to that of the simulated annealing. β0 in the loss function of the sparse autoencoder is simulated with various initial temperature values, and the simulation of various initial temperatures with various numbers of neurons in the hidden layer is considered. When exploring the initial value of the sparse factor, the output sequence can be more consistent with the original sequence.

In the process of sparse autoencoder learning, the choice of the sparse factor β affects the sparsity of the network. The higher the value of β, the heavier the sparse penalty, indicating that the output of the hidden layer in the network tends to 0. The lower the value of β, the lighter the sparse penalty, indicating that the sparsity of the whole network is lower. To facilitate the observation of effective feature extraction in hydrological sequences, the inverse method of simulated annealing is used to dynamically select β. The sparsity penalty term of the network should be from low to high, and the value of β should also be taken from small to large. The value of β is expressed as follows:
(3)
where β0 is the initial value of β and t is the number of network iterations. Figure 4 depicts the values of β when β0 = 5, 10, 15, 20, and 25.
Figure 4

Value curve of sparse factor β under various initial values β0.

Figure 4

Value curve of sparse factor β under various initial values β0.

Close modal
After using the inverse method of simulated annealing to select the sparse factor, combining Equation (3), the cost function of the improved sparse autoencoder is described as follows:
(4)

The sparse autoencoder is trained based on Equation (4). After training, the output value of the hidden layer is extracted to explore the internal characteristics of the input samples.

ESA for studying the river flow sequence

In this study, we examine the characteristic relationship between the output value in the hidden layer and the original input sequence after the ESA training and provide new ideas for the prediction of future river flow sequence evolution. The original input river flow sequence value can be inferred according to the features of river flow sequence extracted from the hidden layer. The training process of the network is illustrated in Figure 5. The edge features extracted from the hidden layer can reflect some apparent changes in the original sequence. The edge feature of the sequence represents the ‘swell’ or ‘slump’ flow in the sequence, which is the boundary of the constraint sequence and can reflect extreme flow. In Figure 5, the sequence is divided into n sub-sequence samples, each of which contains 100 days of river flow in the corresponding period as the input of the ESA. The weights and thresholds of the ESA are initialized in the way of regularization. When the number of iterations is t = 100, the training is over, and after 30 times, the training model is verified using the verification set.

Figure 5

River flow sequence division and network training.

Figure 5

River flow sequence division and network training.

Close modal

Parameter calibration and orthogonal experimental design

The following parameters were set for the studied sequence. Divide the collected experimental data into six groups, each with 100 pieces, 70 for the training set and 30 for the verification set during training. The river flow data from September 2000 are reserved as the test set. During training, when the verification set error rises, stop training to avoid overfitting. The initial values of the sparse factor β were 5, 10, 15, 20, and 25, and the numbers of neurons in the hidden layer, L, were 60, 70, 80, 90, and 100, respectively. In the experiment exploring the relationship between the output value of the hidden layer and the original sequence, β and L were orthogonal to form 25 groups of values, namely, (L, β0) = {(60, 5), (60, 10),…,(60, 25), (70, 5),…, (70, 25),…,(100, 5),…,(100, 25)}. The results of the output layer were compared with the original input, and the feature compression rate and extraction rate were used as the comparison indexes of the experimental group (L, β0). Subsequently, the optimal solution of the hidden neurons L and the initial value of the sparse factor β were explored in river flow feature mining.

The daily average temperature (X1), daily average precipitation (X2), daily average evapotranspiration (X3), and average humidity (X4) were selected as the factors that affect the evolution of river flow (Y) at the Kenswat Station, and the orthogonal experiment was designed. In the experimental groups ESA1, ESA2, ESA3, ESA4, and ESA5, the daily average flow in September, in which the great flood occurred in 2000, was predicted. The prediction results were compared with those obtained from SVM (Chang et al. 2018; Bafitlhile & Li 2019) and feedforward neural network (FFNN; Yilmaz & Muttil 2014; Yaseen et al. 2016). The comparison indexes used were the correlation coefficient (R), root mean square error (RMSE), mean absolute error (MAE), 85% uniformity rate of the offset, and 80% pass rate. The daily average flow in June, July, and August during 1998–2000 were selected as samples of the prediction series. The kernel function based on the Gaussian radial is used in SVM (Fasshauer & McCourt 2012), and the penalty term (c) of the kernel function and the kernel parameter (γ) take the values 3.5 and 0.8, respectively. The selection of the number of neurons in the hidden layer adopts the trial-and-error method and the empirical method. After a large number of experimental tests, it was shown that the number of hidden layer neurons can be set to 5, which can achieve good results. Therefore, the network structure of the FFNN is 4-5-1. The weights and thresholds are initialized randomly and obey the normal distribution N(0, 1). In different orthogonal experimental groups, the weights and thresholds are generated using Equation (1), and β is generated using the simulated annealing inverse method represented by Equation (2). After 30 predictions, the average value is taken as the final prediction result of the experiment. All experiments were carried out on MATLAB R2014a.

The output results of the hidden and output layers in the network are obtained under the combination of various values of L and the sparse factor β. In Figure 6, when (e1), (e2), and (e3) are L = 60 for different values of the sparse factor β0 at 9:00, 15:00, and 20:00, respectively, the logarithmic distribution of the hidden layer output values after training. It can be seen from Figure 6 that when β0 = 5 and 10, the difference between the values on both sides of the median is significantly smaller than when β0= 15, 20, and 25. When β0 = 20 and 25, the difference between the two sides of the median increases evidently, and there are fewer scattered outliers, which indicates that the output value of the hidden layer is more than = 0.05 when the sparse factor is very small. This illustrates that with the increase in β, the ability of ESA to compress the input sequence improves, and the extracted river flow features move more and more to the edge. Similarly, Figures 7(e), 8(e), 9(e), and 10(e) depict the box-shaped distribution chart of the output values in the hidden layer under different values of β when the values of L are 70, 80, 90, and 100, respectively. (e1)–(e3) are the distribution diagrams at 9:00, 15:00, and 20:00, respectively.

Figure 6

When the number of neurons in the hidden layer is 60, (e1)–(e3) are output feature distributions of the hidden layer at 9:00, 15:00, and 20:00, respectively; (a1) and (a2), (b1) and (b2), and (c1) and (c2) are comparisons of output values of the output layer under training and validation sets with the original input sequence at 9:00, 15:00, and 20:00, respectively.

Figure 6

When the number of neurons in the hidden layer is 60, (e1)–(e3) are output feature distributions of the hidden layer at 9:00, 15:00, and 20:00, respectively; (a1) and (a2), (b1) and (b2), and (c1) and (c2) are comparisons of output values of the output layer under training and validation sets with the original input sequence at 9:00, 15:00, and 20:00, respectively.

Close modal

It can be seen from the distribution at each (e1) diagram that there is a small distribution gap at 9:00 for the same value of β0 and for different values of L. The general distribution is relatively stable. When the value of β is small, the ability of the sequence to compress is weak. Therefore, the feature output value is larger, and with the increase in β, the difference between the two sides of the box graph increases evidently and the compression capacity of the sequence is enhanced. In (e2) and (e3), there are some ‘abnormal values’. When β0 = 5, 10, and 15, the distribution values are relatively scattered. When β0 = 20 and 25, there is a significant difference between the two sides of the median. The primary reason is that the selected datasets – June, July, and August – were months with large changes in flow sequence. At 15:00 and 20:00, there was sufficient sunshine, and the rising temperature accelerated the melting of the snow in the upper reaches of the Kenswat Station. In addition, in 2000, there were large floods and extreme flow evolution events in the station area. From June to August 2000, the minimum daily river flow was 20 m3/s, whereas during the flood period, the maximum flow reached 1,100 m3/s. This led to the non-stationariness and randomness of the research sequence. The dispersion of the eigenvalues in (e2) and (e3) is also caused by the date when the river flow gap is evident. It can be seen that the characteristic output value in the hidden layer can reflect the change in the river flow sequence in a certain sense.

Figures 610 present the flow output diagrams when the values of L are 60, 70, 80, 90, and 100, respectively. When parts (a), (b), and (c) represent 9:00, 15:00, and 20:00, respectively, the output values of the output layer of the ESA for different values of β0 are compared with the measured value. Parts (a1) and (a2), (b1) and (b2), and (c1) and (c2) in each figure present the output results of the divided training and validation sets. To make the data on the graph more clear, the following steps are carried out in (a1), (b1), and (c1): when β0 = 5, the output value plus 20; when β0 = 10, the output value plus 10; when β0 = 20, the output value minus 10; when β0 = 25, the output value minus 20. The embedded tables in each figure present the correlation coefficients R between the output values of the training and validation sets and the corresponding measured values for different values of β0. As can be seen from Figures 610(a), when L = 60, with an increase in β, the accuracy of the output of the verification set evidently decreases. In each of Figures 610(a2), the output values at β0 = 20 and 25 are significantly worse than those at β0 = 5 and 10. However, the simulation results are relatively good on days with large variations in river flow. In Figures 610(b) and 610(c), there are more time periods when extreme values occur at 15:00 and 20:00, and the general sequence is more non-stationary at these two time points. The output results indicate that the output values at β0 = 5 and 10 are evidently better than those at β0 = 20 and 25. When β0 = 15, R is more inclined toward the side of β0 = 10, indicating that β0 = 15 is also verified. When L increases, the output results are increasingly better at different time periods for different values of β. For example, at 9:00, when L = 60, β0 = 25, and R = 91.6%, when L = 100, β0 = 25, and R = 99.65%. In addition, it can be seen from Figures 610(a), 610(b), and 610(c) that for different values of L, the increase in L has a positive effect on the improvement of R between the output value of the output layer in the verification set and the measured value. However, from the results presented in Figures 6(e), 7(e), 8(e), 9(e), and 10(e), it can be seen that with the increase in L, the number of output values that are significantly greater than =0.05 in the hidden layer are becoming increasingly equal to the number of input values, which is evidently not good for studying the relationship between the output values of the hidden layer and the extreme evolution values in the river flow series. An analysis and discussion of the appropriate number of hidden neurons and the value of the sparse factor β as ESA parameters can help improve the prediction accuracy of the sequences.

Figure 11 presents a comparison of the prediction results of the ESA with different parameters, SVM, and FFNN for daily average flow in September 2000 under the orthogonal experiment . In September 2000, there was a large flood event at the Kenswat Station. It can be seen from Figure 11 that the prediction result of the ESA4 is the best, particularly on 11 September 2000, 18 September 2000, and other days with a large flow change; ESA4 can fit the part with a large flow change relatively well. FFNN and SVM are the most widely used models for mid- to long-term prediction. In the September forecast, SVM and FFNN demonstrate acceptable forecast accuracy with respect to some stable and relatively small changes in river flow. However, the forecast effect is relatively poor on some days with large changes in river flow. In addition, the orthogonal experiment group ESA5 also demonstrates a relatively good prediction capacity for high flow. Therefore, Figure 11 illustrates that the orthogonal experimental groups ESA4 and ESA5 can achieve relatively good sequence prediction results on days with large flow evolution.

Figure 7

When the number of neurons in the hidden layer is 70, (e1)–(e3) are output feature distributions of the hidden layer at 9:00, 15:00, and 20:00, respectively; (a1) and (a2), (b1) and (b2), and (c1) and (c2) are comparisons of output values of the output layer under training and validation sets with the original input sequence at 9:00, 15:00, and 20:00, respectively.

Figure 7

When the number of neurons in the hidden layer is 70, (e1)–(e3) are output feature distributions of the hidden layer at 9:00, 15:00, and 20:00, respectively; (a1) and (a2), (b1) and (b2), and (c1) and (c2) are comparisons of output values of the output layer under training and validation sets with the original input sequence at 9:00, 15:00, and 20:00, respectively.

Close modal
Figure 8

When the number of neurons in the hidden layer is 80, (e1)–(e3) are the output feature distributions of the hidden layer at 9:00, 15:00, and 20:00, respectively; (a1) and (a2), (b1) and (b2), and (c1) and (c2) are comparisons of output values of the output layer under training and validation sets with the original input sequence at 9:00, 15:00, and 20:00, respectively.

Figure 8

When the number of neurons in the hidden layer is 80, (e1)–(e3) are the output feature distributions of the hidden layer at 9:00, 15:00, and 20:00, respectively; (a1) and (a2), (b1) and (b2), and (c1) and (c2) are comparisons of output values of the output layer under training and validation sets with the original input sequence at 9:00, 15:00, and 20:00, respectively.

Close modal
Figure 9

When the number of neurons in the hidden layer is 90, (e1)–(e3) are output feature distributions of the hidden layer at 9:00, 15:00, and 20:00, respectively; (a1) and (a2), (b1) and (b2), and (c1) and (c2) are comparisons of output values of the output layer under training and validation sets with the original input sequence at 9:00, 15:00, and 20:00, respectively.

Figure 9

When the number of neurons in the hidden layer is 90, (e1)–(e3) are output feature distributions of the hidden layer at 9:00, 15:00, and 20:00, respectively; (a1) and (a2), (b1) and (b2), and (c1) and (c2) are comparisons of output values of the output layer under training and validation sets with the original input sequence at 9:00, 15:00, and 20:00, respectively.

Close modal
Figure 10

When the number of neurons in the hidden layer is 100, (e1)–(e3) are the output feature distributions of the hidden layer at 9:00, 15:00, and 20:00, respectively; (a1) and (a2), (b1) and (b2), and (c1) and (c2) are comparisons of output values of the output layer under training and validation sets with the original input sequence at 9:00, 15:00, and 20:00, respectively.

Figure 10

When the number of neurons in the hidden layer is 100, (e1)–(e3) are the output feature distributions of the hidden layer at 9:00, 15:00, and 20:00, respectively; (a1) and (a2), (b1) and (b2), and (c1) and (c2) are comparisons of output values of the output layer under training and validation sets with the original input sequence at 9:00, 15:00, and 20:00, respectively.

Close modal
Figure 11

Comparison of daily average flow prediction results for September 2000 under various orthogonal experimental groups. The dotted box depicts the flow prediction results for days with large changes in river flow.

Figure 11

Comparison of daily average flow prediction results for September 2000 under various orthogonal experimental groups. The dotted box depicts the flow prediction results for days with large changes in river flow.

Close modal

Figure 12(a)–12(c) present the comparison chart of compression and extraction rates calculated at 9:00, 15:00, and 20:00, respectively, for various values of β0 and L. Figure 13(a)–13(c) present the comparison chart of R calculated at 9:00, 15:00, and 20:00, respectively, for various values of β0 and L. Figure 12(a) illustrates that the compression rate is generally the highest when β0 = 25 under various values of L. When β0 = 15 and 20, the compression rate is concentrated approximately 55%. When the number of hidden neurons is 70, the values of R of the verification set results are 99% and 98.8%. This indicates that when L = 70 and β0 = 15, a better verification effect can be obtained. When L = 60, the result of the verification set is generally poor; however, when L ≥ 70 and β0 = 5, 10, and 15, the result of the verification set is generally higher than 99%, indicating that β0 = 15 or 20. When L ≥ 70, the ESA not only extracts the features of the original sequence more completely, but also the accuracy of the restored original sequence is satisfactory. In the general performance of Figures 12 and 13, except for the difference between the sequence of the verification result and the original sequence at 9:00, L = 60, β0 ≥ 15, in other cases, the value of R between the verification results and the original sequence is almost close to 1, which indicates that the ESA can extract most features of the sequence. These features can reflect the law of the whole research sequence, particularly the part with large variation in river flow.

Figure 12

When the numbers of neurons in the hidden layer are 60, 70, 80, 90, and 100, (a–c) present comparisons of compression and extraction ratios at 9:00, 15:00, and 20:00, respectively.

Figure 12

When the numbers of neurons in the hidden layer are 60, 70, 80, 90, and 100, (a–c) present comparisons of compression and extraction ratios at 9:00, 15:00, and 20:00, respectively.

Close modal
Figure 13

When values of the sparse factor are 5, 10, 15, 20, 25, (a–c) present comparisons of correlation coefficient R between output values in the output layer and those measured at 9:00, 15:00, and 20:00, respectively.

Figure 13

When values of the sparse factor are 5, 10, 15, 20, 25, (a–c) present comparisons of correlation coefficient R between output values in the output layer and those measured at 9:00, 15:00, and 20:00, respectively.

Close modal

It can be seen from the prediction results of the orthogonal experiment in Figure 11 that on some special flow dates, such as 11 September 2000 and 18 September 2000, ESA4 and ESA5 can fit the actual situation well. SVM and FFNN can also demonstrate good prediction results during some time periods when the river flow is relatively stable; however, they cannot fit well during the time period with large evolution. In ESA1, although the predicted results are quite different from the real values, the general trend of the curve is similar to that of the predicted values. Particularly during the special flow time period, the prediction results demonstrate the prediction capability of the ESA from the comparison indexes presented in Table 1. In comparison of prediction performance indicators, the values of R for ESA4, ESA5, and SVM are higher than those for other models; the highest value for SVM is 91%. However, the curve values depicted in Figure 11 illustrate that ESA4 and ESA5 outperform the SVM in the prediction of some special flow. Apparently, the prediction results of the SVM and FFNN are smoother and generally more stable but weaker during the period of flood occurrence than those of ESA4 and ESA5. The RME and MAE are the lowest for ESA4; the values decreased by 19.2, 6.59, 4.34, 2.46, 8.71, and 8.83 and 19.16, 5.03, 3.57, 1.75, 8.4, and 8.48, respectively, compared with those for ESA1, ESA2, ESA3, ESA5, SVM, and FFNN. In the general prediction error comparison, it is evident that the orthogonal experimental groups ESA4 and ESA5 perform better. The 85% anomaly coincidence rate can reflect the gap in the capacity of the different models to demonstrate extreme flow under certain circumstances. The coincidence rates of both ESA4 and ESA5 reached 0.90, indicating that the errors between most predicted and measured values are small. Although the values of R for SVM and FFNN reached 91 and 83%, respectively, the anomaly coincidence rates reached only 0.43 and 0.47, respectively. This clearly indicates that in the high-flow prediction part, the prediction results of SVM and FFNN are worse than those of ESA4 and ESA5. In addition, with an error interval of 20%, the pass rate of ESA4 reaches 100%, those of ESA3 and ESA5 reach 97%, and the prediction results in the stable-flow part are also satisfactory.

Table 1

Comparison of prediction performance indicators

R (%)RMSEMAE85% anomaly coincidence rate80% pass rate
ESA1 71 23.30 22.23 0.13 0.27 
ESA2 82 10.69 8.42 0.63 0.87 
ESA3 83 8.44 6.96 0.83 0.97 
ESA4 87 4.10 3.39 0.90 1.00 
ESA5 88 6.56 5.14 0.90 0.97 
SVM 91 12.81 11.79 0.43 0.80 
FFNN 83 12.93 11.84 0.47 0.77 
R (%)RMSEMAE85% anomaly coincidence rate80% pass rate
ESA1 71 23.30 22.23 0.13 0.27 
ESA2 82 10.69 8.42 0.63 0.87 
ESA3 83 8.44 6.96 0.83 0.97 
ESA4 87 4.10 3.39 0.90 1.00 
ESA5 88 6.56 5.14 0.90 0.97 
SVM 91 12.81 11.79 0.43 0.80 
FFNN 83 12.93 11.84 0.47 0.77 

The bold means the optimal value.

In the special river flow part of the prediction, the model in the ESA orthogonal experiment group generally performs better. The FFNN and SVM yield good prediction results at some time points but generally tend to be stable. Table 2 comparison of prediction results of various models on high-flow days presents the prediction results of the various models during the period with large change in September. As can be seen from the entries in bold, the prediction results of ESA4 are generally the best. This indicates that when L = 90 and β0 = 15, satisfactory results can be obtained in the prediction of the sequence.

Table 2

Comparison of prediction results of various models on high-flow dates

DateObservationESA1ESA2ESA3ESA4ESA5SVMFFNN
10 September 2000 84 67 73 76 81 80 72 70 
11 September 2000 94 76 78 85 92 100 78 81 
16 September 2000 96.5 77 90 87 100 99 85 79 
17 September 2000 101 80 103 90 96 89 95 92 
18 September 2000 136 95 105 117 124 135 110 114 
19 September 2000 87.8 66 71 81 90 90 78 78 
27 September 2000 82.8 63 73 67 77 85 69 76 
28 September 2000 96.5 78 96 85 93 82 85 83 
30 September 2000 113 89 94 100 116 101 99 108 
DateObservationESA1ESA2ESA3ESA4ESA5SVMFFNN
10 September 2000 84 67 73 76 81 80 72 70 
11 September 2000 94 76 78 85 92 100 78 81 
16 September 2000 96.5 77 90 87 100 99 85 79 
17 September 2000 101 80 103 90 96 89 95 92 
18 September 2000 136 95 105 117 124 135 110 114 
19 September 2000 87.8 66 71 81 90 90 78 78 
27 September 2000 82.8 63 73 67 77 85 69 76 
28 September 2000 96.5 78 96 85 93 82 85 83 
30 September 2000 113 89 94 100 116 101 99 108 

When the sequence was processed by the ESA, most output values in the hidden neurons tended to = 0.05. Therefore, the ESA can be considered to compress the original input sequence to a certain extent. Most information in the original sequence is hidden in the neurons whose output values in the hidden layer are far higher than ρ = 0.05, which can reflect the internal rules and characteristics of the original sequence. This edge feature reflects the upper and lower boundaries of the original sequence. The upper boundary of the feature reflects the value of extreme flow in the sequence. The results provide new ideas for exploring the evolution of the regional flood.

In this study, the traditional sparse autoencoder was enhanced using the inverse method of simulated annealing. Considering the river flow data observed at the Kenswat Station as a research example, this study explored the feature extraction of river flow sequences and the prediction performance of future river flow sequences under various values of L when the sparse factor β takes different initial values. Orthogonal experiments were designed, and the forecast results of the flood months were obtained by the orthogonal experimental groups of SVM and FFNN. By analyzing the output characteristics of the hidden layer under the various combination models and the prediction results of the orthogonal experiments, the following conclusions are reached:

  • (1)

    The output value of the hidden layer in the ESA contains the evolution characteristics of the original sequence and can reflect the edge of non-stationary sequences.

  • (2)

    When L = 90 and β0 = 15, the prediction result of the river flow sequence at the Kenswat Station is the best. At 9:00, 15:00, and 20:00, the compression ratio of the original sequence is 55, 54, and 54%, respectively.

  • (3)

    In the output feature of the ESA hidden layer, the upper bound of the features can reflect the eigenvalues of river flow during the flood in the input sequence, thereby providing some new ideas for exploring the law of evolution of regional floods.

In the future, we will continue to explore the relationship between the output value of the hidden layer of the ESA and the law of evolution of non-stationary river flow sequences and investigate the law of evolution of river flow under drought and flood, which can facilitate better management and planning of water resources.

The authors declare that they have no conflict of interest.

Financial supports from the National Natural Science Foundation of China (U1803244), the National Key R&D Program of China (2017YFC0404304), the Key Science and Technology Project in special issues of Bingtuan (2019AB035), and the Talent initiate scientific research projects of the Shihezi University (RCZK2018C23) are gratefully acknowledged.

All relevant data are included in the paper or its Supplementary Information.

Abbott
M. B.
Bathurst
J. C.
Cunge
J. A.
O'Connell
P. E.
1986
An introduction to the European Hydrological System – Systeme Hydrologique European, ‘SHE’, 2: structure of a physically-based, distributed modelling system
.
Journal of Hydrology
87
(
1
),
45
59
.
Ajadi
O. A.
Meyer
F. J.
Liljedahl
A.
&
IEEE
2017
Detection of aufeis-related flood areas in a time series of high resolution SAR images using Curvelet transform and unsupervised classification
. In:
IEEE International Symposium on Geoscience and Remote Sensing IGARSS
, pp.
177
180
.
Alizadeh
M. J.
Nourani
V.
Mousavimehr
M.
Kavianpour
M. R.
2018
Wavelet-IANN model for predicting flow discharge up to several days and months ahead
.
Journal of Hydroinformatics
20
(
1
),
134
148
.
doi:10.2166/hydro.2017.142
.
Badrzadeh
H.
Sarukkalige
R.
Jayawardena
A. W.
2018
Intermittent stream flow forecasting and modelling with hybrid wavelet neuro-fuzzy model
.
Hydrology Research
49
(
1
),
27
40
.
doi:10.2166/nh.2017.163
.
Bengio
Y.
Courville
A.
Vincent
P.
2013
Representation learning: a review and new perspectives
.
IEEE Transactions on Pattern Analysis and Machine Intelligence
35
(
8
),
1798
1828
.
doi:10.1109/tpami.2013.50
.
Bhat
M. S.
Ahmad
B.
Alam
A.
Farooq
H.
Ahmad
S.
2019
Flood hazard assessment of the Kashmir Valley using historical hydrology
.
Journal of Flood Risk Management
12
.
doi:10.1111/jfr3.12521
.
Bou-Fakhreddine
B.
Mougharbel
I.
Faye
A.
Abou Chakra
S.
Pollet
Y.
2018
Daily river flow prediction based on two-phase constructive fuzzy systems modeling: a case of hydrological–meteorological measurements asymmetry
.
Journal of Hydrology
558
,
255
265
.
doi:10.1016/j.jhydrol.2018.01.035
.
Bouilloud
L.
Chancibault
K.
Vincendon
B.
Ducrocq
V.
Habets
F.
Saulnier
G. M.
Anquetin
S.
Martin
E.
Noilhan
J.
2010
Coupling the ISBA land surface model and the TOPMODEL hydrological model for Mediterranean flash-flood forecasting: description, calibration, and validation
.
Journal of Hydrometeorology
11
(
2
),
315
333
.
Chang
M.-J.
Chang
H.-K.
Chen
Y.-C.
Lin
G.-F.
Chen
P.-A.
Lai
J.-S.
Tan
Y.-C.
2018
A support vector machine forecasting model for typhoon flood inundation mapping and early flood warning systems
.
Water
10
(
12
).
doi:10.3390/w10121734
.
Chen
Z.
Li
W.
2017
Multisensor feature fusion for bearing fault diagnosis using sparse autoencoder and deep belief network
.
IEEE Transactions on Instrumentation and Measurement
66
(
7
),
1693
1702
.
doi:10.1109/tim.2017.2669947
.
Crawford
N. H.
Linsley
R. E.
1966
Digital Simulation in Hydrology: Stanford Watershed Model IV. Evapotranspiration
.
Technical Report No. 39
.
Dooge
J. C. I.
1959
A general theory of the unit hydrograph
.
Journal of Geophysical Research
64
(
2
),
241
256
.
Easton
Z. M.
Fuka
D. R.
White
E. D.
Collick
A. S.
Ashagre
B. B.
McCartney
M.
Awulachew
S. B.
Ahmed
A. A.
Steenhuis
T. S.
2010
A multi-basin SWAT model analysis of runoff and sedimentation in the Blue Nile, Ethiopia
.
Hydrology and Earth System Sciences
14
(
10
),
1827
1841
.
doi:10.5194/hess-14-1827-2010
.
Engeland
K.
Borga
M.
Creutin
J.-D.
Francois
B.
Ramos
M.-H.
Vidal
J.-P.
2017
Space-time variability of climate variables and intermittent renewable electricity production – a review
.
Renewable & Sustainable Energy Reviews
79
,
600
617
.
doi:10.1016/j.rser.2017.05.046
.
Fasshauer
G. E.
McCourt
M. J.
2012
Stable evaluation of Gaussian radial basis function interpolants
.
Siam Journal on Scientific Computing
34
(
2
),
A737
A762
.
doi:10.1137/110824784
.
Gangrade
S.
Kao
S.-C.
Naz
B. S.
Rastogi
D.
Ashfaq
M.
Singh
N.
Preston
B. L.
2018
Sensitivity of probable maximum flood in a changing environment
.
Water Resources Research
54
(
6
),
3913
3936
.
doi:10.1029/2017wr021987
.
Glorot
X.
Bengio
Y.
2010
Understanding the difficulty of training deep feedforward neural networks
.
Journal of Machine Learning Research
9
,
249
256
.
Han
R. G.
Ding
Z. H.
Feng
P.
2009
Study on influence of human activity on surface runoff in Haihe River Basin
.
Water Resources & Hydropower Engineering
40
(
3
),
4
7
.
Jimeno-Saez
P.
Senent-Aparicio
J.
Perez-Sanchez
J.
Pulido-Velazquez
D.
2018
A comparison of SWAT and ANN models for daily runoff simulation in different climatic zones of peninsular Spain
.
Water
10
(
2
).
doi:10.3390/w10020192
.
Kisakye
V.
Akurut
M.
Van der Bruggen
B.
2018
Effect of climate change on reliability of rainwater harvesting systems for Kabarole District, Uganda
.
Water
10
(
1
).
doi:10.3390/w10010071
.
Konrad
C. P.
Dettinger
M. D.
2017
Flood runoff in relation to water vapor transport by atmospheric rivers over the Western United States, 1949–2015
.
Geophysical Research Letters
44
(
22
),
11456
11462
.
doi:10.1002/2017gl075399
.
Lee
Y. H.
Singh
V. P.
1999
Tank model using Kalman Filter
.
Journal of Hydrologic Engineering
4
(
4
),
344
349
.
Lee
D. H.
Kim
J. H.
Park
M.-H.
Stenstrom
M. K.
Kang
J.-H.
2020
Automatic calibration and improvements on an instream chlorophyll a simulation in the HSPF model
.
Ecological Modelling
415
,
108835
.
https://doi.org/10.1016/j.ecolmodel.2019.108835
.
Leta
O. T.
El-Kadi
A. I.
Dulai
H.
2018
Impact of climate change on daily streamflow and its extreme values in Pacific Island watersheds
.
Sustainability
10
(
6
).
doi:10.3390/su10062057
.
Li
D.
Qu
S.
Shi
P.
Chen
X.
Xue
F.
Gou
J.
Zhang
W.
2018
Development and integration of sub-daily flood modelling capability within the SWAT model and a comparison with XAJ model
.
Water
10
(
9
).
doi:10.3390/w10091263
.
Mirzaei
M.
Huang
Y. F.
El-Shafie
A.
Chimeh
T.
Lee
J.
Vaizadeh
N.
Adamowski
J.
2015
Uncertainty analysis for extreme flood events in a semi-arid region
.
Natural Hazards
78
(
3
),
1947
1960
.
doi:10.1007/s11069-015-1812-9
.
Niu
W.-j.
Feng
Z.-k.
Cheng
C.-t.
Zhou
J.-z.
2018
Forecasting daily runoff by extreme learning machine based on quantum-behaved particle swarm optimization
.
Journal of Hydrologic Engineering
23
(
3
).
doi:10.1061/(asce)he.1943-5584.0001625
.
Nourani
V.
Tajbakhsh
A. D.
Molajou
A.
Gokcekus
H.
2019
Hybrid wavelet-M5 model tree for rainfall-runoff modeling
.
Journal of Hydrologic Engineering
24
(
5
).
doi:10.1061/(asce)he.1943-5584.0001777
.
Pandey
H. K.
Dwivedi
S.
Kumar
K.
2018
Flood frequency analysis of Betwa River, Madhya Pradesh, India
.
Journal of the Geological Society of India
92
(
3
),
286
290
.
doi:10.1007/s12594-018-1007-6
.
Rakhshandehroo
G.
Akbari
H.
Igder
M. A.
Ostadzadeh
E.
2018
Long-term groundwater-level forecasting in shallow and deep wells using wavelet neural networks trained by an improved harmony search algorithm
.
Journal of Hydrologic Engineering
23
(
2
).
doi:10.1061/(asce)he.1943-5584.0001591
.
Requena
A. I.
Flores
I.
Mediero
L.
Garrote
L.
2016
Extension of observed flood series by combining a distributed hydro-meteorological model and a copula-based model
.
Stochastic Environmental Research and Risk Assessment
30
(
5
),
1363
1378
.
doi:10.1007/s00477-015-1138-x
.
Sahoo
B. B.
Jha
R.
Singh
A.
Kumar
D.
2019
Application of support vector regression for modeling low flow time series
.
Ksce Journal of Civil Engineering
23
(
2
),
923
934
.
doi:10.1007/s12205-018-0128-1
.
Sanchez-Garcia
C.
Schulte
L.
Carvalho
F.
Carlos Pena
J.
2019
A 500-year flood history of the arid environments of southeastern Spain. The case of the Almanzora River
.
Global and Planetary Change
181
.
doi:10.1016/j.gloplacha.2019.102987
.
Santos
C. A. G.
Freire
P. K. M. M.
da Silva
R. M.
Akrami
S. A.
2019
Hybrid wavelet neural network approach for daily inflow forecasting using tropical rainfall measuring mission data
.
Journal of Hydrologic Engineering
24
(
2
).
doi:10.1061/(asce)he.1943-5584.0001725
.
Schreiner-McGraw
A. P.
Ajami
H.
Vivoni
E. R.
2019
Extreme weather events and transmission losses in arid streams
.
Environmental Research Letters
14
(
8
).
doi:10.1088/1748-9326/ab2949
.
Sharghi
E.
Nourani
V.
Najafi
H.
Molajou
A.
2018
Emotional ANN (EANN) and wavelet-ANN (WANN) approaches for Markovian and seasonal based modeling of rainfall-runoff process
.
Water Resources Management
32
(
10
).
doi:10.1007/s11269-018-2000-y
.
Sharghi
E.
Nourani
V.
Molajou
A.
Najafi
H.
2019
Conjunction of emotional ANN (EANN) and wavelet transform for rainfall-runoff modeling
.
Journal of Hydroinformatics
21
(
1
),
136
152
.
doi:10.2166/hydro.2018.054
.
Sun
Y.
Niu
J.
Sivakumar
B.
2019
A comparative study of models for short-term streamflow forecasting with emphasis on wavelet-based approach
.
Stochastic Environmental Research and Risk Assessment
33
(
10
),
1875
1891
.
doi:10.1007/s00477-019-01734-7
.
Tsakiri
K.
Marsellos
A.
Kapetanakis
S.
2018
Artificial neural network and multiple linear regression for flood prediction in Mohawk River, New York
.
Water
10
(
9
).
doi:10.3390/w10091158
.
Wang
Z.
Lou
Y.
2019
Hydrological Time Series Forecast Model Based on Wavelet de-Noising and ARIMA-LSTM
.
Widiasari
I. R.
Nugoho
L. E.
Widyawan
Efendi
R.
2018
Context-based Hydrology Time Series Data for A Flood Prediction Model Using LSTM
.
Yan
L.
Xiong
L.
Liu
D.
Hu
T.
Xu
C.-Y.
2017
Frequency analysis of nonstationary annual maximum flood series using the time-varying two-component mixture distributions
.
Hydrological Processes
31
(
1
),
69
89
.
doi:10.1002/hyp.10965
.
Yang
X.
Warren
R.
He
Y.
Ye
J.
Li
Q.
Wang
G.
2018
Impacts of climate change on TN load and its control in a River Basin with complex pollution sources
.
Science of the Total Environment
615
,
1155
1163
.
doi:10.1016/j.scitotenv.2017.09.288
.
Yaseen
Z. M.
El-Shafie
A.
Afan
H. A.
Hameed
M.
Mohtar
W. H. M. W.
Hussain
A.
2016
RBFNN versus FFNN for daily river flow forecasting at Johor River, Malaysia
.
Neural Computing & Applications
27
(
6
),
1533
1542
.
doi:10.1007/s00521-015-1952-6
.
Ye
Y.
2010
Analysis of water environment quality and the water quality exchange trend during the year of 2000–2009 in Xijiang River
.
Water Purification Technology
29
(
4
),
65
70
.
Yilmaz
A. G.
Muttil
N.
2014
Runoff estimation by machine learning methods and application to the Euphrates Basin in Turkey
.
Journal of Hydrologic Engineering
19
(
5
),
1015
1025
.
doi:10.1061/(asce)he.1943-5584.0000869
.
Zaini
N.
Malek
M. A.
Yusoff
M.
Mardi
N. H.
Norhisham
S.
2018
Daily river flow forecasting with hybrid support vector machine – particle swarm optimization
. In:
IOP Conference Series-Earth and Environmental Science, Vol. 140, 4th International Conference on Civil and Environmental Engineering for Sustainability
.
Zhang
N.
Song
D.
Zhang
J.
Liao
W.
Miao
K.
Zhong
S.
Lin
S.
Hajat
S.
Yang
L.
Huang
C.
2019a
The impact of the 2016 flood event in Anhui Province, China on infectious diarrhea disease: an interrupted time-series study
.
Environment International
127
,
801
809
.
doi:10.1016/j.envint.2019.03.063
.
Zhang
Q.
Li
Z.
Snowling
S.
Siam
A.
El-Dakhakhni
W.
2019b
Predictive models for wastewater flow forecasting based on time series analysis and artificial neural network
.
Water Science and Technology
80
(
2
),
243
253
.
doi:10.2166/wst.2019.263
.
Zhu
S.
Heddam
S.
Wu
S.
Dai
J.
Jia
B.
2019
Extreme learning machine-based prediction of daily water temperature for rivers
.
Environmental Earth Sciences
78
(
6
).
doi:10.1007/s12665-019-8202-7
.