Abstract
In this study we have developed a conjunction model, WGP, of discrete wavelet transform (DWT) and genetic programming (GP) for forecasting river floods when the only data available are the historical daily flows. DWT is used for denoising and smoothening the observed flow time series on which GP is implemented to get the next-day flood. The new model is compared with autoregressive (AR) and stand-alone GP models. All models are calibrated and tested on the Kosi River which is one of the most devastating rivers of the world with high and spiky monsoon flows, modeling of which poses a great challenge. With different inputs, 12 models, four in each class of WGP, GP and AR, are devised. The best performing WGP model, WGP4, with four previous daily flow rates as input, forecasts the Kosi floods with an accuracy of 87.9%, root mean square error of 123.9 m3/s and Nash–Sutcliffe coefficient of 0.993, the best performance indices among all the developed models. The extreme floods are also better simulated by the WGP models than by AR and GP models.
INTRODUCTION
Floods kill more people and destroy more property every year than any other type of natural disaster. According to the estimate by the World Resources Institute, India has a larger percentage of population exposed to flood damage than any other country of the world. Bangladesh, China, Vietnam, Pakistan and Indonesia are other countries with significant percentages of their population at flood risk. Additionally, it is worrying that climatic change is increasing the frequency and severity of extreme events (Li et al. 2016). However, flood damage can be considerably reduced if their occurrence is predicted reliably in advance. Authorities require a timely and reliable estimate of any impending deluge to devise appropriate action plans. Numerous models developed for flood forecasting in the last decades broadly employ three approaches: conceptual, statistical, or artificial intelligence. Although the conceptual approach has an apparent physical concept and consistent accuracy, its heavy hydrometeorological data requirement is a deterrent to its use. The statistical approach, on the other hand, is simple to formulate and execute, but does not give satisfactory results as it is static and deficient in revealing the nonlinear relationship that may exist between the affecting variables. In comparison, models based on artificial intelligence, such as artificial neural networks (ANN), have the ability to learn the fluctuating relationship between the input and output without knowledge of the physical processes occurring within the system; however, their inability to produce an explicit model for use of other investigators is a clear disadvantage. In addition, their optimal structures are hard to determine and the network solutions are easily trapped in local optima.
Genetic programming (GP), however, is a recently developed evolutionary computing technique which generates a transparent and structured representation of the studied system (Koza 1992). It is free from the deficiencies of ANN and provides an explicit model. These advantages are successfully applied in hydrological investigations (Savic et al. 1999; Babovic & Keijzer 2002; Dorado et al. 2003; Aytek & Kisi 2008; Nourani et al. 2008; Azamathulla et al. 2010; Shiri & Kisi 2010; Kisi & Shiri 2012; Shiri et al. 2012a, 2012b; Fallah-Mehdipour et al. 2012; Rodríguez-Vázquez et al. 2012; Fallah-Mehdipour et al. 2013; Kisi et al. 2013; Ashofteh et al. 2014; Aytek et al. 2014; Garg 2014; Hakimzadeh et al. 2014; Uyumaz et al. 2014; Yaseen et al. 2015; Adhikary et al. 2016; Barge & Sharif 2016; Karimi et al. 2017). However, when the time series is highly fluctuating and nonstationary, preprocessing of the series through a suitable technique is desirable for better predictability (Cannas et al. 2006). The discrete wavelet transform (DWT) is an important signal decomposition technique that extracts valuable characteristics of signals at different resolution levels and which, when combined with an evolutionary algorithm such as GP, provides an efficient nonlinear approximation model. Very little literature is available on the combined use of DWT and GP in hydrology. Kisi & Shiri (2011) formulated WGP for daily precipitation forecasting. Shiri & Kisi (2012) modeled daily sediment load and showed that it performed better than wavelet-neuro-fuzzy and wavelet-ANN models. Nourani et al. (2012) developed a hybrid model of wavelet transform and GP to optimize the ANN structure for rainfall-runoff forecasting and the results thus obtained were found to be better than ANN and GP models. Gorgij et al. (2017) developed a WGP model for groundwater budget forecasting in East Azerbaijan, Iran. Karimi et al. (2016) implemented WGP for flow forecasting in the small River Filyos in Turkey and showed it to be more efficient than autoregressive moving average and ANN models.
In the present work, a novel model combining DWT and GP is developed for predicting the next-day Kosi flows for the monsoon period at Baltara gauge site (India) where the only records available are the previous day's flow rates. The motivation for using the WGP conjunction model arises from the fact that the wavelet analysis captures the nonstationary and seasonal effects of the flow time series and the input parameters are minimized based on sensitivity analysis through the GP framework (Nourani et al. 2012). As mentioned above, only limited literature is available in which wavelet transform has been combined with GP for stream flow forecasting in small rivers. However, this should be perhaps the first time that WGP has been evaluated for flood forecasting in a large river like the Kosi, which has a highly undulating monsoon flow that is marked by drifts, trends and abrupt changes.
MODELS
Autoregression


Discrete wavelet transform



In this work, the time series is decomposed using the Mallat (1989) filtering process in which the time series is passed through the low-pass and high-pass filters simultaneously. The outcomes through the low-pass filters are the high-scale and low-frequency wavelets. They are called the approximation wavelets/subtime series, Ai, at resolution level i. They represent gross and slow-changing features of the signal. On the other hand, the outcomes through the high-pass filters are low-scale and high-frequency wavelets, called the detail wavelets/subtime series, Di. They measure the rapidly changing features of the signal. The approximation subtime series can be further decomposed into approximation and detail wavelets for the next decomposition/resolution level. A three-level wavelet decomposition is illustrated in Figure 2.
Mallat (1989) method of time series decomposition into approximation and details subtime series.
Mallat (1989) method of time series decomposition into approximation and details subtime series.
The original signal can again be regained by adding all the detail subtime series and the approximation subtime series of the last resolution level.
Genetic programming
GP is a comparatively new machine learning method that randomly generates a population of computer programs. These programs are potential candidates for solution. The fitness, i.e., suitability of each generated candidate for solution, is evaluated. The ‘fit’ candidates are kept and ‘misfit’ candidates are removed from the next generation. In addition, some new candidates are formed randomly by changing (mutation) or swapping parts of the other candidates (crossover). The process is repeated until the generated programs of the newer generation meet the objective of minimizing the root mean square error between the observed and the predicted flows for the derivation dataset. Although GP and genetic algorithms (GA) have similar working structure, GP has some distinct advantages over GA. While GA has a fixed length, meaning the resulting function has bounded complexity, GP inherently has a variable length making it more flexible with no bounded complexity. Furthermore, GA relies on operator precedence which could be seen as a limitation, whereas GP uses an explicit structure to avoid operator precedence (Kisi & Shiri 2011). The books by Koza (1992) and Banzhaf et al. (1998) have given a good description of GP.
Discrete wavelet transform – genetic programming (WGP)
The WGP is an integration of DWT and GP. DWT is used for denoising the flow time series. This is done by decomposing the series by DWT into approximation and detail subtime series. Generally, the detail subtime series on the first resolution level, i.e., D1, is found to be the noisiest and least correlated subtime series with the original series (Kisi & Shiri 2012). Therefore, by deleting D1 and recombining other wavelets, a wavelet-smoothened series is formed. This transformed series is the input for the GP implementation. The working arrangement for WGP is shown in Figure 3.
DATA
Kosi river
The monsoon floods are a recurring hazard in Eastern India. The Kosi river, in particular, causes widespread human suffering every year through flooding. It is a transboundary Himalayan river with the major portion of its catchment lying in China and Nepal. Although only about 11,410 km2 out of its total catchment area of 74,030 km2 lies in India, it bears the brunt of the Kosi's fury. Any heavy rainfall in its upper catchment results in high floods which rush towards the Indian plain and take a heavy toll on life and property. Additionally, its steep slope in the gorge zone causes high flow velocity which erodes the fragile Himalayan topsoil resulting in a silt yield of about 19 m³/ha/year, one of the highest in the world. The silt is released when the flow velocity diminishes in the plain, causing frequent shifts in the river's course. It has moved laterally westward by about 114 km in the last 250 years. To control the flood damage and confine its course, the government of the state of Bihar has constructed earthen embankments on both sides of the river, without much satisfactory result. The embankments are often breached or overtopped, submerging a large area. The disaster happens almost every year. In the year 2008, the embankment breach was so catastrophic that the government had to declare it a national calamity and a large contingent of the army had to be called in to evacuate the affected people. The Kosi flow increases manyfold during monsoon as many big rivers such as the Kamala, Bagmati and Bhutahi Balan flow into it. The Kosi catchment along with its adjoining rivers is shown in Figure 4.
In this research, a MATLAB-based open-source genetic-programming tool, GPTIPS (Searson et al. 2010) is used on the wavelet-smoothened time series of the Kosi river at Baltara to predict the next-day flows. Different GP parameters, i.e., population size, number of generations, tournament size, elitism, depth of tree, number of genes in a tree and the allowed mathematical functions are tried in every run. The objective for each GP-run is to minimize the root mean square error between the observed and the predicted flows on the derivation dataset. The derivation dataset consisted of daily discharges of the monsoon period (June–October) for the years 2001–4, 2006–7 and 2009–12. The discharges for the years 2005 and 2008 could not be considered, as the river breached its banks in these years and formed two streams. After deriving the proposed models satisfactorily using the derivation dataset, they are tested on the verification dataset which consisted of monsoon discharges for the years 2013–16. The statistical information on the Kosi flow at Baltara is shown in Table 1.
The flow characteristics for the Kosi at Baltara (2002–16)
Parameter (m3/s) . | Derivation dataset . | Verification dataset . |
---|---|---|
Max. daily discharge | 7,265 | 6,268 |
Min. daily discharge | 866 | 653 |
Mean daily discharge | 4,255 | 3,905 |
Standard deviation | 1,367 | 1,293 |
Range | 6,399 | 5,615 |
Parameter (m3/s) . | Derivation dataset . | Verification dataset . |
---|---|---|
Max. daily discharge | 7,265 | 6,268 |
Min. daily discharge | 866 | 653 |
Mean daily discharge | 4,255 | 3,905 |
Standard deviation | 1,367 | 1,293 |
Range | 6,399 | 5,615 |
As the first step, the observed flow time series (OFTS) of the Kosi is dissociated into wavelets /subtime series at three decomposition levels (Figure 5). Wang & Ding (2003) suggested int[log(n)] decomposition levels sufficient for bringing out the attributes of the signal, where n is the data length. As only 1,137 daily flow data are used for deriving the models, three decomposition levels are considered adequate. Furthermore, a suitable mother wavelet is critical to the efficiency of a wavelet-based model. Some of the popular wavelets used in hydrological studies are haar, coif5, dmey, bior6.5, rbio6.8, sym9 and db5. In the present study, rbio6.8 is found to be the most appropriate wavelet for preprocessing of the Kosi flows at Baltara, as the prediction based on it has the maximum correlation coefficient and the minimum root mean square error, both for the derivation and the verification datasets (Table 2).
Performance of the WGP4 model with different mother wavelets. NSC, Nash–Sutcliffe coefficient; RSME, root mean square error
Mother wavelet . | Derivation period . | Verification period . | ||
---|---|---|---|---|
NSC . | RMSE (m3/s) . | NSC . | RMSE (m3/s) . | |
coif5 | 0.983 | 161 | 0.973 | 159 |
dmey | 0.956 | 172 | 0.963 | 169 |
bior6.8 | 0.949 | 181 | 0.950 | 178 |
rbio6.8 | 0.989 | 155 | 0.992 | 124 |
sym9 | 0.957 | 170 | 0.953 | 174 |
db5 | 0.986 | 159 | 0.980 | 160 |
haar | 0.883 | 196 | 0.869 | 191 |
Mother wavelet . | Derivation period . | Verification period . | ||
---|---|---|---|---|
NSC . | RMSE (m3/s) . | NSC . | RMSE (m3/s) . | |
coif5 | 0.983 | 161 | 0.973 | 159 |
dmey | 0.956 | 172 | 0.963 | 169 |
bior6.8 | 0.949 | 181 | 0.950 | 178 |
rbio6.8 | 0.989 | 155 | 0.992 | 124 |
sym9 | 0.957 | 170 | 0.953 | 174 |
db5 | 0.986 | 159 | 0.980 | 160 |
haar | 0.883 | 196 | 0.869 | 191 |
The Kosi flow decomposition into approximation and detail wavelets at three resolution levels.
The Kosi flow decomposition into approximation and detail wavelets at three resolution levels.
Again, a correlation study between the subtime series and the original flow time series showed D1, the first detail subtime series is found to be the noisiest and the least correlated wavelet to the original flow time series. Therefore, ignoring D1 and re-adding wavelets D2, D3 and A3, a wavelet-smoothened flow time series (WFTS) is developed which formed inputs for WGP models. The AR and GP models use OFTS for inputs. Based on different inputs, 12 models (four in each class of WGP, GP and AR) are developed, i.e., models WGP1, WGP2, WGP3 and WGP4 in the WGP class, models GP1, GP2, GP3 and GP4 in the GP class, and models AR1, AR2, AR3 and AR4 in the AR class. The model WGP3, for instance, has three inputs, i.e., the current-day, one-day-before and two-days-before flow rates from WFTS, while models GP3 and AR3 have inputs from OFTS (Table 3). The desired output for all the proposed models is the next-day flow.
Input variables of the proposed models
Model . | Inputs . | Target . |
---|---|---|
AR1/GP1 | Qo,t | Qo,t+1 |
AR2/GP2 | Qo,t, Qo,t-1 | Qo,t+1 |
AR3/GP3 | Qo,t, Qo,t-1,Qo,t-2 | Qo,t+1 |
AR4/GP4 | Qo,t, Qo,t-1,Qo,t-2,Qo,t-3 | Qo,t+1 |
WGP1 | qt | Qo,t+1 |
WGP2 | qt, qt-1 | Qo,t+1 |
WGP3 | qt, qt-1,qt-2 | Qo,t+1 |
WGP4 | qt, qt-1,qt-2,qt-3 | Qo,t+1 |
Model . | Inputs . | Target . |
---|---|---|
AR1/GP1 | Qo,t | Qo,t+1 |
AR2/GP2 | Qo,t, Qo,t-1 | Qo,t+1 |
AR3/GP3 | Qo,t, Qo,t-1,Qo,t-2 | Qo,t+1 |
AR4/GP4 | Qo,t, Qo,t-1,Qo,t-2,Qo,t-3 | Qo,t+1 |
WGP1 | qt | Qo,t+1 |
WGP2 | qt, qt-1 | Qo,t+1 |
WGP3 | qt, qt-1,qt-2 | Qo,t+1 |
WGP4 | qt, qt-1,qt-2,qt-3 | Qo,t+1 |
Note:Qo,t-i and qt-i, (i=0 to 3) are i-day-before flows from OFTS and WFTS, respectively, while Qo,t+1 is the next-day flow from OFTS.
After models are satisfactorily derived, they are evaluated using the following performance indices:
- (i)
- (ii)




RESULTS
The proposed AR, GP and WGP models are derived and verified for forecasting of the next-day Kosi floods for the monsoon period (June–October) at the Baltara gauge site. The performance of the devised models as presented in Table 4 clearly suggests that WGP is superior to GP and AR models. This is because the WGP model captures long, intermediate and short patterns in the flow time series, whereas GP and AR capture short patterns only (Nourani et al. 2012; Papacharalampous et al. 2017).
The performance of the proposed models for the Kosi River at Baltara (India)
Set . | Model . | Derivation dataset . | Verification dataset . | ||||
---|---|---|---|---|---|---|---|
NSC . | RMSE (m3/s) . | Accuracy (%) . | NSC . | RMSE (m3/s) . | Accuracy (%) . | ||
Set 1 | AR1 | 0.96 | 258 | 78.0 | 0.97 | 214 | 70.5 |
GP1 | 0.96 | 257 | 77.8 | 0.97 | 215 | 69.9 | |
WGP1 | 0.97 | 227 | 79.4 | 0.98 | 185 | 74.6 | |
Set 2 | AR2 | 0.97 | 249 | 81.1 | 0.97 | 207 | 72.7 |
GP2 | 0.97 | 249 | 81.0 | 0.97 | 207 | 72.9 | |
WGP2 | 0.98 | 198 | 83.2 | 0.99 | 159 | 79.4 | |
Set 3 | AR3 | 0.97 | 249 | 81.1 | 0.97 | 207 | 72.7 |
GP3 | 0.96 | 246 | 81.2 | 0.97 | 206 | 73.0 | |
WGP3 | 0.98 | 175 | 86.9 | 0.99 | 135 | 77.4 | |
Set 4 | AR4 | 0.97 | 248 | 81.0 | 0.97 | 207 | 72.7 |
GP4 | 0.97 | 245 | 81.4 | 0.97 | 206 | 73.1 | |
WGP4 | 0.99 | 155 | 88.4 | 0.99 | 124 | 87.9 |
Set . | Model . | Derivation dataset . | Verification dataset . | ||||
---|---|---|---|---|---|---|---|
NSC . | RMSE (m3/s) . | Accuracy (%) . | NSC . | RMSE (m3/s) . | Accuracy (%) . | ||
Set 1 | AR1 | 0.96 | 258 | 78.0 | 0.97 | 214 | 70.5 |
GP1 | 0.96 | 257 | 77.8 | 0.97 | 215 | 69.9 | |
WGP1 | 0.97 | 227 | 79.4 | 0.98 | 185 | 74.6 | |
Set 2 | AR2 | 0.97 | 249 | 81.1 | 0.97 | 207 | 72.7 |
GP2 | 0.97 | 249 | 81.0 | 0.97 | 207 | 72.9 | |
WGP2 | 0.98 | 198 | 83.2 | 0.99 | 159 | 79.4 | |
Set 3 | AR3 | 0.97 | 249 | 81.1 | 0.97 | 207 | 72.7 |
GP3 | 0.96 | 246 | 81.2 | 0.97 | 206 | 73.0 | |
WGP3 | 0.98 | 175 | 86.9 | 0.99 | 135 | 77.4 | |
Set 4 | AR4 | 0.97 | 248 | 81.0 | 0.97 | 207 | 72.7 |
GP4 | 0.97 | 245 | 81.4 | 0.97 | 206 | 73.1 | |
WGP4 | 0.99 | 155 | 88.4 | 0.99 | 124 | 87.9 |



Figure 6 shows the comparison of the high floods (>6,000 m3/s) predicted for the derivation dataset by the best models of their class, i.e., models WGP4, GP4 and AR4. The WGP4 is again seen to be the most reliable for forecasting the high flows, as the predicted series almost resembles the observed high flows of the Kosi, whereas models AR4 and GP4 show greater deviation. Once the models in Set 4 are satisfactorily derived, they are tested for the verification dataset. It can be seen from Table 4 that in this dataset also the WGP4 is clearly the best forecasting model. It predicts the floods with the highest NSC value of 0.99, the least RMSE value of 124 m3/s and the maximum accuracy of 87.9%.
The comparison of the predicted and observed high flows (>6,000 m3/s) of the Kosi at Baltara (derivation dataset).
The comparison of the predicted and observed high flows (>6,000 m3/s) of the Kosi at Baltara (derivation dataset).
Figure 7 compares the predicted and the observed high floods (flows >5,500 m3/s) for the verification dataset. It can be observed from Table 4 that WGP4 predicts the high flows better than the GP4 and AR4 models. Its predictions for the highest three floods (of 6,268 m3/s, 6,203 m3/s and 6,181 m3/s) as 6,228 m3/s, 6,214 m3/s and 6,179 m3/s respectively, are closest to the observed values, while models AR4 and GP4 either under- or over-predict these floods as 6,160 m3/s, 6,233 m3/s and 6,029 m3/s respectively, and 6,150 m3/s, 6,222 m3/s and 6,019 m3/s respectively. Understandably, there is prevalence of more noise in high discharges which are not normally measured directly but extrapolated from the stage-discharge rating curves (Tiwari & Chatterjee 2011). All models perform similarily in case of low flows which are influenced more by the moisture and vegetative condition of the soil.
The comparison of the predicted and observed high flows (>5,500 m3/s) of the Kosi at Baltara (verification dataset).
The comparison of the predicted and observed high flows (>5,500 m3/s) of the Kosi at Baltara (verification dataset).
The summarized results show that the WGP model is suitable for flood forecasting even in large and unpredictable rivers like the Kosi, and outperforms the AR and GP models. The AR and GP models, in comparison, have modeling abilities which are in agreement with the findings of other investigators (Savic et al. 1999; Nourani et al. 2012). By combining GP with wavelet transform, seasonality characteristics of a complex nonlinear process could be captured. However, as also reiterated by Papacharalampous et al. (2017), it must be noted here that it is the specific context, available data and the process under which a model is developed that render it effective or ineffective. For example, in the present research, WGP is seen forecasting next-day Kosi flows at Baltara satisfactorily and compares well to GP and AR models, but it must be remembered that we could include monsoon data for only 13 years with limited number of high flows. This data length may be insufficient for modeling a large and complex river system like the Kosi. Moreover, during monsoon, heavy downpours in the upper reaches of Tibet and Nepal generate flooding in the Kosi, while in winter, glacier melt appears as stream flow. Therefore, it would be imprudent to suggest that the same model would be appropriate for the other rivers as well because the relevant parameters may be different.
CONCLUSION
The main objective of the present research is to examine the usefulness of the conjunction model of DWT and GP for flood forecasting in large rivers when the only data available are historical flow records. The results clearly suggest that wavelet-transfomed data improve the approximation ability of GP. The best performing WGP model, with four previous daily discharges as input, predicts the flows with the highest accuracy of 87.9% for the verification dataset of the Kosi River at the Baltara site, while autoregressive and stand-alone GP models predict flows with only 72.7% and 73.1% accuracies respectively. The peak flows are also predicted more accurately by the WGP model than by AR and GP models. In conclusion, DWT is an important data preprocessing method which denoises the flow time series and improves the efficiency of a forecasting tool.