## Abstract

Rainfall–runoff modelling is crucial for enhancing the effectiveness and sustainability of water resources. Conceptual models can have difficulties, such as coping with nonlinearity and needing more data, whereas data-driven models can be deprived of reflecting the physical process of the basin. In this regard, two hybrid model approaches, namely Génie Rural à 4 paramètres Journalier (GR4 J)–wavelet-based data-driven models (i.e., wavelet-based genetic algorithm–artificial neural network (WGANN); GR4 J–WGANN_{1} and GR4 J–WGANN_{2}), were implemented to improve daily rainfall–runoff modelling. The novel GR4 J–WGANN_{1} hybrid model includes the outflow (*QR*) and direct flow (*QD*) obtained from the GR4 J model, and the GR4 J–WGANN_{2} hybrid model includes the soil moisture index (*SMI*) obtained from the GR4 J model as input data. In hybrid models, wavelet analysis and the Boruta algorithm were implemented to decompose input data and select wavelet components. Four gauging stations in the Eastern Black Sea and Kızılırmak basins in Turkey were used to observe modelling performance. The GR4 J model exhibited poor performance for extreme flow forecasting. The novel GR4 J–WGANN_{1} approach performed better than the GR4 J–WGANN_{2} model, and the hybrid models improved modelling performance up to 40% compared to the GR4 J model. In this regard, integrated conceptual–wavelet-based data-driven models can be useful for improving the conceptual model performance, especially regarding extreme flow forecasting.

## HIGHLIGHTS

Two hybrid models (i.e., GR4 J–WGANN

_{1}(Génie Rural à 4 paramètres Journalier-wavelet–genetic algorithm–artificial neural network) and GR4 J–WGANN_{2}) and the GR4 J conceptual model were implemented for rainfall–runoff modelling.The Boruta algorithm indicated a higher importance level of outputs obtained from the GR4 J model as input data than other variables in hybrid models.

The hybrid models improved the performance by up to 40% compared to the GR4 J model.

The proposed GR4 J–WGANN

_{1}model yielded better results than the GR4 J–WGANN_{2}model.

### Graphical Abstract

## INTRODUCTION

Rainfall–runoff modelling has been one of the significant issues in hydrology concerning water resources management and planning. Due to climate change, extreme events, such as floods and droughts, have been observed frequently worldwide in recent years. Hydrological modelling has become crucial for efficient water management and precautions against extreme events. Different hydrological modelling methods, such as conceptual, data-driven, and physically based models, have been put forward. Beven (1989) discussed the applicability of the physically based models in reality and stated the limitations derived from the incompatibility of the equations in the models for the heterogeneous conditions, the lack of theory related to scaling integration, and practical restrictions on solution methodologies and dimensionality problems in parameter calibration. Beven (2001) argued the problems related to implementing distributed hydrological models, such as nonlinearity, scale, equifinality, uniqueness, and uncertainty problems, and focused on the alternative approach based on the prior evaluation of models regarding physical realism on the value of data in model rejection. Frame *et al.* (2021) investigated the performance of three different long short-term memory (LSTM)-based models depending on the usage of only United States National Water Model (NWM) outputs (LSTM_PP), LSTM post-processor trained on the NWM outputs and atmospheric forcings (LSTM_PPA), and an LSTM model trained only on atmospheric forcing (LSTM_A) for daily streamflow prediction. It was stated by Frame *et al.* (2021) that all three LSTM deep learning model structures performed better than the NWM, and LSTM significantly contributed to the performance improvement of the NWM. Rozos *et al.* (2022) used the machine learning approximators (i.e., LSTM and recurrent neural network (RNN)) to evaluate the HYMOD2 and linear regression hydrological modelling (LRHM) models (Rozos *et al.* 2020), and they found significant performance improvements for runoff forecasting. They pointed out that this approach can be useful for dealing with the poor performance of the hydrological models due to anomalies in the catchment dataset without physical processes being taken into account. Koutsoyiannis & Montanari (2022) introduced a method called ‘Bluecat’ for the simulation and prediction of hydrological processes, which depends on using a generic deterministic model that can be converted to a stochastic formulation. Thus, it can connect the deterministic and stochastic model approaches and simulate and predict hydrologic variables with uncertainty evaluation.

Conceptual models are based on the relationship between hydrometeorological variables, which are usually shown via mathematical equations. Conceptual models have many advantages, such as observing the physical relationship between the parameters and having fewer required parameters than the distributed models, which can also be used for rainfall–runoff modelling as an another approach (Zhou *et al.* 2019). However, conceptual models can require a larger dataset and can have challenges for the calibration process (Bai *et al.* 2021; Jahandideh-Tehrani *et al.* 2021). Several conceptual models, such as Génie Rural à 4 paramètres Journalier (GR4 J), have been extensively used for rainfall–runoff modelling in the last two decades. The GR4 J model is a lumped daily conceptual model with four free parameters introduced by Perrin *et al.* (2003). Perrin *et al.* (2003) showed the usefulness and effectiveness of the GR4 J model by comparing it with other conceptual models, which have more parameters, such as Xinanjiang (XAJ) (Zhao & Liu 1995), Hydrologiska Byråns Vattenbalansavdelning (HBV) (Bergström 1995), and Identification of unit Hydrographs And Component flows from Rainfall Evapotranspiration and Streamflow (IHACRES) (Jakeman *et al.* 1990). In this regard, the GR4 J model has been implemented in many basins across the world (Massmann 2015; Dakhlaoui *et al.* 2017; Brulebois *et al.* 2018; Sezen *et al.* 2019). Anshuman *et al.* (2018) implemented GR4 J and Australian Water Balance (AWBM) models for rainfall–runoff modelling at two catchments in the upper Godavari basin in India. They revealed that the GR4 J model yielded better than the AWBM model for streamflow forecasting, and the GR4 J model can be useful for water balance modelling despite its simple structure. Ghimire *et al.* (2020) used GR4 J, IHACRES catchment wetness index (IHACRES CWI), Hydrological Engineering Center-Hydrological Modeling System (HEC-HMS), and Soil & Water Assessment Tool (SWAT) models in Ayeyarwady, in Myanmar. They stated that process-based and relatively complex models could be useful in data-rich catchments, whereas conceptual models with simple model structures, such as the GR4 J model, can be applicable in data-limited basins for rainfall–runoff modelling. Wijayarathne & Coulibaly (2020) investigated the performance of GR4 J, Sacramento Soil Moisture Accounting (SAC-SMA), McMaster University HBV (MAC-HBV), HEC-HMS, and University of Waterloo Flood Forecasting System (WATFLOOD) models in the Waterford River watershed, Canada. They found out that SAC-SMA and GR4 J models exhibited better performance than other models with regard to the estimation of low, medium, and high flows. All these studies showed the usefulness and applicability of the GR4 J model in different basins.

The data-driven models, which are based on the relationship between input and output data, are less dependent on the physical process of the basin. Artificial intelligence-based data-driven models can have advantages such as coping with the nonlinear hydrological process and a large amount of dynamicity in datasets (Nourani *et al.* 2014). Papacharalampous *et al.* (2019) implemented several stochastic (such as autoregressive integrated moving average (ARIMA) and autoregressive fractionally integrated moving average (ARFIMA)) and machine learning methods (such as artificial neural networks (ANN), support vector machines (SVM), and random forest (RF)) for the hydrological processes forecasting. They found out that the forecasting performance of the stochastic and machine learning methods did not vary significantly. Tyralis *et al.* (2019) reviewed the different implementations of RF in water resources. They indicated the applicability and usefulness of RF in many aspects such as regression, classification, and variable selection. Although the applicability of data-driven models has been indicated in many studies, they can also have some shortcomings such as forecasting extreme values and inadequacy for handling non-stationary data (Cannas *et al.* 2006; Kurtulus & Razack 2007; Shoaib *et al.* 2018). In this regard, data pre-processing techniques, such as wavelet transformation with the data-driven models, have been widespread (Kalteh 2013; Ramana *et al.* 2013; Sharghi *et al.* 2019). Wavelet transformation can improve the performance of data-driven models by isolating the noise and providing spatial and temporal information for time series (Nourani *et al.* 2013; Shoaib *et al.* 2018). In recent years, the performance of conceptual and data-driven models has been compared, and they have been combined for rainfall–runoff modelling. Humphrey *et al.* (2016) integrated the GR4 J model into the ANN model and used the hybrid model for monthly streamflow forecasting. They compared the performance of the hybrid model with GR4 J and ANN models. In this regard, Humphrey *et al.* (2016) pointed out that the hybrid model yielded more accurate forecasting results than the GR4 J and ANN models, especially for high flows. Kumanlioglu & Fistikoglu (2019) integrated the GR4 J and ANN models by using net rainfall (*Pn*), net evapotranspiration (*En*), percolation (*Perc*), and the part of *Pn* filling the production store (*Ps*) obtained from the GR4 J model as input data and genetic algorithm (GA) for the calibration of the conceptual model. They stated that the integrated model approach could be useful with regard to more robust and better modelling performance, especially in highly nonlinear conditions. Ghaith *et al.* (2020) implemented a hybrid model, which is based on the integration of the HYdrological MODel (HYMOD) and ANN models, a hybrid hydrological data-driven (HHDD) model, for daily streamflow forecasting. They stated that HHDD yielded better and more robust prediction results than the HYMOD. Wang *et al.* (2021) integrated the XAJ model with the wavelet-based random forest (WRF) model for daily rainfall–runoff modelling. They revealed that the XAJ–WRF model yielded better simulation results than the XAJ and RF models, and wavelet analysis contributed to the performance improvement. Okkan *et al.* (2021) implemented two different hybrid model approaches, namely nested and parallel coupling methods, to integrate the dynamic water balance model (dynwbm) and ANN, support vector regression (SVR) models for monthly rainfall–runoff modelling in the Gediz River basin, Turkey. They indicated the outperformance of the hybrid model approaches against the dynwbm model and stated that the nested hybrid model approach could be useful for streamflow simulation. As it can be seen, with the integration of the benefits of both the conceptual and data-driven models, hybrid models can improve the rainfall–runoff modelling performance.

In this study, two different hybrid GR4 J and wavelet-based genetic algorithm–ANN (WGANN) (i.e., GR4 J–WGANN_{1} and GR4 J–WGANN_{2}) model approaches were implemented to improve the modelling performance of the GR4 J model. In the first hybrid model (GR4 J–WGANN_{1}), routing store outflow (*QR*) and direct flow (*QD*) acquired from the GR4 J model, which represent the novelty of this study, were used as input variables. In the second model (GR4 J–WGANN_{2}), the soil moisture index (*SMI*) obtained from the GR4 J model was used as an input variable in the optimized ANN model by the GA. Implementing the coupled GR4 J conceptual and wavelet-based data-driven models will be another novel aspect of this study. Furthermore, the Boruta algorithm was also implemented to select wavelet components for each input data in hybrid models as a new approach. Thus, the performance of the GR4 J, GR4 J–WGANN_{1}, and GR4 J–WGANN_{2} models will be compared for daily rainfall–runoff modelling. The models will be implemented in three sub-basins of the Eastern Black Sea, and one sub-basin of the Kızılırmak basin, Turkey. In this regard, it will be observed whether both hybrid models can develop the performance of the GR4 J model.

## DATA AND METHODS

### Basin characteristics and data

In this study, discharge data of four gauging stations, namely Cevizdere Düzköy, Akçay Duraklı, and Cura Çayı Curi, in the western part of the Eastern Black Sea basin and Çağşur Çayı Esenler in the Kızılırmak basin in Turkey were used for daily rainfall–runoff modelling. The Turkish General Directorate of State Hydraulic Works provided the runoff data, and the Turkish State Meteorological Service provided temperature and precipitation data. The Eastern Black Sea basin has an irregular flood regime, and flood cases can be observed. In this regard, floods have led to economic and human losses in recent years (Republic of Turkey Ministry of Agriculture & Forestry General Directorate of Water Management 2020). In addition, precipitation also has a changing pattern due to the geographical formation of the basin. Precipitation occurs regularly in coastal parts of the basin during all seasons, whereas the summer season is usually arid in the terrestrial part of the basin. In general, coastal parts have moderate climatic conditions, while continental climate conditions dominate the interior parts. In the Kızılırmak basin, moderate climate conditions also dominate the coastal parts compared to the interior parts. However, the precipitation is relatively less in the coastal part of the Kızılırmak basin than in the Eastern Black Sea basin. The meteorological and discharge gauging stations used in this study are shown in Figure 1. Areal precipitation was calculated by using a Thiessen polygon for Ceviz, Akçay, Cura, and Çağşur streams via ArcGIS software (Figure 1). The temperature data of Ünye (17624) for Ceviz and Cura streams, Terme/Kozluk Beldesi (18544) for the Akçay stream (18544) in the Eastern Black Sea basin, and Bafra (17622) for the Çağşur stream in the Kızılırmak basin were used. The runoff and hydrometeorological data cover the period between 01.01.2012 and 30.09.2020 in Ceviz, Akçay, Cura, and Çağşur streams. Accordingly, the total data of meteorological and discharge stations consists of 3,196 days. The data ranges for meteorological and discharge gauging stations can be seen in Table 1. The first year of the data period was used as a warm-up period in the GR4 J model. Then, the period between 01.01.2013 and 04.06.2018 was used for the calibration and 05.06.2018 and 30.09.2020 for the validation in conceptual and hybrid models for Ceviz, Akçay, Cura, and Çağşur streams. The suggestion of Klemeš (1986) was taken into account to determine the calibration and test periods. Klemeš (1986) referred that the data can be split as 70% for the calibration and 30% for the validation if the available data are not long enough. Missing precipitation data were imputed by using RF via the ‘missForest’ package, part of R software (Stekhoven & Bühlmann 2012; Stekhoven 2013), whereas missing temperature data of the Terme/Kozluk Beldesi station were imputed by regression analysis using the temperature data of the Ünye station. The applicability and useful implementation of missForest, which is one of the RF algorithms, were shown for imputing missing data (Tang & Ishwaran 2017; Aguilera *et al.* 2020; Addi *et al.* 2022). Addi *et al.* (2022) stated that missForest imputation yielded good performance in predicting the dry and wet periods and extreme rainfall values. As for imputing temperature data, regression analysis was preferred due to the high correlation (*r*=0.993) between the temperature data of Terme/Kozluk Beldesi and Ünye stations. The potential evapotranspiration data can be calculated using several methods such as Penman––Monteith, Hargreaves, Oudin, and Thornthwaite. The performance of different evapotranspiration calculation methods has been investigated in many studies (Tegos *et al.* 2015; Kodja *et al.* 2020). Tegos *et al.* (2015) compared the performance of an evapotranspiration model depending on a simplification of the Penman–Monteith formula with radiation and temperature-based methods. They found that the simplified version of the Penman–Monteith model approach can be useful and efficient in evaluating potential evapotranspiration fields. Kodja *et al.* (2020) implemented the Penman–Monteith and Oudin methods for the evapotranspiration calculation in the GR4 J model in tropical basins of West Africa. They found that implementing different evapotranspiration formulations had few impacts on the GR4 J model. Flores *et al.* (2021) also indicated the effectiveness and outperformance of the Oudin formula for rainfall–runoff modelling. Accordingly, evapotranspiration was calculated using the Oudin formula (Oudin *et al.* 2005), which is based on the daily temperature data and the basin location, in this study. Oudin *et al.* (2005) showed that the Oudin formula could be practical and efficient for rainfall–runoff modelling. It requires less data to calculate evapotranspiration than other methods like the Penman formula. One can refer to the study of Oudin *et al.* (2005) for further details regarding the calculation of the Oudin formula. Statistical data regarding the runoff and hydrometeorological data can be seen in Table 2. In this respect, the calculated minimum, maximum, mean, standard deviation, skewness, and excess kurtosis (i.e., kurtosis coefficient minus 3) values are given in Table 2. The statistical data, such as mean, standard deviation, skewness, and excess kurtosis belonging to the time series used in this study are generally in agreement with the rest of the hydrometeorological data of many stations around the world, which were statistically analysed by Dimitriadis *et al.* (2021a, 2021b). This agreement can be evaluated as a significant indicator with regard to the compatibility of the time series for the analysis. As for the investigation of the long-term persistence of time series, the Hurst phenomenon has been a widely used method in many studies (Koutsoyiannis 2003, 2020; Dimitriadis *et al.* 2021a, 2021b). In this regard, the Hurst exponent, which was introduced by Hurst (1951), was calculated to identify the long-term persistence in precipitation, evapotranspiration, and streamflow time series in this study and is presented in Table 2. The Hurst parameter (*H*) for 0.5<*H*<1 refers to long-term persistence behaviour in the time series, for 0<*H*<0.5 the anti-persistent behaviour, and for *H*=0.5 a white-noise behaviour (Dimitriadis *et al.* 2021a). According to Table 2, *H* values show the long-term persistence behaviour in precipitation, evapotranspiration, and streamflow time series used in this study. Furthermore, the precipitation has remarkably less long-term persistence behaviour than the evapotranspiration and streamflow data and is even very close to white-noise behaviour, especially in Ceviz, Cura, and Akçay streams. The cross-correlations for the precipitation–runoff and evapotranspiration–runoff are presented in Table 3 to observe the relationship between variables. Accordingly, the correlations for the lag0 and lag1 are high compared to the subsequent lags for precipitation–runoff. On the other hand, the correlations for each lag are close to each other for evapotranspiration–runoff.

Station types . | Station no. . | Station name . | Location . | Data range . | Data used in the warm-up period for the conceptual model . | Modelling period . | Missing data ratio in the modelling period (%) . | |
---|---|---|---|---|---|---|---|---|

Calibration . | Validation . | |||||||

Meteoorological stations | 17622 | Bafra | 35.92 E, 41.55 N | 01.01.2012–30.09.2020 | 01.01.2012–31.12.2012 | 01.01.2013–04.06.2018 | 05.06.2018–30.09.2020 | – |

18134 | Vezirköprü | 35.45 E, 41.14 N | 01.06.2012–30.09.2020 | 01.01.2012–31.12.2012 | 01.01.2013–04.06.2018 | 05.06.2018–30.09.2020 | – | |

18539 | Havza | 35.71 E, 40.99 N | 01.01.2014–30.09.2020 | 01.01.2012–31.12.2012 | 01.01.2013–04.06.2018 | 05.06.2018–30.09.2020 | 12.9 | |

18540 | Kavak | 36.06 E, 41.1 N | 01.01.2014–30.09.2020 | 01.01.2012–31.12.2012 | 01.01.2013–04.06.2018 | 05.06.2018–30.09.2020 | 12.9 | |

18544 | Terme/Kozluk Beldesi | 37.16 E, 41.14 N | 01.02.2014–30.09.2020 | 01.01.2012–31.12.2012 | 01.01.2013–04.06.2018 | 05.06.2018–30.09.2020 | 14 | |

18531 | Kumru | 37.25 E, 40.86 N | 01.02.2014–30.09.2020 | 01.01.2012–31.12.2012 | 01.01.2013–04.06.2018 | 05.06.2018–30.09.2020 | 14 | |

18526 | Çaybaşı | 37.1 E, 41.02 N | 01.02.2014–30.09.2020 | 01.01.2012–31.12.2012 | 01.01.2013–04.06.2018 | 05.06.2018–30.09.2020 | 14 | |

18131 | Akkuş | 37.02 E, 40.79 N | 01.11.2012–30.09.2020 | 01.01.2012–31.12.2012 | 01.01.2013–04.06.2018 | 05.06.2018–30.09.2020 | – | |

17624 | Ünye | 37.29 E, 41.14 N | 01.01.2012–30.09.2020 | 01.01.2012–31.12.2012 | 01.01.2013–04.06.2018 | 05.06.2018–30.09.2020 | – | |

18529 | İkizce/Şenbolluk | 37.00 E, 41.05 N | 01.03.2014–30.09.2020 | 01.01.2012–31.12.2012 | 01.01.2013–04.06.2018 | 05.06.2018–30.09.2020 | 15 | |

Discharge gauging stations | E22A066 | Cevizdere Düzköy | 37.31 E, 41.06 N | 01.01.2012–30.09.2020 | 01.01.2012–31.12.2012 | 01.01.2013–04.06.2018 | 05.06.2018–30.09.2020 | – |

E22A054 | Akçay Duraklı | 37.15 E, 41.11 N | 01.01.2012–30.09.2020 | 01.01.2012–31.12.2012 | 01.01.2013–04.06.2018 | 05.06.2018–30.09.2020 | – | |

E22A071 | Cura Çayı Curi | 37.20 E, 41.12 N | 01.01.2012–30.09.2020 | 01.01.2012–31.12.2012 | 01.01.2013–04.06.2018 | 05.06.2018–30.09.2020 | – | |

E15A049 | Çağşur Çayı Esençay | 35.84 E, 41.33 N | 01.01.2012–30.09.2020 | 01.01.2012–31.12.2012 | 01.01.2013–04.06.2018 | 05.06.2018–30.09.2020 | – |

Station types . | Station no. . | Station name . | Location . | Data range . | Data used in the warm-up period for the conceptual model . | Modelling period . | Missing data ratio in the modelling period (%) . | |
---|---|---|---|---|---|---|---|---|

Calibration . | Validation . | |||||||

Meteoorological stations | 17622 | Bafra | 35.92 E, 41.55 N | 01.01.2012–30.09.2020 | 01.01.2012–31.12.2012 | 01.01.2013–04.06.2018 | 05.06.2018–30.09.2020 | – |

18134 | Vezirköprü | 35.45 E, 41.14 N | 01.06.2012–30.09.2020 | 01.01.2012–31.12.2012 | 01.01.2013–04.06.2018 | 05.06.2018–30.09.2020 | – | |

18539 | Havza | 35.71 E, 40.99 N | 01.01.2014–30.09.2020 | 01.01.2012–31.12.2012 | 01.01.2013–04.06.2018 | 05.06.2018–30.09.2020 | 12.9 | |

18540 | Kavak | 36.06 E, 41.1 N | 01.01.2014–30.09.2020 | 01.01.2012–31.12.2012 | 01.01.2013–04.06.2018 | 05.06.2018–30.09.2020 | 12.9 | |

18544 | Terme/Kozluk Beldesi | 37.16 E, 41.14 N | 01.02.2014–30.09.2020 | 01.01.2012–31.12.2012 | 01.01.2013–04.06.2018 | 05.06.2018–30.09.2020 | 14 | |

18531 | Kumru | 37.25 E, 40.86 N | 01.02.2014–30.09.2020 | 01.01.2012–31.12.2012 | 01.01.2013–04.06.2018 | 05.06.2018–30.09.2020 | 14 | |

18526 | Çaybaşı | 37.1 E, 41.02 N | 01.02.2014–30.09.2020 | 01.01.2012–31.12.2012 | 01.01.2013–04.06.2018 | 05.06.2018–30.09.2020 | 14 | |

18131 | Akkuş | 37.02 E, 40.79 N | 01.11.2012–30.09.2020 | 01.01.2012–31.12.2012 | 01.01.2013–04.06.2018 | 05.06.2018–30.09.2020 | – | |

17624 | Ünye | 37.29 E, 41.14 N | 01.01.2012–30.09.2020 | 01.01.2012–31.12.2012 | 01.01.2013–04.06.2018 | 05.06.2018–30.09.2020 | – | |

18529 | İkizce/Şenbolluk | 37.00 E, 41.05 N | 01.03.2014–30.09.2020 | 01.01.2012–31.12.2012 | 01.01.2013–04.06.2018 | 05.06.2018–30.09.2020 | 15 | |

Discharge gauging stations | E22A066 | Cevizdere Düzköy | 37.31 E, 41.06 N | 01.01.2012–30.09.2020 | 01.01.2012–31.12.2012 | 01.01.2013–04.06.2018 | 05.06.2018–30.09.2020 | – |

E22A054 | Akçay Duraklı | 37.15 E, 41.11 N | 01.01.2012–30.09.2020 | 01.01.2012–31.12.2012 | 01.01.2013–04.06.2018 | 05.06.2018–30.09.2020 | – | |

E22A071 | Cura Çayı Curi | 37.20 E, 41.12 N | 01.01.2012–30.09.2020 | 01.01.2012–31.12.2012 | 01.01.2013–04.06.2018 | 05.06.2018–30.09.2020 | – | |

E15A049 | Çağşur Çayı Esençay | 35.84 E, 41.33 N | 01.01.2012–30.09.2020 | 01.01.2012–31.12.2012 | 01.01.2013–04.06.2018 | 05.06.2018–30.09.2020 | – |

Rainfall stations for Thiessen polygon . | Discharge gauging stations . | Streams . | P (mm). | E (mm). | Q (mm/d). | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

Min. . | Max. . | Mean . | Std. . | Skew. . | Kurt. . | Hurst exp. . | Min. . | Max. . | Mean . | Std. . | Skew. . | Kurt. . | Hurst exp. . | Min. . | Max. . | Mean . | Std. . | Skew. . | Kurt. . | Hurst exp. . | |||

Çaybaşı (18526), Ünye (17624), Akkuş (18131), Kumru (18531), Terme/Kozluk Beldesi (18544) | Cevizdere Düzköy (E22A066) | Ceviz | 0 | 57.8 | 2.9 | 5.7 | 3.8 | 20.1 | 0.51 | 0.2 | 5.5 | 2.6 | 1.5 | 0.27 | −.1.4 | 0.66 | 0.05 | 35.4 | 1.5 | 1.96 | 4.6 | 41.0 | 0.68 |

Çaybaşı (18526), Akkuş (18131), İkizce/Şenbolluk (18529), Terme/Kozluk Beldesi (18544) | Akçay Duraklı (E22A054) | Akçay | 0 | 133.3 | 4.0 | 9.1 | 5.0 | 41.0 | 0.53 | 0.13 | 5.4 | 2.5 | 1.5 | 0.26 | −1.4 | 0.66 | 0.07 | 50.9 | 2.2 | 3.8 | 4.7 | 35.3 | 0.67 |

Çaybaşı (18526), Akkuş (18131), İkizce/Şenbolluk (18529), Terme/Kozluk Beldesi (18544) | Cura Çayı Curi (E22A071) | Cura | 0 | 100.1 | 3.4 | 7.4 | 4.7 | 34.8 | 0.53 | 0.2 | 5.5 | 2.6 | 1.5 | 0.27 | −.1.4 | 0.66 | 0 | 30.6 | 1.9 | 2.36 | 3.6 | 22.9 | 0.70 |

Bafra (17622), Vezirköprü (18134), Havza (18539), Kavak (18540) | Çağşur Çayı Esençay (E15A049) | Çağşur | 0 | 43.7 | 1.7 | 4.0 | 3.9 | 20.3 | 0.58 | 0.007 | 5.5 | 2.5 | 1.5 | 0.25 | −1.4 | 0.67 | 0.007 | 15.1 | 0.55 | 0.87 | 5.2 | 51.3 | 0.70 |

Rainfall stations for Thiessen polygon . | Discharge gauging stations . | Streams . | P (mm). | E (mm). | Q (mm/d). | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

Min. . | Max. . | Mean . | Std. . | Skew. . | Kurt. . | Hurst exp. . | Min. . | Max. . | Mean . | Std. . | Skew. . | Kurt. . | Hurst exp. . | Min. . | Max. . | Mean . | Std. . | Skew. . | Kurt. . | Hurst exp. . | |||

Çaybaşı (18526), Ünye (17624), Akkuş (18131), Kumru (18531), Terme/Kozluk Beldesi (18544) | Cevizdere Düzköy (E22A066) | Ceviz | 0 | 57.8 | 2.9 | 5.7 | 3.8 | 20.1 | 0.51 | 0.2 | 5.5 | 2.6 | 1.5 | 0.27 | −.1.4 | 0.66 | 0.05 | 35.4 | 1.5 | 1.96 | 4.6 | 41.0 | 0.68 |

Çaybaşı (18526), Akkuş (18131), İkizce/Şenbolluk (18529), Terme/Kozluk Beldesi (18544) | Akçay Duraklı (E22A054) | Akçay | 0 | 133.3 | 4.0 | 9.1 | 5.0 | 41.0 | 0.53 | 0.13 | 5.4 | 2.5 | 1.5 | 0.26 | −1.4 | 0.66 | 0.07 | 50.9 | 2.2 | 3.8 | 4.7 | 35.3 | 0.67 |

Çaybaşı (18526), Akkuş (18131), İkizce/Şenbolluk (18529), Terme/Kozluk Beldesi (18544) | Cura Çayı Curi (E22A071) | Cura | 0 | 100.1 | 3.4 | 7.4 | 4.7 | 34.8 | 0.53 | 0.2 | 5.5 | 2.6 | 1.5 | 0.27 | −.1.4 | 0.66 | 0 | 30.6 | 1.9 | 2.36 | 3.6 | 22.9 | 0.70 |

Bafra (17622), Vezirköprü (18134), Havza (18539), Kavak (18540) | Çağşur Çayı Esençay (E15A049) | Çağşur | 0 | 43.7 | 1.7 | 4.0 | 3.9 | 20.3 | 0.58 | 0.007 | 5.5 | 2.5 | 1.5 | 0.25 | −1.4 | 0.67 | 0.007 | 15.1 | 0.55 | 0.87 | 5.2 | 51.3 | 0.70 |

In parenthesis, ‘Station no.’ was written for both rainfall and discharge gauging stations. The calculated kurtosis values in the table are the excess kurtosis values (i.e., kurtosis coefficient minus 3).

Streams . | Correlated conjugates . | Lags . | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|

lag0 . | lag1 . | lag2 . | lag3 . | lag4 . | lag5 . | lag6 . | lag7 . | lag8 . | lag9 . | lag10 . | ||

Ceviz | P–Q | 0.51 | 0.22 | 0.04 | 0.004 | −0.004 | 0.005 | −0.02 | −0.02 | −0.01 | −0.004 | −0.008 |

E–Q | −0.34 | −0.33 | −0.33 | −0.32 | −0.31 | −0.31 | −0.30 | −0.29 | −0.29 | −0.28 | −0.28 | |

Akçay | P–Q | 0.65 | 0.24 | 0.03 | 0.004 | −0.003 | 0.02 | −0.008 | −0.03 | −0.02 | −0.008 | −0.01 |

E–Q | −0.29 | −0.28 | −0.27 | −0.27 | −0.26 | −0.26 | −0.25 | −0.25 | −0.24 | −0.24 | −0.24 | |

Cura | P–Q | 0.58 | 0.26 | 0.06 | 0.02 | 0.004 | 0.017 | 0.009 | 0.004 | −0.004 | −0.014 | −0.008 |

E–Q | −0.34 | −0.34 | −0.33 | −0.32 | −0.31 | −0.31 | −0.30 | −0.29 | −0.29 | −0.28 | −0.28 | |

Çağşur | P–Q | 0.27 | 0.11 | 0.06 | 0.06 | 0.04 | 0.06 | 0.04 | 0.02 | 0.03 | 0.05 | 0.03 |

E–Q | −0.13 | −0.12 | −0.11 | −0.10 | −0.09 | −0.08 | −0.07 | −0.06 | −0.05 | −0.04 | −0.04 |

Streams . | Correlated conjugates . | Lags . | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|

lag0 . | lag1 . | lag2 . | lag3 . | lag4 . | lag5 . | lag6 . | lag7 . | lag8 . | lag9 . | lag10 . | ||

Ceviz | P–Q | 0.51 | 0.22 | 0.04 | 0.004 | −0.004 | 0.005 | −0.02 | −0.02 | −0.01 | −0.004 | −0.008 |

E–Q | −0.34 | −0.33 | −0.33 | −0.32 | −0.31 | −0.31 | −0.30 | −0.29 | −0.29 | −0.28 | −0.28 | |

Akçay | P–Q | 0.65 | 0.24 | 0.03 | 0.004 | −0.003 | 0.02 | −0.008 | −0.03 | −0.02 | −0.008 | −0.01 |

E–Q | −0.29 | −0.28 | −0.27 | −0.27 | −0.26 | −0.26 | −0.25 | −0.25 | −0.24 | −0.24 | −0.24 | |

Cura | P–Q | 0.58 | 0.26 | 0.06 | 0.02 | 0.004 | 0.017 | 0.009 | 0.004 | −0.004 | −0.014 | −0.008 |

E–Q | −0.34 | −0.34 | −0.33 | −0.32 | −0.31 | −0.31 | −0.30 | −0.29 | −0.29 | −0.28 | −0.28 | |

Çağşur | P–Q | 0.27 | 0.11 | 0.06 | 0.06 | 0.04 | 0.06 | 0.04 | 0.02 | 0.03 | 0.05 | 0.03 |

E–Q | −0.13 | −0.12 | −0.11 | −0.10 | −0.09 | −0.08 | −0.07 | −0.06 | −0.05 | −0.04 | −0.04 |

*API*),

*SMI*,

*QR*, and

*QD*will also be used for the rainfall–runoff modelling in hybrid models. The computation of the API can be seen as follows (Kohler & Linsley 1951; Heggen 2001):where

*i*is the number of antecedent days,

*k*is the recession constant, and

*P*

_{j}is the precipitation for day

*j*. Accordingly, it can be assumed that the earlier precipitation could have less effect on the streamflow regime than recent precipitation (Humphrey

*et al.*2016). The value of the recession constant,

*k*, is between 0.80 and 0.98 (Viessman & Lewis 1996; Heggen 2001). In this paper, the

*k-*value was implemented as 0.98 for the calculation of

*API*. The

*i-*value was used as 7 days, which is one of the typical values in the API calculation (Heggen 2001).

### Models

#### GR4 J conceptual model

The GR4 J model is a lumped conceptual model for daily rainfall-runoff modelling, which has four parameters, introduced by Perrin *et al.* (2003). In the GR4 J model, precipitation (*P*) and evapotranspiration (*E*) are used as input data. The GR4 J model has *x*_{1}, *x*_{2}, *x*_{3}, and *x*_{4} parameters, where *x*_{1} (mm) stands for the maximum capacity of the production store, *x*_{2} (mm) for the groundwater exchange coefficient, *x*_{3} (mm) for 1 day ahead maximum capacity of the routing store, and *x*_{4} (days) for the time base of the unit hydrograph of UH1. In the GR4 J model, first, net precipitation and evapotranspiration are obtained, and in the final step, the simulated discharge was calculated via the summation of the *QD* and routing store *QR*. The *SMI*, routing store *QR*, and *QD* were obtained from the GR4 J model and used in hybrid models. The *SMI* values are sequences of the computed water level in production store (*S*) values in the GR4 J model (Figure 2) (Anctil *et al.* 2004). One can refer to the study of Perrin *et al.* (2003) for further details regarding the GR4 J model structure and Anctil *et al.* (2004) for the computation of the *SMI*. The method described by Michel (1991) was used, and root mean square error (RMSE) was utilized as the objective function to calibrate the GR4 J model. Firstly, in the calibration algorithm, a screening process is performed using either a predefined grid or a list of initial parameter sets, and then the steepest descent local search algorithm is carried out (Michel 1991; Coron *et al.* 2017, 2021). In this paper, the calibration and simulation of runoff data for the GR4 J model were fulfilled via the *AirGR* package (Coron *et al.* 2017, 2021) that is part of the R software (R Development Core Team 2015).

#### Wavelet transformation

*ψ*(

*τ*,

*s*), can be stated as the translation and dilation of a mother wavelet function as follows:where

*t*is the time function,

*τ*is the time translation, and

*s*is the scale parameter of the wavelet. The continuous wavelet transform (CWT) coefficients are obtained by the convolution of

*x*(

*t*) with

*ψ*(

*τ*,

*s*).where (

***) denotes the complex conjugates and

*W*(

*τ*,

*s*) stands for the continuous wavelet coefficient. The CWT can require more computation time than the discrete wavelet transformation (DWT) (Adamowski & Sun 2010). The DWT is utilized to calculate the wavelet coefficients only at defined scales. In this regard, the discrete wavelet analysis can be more practical for avoiding large data and more computation time for each scale. The DWT can be calculated as follows:where

*m*stands for the decomposition level and

*n*stands for the translation factor of time. stands for a specified fixed dilation step, where , and stands for the location parameter, where . The widely used parameters in the DWT are and (Daubechies 1990). For a discrete time series , where takes place at the discrete time

*i*, the dyadic wavelet transform becomes as below:where represents the wavelet coefficients for the scale and location for the DWT. An approximation series and detailed series at different scales are obtained using DWT analysis filtering techniques. In this study, the maximum overlap discrete wavelet transformation (MODWT) using the ‘db4’ mother wavelet filter was implemented for the multi-resolution analysis (MRA). The db4 is one of the powerful mother wavelets with regard to obtaining the time-localization characteristics of time series with a short memory, in particular (Nourani

*et al.*2014). Danandeh Mehr

*et al.*(2014) stated that the db4 wavelet filter is a strong tool to capture the non-stationary characteristics of the streamflow process. Catalao

*et al.*(2011) implemented the db4 for wind power forecasting and pointed out that the db4 can provide an appropriate trade-off between the wavelength and smoothness, especially for short-term forecasting. In addition, the db4 was successfully used in many hydrological modelling studies (Nourani

*et al.*2013; Sharghi

*et al.*2019). The MRA depends on the decomposition of the original time series by using the pyramid algorithm (Mallat 1989). The MRA can be useful for observing the signal at different decomposition levels and obtaining information regarding the time series, such as separating the trends from the noise (Heidinger

*et al.*2012).

#### ANN optimized by GA

*et al.*2007). There are many types of ANN such as convolutional neural network, radial basis function neural network (RBFNN), and feed-forward neural network (FNN). The FNN can be practical and efficient for input and output mapping (Hornik

*et al.*1989; Khaki

*et al.*2016). The FNN model is a commonly used ANN type in hydrological modelling studies (Wang

*et al.*2015; Meshram

*et al.*2019; Tran Anh

*et al.*2019). The FNN can be summarized as follows:where

*w*stands for the weight vector,

_{i}*x*for the input vector,

_{i}*b*for the bias, and

*f*for the transfer function. In this study, the weights and bias of the FNN were optimized by using the GA. The GA is a computational technique based on biological evolution (Forrest 1996). Significant elements, such as selection, crossover, and mutation, in the GA process affect generating a new population to find an optimum solution for the problem. The ideal solution for the optimization problem can be assessed according to the fitness function. Until reaching the optimum solution or the maximum generation number, reproducing the new population continues. In this study, mean square error was used as a fitness function to obtain weights and bias for FNN. The utilization of the GA with the data-driven models has become widespread in forecasting hydrometeorological variables (Nasseri

*et al.*2008; Bahrami

*et al.*2016). These studies showed that the utilization of the GA with the ANN could be useful for improving the model performance. In this study, the Roulette Wheel selection was implemented, and operators of GA, such as crossover and mutation rates, were used as 0.9 and 0.1 according to the trial–error methodology and by considering previous studies (Sedki

*et al.*2009; Reshma

*et al.*2015).

#### Boruta algorithm

Deciding the input variables for predicting the output variable is one of the critical points in data-driven models. In this study, the Boruta algorithm was used to determine wavelet components and input variables for rainfall–runoff modelling. The Boruta algorithm is a feature selection algorithm, a wrapper around an RF algorithm (Kursa *et al.* 2010). The implementation of the Boruta algorithm can be sorted as follows for a set of *P* samples of predictor variables (*x _{p}ɛR*

_{n}),

*n*stands for the number of inputs and

*p*=1, 2, 3,…,

*P*, and

*y*stands for the target variable (Kursa

_{p}*et al.*2010; Kursa & Rudnicki 2010; Prasad

*et al.*2019):

- (1)
A randomly ordered shadow input vector, , is generated for the respective input vector,

*x*, to isolate the correlations between the input variable and provide the randomness._{p} - (2)
The RF model is performed to predict the

*y*by using the_{p}*x*and input vectors._{p} - (3)For each run, variable importance is computed. In this regard, permutation importance or Mean Decrease Accuracy (MDA) for every input variable (
*x*) and shadow input (_{p}*x*′_{p}) is computed over all trees. According to the previous studies, the number of trees, mtree, was decided as 500 (Strobl*et al.*2008; Hur*et al.*2017; Prasad*et al.*2019). The computation of the MDA is as follows:where*I*(.) stands for the indicator function,*OOB*(out-of-bag) for the prediction error of each of the training samples utilizing bootstrap aggregation, () for the predicted values before permuting, and () for the predicted values after permuting. - (4)
- (5)
Accordingly,

*Z*-scores of input variables are compared with the corresponding shadow variables. The input variables for*Z*<*MZSA*are assessed as ‘Unimportant’, whereas input variables for*Z*>*MZSA*are evaluated as ‘Important’. - (6)
For each iteration, new shadows are generated, and the Boruta algorithm halts if all variables are confirmed as ‘Important’ or ‘Unimportant’ or the maximum iteration number is reached.

- (7)
The unconfirmed input variables are labelled as ‘Tentative’ if the maximum iteration number is reached. They can be either confirmed or rejected by comparing the respective median

*Z*-scores with the median*Z*-scores of the best shadow input variable.

#### Conceptual–data-driven hybrid model structure

In this study, two different outputs of the GR4 J model were used in the hybrid models. In the first hybrid model (GR4 J–WGANN_{1}), *QR* and *QD* obtained from GR4 J, *P*, *E*, and *API* were used as input variables. In this respect, preceding 2 days (i.e., *P _{t}*

_{−2},

*E*

_{t}_{−2},

*API*

_{t}_{−2},

*QR*

_{t}_{−2}, and

*QD*

_{t}_{−2}), preceding 1 day (i.e.,

*P*

_{t}_{−1},

*E*

_{t}_{−1},

*API*

_{t}_{−1},

*QR*

_{t}_{−1}, and

*QD*

_{t}_{−1}), and data for the day

*t*(i.e.,

*P*,

*E*,

*API*,

*QR*, and

*QD*) were used for runoff prediction in the GR4 J–WGANN

_{1}model. Thus, the effect of flow outputs of the GR4 J model (i.e., routing store

*QR*and

*QD*from the unit hydrograph 2) was investigated for runoff forecasting, representing the novelty of this study. In the second hybrid model (GR4 J–WGANN

_{2}), the

*SMI*obtained from GR4 J,

*P*,

*E*, and

*API*was implemented as input data. In the GR4 J–WGANN

_{2}model, preceding 2 days (i.e.,

*P*

_{t}_{−2},

*E*

_{t}_{−2},

*API*

_{t}_{−2}, and

*SMI*

_{t−2}), preceding 1 day (i.e.,

*P*

_{t}_{−1},

*E*

_{t}_{−1},

*API*

_{t}_{−1}, and

*SMI*

_{t}_{−1}), and data for the day

*t*(i.e.,

*P*,

*E*,

*API*, and

*SMI*) were used for runoff estimation. Thus, initial basin conditions were provided by using the

*API*and the

*SMI*as input variables, as implemented in previous studies (Anctil

*et al.*2004; Brocca

*et al.*2008; Humphrey

*et al.*2016). In both hybrid models, wavelet transform was used to acquire the wavelet components of each input variable in the hybrid model. Then, the Boruta algorithm was implemented to select the wavelet components for rainfall–runoff modelling. Accordingly, the rejected components were discarded from the input variables. Then, runoff forecasting was fulfilled by using optimized ANN by the GA. The flow chart for the GR4 J–WGANN

_{1}and GR4 J–WGANN

_{2}models can be seen in Figures 2 and 3, respectively. With the utilization of the hybrid models, it will be observed whether using different outputs of the conceptual model in hybrid GR4 J–WGANN models will improve the rainfall–runoff modelling performance compared to the GR4 J model.

### Assessment of the model performance

_{1}, and GR4 J–WGANN

_{2}models was assessed according to the Nash–Sutcliffe efficiency (NSE) (Nash & Sutcliffe 1970), RMSE, mean absolute error (MAE), and Kling–Gupta efficiency (KGE) (Gupta

*et al.*2009), as seen in Equations (9)–(12):where

*N*, , and refer to the data length, simulated runoff, and observed runoff for

*i*th time in Equations (9)–(11), respectively. represents the mean of the observed runoff in Equation (9). In addition,

*r*stands for the correlation coefficient,

*α*for the ratio of simulated mean runoff to observed mean runoff, and

*β*for the ratio of the standard deviation of simulated runoff to the standard deviation of observed runoff in Equation (12). In Equation (9) (i.e., NSE), the denominator term refers to the total variation of the observed values about the mean, and the numerator part is the sum of errors that are based on the difference between a predicted value and the corresponding observed value (McCuen

*et al.*2006). The RMSE indicates the square root of the second sample moment of differences between the predicted and observed values (Nabavi-Pelesaraei

*et al.*2021). The MAE depends on each prediction error, which is the difference between predicted and observed values (Sammut & Webb 2011). The KGE depends on the decomposition of the NSE into principal components such as correlation, variability bias, and mean bias (Knoben

*et al.*2019). The low values of the MAE and RMSE indicate better performance of models. If KGE values are closer to 1, it shows better performance of the model. In addition, NSE values below 0.3 can be assessed as unsatisfactory, range between 0.3 and 0.5 as satisfactory, range between 0.5 and 0.7 as good, and values larger than 0.7 as a very good performance for daily scale evaluation (Moriasi

*et al.*2007; Kalin

*et al.*2010; Jimeno-Sáez

*et al.*2018).

## RESULTS AND DISCUSSION

### The performance of the GR4 J model

First, the calibration of the GR4 J model was performed and the calibration parameters were obtained. Then, the runoff simulation was carried out. The simulation results of the GR4 J model can be seen in Table 4 and Figures 4(a)–7(a). According to the NSE, the GR4 J model yielded a good performance, as seen in Table 4. However, the GR4 J model overestimated low flows, whereas it underestimated high flows in Akçay, Ceviz, Cura, and Çağşur streams, as seen in Figures 4(a)–7(a). Furthermore, it was observed that the GR4 J model yielded the best simulation results in the Akçay stream, while it yielded the worst results in the Ceviz stream, according to Table 4. In recent years, the flashiness of precipitation and flash floods, which could be related to climate change, has tended to increase in the area especially for the Ceviz stream (Beden & Ulke Keskin 2021), and this could affect the rainfall–runoff modelling performance of the GR4 J model. Poncelet *et al.* (2017) also stated that the flashiness of precipitation and streamflow could negatively affect the model performance. In this regard, models can underestimate the flow variability and not yield better results for predicting peak flows (Gupta *et al.* 2009; van Esse *et al.* 2013; Poncelet *et al.* 2017). It can negatively affect the runoff estimation performance of the model. According to Figures 8(a)–11(a), the GR4 J model underestimated high flows and overestimated low flows in all streams. In Akçay, Ceviz, and Cura streams, the GR4 J model underestimated flows in the winter and summer seasons, whereas it overestimated flows during the spring season. The Çağşur stream has a lower flow period than other streams, and the GR4 J model overestimated low flows in the summer and autumn seasons while it underestimated flows in winter and spring, as seen in Figure 11(a). Scatter diagrams also indicated the overestimation of low flows and the underestimation of high flows in all streams according to Figures 8(a)–11(a).

Streams . | GR4 J . | GR4 J–WGANN_{1}. | GR4 J–WGANN_{2}. | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|

NSE . | KGE . | MAE (mm/d) . | RMSE (mm/d) . | NSE . | KGE . | MAE (mm/d) . | RMSE (mm/d) . | NSE . | KGE . | MAE (mm/d) . | RMSE (mm/d) . | |

Akçay | 0.61 | 0.58 | 1.35 | 2.79 | 0.71 | 0.67 | 1.1 | 2.39 | 0.70 | 0.68 | 1.12 | 2.45 |

Ceviz | 0.50 | 0.54 | 0.68 | 1.19 | 0.70 | 0.77 | 0.5 | 0.93 | 0.67 | 0.83 | 0.55 | 0.98 |

Cura | 0.55 | 0.60 | 0.93 | 1.67 | 0.69 | 0.73 | 0.75 | 1.38 | 0.68 | 0.72 | 0.71 | 1.4 |

Çağşur | 0.54 | 0.56 | 0.22 | 0.45 | 0.64 | 0.62 | 0.19 | 0.40 | 0.60 | 0.58 | 0.20 | 0.43 |

0.55 | 0.57 | 0.795 | 1.525 | 0.685 | 0.698 | 0.635 | 1.275 | 0.663 | 0.703 | 0.645 | 1.315 |

Streams . | GR4 J . | GR4 J–WGANN_{1}. | GR4 J–WGANN_{2}. | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|

NSE . | KGE . | MAE (mm/d) . | RMSE (mm/d) . | NSE . | KGE . | MAE (mm/d) . | RMSE (mm/d) . | NSE . | KGE . | MAE (mm/d) . | RMSE (mm/d) . | |

Akçay | 0.61 | 0.58 | 1.35 | 2.79 | 0.71 | 0.67 | 1.1 | 2.39 | 0.70 | 0.68 | 1.12 | 2.45 |

Ceviz | 0.50 | 0.54 | 0.68 | 1.19 | 0.70 | 0.77 | 0.5 | 0.93 | 0.67 | 0.83 | 0.55 | 0.98 |

Cura | 0.55 | 0.60 | 0.93 | 1.67 | 0.69 | 0.73 | 0.75 | 1.38 | 0.68 | 0.72 | 0.71 | 1.4 |

Çağşur | 0.54 | 0.56 | 0.22 | 0.45 | 0.64 | 0.62 | 0.19 | 0.40 | 0.60 | 0.58 | 0.20 | 0.43 |

0.55 | 0.57 | 0.795 | 1.525 | 0.685 | 0.698 | 0.635 | 1.275 | 0.663 | 0.703 | 0.645 | 1.315 |

Bold numbers demonstrate average values of the NSE, KGE, MAE, and RMSE.

### The performance of the hybrid GR4 J–WGANN_{1} model

In GR4 J–WGANN_{1}, first *QR* and *QD* were obtained from the GR4 J model, and the *API* was calculated. Then, *P*, *E*, *API*, *QR*, and *QD* input data were decomposed into 10 detailed components and one approximation component by using the DWT. To determine the wavelet decomposition level, the study by Sang (2012) takes times-series length into consideration. Accordingly, Sang (2012) stated that one minus value of the maximum wavelet decomposition level according to the data length (i.e., the integer part of log_{2}(2,830)−1) could be appropriate for the wavelet decomposition level selection for de-noising series. Then, the wavelet components of each variable used in runoff forecasting were determined using the Boruta algorithm. According to Figures 12(a) and 13(a), *QR* and *QD* have more important wavelet components than other input variables for the Akçay and Çağşur streams. The unimportant components (generally *D*_{1} and *D*_{2} components) were extracted from the *P*, *E*, and *API* data*.* Similar findings from the Boruta analysis were also obtained for Ceviz and Cura streams. The extraction of high-frequency components, which generally represent the noisy part of data (especially *D*_{1}), can improve the model performance, as emphasized in previous studies (Santos *et al.* 2014; Tayyab *et al.* 2019). The GR4 J–WGANN_{1} model yielded better performance than the GR4 J model with regard to runoff forecasting, as seen in Figures 4(b)–7(b). According to Table 4, the NSE values change between 0.64 and 0.71, and the increase in runoff forecasting performance is 15, 40, 25, and 14% for the GR4 J–WGANN_{1} model compared to the GR4 J model in Akçay, Ceviz, Cura, and Mert streams, respectively. Other evaluation criteria showed more superior performance of the GR4 J–WGANN_{1} model than the GR4 J model for each stream. The GR4 J–WGANN_{1} model outperformed the GR4 J–WGANN_{2} model, as seen in Table 4. Using more input data obtained from the GR4 J model, which has a high importance level (i.e., *QR* and *QD*) compared to just single-input data from the GR4 J model (i.e., *SMI*), could lead to better runoff prediction performance. The GR4 J–WGANN_{1} model also improved low and high runoff forecasting performance, as seen in Figures 4(b)–7(b). In addition, the simulated monthly runoff by the GR4 J–WGANN_{1} model is more compatible with the observed monthly runoff than the GR4 J model, especially in the Ceviz stream (Figures 8(b)–11(b)). The runoff simulation for April–December was successfully predicted, particularly in Akçay and Ceviz streams. Furthermore, the GR4 J–WGANN_{1} model yielded better low flow simulation in the Çağşur stream in the summer season. The scattering diagrams also showed that the simulated runoff was more consistent with the observed runoff than the GR4 J model, especially regarding low- and high-flow simulation. Accordingly, using the *QR* and *QD* in the GANN model as input data and wavelet transform enhanced the rainfall–runoff modelling performance, particularly regarding low and high runoff simulation.

### The performance of the hybrid GR4 J–WGANN_{2} model

In GR4 J–WGANN_{2}, *SMI*, *API*, *P*, and *E* were decomposed by using the wavelet analysis initially. Accordingly, 10 detailed components and 1 approximation component for each component were obtained. Then, the Boruta algorithm was implemented to determine the wavelet components used for rainfall–runoff modelling. The Boruta analysis indicated that the importance level of the *SMI* is high compared to wavelet components of other input variables in the Akçay and Çağşur streams, as seen in Figures 12(b) and 13(b). Similarly, the importance level of the *SMI* components was also higher than the wavelet components of other input variables in the Ceviz and Cura streams. The rejected components (mostly *D*_{1} and *D*_{2}) were extracted from the input variables and important components were used for runoff forecasting. With regard to the importance of soil moisture, Tramblay *et al.* (2010) pointed out that soil moisture can be useful for providing the initial conditions for rainfall–runoff modelling, particularly in small basins. Tayfur & Brocca (2015) stated that the utilization of soil moisture and the rainfall data in a fuzzy model enhanced the rainfall–runoff modelling performance, particularly in capturing the peak flow values. The runoff forecasting performance of the GR4 J–WGANN_{2} model can be observed in Table 4. According to Table 4, NSE values change between 0.60 and 0.70, and the performance improvement is 15, 34, 24, and 7% for the GR4 J–WGANN_{2} model compared to the GR4 J model in Akçay, Ceviz, Cura, and Çağşur streams, respectively. Other evaluation criteria also showed that the GR4 J–WGANN_{2} model outperformed the GR4 J model. The GR4 J–WGANN_{2} model underperformed compared to the GR4 J–WGANN_{1} model. As seen in Figures 4(c)–7(c), the GR4 J–WGANN_{2} model yielded better results with regard to capturing low and peak flows than the GR4 J model. Humphrey *et al.* (2016) also found that the hybrid GR4 J–ANN model produced more precise forecasting results than the individual GR4 J and ANN models, particularly for forecasts based on climatology. They also stated that the hybrid model yielded more reliable forecasting results than the ANN and GR4 J models regarding the high-flow simulation. Furthermore, the inclusion of the *SMI* and the *API* could lead to better performance of the hybrid model than the GR4 J model since the *SMI* and the *API* can give more information regarding the initial catchment conditions (Humphrey *et al.* 2016) and for the GANN model. Anctil *et al.* (2004) also used the *API* and the *SMI* in the ANN model as input data. They found out that slow-response data could be helpful to develop the streamflow prediction efficiency of the ANN model in addition to the fast-response time series such as recently observed streamflow and rainfall sequences. Furthermore, Sharghi *et al.* (2019) put forward that wavelet transform could be useful to handle the non-stationary behaviour of data and found out that wavelet analysis developed the estimation of peak flows. Badrzadeh *et al.* (2018) stated that the decomposition of historical time series at different frequency levels via wavelet transform could predict extreme values more accurately. According to Figures 8(c)–11(c), the GR4 J–WGANN_{2} model yielded more accurate monthly runoff predictions than the GR4 J model. As seen in scattering diagrams, the simulation of low and high flows is also more reliable than the simulated low and high runoff by the GR4 J model.

The hybrid models performed better than the GR4 J model; however, they underestimated high flows. In recent years, flash floods and extreme precipitation events have increased in the study area (especially in coastal parts). The excessiveness of high-flow events and irregular hydrologic regime in the study area (i.e., coastal parts of the Black Sea in Turkey, in particular) can prevent better high-flow forecasting performance of the models. Pulido-Calvo & Portela (2007) implemented the computational neural networks (CNNs) for daily streamflow forecasting in Portuguese watersheds and stated that a highly irregular hydrologic regime could prevent better high-flow prediction performance. In addition, they also found that a comparatively small number of high flows in the training set could make the learning process of CNN models difficult with regard to high-flow prediction. In this regard, using longer data with a comparatively larger number of extreme events in the calibration and validation period can improve the hybrid model performance. Collier (2007) found that the hydrological simulation of peak flows can be very uncertain, and it is important to understand spreading uncertainty through the flood forecast chain. In this regard, this uncertainty could lead to limitations on predictability under a changing climate. Beven & Binley (2014) discussed the implementation of the generalized likelihood uncertainty estimation (GLUE) methodology for uncertainty estimation and stated that the uncertainties related to the input and evaluation data as well as model parameters and structures could have significant effects on modelling performance. Tian *et al.* (2014) investigated the model parameter and structure uncertainty for the GR4 J, HBV, and XAJ models using the GLUE methodology. They revealed that parameter uncertainty became larger when discharge values increased, and this could be related to the rarity of extreme values in the observed data compared to normal discharge values. In addition, Koutsoyiannis (2019) investigated the irreversibility of stochastic characterization and the simulation of the atmospheric and hydrological processes. In this study, it was revealed that atmospheric processes, such as temperature and rainfall, did not show explicit irreversibility, while the irreversibility was apparent for scales of several days for streamflow. In this regard, Koutsoyiannis (2019) stated that this situation created the need for reproducing discharge time series in flood simulations. Vavoulogiannis *et al.* (2021) also emphasized that temporal irreversibility can be an explicit feature of the streamflow process that evinces itself over several days. Concerning the long-range dependence, Serinaldi & Kilsby (2016) stated that complexity streamflow dynamics need modelling approaches that take temporal asymmetry and nonlinearity into consideration. They also found that nonlinearity could conceal linear short- and long-range dependence. Szolgayova *et al.* (2014b) analysed the long-term variability of the flow regime in the Danube River and the relationship between flow and precipitation and air temperature using wavelet analysis approaches. They revealed that long-range dependence in precipitation spreads into discharge and precipitation data could be used with wavelet decomposition to attain a multivariate stochastic discharge model, especially at a monthly scale. Similarly, Szolgayova *et al.* (2014a) indicated the usefulness of wavelet transform for modelling and forecasting daily discharge time series exhibiting long-range dependence. Accordingly, wavelet transformation can be evaluated as a useful method for the performance improvement in hybrid models when using precipitation, evapotranspiration, and streamflow data, which exhibited almost the white-noise and long-term persistent behaviours, respectively, in this study. As can be seen from previous studies, many factors can affect the rainfall–runoff modelling performance and these factors will be investigated in future studies.

## CONCLUSIONS

In this study, the performance of the GR4 J conceptual model and GR4 J–WGANN_{1} and GR4 J–WGANN_{2} hybrid models were investigated for daily rainfall–runoff modelling. The study mainly aimed to implement a new hybrid wavelet-based GR4 J–GANN_{1} (i.e., GR4 J–WGANN_{1}) model that uses the *QR* and *QD*, outputs of the GR4 J model as input data in the WGANN model. Another aim of this study is to observe the improvement in the rainfall–runoff modelling performance using the hybrid GR4 J–WGANN_{1} and GR4 J–WGANN_{2} hybrid models compared to the GR4 J model. The western part of the Eastern Black Sea basin and one sub-basin in the Kızılırmak basin were selected as study areas. The deductions of this study can be summarized as follows:

The GR4 J model yielded good performance in all streams (NSE≥0.50). However, the GR4 J model overestimated low flows and underestimated high flows in Ceviz, Cura, Akçay, and Çağşur streams. Flash precipitation, flood events, and unstable flow regime could prevent the better modelling performance of the GR4 J model in the Eastern Black Sea basin regarding the estimation of extreme flows. In the Çağşur stream, the drier period is relatively long compared to other streams, which could negatively affect the GR4 J performance.

In the GR4 J–WGANN

_{1}model, QR and QD obtained from the GR4 J model, the API, precipitation, and evapotranspiration; in the GR4 J–WGANN_{2}model, the SMI obtained from the GR4 J model and the API, precipitation, and evapotranspiration were used as input data. After the input data were decomposed using wavelet transformation, the Boruta algorithm was implemented to determine the wavelet components used in rainfall–runoff modelling. The Boruta algorithm indicated that the importance level of QR and QD's wavelet components in the GR4 J–WGANN_{1}model and SMI's wavelet components in the GR4 J–WGANN_{2}model were remarkably high compared to wavelet components of other input variables. In this regard, using wavelet transform and different outputs of the GR4 J model as input data in data-driven models improved the rainfall–runoff modelling performance compared to the GR4 J model.The GR4 J–WGANN

_{1}and GR4 J–WGANN_{2}hybrid models developed low- and high-flow simulation performance compared to the GR4 J model. It shows that wavelet-based conceptual–data-driven hybrid models can remarkably improve the rainfall–runoff modelling performance in basins with challenging hydrological conditions.The proposed GR4 J–WGANN

_{1}model yielded better results than the GR4 J–WGANN_{2}model. The Boruta algorithm showed that*QR*and*QD*generally have more important wavelet components than the*SMI*. In other words, using more significant input data obtained from the GR4 J model (i.e.,*QR*and*QD*) compared to using only the*SMI*could improve the performance and the novel GR4 J–WGANN_{1}model approach can be an alternative to the GR4 J–WGANN_{2}model.Using the Boruta algorithm in hybrid models for rainfall–runoff modelling, which is another novel aspect of this study, could also be advantageous and remarkable with regard to observing the significance of input data. Thus, it can be inferred that using different hybrid model approaches that combine the advantages of the conceptual and data-driven models can provide better results for rainfall–runoff modelling.

Although the hybrid models performed better than the GR4 J model, they could not perform very well, especially with regard to forecasting high flows. The irregular hydrologic regime, uncertainty, and smaller number of extreme events in the calibration and validation period could prevent better high-flow prediction performance of the hybrid models. In this regard, the aim is that the uncertainty of model parameters and the prediction performance will be investigated in future studies. In addition, the implementation of the larger datasets at different scales (such as hourly, daily, and monthly) in hybrid models for rainfall–runoff modelling is planned. In further studies, the performance of different hybrid models, which also distinguish the contribution of the wavelet transform and the hybrid model structure, will be investigated to enhance the rainfall–runoff modelling performance of both the conceptual and data-driven models.

## ACKNOWLEDGEMENTS

The authors are grateful to the Turkish State Meteorological Service and the General Directorate of State Hydraulic Works for providing the data. This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

## DATA AVAILABILITY STATEMENT

Data cannot be made publicly available; readers should contact the corresponding author for details.

## CONFLICTS OF INTEREST STATEMENT

The authors declare there is no conflict.

## REFERENCES

*ACM Computing Surveys*(

*CSUR*)

*Hydrologie appliquée aux petits bassins ruraux, Hydrology Handbook*(

*in French*)

*Computer Models of Watershed Hydrology, Chap.*7