ABSTRACT
Among various monthly water balance models, one of the models that has the simplest structure and offers a well-behaved conceptual platform is the GR2M. Despite the widespread use of the model with two-free parameters, the fact that it tends to produce relatively large errors in peak flow months necessitates some modifications to the model. The reason for the mentioned simulation deficiencies could be that the relationship between the routing reservoir and the external environment of the basin is controlled by a single parameter, making the storage–discharge relationship linear. Therefore, in this study, least squares support vector regression, one of the nonlinear data-driven models, has replaced the routing part of the GR2M to enhance the monthly runoff simulation. The performance of the three-parameter hybrid model (GR3M), which was developed by considering the parameter parsimony point of view and including a machine learning (ML)-based nonlinear routing scheme, was examined in some locations in the Gediz River Basin in western Turkey. Statistical performance measures have shown that GR3M, which both leverages the capabilities of an ML model and blends conceptual outputs within a nested scheme, clearly outperforms the original GR2M. The proposed modification has brought significant improvements, especially to high-flow simulations.
HIGHLIGHTS
The routing component of the GR2M model has been replaced with least squares support vector regression (LSSVR).
The model GR3M, which is entirely hybrid, has improved the monthly runoff simulation.
Incorporating LSSVR into GR2M has made high-flow simulation more robust.
GR3M also preserves the conceptual structure of GR2M with a nested hybridization.
INTRODUCTION
Monthly water balance models (MWBMs) are still of interest thanks to their limited demand for input data and the practical usability of their outputs in long-term hydrological drought assessments and reservoir operation studies. Despite some MWBMs capturing basin dynamics with more intensive parameter sets, most of them tend to oversimplify the hydrological process by conceptualizing watershed with several interconnected storages and generally have a number of parameters ranging from two to five (Bai et al. 2015). Notwithstanding a great deal of expertise pertaining to many models, there remains a need to identify and rectify their shortcomings (Xu & Singh 1998). For example, the majority of these conceptual-type models can suffer from linearity assumptions in generating runoff (Vidyarthi & Jain 2022). Moreover, the residuals of such models can be subject to seasonality and have limited capabilities, particularly in high-flow simulations (Fathi et al. 2023). Although there have been attempts to modify the evapotranspiration process to enable water balance models to produce more accurate runoff simulations (e.g., Bai et al. 2015), recent studies have shown that the main focus should be on nonlinearity in the mechanisms of the runoff components (Cheng et al. 2020; Fathi et al. 2023). However, there are a scant number of papers on the development of the runoff partitioning structure in MWBMs. For example, Cheng et al. (2020) claimed that the linear storage–discharge relationship on short timescales has a solid physical basis due to the negligibility of recharging from the current rainfall event. When analyzing MWBMs, however, recharge to groundwater storage from the current month's precipitation and therefore the baseflow generation process comes into consideration (Hrachowitz et al. 2014). To this end, Cheng et al. (2020) demonstrated that by arranging the baseflow components of several MWBMs in nonlinear exponential form, the storage–discharge dynamics could be better described, enhancing the monthly runoff simulations. In another study, Fathi et al. (2023) have integrated a seasonal component regulated by an exponent parameter into the simple GR2M model by allowing its routing coefficient to change from month to month as opposed to maintaining a constant value throughout the year. Their modified model has been verified to outperform the GR2M over 1,469 catchments in the USA.
Additionally, in recent years, studies on the integration of conceptual models with machine learning (ML) techniques have begun to demonstrate promise in improving simulation accuracy while preserving hydrological interpretability (Kapoor et al. 2023). Some of these models are of the coupled type and rely on employing multiple conceptual outputs derived from conceptual models as auxiliary predictors to strengthen ML-based training (e.g., Chen & Adams 2006; Humphrey et al. 2016; Wang et al. 2021; Sezen & Partal 2022; Vidyarthi & Jain 2022). According to McIntyre (2013), performance enhancements are often the result of theoretical nonlinearity in the routing process, which merits greater consideration in conceptual modeling to contribute to process understanding. In the study presented by Vidyarthi & Jain (2022), Australian water balance model (AWBM) outputs were coupled with artificial neural networks (ANNs), intrinsically adopting the notion ascribed by McIntyre (2013). They employed AWBM to obtain runoff hydrographs at individual sub-catchment levels in upstream reaches of the Kentucky River Basin, USA, and subsequently trained ANNs in place of linear Muskingum to perform runoff routing at the basin outlet.
On the contrary, in a rather limited number of studies, the internal structures of conceptual models have been replaced with ML approaches (e.g., Kumanlioglu & Fistikoglu 2019; Okkan et al. 2021; Bhasme et al. 2022; Durgut & Ayvaz 2023; Kapoor et al. 2023). Kapoor et al. (2023), for instance, substituted modern deep learning models for the processes in the routing storage of the GR4J model. Given that the parameters of the considered part of the conceptual model and those of the ML model are calibrated jointly in this type of hybridization, accounting for potential parameter interactions results in a more robust simulation. Despite studies motivating the incorporation of conceptual and ML models into hybrid modeling frameworks in daily runoff simulation (e.g., Kumanlioglu & Fistikoglu 2019; Durgut & Ayvaz 2023), only a few studies have embedded ML tools into existing MWBMs to upgrade their runoff-generating mechanisms and enhance monthly simulations (Okkan et al. 2021; Bhasme et al. 2022). However, while adopting this synergistic manner, the parameter density of each model used might complicate the calibration process of MWBMs. From this standpoint, it would be more consistent not to deviate from the perspective of parameter parsimony and to search for hybridization facilities that can be easily adapted to operational studies. Therefore, our study's objectives are:
i. to replace the routing part of a simple MWBM, which considers the underlying physical process under the assumption of linearity, with a robust ML model;
ii. to propose a hybrid framework that preserves soil moisture storage within the physics-guided conceptual model structure and to question to what extent the complex relationships between intermediate outputs provided by the conceptual layer of the hybrid model and target runoff can be extracted with ML; and
iii. to evaluate an alternative parametric model that accounts for the nonlinear storage–discharge relationship against the efficacy of ML while managing the routing process in an MWBM, and to evaluate the performances of all models across various conditions.
For the objectives stated above, we examined potential modifications to the GR2M model, which is without a doubt one of the most parsimonious MWBMs. Due to its success in simulating monthly runoff and its simple structure, this two-parameter model is regarded as a benchmark for many basin-based studies conducted on different continents (Fathi et al. 2023). We replaced the linearly behaving routing storage of GR2M with least squares support vector regression (LSSVR), which can avoid several drawbacks of ANNs, and proposed a novel hybrid model with a total of three free parameters, maintaining the soil moisture (production) storage of the original model. The hybrid model (hereafter denoted as GR3M) is in line with the parsimonious structure of the modified GR2M suggested by Fathi et al. (2023) and deals with capturing the nonlinear rainfall–runoff relationship in a different way. The virtues of GR3M have been tested at the flow gauging stations across the Gediz River Basin (GRB), one of the most important agricultural basins in Turkey. To the best of our knowledge, there has been no prior investigation that has modified the internal structure of GR2M through employing any ML technique, with the exception of a scant number of papers (e.g., Okkan et al. 2021; Bhasme et al. 2022), which have integrated ML models into MWBMs with more intricate parameterization. Those studies have substituted a chosen ML model for malfunctioning or insufficiently functioning storages in the conceptual model, as implemented in this study. Nevertheless, the topologies within these models do not explicitly utilize nonlinear routing; rather, they treat much more conceptual outputs as refined inputs. Moreover, controlling the conceptual layer of the novel hybrid model with a singular parameter, while controlling the nonlinear routing layer utilizing LSSVR with two parameters could facilitate the calibration process for parsimonious model users. The novel aspects of the developed hybrid modeling framework are its ability to keep the original model's simplicity and successfully embed LSSVR into the conceptual structure as a nonlinear routing tool, thus having the potential to overcome the reported shortcomings of the GR2M model. This article is organized into five sections: the subsequent two sections detail the methodology adopted and the studied catchment and data, respectively. The results are then presented in Section 4, including discussions. Concluding remarks are given in the final section.
METHODOLOGY
Hybrid water balance model (GR3M) as an alternative to the GR2M
The hybrid GR3M model is sequentially composed of a production store and an LSSVR-based nonlinear routing component, as depicted in Figure 1. While applying GR3M, the original production storage of the GR2M model, driven by P and Ep inputs and initial soil moisture storage, is preserved. Afterward, the excess precipitation and the released water from the production store are transmitted to the nonlinear routing part. Therein, the LSSVR, which has two parameters to be tuned, substitutes for the routing reservoir guided by the parameter RC and converts the production store outputs (P1 and P2) to the simulated monthly runoff at the catchment outlet. LSSVR has been favored among a pool of data-driven models when incorporating ML into an existing water balance model because its simplified algorithm requires less training effort and offers a computational benefit over standard support vector regression. Moreover, it incorporates structural risk minimization as opposed to empirical risk minimization, which is utilized in many ML models such as ANNs, and exhibits strength in its efforts to enhance rainfall–runoff relationships (Katipoglu & Sarigol 2023; Xu et al. 2023).
In Figure 1, the equations for how to perform nonlinear routing with LSSVR are also specified. It should be noted that the production store outputs and observed runoff data are scaled to lie within the range [0, 1]. The radial basis kernel function (RBF) is employed to map nonlinearly the normalized outputs conveyed by the production reservoir into a higher-dimensional feature space (for the kernel matrix obtained for the calibration period, see w10 in Figure 1). It is anticipated that RBF can effectively address the nonlinear nature of the routing process, where optimal estimation of the kernel width parameter (σ) would play a crucial role. When the regularization parameter (Cr) is also taken into account, the Lagrangian multipliers (α) and the bias term (b) are estimated (see w11 in Figure 1). After the kernel matrix for the validation period (Kval), which may not be in square form, is constituted, the simulated outputs generated throughout the periods of calibration and validation are compiled (see w12 in Figure 1). Finally, all these values are converted back into runoff estimations that use the same unit as the original target.
Another nonlinear routing-themed model
In this alternative model, referred to as GR3M_PA, the constants m and b need to be calibrated together with SM so as to simulate total runoff. The nonlinear variability between water recharged from the production store and routed runoff is controlled by these new parameters.
Automatic calibration of the models
While the mutation process could gain stochastic search ability depending on two individuals randomly selected from the population (r1, r2 ∈{1, 2,…,np}), the extent of mutation is regulated by the SF1 and SF2 scaling factors.
It should also be noted that the steps of the mDEA, apart from the mutation operator, were executed as suggested by Storn & Price (1997).
Model evaluation criteria
Furthermore, FITEVAL software developed by Ritter & Muñoz-Carpena (2013) was implemented to evaluate the goodness-of-fit of the calibrated models since it can yield the NSE-based relative frequency distribution (RFD) produced by bootstrapping and the corresponding statistical significance. In order to facilitate comparisons and quantify discrepancies between the observed and simulated runoff, a Taylor diagram (Taylor 2001) is also utilized. This offers a brief statistical overview of the consistency between the simulated data and the observations, as measured by the root mean square difference (RMSD) and correlation coefficients. Section 4 elaborates on the application and discussion of the hybrid GR3M model.
STUDY AREA AND DATA
RESULTS AND DISCUSSION
In all models used, the production store was initially assumed to be at 10% capacity, while the initial value for the routing reservoir of the GR2M is fixed at 10 mm. To avoid any bias associated with the initial storage values, Fathi et al. (2023) have suggested implementing a 12-month warm-up period. The data set is therefore separated into three subsets: (i) the warm-up period from October 1979 to September 1980; (ii) the calibration period from October 1980 to September 1995; and (iii) the validation period from October 1995 to September 2010. As alluded to earlier, the control parameters of LSSVR, which supersedes the simple routing storage of GR2M, and the production store parameter SM were embedded into the GR3M scheme. Then, the metaheuristics optimizer, mDEA, assisted in the joint calibration of these parameters. Given the unknown physical limits of the LSSVR parameters responsible for controlling the nonlinear routing component of GR3M, it should be noted that the optimization process of these parameters has been initiated by generating random values within a specified range (i.e., [0.1, 100]). Subsequently, these values have not been constrained within a particular bound throughout the iterations. Additionally, we would like to highlight that recent research indicates that k-fold cross validation could potentially mitigate overfitting concerns and reduce uncertainties associated with input data set partitioning (Vu et al. 2022). The training set is subdivided into k-folds, and the validation procedure to be iterated k times, according to this concept. The mean value of fitness values throughout the k loops is the performance metric to be improved (Bazrkar & Chu 2022). In the study, we observed that by assigning the fold number k, population size of mDEA and maximum number of iterations to 5, 30, and 50, respectively, there could be sufficient convergence in the objective function without being affected by overfitting issue for GR3M. The other mDEA control settings were chosen in pursuit of several experiments. While both scaling factors were fixed at 0.5, CR was taken as 0.7. Under these settings, mDEA was executed 30 times during parameter optimization for each model. An iterative simulation–optimization procedure was implemented to calibrate randomly assigned parameter solutions in each independent run. Thereafter, the final parameter estimations were stored for any model, and the best solution from the solution pool could be extracted.
Stations . | GR2M . | GR3M_PA . | GR3M . | |||||
---|---|---|---|---|---|---|---|---|
SM . | RC . | SM . | b . | m . | SM . | σ . | Cr . | |
E05A023 | 258.98 | 0.78 | 425.33 | 11.08 | 10.67 | 498.05 | 11.43 | 140.80 |
E05A022 | 216.08 | 0.73 | 406.62 | 14.52 | 5.97 | 409.29 | 1.14 | 148.55 |
E05A014 | 231.99 | 0.59 | 599.75 | 63.82 | 2.68 | 415.86 | 0.26 | 2,399.61 |
D05A038 | 645.61 | 0.66 | 1,140.15 | 21.69 | 16.15 | 861.42 | 0.50 | 1,694.24 |
D05A031 | 85.91 | 0.62 | 394.35 | 114.19 | 7.71 | 452.90 | 2.60 | 45.42 |
Stations . | GR2M . | GR3M_PA . | GR3M . | |||||
---|---|---|---|---|---|---|---|---|
SM . | RC . | SM . | b . | m . | SM . | σ . | Cr . | |
E05A023 | 258.98 | 0.78 | 425.33 | 11.08 | 10.67 | 498.05 | 11.43 | 140.80 |
E05A022 | 216.08 | 0.73 | 406.62 | 14.52 | 5.97 | 409.29 | 1.14 | 148.55 |
E05A014 | 231.99 | 0.59 | 599.75 | 63.82 | 2.68 | 415.86 | 0.26 | 2,399.61 |
D05A038 | 645.61 | 0.66 | 1,140.15 | 21.69 | 16.15 | 861.42 | 0.50 | 1,694.24 |
D05A031 | 85.91 | 0.62 | 394.35 | 114.19 | 7.71 | 452.90 | 2.60 | 45.42 |
The SM parameter is in mm. The possible ranges of SM and RC were assigned as 10–1,000 mm and 0.01–1.0, respectively.
.
The RMSE values are in mm. The optimal results with regard to the performance indices assessed are marked in bold fonts.
Upon evaluating all data points, one could argue that the original GR2M is already sufficient and that the usage of GR3M is not essential, even if it improves overall performance. Nevertheless, it has been detected that GR2M's functionalities in simulating high flows are restricted, as Fathi et al. (2023) stressed. This determination was made based on the indices calculated between the points above the third quartile of the observed series and those in the corresponding simulations (Table 2). It is clear from Table 2 that the performance of the GR2M under high-flow conditions during the validation period was unsatisfactory at all stations, excluding E05A022. It is noteworthy that the GR3M tends to carry high-flow simulations into acceptable or good grades for the same stations. It is also intriguing, however, that GR3M_PA has gone beyond what it provided for all flows within the scope of high-flow simulations. While it ensures a PI greater than 10% at each station, except for E05A022 and D05A038, the high-flow simulations produced have remained unsatisfactory. In turn, it is worth mentioning that the PI values delivered by hybrid GR3M at stations E05A023, E05A014, and D05A038 surpass those of GR3M_PA by over 20% (Figure 3).
Furthermore, Taylor diagrams were employed to assess the competence of runoff simulations at each station during the validation period, considering all flows (Supplementary Figure S2) and high flows (Supplementary Figure S3). It is discernible from these diagrams which MWBMs are in closest proximity to the observations. Accordingly, GR3M yielded RMSD values that were comparatively small, providing further evidence that the transition of the routing process to an ML-driven nonlinear form is consistent.
In addition to the advantages associated with hybrid models using ML, there are some obstacles that need to be addressed and research priorities. As simulations from these models are blended with ML approaches, they may inevitably inherit some of the limitations of these techniques (Slater et al. 2023). Frequently addressed limitations of ML models include the need for large data sets to train, sensitivity to collinearity, and overtraining (Ghaith et al. 2020). In our study, nearly 30 years of data were used, and the training and testing of GR3M seem not to be affected by the length of the data. Yet, it is advisable to validate the suggested model in other catchments employing data of varying durations. On the contrary, since the LSSVR embedded into the GR3M is routing with P1 and P2 from the production storage, the collinearity in this dimension is marginal. However, we recommend focusing on collinearity in cases where multiple variables computed from the conceptual layer of the hybrid model are serving as input for the data-driven phase (e.g., Kumanlioglu & Fistikoglu 2019).
We also believe that there are opportunities to increase the effectiveness of and disseminate nested hybrid models like GR3M. One avenue worth trying would be to incorporate physics-informed ML designs into the nested framework that tends to obey the law of conservation of mass. For instance, some recent articles focusing on physics-guided deep learning models (i.e., long short-term memory (LSTM) networks and convolutional networks) for runoff simulation attract attention (e.g., Xie et al. 2021; Bhasme et al. 2022; Frame et al. 2023; Zhong et al. 2023). Among these, Xie et al. (2021) revealed that the physics-guided LSTM network model outperformed the traditional LSTM in experiments conducted in more than 500 basins. Conversely, Frame et al. (2023) demonstrated that physics-guided deep learning models can exhibit weaker performance when confronted with biases detected in observations, as opposed to unconstrained deep learning models that account for residuals. Although mass conservation was taken as the basis for the production storage of GR3M proposed in this study, findings in the recent literature made us wonder how a nested model, which is hybridized LSTM with mass conservation constraints, could yield runoff simulations. We have the intention of conducting an investigation of this context in the future.
Furthermore, how to incorporate techniques for isolating noises in time series into the nested hybridization procedure will be the subject of another future study. For example, the wavelet transformation has been implemented in recent years to enhance the performance of runoff simulations by decomposing the periodic structure in conceptual model outputs (e.g., Wang et al. 2021; Sezen & Partal 2022) or hydroclimate records (e.g., Remesan et al. 2009; Bajirao et al. 2021). Thus, our further research might focus on evaluating the viability of this methodology as an intermediate preprocessing within a nested hybrid model.
In addition to all these, as reported by Slater et al. (2023), there may be concerns about how these hybrid models could complement or supersede traditional models in operational contexts. To persuade operational forecasters that hybrid models can bring value to the hydrology literature when compared with conventional ones, it is recommended to adopt the nested hybridization style as demonstrated in this paper. The success of blending the strengths of a conceptual model and an ML algorithm in one combinatorial scheme and ensuring that their parameters interact with one another has been revealed in a rare number of papers (e.g., Okkan et al. 2021). This study has shown that interfering with the linearly behaved components in even simpler MWBMs with the mentioned hybridization strategy can considerably enhance the monthly simulation accuracy and hopes to provide an insight for operational forecasters.
CONCLUSION
This study was conceived with the following two subsequent motivations in mind: (i) there are rather limited papers focusing on the modification of the runoff partitioning structure in MWBMs using ML techniques for better simulation performance. (ii) Given the significance of high-flow simulation, particularly in regions with intensive irrigation efforts (e.g., GRB), the fact that a commonly employed model (GR2M) generates biased estimates during periods of high precipitation could be a major worry. Despite a limited number of studies addressing these two concerns (e.g., Cheng et al. 2020; Fathi et al. 2023), ML techniques have not yet been widely utilized in the internal structural modification of MWBMs. Although it was previously discovered by Okkan et al. (2021) that there are ML-oriented improvements in more parameter-intensive MWBMs, this study reveals for the first time how the nested hybridization style yields outcomes in simpler structured and benchmark models such as GR2M.
In the study carried out considering all these motivations, a hybrid model termed GR3M was constructed by substituting only the routing store of GR2M with LSSVR, and it was studied to what extent this novel approach provided accurate monthly simulation. The performance measures suggest that the hybrid model manipulating nonlinear routing is more capable of capturing the rainfall–runoff relationship as opposed to the standard GR2M and GR3M_PA. The hybrid model, which is also not laborious to calibrate due to its simplicity, can be recognized as being rather robust in simulating both all flows and high flows. Nevertheless, diagnostics of the performance of the model over numerous catchments are undoubtedly essential for generalization. Moreover, the models in this hybridization style cannot claim to deal with basin dynamics in all their aspects. But the notion that ML techniques such as LSSVR, which contain few parameters to be tuned and have the flexibility to blend the conceptual outputs, can be utilized in MWBMs that require intervention in the runoff-generating mechanism has been criticized. The developed hybrid model has the potential to be adapted to operational studies as it preserves the conceptual structure of the original model with a nested hybridization.
AUTHORS' CONTRIBUTION
U.O. wrote the main manuscript text and provided the conceptualization of the study. U.O. and Z.B.E. prepared the MATLAB codes, all figures, and tables. U.O. and O.F. contributed to the development of the methodology proposed. U.O. and Z.B.E. revised the manuscript. All authors reviewed the manuscript.
FUNDING
No funding support was received for this study.
DATA AVAILABILITY STATEMENT
Data cannot be made publicly available; readers should contact the corresponding author for details.
CONFLICT OF INTEREST
The authors declare there is no conflict.