Accurate streamflow prediction is essential for optimal water management and disaster preparedness. While data-driven methods’ performance often surpasses process-based models, concerns regarding their ‘black-box’ nature persist. Hybrid models, integrating domain knowledge and process modeling into a data-driven framework, offer enhanced streamflow prediction capabilities. This study investigated watershed memory and process modeling-based hybridizing approaches across diverse hydrological regimes – Korean and Ethiopian watersheds. Following watershed memory analysis, the Soil and Water Assessment Tool (SWAT) was calibrated using the recession constant and other relevant parameters. Three hybrid models, incorporating watershed memory and residual error, were developed and evaluated against standalone long short-term memory (LSTM) models. Hybrids outperformed the standalone LSTM across all watersheds. The memory-based approach exhibited superior and consistent performance across training, evaluation periods, and regions, achieving 17–66% Nash–Sutcliffe efficiency coefficient improvement. The residual error-based technique showed varying performance across regions. While hybrids improved extreme event predictions, particularly peak flows, all models struggled at low flow. Korean watersheds’ significant prediction improvements highlight the hybrid models’ effectiveness in regions with pronounced temporal hydrological variability. This study underscores the importance of selecting a specific hybrid approach based on the desired objectives rather than solely relying on statistical metrics that often reflect average performance.

  • Three hybrid machine learning models were developed, considering watershed memory and hydrological model residual error.

  • We evaluated hybrid models for enhanced streamflow prediction across diverse hydro-meteorological regions.

  • Watershed memory-based hybrid models offered superior and consistent performance across regions.

Reliable streamflow prediction is of paramount importance for effective water resource allocation and disaster management (Erdal & Karakurt 2013; Chu et al. 2021; Reis et al. 2021). In this regard, hydrological models, including data-driven, conceptual, and physically based, play a pivotal role in enhancing our comprehension of the watershed physical processes and how these processes link (Bourdin et al. 2012). Despite criticisms for parametric complexity and challenging implementation, process-based watershed models emerge as indispensable when a comprehensive grasp of hydrological processes and region water balance is crucial (Fatichi et al. 2016). Process-based models' complexity introduces certain drawbacks, such as heightened uncertainty, calibration and validation challenges, and modeling costs (Moges et al. 2020; Herrera et al. 2022).

Data-driven modeling has emerged as a viable alternative to traditional hydrological techniques (Fahimi et al. 2017; Lee et al. 2023a). These approaches leverage the power of data to uncover patterns from time-series input–output data, thereby reducing the reliance on extensive physical data inputs (Karandish & Šimůnek 2016). In essence, data-driven models are ‘black-box’ methodologies that establish relationships between inputs and outputs without explicitly modeling the underlying physical hydrological processes (Solomatine & Ostfeld 2008; Bourdin et al. 2012; Chu et al. 2021). Despite this limitation, data-driven models have consistently shown superior performance compared to traditional approaches (Nearing et al. 2021; Reis et al. 2021; Yu et al. 2023). This performance advantage has fueled a surge in the adoption of data-driven modeling for hydrological analysis (Yaseen et al. 2015; Zounemat-Kermani et al. 2020; Mohammadi et al. 2024).

However, in addition to their inherent black-box nature, data-driven models have faced criticism for their vulnerability to overfitting or underfitting, which is contingent upon the model's complexity, as well as the volume and quality of the data (Ghaith et al. 2020; Mudiyanselage Viraj et al. 2021). Furthermore, both data-driven and process-based modeling approaches exhibit suboptimal performance when predicting extreme events (Zheng et al. 2018; Tan et al. 2020; Zounemat-Kermani et al. 2021). Bridging the gap, expert knowledge-based approaches have surged in popularity, offering both enhanced performance and interpretability for accurate prediction (Mudiyanselage Viraj et al. 2021; Willard et al. 2023). In hydrology, these methods involve integrating watershed memory indicators such as baseflow analysis with other input data (Zemzami & Benaabidate 2016; Tongal & Booij 2018; Li et al. 2022), simulating intermediate variables (Humphrey et al. 2016; Noori & Kalin 2016), and accounting for residual errors (Tian et al. 2018; Kassem et al. 2020; Kim et al. 2021; Cho & Kim 2022).

This watershed memory/domain knowledge and process-based modeling-related data-driven modeling, known by various terms such as physics-informed machine learning, hybrid machine learning, and theory-based machine learning, offers several benefits, including improved physical consistency, interpretability, and prediction accuracy (Humphrey et al. 2016; Noori & Kalin 2016). However, the choice between simulating watershed processes and leveraging watershed memory for performance enhancement remains unclear, with limited research exploring their comparative effectiveness and reasoning behind the selection. In particular, residual error modeling is aimed at solely improving model performance with little possibility of providing uncertainty sources. It is crucial to acknowledge that the simulation of watershed processes can introduce complexities into the process of streamflow prediction. In contrast, watershed memory-related techniques such as baseflow analysis provide a comparatively more straightforward approach. The disparity in complexity highlights the need for thorough reasoning behind technique selection, further comparative research, and examining alternatives within the framework.

This study addresses this gap by investigating the relative effectiveness of simulating hydrological processes and employing watershed memory techniques for improving data-driven streamflow prediction. We explored the efficacy of the baseflow index in enhancing prediction accuracy, a novel approach not previously investigated in this context. Additionally, we used watershed memory for the calibration and validation of process-based models, highlighting its potential for wider application. Most importantly, we tackled the critical question: can hybrid models conquer the ever-growing threat of extreme events in streamflow prediction where traditional methods falter? The findings will provide valuable insights for researchers and practitioners in the field of hydrological modeling, informing the selection of suitable approaches for data-driven streamflow prediction and contributing to the advancement of hybrid modeling approaches. This paper is structured as follows. The first section provides an overview of streamflow improvement techniques utilizing watershed memory and residual error modeling. Next, the study regions and dataset are described in detail. Subsequently, the methods section outlines the models and approaches employed in the study. The results and discussion section presents the key model outputs, followed by a concluding section that summarizes the main findings and their implications.

Beyond the present: watershed memory for improved model predictions

Watershed memory quantifies the retention and release of water within a watershed over time. It reflects the duration for which past climate, hydrogeological features, and watershed characteristics influence the current hydrological response. Conceptualizing streamflow into quick flow, interflow, and baseflow components helps understand this dynamic (Tallaksen 1995; Duncan 2019).

Baseflow, distinguished by its persistence and sustained nature within streamflow, primarily originates from groundwater and delayed sources (Hall 1968; Lim et al. 2005; Duncan 2019; McMahon & Nathan 2021). Understanding baseflow dynamics is crucial for maintaining the delicate balance between healthy ecosystems, reliable water supplies, and clean water, with applications ranging from assessing drought risks to calibrating hydrological models (Tallaksen 1995; Brutsaert 2008; Eckhardt 2008). Recognizing its significance, hydrologists and hydrogeologists have dedicated over a century to studying baseflow, gaining valuable insights into aquifer properties, and improving water resource management (Lim et al. 2005; Thiesen et al. 2019; McMahon & Nathan 2021).

While plenty of baseflow separation techniques are available, digital filters have gained significant traction due to their ease of application and computational efficiency compared to tracer-based techniques (complex, time-consuming) and graphical methods (potentially irregular patterns) (Lim et al. 2010; Collischonn & Fan 2013; Cheng et al. 2022). The Eckhardt digital filter (Eckhardt 2005), in particular, has emerged as a leading choice, exhibiting superior performance in various regional studies (Xie et al. 2020; Cheng et al. 2022; Eckhardt 2023). Recently, the Eckhardt filter-based baseflow analysis has proven highly effective in enhancing streamflow prediction in data-driven modeling frameworks (Taormina et al. 2015). The equation for the Eckhardt Filter is as follows:
formula
(1)
where and are baseflow at times step t and the previous time step; is the streamflow at time step t; BFImax is the maximum baseflow index (the long-term ratio of baseflow to streamflow); and is a filtering coefficient, or recession constant.
Recession analysis (Langbein 1938) is used to estimate the filter parameter (Eckhardt 2008). The daily mean streamflow values during recession periods () are plotted against subsequent day values (). The resulting line's gradient, passing through the origin, represents the recession constant () according to the equation . BFImax, capturing the region's hydrogeologic fingerprint, represents the long-term baseflow-to-streamflow ratio, typically ranging from 0.25 to 0.8 (Eckhardt 2005, 2008). As field investigations of hydrogeology are often challenging, a backward filter method utilizes the recession constant () to estimate BFImax (Collischonn & Fan 2013). This iterative approach, based on the following equation, calculates BFImax by dividing the maximum potential baseflow by the total streamflow.
formula
(2)

BFI reflects how watershed features, geology, and land use impact water storage (Bloomfield et al. 2009; Van Loon & Laaha 2015; Sutanto & Van Lanen 2022). Watershed memory is typically represented by BFI and baseflow recession constant (Sutanto & Van Lanen 2022; Gu et al. 2023). Studies also show a positive link between this memory and accurate streamflow prediction (Harrigan et al. 2018; Girons Lopez et al. 2021).

Several data-driven studies leverage baseflow separation to enhance streamflow predictions. Corzo & Solomatine (2007) demonstrated that even traditional methods, such as the constant slope approach, can enhance simulation accuracy, with further improvement achieved using optimized baseflow filtering equations. Similarly, Zemzami & Benaabidate (2016) found that recursive digital filters outperformed simpler alternatives when paired with artificial neural networks (ANN). Two main methods utilize baseflow separation: (1) baseflow as a predictor: integrate baseflow with other inputs (Chen et al. 2021; Tongal & Booij 2018, 2022). (2) Separate models for baseflow and excess flow: build individual models for each component, combining results for prediction (Corzo & Solomatine 2007; Isik et al. 2013; Taormina et al. 2015). While both approaches demonstrably improve forecasting, comparative studies and clear selection guidance are lacking, presenting an interesting research avenue.

Leveraging residual error modeling for enhanced model performance

Uncertainties in hydrological models – arising from observations, structure, and parameters – limit their ability to replicate hydrographs (Wu et al. 2014; Li et al. 2017; Moges et al. 2020). This has led to growing interest in studying residual errors for accurate predictions and successful inference (Smith et al. 2015). By systematically identifying and rectifying biases through data-driven approaches, residual error modeling has been used to improve the predictive accuracy of hydrological models (Konapala et al. 2020; Li et al. 2017, 2021; Sikorska-Senoner & Quilty 2021). It stands as one of the earliest and most commonly embraced methods for addressing the constraints inherent in physics-based models (Willard et al. 2023). The fundamental idea hinges on uncovering systematic biases in a physical model compared to real-world observations, leveraging this knowledge to fine-tune its predictions for enhanced accuracy (Mekonnen et al. 2015; Smith et al. 2015; Willard et al. 2023) – Equation (3).
formula
(3)
where is the process-based model error, t represents time, is observed discharge, is simulated discharge, denotes forcing data, and refers to a set of unknown model parameters.

The hydrological model residual-based streamflow prediction enhancing framework has been applied in two ways: (1) data-driven models trained solely on process-based model errors, and (2) residual errors as additional input variables (Tian et al. 2018; Kassem et al. 2020; Sikorska-Senoner & Quilty 2021; Cho & Kim 2022). A pivotal consideration in the realm of hydrological modeling hybridization involves the impact of calibrating process-based models before their integration with machine learning frameworks. This issue was addressed by Shen et al. (2022), who evaluated the predictive performance of both calibrated and uncalibrated versions of the PCRaster Global Water Balance (PCR-GLOBWB) model when coupled with random forest (RF). Their findings revealed significant enhancements in accuracy for both configurations, signifying the substantial potential of residual error correction within this hybridization paradigm.

Watersheds from two worlds: South Korea and Ethiopia

This study investigated two geographically and hydrometeorologically distinct regions: the Gapcheon and Chogang watersheds in South Korea and the Melka Kunture watershed in Ethiopia (Figure 1). The South Korean watersheds exhibit diverse topography and hydrological characteristics, covering approximately 600 km² (Gapcheon) and 625 km² (Chogang). The elevation in the region ranges from 20 to 1,140 m above sea level and an average annual precipitation of 1,375 mm was recorded from 2000 to 2022. The Melka Kunture watershed covers a vast 4,456 km2 in the upper Awash River region of Ethiopia. Rugged slopes and verdant valleys define this agricultural area, where elevations soar from 1,262 to 3,599 m.
Figure 1

Study watersheds: South Korea (Gapcheon and Chogang), Ethiopia (Melka Kunture).

Figure 1

Study watersheds: South Korea (Gapcheon and Chogang), Ethiopia (Melka Kunture).

Close modal
Although their hydro-meteorological regimes demonstrably differ, the two regions exhibit synchronous patterns in their primary rainy seasons (Figure 2(a) and 2(b)). Marked disparities emerge in terms of precipitation volume and the characteristics of extreme events. The Melka Kunture watershed, despite its considerably larger areal extent compared to Gapcheon and Chogang, experiences significantly lower peak discharge events than the two small watersheds. Relatively, the Melka Kunture watershed experiences higher low flow (Figure 2(c)).
Figure 2

A comparative analysis of hydro-meteorological regimes in the Melka Kunture, Gapcheon, and Chogang watersheds.

Figure 2

A comparative analysis of hydro-meteorological regimes in the Melka Kunture, Gapcheon, and Chogang watersheds.

Close modal

Overview of model input dataset and data sources

In this study, diverse datasets comprising weather data, streamflow, land use/land cover, and the digital elevation model (DEM) were acquired from various data-providing agencies (Table 1). The study areas exhibit distinct variations in land use, soil composition, and topographic features (Figure 3). Specifically, the watersheds located in South Korea are predominantly characterized by forest cover. In contrast, the Ethiopian watershed is primarily agricultural. Among the Korean watersheds, the Gapcheon watershed stands out for its mixed urban and agricultural land use practices.
Table 1

Details of key model inputs and data sources for South Korean and Ethiopian (*) watersheds

Data descriptionDetails of the dataData sources (online and office access)
Weather Daily from 2000 to 2020 Korea Meteorological Administration, KMA (KMA 2022
Streamflow Daily from 2000 to 2020 Han River Flood Control Office, HRFCO (HRFCO 2023
DEM 30 m National Geographic Information Institute, NGII (NGII 2022
Land use/Land cover 30 m Environmental Geographic Information Service, EGIS (EGIS 2022
Soil classes – Rural Development Administration, RDA (RDA 2022
Weather* Daily from 1990 to 2013 National Meteorological Service Agency of Ethiopia 
Streamflow* Daily from 1990 to 2015 Ministry of Water Resources, Irrigation, and Electricity, Ethiopia 
DEM* 30 m USGS ‘earthexplorer’ website 
Land use/land cover* 30 m GLOBELAND 30 (Jun et al. 2014
Soil classes* – Harmonized World Soil Database (FAO) 
Data descriptionDetails of the dataData sources (online and office access)
Weather Daily from 2000 to 2020 Korea Meteorological Administration, KMA (KMA 2022
Streamflow Daily from 2000 to 2020 Han River Flood Control Office, HRFCO (HRFCO 2023
DEM 30 m National Geographic Information Institute, NGII (NGII 2022
Land use/Land cover 30 m Environmental Geographic Information Service, EGIS (EGIS 2022
Soil classes – Rural Development Administration, RDA (RDA 2022
Weather* Daily from 1990 to 2013 National Meteorological Service Agency of Ethiopia 
Streamflow* Daily from 1990 to 2015 Ministry of Water Resources, Irrigation, and Electricity, Ethiopia 
DEM* 30 m USGS ‘earthexplorer’ website 
Land use/land cover* 30 m GLOBELAND 30 (Jun et al. 2014
Soil classes* – Harmonized World Soil Database (FAO) 
Figure 3

Simplified land use/land cover from SWAT model input: (a) Gapcheon and Chogang watershed and (b) Melka Kunture watershed.

Figure 3

Simplified land use/land cover from SWAT model input: (a) Gapcheon and Chogang watershed and (b) Melka Kunture watershed.

Close modal

The Gapcheon and Chogang watersheds benefit from readily available, up-to-date streamflow and weather observations. Conversely, the Melka Kunture watershed presents a data limitation, with complete observations accessible only until the year 2013. Similar data availability constraints are evident in other studies focusing on this region, with model simulations and observed data restricted to this period (e.g., Shawul & Chakma 2020; Birhanu et al. 2021; Mitiku et al. 2023).

The hydrologic soil group is a crucial property influencing watershed hydrological processes. These data were processed from SWAT model inputs (Figure 4). In the South Korean watersheds, the soils are categorized into groups A, B, and C, with group A being the predominant classification. Conversely, the Ethiopian watershed exhibits soils belonging to groups A, B, and D, with group D emerging as the dominant soil type in the region.
Figure 4

Spatial distribution of hydrologic soil group in the study regions: (a) Gapcheon and Chogang watersheds and (b) Melka Kunture watershed.

Figure 4

Spatial distribution of hydrologic soil group in the study regions: (a) Gapcheon and Chogang watersheds and (b) Melka Kunture watershed.

Close modal

Soil and Water Assessment Tool

The Soil and Water Assessment Tool (SWAT) (Arnold et al. 1998) is a semi-distributed continuous-time model developed by the U.S. Department of Agriculture, Agricultural Research Service (USDA, ARS). SWAT uses a hydrologic response unit (HRU) as its fundamental computation unit. HRUs aggregate areas within a subbasin that share similar soil, land use/land cover, and slope characteristics based on user-defined thresholds. The core water balance equation can be distilled into four primary components: ET, surface runoff, soil water, and groundwater (Equation (4)).
formula
(4)
where SWo and SWt are initial and final soil water content; t is the time (days); Prec, ET, Perc, Qg, and Qs represent daily rainfall, evapotranspiration, percolation, groundwater discharge, and surface runoff, respectively, with all units in mm.

The SWAT model was used to develop hydrological models for the Gapcheon and Chogang watersheds, which comprise 26 and 27 subbasins and 1,295 and 1,199 HRUs, respectively. In the Gapcheon watershed, over 56% of the area exhibits a slope of less than 25%. For HRU generation in both watersheds, a threshold option of 7/7/7% for land use/soil/slope was employed. Approximately 73% of the total area in both watersheds is covered by forest. In the Chogang watershed specifically, 44% of the terrain features a slope of less than 25%.

Melka Kunture watershed had around 37 subbasins, and the HRU was created using a 3/3/3% Land use/Soil/Slope threshold. Around 86% of the Melka Kunture watershed is used for rainfed agriculture.

The SWAT model was calibrated and validated for the selected parameters using observed streamflow data from the outlets of each watershed (Table 2). Sequential uncertainty fitting version 2 (SUFI-2) was applied within the SWAT-calibration and uncertainty program (SWAT-CUP) user interface (Abbaspour et al. 2007; Abbaspour 2011) for calibration and validation. For each watershed, the calibration and validation periods were as follows:

  • Gapcheon watershed: calibration: 2002–2012; validation: 2013–2020

  • Chogang watershed: calibration: 2006–2014; validation: 2015–2020

  • Melka Kunture: calibration: 2000–2007; validation: 2008–2013

Table 2

Calibrated SWAT model parameters: Parameters marked with (*) were calibrated only for the Gapcheon and Chogang watersheds, while those marked with (**) were calibrated only for the Melka Kunture watershed

ParameterDescription of parametersInitial range
r__CN2.mgt Soil Conservation Service (SCS) runoff curve number −0.2 to 0.5 
V__GWQMN.gw Threshold depth of water in the shallow aquifer required for return flow to occur (mm) 0–5,000 
v__RCHRG_DP.gw Deep aquifer percolation fraction 0–1 
*v__CH_K2.sub Effective hydraulic conductivity in main channel alluvium 0.01–150 
*r__OV_N.hru Manning's ‘n’ value for overland flow −0.25 to 0.25 
v__CANMX.hru Maximum canopy storage 0–100 
r__SOL_K().sol Saturated hydraulic conductivity −0.25 to 0.25 
v__SURLAG.hru Surface runoff lag time 0.05–24 
v__ALPHA_BF.gw Baseflow recession factor 0.02–0.07 
*v__ESCO.hru Soil evaporation demand coefficient 0–1 
**v__ALPHA_BNK.rte Baseflow alpha factor for bank storage 0–1 
**r__SOL_Z.sol Depth from the soil surface to the bottom of layer −0.2 to 0.5 
r__SOL_AWC.sol Soil available moisture capacity (mm H2O/mm soil) −0.5 to 0.5 
ParameterDescription of parametersInitial range
r__CN2.mgt Soil Conservation Service (SCS) runoff curve number −0.2 to 0.5 
V__GWQMN.gw Threshold depth of water in the shallow aquifer required for return flow to occur (mm) 0–5,000 
v__RCHRG_DP.gw Deep aquifer percolation fraction 0–1 
*v__CH_K2.sub Effective hydraulic conductivity in main channel alluvium 0.01–150 
*r__OV_N.hru Manning's ‘n’ value for overland flow −0.25 to 0.25 
v__CANMX.hru Maximum canopy storage 0–100 
r__SOL_K().sol Saturated hydraulic conductivity −0.25 to 0.25 
v__SURLAG.hru Surface runoff lag time 0.05–24 
v__ALPHA_BF.gw Baseflow recession factor 0.02–0.07 
*v__ESCO.hru Soil evaporation demand coefficient 0–1 
**v__ALPHA_BNK.rte Baseflow alpha factor for bank storage 0–1 
**r__SOL_Z.sol Depth from the soil surface to the bottom of layer −0.2 to 0.5 
r__SOL_AWC.sol Soil available moisture capacity (mm H2O/mm soil) −0.5 to 0.5 

Note: v__ means that the default parameter is replaced by a given value, and r__ means that the existing parameter value is multiplied by (1 + a given value).

The calibration range for the baseflow recession factor (ALPHA_BF.gw) was determined based on baseflow recession analysis results and set to around ±30% of those values. For other parameters, the initial ranges were determined systematically using published values (Shawul & Chakma 2020; Lee et al. 2023b), expert knowledge, and sensitivity simulations exploring parameter space and parameter types.

Watershed memory: baseflow filtering using two-parameter digital filter

The recession analysis and BFImax were done to separate baseflow from streamflow. The recession constant for the Gapcheon watershed was 0.978, with an average BFImax of 0.426. For the Chogang watershed, the recession constant was 0.977, and the BFImax was 0.344. In the Melka Kunture watershed, the recession constant was 0.959, and the BFImax was 0.475.

Distinct baseflow characteristics emerge across the three watersheds examined in this study (Figure 5). While Melka Kunture shows a moderate balance between baseflow and runoff, Gapcheon and Chogang have a relatively high proportion of runoff. This variation can likely be attributed to several factors, including climate, geology and soil characteristics, and land cover and vegetation. Notably, the comparison between the two study regions reveals significantly higher streamflow in the South Korean watersheds during rainy seasons, which can be primarily attributed to the distinct climate conditions, with intense rainfall events in South Korea leading to substantial surface runoff.
Figure 5

Observed streamflow and filtered baseflow in the study regions: (a) Gapcheon watershed, (b) Chogang watershed, and (c) Melka Kunture watershed.

Figure 5

Observed streamflow and filtered baseflow in the study regions: (a) Gapcheon watershed, (b) Chogang watershed, and (c) Melka Kunture watershed.

Close modal

Long short-term memory

Long short-term memory (LSTM) (Hochreiter & Schmidhuber 1997) is a type of recurrent neural network (RNN) designed to excel at capturing long-term dependencies in sequential data. Unlike traditional RNNs, which suffer from the vanishing gradient problem, LSTMs employ a memory cell, gates, and a complex gating mechanism to manage the flow of information. These gates allow LSTMs to selectively retain or discard information as needed, enabling them to handle long-term dependencies effectively.

LSTM has gained significant traction in hydrology for its ability to predict short- and long-term hydrological variables (Shen 2018; Kim et al. 2021; Xie et al. 2022). Detailed explanations of LSTM's inner workings are provided by numerous researchers, including Kim et al. (2021) and Kratzert et al. (2018).

This study used an LSTM layer with 50 units and the Adam optimizer with a learning rate of 0.001. The model was trained for 60–700 epochs with a batch size of 32–128. A dropout rate of 0.1 was employed for regularization. The architecture included multiple dense layers with varying numbers of units (256, 512, 512, 64) and rectified linear unit (ReLU) activation functions. These hyperparameters, including the LSTM units, learning rate, batch size, number of epochs, dropout rate, and dense layer architecture, are pivotal in influencing the model's ability to capture temporal patterns in the input data and can be fine-tuned to optimize prediction performance.

A hybrid machine learning approach for improved streamflow prediction

Section 2 provides an overview of two widely adopted hybridization techniques for streamflow prediction: residual error-based and watershed memory-based methods. While the former demonstrably increases prediction accuracy, it demands substantial computational resources by executing all process-based modeling steps to extract residual errors for subsequent machine learning analysis. This translates to increased cost and time investment in streamflow prediction. Conversely, watershed memory-based approaches offer relative simplicity but lack clear justifications for their selection over residual error methods.

Furthermore, while both process-based models and data-driven methods struggle with extreme events (Zhang et al. 2018; Tan et al. 2020; Yifru et al. 2024), research lacks clarity on which specific hydrological extremes benefit from hybridized approaches. Therefore, this study endeavors to address three fundamental questions regarding the two extensively employed hybridization techniques.

This study addresses three key questions regarding these prevalent techniques: (1) Which method performs better overall? (2) Which approach aids better in predicting hydrological extremes, particularly low flow or peak flow? (3) Do alternative watershed memory terms enhance prediction beyond commonly used baseflow data? To answer the questions, this study meticulously employed a robust methodological framework incorporating baseflow analysis, comprehensive watershed modeling using the SWAT model, and development of standalone and hybrid machine learning models (Figure 6).
Figure 6

Integrated modeling framework used in this study: The process-based model residual error was after calibration. A recursive digital filter was used to estimate hydrograph separation and estimate recession constant for SWAT model calibration.

Figure 6

Integrated modeling framework used in this study: The process-based model residual error was after calibration. A recursive digital filter was used to estimate hydrograph separation and estimate recession constant for SWAT model calibration.

Close modal

The methodology involves three key steps. First, baseflow recession analysis was employed to calibrate and validate the SWAT model, ensuring accurate simulation of baseflow contributions to streamflow. Next, baseflow, baseflow index, and SWAT model residuals were utilized as additional inputs to train an LSTM model, which aims to improve streamflow prediction accuracy. In this framework, four modeling approaches were investigated. The first model utilized an LSTM model with weather and observed streamflow data for prediction. In the second hybrid model, Bflow-LSTM, baseflow data joined weather and streamflow data as inputs. The third method, BFI-LSTM, employed an LSTM model with baseflow index, weather, and streamflow data. The fourth, SWAT-LSTM, utilized an LSTM to predict and update residual errors using weather and SWAT model residual errors. For all watersheds and models, data were divided into 80% for training and 20% for testing. Furthermore, 20% of the training data were allocated for cross-validation, further optimizing model performance.

Model performance indices and evaluations

Three key metrics were used to assess model performance: Nash–Sutcliffe efficiency (NSE) (Nash & Sutcliffe 1970), percent bias (PBIAS) (Sorooshian et al. 1993), and coefficient of determination (R2) (Equations (5) and (6)). NSE measures the model's ability to capture the observed variability, offering a comprehensive assessment of its overall performance. PBIAS reveals whether the model overestimates or underestimates streamflow, highlighting potential biases in the predictions. R2 evaluates how well the model explains the observed variations in streamflow, providing insight into its fit and explanatory power.

While model performance evaluation can vary based on the specific simulation objectives and lacks a universally accepted standard, achieving certain statistical indicators is crucial for ensuring accurate streamflow simulation in process-based models like SWAT. Moriasi et al. (2007) recommend aiming for an NSE greater than 0.5 and maintaining PBIAS within the range of −25% to +25%. Reaching these benchmarks suggests that the model effectively captures the observed streamflow dynamics and provides reliable predictions. NSE was used as an objective function in this study.
formula
(5)
formula
(6)

In Equations (5) and (6), Qi and Si are the observed and computed data for the ith day, n is the length of data considered, and is the mean of n observed data, respectively.

While traditional model performance indices offer a general assessment of overall model accuracy, they often fail to reveal specific limitations in predicting critical flow regimes like peak and low flows. To address this, we employed flow duration curves (FDCs) to visually evaluate the performance of all modeled outputs and observed flow. This approach leverages the inherent sensitivity of FDCs to changes in flow frequency and magnitude, allowing us to identify potential biases in model predictions at both extreme and intermediate flow levels. To specifically assess model performance at extreme events, we zoomed in on the far ends of the FDC plots, enabling a close-up examination of peak and low flow predictions.

SWAT model calibration and validation

The SWAT model's calibration and validation results demonstrated its ability to represent the overall streamflow pattern, as evidenced by NSE values ranging from 0.70 to 0.72 in the Gapcheon watershed. Nevertheless, the model's performance at low and peak flows exhibited notable limitations. During both calibration and validation, the model consistently underestimated the streamflow. In the Chogang watershed, despite capturing the general flow pattern, the model significantly underestimated peak flows. These discrepancies between observed and simulated streamflow likely stem from land use change (Lee et al. 2023b) and significant differences in hydrological processes between dry and wet years, particularly the occurrence of consecutive dry periods. For instance, the streamflow in 2020 was considerably higher in both watersheds compared to previous wet years (Figure 7(a) and 7(b)). In the Melka Kunture watershed, the model performed well overall (Figure 7(c)), though it overestimated the peak flow.
Figure 7

Comparison of observed and simulated streamflow in the study watersheds: (a) Gapcheon, (b) Chogang, and (c) Melka Kunture.

Figure 7

Comparison of observed and simulated streamflow in the study watersheds: (a) Gapcheon, (b) Chogang, and (c) Melka Kunture.

Close modal

Based on the widely used model performance rating by Moriasi et al. (2007), the Gapcheon watershed model achieved a ‘good’ rating based on its NSE values (Table 3). However, its performance varied between calibration and validation periods, receiving ‘satisfactory’ and ‘unsatisfactory’ ratings, respectively. In contrast, the Chogang watershed model's performance was ‘unsatisfactory.’ As discussed in other studies (Tigabu et al. 2023), the observed high inter-annual flow variability in this Korean watershed might be a contributing factor to the SWAT model's underperformance. The Melka Kunture watershed model exhibited the best performance, achieving a ‘generally good’ rating.

Table 3

SWAT model performance indices

WatershedNSEPBIASR2
Gapcheon 0.72 (0.70) 32.7 (23.8) 0.72 (0.70) 
Chogang 0.53 (0.37) 52.4 (37.6) 0.69 (0.46) 
Melka Kunture 0.70 (0.68) − 3.3 ( − 19.8) 0.74 (0.71) 
WatershedNSEPBIASR2
Gapcheon 0.72 (0.70) 32.7 (23.8) 0.72 (0.70) 
Chogang 0.53 (0.37) 52.4 (37.6) 0.69 (0.46) 
Melka Kunture 0.70 (0.68) − 3.3 ( − 19.8) 0.74 (0.71) 

The values in the bracket are validations.

Overall performance of hybrid models

Model performance on streamflow prediction varies noticeably across watersheds (Figure 8). The hybrid models stand out for their close alignment with the observed streamflow's statistical distribution. In terms of statistical indices, the benchmark (LSTM) exhibited moderate performance across both training and testing phases, with NSE values ranging from 0.71 to 0.57 in the training phase and from 0.43 to 0.32 in the testing phase for the Gapcheon and Chogang watersheds (Figure 9). Conversely, the LSTM model achieved good performance in the semi-dry environment (Melka Kunture watershed) during training and testing, with NSE values of 0.89 and 0.82, respectively.
Figure 8

Observed and predicted streamflow (Box and Whisker Plots): (a) Gapcheon watershed, (b) Chogang watershed, and (c) Melka Kunture watershed.

Figure 8

Observed and predicted streamflow (Box and Whisker Plots): (a) Gapcheon watershed, (b) Chogang watershed, and (c) Melka Kunture watershed.

Close modal
Figure 9

Radial plot comparing streamflow prediction during the training and testing periods: (a) Nash–Sutcliffe efficiency (NSE) and (b) percent bias (PBIAS). The bold axis on the right indicates model performance during the training period, while the left side depicts performance during testing. As R2 values were highly correlated with NSE, they are not presented here.

Figure 9

Radial plot comparing streamflow prediction during the training and testing periods: (a) Nash–Sutcliffe efficiency (NSE) and (b) percent bias (PBIAS). The bold axis on the right indicates model performance during the training period, while the left side depicts performance during testing. As R2 values were highly correlated with NSE, they are not presented here.

Close modal

Across study regions, performance differences between the standalone SWAT and LSTM models were negligible. The hybrid models demonstrated superior predictive accuracy compared to the standalone LSTM model. Notably, the Bflow-LSTM and BFI-LSTM models significantly outperformed the weather and streamflow-based LSTM predictions. This finding underscores the critical importance of incorporating watershed memory, particularly through baseflow and baseflow index, to enhance model robustness and ensure consistent performance across training and testing phases.

The hybrid model's performance in South Korean watersheds exhibited an NSE value range of 0.79–0.99 during training and 0.55–0.98 during testing (Table 4). Conversely, in the Ethiopian watershed, all tested hybrid models achieved an NSE value of 0.99 during training, with values ranging from 0.98 to 0.99 in the testing period. Across all watersheds, Bflow-LSTM consistently outperformed other models.

Table 4

Summary of performance comparison of benchmark and hybrid models during training and testing phases

Study areaModeling scenariosTraining
Testing
NSEPBIASR2NSEPBIASR2
Gapcheon LSTM 0.57 30.2 0.54 0.32 36.8 0.27 
 Bflow-LSTM 0.98 6.8 0.98 0.98 15.2 0.98 
 BFI-LSTM 0.79 − 3.7 0.80 0.73 1.21 0.73 
 SWAT-LSTM 0.88 10.6 0.88 0.71 12.5 0.71 
Chogang LSTM 0.71 31.5 0.73 0.43 27.4 0.46 
 Bflow-LSTM 0.99 − 1.8 0.99 0.97 − 4.1 0.98 
 BFI-LSTM 0.89 22.9 0.90 0. 77 8.6 0.83 
 SWAT-LSTM 0.85 24.4 0.86 0.55 4.2 0.57 
Melka Kunture LSTM 0.89 − 5.0 0.89 0.82 8.6 0.82 
 Bflow-LSTM 0.99 2.9 0.99 0.99 6.2 0.99 
 BFI-LSTM 0.99 − 1.4 0.99 0.98 5.7 0.97 
 SWAT-LSTM 0.99 3.9 0.99 0.99 4.4 0.99 
Study areaModeling scenariosTraining
Testing
NSEPBIASR2NSEPBIASR2
Gapcheon LSTM 0.57 30.2 0.54 0.32 36.8 0.27 
 Bflow-LSTM 0.98 6.8 0.98 0.98 15.2 0.98 
 BFI-LSTM 0.79 − 3.7 0.80 0.73 1.21 0.73 
 SWAT-LSTM 0.88 10.6 0.88 0.71 12.5 0.71 
Chogang LSTM 0.71 31.5 0.73 0.43 27.4 0.46 
 Bflow-LSTM 0.99 − 1.8 0.99 0.97 − 4.1 0.98 
 BFI-LSTM 0.89 22.9 0.90 0. 77 8.6 0.83 
 SWAT-LSTM 0.85 24.4 0.86 0.55 4.2 0.57 
Melka Kunture LSTM 0.89 − 5.0 0.89 0.82 8.6 0.82 
 Bflow-LSTM 0.99 2.9 0.99 0.99 6.2 0.99 
 BFI-LSTM 0.99 − 1.4 0.99 0.98 5.7 0.97 
 SWAT-LSTM 0.99 3.9 0.99 0.99 4.4 0.99 

While the standalone LSTM served as the primary benchmark for evaluating hybrid model performance, comparisons with the standalone SWAT model further revealed significant improvements achieved by the SWAT residual error-based hybrid models. This enhancement was particularly noteworthy in the Korean watersheds, where the standalone SWAT model exhibited limitations. The hybrid SWAT-LSTM model demonstrably improved core streamflow prediction metrics (NSE and PBIAS) in the Korean watersheds.

Despite significant streamflow prediction improvements in the Korean watersheds compared to the standalone SWAT model in the region, the hybrid SWAT-LSTM achieved near-perfect performance in the Melka Kunture watershed. This exceptional outcome might be directly linked to the standalone SWAT model's initial performance in this specific case. While other studies (Shen et al. 2022) suggest equivalent performance improvements from hybridizing machine learning models regardless of process-based model calibration, our findings hint at a potential influence of calibration performance on hybrid model outcomes. Further investigation is warranted to explore this possible connection.

Hybrid models’ adaptability to streamflow extremes

Scrutinizing the slope variations at both the beginning and end of the plotted FDC lines provided valuable insights into the model's proficiency in capturing extreme events. The threshold for low flow varied across the watersheds. Low flow in the Gapcheon and Chogang watersheds was defined as flows exceeding an 80% probability of exceedance, while the Melka Kunture watershed used a stricter threshold of 90%. Conversely, peak flow in the Gapcheon and Chogang watersheds constituted the flow with a probability of exceedance below 10%, while in the Melka Kunture watershed, it remained below 20%. A closer examination of the plots within these thresholds revealed the model's performance at each watershed (Figure 10). A common challenge emerged across almost all models; they struggled to capture extreme low and peak flows. Previous studies (Konapala et al. 2020; Hauswirth et al. 2021) observed similar limitations in capturing low flow compared to peak flow with machine learning models, including hybrid approaches.
Figure 10

Model performance at extreme events: (a) Gapcheon, (b) Chogang, and (c) Melka Kunture. Dual axes (log-x/linear-y, linear-x/log-y) shed light on model performance disparities at both high and low flows.

Figure 10

Model performance at extreme events: (a) Gapcheon, (b) Chogang, and (c) Melka Kunture. Dual axes (log-x/linear-y, linear-x/log-y) shed light on model performance disparities at both high and low flows.

Close modal

The Bflow-LSTM model significantly outperformed other models in peak flow prediction across all study watersheds. Interestingly, the SWAT model achieved performance comparable to the hybrid modeling approaches at low-flow prediction. This finding aligns with previous studies, such as Kim et al. (2021), who reported that machine learning models often excel at peak flow prediction while process-based models are better suited for low-flow simulations. Notably, each model exhibited significantly different performance at extremely low-flow prediction, with the watershed memory-based models consistently overestimating extreme low flow in all watersheds.

Recent studies (Vinuesa et al. 2020) have underscored the potential of artificial intelligence in advancing various sustainable development goals (SDGs). The environmental-related SDGs are SDG 13 (climate action), SDG 14 (life below water), and SDG 15 (life on land) (United Nations Development Programme 2016). This research aligns directly with these broader initiatives, illuminating the performance of various modeling approaches in capturing extreme events. Improved prediction of low flows enables more effective management of water resources during droughts, ensuring equitable access to safe water for all. Furthermore, understanding how models handle extreme events facilitates the development of strategies for flood preparedness and mitigation, safeguarding communities and infrastructure from the impacts of climate change.

Limitations and outlook

The focus on the popular residual error approach in this study, while motivated by its widespread application, restricted the exploration of other promising hybrid techniques. This limited scope prevented an in-depth investigation of incorporating additional variables, feeding process-based model outputs directly into machine learning algorithms, and even replacing specific modules within the models themselves (Bhasme et al. 2022; Cho & Kim 2022; Feng et al. 2022; Liu et al. 2022; Yu et al. 2023). Additionally, the study solely explored baseflow-based memory techniques, overlooking the potential of other intermediate process outputs such as evapotranspiration that could further enhance model performance. Therefore, this study can serve as a valuable springboard for future research to delve into these untapped avenues.

Unlocking the full potential of hybrid hydrological models demands investigating promising alternatives beyond the residual error approach. Can incorporating residual error as a predictor refine process-based simulations, or should it be included as an input variable? Similarly, does a modular approach using watershed memory terms, encompassing both surface runoff and baseflow, outperform individual variable inputs? Evaluating the efficacy of other deep learning models within this framework also presents an intriguing avenue for future research. Answering these seemingly simple questions will unlock a new era of efficiency and accuracy in hybrid machine learning prediction studies, paving the way for more robust and adaptable hydrological models.

This study delved into the efficacy of two prevalent streamflow prediction hybridization techniques across two geographically and hydrometeorologically diverse regions: residual error-based and watershed memory-based methods. While the former demonstrably elevates accuracy by capitalizing on comprehensive process-based modeling, it incurs substantial computational expenses. Conversely, watershed memory-based approaches present a computationally efficient alternative, though their competitive edge against the more data-intensive residual error methods and the rationale for selecting one over the other remains unclear. To bridge this knowledge gap and inform best practices, this study investigates three key questions:

  1. Which hybridization methodology demonstrates superior overall performance?

  2. Do these techniques significantly enhance forecasts of specific hydrological extremes (low flow, peak flow)?

  3. Can alternative watershed memory terms surpass the efficacy of the traditional baseflow data-based hybridization approach?

Employing a rigorous methodological framework incorporating baseflow analysis, watershed process modeling using SWAT, and development of standalone and hybrid LSTM models, our study yielded key insights. First, the Bflow-LSTM model, leveraging baseflow data, consistently outperformed all other investigated methods across diverse watersheds and metrics (NSE, PBIAS, and R2) during training and testing. Notably, it significantly improved extreme events predictions, particularly peak flows, relative to the standalone LSTM. Second, while overall performance remained modest, the standalone SWAT model showed comparably good performance at low-flow events. This highlights the continued relevance of process-based models in capturing low-flow dynamics. Third, our exploration of alternative memory terms yielded nuanced results. Although the BFI-LSTM model, which utilizes the baseflow index as memory, showed promising performance, it did not consistently surpass the Bflow-LSTM model across all watersheds and metrics.

The choice between watershed memory and residual error-based hybrid models may depend on the specific study focus. In the context of drought-related or low-flow studies, the standalone SWAT model may be a well-suited tool. However, for peak flow or flood events, the superior performance of the Bflow-LSTM model makes it a valuable tool for hydrological studies and disaster management applications.

This study was supported by the Surface Soil Conservation and Management (SS) projects, funded by the Ministry of Environment (MOE) of Korea, under grant number 2019002820003, and by a National Research Foundation of Korea (NRF) grant, which is funded by the Korea government (MSIT), with the grant number 2022R1F1A1073748.

Data cannot be made publicly available; readers should contact the corresponding author for details.

The authors declare there is no conflict.

Abbaspour
K. C.
2011
SWAT-CUP4: SWAT Calibration and Uncertainty Programs – a User Manual
.
Swiss Federal Institute of Aquatic Science and Technology, Dübendorf
.
Abbaspour
K. C.
,
Yang
J.
,
Maximov
I.
,
Siber
R.
,
Bogner
K.
,
Mieleitner
J.
,
Zobrist
J.
&
Srinivasan
R.
2007
Modelling hydrology and water quality in the pre-alpine/alpine Thur watershed using SWAT
.
J. Hydrol.
333
,
413
430
.
https://doi.org/10.1016/j.jhydrol.2006.09.014
.
Arnold
J. G.
,
Srinivasan
R.
,
Muttiah
R. S.
&
Williams
J. R.
1998
Large area hydrologic modeling and assessment part I: Model development
.
J. Am. Water Resour. Assoc.
34
,
73
89
.
https://doi.org/10.1111/j.1752-1688.1998.tb05961.x
.
Bhasme
P.
,
Vagadiya
J.
&
Bhatia
U.
2022
Enhancing predictive skills in physically-consistent way: Physics informed machine learning for hydrological processes
.
J. Hydrol.
615
,
128618
.
https://doi.org/10.1016/J.JHYDROL.2022.128618
.
Birhanu
B.
,
Kebede
S.
,
Charles
K.
,
Taye
M.
,
Atlaw
A.
&
Birhane
M.
2021
Impact of natural and anthropogenic stresses on surface and groundwater supply sources of the Upper Awash Sub-Basin, Central Ethiopia
.
Front. Earth Sci.
9
.
https://doi.org/10.3389/feart.2021.656726
.
Bloomfield
J. P.
,
Allen
D. J.
&
Griffiths
K. J.
2009
Examining geological controls on baseflow index (BFI) using regression analysis: An illustration from the Thames Basin, UK
.
J. Hydrol.
373
,
164
176
.
https://doi.org/10.1016/j.jhydrol.2009.04.025
.
Bourdin
D. R.
,
Fleming
S. W.
&
Stull
R. B.
2012
Streamflow modelling: A primer on applications, approaches and challenges
.
Atmosphere-Ocean
50
,
507
536
.
https://doi.org/10.1080/07055900.2012.734276
.
Brutsaert
W.
2008
Long-term groundwater storage trends estimated from streamflow records: Climatic perspective
.
Water Resour. Res.
44
.
https://doi.org/10.1029/2007WR006518
.
Chen
H.
,
Xu
Y. P.
,
Teegavarapu
R. S. V.
,
Guo
Y.
&
Xie
J.
2021
Assessing different roles of baseflow and surface runoff for long-term streamflow forecasting in southeastern China
.
Hydrol. Sci. J.
66
,
2312
2329
.
https://doi.org/10.1080/02626667.2021.1988612
.
Cheng
S.
,
Tong
X.
&
Illman
W. A.
2022
Evaluation of baseflow separation methods with real and synthetic streamflow data from a watershed
.
J. Hydrol.
613
,
128279
.
https://doi.org/10.1016/j.jhydrol.2022.128279
.
Cho
K.
&
Kim
Y.
2022
Improving streamflow prediction in the WRF-Hydro model with LSTM networks
.
J. Hydrol.
605
.
https://doi.org/10.1016/J.JHYDROL.2021.127297
.
Chu
H.
,
Wei
J.
,
Wu
W.
,
Jiang
Y.
,
Chu
Q.
&
Meng
X.
2021
A classification-based deep belief networks model framework for daily streamflow forecasting
.
J. Hydrol.
595
,
125967
.
https://doi.org/10.1016/J.JHYDROL.2021.125967
.
Collischonn
W.
&
Fan
F. M.
2013
Defining parameters for Eckhardt's digital baseflow filter
.
Hydrol. Process.
27
,
2614
2622
.
https://doi.org/10.1002/HYP.9391
.
Corzo
G.
&
Solomatine
D.
2007
Baseflow separation techniques for modular artificial neural network modelling in flow forecasting
.
Hydrol. Sci. J.
52
,
491
507
.
https://doi.org/10.1623/hysj.52.3.491
.
Duncan
H. P.
2019
Baseflow separation – a practical approach
.
J. Hydrol.
575
,
308
313
.
https://doi.org/10.1016/J.JHYDROL.2019.05.040
.
Eckhardt
K.
2005
How to construct recursive digital filters for baseflow separation
.
Hydrol. Process.
19
,
507
515
.
https://doi.org/10.1002/HYP.5675
.
Eckhardt
K.
2008
A comparison of baseflow indices, which were calculated with seven different baseflow separation methods
.
J. Hydrol.
352
,
168
173
.
https://doi.org/10.1016/J.JHYDROL.2008.01.005
.
Eckhardt
K.
2023
Technical note: How physically based is hydrograph separation by recursive digital filtering?
Hydrol. Earth Syst. Sci.
27
,
495
499
.
https://doi.org/10.5194/hess-27-495-2023
.
EGIS
2022
Environmental Geographic Information Service, EGIS [WWW Document]. Available from: https://egis.me.go.kr/ (accessed 12 September 2023)
.
Erdal
H. I.
&
Karakurt
O.
2013
Advancing monthly streamflow prediction accuracy of CART models using ensemble learning paradigms
.
J. Hydrol.
477
,
119
128
.
https://doi.org/10.1016/J.JHYDROL.2012.11.015
.
Fahimi
F.
,
Yaseen
Z. M.
&
El-shafie
A.
2017
Application of soft computing based hybrid models in hydrological variables modeling: A comprehensive review
.
Theor. Appl. Climatol.
128
,
875
903
.
https://doi.org/10.1007/s00704-016-1735-8
.
Fatichi
S.
,
Vivoni
E. R.
,
Ogden
F. L.
,
Ivanov
V. Y.
,
Mirus
B.
,
Gochis
D.
,
Downer
C. W.
,
Camporese
M.
,
Davison
J. H.
,
Ebel
B.
,
Jones
N.
,
Kim
J.
,
Mascaro
G.
,
Niswonger
R.
,
Restrepo
P.
,
Rigon
R.
,
Shen
C.
,
Sulis
M.
&
Tarboton
D.
2016
An overview of current applications, challenges, and future trends in distributed process-based models in hydrology
.
J. Hydrol.
537
,
45
60
.
https://doi.org/10.1016/J.JHYDROL.2016.03.026
.
Ghaith
M.
,
Siam
A.
,
Li
Z.
&
El-Dakhakhni
W.
2020
Hybrid hydrological data-driven approach for daily streamflow forecasting
.
J. Hydrol. Eng.
25
,
1
9
.
https://doi.org/10.1061/(ASCE)HE.1943-5584.0001866
.
Girons Lopez
M.
,
Crochemore
L.
&
Pechlivanidis
I. G.
2021
Benchmarking an operational hydrological model for providing seasonal forecasts in Sweden
.
Hydrol. Earth Syst. Sci.
25
,
1189
1209
.
https://doi.org/10.5194/hess-25-1189-2021
.
Gu
H.
,
Xu
Y. P.
,
Liu
L.
,
Xie
J.
,
Wang
L.
,
Pan
S.
&
Guo
Y.
2023
Seasonal catchment memory of high mountain rivers in the Tibetan Plateau
.
Nat. Commun.
14
.
https://doi.org/10.1038/s41467-023-38966-9
.
Hall
F. R.
1968
Base-flow recessions – a review
.
Water Resour. Res.
4
,
973
983
.
https://doi.org/10.1029/WR004i005p00973
.
Harrigan
S.
,
Prudhomme
C.
,
Parry
S.
,
Smith
K.
&
Tanguy
M.
2018
Benchmarking ensemble streamflow prediction skill in the UK
.
Hydrol. Earth Syst. Sci.
22
,
2023
2039
.
https://doi.org/10.5194/hess-22-2023-2018
.
Hauswirth
S. M.
,
Bierkens
M. F. P.
,
Beijk
V.
&
Wanders
N.
2021
The potential of data driven approaches for quantifying hydrological extremes
.
Adv. Water Resour.
155
,
104017
.
https://doi.org/10.1016/j.advwatres.2021.104017
.
Herrera
P. A.
,
Marazuela
M. A.
&
Hofmann
T.
2022
Parameter estimation and uncertainty analysis in hydrological modeling
.
Wiley Interdiscip. Rev. Water
9
,
1
23
.
https://doi.org/10.1002/wat2.1569
.
Hochreiter
S.
&
Schmidhuber
J.
1997
Long short-term memory
.
Neural Comput.
9
,
1735
1780
.
https://doi.org/10.1162/neco.1997.9.8.1735
.
HRFCO
2023
Han River Flood Control Office, HRFCO [WWW Document]. Available from: https://hrfco.go.kr/main.do (accessed 12 September 2023)
.
Isik
S.
,
Kalin
L.
,
Schoonover
J. E.
,
Srivastava
P.
&
Graeme Lockaby
B.
2013
Modeling effects of changing land use/cover on daily streamflow: An artificial neural network and curve number based hybrid approach
.
J. Hydrol.
485
,
103
112
.
https://doi.org/10.1016/j.jhydrol.2012.08.032
.
Jun
C.
,
Ban
Y.
&
Li
S.
2014
Open access to earth land-cover map
.
Nature
514
,
434
434
.
https://doi.org/10.1038/514434c
.
Karandish
F.
&
Šimůnek
J.
2016
A comparison of numerical and machine-learning modeling of soil water content with limited input data
.
J. Hydrol.
543
,
892
909
.
https://doi.org/10.1016/j.jhydrol.2016.11.007
.
Kassem
A. A.
,
Raheem
A. M.
,
Khidir
K. M.
&
Alkattan
M.
2020
Predicting of daily Khazir basin flow using SWAT and hybrid SWAT-ANN models
.
Ain Shams Eng. J.
11
,
435
443
.
https://doi.org/10.1016/J.ASEJ.2019.10.011
.
KMA
2022
Korea Meteorological Administration [WWW Document]. Korea Meteorol. Adm. http://www.kma.go.kr/eng/biz/observation_01.jsp (accessed 30 November 2022)
.
Konapala
G.
,
Kao
S.-C.
,
Painter
S. L.
&
Lu
D.
2020
Machine learning assisted hybrid models can improve streamflow simulation in diverse catchments across the conterminous US
.
Environ. Res. Lett.
15
,
104022
.
https://doi.org/10.1088/1748-9326/aba927
.
Kratzert
F.
,
Klotz
D.
,
Brenner
C.
,
Schulz
K.
&
Herrnegger
M.
2018
Rainfall-runoff modelling using long short-term memory (LSTM) networks
.
Hydrol. Earth Syst. Sci.
22
,
6005
6022
.
https://doi.org/10.5194/hess-22-6005-2018
.
Langbein
W. B.
1938
Some channel-storage studies and their application to the determination of infiltration
.
Trans. Am. Geophys. Union
19
,
435
.
https://doi.org/10.1029/TR019i001p00435
.
Lee
J.
,
Abbas
A.
,
McCarty
G. W.
,
Zhang
X.
,
Lee
S.
&
Hwa Cho
K.
2023a
Estimation of base and surface flow using deep neural networks and a hydrologic model in two watersheds of the Chesapeake Bay
.
J. Hydrol.
617
,
128916
.
https://doi.org/10.1016/j.jhydrol.2022.128916
.
Lee
J.
,
Park
M.
,
Min
J.-H.
&
Na
E. H.
2023b
Integrated assessment of the land use change and climate change impact on baseflow by using hydrologic model
.
Sustainability
15
,
12465
.
https://doi.org/10.3390/su151612465
.
Li
M.
,
Wang
Q. J.
,
Robertson
D. E.
&
Bennett
J. C.
2017
Improved error modelling for streamflow forecasting at hourly time steps by splitting hydrographs into rising and falling limbs
.
J. Hydrol.
555
,
586
599
.
https://doi.org/10.1016/j.jhydrol.2017.10.057
.
Li
D.
,
Marshall
L.
,
Liang
Z.
,
Sharma
A.
&
Zhou
Y.
2021
Characterizing distributed hydrological model residual errors using a probabilistic long short-term memory network
.
J. Hydrol.
603
,
126888
.
https://doi.org/10.1016/j.jhydrol.2021.126888
.
Li
K.
,
Huang
G.
,
Wang
S.
&
Razavi
S.
2022
Development of a physics-informed data-driven model for gaining insights into hydrological processes in irrigated watersheds
.
J. Hydrol.
613
,
128323
.
https://doi.org/10.1016/j.jhydrol.2022.128323
.
Lim
K. J.
,
Engel
B. A.
,
Tang
Z.
,
Choi
J.
,
Kim
K.-S.
,
Muthukrishnan
S.
&
Tripathy
D.
2005
Automated web gis based hydrograph analysis tool, what
.
J. Am. Water Resour. Assoc.
41
,
1407
1416
.
https://doi.org/10.1111/j.1752-1688.2005.tb03808.x
.
Lim
K. J.
,
Park
Y. S.
,
Kim
J.
,
Shin
Y.-C.
,
Kim
N. W.
,
Kim
S. J.
,
Jeon
J.-H.
&
Engel
B. A.
2010
Development of genetic algorithm-based optimization module in WHAT system for hydrograph analysis and model application
.
Comput. Geosci.
36
,
936
944
.
https://doi.org/10.1016/j.cageo.2010.01.004
.
Liu
B.
,
Tang
Q.
,
Zhao
G.
,
Gao
L.
,
Shen
C.
&
Pan
B.
2022
Physics-guided long short-term memory network for streamflow and flood simulations in the Lancang–Mekong River Basin
.
Water
14
,
1429
.
https://doi.org/10.3390/w14091429
.
McMahon
T. A.
&
Nathan
R. J.
2021
Baseflow and transmission loss: A review
.
Wiley Interdiscip. Rev. Water
.
https://doi.org/10.1002/wat2.1527
.
Mekonnen
B. A.
,
Nazemi
A.
,
Mazurek
K. A.
,
Elshorbagy
A.
&
Putz
G.
2015
Hybrid modelling approach to prairie hydrology: Fusing data-driven and process-based hydrological models
.
Hydrol. Sci. J.
60
,
1473
1489
.
https://doi.org/10.1080/02626667.2014.935778
.
Mitiku
A. B.
,
Meresa
G. A.
,
Mulu
T.
&
Woldemichael
A. T.
2023
Examining the impacts of climate variabilities and land use change on hydrological responses of Awash River basin, Ethiopia
.
HydroResearch
6
,
16
28
.
https://doi.org/10.1016/J.HYDRES.2022.12.002
.
Moges
E.
,
Demissie
Y.
,
Larsen
L.
&
Yassin
F.
2020
Review: Sources of hydrological model uncertainties and advances in their analysis
.
Water
13
,
28
.
https://doi.org/10.3390/w13010028
.
Mohammadi
B.
,
Vazifehkhah
S.
&
Duan
Z.
2024
A conceptual metaheuristic-based framework for improving runoff time series simulation in glacierized catchments
.
Eng. Appl. Artif. Intell.
127
,
107302
.
https://doi.org/10.1016/j.engappai.2023.107302
.
Moriasi
D. N.
,
Arnold
J. G.
,
Van Liew
M. W.
,
Bingner
R. L.
,
Harmel
R. D.
&
Veith
T. L.
2007
Model evaluation guidelines for systematic quantification of accuracy in watershed simulations
.
Trans. ASABE
50
,
885
900
.
https://doi.org/10.13031/2013.23153
.
Mudiyanselage Viraj
H.
,
Herath
V.
,
Chadalawada
J.
&
Babovic
V.
2021
Hydrologically informed machine learning for rainfall-runoff modelling: Towards distributed modelling
.
Hydrol. Earth Syst. Sci
25
,
4373
4401
.
https://doi.org/10.5194/hess-25-4373-2021
.
Nash
J. E. E.
&
Sutcliffe
J. V. V.
1970
River flow forecasting through conceptual models part I – a discussion of principles
.
J. Hydrol.
10
,
282
290
.
https://doi.org/10.1016/0022-1694(70)90255-6
.
Nearing
G. S.
,
Kratzert
F.
,
Sampson
A. K.
,
Pelissier
C. S.
,
Klotz
D.
,
Frame
J. M.
,
Prieto
C.
&
Gupta
H. V.
2021
What role does hydrological science play in the age of machine learning?
Water Resour. Res.
57
,
e2020WR028091
.
https://doi.org/10.1029/2020WR028091
.
NGII
2022
National Geographic Information Institute [WWW Document]. Available from: https://www.ngii.go.kr/eng/main.do (accessed 24 May 2022)
.
Noori
N.
&
Kalin
L.
2016
Coupling SWAT and ANN models for enhanced daily streamflow prediction
.
J. Hydrol.
533
,
141
151
.
https://doi.org/10.1016/j.jhydrol.2015.11.050
.
RDA
2022
Rural Development Administration, RDA [WWW Document]. Available from: https://www.rda.go.kr/ (accessed 12 September 2023)
.
Reis
G. B.
,
da Silva
D. D.
,
Fernandes Filho
E. I.
,
Moreira
M. C.
,
Veloso
G. V.
,
Fraga
M. d. S.
&
Pinheiro
S. A. R.
2021
Effect of environmental covariable selection in the hydrological modeling using machine learning models to predict daily streamflow
.
J. Environ. Manage.
290
,
112625
.
https://doi.org/10.1016/j.jenvman.2021.112625
.
Shawul
A. A.
&
Chakma
S.
2020
Suitability of global precipitation estimates for hydrologic prediction in the main watersheds of upper Awash basin
.
Environ. Earth Sci.
79
,
53
.
https://doi.org/10.1007/s12665-019-8801-3
.
Shen
C.
2018
A transdisciplinary review of deep learning research and its relevance for water resources scientists
.
Water Resour. Res.
54
,
8558
8593
.
https://doi.org/10.1029/2018WR022643
.
Shen
Y.
,
Ruijsch
J.
,
Lu
M.
,
Sutanudjaja
E. H.
&
Karssenberg
D.
2022
Random forests-based error-correction of streamflow from a large-scale hydrological model: Using model state variables to estimate error terms
.
Comput. Geosci.
159
,
105019
.
https://doi.org/10.1016/j.cageo.2021.105019
.
Sikorska-Senoner
A. E.
&
Quilty
J. M.
2021
A novel ensemble-based conceptual-data-driven approach for improved streamflow simulations
.
Environ. Modell. Software
143
.
https://doi.org/10.1016/J.ENVSOFT.2021.105094
.
Smith
T.
,
Marshall
L.
&
Sharma
A.
2015
Modeling residual hydrologic errors with Bayesian inference
.
J. Hydrol.
528
,
29
37
.
https://doi.org/10.1016/j.jhydrol.2015.05.051
.
Solomatine
D. P.
&
Ostfeld
A.
2008
Data-driven modelling: Some past experiences and new approaches
.
J. Hydroinf.
10
,
3
22
.
https://doi.org/10.2166/hydro.2008.015
.
Sorooshian
S.
,
Duan
Q.
&
Gupta
V. K.
1993
Calibration of rainfall-runoff models: Application of global optimization to the Sacramento soil moisture accounting model
.
Water Resour. Res.
29
,
1185
1194
.
https://doi.org/10.1029/92WR02617
.
Sutanto
S. J.
&
Van Lanen
H. A. J.
2022
Catchment memory explains hydrological drought forecast performance
.
Sci. Rep.
12
,
1
11
.
https://doi.org/10.1038/s41598-022-06553-5
.
Tallaksen
L. M.
1995
A review of baseflow recession analysis
.
J. Hydrol.
165
,
349
370
.
https://doi.org/10.1016/0022-1694(94)02540-R
.
Tan
M. L.
,
Gassman
P. W.
,
Yang
X.
&
Haywood
J.
2020
A review of SWAT applications, performance and future needs for simulation of hydro-climatic extremes
.
Adv. Water Resour.
143
,
103662
.
https://doi.org/10.1016/j.advwatres.2020.103662
.
Taormina
R.
,
Chau
K. W.
&
Sivakumar
B.
2015
Neural network river forecasting through baseflow separation and binary-coded swarm optimization
.
J. Hydrol.
529
,
1788
1797
.
https://doi.org/10.1016/J.JHYDROL.2015.08.008
.
Thiesen
S.
,
Darscheid
P.
&
Ehret
U.
2019
Identifying rainfall-runoff events in discharge time series: A data-driven method based on information theory
.
Hydrol. Earth Syst. Sci.
23
,
1015
1034
.
https://doi.org/10.5194/hess-23-1015-2019
.
Tian
Y.
,
Xu
Y. P.
,
Yang
Z.
,
Wang
G.
&
Zhu
Q.
2018
Integration of a parsimonious hydrological model with recurrent neural networks for improved streamflow forecasting
.
Water (Switzerland)
10
.
https://doi.org/10.3390/w10111655
.
Tigabu
T. B.
,
Wagner
P. D.
,
Narasimhan
B.
&
Fohrer
N.
2023
Pitfalls in hydrologic model calibration in a data scarce environment with a strong seasonality: Experience from the Adyar catchment, India
.
Environ. Earth Sci.
82
,
367
.
https://doi.org/10.1007/s12665-023-11047-2
.
Tongal
H.
&
Booij
M. J.
2018
Simulation and forecasting of streamflows using machine learning models coupled with base flow separation
.
J. Hydrol.
564
,
266
282
.
https://doi.org/10.1016/j.jhydrol.2018.07.004
.
Tongal
H.
&
Booij
M. J.
2022
Simulated annealing coupled with a Naïve Bayes model and base flow separation for streamflow simulation in a snow dominated basin
.
Stoch. Environ. Res. Risk Assess.
37
,
89
112
.
https://doi.org/10.1007/s00477-022-02276-1
.
United Nations Development Programme
2016
The 17 Goals | Sustainable Development [WWW Document]. Sustain. Dev. Available from: https://sdgs.un.org/goals (accessed 16 February 2024)
.
Van Loon
A. F.
&
Laaha
G.
2015
Hydrological drought severity explained by climate and catchment characteristics
.
J. Hydrol.
526
,
3
14
.
https://doi.org/10.1016/j.jhydrol.2014.10.059
.
Vinuesa
R.
,
Azizpour
H.
,
Leite
I.
,
Balaam
M.
,
Dignum
V.
,
Domisch
S.
,
Felländer
A.
,
Langhans
S. D.
,
Tegmark
M.
&
Fuso Nerini
F.
2020
The role of artificial intelligence in achieving the sustainable development goals
.
Nat. Commun.
11
,
233
.
https://doi.org/10.1038/s41467-019-14108-y
.
Willard
J.
,
Jia
X.
,
Xu
S.
,
Steinbach
M.
&
Kumar
V.
2023
Integrating scientific knowledge with machine learning for engineering and environmental systems
.
ACM Comput. Surv.
55
,
1
37
.
https://doi.org/10.1145/3514228
.
Wu
B.
,
Zheng
Y.
,
Tian
Y.
,
Wu
X.
,
Yao
Y.
,
Han
F.
,
Liu
J.
&
Zheng
C.
2014
Systematic assessment of the uncertainty in integrated surface water-groundwater modeling based on the probabilistic collocation method
.
Water Resour. Res.
50
,
5848
5865
.
https://doi.org/10.1002/2014WR015366
.
Xie
J.
,
Liu
X.
,
Wang
K.
,
Yang
T.
,
Liang
K.
&
Liu
C.
2020
Evaluation of typical methods for baseflow separation in the contiguous United States
.
J. Hydrol.
583
,
124628
.
https://doi.org/10.1016/J.JHYDROL.2020.124628
.
Xie
J.
,
Liu
X.
,
Tian
W.
,
Wang
K.
,
Bai
P.
&
Liu
C.
2022
Estimating gridded monthly baseflow from 1981 to 2020 for the contiguous US using Long Short-Term Memory (LSTM) networks
.
Water Resour. Res.
58
,
e2021WR031663
.
https://doi.org/10.1029/2021WR031663
.
Yaseen
Z. M.
,
El-shafie
A.
,
Jaafar
O.
,
Afan
H. A.
&
Sayl
K. N.
2015
Artificial intelligence based models for stream-flow forecasting: 2000–2015
.
J. Hydrol.
530
,
829
844
.
https://doi.org/10.1016/J.JHYDROL.2015.10.038
.
Yifru
B. A.
,
Lim
K. J.
&
Lee
S.
2024
Enhancing streamflow prediction physically consistently using process-Based modeling and domain knowledge: A review
.
Sustainability
16
,
1376
.
https://doi.org/10.3390/su16041376
.
Yu
Q.
,
Jiang
L.
,
Wang
Y.
&
Liu
J.
2023
Enhancing streamflow simulation using hybridized machine learning models in a semi-arid basin of the Chinese loess Plateau
.
J. Hydrol.
617
,
129115
.
https://doi.org/10.1016/j.jhydrol.2023.129115
.
Zemzami
M.
&
Benaabidate
L.
2016
Improvement of artificial neural networks to predict daily streamflow in a semi-arid area
.
Hydrol. Sci. J.
61
,
1801
1812
.
https://doi.org/10.1080/02626667.2015.1055271
.
Zhang
Z.
,
Zhang
Q.
,
Singh
V. P.
&
Shi
P.
2018
River flow modelling: Comparison of performance and evaluation of uncertainty using data-driven models and conceptual hydrological model
.
Stoch. Environ. Res. Risk Assess.
32
,
2667
2682
.
https://doi.org/10.1007/s00477-018-1536-y
.
Zheng
F.
,
Maier
H. R.
,
Wu
W.
,
Dandy
G. C.
,
Gupta
H. V.
&
Zhang
T.
2018
On lack of robustness in hydrological model development due to absence of guidelines for selecting calibration and evaluation data: Demonstration for data-driven models
.
Water Resour. Res.
54
,
1013
1030
.
https://doi.org/10.1002/2017WR021470
.
Zounemat-Kermani
M.
,
Matta
E.
,
Cominola
A.
,
Xia
X.
,
Zhang
Q.
,
Liang
Q.
&
Hinkelmann
R.
2020
Neurocomputing in surface water hydrology and hydraulics: A review of two decades retrospective, current status and future prospects
.
J. Hydrol.
588
.
https://doi.org/10.1016/J.JHYDROL.2020.125085
.
Zounemat-Kermani
M.
,
Batelaan
O.
,
Fadaee
M.
&
Hinkelmann
R.
2021
Ensemble machine learning paradigms in hydrology: A review
.
J. Hydrol.
598
,
126266
.
https://doi.org/10.1016/J.JHYDROL.2021.126266
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).