This work compares the accuracy of streamflow estimated by a data-driven artificial neural network (ANN) and the physically based soil and water assessment tool (SWAT). The models were applied in two small watersheds, one highly urbanized and the other primarily covered with evergreen forest and shrubs, in the San Antonio region of central south Texas, where karst geologic features are prevalent. Both models predicted daily streamflow in the urbanized watershed very well, with the ANN and SWAT having the Nash–Sutcliffe coefficient of efficiency (NSE) values of 0.76 and 0.72 in the validation period, respectively. However, both models predicted streamflow poorly in the nonurban watershed. The NSE values of the ANNs significantly improved when a time series autoregressive model structure using historical streamflow data was implemented in the nonurban watershed. The SWAT model only achieved trivial performance improvement after using the SWAT-CUP SUFI-2 calibration procedure. This result suggests that an ANN model may be more suitable for short-term streamflow forecasting in watersheds heavily affected by karst features where the complex processes of rapid groundwater recharge and discharge strongly influence surface water flow.

  • The ANN models outperformed the SWAT models by varying degrees in the two karstified watersheds in central Texas.

  • The calibrated SWAT and ANN models performed better in the urban watershed while having poor statistical performance in the rural watershed.

  • Applying the correction factor approach caused different extents of improvement in statistical results when a relatively higher uncertainty level was assumed.

The prediction of streamflow using rainfall-runoff models is vital in water resources management. It aims to improve decision-making for a wide range of hydrological problems. Streamflow is the result of complex natural processes at the watershed scale. In the past several decades, various computer-based hydrological models have been developed to simulate streamflow, most of which focus on capturing the rainfall-runoff process since precipitation is usually the primary driving force of a hydrological system (Moradkhani & Sorooshian 2009). As more and more models became available, hydrologists started to classify them into different categories based on their structure. One standard classification divides hydrological models into lumped or distributed models. A lumped hydrological model considers the study watershed as a single unit. The parameters representing spatial characteristics related to the rainfall-runoff process are averaged or ignored for the entire watershed (Brirhet & Benaabidate 2016). In a distributed model, the variation of watershed characteristics and hydrological processes in space are explicitly considered, usually through discretizing the watershed into a large number of rectangular grid cells or a limited number of subbasins based on the drainage and topographic features (Islam 2011). Another frequently used classification approach considers models as deterministic or stochastic. A deterministic model generates only a single output value with a given input dataset and model parameters (Beven 2011). In contrast, stochastic hydrological models usually provide probability distributions of the target variables (Beaumont 1979). Exploring the suitability and analyzing different models' limitations is a popular topic in modern hydrology.

One major issue in hydrological modeling is deciding the appropriate level of model complexity. Beven (2011) summarized two widely accepted views of modeling. The first suggests that hydrological models are merely tools for extrapolating available data in time and space. Therefore, if the model inputs and outputs can be successfully related, it is not essential to elaborate on the details of a watershed. The second view maintains that models should reflect the physical processes involved to the extent possible to ensure confidence when extrapolating beyond the existing observations. Consequently, more complex physically based hydrological models are usually adopted under the second perspective. The soil and water assessment tool (SWAT) and artificial neural network (ANN) are two widely adopted watershed-level hydrological models in recent literature (Kassem et al. 2020). The more complex SWAT model simulates hydrological processes using mathematical equations that describe important physical laws of conservation of mass, energy, and momentum (Arnold et al. 2012), while the ANN models simply identify the statistical connections between hydrological inputs and outputs (ASCE 2000b). A detailed comparison of these two models can provide insights into the issue of model complexity in hydrological simulation.

A few studies have compared SWAT and ANN regarding streamflow prediction. Kim et al. (2015) applied SWAT and ANN to impute missing streamflow observations in the Taehwa River Watershed in Korea and found that SWAT was better at simulating low flows, while the neural network model generally performed better at simulating high flows. Jimeno-Sáez et al. (2018) compared the accuracy of streamflow prediction between SWAT and ANN models at the Ladra River Basin and the Segura River Basin in Spain. The authors also concluded that ANN is better at estimating higher flows. Demirel et al. (2009) applied SWAT and ANN models to simulate streamflow in the Pracana River Basin in Portugal and concluded that ANN was more successful at forecasting peak streamflow, whereas the SWAT model performed better in terms of overall goodness-of-fit indicators. Srivastava et al. (2006) analyzed the performance of streamflow prediction from SWAT and ANN in the agricultural-dominated Honey Brook Watershed in Pennsylvania. They found that the ANN model produced simulation results with the Nash–Sutcliffe coefficient of efficiency (NSE) and coefficient of determination (R2) better than the SWAT model. In a recent study, Zakizadeh et al. (2020) used ANN and SWAT to simulate the rainfall-runoff relationship in a small watershed near Tehran city, Iran. The authors concluded that ANN produced simulations with minor error and uncertainty, although both models could achieve excellent predictive performance.

While previous studies have explored the suitability of SWAT and ANN models in various regions, none of these studies were conducted in the state of Texas, where its hot climate condition and growing population have made water sustainability a contentious issue. Additionally, the San Antonio region of central south Texas has extensive karst terrain, making hydrological modeling more difficult and rendering model performance patterns vastly different from non-karstified watersheds (Jakada & Chen 2020). Furthermore, few studies on SWAT and ANN have considered uncertainty in the measurement and modeling process, yet considering uncertainty estimates allows modelers to quantify their confidence in the observed and simulated values (Harmel et al. 2010). Considering model uncertainty is vitally essential for the stochastic ANN model, in which the simulation outcome of each data point is a distribution rather than a single determined value (Hu et al. 2001). Hence, the objectives of this study are (a) to parameterize SWAT and an ANN model to simulate streamflow in two small watersheds in the San Antonio Region, one rural and one urban; (b) to compare SWAT and ANN model performance in karstic watersheds; (c) to analyze SWAT and ANN performance under different dominant land-use type; and (d) to enhance the model evaluation using modified goodness-of-fit indicators that incorporate measurement and model uncertainty.

Description of SWAT and ANN models

The SWAT is a physically based, semi-distributed, deterministic model developed to assess water quality and quantity in large river basins with varying soils, land uses, land cover types, and management practices (Arnold et al. 2012). The SWAT model divides a study watershed into a few subbasins. The model further groups land with homogeneous slopes, soil types, and land cover types into hydrologic response units (HRUs) at a finer spatial scale. The HRUs may not be spatially continuous and can be found at different locations within a subbasin. This strategy effectively simplifies the simulation of watershed processes (Gassman et al. 2007; Licciardello et al. 2011; Tuppad et al. 2011). The development of the SWAT model spans the last three decades, with new functions and routines continuously added to the model. The current SWAT model has been widely applied to complex systems' water, sediment, agricultural chemicals, and contaminant yields (Gassman et al. 2007; Abbaspour et al. 2015). Moreover, the SWAT model is closely integrated with a geographic information system (GIS) since most SWAT input data have spatial characteristics (Srinivasan & Arnold 1994; Jayakrishnan et al. 2005; Olivera et al. 2006). The ArcGIS and QGIS platforms are integrated with the SWAT model to date.

A physically based hydrological model is often referred to as a white box since its modeling processes are established on known scientific principles of mass and energy fluxes (Moradkhani & Sorooshian 2009). In contrast to the complex structure of physically based models, statistical models have gained popularity for rainfall-runoff modeling based on their simplicity and low computational resource demand. An ANN is a computing system that resembles the structure of the human biological neural network. It can identify nonlinear relationships from given patterns and fit nonparametric models on multivariate input data (Govindaraju & Rao 2013). ANNs have been applied to rainfall-runoff modeling since the 1990s. However, when used as a rainfall-runoff model, an ANN only identifies the relationship between historical inputs (meteorological data) and outputs (streamflow) without considering any of the physical processes involved, therefore falling into the category of a lumped model and is often referred to as a black-box model (ASCE 2000a). In addition, the ANNs are stochastic models, given that the fitted parameter values often vary from one training process to another for a fixed training dataset (Jain & Srinivasulu 2004), creating a distribution of outputs rather than one value.

The physically based SWAT model simulates streamflow by incorporating descriptive mathematical equations designed to conceptualize the hydrological processes in a watershed. The model requires grid-based geospatial data as inputs in addition to meteorological data and incorporates a significant number of model parameters (Fatichi et al. 2016; Noori & Kalin 2016). Solving the equations for the state variables at the level of HRUs for each time step can be time-consuming and computationally demanding (Noori & Kalin 2016; Jimeno-Sáez et al. 2018). In this regard, statistical models such as ANN hold a clear advantage. ANNs do not require a priori knowledge of the watershed's physical characteristics as model input, which reduces the model setup procedures. Meanwhile, the time it takes to train and select the best neural network is significantly shorter than calibrating a SWAT model (Minns & Hall 1996; Jimeno-Sáez et al. 2018), capable of achieving ‘unreasonable effectiveness’ in hydrological applications when sufficient training data are available (Worland et al. 2019). On the other hand, several studies have noted the disadvantages of applying ANN as a rainfall-runoff model. In particular, ANNs' lack of explanations for the underlying physical hydrological processes has generated much concern among hydrologists (Karunanithi et al. 1994; ASCE 2000a; Ha & Stenstrom 2003; Jain & Srinivasulu 2004; Yaseen et al. 2015). The inability to capture physical dynamics at the watershed level means that ANNs are unsuitable for environmental impact studies, such as modeling streamflow under changing climate conditions (Milly et al. 2008; Humphrey et al. 2016). Like other statistical models, extrapolating beyond the training data range often undermines ANNs' predictive performance (Minns & Hall 1996; Sahoo et al. 2006).

Study area

Two small watersheds in the San Antonio region of Texas are selected for this study, both located within the Medina River Basin (HUC8: 12100302), which covers part of the San Antonio urban area and a large rural area to the west of the city (Figure 1(a)). The study area has a subtropical and semi-humid climate with long, hot summers and short, warm winters. A part of the Edwards Balcones Fault Zone (BFZ) aquifer lies under the study area, where karst geologic features are prevalent (Loáiciga et al. 2000). Unique hydrological processes, including diversion of surface runoff into sinkholes, fast groundwater movement through subsurface conduits, and recharge to surface water from springs linked directly to the aquifers, are often found in such terrain (Jakada & Chen 2020).
Figure 1

(a) Location of the UMRW and LCW in the state of Texas, with zone illustration of the Edwards BFZ Aquifer; (b) DEM of the UMRW; (c) DEM of the LCW.

Figure 1

(a) Location of the UMRW and LCW in the state of Texas, with zone illustration of the Edwards BFZ Aquifer; (b) DEM of the UMRW; (c) DEM of the LCW.

Close modal

The two study watersheds were delineated in ArcSWAT using the digital elevation model (DEM). The outlets of both study watersheds were selected at the USGS gages with long-term consistent streamflow records. The Leon Creek Watershed (LCW) covers the western part of the San Antonio urban area. It is centered at 98.67° west longitude, 29.56° north latitude, and has a drainage area of 535.76 km2. The drainage area of the LCW is situated across the contributing, recharge, and artesian zones of the Edwards (BFZ) aquifer (Figure 1(a)). Elevation of the LCW decreases from north to south, ranging from 546 m in the northern part to 177 m near the watershed outlet (Figure 1(c)). The LCW is heavily urbanized. According to the 2011 National Land Cover Database (NLCD2011) land-use land cover (LULC) classification (Yang et al. 2018), 47.2% of the LCW is classified as developed urban areas with different levels of development intensity. Besides the extensive impervious surface of the city, urban afforestation has made up about 34.2% of the LCW evergreen or deciduous forest. Leon Creek is the main waterway in the LCW. It originates from multiple smaller creeks in the northern part of LCW and flows south before merging into the lower part of the Medina River.

The Upper Medina River Watershed (UMRW) is about 30 km northwest of San Antonio. It centers at 99.32° west longitude and 29.82° north latitude. The drainage area of the UMRW is 847.03 km2 and is located entirely within the contributing zone of the Edwards BFZ aquifer (Figure 1(a)). The UMRW is primarily rural, and its dominant land cover types are forest and shrub. Deciduous and evergreen forests cover 48.5% of the drainage area, while shrubland covers 38.9%, according to the NLCD2011 classification. Elevation of the UMRW declines from 727 m at its highest point in the northwest to 364 m at the lowest point in the southeast watershed outlet (Figure 1(b)). The upper Medina River is the main waterway in the UMRW. It follows the topographic decline of the UMRW and flows southeastwards. The Medina River is a tributary of the San Antonio River, and it eventually merges into the San Antonio River further south outside the Medina River Basin.

The authors used the hydrological simulations of the same period in the LCW demonstrated in this work as a part of an earlier study that explored the application of various gridded weather products (Mei et al. 2022). In this work, different ANN model configurations were explored in LCW. Meanwhile, unlike the earlier study, this work presented a different investigation of the model output to conduct paired watershed comparisons and incorporate model uncertainty analysis.

Data acquisition

All the data used for constructing hydrological models in this study were obtained from publicly accessible data sources. To set up model simulations, SWAT requires four main data files, including the DEM, LULC, soil, and meteorological data. The National Elevation Dataset (NED), which has a 30 m resolution, was used for the DEM data. The 2011 National Land Cover Dataset (NLCD2011) was collected as the LULC data. The NED and NLCD2011 datasets for the study area were obtained from the USDA Natural Resources Conversation Service (NRCS) geospatial data gateway (USDA-NRCS 2014). The state soil geographic (STATSGO) database preloaded with the ArcSWAT interface was used as the soil data. Most importantly, the Parameter-elevation Regressions on Independent Slopes Model (PRISM) daily spatial climate dataset AN81d (Daly et al. 2008) was acquired as meteorological inputs using the Google Earth Engine (GEE). This cloud-based geospatial analysis platform allows easy access to many government-supported accessible geospatial data archives (Gorelick et al. 2017). The area-averaged daily precipitation, minimum, maximum, and mean temperature of the PRISM dataset were collected by applying shapefile masks of the two study watersheds on GEE.

The USGS daily streamflow observations at Leon Creek (USGS 08181480) and Upper Medina River (USGS 08178880) from the years 2000 to 2009 were used (U.S. Geological Survey 2016) in this study. This study period was chosen as streamflow observations were available at both sites, ensuring a comparison between the watersheds. Observations from 2000 to 2006 were used to calibrate the SWAT model and provided a training target for the ANN model. Observations from 2007 to 2009 were used as standalone testing data.

SWAT modeling approach

SWAT formulates hydrological processes in a watershed using mathematical equations describing important physical laws of mass, energy, and momentum conservation. Water balance within the system at each time step is calculated to produce simulation results of hydrological and water quality (H/WQ) variables. A detailed description of the SWAT modeling process can be found in the SWAT theoretical documentation (Neitsch et al. 2011). This study used a DEM covering a much larger spatial extent than the study area to define the two study watersheds. Based on the topography provided by the DEM, the subbasin thresholds (minimum area for initiating stream networks) were applied to define the number and location of subbasins (Her et al. 2015). The subbasins were further discretized into HRUs, and a 10% threshold was applied to remove minor slope, soil, and land-use classes to restrict the total number of HRUs for improving computational efficiency (Femeena et al. 2022). As a result, 25 subbasins and 298 HRUs were defined for LCW, while 23 subbasins and 169 HRUs were defined for UMRW.

While SWAT adopts the more traditional approach of utilizing gauge weather data as inputs, the PRISM dataset for the contiguous United States is in a gridded format. This study used GEE to calculate the area-averaged meteorological data for the two study watersheds. The watershed centroids were used as ‘virtual rain gauges’ (Elhassan et al. 2016). The SWAT model simulation used daily precipitation and maximum and minimum temperatures from 1998 to 2009. The model simulations were set up on a daily time step for the 12-year simulation period. The calendar years of 1998–1999 were used for model warm-up, and 2000–2006 were used for model calibration. Model calibration minimizes the difference between model simulation and observation by adjusting model parameters. The models are then validated using observations from 2007 to 2009 without further change to the calibrated parameters. The SWAT model calibration was conducted in SWAT Calibration and Uncertainty Programs (SWAT-CUP) using the SUFI-2 procedure (Abbaspour 2011), with the NSE adopted as the calibration objective function. Table 1 summarizes 15 parameters selected for calibration, all considered sensitive for streamflow simulation according to the literature (Arabi et al. 2007; Qi et al. 2017; Koycegiz & Buyukyildiz 2019; Chen et al. 2020). Snow-melt parameters were left at default values since snow rarely occurs in the San Antonio region. Meanwhile, multiple groundwater parameters were adjusted due to their significant impact on modeling the recharge and discharge of the Edwards BFZ aquifer. More details of the adjusted parameters are discussed in the results section.

Table 1

Description of the calibrated SWAT parameters

Hydrology input parameterDescriptionFile extensionType of changeInitial value range
CN2 SCS runoff curve number for antecedent moisture condition II .mgt Relative (−10%, 10%) 
ALPHA_BF Base flow alpha factor (days) .gw Replace (0, 1) 
GW_DELAY Delay time for aquifer recharge (days) .gw Replace (0, 500) 
GWQMN Threshold depth of water in the shallow aquifer required for return flow to occur (mm H2O) .gw Replace (0, 5,000) 
GW_REVAP Groundwater ‘revap’ coefficient .gw Replace (0.02, 0.2) 
REVAPMN Threshold depth of water in the shallow aquifer for ‘revap’ or percolation to the deep aquifer to occur (mm H2O) .gw Replace (0, 500) 
RCHRG_DP Deep aquifer percolation fraction .gw Replace (0, 1) 
SOL_AWC Available water capacity of the soil layer (mm H2O/mm soil) .sol Relative (−5%, 5%) 
SOL_K Soil-saturated hydraulic conductivity (mm/h) .sol Relative (−5%, 5%) 
ESCO Soil Evaporation compensation factor .hru Replace (0.6, 0.95) 
CANMX Maximum canopy storage (mm H2O) .hru Replace (0, 100) 
CH_K1 Effective hydraulic conductivity in tributary channel alluvium (mm/h) .sub Replace (5, 130) 
CH_K2 Main channel hydraulic conductivity (mm/h) .rte Replace (5, 130) 
CH_N2 Manning's ‘n’ value for the main channel .rte Replace (0.01, 0.3) 
SURLAG Surface runoff lag coefficient (days) .bsn Replace (1, 24) 
Hydrology input parameterDescriptionFile extensionType of changeInitial value range
CN2 SCS runoff curve number for antecedent moisture condition II .mgt Relative (−10%, 10%) 
ALPHA_BF Base flow alpha factor (days) .gw Replace (0, 1) 
GW_DELAY Delay time for aquifer recharge (days) .gw Replace (0, 500) 
GWQMN Threshold depth of water in the shallow aquifer required for return flow to occur (mm H2O) .gw Replace (0, 5,000) 
GW_REVAP Groundwater ‘revap’ coefficient .gw Replace (0.02, 0.2) 
REVAPMN Threshold depth of water in the shallow aquifer for ‘revap’ or percolation to the deep aquifer to occur (mm H2O) .gw Replace (0, 500) 
RCHRG_DP Deep aquifer percolation fraction .gw Replace (0, 1) 
SOL_AWC Available water capacity of the soil layer (mm H2O/mm soil) .sol Relative (−5%, 5%) 
SOL_K Soil-saturated hydraulic conductivity (mm/h) .sol Relative (−5%, 5%) 
ESCO Soil Evaporation compensation factor .hru Replace (0.6, 0.95) 
CANMX Maximum canopy storage (mm H2O) .hru Replace (0, 100) 
CH_K1 Effective hydraulic conductivity in tributary channel alluvium (mm/h) .sub Replace (5, 130) 
CH_K2 Main channel hydraulic conductivity (mm/h) .rte Replace (5, 130) 
CH_N2 Manning's ‘n’ value for the main channel .rte Replace (0.01, 0.3) 
SURLAG Surface runoff lag coefficient (days) .bsn Replace (1, 24) 

Relative means the existing parameter value is multiplied by 1 plus the given value; Replace means the given value replaces the existing parameter value.

ANN modeling approach

The primary purpose of applying ANN as a rainfall-runoff model is to determine a model structure that best captures the nonlinear relationships between the input meteorological variables and the target streamflow. A comprehensive review of ANN application in hydrology can be found at ASCE (2000a). A three-layered feed-forward neural network using a back-propagation algorithm was employed in this study. Several neural networks were trained before a model selection phase that selected the model with the best generalization capability (Donate et al. 2013). Table 2 summarizes six model structures examined for LCW and UMRW, including scenarios 1 through 3, which only used meteorological variables as predictors, and scenarios 4 through 6, which included precipitation data and streamflow from previous time steps as predictors. The considered predictors include daily precipitation (Pt), precipitation of the previous n days (Ptn), daily mean air temperature (Tt), streamflow observation of the previous n days (Qtn), and total precipitation for the preceding n days (Pn). All input variables are normalized to the range of 0–1 using the following equation to speed up model training.
(1)
where denotes the ith observed data point. and denote the minimum and maximum values in the observed time series, and denotes the normalized values.
Table 2

ANN model input combinations

Prediction scenarioInput combinationOutput
Pt, Pt−1, Pt−2, Pt−3, Pt−4, Tt Qt 
Pt, Pt−1, Pt−2, Pt−3, Pt−4, Pn Qt 
Pt, Pt−1, Pt−2, Pt−3, Pt−4, Pn,Tt Qt 
Pt, Qt−1, Qt−2 Qt 
Pt, Qt−1, Qt−2, Qt−3 Qt 
Pt, Qt−1, Qt−2, Qt−3, Qt−4 Qt 
Prediction scenarioInput combinationOutput
Pt, Pt−1, Pt−2, Pt−3, Pt−4, Tt Qt 
Pt, Pt−1, Pt−2, Pt−3, Pt−4, Pn Qt 
Pt, Pt−1, Pt−2, Pt−3, Pt−4, Pn,Tt Qt 
Pt, Qt−1, Qt−2 Qt 
Pt, Qt−1, Qt−2, Qt−3 Qt 
Pt, Qt−1, Qt−2, Qt−3, Qt−4 Qt 

Cross-validation is the simplest and most widely used method for estimating prediction errors in statistical modeling (Hastie et al. 2009). However, since the time dependency and autocorrelation of a streamflow time series contradict the basic assumption of traditional cross-validation that the data are independent and identically distributed (Bergmeir & Benítez 2012), this study proposes to use the blocked cross-validation (BlockedCV) for determining the best model structure. Using BlockedCV, the data points are grouped into consistent blocks, and their sequential order is preserved in the training/validation data split process. The data from 2000 to 2006 were used for modeling training and validation, while data from 2007 to 2009 were used for testing to match the SWAT model simulation period discussed in the previous section. Unlike the notations commonly used in physically based hydrological model analysis, the validation dataset in statistical cross-validation often refers to the dataset of which statistical outcome is used for model selection. In contrast, the testing dataset is used for standalone model verification without further changing the model parameters.

There is, in general, no commonly accepted rule for determining the number of hidden units for a three-layered neural network. However, some studies have discussed the size of hidden units to explore. For example, Ha & Stenstrom (2003) evaluated 1–9 hidden units in their study of water quality estimation. In another similar study for water quality prediction, Kalin et al. (2010) searched from 1 to 10 hidden units. Demirel et al. (2009) supported the number of hidden units to be two-thirds of the sum of the number of input and output nodes, while Gazzaz et al. (2012) suggested that the hidden units size falls between i and 2i + 1, where i represents the number of input nodes. This study investigated the number of hidden layer units from 1 to 10 for each prediction scenario in Table 2. The root mean square error (RMSE) of the validation dataset of all trained models was calculated to find the best model structure. The RMSE is commonly adopted for evaluating statistical models and represents how far the residuals are from 0 on average (Kuhn & Johnson 2013). This study used the R software (R Core Team 2019) for all ANN model training and structure selection.

Model performance measures

Previous studies have suggested that no single metric is sufficient to verify a hydrological model (Harmel et al. 2014; Yaseen et al. 2018). This study used two goodness-of-fit indicators to evaluate model performance: the NSE and percent bias (PBIAS). The NSE is a normalized statistic that determines the magnitude of residual variance compared to the observed data variance. It ranges from –∞ to 1.0, with an NSE = 1.0 representing an optimal fit (Nash & Sutcliffe 1970). The PBIAS measures the average tendency of model overestimation or underestimation. A smaller absolute PBIAS indicates a better model fit. We adopted the evaluation criteria Moriasi et al. (2007) proposed to evaluate the model performance. The rating criteria are slightly relaxed since the simulations in this study are conducted on the finer daily time step. A more detailed rating guideline is available at ASABE (2017). Table 3 displays the form of NSE and PBIAS and their corresponding model performance criteria.

Table 3

Goodness-of-fit indicators and model performance evaluation criteria

Performance ratingNSE PBIAS (%)
Very good   
Good   
Satisfactory   
Unsatisfactory   
Performance ratingNSE PBIAS (%)
Very good   
Good   
Satisfactory   
Unsatisfactory   

is the simulated data; is the observed data; is the mean of the observed data.

Incorporating measurement and model uncertainty

Uncertainty should always be accounted for in a hydrological model application since water resources management decisions are increasingly based on H/WQ modeling (Harmel & Smith 2007; Beven 2011). In order to incorporate both measurement and model uncertainty, Harmel et al. (2010) proposed using a correction factor to modify the error term (, as shown in Table 3) in the traditional statistical indicators. The theoretical basis of the correction factor is that the more the uncertainty distribution of the simulated and observed data pairs overlap, the closer the simulated and observed values are to one another (Harmel et al. 2010). The degree of overlap is represented by the joint probability of the two uncertainty distributions and can be expressed by (Harmel et al. 2010):
(2)
where is the degree of overlap of distributions for each simulated () and observed () pair, is the probability density function (pdf) of the simulated value (), and is the probability density function of the observed value (). The pdfs of the simulated and observed values were determined in the R software package using the parameters of the assumed uncertainty distributions, which will be further discussed in the following paragraphs. The degree of overlap is calculated by integrating the pdfs of each simulated and observed pair.
The degree of overlap ranges from 0 to 1, and it is then used to calculate the correction factor and modify the error term, which can be expressed by (Harmel et al. 2010):
(3)
(4)
where is the correction factor that incorporates measurement and model uncertainty for each simulated () and observed () pair. The term is the modified error term for each simulated () and observed () pair. The modified error term is then used to substitute the simple error term () in the NSE and PBIAS presented in Table 3.
The uncertainty level of data involving complex measurement and modeling processes is difficult to quantify since the sources of the uncertainty usually cannot be directly measured. Under this circumstance, a prior probability distribution is often adopted to quantify the uncertainty (Rowe 1994). In this work, we assume the uncertainty distributions of the observed and simulated data points to be normally distributed, which is one of the uncertainty distributions adopted by Harmel et al. (2010). A normal distribution has two parameters: the mean and standard deviation. To generate the normal distributions, we used the observed and simulated streamflow values as their mean. Their corresponding standard deviations (Std.Dev) were generated by multiplying the mean of the distributions with the coefficients of variation (Cv). Harmel et al. (2010) proposed using four Cv values ranging from low to high. This study adopted two Cv values (0.026 and 0.256) that Harmel et al. (2010) recommended. The standard deviation is derived from the following equation (Haan 2002):
(5)
where Std.Dev is the sample standard deviation, is the sample mean of the uncertainty distribution.

Statistical assessment of the observed streamflow

The statistical summary (i.e., mean, median, standard deviation, Std.Dev, coefficient of variation, Cv, maximum, Qmax, minimum, Qmin, first-, second-, third-, and fourth-order autocorrelation coefficients, r1, r2, r3, r4) of the observed streamflow for both study watersheds are displayed in Table 4. In LCW, the overall streamflow difference between the calibration and validation periods is smaller than in UMRW. This study proposes using the streamflow of previous time steps as one of the predictors in the ANN models. Hence, autocorrelation coefficients of the streamflow up to the fourth order were calculated and are presented in Table 4. The autocorrelation decreases significantly as the lag increases for all modeling periods and watersheds. The autocorrelation is more substantial in the LCW during the calibration period. In the UMRW, the autocorrelation of the calibration period is much weaker than that of the validation period.

Table 4

Statistical summary of daily streamflow observations

WatershedTime periodStreamflow data (m3/s)
MeanMedianStd.DevCvQmaxQminr1r2r3r4
LCW Calibration 1.581 0.185 16.419 10.386 580.150 0.025 0.621 0.490 0.296 0.112 
Validation 1.486 0.198 10.474 7.046 277.623 0.018 0.390 0.071 0.076 0.079 
UMRW Calibration 6.358 1.837 65.366 10.280 2,943.200 0.009 0.552 0.342 0.267 0.094 
Validation 3.992 0.819 10.420 2.610 163.291 0.000 0.707 0.607 0.538 0.499 
WatershedTime periodStreamflow data (m3/s)
MeanMedianStd.DevCvQmaxQminr1r2r3r4
LCW Calibration 1.581 0.185 16.419 10.386 580.150 0.025 0.621 0.490 0.296 0.112 
Validation 1.486 0.198 10.474 7.046 277.623 0.018 0.390 0.071 0.076 0.079 
UMRW Calibration 6.358 1.837 65.366 10.280 2,943.200 0.009 0.552 0.342 0.267 0.094 
Validation 3.992 0.819 10.420 2.610 163.291 0.000 0.707 0.607 0.538 0.499 

In addition, the cumulative probability curve for the daily discharge of each year at Leon Creek and Upper Medina River is displayed in Figure 2. The discharge is converted into a log scale to demonstrate the difference among the data points. Overall, the measured discharge at Upper Medina River is significantly larger than that of Leon Creek during most of the study period. Nevertheless, the high flow points of Leon Creek are close to that of Upper Medina River, especially in the year 2001 (Figure 2(b)) and years 2006–2009 (Figure 2 (plots g to j)). During these years, the highest daily flow of Leon Creek exceeded that of the larger Upper Medina River. It seems possible that these observations are due to the extensive impervious cover in the urban LCW, which causes higher discharge peaks during flood events. Whereas storm runoff in the rural UMRW quickly recharges into the Edwards Aquifer, reducing the discharge peaks at Upper Medina River.
Figure 2

Observed daily streamflow of Leon Creek and Upper Medina River from 2000 to 2009 (Plot a through j).

Figure 2

Observed daily streamflow of Leon Creek and Upper Medina River from 2000 to 2009 (Plot a through j).

Close modal

SWAT model calibration and validation

The SUFI-2 algorithm is an iterative procedure that requires updating the parameter ranges after each iteration until model performance stabilizes. For both LCW and UMRW, five calibration iterations, each with 500 simulations, were run. The parameter range from the last calibration iteration was used without further change in the validation iteration, which was composed of 500 simulations as well. The calibrated parameter value range and the best-fitted parameter values for the validation period are summarized in Table 5.

Table 5

Calibrated SWAT parameter range and the best-fitted validation value

Hydrology input parameterDefault value in SWATCalibrated parameter range
Best-fitted validation value
LCWUMRWLCWUMRW
CN2a 35–98 (−15.5%, −8.3%) (12.0%, 16.4%) −14.7% 12.0% 
ALPHA_BF 0.048 (0.075, 0.224) (0.751, 0.882) 0.086 0.774 
GW_DELAY 31 (228, 420) (334, 411) 363 370 
GWQMN 1,000 (3,712, 4,378) (2,521, 3,496) 4,268 3,302 
GW_REVAP 0.02 (0.125, 0.150) (0.037, 0.076) 0.136 0.056 
REVAPMN 750 (331, 412) (85, 225) 348 154.2 
RCHRG_DP 0.05 (0.049, 0.159) (0.152, 0.286) 0.157 0.179 
SOL_AWCa 0.01–0.42 (−5.3%, 1.5%) (−3.4%, −1.4%) −2.6% −2.8% 
SOL_Ka 0–2,000 (−3.2%, −1.6%) (−4.4% −1.9%) −2.1% −4.4% 
ESCO 0.95 (0.862, 0.921) (0.745, 0.797) 0.872 0.796 
CANMX (65, 82) (79, 100) 79 81 
CH_K1 (77, 114) (9, 24) 107 23 
CH_K2 (43, 60) (111, 130) 51 114 
CH_N2 0.014 (0.010, 0.031) (0.247, 0.300) 0.014 0.273 
SURLAG (3.81, 9.75) (19.41, 24.00) 7.94 23.42 
Hydrology input parameterDefault value in SWATCalibrated parameter range
Best-fitted validation value
LCWUMRWLCWUMRW
CN2a 35–98 (−15.5%, −8.3%) (12.0%, 16.4%) −14.7% 12.0% 
ALPHA_BF 0.048 (0.075, 0.224) (0.751, 0.882) 0.086 0.774 
GW_DELAY 31 (228, 420) (334, 411) 363 370 
GWQMN 1,000 (3,712, 4,378) (2,521, 3,496) 4,268 3,302 
GW_REVAP 0.02 (0.125, 0.150) (0.037, 0.076) 0.136 0.056 
REVAPMN 750 (331, 412) (85, 225) 348 154.2 
RCHRG_DP 0.05 (0.049, 0.159) (0.152, 0.286) 0.157 0.179 
SOL_AWCa 0.01–0.42 (−5.3%, 1.5%) (−3.4%, −1.4%) −2.6% −2.8% 
SOL_Ka 0–2,000 (−3.2%, −1.6%) (−4.4% −1.9%) −2.1% −4.4% 
ESCO 0.95 (0.862, 0.921) (0.745, 0.797) 0.872 0.796 
CANMX (65, 82) (79, 100) 79 81 
CH_K1 (77, 114) (9, 24) 107 23 
CH_K2 (43, 60) (111, 130) 51 114 
CH_N2 0.014 (0.010, 0.031) (0.247, 0.300) 0.014 0.273 
SURLAG (3.81, 9.75) (19.41, 24.00) 7.94 23.42 

aIndicate percent change to existing parameter values.

Due to their spatial heterogeneity, the SCS runoff curve number (CN2) and soil parameters are adjusted through a percentage change. Similarly, groundwater (.gw) and general HRU parameters (.hru) have spatial heterogeneity to the level of HRUs, and subbasin (.sub) and routing parameters (.rte) have spatial heterogeneity to the level of subbasins. However, this study adjusted all other parameters through direct replacement since their default values are fixed across the entire watershed in SWAT 2012. It should be noted that only the watershed outflows were adopted in the model calibration process due to the limitation in accessible observed streamflow data. Meanwhile, the two study watersheds were divided into many HRUs. Applying only one value for some spatially distributed parameters across all HRUs can cause model over-parameterization and equifinality.

The groundwater (.gw) parameters were the focus of this study as they control the recharge speed into the Edwards Aquifer. The baseflow alpha factor (ALPHA_BF) is slightly increased from the default in LCW and significantly increased in UMRW. The larger ALPHA_BF indicates that the groundwater flow response to the changes in recharge is more rapid in UMRW. The groundwater delay time (GW_DELAY) was extended in LCW and UMRW, which indicates that water exits the soil profile and enters the shallow aquifer relatively slowly in both watersheds. Meanwhile, the threshold depth of water in the shallow aquifer required for return flow to occur (GWQMN) is significantly increased from its default value, possibly reflecting a large storage capacity of the shallow aquifer in both study watersheds, which is very common in karstic regions. A larger groundwater ‘revap’ coefficient was found in LCW than in UMRW, suggesting that water movement from the shallow aquifer to the root zone is faster in LCW. The fast water discharge is likely occurring through sinkholes in the region when the hydraulic head of groundwater is high. In addition, a relatively small deep aquifer percolation fraction (RCHRG_DP) was found in LCW and UMRW, indicating that only a tiny proportion of water in the root zone was recharged into the deep aquifer for our calibrated models. A possible explanation might be that the deep aquifer in this karstic region mainly receives fast recharge through the sinkholes. The deep aquifer has a relatively high hydraulic head, which reduces the amount of water recharge by percolating the soil layers.

ANN model selection

The three-layer feed-forward neural network structure contains an input, hidden, and output layer. All nodes are fully connected to nodes in their adjacent layers with links that contain weight and bias information. As mentioned in the methodology section, for all six prediction scenarios presented in Table 2, the hidden layer size was explored from 1 to 10. The RMSE of the cross-validation dataset is calculated to select the best model structure. The models with the smallest RMSE for each prediction scenario are displayed in Table 6 for LCW and Table 7 for UMRW. Among the six prediction scenarios, scenario 6 of LCW and scenario 5 of UMRW had the smallest cross-validation RMSE and were selected as the best model for each study watershed.

Table 6

Best ANN model structure and performance result for the LCW

Prediction scenarioHidden nodesTraining
Testing
Validation RMSE (m3/s)
NSEPBIAS (%)NSEPBIAS (%)
0.901 60.7 0.736 −16.9 2.247 
0.891 59.4 0.671 1.3 2.178 
0.908 68.3 0.760 −6.3 2.231 
0.882 73.2 0.759 −2.0 2.183 
0.901 57.7 0.770 −11.2 2.149 
0.887 56.2 0.756 −13.4 2.111 
Prediction scenarioHidden nodesTraining
Testing
Validation RMSE (m3/s)
NSEPBIAS (%)NSEPBIAS (%)
0.901 60.7 0.736 −16.9 2.247 
0.891 59.4 0.671 1.3 2.178 
0.908 68.3 0.760 −6.3 2.231 
0.882 73.2 0.759 −2.0 2.183 
0.901 57.7 0.770 −11.2 2.149 
0.887 56.2 0.756 −13.4 2.111 
Table 7

Best ANN model structure and performance result for the UMRW

Prediction scenarioHidden nodesTraining
Testing
Validation RMSE (m3/s)
NSEPBIAS (%)NSEPBIAS (%)
0.949 40.5 −0.076 −89.7 4.150 
0.961 8.8 −0.029 −90.3 4.238 
0.950 44.7 −0.077 −89.5 3.984 
0.670 3.2 0.294 −72.8 3.518 
0.892 3.7 0.316 −79.8 2.360 
0.863 −11.7 0.355 −79.1 2.658 
Prediction scenarioHidden nodesTraining
Testing
Validation RMSE (m3/s)
NSEPBIAS (%)NSEPBIAS (%)
0.949 40.5 −0.076 −89.7 4.150 
0.961 8.8 −0.029 −90.3 4.238 
0.950 44.7 −0.077 −89.5 3.984 
0.670 3.2 0.294 −72.8 3.518 
0.892 3.7 0.316 −79.8 2.360 
0.863 −11.7 0.355 −79.1 2.658 

The trained ANN models overall had very good performance in LCW (Table 6). All six scenarios had NSE values around 0.9 during the training period, although the PBIAS values were above 50%, indicating that streamflow is continuously overestimated during the training process. In the testing period, the NSE values ranged between 0.67 and 0.77, while all PBIAS values are below 25%, overall indicating very good streamflow estimation performance.

In UMRW (Table 7), the NSE and PBIAS performance for all six prediction scenarios in the training period is in the range of ‘good’ to ‘very good,’ with NSE ranging from 0.67 to 0.95 and PBIAS ranging from −11.7 to 44.7%. However, the predictive performance of the testing period is much worse. The NSE values were below 0 for scenarios 1 through 3, in which the observed mean is a better predictor than the model. The NSE values are near 0.3 for scenarios 4 through 6, near the boundary of unsatisfactory and satisfactory performance. The PBIAS of all six scenarios were below −70%, suggesting severe streamflow underestimation. As mentioned in the methodology section, prediction scenarios 4 through 6 included the streamflow of previous time steps as predictors, while scenarios 1 through 3 only used the meteorological data. It is apparent from Tables 6 and 7 that the inclusion of streamflow as one of the predictors did not cause a clear difference in the predictive performance in LCW but significantly improved the predictive performance in UMRW.

SWAT and ANN model performance comparison

The goodness-of-fit indicators, traditionally calculated and modified with the correction factor for SWAT and ANN models, are presented in Table 8. Similar to the results in Harmel et al. (2010), when the uncertainty level is low (Cv = 0.026), there was almost no noticeable improvement in the indicator values. However, when the more significant uncertainty level was applied (Cv = 0.256), both NSE and PBIAS values showed levels of improvement in the two study watersheds. Overall, incorporating the correction factors into the goodness-of-fit assessment yielded results consistent with the opinion of Harmel et al. (2010), as very good model performance did not improve considerably, while poor performance did not become satisfactory. In this study, the slightly improved values provided a more comprehensive assessment of the model performance.

Table 8

Statistical performance of SWAT and ANN models

WatershedModelIndicatorCalibration (training) (2000–2006)
Validation (testing) (2007–2009)
Traditional calculationCv = 0.026Cv = 0.256Traditional calculationCv = 0.026Cv = 0.256
LCW ANN NSE 0.89vg 0.89vg 0.91vg 0.76vg 0.76vg 0.79vg 
PBIAS (%) 56.2s 56.2s 56.4s −13.4vg −13.3vg −10.7vg 
SWAT NSE 0.90vg 0.90vg 0.92vg 0.72vg 0.72vg 0.74vg 
PBIAS (%) 49.6g 49.6g 48.9g 36.8g 36.8g 33.5g 
UMRW ANN NSE 0.89vg 0.89vg 0.96vg 0.32s 0.32s 0.34s 
PBIAS (%) 3.7vg 3.7vg 5.8vg −79.8u −79.8u −78.4u 
SWAT NSE 0.54g 0.54g 0.56g −0.02u −0.02u 0.07u 
PBIAS (%) 4.3vg 4.3vg 3.5vg 27.3g 27.3g 25.3g 
WatershedModelIndicatorCalibration (training) (2000–2006)
Validation (testing) (2007–2009)
Traditional calculationCv = 0.026Cv = 0.256Traditional calculationCv = 0.026Cv = 0.256
LCW ANN NSE 0.89vg 0.89vg 0.91vg 0.76vg 0.76vg 0.79vg 
PBIAS (%) 56.2s 56.2s 56.4s −13.4vg −13.3vg −10.7vg 
SWAT NSE 0.90vg 0.90vg 0.92vg 0.72vg 0.72vg 0.74vg 
PBIAS (%) 49.6g 49.6g 48.9g 36.8g 36.8g 33.5g 
UMRW ANN NSE 0.89vg 0.89vg 0.96vg 0.32s 0.32s 0.34s 
PBIAS (%) 3.7vg 3.7vg 5.8vg −79.8u −79.8u −78.4u 
SWAT NSE 0.54g 0.54g 0.56g −0.02u −0.02u 0.07u 
PBIAS (%) 4.3vg 4.3vg 3.5vg 27.3g 27.3g 25.3g 

Superscripts represent the performance levels, ‘vg’ – ‘very good’, ‘g’ – ‘good’, ‘s’ – ‘satisfactory’, ‘u’ – ‘unsatisfactory’.

In the urban LCW, both SWAT-LCW and ANN-LCW models produced very good simulation results, with the calibration (training) NSE reaching approximately 0.90 and validation (testing) NSE above 0.70. In addition, both SWAT-LCW and ANN-LCW models overestimated streamflow in the calibration (training) period. The SWAT-LCW model further overestimated the streamflow in the validation period, while the ANN model underestimated the streamflow in the testing period. Overall, the statistical indicators suggest that the ANN model slightly outperformed the SWAT in the urban LCW. These results corroborate the findings of some previous work comparing ANN and process-based approaches for streamflow modeling, which had often reported that the ANN models provided more accurate simulations using the criteria of statistical indicators (Srivastava et al. 2006; Jimeno-Sáez et al. 2018; Kassem et al. 2020).

In the rural UMRW, the ANN model performed significantly better than the SWAT model. The NSE performance of the training period was classified as very good for the ANN-UMRW model, using either the traditional calculation or the modified indicators. In addition, the testing NSE of the ANN-UMRW model was classified as satisfactory. In contrast, the calibration period SWAT-UMRW model only had good performance with NSE ranging from 0.54 to 0.56 from different uncertainty levels, and the validation period SWAT-UMRW model had unsatisfactory NSE performance on all uncertainty levels. Interestingly, the PBIAS statistics of the calibration (training) period for both SWAT-UMRW and ANN-UMRW models were classified as very good. However, the validation (testing) of PBIAS was similar to that in LCW, with ANN-UMRW underestimated streamflow and SWAT-UMRW made overestimation.

The calibration (training) results did not offer knowledge of the actual model predictive performance in either watershed. Hence, only the validation (testing) phase was considered for graphical analysis. The hydrographs with precipitation records (Figure 3) for the validation (testing) period (2007–2009) were created to analyze further the difference between observed daily streamflow and SWAT and ANN simulation results. In Figure 3, the annual validation (testing) period hydrograph for LCW was displayed in plots (a) through (c), and for UMRW was displayed in plots (d) through (f). In LCW, both SWAT and ANN models captured the timing of most major streamflow peaks except for one significant storm event in the late summer of 2008, before which a wet period with a relatively large precipitation volume was recorded. The SWAT-LCW model performed better during 2007 and 2009 and less accurately during 2008, with extended periods of deficient observed flow records. The ANN-LCW model had more consistent predictive performance throughout the testing period than the SWAT-LCW model.
Figure 3

Daily precipitation, observed, and simulated streamflow of year (a) 2007, (b) 2008, (c) 2009 for the LCW, and year (d) 2007, (e) 2008, and (f) 2009 for the UMRW.

Figure 3

Daily precipitation, observed, and simulated streamflow of year (a) 2007, (b) 2008, (c) 2009 for the LCW, and year (d) 2007, (e) 2008, and (f) 2009 for the UMRW.

Close modal

Additionally, the hydrographs in LCW show that the SWAT model overestimated the magnitude of almost every prominent streamflow peak in 2007 and 2009, while the magnitude of the ANN model predicted peak flow was much smaller. In the rural UMRW, SWAT and ANN models failed to capture the timing and magnitude of the significant storm events. More specifically, the ANN-UMRW model caught the timing of the storm events from 2007 to 2009 but often severely underestimated their magnitudes. On the other hand, the SWAT-UMRW model failed to capture the timing and magnitude of the storm events and falsely simulated a few nonexistent storm flow events. Additionally, the recession limbs simulated by SWAT-UMRW were much longer than those from the actual flow. It seems most likely that these results are due to the fast recharge through the sinkholes, which are found plenty in regions with karstic features.

Figure 4 presents the validation/testing scatter plots from 2007 to 2009 for LCW and UMRW with their corresponding correlation coefficients. The observed high and low flows were divided at the 5% exceedance probability. In LCW, the SWAT and ANN models performed well, predicting the top 5% of observed streamflow with a slight underestimation and correlation coefficient © of 0.86 and 0.90, respectively (Figure 4(a)). However, for the bottom 95% of streamflow, both SWAT and ANN performed poorly with weak correlations, in which the streamflow was hugely overestimated (Figure 4(b)). In the rural UMRW, the SWAT-UMRW and ANN-UMRW models performed poorly for low and high-flow conditions, resulting in weak correlations. However, the correlation coefficients of the ANN model were more significant than that of the SWAT model in both low- and high-flow conditions. In addition, the scatter plots (Figure 4(c) and 4(d)) suggest that SWAT-UMRW overestimated streamflow for low and high conditions, while the ANN-UMRW models underestimated predicted streamflow.
Figure 4

Scatter plots of validation(testing) period daily streamflow of (a) LCW high flow, (b) LCW low flow, (c) UMRW high flow, and (d) UMRW low flow.

Figure 4

Scatter plots of validation(testing) period daily streamflow of (a) LCW high flow, (b) LCW low flow, (c) UMRW high flow, and (d) UMRW low flow.

Close modal

The LCW and UMRW are situated above the Edwards BFZ Aquifer, with UMRW entirely located within the contribution zone and LCW located through the contributing, recharge, and artesian zone (Figure 1(a)). Karst geological features are prevalent in the Edwards BFZ Aquifer region, and a significant amount of surface runoff in the contributing zone infiltrates the water table aquifer (Loáiciga et al. 2000). In the rural UMRW, the karst terrain makes rainfall-runoff modeling particularly difficult. The observed streamflow time series did not correspond well with precipitation, mostly likely due to the fast infiltration or direct recharge through sinkholes. The water recharging into the Edwards BFZ Aquifer generally moves from west to east through large subterranean conduits (Loáiciga et al. 2000) and eventually discharges back to the surface through a few large springs far east outside of the watershed boundary of UMRW.

In the meantime, the two hydrological models employed in this study, especially the SWAT model, heavily relied on the input precipitation for estimating the watershed outflow. The daily streamflow simulation results suggest that the calibrated SWAT model failed to quantify the amount of groundwater recharge and discharge in UMRW, hence overestimating the watershed outflow. On the contrary, estimating watershed outflow in the urban LCW was much more successful for SWAT and ANN. These results are likely due to the extensive impervious surface in LCW, causing more direct surface runoff into the stream instead of recharging into the water table aquifer. Thereby, the rainfall-runoff process in the urban watershed was not evidently affected by the karst topography underneath. A few previous studies pointed out that the traditional SWAT models have generally not yielded satisfactory runoff estimates in karst watersheds, and some suggested applying additional modules or using the trace test method to quantify streamflow and infiltration (Spruill et al. 2000; Shao et al. 2019; Wang et al. 2019; Jakada & Chen 2020), which are beyond the scope of this study.

This study investigated watershed-level daily streamflow simulation using SWAT and ANN in two small watersheds in the San Antonio Region, where karst terrain is prevalent. This study set out to compare the efficacy of the SWAT and ANN models in estimating streamflow in the karstic watershed in central south Texas. The paired watershed approach assesses SWAT and ANN performance under different dominant land cover types. Additionally, we applied the correction factor approach to enhance model evaluation using modified goodness-of-fit indicators incorporating measurement and model uncertainty. The conclusions of this study are summarized as follows:

  • (1)

    Six ANN prediction scenarios were set up for the two study watersheds, and BlockedCV was applied to select the best neural network structure for each watershed. The model selection results (Tables 6 and 7) show that in urban LCW, the inclusion of previous time step streamflow as one of the predictors did not noticeably improve model performance. In contrast, in the rural UMRW, the model performance significantly improved when a time series autoregressive model structure using historical streamflow as input data was implemented.

  • (2)

    Considering the statistical performance and graphical comparison between the observed and simulated time series, both SWAT and ANN models made very good daily streamflow estimations in the urban LCW, while their performance in the rural UMRW was much less satisfactory. This discrepancy could be attributed to the extensive impervious surface in the urban watershed, causing more direct surface runoff into stream networks than in the rural watershed.

  • (3)

    The calibrated SWAT model has inferior performance in the rural UMRW, which is located within the contributing zone of Edwards BFZ Aquifer, where karst geological features are prevalent. This finding is consistent with a few previous studies, which suggested that the conventional SWAT model is less capable in a karst environment.

  • (4)

    The statistical indicators performance results for the standalone validation (testing) period suggest that the ANN model performed slightly better than the SWAT model in LCW and significantly better in UMRW. These findings, taken together with the fact that the data-driven ANN model has a short response time compared with SWAT, suggest that ANN is a better real-time simulator of streamflow, although it does not address the physical aspect of a hydrological system.

  • (5)

    Applying the correction factor approach to modify goodness-of-fit indicators to incorporate measurement and model uncertainty yielded similar results in the two study watersheds. When the uncertainty level was low (Cv = 0.026), the indicator values showed almost no visible improvement. However, when the more significant uncertainty level was applied (Cv = 0.256), both NSE and PBIAS values showed different levels of improvement for the ANN and SWAT models in both study watersheds.

This work was limited to hydrological simulations in small-sized watersheds in the San Antonio region. The study area was unique with its vast karstic groundwater aquifer with rapid recharge and discharge capabilities. An issue that was not addressed was whether the paired watershed study between urban and rural watersheds would yield closer model performance in an area without similar karst geological features. More importantly, although the results of this investigation show that ANNs achieved promising performance and outperformed the SWAT model by different margins, the most significant limitation lies in the fact that no spatially distributed watershed processes were considered for the ANNs in this work. Further research could usefully explore ANN forms that explicitly address the spatial distribution of hydrological processes within a watershed.

This manuscript is a chapter of the first author's dissertation.

X. M. and P. S. contributed significantly to developing the methodology applied in this study. X. M. conducted the model simulations and wrote the original draft. X. M., P. S. and J. L. analyzed the results and revised the manuscript. All authors have read and agreed to the published version of the manuscript.

This research was supported by the Yunnan Provincial Department of Education basic scientific research fund under grant NO. 2023J0446 and NO. 2023Y0884.

Data cannot be made publicly available; readers should contact the corresponding author for details.

The authors declare there is no conflict.

Abbaspour
K. C.
2011
SWAT-CUP4: SWAT Calibration and Uncertainty Programs – A User Manual
, Vol.
106
.
Swiss Federal Institute of Aquatic Science and Technology, Eawag, Dübendorf
.
Abbaspour
K. C.
,
Rouholahnejad
E.
,
Vaghefi
S.
,
Srinivasan
R.
,
Yang
H.
&
Kløve
B.
2015
A continental-scale hydrology and water quality model for Europe: Calibration and uncertainty of a high-resolution large-scale SWAT model
.
Journal of Hydrology
524
,
733
752
.
Arabi
M.
,
Govindaraju
R. S.
&
Hantush
M. M.
2007
A probabilistic approach for analysis of uncertainty in the evaluation of watershed management practices
.
Journal of Hydrology
333
(
2–4
),
459
471
.
Arnold
J.
,
Kiniry
J.
,
Srinivasan
R.
,
Williams
J.
,
Haney
E.
&
Neitsch
S.
2012
Soil and Water Assessment Tool, Input/Output Documentation Version 2012
.
Texas Water Resources Institute
,
College Station, TX
.
ASABE
2017
Guidelines for Calibrating, Validating, and Evaluating Hydrologic and Water Quality (H/WQ) Models
.
ASCE
2000a
Artificial neural networks in hydrology. I: Preliminary concepts
.
Journal of Hydrologic Engineering
5
(
2
),
115
123
.
ASCE
2000b
Artificial neural networks in hydrology. II: Hydrologic applications
.
Journal of Hydrologic Engineering
5
(
2
),
124
137
.
Beaumont
C.
1979
Stochastic models in hydrology
.
Progress in Physical Geography
3
(
3
),
363
391
.
Bergmeir
C.
&
Benítez
J. M.
2012
On the use of cross-validation for time series predictor evaluation
.
Information Sciences
191
,
192
213
.
Beven
K. J.
2011
Rainfall-runoff Modelling: the Primer
.
John Wiley & Sons
,
Hoboken, NJ
.
Brirhet
H.
&
Benaabidate
L.
2016
Comparison of two hydrological models (lumped and distributed) over a pilot area of the Issen watershed in the Souss basin, Morocco
.
European Scientific Journal
12
(
18
),
347
358
.
Chen
M.
,
Gassman
P. W.
,
Srinivasan
R.
,
Cui
Y.
&
Arritt
R.
2020
Analysis of alternative climate datasets and evapotranspiration methods for the Upper Mississippi River Basin using SWAT within HAWQS
.
Science of the Total Environment
270
,
137562
.
Daly
C.
,
Halbleib
M.
,
Smith
J. I.
,
Gibson
W. P.
,
Doggett
M. K.
,
Taylor
G. H.
,
Curtis
J.
&
Pasteris
P. P.
2008
Physiographically sensitive mapping of climatological temperature and precipitation across the conterminous United States
.
International Journal of Climatology
28
(
15
),
2031
2064
.
Demirel
M. C.
,
Venancio
A.
&
Kahya
E.
2009
Flow forecast by SWAT model and ANN in Pracana basin, Portugal
.
Advances in Engineering Software
40
(
7
),
467
473
.
Donate
J. P.
,
Cortez
P.
,
SáNchez
G. G.
&
De Miguel
A. S.
2013
Time series forecasting using a weighted cross-validation evolutionary artificial neural network ensemble
.
Neurocomputing
109
,
27
32
.
Elhassan
A.
,
Xie
H.
,
Al-othman
A. A.
,
Mcclelland
J.
&
Sharif
H. O.
2016
Water quality modelling in the San Antonio River Basin driven by radar rainfall data
.
Geomatics, Natural Hazards and Risk
7
(
3
),
953
970
.
Fatichi
S.
,
Vivoni
E. R.
,
Ogden
F. L.
,
Ivanov
V. Y.
,
Mirus
B.
,
Gochis
D.
&
Ebel
B.
2016
An overview of current applications, challenges, and future trends in distributed process-based models in hydrology
.
Journal of Hydrology
537
,
45
60
.
Femeena
P. V.
,
Karki
R.
,
Cibin
R.
&
Sudheer
K.
2022
Reconceptualizing HRU threshold definition in the soil and water assessment tool
.
JAWRA Journal of the American Water Resources Association
58
(
4
),
508
516
.
Gassman
P. W.
,
Reyes
M. R.
,
Green
C. H.
&
Arnold
J. G.
2007
The soil and water assessment tool: historical development, applications, and future research directions
.
Transactions of the ASABE
50
(
4
),
1211
1250
.
Gazzaz
N. M.
,
Yusoff
M. K.
,
Aris
A. Z.
,
Juahir
H.
&
Ramli
M. F.
2012
Artificial neural network modeling of the water quality index for Kinta River (Malaysia) using water quality variables as predictors
.
Marine Pollution Bulletin
64
(
11
),
2409
2420
.
Gorelick
N.
,
Hancher
M.
,
Dixon
M.
,
Ilyushchenko
S.
,
Thau
D.
&
Moore
R.
2017
Google Earth engine: planetary-scale geospatial analysis for everyone
.
Remote Sensing of Environment
202
,
18
27
.
Govindaraju
R. S.
&
Rao
A. R.
2013
Artificial Neural Networks in Hydrology
, Vol.
36
.
Springer Science & Business Media
,
Dordrecht
.
Haan
C.
2002
Statistical Methods in Hydrology
, 2nd edn.
Iowa State Press
,
Ames, IA
, p.
496
.
Harmel
D.
,
Smith
P.
,
Migliaccio
K.
,
Chaubey
I.
,
Douglas-Mankin
K.
,
Benham
B.
,
Shukla
S.
,
Muñoz-Carpena
R.
&
Robson
B. J.
2014
Evaluating, interpreting, and communicating performance of hydrologic/water quality models considering intended use: A review and recommendations
.
Environmental Modelling & Software
57
,
40
51
.
Hastie
T.
,
Tibshirani
R.
&
Friedman
J.
2009
The Elements of Statistical Learning: Data Mining, Inference, and Prediction
.
Springer Science & Business Media
,
New York, NY
.
Her
Y.
,
Frankenberger
J.
,
Chaubey
I.
&
Srinivasan
R.
2015
Threshold effects in HRU definition of the soil and water assessment tool
.
Transactions of the ASABE
58
(
2
),
367
378
.
Hu
T.
,
Lam
K.
&
Ng
S.
2001
River flow time series prediction with a range-dependent neural network
.
Hydrological Sciences Journal
46
(
5
),
729
745
.
Islam
Z.
2011
A Review on Physically Based Hydrologic Modeling
.
University of Alberta
,
Edmonton, AB
,
Canada
.
Jakada
H.
&
Chen
Z.
2020
An approach to runoff modelling in small karst watersheds using the SWAT model
.
Arabian Journal of Geosciences
13
(
8
),
1
18
.
Jayakrishnan
R.
,
Srinivasan
R.
,
Santhi
C.
&
Arnold
J.
2005
Advances in the application of the SWAT model for water resources management
.
Hydrological Processes: An International Journal
19
(
3
),
749
762
.
Jimeno-Sáez
P.
,
Senent-Aparicio
J.
,
Pérez-Sánchez
J.
&
Pulido-Velazquez
D.
2018
A comparison of SWAT and ANN models for daily runoff simulation in different climatic zones of peninsular Spain
.
Water
10
(
2
),
192
.
Kalin
L.
,
Isik
S.
,
Schoonover
J. E.
&
Lockaby
B. G.
2010
Predicting water quality in unmonitored watersheds using artificial neural networks
.
Journal of Environmental Quality
39
(
4
),
1429
1440
.
Karunanithi
N.
,
Grenney
W. J.
,
Whitley
D.
&
Bovee
K.
1994
Neural networks for river flow prediction
.
Journal of Computing in Civil Engineering
8
(
2
),
201
220
.
Kassem
A. A.
,
Raheem
A. M.
,
Khidir
K. M.
&
Alkattan
M.
2020
Predicting of daily Khazir basin flow using SWAT and hybrid SWAT-ANN models
.
Ain Shams Engineering Journal
11
(
2
),
435
443
.
Kim
M.
,
Baek
S.
,
Ligaray
M.
,
Pyo
J.
,
Park
M.
&
Cho
K.
2015
Comparative studies of different imputation methods for recovering streamflow observation
.
Water
7
(
12
),
6847
6860
.
Kuhn
M.
&
Johnson
K.
2013
Applied Predictive Modeling
, Vol.
26
.
Springer
,
New York, NY
.
Licciardello
F.
,
Rossi
C.
,
Srinivasan
R.
,
Zimbone
S.
&
Barbagallo
S.
2011
Hydrologic evaluation of a Mediterranean watershed using the SWAT model with multiple PET estimation methods
.
Transactions of the ASABE
54
(
5
),
1615
1625
.
Loáiciga
H.
,
Maidment
D.
&
Valdes
J. B.
2000
Climate-change impacts in a regional karst aquifer, Texas, USA
.
Journal of Hydrology
227
(
1–4
),
173
194
.
Mei
X.
,
Smith
P. K.
,
Li
J.
&
Li
B.
2022
Hydrological evaluation of gridded climate datasets in a Texas urban watershed using soil and water assessment tool and artificial neural network
.
Frontiers in Environmental Science
10
,
1128
.
Milly
P. C.
,
Betancourt
J.
,
Falkenmark
M.
,
Hirsch
R. M.
,
Kundzewicz
Z. W.
,
Lettenmaier
D. P.
&
Stouffer
R. J.
2008
Stationarity is dead: Whither water management?
Science
319
(
5863
),
573
574
.
Minns
A.
&
Hall
M.
1996
Artificial neural networks as rainfall-runoff models
.
Hydrological Sciences Journal
41
(
3
),
399
417
.
Moradkhani
H.
&
Sorooshian
S.
2009
General review of rainfall-runoff modeling: Model calibration, data assimilation, and uncertainty analysis
. In:
Hydrological Modelling and the Water Cycle
(Sorooshian, S., Hsu, K-L., Coppola, E., Tomassetti, B., Verdecchia, M. & Visconti, G., eds.)
.
Springer
,
New York, NY
, pp.
1
24
.
Moriasi
D. N.
,
Arnold
J. G.
,
Van Liew
M. W.
,
Bingner
R. L.
,
Harmel
R. D.
&
Veith
T. L.
2007
Model evaluation guidelines for systematic quantification of accuracy in watershed simulations
.
Transactions of the ASABE
50
(
3
),
885
900
.
Neitsch
S. L.
,
Arnold
J. G.
,
Kiniry
J. R.
&
Williams
J. R.
2011
Soil and Water Assessment Tool Theoretical Documentation Version 2009
.
Olivera
F.
,
Valenzuela
M.
,
Srinivasan
R.
,
Choi
J.
,
Cho
H.
,
Koka
S.
&
Agrawal
A.
2006
ARCGIS-SWAT: A geodata model and GIS interface for SWAT
.
Journal of the American Water Resources Association
42
(
2
),
295
309
.
R Core Team
2019
R: A Language and Environment for Statistical Computing: R Foundation for Statistical Computing
.
Rowe
W. D.
1994
Understanding uncertainty
.
Risk Analysis
14
(
5
),
743
750
.
Spruill
C. A.
,
Workman
S. R.
&
Taraba
J. L.
2000
Simulation of daily and monthly stream discharge from small watersheds using the SWAT model
.
Transactions of the ASAE
43
(
6
),
1431
.
Srinivasan
R.
&
Arnold
J. G.
1994
Integration of a basin-scale water quality model with GIS 1
.
Journal of the American Water Resources Association
30
(
3
),
453
462
.
Srivastava
P.
,
McNair
J. N.
&
Johnson
T. E.
2006
Comparison of process-based and artificial neural network approaches for streamflow modeling in an agricultural watershed
.
Journal of the American Water Resources Association
42
(
3
),
545
563
.
Tuppad
P.
,
Douglas-Mankin
K.
,
Lee
T.
,
Srinivasan
R.
&
Arnold
J.
2011
Soil and water assessment tool (SWAT) hydrologic/water quality model: Extended capability and wider adoption
.
Transactions of the ASABE
54
(
5
),
1677
1684
.
USDA-NRCS
2014
Geospatial Data Gateway
.
U.S. Geological Survey
2016
National Water Information System Data Available on the World Wide Web (USGS Water Data for the Nation)
.
Worland
S. C.
,
Steinschneider
S.
,
Asquith
W.
,
Knight
R.
&
Wieczorek
M.
2019
Prediction and inference of flow duration curves using multioutput neural networks
.
Water Resources Research
55
(
8
),
6850
6868
.
Yang
L.
,
Jin
S.
,
Danielson
P.
,
Homer
C.
,
Gass
L.
,
Bender
S. M.
,
Case
A.
,
Costello
C.
,
Dewitz
J.
&
Fry
J.
2018
A new generation of the United States national land cover database: Requirements, research priorities, design, and implementation strategies
.
ISPRS Journal of Photogrammetry and Remote Sensing
146
,
108
123
.
Yaseen
Z. M.
,
El-Shafie
A.
,
Jaafar
O.
,
Afan
H. A.
&
Sayl
K. N.
2015
Artificial intelligence based models for stream-flow forecasting: 2000–2015
.
Journal of Hydrology
530
,
829
844
.
Yaseen
Z. M.
,
Sulaiman
S. O.
,
Deo
R. C.
&
Chau
K.-W.
2018
An enhanced extreme learning machine model for river flow forecasting: state-of-the-art, practical applications in water resource engineering area and future research direction
.
Journal of Hydrology
569
,
387
408
.
Zakizadeh
H.
,
Ahmadi
H.
,
Zehtabian
G.
,
Moeini
A.
&
Moghaddamnia
A.
2020
A novel study of SWAT and ANN models for runoff simulation with application on dataset of metrological stations
.
Physics and Chemistry of the Earth, Parts A/B/C
120
,
102899
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).