Abstract
This study evaluated the performance and hydrologic utility of four different satellite precipitation datasets (SPDs), including GPM (IMERG_F), PERSIANN_CDR, CHIRPS, and CMORPH, to predict daily streamflow and SL using the SWAT hydrological model as well as SWAT coupled soft computing models (SCMs) such as artificial neural networks (SWAT-ANNs), random forests (SWAT-RFs), and support vector regression (SWAT-SVR), in the mountainous Upper Jhelum River Basin (UJRB), Pakistan. SCMs were developed using the outputs of un-calibrated SWAT models to improve the predictions. Overall, the GPM shows the highest performance for the entire simulation with R2 and PBIAS varying from 0.71 to 0.96 and −13.1 to 0.01%, respectively. For the best GPM-based models, SWAT-RF showed a superior ability to simulate the entire streamflow with R2 of 0.96, compared with the SWAT-ANN (R2 = 0.90), SWAT-SVR (R2 = 0.87), and SWAT-CUP (R2 = 0.71). Similarly, SWAT-ANN presented the best performance capability to simulate the SL with an R2 of 0.71, compared with the SWAT-RF (R2 = 0.66), SWAT-SVR (R2 = 0.52), and SWAT-CUP (R2 = 0.42). Hence, hydrological coupled SCMs based on SPDs could be an effective technique for simulating hydrological parameters, particularly in complex terrain where gauge network density is low or uneven.
HIGHLIGHTS
Soft computing models development using the outputs of un-calibrated SWAT models to improve the prediction of daily streamflow and sediment load in Rivers.
Effectiveness of the hydrological coupled soft computing models based on satellite precipitation datasets for simulating hydrological parameters.
Auto-optimization of different sensitive parameters of the soft computing models to improve predictions.
Graphical Abstract
INTRODUCTION
Streamflow and sediment transportation are the most critical variables of water resources management. The accurate prediction and estimation of these variables are required for assessing future floods, reservoir life and management, development of agricultural policies, and operational planning of the hydraulic structures. However, finding accurate methods to compute precise streamflow and sediment load in the fluvial network is a challenging task because of the highly complex and non-linear features of the hydrological/hydro-sedimentological processes affected by various climatic, hydrographic, and anthropogenic factors in the river basin (Zounemat-Kermani et al. 2020).
In hydrological modelling, it is extremely important to have accurate precipitation (P) data with high spatial and temporal resolution (Rahman et al. 2020; Talchabhadel et al. 2021). However, the variability in P over space and lack of ground-based gauge observations, particularly over a complex topographic basin, adversely affects the rainfall-runoff simulation process (Gourley & Vieux 2006). Various methods have been applied to estimate P, but they all have their advantages and disadvantages. Generally, ground-based rain gauges (RGs) and weather radar (WR) are installed to record the P directly. But the spatial inconsistency of P over time is highly reliant on the gauge network density (Yan & Bardossy 2019; Nanding et al. 2021). However, the RGs and WR are sparsely distributed in mountainous areas, making it hard to record the temporospatial variability in P (Ma et al. 2018). Likewise, technical problems in developing countries, like data sharing for transboundary river basins, have taken substantial challenges in collecting the gauge P data (Thu & Wehn 2016). Researchers have developed various global and regional-based gridded precipitation datasets (GPDs) derived from reanalysis and remote sensing data to overcome these limitations.
Remote sensing offers satellite-based precipitation datasets (SPDs) of different temporal and spatial resolutions with prolonged archives. The SPDs provide uninterrupted recording of P over local and global scales. Furthermore, SPDs eliminate data scarcity problems associated with sparsely distributed RGs or ungauged areas (Hong et al. 2019). Despite this, no best-performing SPDs consistently produce superior results worldwide; therefore, SPDs must be evaluated before making any application for any study area. SPDs such as Global Precipitation Measurement (GPM)-based Integrated Multi-Satellite Retrievals for GPM (IMERG), Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks – Climate Data Records (PERSIANN-CDR), Climate Hazards Group Infrared Precipitation with Station data (CHIRPS), and Climate Prediction Center MORPHing (CMORPH) and many others offer P datasets with a great spatiotemporal resolution. Moreover, these datasets are freely available for scientific research to simulate rainfall-runoff modelling.
Various hydrological models with varying degrees of complexity have been developed to compute the rainfall-runoff relationship and predict runoff and sediment load. Numerous hydrological models based on empirical, physical, and theoretical approaches exist in the literature to compute the runoff and sediment transport processes in the fluvial network (Hajigholizadeh et al. 2018; Jaiswal et al. 2020; Borrelli et al. 2021). These models are helpful in predicting the runoff and sediment concentration produced under various climatic conditions, land use and land cover changes, and different management strategies (Ndulue et al. 2015; Daramola et al. 2022). Generally, these models require intensive data inputs and can ultimately support decision-making at the local or regional levels for urban planning and watershed management.
Numerous studies led by different researchers worldwide have determined that the physical-based, semi-distributed SWAT (Soil and Water Assessment Tool) model provides acceptable estimation and prediction of the runoff and sediment load (Worku et al. 2017; Tan et al. 2019; Awasthi & Kumar 2021). In a physical-based model, runoff and sediments generated by different natural processes in a river basin are simulated based on the mass and energy conservation laws. They provide an understanding of river basin processes, which is helpful for evaluating the impacts of soil and water conservation measures (Singh et al. 2014). However, the drawback of these models is that they need extensive input data such as topographical, hydro-meteorological, land use, soil, and crop management data for the model development. Moreover, calibrating the several parameters of these models is highly complicated due to the non-linearity of the runoff and sediment transport processes (Zhihua et al. 2020; Jimeno-Sáez et al. 2022). Therefore, they demand proficient background knowledge and high computational time compared with data-driven models. Due to the inherent drawbacks of these models, more reliable approaches are required for assessing runoff and sedimentation.
Over the past two decades, data-driven models have emerged as the prevailing tool for the computation of highly complex and non-linear processes. These models provide a comparatively flexible and fast technique to understand the complexity and non-linearity of the system by computing the values of the objective function. The data-driven models show excellent competency in modelling complex and non-linear characteristics of the hydro-meteorological time series without prior knowledge of fundamental processes (Khazaee Poul et al. 2019). These models, such as computational intelligence (CI), data mining (DM), soft computing (SC), artificial intelligence (AI), and machine learning (ML) analyse the complex system and develop the functional connection between input and output variables without considering the analytical and physical relationship of the objective system.
Recent studies have applied different data-driven models for computing and forecasting different hydrological variables such as streamflow (Kassem et al. 2020; Khodakhah et al. 2022), reservoir water level (Hrnjica & Bonacci 2019), reservoir evaporation (Zhang et al. 2018; Yuan et al. 2022), water quality (Khadr & Elshemy 2017), and suspended sediment transport in rivers (Khan et al. 2021; Jimeno-Sáez et al. 2022). Khan et al. (2021) compared the performance of various conventional sediment-rating curves (SRC) and soft computing approaches such as local linear regression (LLR) and artificial neural network (ANN) model for assessing the daily suspended sediment load (SSL) based on the current and previous streamflows and former sediment data. Khan et al. (2020) predicted the missing and future daily SSL with the ANN model. They used it as the boundary condition of the hydraulic model to compute the precise thalweg levels and delta advancement rate of the Mangla reservoir, Pakistan.
Different models have been employed independently based on SWAT and SC approaches to model runoff and sediment transport. However, few studies have compared both approaches, e.g., Jimeno-Sáez et al. (2022) compared the two ML algorithms, M5P and random forest (RF) models, with the SWAT model for the estimation of the SSL in the Oskotz river basin and concluded that RF and M5P models are suitable models for computing SSL in contrast to the SWAT model. Sirabahenda et al. (2020) compared the SWAT and the adapted neuro-fuzzy interference system (ANFIS) to estimate runoff and SSL load. They concluded that the performance of the ANFIS model is higher than the SWAT model. Pradhan et al. (2020) concluded that the ANN is a valuable alternative to the SWAT model for predicting SSL. Koycegiz & Buyukyildiz (2019) compared the SWAT model with ANN and support vector regression (SVR) for the estimation of the streamflow. They obtained better performance using data-driven models (ANN and SVR).
This study mainly evaluates the performance of the hydrological model (i.e., SWAT) and a hybrid hydrological and data-driven model (i.e., SWAT-ANN, SWAT-RF, and SWAT-SVR) by applying four different SPDs (i.e., GPM (IMERG_F), PERSIANN_CDR, CHIRPS, and CMORPH) for the precise estimation of streamflow and SSL of the Upper Jhelum River basin (UJRB; Mangla Dam Watershed), which is one of the three major rivers of the Indus basin irrigation system (IBIS). This study also evaluates the capabilities of the hybrid models for predicting the streamflow and SSL by employing the un-calibrated output variables (without optimization of different hydrological model parameters) of the SWAT model as an input for the data-driven models. This study also evaluates the performance of the hybrid models where high-quality in-situ P measurements are not available. The accuracy of these models was assessed for the period 2000–2013 by using the most widely applied statistical parameters. To our best knowledge, no previous research compared the SWAT hydrological model with these hybrid models by applying different SPDs.
STUDY SITE
Location map of UJRB and spatial distribution of metrological stations.
Two particular P systems robustly direct the hydro-climatology of the UJRB; western disturbance and monsoon that prevailed in northwestern and southern parts of the UJRB, respectively. The mean annual precipitation of the northwestern and southern parts of the UJRB is 846 and 1,893 mm, respectively. The average yearly minimum temperature (Tmin) and maximum temperatures (Tmax) of UJRB are 0.3 °C (Dec–Jan) and 29.5 °C (Jun–Jul), respectively. The UJRB's temperature drops as elevation rises (from south to north), but P does not follow the particular trend in such a complex topography. The mean annual discharge at Azad Pattan station is 833 m3/s. About 75% of the overall inflow into the Mangla reservoir occurs between March and August, primarily due to monsoon rains with a small amount of snowmelt. As a result, river runoff increases from March to April, reaching its peak in May–July owing to monsoon rain (June–August) and snowmelt. The average annual P of different climatic stations over the UJRB and their characteristics are listed in Table 1.
Inventory of climate stations
Sr. No. . | Station Name . | Longitude (°E) . | Latitude (°N) . | Elevation (m a.s.l.) . | Precipitation (mm/year) . |
---|---|---|---|---|---|
1 | Naran | 73.65 | 34.9 | 2,362 | 1,154 |
2 | Balakot | 73.35 | 34.55 | 995.4 | 1,701 |
3 | Bagh | 73.77 | 33.98 | 1,067 | 1,422 |
4 | Garidoptta | 73.62 | 34.22 | 814 | 1,567 |
5 | Gujarkhan | 73.13 | 33.25 | 458 | 830 |
6 | Kotli | 73.9 | 33.52 | 614 | 1,245 |
7 | Muzaffarabad | 73.47 | 34.37 | 702 | 1,423 |
8 | Mangla | 73.63 | 33.121 | 282 | 1,779 |
9 | Murree | 73.38 | 33.91 | 2,213 | 889 |
10 | Plandri | 73.71 | 33.72 | 1,402 | 1,443 |
11 | Rawalakot | 73.77 | 33.86 | 1,676 | 1,397 |
12 | Kupwara | 74.25 | 34.51 | 1,609 | 1,245 |
13 | Srinagar | 74.83 | 34.08 | 1,587 | 721 |
14 | Qazigund | 75.08 | 33.58 | 1,690 | 1,290 |
15 | Gulmarg | 74.375 | 34.189 | 2,705 | 1,121 |
16 | Pahalgam | 75.33 | 34 | 2,310 | 1,204 |
Sr. No. . | Station Name . | Longitude (°E) . | Latitude (°N) . | Elevation (m a.s.l.) . | Precipitation (mm/year) . |
---|---|---|---|---|---|
1 | Naran | 73.65 | 34.9 | 2,362 | 1,154 |
2 | Balakot | 73.35 | 34.55 | 995.4 | 1,701 |
3 | Bagh | 73.77 | 33.98 | 1,067 | 1,422 |
4 | Garidoptta | 73.62 | 34.22 | 814 | 1,567 |
5 | Gujarkhan | 73.13 | 33.25 | 458 | 830 |
6 | Kotli | 73.9 | 33.52 | 614 | 1,245 |
7 | Muzaffarabad | 73.47 | 34.37 | 702 | 1,423 |
8 | Mangla | 73.63 | 33.121 | 282 | 1,779 |
9 | Murree | 73.38 | 33.91 | 2,213 | 889 |
10 | Plandri | 73.71 | 33.72 | 1,402 | 1,443 |
11 | Rawalakot | 73.77 | 33.86 | 1,676 | 1,397 |
12 | Kupwara | 74.25 | 34.51 | 1,609 | 1,245 |
13 | Srinagar | 74.83 | 34.08 | 1,587 | 721 |
14 | Qazigund | 75.08 | 33.58 | 1,690 | 1,290 |
15 | Gulmarg | 74.375 | 34.189 | 2,705 | 1,121 |
16 | Pahalgam | 75.33 | 34 | 2,310 | 1,204 |
DATA DESCRIPTION
Observed data
The daily observed metrological data (P, Tmax, and Tmin) from sixteen climate stations for the period of 1995–2007 were obtained from the Pakistan Meteorological Department (PMD), Water and Power Development Authority (WAPDA) Pakistan, and the Indian Meteorological Department (IMD). The metrological data of Kupwara, Srinagar, Qazigund, Gulmarg, and Pahalgam stations were extracted from the daily gridded P and temperature datasets, derived from a dense network of metrological stations of IMD (Pai et al. 2014). The remaining station data were obtained from PMD and WAPDA and used as a reference dataset for the bias correction (BC) of the SPDs. The daily observed mean flow (Q) and SSL data from 1998 to 2013 of the Azad-Pattan gauging station were obtained from the WAPDA. Furthermore, temperature, wind speed, humidity, and solar radiation data from Climate Forecast System Reanalysis (CFSR) dataset were employed for the hydrological simulations.
Satellite-based Precipitation Datasets (SPDs)
This study employs daily-based P data from four selected SPDs whose description is presented in Table 2.
GPM (IMERG) final run Vo6, PERSIANN-CDR, CHIRPS V2.0, and CMORPH V1.0
Dataset Name . | Spatial resolution . | Temporal resolution . | Coverage . | Period . | References . |
---|---|---|---|---|---|
GPM (IMERGE_F) | 0.1° | 30 min and daily | 60°S–60°N | 2000–recent | Hou et al. (2014) |
PERSIANN_CDR | 0.25° | Daily | 60°S–60°N | 1983–recent | Ashouri et al. (2015) |
CHIRPS | 0.05° | Daily | 50°S–50°N | 1981–recent | Funk et al. (2015) |
CMORPH | 0.25° | 30 min and daily | 60°S–60°N | 1998–recent | Joyce et al. (2004) |
Dataset Name . | Spatial resolution . | Temporal resolution . | Coverage . | Period . | References . |
---|---|---|---|---|---|
GPM (IMERGE_F) | 0.1° | 30 min and daily | 60°S–60°N | 2000–recent | Hou et al. (2014) |
PERSIANN_CDR | 0.25° | Daily | 60°S–60°N | 1983–recent | Ashouri et al. (2015) |
CHIRPS | 0.05° | Daily | 50°S–50°N | 1981–recent | Funk et al. (2015) |
CMORPH | 0.25° | 30 min and daily | 60°S–60°N | 1998–recent | Joyce et al. (2004) |
The Integrated Multi-satellitE Retrievals for GPM (IMERG) algorithm combines information from the GPM satellite constellation to record the P for greater parts of the Earth's surface. This dataset is particularly pertinent to sparsely distributed RGs and an ungauged portion of the surface of the earth. The new version of IMERG-06 combines previously estimated P by the Tropical Rainfall Measuring Mission (TRMM) satellite (2000–2015) with recent P estimates by Global Precipitation Measurement (GPM) satellite (2014–present). The GPM satellite carries two primary sensors: first, the GPM Microwave Imager (GMI) that observes P intensity, type, and size; and second, the Dual-frequency Precipitation Radar (DPR) that measures the inner formation of storms within and under the clouds. In addition, this information serves as a reference standard for combining P measurements from other satellites within the same constellation (Huffman et al. 2015). In this study, the P data of IMERG_F V6 at a spatial resolution of 0.1° data for 2000–2013 were downloaded from the following website: https://giovanni.gsfc.nasa.gov/giovanni/.
The PERSIANN_CDR provides daily rainfall computed at a spatial resolution of 0.25° in the latitude bands 60°S–60°N from 1983 to the near present. The PERSIANN algorithm estimates P using the infrared satellite data from GridSat-B1. Then biased-corrected PERSIANN-CDR data are computed by training an artificial neural network using infrared satellite data and stage IV P data from the National Centers for Environmental Prediction (NCEP) (Knapp 2008). In this study, the biased-corrected PERSIANN-CDR data for 1995–2013 were downloaded from the following website: https://app.climateengine.org/climateEngine.
CHIRPS V2.0 is a merged P product which combines pentad P data. It creates gridded rainfall time series for trend analysis and seasonal drought monitoring by combining three datasets: in-house climatology, CHPclim, 0.05° resolution satellite imagery, and in-situ station data (Funk et al. 2015). In this study, daily P data of CHIRPS V2.0 at a spatial resolution of 0.05° for the period of 1995–2013 were downloaded from: https://app.climateengine.org/climateEngine.
CMORPH uses satellite passive microwave surveillance data corrected by high-resolution IR imagery to estimate P. The former version of the CMORPH is a pure satellite precipitation product based solely on observation from the satellite. In the latest CMORPH V1.0, BC was applied by comparing satellite and daily rain gauge estimates (Joyce et al. 2004). In this study, daily P data of CMORPH V1.0 with a spatial resolution of 0.25° for the period of 1995–2013 were downloaded from: https://app.climateengine.org/climateEngine.
Spatial data
In hydrological analysis, the topography is used to delineate watersheds, develop drainage patterns, and explore topographical features. It affects the flow rate and flow direction over the land surface. In this study, the UJRB area was delineated by ARCSWAT, using a 30 × 30 m Digital Elevation Model (DEM) from the Shuttle Radar Topography Mission (SRTM) of National Aeronautics and Space Administration (NASA).
Figure 2(b) depicts the soil map of the UJRB. The soil data were acquired from the Food and Agriculture Organization (FAO) of the United Nations Harmonized World Soil Database (HWSD). The soil data were classified into different soil classes. The dominant soil group is Gleyic Solonacks soil which covers around 49% of the basin area that falls in the high mountainous region. The remaining basin soil groups are Calaric Phaoeozems soil (23%) found in the mid ranges, and Mollic Planosols soil (21%) found in the northern valley areas.
METHODOLOGY
In this study, different hybrid models SWAT-ANN, SWAT-RF, and SWAT-SVR, based on the SWAT model, were developed to improve the daily streamflow and sediment load prediction. These hybrid models are based on the SWAT hydrological model, which requires highly precise P data with a high spatial and temporal resolution. Therefore, bias-corrected P data from four freely available SPDs (i.e., GPM (IMERG_F), PERSIANN_CDR, CHIRPS, and CMORPH) were fed into the SWAT model in order to obtain precise results. The SWAT model was then calibrated and validated (through parameter optimization) using SUFI-2 (Sequential Uncertainty Fitting version 2) in SWAT-CUP (SWAT Calibration and Uncertainty Program).
Similarly, hybrid models were developed by using the output variables of the un-calibrated SWAT model (which had not been optimized for parameter values), which served as inputs for the hybrid models. Earlier, inputs were pre-processed according to each hybrid model's requirements for better prediction. In the next step, hybrid models were trained and tested using 70 and 30% random data, respectively.
Bias Correction (BC)
The local intensity scaling (LOCI) method of BC was employed using historical observation to correct the biases in SPDs. The LOCI proposed by Schmidli et al. (2006) rectify the biases in mean, wet-day frequencies, and wet-day intensities of the P time series in three steps:
According to Schmidli et al. (2006), the corrected control and scenario P series have the same mean, wet-day frequency, and intensity as the observed period.
Hydrological model
In this study, the SWAT model developed by the Agriculture Research Service of the United State Department of Agriculture (USDA) is employed for the hydrological modelling. It is a physical-based, semi-distributed and time-continuous model that was successfully applied for various hydrological modelling, assessment of LULC variability impact on runoff, sediment transport and nutrient simulation, water quality, and water management practices on a different time interval (Yin et al. 2017; Tuo et al. 2018). It divides the larger basin into various sub-basins and then creates hydrological response units (HRUs), which are the least element of the sub-basin. SWAT generates the HRUs based on two methods: first, by LULC and soil data information with DEM, and second, by assigning the threshold values (Arnold et al. 2012). It simulates the hydrological processes in two distinct phases that are land and routing phases. The first land phase computes the surface runoff for each HRU by employing a Soil conservation Service (SCS) or Green and Ampt infiltration method (USDA 1985; Heber Green & Ampt 2009). The second routing phase routed the computed surface runoff from all the HRUs to the basin outlet through the channels network using Muskingum or Variable method (Duan et al. 2019). Furthermore, the SWAT model simulates the horological cycle using the water balance equation given by Neitsch et al. (2011) and the evapotranspiration (ET) was estimated using the Hargreaves method. Similarly, it computes the sediment yield for each HRU by applying the Modified Universal Soil Loss Equation (MUSLE) (Williams 1975). A detailed description of the SWAT model can be found in the SWAT theoretical document through the following link: https://swat.tamu.edu/docs/.
Artificial Neural Network (ANN)
An ANN is a computing technique that mimics the work of the human brain and nervous system with the aid of a mathematical structure. It can learn, memorize, and reveal the various relationships in the dataset using its network capability. This technique can model complex non-linear hydrological and sedimentological processes in a watershed without comprehensive knowledge of its physical characteristics. Neural networks are made up of neuronal processing units that are layered and linked by several connections. The two most common architectures of ANNs are recurrent and non-recurrent. Non-recurrent architectures comprise a single-layer or multilayer perceptron (Mohammadi et al. 2020). In this work, the simulation of hydrological processes was carried out using feed-forward multilayer perceptron neural networks (MLPNNs), which are extensively applied in hydrological simulations. Generally, MLPNNs are composed of three layers, the input layer, the hidden layer, and the output layer. Layers are connected by weighted synaptic connections between their nodes (neurons). A weight is associated with each node connection, which signifies their connection strength. In general, these networks operate as follows: (i) neurons process the information received from the array of inputs or signals; (ii) through connection links, information is passed between neurons of adjacent layers; (iii) to produce an output signal, neurons combine multiple input signals linearly as per their input weights and then pass them through a non-linear activation (transfer) function.
The performance of the ANNs can be improved by applying the appropriate activation functions and optimizing the parameters, such as the size of the hidden layer and its neurons, gains, and threshold. The neural network is trained by employing a suitable learning algorithm that optimizes the connection weights among the neurons of the training dataset. A thorough description of the MLPNNs method can be found in existing studies (Nourani 2014; Kushwaha & Kumar 2017).
Random Forest (RF)
The RF algorithm is based on a supervised ML algorithm that is computationally efficient compared with other types of ML algorithms. It is one of the prevailing ensemble-learning algorithms, which combines the results of the predictions from the multiple trained decision trees (DTs), and then by taking means to determine the final output. To train the DTs, RF utilizes a bagging or bootstrap approach to create several subsets of the training sample, which are randomly selected with replacement. By doing so, it helps to overcome deviations in predictions when the underlying data varies. RF is generally known as a parallel process because it builds multiple trees from a dataset in parallel, while each tree is independent of the others. Overall, the RF algorithm offers the benefit of high accuracy with less training time.
In RF modelling, the two most pertinent parameters are required: (1) The number of trees to build a forest (ntree, n-estimator), which is the critical variable of RF; (2) the number of variables or features in the random subset at each node (mtry). For a more detailed description and mathematical equations on RF, the reader is referred to Breiman (2001) and Liaw & Wiener (2002).
Support Vector Regression (SVR)
The support vector machine (SVM) is a statistical model of a supervised ML algorithm devised by Cortes & Vapnik (1995). The SVMs are applied for classification and regression problems to get the least amount of error in grouping data or fitting the function. SVM is applied for classification and regression analyses as support vector classification (SVC) and support vector regression (SVR), respectively (Burges 1998; Smola & Schölkopf 2004). In the SVR method, different kernel functions are used to model non-linear data, i.e., linear, polynomial, radial basis function (RBF), and sigmoid kernel. A kernel function transforms a complex non-linear problem in the original input space into a simple linear problem in the feature space (i.e., some higher-dimensional space). In contrast to other kernels, the RBF kernel is less prone to numerical difficulties and more capable of handling non-linear input–output relationships (Lin et al. 2006). Therefore, this study adopts the commonly used RBF kernel. An SVR application that uses an RBF kernel has three critical parameters: the penalty error parameter C (C > 0), the Gaussian RBF kernel parameter (γ), and the deviation of the error margin (ε). For a more detailed description and mathematical equations of SVR-RBF, the reader is referred to Cortes & Vapnik (1995) and Aljanabi et al. (2018).
Hybrid models development
A hybrid modelling approach was developed to predict daily streamflows by coupling the SWAT model with other SCMs like ANN, RF, and SVR models, resulting in SAWT-ANN, SWAT-RF, and SWAT-SVR models. In this approach, the outcomes of the SWAT model with its default parameter combination (without calibration) were employed as inputs for the soft computing models. During this development, SWAT was considered a comprehensive interface that integrated weather, terrain, LULC, and soil to produce output variables, which serve as input to the SCMs. The summary output files (output.std and output.sed) of the SWAT model contain the required output variables. The output variables are listed in Table 3.
SWAT models output variables (Input variables for SCMs)
Sr. No. . | Variable Name . | Description . |
---|---|---|
1 | PREC | Daily mean precipitation in the watershed (mm) |
2 | SURQ | Surface runoff contribution from streamflow from HRU (mm) |
3 | LATQ | Daily lateral flow (mm) |
4 | GWQ | Daily groundwater contribution to stream (mm) |
5 | PERCOLATE | Daily water percolation (mm) |
6 | SW | Daily amount water stored in soil profile (mm) |
7 | EVAP | Average daily rate of water loss from reach by evaporation (m3/s) |
8 | WATER YIELD | Daily water yield to streamflow from HRUs (mm) |
9 | SED_OUT | Daily total sediment transported out of reach (tons) |
Sr. No. . | Variable Name . | Description . |
---|---|---|
1 | PREC | Daily mean precipitation in the watershed (mm) |
2 | SURQ | Surface runoff contribution from streamflow from HRU (mm) |
3 | LATQ | Daily lateral flow (mm) |
4 | GWQ | Daily groundwater contribution to stream (mm) |
5 | PERCOLATE | Daily water percolation (mm) |
6 | SW | Daily amount water stored in soil profile (mm) |
7 | EVAP | Average daily rate of water loss from reach by evaporation (m3/s) |
8 | WATER YIELD | Daily water yield to streamflow from HRUs (mm) |
9 | SED_OUT | Daily total sediment transported out of reach (tons) |
Model performance evaluation
In this study, five statistical parameters R2 (coefficient of determination), NSE (Nash–Sutcliffe efficiency), PBIAS (percent bias), RMSE (root mean square error), and RSR (RMSE-observation's standard deviation ratio) methods were applied to assess the performance of different models to estimate river discharge and sediment load. R2 describes data collinearity between observations and simulations. NSE indicates how well an observed plot fits a simulated plot. The PBIAS determines whether simulations are larger or smaller than observations based on their average magnitude. RMSE indicates the difference between the observed and simulated series. Similarly, the residual variance of a prediction is represented by the RSR. Generally, a better-performing model has higher R2, NSE, and lower PBIAS, RMSE, and RSR. These statistical indicators are listed in Table 4, along with their mathematical expressions and ranges.
Performance metrics
Mathematical equation . | Range . |
---|---|
![]() | Range [0,1], and 1 is the optimal value (o.v.) |
![]() | Range [−∞,1], and 1 is the o.v. |
![]() | Range [−∞, + ∞], and 0 is the o.v. |
![]() | Range [0, + ∞], and 0 is the o.v. |
![]() | Range [0, + ∞], and 0 is the o.v. |
Mathematical equation . | Range . |
---|---|
![]() | Range [0,1], and 1 is the optimal value (o.v.) |
![]() | Range [−∞,1], and 1 is the o.v. |
![]() | Range [−∞, + ∞], and 0 is the o.v. |
![]() | Range [0, + ∞], and 0 is the o.v. |
![]() | Range [0, + ∞], and 0 is the o.v. |
Note: O is the observed value and S is the simulated value, avg is the average of the total values, and n is the total number of observations.
RESULTS AND DISCUSSION
Comparing observed data with SPDs and their BC
(a) Mean monthly precipitation distribution (basin-averaged) without bias correction and (b) after bias correction applied.
(a) Mean monthly precipitation distribution (basin-averaged) without bias correction and (b) after bias correction applied.
(a) Violin plots of precipitation distribution without bias correction and (b) after bias correction applied.
(a) Violin plots of precipitation distribution without bias correction and (b) after bias correction applied.
Evaluation of SWAT models
In order to establish a baseline, a SWAT model was run using default parameters for each BC SPD of GPM, PERSIANN, CHIRPS, and CMORPH for the period 2000–2013 on a daily basis, including a two-year warm-up period (2000–2001). The default values of parameters were derived from the Food and Agriculture Organization (FAO) Global Map of land cover and soil characteristics (Abbaspour et al. 2004, 2007). Hereafter, a calibration process was performed using SUFI-2 in SWAT-CUP on the results based on default values to make them more comparable with the observed values. The model was calibrated and validated using daily Jhelum River flow data from January 2002 to September 2010 and October 2010 to December 2013 at A-pattan station, respectively. For the calibration of the sensitive parameters, 22 frequently used streamflow calibration parameters were selected through an extensive literature review close to the study area (Abbaspour et al. 2015; Babur et al. 2016; Saddique et al. 2019, 2022; Shahid et al. 2021; Rahman et al. 2022). In SWAT-CUP, a Global Sensitive Analysis (GSA) was applied to select the most sensitive parameters for streamflow. Hence, 28 parameters were considered for the sensitivity analysis, out of which 8 parameters were found to be the most sensitive to streamflow in the calibration phase based on maximum t-test values and minimum p-values, as shown in Table 5. Based on these most critical parameters, SWAT-CUP was run three times with 1,000 simulations each time to calibrate streamflow and to narrow down parameter ranges at every step (Table 5). Detailed information on the calibration procedure of the model can be found in Abbaspour et al. (2015).
The SWAT model parameters
Parameter . | Final parameter range . | t-test . | p-value . |
---|---|---|---|
v_SMFMN | 2.85–5 | 0.76 | 0.86 |
r_SOL_AWC | −0.08 to 0.14 | 1.19 | 0.23 |
v_TIMP | 0.47–1.5 | 1.52 | 0.13 |
v_SFTMP | −2.32–3.02 | 1.92 | 0.05 |
v_SMTMP | −1.18 to 5 | 3.56 | 0.004 |
v_SMFMX | 1.25–4.5 | 5.05 | 0.0001 |
v_ALPHA_BF | 0.33–1 | 10.22 | 0.0001 |
r_CN2 | −0.33 to 0.02 | 4.08 | 0.0001 |
The parameters used to calibrate SL | |||
v__CH_COV1 | 0.02–0.40 | 0.47 | 0.63 |
v__SPEXP | 1.12–1.39 | 0.62 | 0.62 |
r__USLE_K | −0.05 | 0.73 | 0.46 |
v__USLE_P | −0.45 to 0.51 | 0.9 | 0.36 |
v__SPCON | 0.002–0.007 | 1.03 | 0.30 |
v__CH_COV2 | −0.16 to 0.62 | 1.35 | 0.19 |
Parameter . | Final parameter range . | t-test . | p-value . |
---|---|---|---|
v_SMFMN | 2.85–5 | 0.76 | 0.86 |
r_SOL_AWC | −0.08 to 0.14 | 1.19 | 0.23 |
v_TIMP | 0.47–1.5 | 1.52 | 0.13 |
v_SFTMP | −2.32–3.02 | 1.92 | 0.05 |
v_SMTMP | −1.18 to 5 | 3.56 | 0.004 |
v_SMFMX | 1.25–4.5 | 5.05 | 0.0001 |
v_ALPHA_BF | 0.33–1 | 10.22 | 0.0001 |
r_CN2 | −0.33 to 0.02 | 4.08 | 0.0001 |
The parameters used to calibrate SL | |||
v__CH_COV1 | 0.02–0.40 | 0.47 | 0.63 |
v__SPEXP | 1.12–1.39 | 0.62 | 0.62 |
r__USLE_K | −0.05 | 0.73 | 0.46 |
v__USLE_P | −0.45 to 0.51 | 0.9 | 0.36 |
v__SPCON | 0.002–0.007 | 1.03 | 0.30 |
v__CH_COV2 | −0.16 to 0.62 | 1.35 | 0.19 |
Based on the criteria suggested by Moriasi et al. (2007), the accuracy of SWAT-CUP models was evaluated by different statistical parameters, such as R2, NSE, PBIAS, RMSE, and RSR. A comparison of the calibration and validation results of the different SWAT-CUP models are shown in Table 6. In both calibration and validation phases, CHIRPS, PERSIANN, and GPM models showed satisfactory to good competence in computing streamflows with R2, NSE, PBIAS, and RSR varying from 0.68 to 0.70 and 0.75 to 0.78, 0.62 to 0.67 and 0.73 to 0.76, −19.0 to −8.20% and −10.50 to +7.0%, and 0.57 to 0.62 and 0.49 to 0.52, respectively. Furthermore, the performance of the CMORPH model was poor, especially in its calibration phase, as the lowest R2 was 0.45 and NSE was 0.36, and highly underestimated streamflow prediction with PBIAS −22.6%. Based on Figure 6 and Table 6, the GPM model performed slightly better than the CHIRPS and PERSIANN models with a higher R2 and NSE, acceptable PBIAS and minimum RMSE and RSR. Previous studies by Popovych & Dunaieva (2021), Saddique et al. (2022), and Tang et al. (2020) have shown similar results.
SWAT-CUP model performance using different SPDs
SPDs model . | R2 . | NSE . | PBIAS (%) . | RMSE (m3/s) . | RSR . | |||||
---|---|---|---|---|---|---|---|---|---|---|
Cal. . | Val. . | Cal. . | Val. . | Cal. . | Val. . | Cal. . | Val. . | Cal. . | Val. . | |
GPM | 0.70 | 0.75 | 0.66 | 0.73 | −14.20 | −9.40 | 365.94 | 261.61 | 0.58 | 0.52 |
PERSIANN | 0.70 | 0.76 | 0.62 | 0.74 | −19.00 | −10.50 | 386.49 | 257.16 | 0.62 | 0.51 |
CHIRPS | 0.68 | 0.78 | 0.67 | 0.76 | −8.20 | 7.00 | 359.83 | 249.96 | 0.57 | 0.49 |
CMORPH | 0.45 | 0.62 | 0.36 | 0.61 | −22.60 | −6.00 | 499.96 | 316.04 | 0.80 | 0.62 |
SPDs model . | R2 . | NSE . | PBIAS (%) . | RMSE (m3/s) . | RSR . | |||||
---|---|---|---|---|---|---|---|---|---|---|
Cal. . | Val. . | Cal. . | Val. . | Cal. . | Val. . | Cal. . | Val. . | Cal. . | Val. . | |
GPM | 0.70 | 0.75 | 0.66 | 0.73 | −14.20 | −9.40 | 365.94 | 261.61 | 0.58 | 0.52 |
PERSIANN | 0.70 | 0.76 | 0.62 | 0.74 | −19.00 | −10.50 | 386.49 | 257.16 | 0.62 | 0.51 |
CHIRPS | 0.68 | 0.78 | 0.67 | 0.76 | −8.20 | 7.00 | 359.83 | 249.96 | 0.57 | 0.49 |
CMORPH | 0.45 | 0.62 | 0.36 | 0.61 | −22.60 | −6.00 | 499.96 | 316.04 | 0.80 | 0.62 |
Evaluation of SWAT-ANN models
The development of different hybrid SWAT-ANN models was conducted using the outputs of the baseline SWAT models (un-calibrated) for each SPD. The ANN model was developed based on these outputs (listed in Table 4). In order to feed the data into the ANN models, the inputs were first transformed into a dimensionless state through the use of data normalization techniques suggested by Cigizoglu (2003). Afterwards, a Gamma test (GT) was performed to identify the most effective input variables for constructing a reliable and smooth ANN model. A detailed discussion of GT and its application can be found in Stefánsson et al. (1997) and Tsui et al. (2002). After selecting the effective inputs through the GT, the ANN model was trained by optimizing its distinct parameters. In this study, the best ANN model was developed using the Levenberg–Marquardt learning algorithm, 11 hidden nodes with tansig activation function in the hidden layer, purelin activation function in the output layer, 1,000 epochs, and 10−5 as goal performance for the single-hidden-layer network.
The performance evaluation of the different SWAT-ANN hybrid models for predicting daily streamflow at the training and testing phase is given in Table 7. According to Moriasi et al. (2007) criteria, GPM, PERSIANN, and CHIRPS SWAT-ANN hybrid models demonstrated very good competence in computing streamflows. As a result of the training and testing period, the performance indices R2, NSE, PBIAS, and RSR for GPM, PERSIANN, and CHIRPS SWAT-ANN hybrid models ranged from 0.86 to 0.90 and 0.82 to 0.89, 0.86 to 0.90 and 0.81 to 0.89, −0.53 to −0.02% and −0.4 to 5.16%, and 0.31 to 0.37 and 0.34 to 0.44, respectively. Furthermore, the performance of the CMORPH SWAT-ANN model was improved from satisfactory to good when compared with the calibrated SWAT-CUP model in predicting streamflow. The outcomes (Tables 6 and 7) revealed that the SWAT-ANN hybrid models are much better than the calibrated SWAT-CUP model.
SWAT-ANN model performance using different SPDs
SPDs model . | R2 . | NSE . | PBIAS (%) . | RMSE (m3/s) . | RSR . | |||||
---|---|---|---|---|---|---|---|---|---|---|
Training . | Testing . | Training . | Testing . | Training . | Testing . | Training . | Testing . | Training . | Testing . | |
GPM | 0.90 | 0.89 | 0.90 | 0.89 | −0.05 | −0.07 | 186.53 | 201.51 | 0.31 | 0.34 |
PERSIANN | 0.88 | 0.83 | 0.88 | 0.83 | −0.02 | −0.40 | 210.39 | 243.87 | 0.35 | 0.41 |
CHIRPS | 0.86 | 0.82 | 0.86 | 0.81 | −0.53 | 5.16 | 218.43 | 237.14 | 0.37 | 0.44 |
CMORPH | 0.80 | 0.71 | 0.80 | 0.70 | −1.04 | 5.68 | 339.19 | 253.69 | 0.45 | 0.55 |
SPDs model . | R2 . | NSE . | PBIAS (%) . | RMSE (m3/s) . | RSR . | |||||
---|---|---|---|---|---|---|---|---|---|---|
Training . | Testing . | Training . | Testing . | Training . | Testing . | Training . | Testing . | Training . | Testing . | |
GPM | 0.90 | 0.89 | 0.90 | 0.89 | −0.05 | −0.07 | 186.53 | 201.51 | 0.31 | 0.34 |
PERSIANN | 0.88 | 0.83 | 0.88 | 0.83 | −0.02 | −0.40 | 210.39 | 243.87 | 0.35 | 0.41 |
CHIRPS | 0.86 | 0.82 | 0.86 | 0.81 | −0.53 | 5.16 | 218.43 | 237.14 | 0.37 | 0.44 |
CMORPH | 0.80 | 0.71 | 0.80 | 0.70 | −1.04 | 5.68 | 339.19 | 253.69 | 0.45 | 0.55 |
Scatterplots for daily streamflow predictions from different SWAT-ANN hybrid models.
Scatterplots for daily streamflow predictions from different SWAT-ANN hybrid models.
Evaluation of SWAT-RF models
The Grid-Search contour map for the selection of the sensitive parameters' values.
The Grid-Search contour map for the selection of the sensitive parameters' values.
Based on the optimized parameters, SWAT-RF models were trained and tested to predict daily streamflows. These models were evaluated using the same performance indices, which are applied in previous sections as listed in Table 8. In terms of performance indices for training and testing of all the SWAT-RF hybrid models, the performance index R2, NSE, RMSE, and RSR ranged between 0.98 to 0.98 and 0.87 to 0.92, 0.98 to 0.98 and 0.87 to 0.92, 78.24 to 93.9 (m3/s) and 171.47 to 203.28 (m3/s), and 0.13 to 0.15 and 0.29 to 0.36, respectively. Additionally, PBIAS was less than 1%, indicating that all models accurately predicted streamflow. The accuracy of all RF models is excellent according to the criteria proposed by Moriasi et al. (2007). Furthermore, a comparison of Tables 6,7–8 indicated that the SWAT-RF hybrid model outperformed both SWAT-CUP and SWAT-ANN models.
SWAT-RF model performance using different SPDs
SPDs model . | R2 . | NSE . | PBIAS (%) . | RMSE (m3/s) . | RSR . | |||||
---|---|---|---|---|---|---|---|---|---|---|
Training . | Testing . | Training . | Testing . | Training . | Testing . | Training . | Testing . | Training . | Testing . | |
GPM | 0.98 | 0.92 | 0.98 | 0.92 | 0.00 | 0.00 | 78.24 | 171.47 | 0.13 | 0.29 |
PERSIANN | 0.98 | 0.90 | 0.98 | 0.90 | 0.00 | 0.00 | 85.25 | 193.14 | 0.14 | 0.32 |
CHIRPS | 0.98 | 0.89 | 0.98 | 0.89 | 0.00 | 0.00 | 88.50 | 196.50 | 0.15 | 0.33 |
CMORPH | 0.98 | 0.87 | 0.98 | 0.87 | 0.00 | 0.01 | 93.90 | 203.28 | 0.15 | 0.36 |
SPDs model . | R2 . | NSE . | PBIAS (%) . | RMSE (m3/s) . | RSR . | |||||
---|---|---|---|---|---|---|---|---|---|---|
Training . | Testing . | Training . | Testing . | Training . | Testing . | Training . | Testing . | Training . | Testing . | |
GPM | 0.98 | 0.92 | 0.98 | 0.92 | 0.00 | 0.00 | 78.24 | 171.47 | 0.13 | 0.29 |
PERSIANN | 0.98 | 0.90 | 0.98 | 0.90 | 0.00 | 0.00 | 85.25 | 193.14 | 0.14 | 0.32 |
CHIRPS | 0.98 | 0.89 | 0.98 | 0.89 | 0.00 | 0.00 | 88.50 | 196.50 | 0.15 | 0.33 |
CMORPH | 0.98 | 0.87 | 0.98 | 0.87 | 0.00 | 0.01 | 93.90 | 203.28 | 0.15 | 0.36 |
Scatterplots for daily streamflow predictions from different SWAT-RF hybrid models.
Scatterplots for daily streamflow predictions from different SWAT-RF hybrid models.
Evaluation of SWAT-SRV models
In this section, the hybrid SWAT-SVR models were developed based on the outputs (listed in Table 4) of baseline SWAT models (un-calibrated) for each SPD. Similar to the ANN models, the inputs are first transformed into a dimensionless state. Therefore, the data were normalized within a particular range using the standard scaling feature approach. The next step was to train the SVR models by fine-tuning the hyper-parameters using the normalized data to reduce uncertainty in the modelling. A Grid-Search (GS) approach with five-fold cross-validation was employed to fine-tune the pertinent SVR model parameters including the kernel, C, γ, and ε. The initial range of variables in SVR grid searching is Kernel (linear, polynomial, RBF, and sigmoid), C (0.1 − 10,000, step = 100), γ (0.01 − 20, step = 0.05), and ε (0.01 − 0.3, step = 0.05). The final optimal value of these parameters is the kernel = rbf, C = 1,000, γ = 0.1, and ε = 0.1.
Finally, the accuracy of SWAT-SVR models for estimating streamflow was evaluated and compared according to Moriasi et al. (2007) criteria. The statistical performance indices R2, NSE, PBIAS, and RSR for GPM, PERSIANN, CHIRPS, and CMORPH SWAT-ANN hybrid models for training and testing varied from 0.76 to 0.87 and 0.75 to 0.85, 0.75 to 0.87 and 0.75 to 0.85, −7.6 to −2.1% and −7 to −1.8%, and 0.36 to 0.5 and 0.38 to 0.50, respectively, as presented in Table 9. Furthermore, all the SVR models' performance was good compared with the calibrated SWAT-CUP model in predicting streamflow. Overall, it was observed from the results (Tables 6,78–9) that the rank of the overall performance of the SWAT-based models in predicting streamflow was SWAT-RF > SWAT-ANN > SWAT-SVR > SWAT-CUP.
SWAT-SVR model performance using different SPDs
SPDs model . | R2 . | NSE . | PBIAS (%) . | RMSE (m3/s) . | RSR . | |||||
---|---|---|---|---|---|---|---|---|---|---|
Training . | Testing . | Training . | Testing . | Training . | Testing . | Training . | Testing . | Training . | Testing . | |
GPM | 0.87 | 0.85 | 0.87 | 0.85 | −2.10 | −1.80 | 216.68 | 224.98 | 0.36 | 0.38 |
PERSIANN | 0.83 | 0.82 | 0.83 | 0.82 | −3.30 | −2.10 | 247.86 | 249.25 | 0.41 | 0.43 |
CHIRPS | 0.83 | 0.82 | 0.82 | 0.82 | −2.90 | −2.40 | 252.93 | 248.18 | 0.42 | 0.42 |
CMORPH | 0.76 | 0.75 | 0.75 | 0.75 | −7.60 | −7.00 | 301.62 | 300.00 | 0.50 | 0.50 |
SPDs model . | R2 . | NSE . | PBIAS (%) . | RMSE (m3/s) . | RSR . | |||||
---|---|---|---|---|---|---|---|---|---|---|
Training . | Testing . | Training . | Testing . | Training . | Testing . | Training . | Testing . | Training . | Testing . | |
GPM | 0.87 | 0.85 | 0.87 | 0.85 | −2.10 | −1.80 | 216.68 | 224.98 | 0.36 | 0.38 |
PERSIANN | 0.83 | 0.82 | 0.83 | 0.82 | −3.30 | −2.10 | 247.86 | 249.25 | 0.41 | 0.43 |
CHIRPS | 0.83 | 0.82 | 0.82 | 0.82 | −2.90 | −2.40 | 252.93 | 248.18 | 0.42 | 0.42 |
CMORPH | 0.76 | 0.75 | 0.75 | 0.75 | −7.60 | −7.00 | 301.62 | 300.00 | 0.50 | 0.50 |
Scatterplots for daily streamflow predictions from different SWAT-SVR hybrid models.
Scatterplots for daily streamflow predictions from different SWAT-SVR hybrid models.
Models comparison for different SPDs
Comparison of models performance using different SPDs. Please refer to the online version of this paper to see this figure in colour: https://dx.doi.org/10.2166/wcc.2023.470.
Comparison of models performance using different SPDs. Please refer to the online version of this paper to see this figure in colour: https://dx.doi.org/10.2166/wcc.2023.470.
Suspended sediment load prediction
According to the comparison of the overall performance of all models using the various SPDs, the GPM-based models performed better than the CHIRPS, PERSIANN, and CMORPH models. Thus, the GPM-based models were selected for further use in the prediction of the SSL. Furthermore, all the information related to the sediment rating curve of the study site can be found in Khan et al. (2022).
Similar to the streamflow parameter calibration in SWAT-CUP, the sensitivity of sediment parameters was assessed by using GSA in SWAT-CUP. Then, the calibration and validation were performed by optimizing the six most sensitive sediment parameters (listed in Table 5) using SUFI-2 in SWAT-CUP. A summary of the statistical performance of SSL simulation for calibration and validation is given in Table 10.
Model performance using the GPM P dataset for the prediction of SSL
Model . | R2 . | NSE . | PBIAS (%) . | RMSE (tons/day) . | RSR . | |||||
---|---|---|---|---|---|---|---|---|---|---|
Train . | Test . | Train . | Test . | Train . | Test . | Train . | Test . | Train . | Test . | |
SWAT-CUP | 0.54 | 0.35 | 0.52 | 0.32 | −10.35 | −11.85 | 82,243.94 | 1,70,556.42 | 0.69 | 0.82 |
SWAT-RF | 0.67 | 0.63 | 0.67 | 0.63 | 1.73 | 4.12 | 83,790.21 | 1,04,196.61 | 0.57 | 0.60 |
SWAT-ANN | 0.74 | 0.66 | 0.74 | 0.65 | 3.17 | 3.74 | 81,808.82 | 82,729.57 | 0.50 | 0.60 |
SWAT-SVR | 0.51 | 0.52 | 0.51 | 0.52 | 1.80 | 1.5 | 1,05,069.12 | 1,11,570.7 | 0.69 | 0.70 |
Model . | R2 . | NSE . | PBIAS (%) . | RMSE (tons/day) . | RSR . | |||||
---|---|---|---|---|---|---|---|---|---|---|
Train . | Test . | Train . | Test . | Train . | Test . | Train . | Test . | Train . | Test . | |
SWAT-CUP | 0.54 | 0.35 | 0.52 | 0.32 | −10.35 | −11.85 | 82,243.94 | 1,70,556.42 | 0.69 | 0.82 |
SWAT-RF | 0.67 | 0.63 | 0.67 | 0.63 | 1.73 | 4.12 | 83,790.21 | 1,04,196.61 | 0.57 | 0.60 |
SWAT-ANN | 0.74 | 0.66 | 0.74 | 0.65 | 3.17 | 3.74 | 81,808.82 | 82,729.57 | 0.50 | 0.60 |
SWAT-SVR | 0.51 | 0.52 | 0.51 | 0.52 | 1.80 | 1.5 | 1,05,069.12 | 1,11,570.7 | 0.69 | 0.70 |
Furthermore, the hybrid models were developed using the output variables of an un-calibrated SWAT model in the same way as streamflow models. Following this, each hybrid model was trained and tested in accordance with the same procedure as in the streamflow prediction models.
After training and testing, the performances of all models were evaluated by applying the same statistical indices including R2, NSE, PBIAS, and RSR, as presented in Table 10. All these results from Table 10 indicate that the SWAT-ANN model performed better than other models in both calibration and validation with higher NSE 0.74 and 0.65, and lower RMSE 81,808.82 (tons/day) and 82,729.57 (tons/day), respectively. As far as overall performance is concerned, the SWAT-ANN model performed the best, followed by SWAT-RF, SWAT-SVR, and SWAT-CUP.
Scatterplots for the predictions of SSL (in million tons per day) from different models.
Scatterplots for the predictions of SSL (in million tons per day) from different models.
CONCLUSIONS
In hydrological modelling, a precise estimation of streamflow and SL is essential for understanding the hydrodynamics of rivers. The accuracy of their predictions is greatly dependent on the presence of high-quality precipitation data throughout the entire basin. The availability of high-quality precipitation data is a major concern, especially in developing countries, where most of the land is poorly gauged or ungauged (particularly in mountainous areas). Nowadays, different SPDs are freely available to overcome this problem.
Therefore, this study aimed to evaluate the applicability of four SPDs, namely GPM (IMERG_F), PERSIANN_CDR, CHIRPS, and CMORPH, for estimating streamflow and SL for a transboundary high altitude catchment (UJRBs) using both hydrological and hybrid SCMs. A further objective of this study was to explore the performance of the SWAT hydrological model as well as hybrid SCMs, including SWAT-ANN, SWAT-RF, and SWAT-SVR, in estimating streamflow and SL based on the aforementioned SPDs for the years 2000–2013. Following is a summary of the main findings of the current study:
Evaluation with multiple statistical indicators revealed that the models based on GPM, PERSIANN, CHIRPS, and CMORPH P datasets performed reasonably well in simulating streamflows for UJRB, with moderate to excellent model performance. However, the performance of the CMORPH-based models is relatively poor compared with other models for the UJRB.
Regarding the hydrological and hybrid SCMs, results revealed that all the SCMs developed by the outputs of the SWAT model (un-calibrated) outperformed the calibrated SWAT-CUP model. Furthermore, multiple statistical indices revealed that the SWAT-RF models performed well in simulating the streamflows, followed by SWAT-ANN and SWAT-SVR. For the entire simulation period, SWAT-RF models performed very well in estimating streamflow, with NSE ranging between 0.95 and 0.96 and PBIAS ranging between 0.01 and 0.4%. According to a comparison of SWAT-ANN and SWAT-SVR models, SWAT-ANN was slightly ahead of SWAT-SVR, but the difference was not significant, so both models can be considered as viable alternatives for streamflow forecasting.
Due to the high performance of GPM-based models, we used them exclusively to further predict the SSL. As evaluated by multiple statistical indicators, SWAT-ANN and SWAT-RF models were fairly accurate for simulating SSL for UJRBs compared with SWAT-SVR and SWAT-CUP models. The results also indicated that SWAT-ANN models performed better than SWAT-RF models for SSL prediction, whereas SWAT-RF models performed better than SWAT-ANN models for streamflow prediction. Because ANNs are less sensitive to unreliable training data, even though both models declined to some extent on these occasions.
To conclude, SPDs-based hybrid soft computing models have considerable potential for predicting streamflow and SSL where the in-situ P data are poorly gauged or unmeasured.
ACKNOWLEDGEMENTS
The first author was financially supported by a doctoral scholarship from the Higher Education Commission of Pakistan (HEC) and the German Academic Exchange Service (DAAD). The authors wish to acknowledge the Mangla Dam Authority and the Surface Water Hydrology Project (SWHP) of WAPDA for providing data that was used in this study. We are thankful to the GPM, PERSIANN_CDR, CHIRPS, and CMORPH research communities for making the satellite precipitation data available for this work. We extend our thanks to the Institute of Hydraulic Engineering and Technical Hydromechanics (IWD), Technische Universität Dresden, for providing the opportunity to perform this research.
AUTHOR CONTRIBUTIONS
M.A.K. conceptualized the study, formulated the study methodology, and involved in the software setup and execution; M.A.K. did the validation; M.A.K. performed the formal analysis; M.A.K. wrote the original draft preparation; and J.S. supervised the study.
FUNDING
The authors did not receive support from any organization for the submitted work.
DATA AVAILABILITY STATEMENT
Data cannot be made publicly available; readers should contact the corresponding author for details.
CONFLICT OF INTEREST
The authors declare there is no conflict.