Assessment of gridded precipitation products in the hydrological modeling of a flood-prone mesoscale basin

Precipitation plays a critical role in hydrometeorological studies. A predictive analysis of gridded rainfall datasets may provide a cost-effective alternative to conventional rain gauge observations. Here, our objective is to evaluate the performance of satellite and reanalysis precipitation products in the hydrological modeling of a mesoscale watershed. The research also examines the accuracy of hydrological simulations in a sizeable flood-prone watershed in the absence of observed data associated with the myriad water retaining structures present in the catchment. We use three precipitation products, namely Tropical Rainfall Measurement Missions (TRMM) 3B42 Version 7, Climate Forecast System Reanalysis (CFSR), and daily precipitation data recorded at multiple rain gauges in the upper Huai River Basin to simulate streamflow. The Soil & Water Assessment Tool (SWAT) is utilized for runoff modeling, while SWAT-CUP is used to perform sensitivity analysis and to calibrate and validate the simulation results. Nash–Sutcliffe efficiency, percent bias, and Kling-Gupta efficiency (KGE) are employed to evaluate modeling efficiency for three precipitation datasets on different temporal scales. The results indicate that TRMM and CFSR datasets provide satisfactory results on both daily and monthly scales. Specifically, the SWAT model performs better at monthly simulations than daily simulations for all precipitation datasets used.


INTRODUCTION
One of the biggest obstacles faced by hydrologic modelers and water scientists is the lack of data pertaining to precipitation, discharge, land cover, and soil characteristics. The paucity and inadequacy of ground observations, particularly in remote and underdeveloped regions of the world, pose an enormous challenge in flood monitoring and water resource assessment (Biancamaria et al. 2016). It is widely believed among researchers that the majority of the global watersheds are either poorly gauged or entirely ungauged, leading to inadequacies in modeling results (Kundzewicz 2007). Thus, hydrological modeling becomes an arduous task in such data-sparse basins (Adjei et al. 2014).
Precipitation is a fundamental and the most significant component in runoff modeling (Liu et al. 2015), and its spatial and temporal resolution plays a crucial role in the accurate estimation and prediction of water resources (Darand et al. 2017). Consequently, streamflow simulation results are immensely reliant on the accuracy of precipitation measurements (Zhao et al. 2015). The assessment of surface and groundwater resources is closely related to the spatial and temporal resolution of precipitation data (Li et al. 2018). Conventionally, precipitation data is obtained through either rain gauge measurement or a ground-based radar system. The accuracy of these measurements is highly reliant on the distribution and density of rain gauges (Islam et al. 2012). However, the number of rain gauge stations around the world has significantly declined over the assessment in northeastern Brazil (de Lima & Alcântara 2019). The precipitation dataset provided by CFSR demonstrated more rainy days and high-intensity rainfall than in situ observations in the Bahe River basin in China (Hu et al. 2017). In northern Mexico, the CFSR dataset was used to identify spatial and temporal variations and trends in precipitation and drought over a 35-year period (Martinez-Cruz et al. 2020). CFSR precipitation datasets tend to overestimate rainfall in Bolivia, whereas TRMM 3B42 demonstrated an overall underestimation (Blacutt et al. 2015). Eini et al. concluded that CFSR can substitute gauge precipitation in the Maharlu Lake basin in Iran, albeit with slightly lower efficiency in streamflow modeling (Eini et al. 2019). CFSR weather may be used as a substitute for water budget and as well as crop yield simulations in areas where high-quality observed weather data are unavailable (Dile & Srinivasan 2014). Moreover, the long-term coverage and availability of CFSR datasets make it helpful in modeling climate and land-use change scenarios.
Although the application of gridded precipitation datasets has been assessed in many areas over the years, few have focused on their performance in hydrological modeling in basins impacted by extensive human activities. This study investigates the use of TRMM daily precipitation products 3B42 Version 7 and NCEP reanalysis precipitation product CFSR for the streamflow simulation in the upper Huai basin (UHB). In addition, we attempt to simulate streamflow from a large watershed where data associated with the water retaining structures and water diversion structures, such as dams and sluices, are inaccessible. The SWAT model is employed to perform hydrological simulation, and SWAT Calibration Uncertainties Program (SWAT-CUP) is used for model calibration and sensitivity analysis. Moreover, hydrological modeling is performed using gauge precipitation data to assess the simulation accuracy of the two precipitation datasets.

METHODOLOGY Study area
The Huai River Basin (HRB) lies between 30°55 0 and 36°36 0 latitudes and 111°55 0 and 121°25 0 longitudes. With a catchment area of 270,000 sq. km, the Huai River is among the six largest river basins in the country. The HRB experiences a semi-humid monsoon climate where precipitation shows an increasing trend from north to south, with an annual average of 911 mm predominantly occurring between June and September (Chen et al. 2011). Similarly, temperatures display a gradual increase with an annual average of 11°C in the north to 16°C in the south. Alternatively, the mean annual potential evapotranspiration is estimated to be between 950 and 1,150 mm using the Penman method (Chu et al. 2019). In comparison, the annual average runoff depth is approximated to be 231 mm, predominantly occurring in the flood season (Song et al. 2015). The period from June to September is regarded as the flood season, whereas the period between October and May is seen as the non-flood season in the area .
The basin has a population density of approximately five times the national average, with a total of over 165 million inhabitants. Despite having water resources of 83.5 billion m 3 , the per capita water availability is less than one-fifth of the nation's average (Zhang et al. 2013). The HRB has a relatively flat topography with some mountainous areas in the northwest, west, and south of the basin (Yang et al. 2020). The region is highly susceptible to natural disasters, particularly recurrent flooding, as well as severe environmental degradation. Historically, the area has experienced frequent disastrous flooding resulting in significant economic damages and loss of human lives (Shrestha et al. 2006). To minimize the impacts of hydrological disasters, over 5,700 dams and 5,000 floodgates have been constructed in the area since the foundation of the People's Republic of China in 1949. Consequently, most natural streams and rivers in the area have been fragmented into relatively isolated and disassociated water bodies. Extensive human activities have transformed the natural hydroclimate, hence, making it extremely challenging to accurately mimic flow regimes considering the actual operations of sluices in the area (Zhang 2010). Previous studies indicate substantial variations in flow regimes resulting from the construction and operation of hydrological structures in the basin . Under these circumstances, reservoir operation data become critically essential for efficient streamflow simulation. However, access to reservoir operation management data is generally restricted (Lv et al. 2016), particularly in places like China.
The primary focus area of this study is the upstream region of the Bengbu hydrological station (Figure 1). The natural streamflow at Bengbu hydrological station has significantly declined, primarily due to the increased withdrawal of water for agricultural, industrial, and municipal consumption in the upstream area (Sun et al. 2016).

Terrain data
Terrain data used in this study include the digital elevation model (DEM), soil type maps, and land use maps. These datasets are further elaborated on in the following sections.

DEM
DEM plays a critical part in hydrological modeling as all topographic features of the catchment, its sub-basins, as well as stream networks are obtained from the DEM dataset. Attributes derived from the DEM dataset comprise watershed slope and boundary, reach length, and channel slope. We used NASA's Shuttle Radar Topography Mission (SRTM) 90 meters (3 arc second) dataset for this research ( Jarvis et al. 2008).

Land cover
Land cover provides the synthesis of various material types covering the earth's surface and their corresponding features and characteristics. The spatial distribution of different land covers affects the interaction of moisture between land and atmosphere, hence influencing the hydrological cycle. Globeland30, developed by the National Geomatics Center of China (NGCC), offers open-access global land cover datasets at 30-m spatial resolution. Globeland30 categorizes land cover into ten different classes, namely cultivated land, forest, grassland, shrubland, wetland, water bodies, tundra, artificial surfaces, bareland, permanent snow, and ice. The land-use dataset for the year 2010 was acquired in raster format from http:// www.globeland30.org/ and was subsequently merged and extracted to cover the UHB (Table 1).

Soil Map
Soil characteristics have a significant effect on catchment hydrology. Here, we used the digital soil map of China with 30 arc seconds resolution. The database is compiled by the Institute of Soil Sciences, Chinese Academy of Sciences (ISSCAS), and is part of the Harmonized World Soil Database (HWSD) (Batjes 2016). The soil map was acquired from the Food and Agricultural Organization (FAO) and was downloaded from https://www.isric.org/projects/soil-and-terrain-soter-databaseprogramme in vector format. The soil map was classified according to the dominant soil type ( Table 2). The HWSD Viewer software was used to attain different hydrological attributes of these soil types.

Hydroclimatic data
The satellite-based precipitation data are available on the NASA TRMM's website and were downloaded free of charge. The satellite coverage extends from 50°N to 50°S with a spatial resolution of 0.25°Â 0.25° (Huffman et al. 2010). The reanalysis precipitation dataset provided by CFSR was downloaded from the Texas A&M University global weather website globalweather.tamu.edu.
Daily precipitation data recorded at three meteorological stations in the proximity of the UHB were acquired from China Meteorological Administration (CMA). Moreover, satellite precipitation data, 3B42V7, for the period 2001 to 2013 were downloaded and unified on a daily scale and analyzed for any discrepancies while using the gauge measurements as the standard reference points. Daily streamflow recorded at the Bengbu hydrological station has been applied to calibrate and validate the model and evaluate the simulation results.

Models used
SWAT SWAT, a continuous-time simulation model, was initially developed to analyze the impacts of land management and nonpoint source pollution in large-scale basins. SWAT is a computationally efficient process-based model with the capability   . In the SWAT model, water from HRUs is routed into streams associated with each sub-basin, then to the main channel, and finally to the basin outlet. The following governing equation is used to calculate water balance in a hydrological basin (Arnold et al. 1998).
where SW represents soil moisture content minus the 15-bar water content, R, Q, ET, P, and QR represent precipitation, runoff, evapotranspiration, percolation, and return flow, respectively. All these components of the water balance equation are measured and estimated in millimeters at a daily time scale, t. The model improves the accuracy of water balance estimation by calculating ET for different soil and crop types in the subareas of a complex basin. Thus, increasing the efficiency of the model in predicting surface runoff generated from the catchment area.

SWAT-CUP
SWAT-CUP can perform sensitivity analysis, calibration, validation, and uncertainty analysis for SWAT models. The software offers several optimization techniques, namely Sequential Uncertainty Fitting version 2 (SUFI-2), Generalized Likelihood Uncertainty Estimation (GLUE), Particle Swarm Optimization (PSO), Parameter Solution (ParaSol), and Markov Chain Monte Carlo (MCMC) (Khalid et al. 2018). SUFI-2 adopts the stochastic approach to perform model calibration and assess uncertainty in the simulation process. The technique attempts to consider all sources of uncertainties arising from model input, parameters, and measured data. Evidence from existing researches in different regions suggests that SUFI-2 is computationally efficient and exhibits a reasonable accuracy in calibrating and validating SWAT models (Yang et al. 2014). Wu & Chen (2015) demonstrated that SUFI-2 outperforms GLUE and ParaSol in the calibration and uncertainty analysis of the Wenjing river in China (Wu & Chen 2015). Therefore, we chose SUFI-2 for the calibration, validation, and sensitivity, and uncertainty analyses of the simulation results for the UHB. SUFI-2 algorithm describes uncertainty as to the disparity between the observed and simulated variables. Uncertainties in the calibrations are measured in terms of P-factor and R-factor. P-factor refers to the percentage of measured data bracketed by the 95% prediction uncertainty (95PPU), whereas the R-factor indicates the average thickness of the 95PPU divided by the standard deviation of the measured data. P-factor can be used as an index to quantify the model's capability in capturing uncertainties. R-factor is regarded as an index to assess the quality of calibration and is calculated using the following equation (Rouholahnejad et al. 2012): where X s,97.5% represents the upper and X s,2.5% represents the lower boundary of 95PPU for the simulated variable X s , whereas σ obs is the standard deviation of observed data. 95PPU is measured at 2.5 and 97.5% levels of the cumulative distribution of an output variable that is acquired through Latin hypercube sampling while rejecting 5% of the inferior simulations (Abbaspour et al. 2004). The values of the P-factor range between 0 and 1, where 1 suggests a perfect agreement between the simulated and measured data. A P-factor higher than 0.7 and an R-factor lower than 1.5 will indicate an adequate accuracy in model calibration and validation. However, the quest for a higher P-factor will also result in a higher R-factor. Therefore, a balance between the two factors is sought to achieve optimum results (Abbaspour et al. 2015). Detailed theoretical descriptions of SUFI-2 and other uncertainty analysis methods can be found in Abbaspour et al. (2004Abbaspour et al. ( , 2007.

Model setup
ArcSWAT Several steps are followed to delineate the watershed, define flow paths, and calculate basin parameters according to the soil, land use, and topographic inputs. Subsequently, weather data are added to calculate different hydrological components at the basin. These steps are further elaborated in detail in the following lines.
1. DEM-based delineation is carried out to define stream network (Figure 2), followed by output section and watershed delineation. Subsequently, sub-basin parameters are determined using the 'Calculate sub-basin parameter' function. The tool divides the entire study area into 27 sub-basins. 2. Soil and land use maps in raster format (Figures 3 and 4) are imported into the model. In addition to soil type and land use data, slope classes are discretized to define HRUs in the study basin. As a result, the study area is distributed into 1,135 HRUs using the ArcSWAT tool. 3. Daily precipitation data obtained from meteorological stations were added using the weather generator. The remaining climatic variables, including daily minimum and maximum temperatures, wind speed, relative humidity, and solar radiation, were simulated using the SWAT weather generator, WGEN. The generator simulates synthetic weather data using various core weather statistics primarily based on the occurrence of wet or dry days at a particular location (Hayhoe & Stewart 1996). WGEN requires no additional user input. Details of the SWAT weather generator and the climatic variable simulation can be found in the SWAT theoretical documentation (Neitsch et al. 2011). The same procedure was repeated while running the model for TRMM and CFSR precipitation datasets. 4. The SWAT model is set up and run for a period starting January 1, 1998, until December 31, 2010. The first 3 years (1998)(1999)(2000) are selected as a warm-up period before the model is run on a daily scale. The same procedure is followed to run the model on a monthly scale. Setting up SWAT-CUP's SUFI-2 algorithm includes selecting parameters, defining their ranges, and choosing an appropriate objective function before running the iteration. Following steps are taken to perform calibration, validation, and sensitivity analysis for runoff simulation of the UHB.
1. Twenty-four parameters influencing runoff, listed in Table 4 3. In File.cio, the numbers of years simulated are adjusted for calibration (2001-2006) and validation (2007-2010) periods. 4. The objective function is defined as NSE, with 0.5 as the minimum threshold value for behavioral simulations. 5. Daily values of observed discharge are added to observed_rch.txt and observed.txt files. 6. Executable files SUFI2_pre.bat, SUFI2_run.bat, SUFI2_post.bat, and SUFI2_extract.bat are run to perform the iteration. 7. New ranges of values for 24 parameters are acquired after each iteration. These ranges are used to modify minimum and maximum values in the parameter input file before running a new iteration. The program also performs global sensitivity analysis to rate parameter sensitivity based on their respective P-values and t-statistic. 8. A total of three iterations (1,500 simulations) were run to obtain an acceptable value of the objective function. An additional iteration with the same number of simulations (500) and the same parameter ranges from the third iteration is run for the validation period (2007-2010). 9. The same procedure is followed to perform calibration, validation, and uncertainty analysis on a monthly scale. Several objective functions are available to assess the performance of the SWAT model on both daily and monthly scales. These indicators include Coefficient of determination (R 2 ), modified Coefficient of determination (bR 2 ), Nash-Sutcliffe efficiency (NSE), modified Nash-Sutcliffe efficiency (MNS), Percent Bias (PBIAS), and Kling-Gupta efficiency (KGE). SWAT-CUP has the capability to perform calibration and uncertainty analysis based on any of the aforementioned objective functions.
In simple terms, R 2 is the squared value of the linear correlation, R, between the observed and simulated values. The value of R 2 varies from 0 to 1. Higher values of R 2 show minor error variance hence indicating good model performance. A value lower than 0.5 is generally considered unacceptable, depicting an unsatisfactory performance of the model. R 2 is calculated using the following equation (Krause et al. 2005):  NSE is a widely used indicator to measure the efficiency of hydrological models. It is interpreted as one minus the sum of absolute squared differences between observed and simulated discharge measurements normalized by the variance of the observed values during the investigation period. NSE values range from À∞ to 1, with 1 being the optimal value. An NSE value higher than 0.5 is believed to indicate a satisfactory model performance, whereas negative values would indicate  that the average of observed discharge is a better predictor than the simulated discharge value (Nash & Sutcliffe 1970). Mathematically, it is calculated as: PBIAS measures the tendency of overestimation or underestimation in the simulated discharge data compared with the actual observed data. The optimum value for PBIAS is 0, with values higher than or equal to +25% indicate unsatisfactory model simulation (Yapo et al. 1996). The following mathematical equation is used to calculate PBIAS It has been noted that runoff peaks tend to be underestimated when NSE is employed in optimization, which may lead to the inadequate calculation of peak flows. Therefore, KGE, another model evaluation indicator, was proposed by Gupta et al. in 2009 to address issues associated with the underestimation of flood peaks. It is worth mentioning that KGE also underestimates runoff peaks when used in optimization. However, the effect will be milder in comparison with NSE (Gupta et al. 2009). The following equation is used to compute KGE σ sim and σ obs are the standard deviation in the simulated and observed values, respectively, whereas μ sim and μ obs are the simulation and observation mean, respectively. The optimal value for KGE is 1 indicating a perfect agreement between the observed and simulated values. KGE 0.41 is considered unacceptable, and values higher than 0.5 are desired for a satisfactory model simulation.

Rain gauge
Daily streamflow simulated using conventional rain gauge precipitation along with the observed daily streamflow is plotted in Figure 5 for calibration and validation periods. The graphs also illustrate the results of prediction uncertainties for each period. For the Bengbu hydrological station, an NSE value of 0.77 and an R 2 value of 0.79 were observed during the calibration period. Nonetheless, the value of NSE fell to 0.68 and R 2 to 0.72 during the validation period for daily streamflow simulation. Daily simulation using the SWAT model overestimated discharge (PBIAS ¼ 9.5%) during calibration and underestimated (PBIAS ¼ À12.4%) during the validation period. However, the calibration and validation performances are regarded as 'good' according to the model evaluation guidelines provided in Moriasi et al. (2007). The model uncertainty analysis indicates a P-factor of 0.86 and an r-factor of 1.05 during the calibration period. Meanwhile, the P-factor was 0.89, and the r-factor was 1.07 during validation. Additionally, both calibration and validation plots exhibit a reasonable agreement with the daily streamflow recorded at Bengbu station. Figure 5 demonstrates that the model tends to overestimate runoff volumes during the summer peaks of 2003 and 2007.

TRMM 3B42
TRMM's daily precipitation 3B42 Version 7 was used to drive hydrological simulation for the UHB. Daily simulation streamflow using satellite data along with the observed streamflow data are plotted in Figure 6 for calibration and validation periods. Figure 6 also includes 95PPU plots while calibrating and validating the SWAT model for the upstream catchment area of the Bengbu hydrological station. The results for the objective function, NSE, and other performance indicators are listed in Table 6. It can be noted that the SWAT model resulted in satisfactory simulation results for both calibration and validation periods. The model showed negligible bias (PBIAS¼0.8%) during calibration and a moderate underestimation (PBIAS¼À16.3%) in the validation period ). The NSE values were 0.70 and 0.66 during the calibration and validation periods, respectively. The percentage of data bracketed by 95PPU was reasonable during calibration (P-factor ¼ 0.74) and validation (P-factor ¼ 0.73). The r-factor, however, was 2.12 during calibration and 2.23 during validation. Nonetheless, both visual and mathematical examination of calibration and validations results show that the simulated streamflow is in an acceptable concurrence with the observed streamflow data.

NCEP-CFSR
Daily precipitation data series obtained from NCEP-CFSR was used to drive the SWAT model for daily streamflow simulation for 2001-2010. The calibration period (2001)(2002)(2003)(2004)(2005)(2006) achieved NSE and R 2 values of 0.80 and a PBIAS value of À2.3%. The resultant P-factor and R-factor were 0.65 and 0.95, respectively, exhibiting a satisfactory calibration strength. Both NSE and R 2 dropped to 0.73 with an underestimation of 6.9% during the validation period (2007)(2008)(2009)(2010). PPU plots for calibration and validation of the daily SWAT simulation using NCEP-CFSR data are visualized in Figure 7. Model performance, given in Table 6, using CFSR data can be deemed 'good' during both calibration and validation periods.

Rain gauge
The results of the SWAT model monthly streamflow simulations for the Bengbu hydrological station are presented in Figure 8 for the calibration and validation periods. The plot also illustrates prediction uncertainty for both the calibration and validation periods. A visual inspection reveals that the simulated monthly streamflow coincides nicely with the mean monthly discharge recorded at the Bengbu hydrological station. The model also appears to represent summer peaks reasonably well, with some discrepancies observed in the summer months of 2005, 2006, and 2010. The model uncertainty analysis resulted in an r-factor of 0.84 during both the calibration and validation periods. The P-factor, however, improved from 0.60 during calibration to 0.69 during validation.
The outcomes of the objective function as well as other performance indices, including R 2 , NSE, KGE, and PBIAS, are provided in Table 7. An NSE value of 0.82 and an R 2 value of 0.84 were achieved in the calibration of the SWAT model  while using the conventional precipitation data. However, an overestimation bias of À18% (PBIAS ¼ À18.0%) was observed during the calibration period for monthly streamflow simulation. During the validation period, the model exhibited an equivalent performance with an NSE value of 0.81, an R 2 value of 0.82, and a PBIAS value of 6.4%. The results indicate that the SWAT model presents a satisfactory performance in simulating monthly streamflow for the UHB while using rain gauge precipitation.

TRMM 3B42
TRMM's daily precipitation product 3B42 version 7 was used to drive the SWAT model to simulate monthly streamflow generated at the upstream catchment of the Bengbu hydrological station. Results of model simulation for both the calibration and validation periods, along with the measured mean monthly discharges and 95PPU, are illustrated in Figure 9. Upon visual inspection, it can be observed that the simulated results correspond reasonably well with the observed discharges recorded at the Bengbu hydrological station. However, it is evident that the simulated results failed to capture the summer peaks of 2010. It is also observed that the model underestimates peak flows in the summer of 2006 and 2008 while overestimating streamflow in 2005 and 2009. The P-factor was 0.60 throughout the streamflow simulation, whereas the r-factor was 0.79 during calibration and 0.86 during validation. Table 7 lists the results of performance indicators when the model is driven on a monthly scale using satellite precipitation data. NSE values of 0.73 and 0.69 were observed in the calibration and validation periods for monthly streamflow simulation. Similarly, the model's R 2 were 0.74 and 0.70 for calibration and validation periods, respectively. The model performed fairly well in terms of estimation bias, with PBIAS of À9.1% for calibration and À7.0 for validation . The model

NCEP-CFSR
NCEP's daily precipitation reanalysis product CFSR was used to simulate monthly streamflow at Bengbu hydrological stations. The model results after calibration and validation are illustrated in Figure 10. A visual inspection of the validation plot reveals that the model underestimates (overestimates) the peak flow of summer 2007 (2008). Overall, the model performed satisfactorily in terms of NSE, R 2 , KGE, and PBIAS (Table 7) during both the calibration and validation periods. The uncertainty in model parameters was evaluated using P-factor and R-factor. A P-factor of 0.72 was observed for calibration and 0.81 for validation, indicating an over 70% envelop of the observed streamflow dataset. The thickness of the 95PPU envelop, represented with R-factor, was seen as 1.10 and 0.96 for calibration and validation periods, respectively.

Sensitivity analysis
Sensitivity analysis measures how responsive a model is to the variation in its input parameters and data. It employs various techniques to assess the association between the uncertainty in model output and the uncertainty in model output (Salciccioli et al. 2016). We performed a sensitivity analysis using SWAT-CUP to identify significant parameters influencing streamflow in the UHB. SWAT uses a slew of input parameters impacting the hydrological component of watershed modeling. However, we chose 24 parameters to perform sensitivity analysis and to calibrate and subsequently validate the model. Global parameter identifiers were set for all HRUs, i.e., parameter ranges and calibrated values remain the same in the UHB. Parameters with multiplicative terms (r prefix) indicate an increment in the respective parameter of a particular HRU, whereas for the remaining parameters (v prefix), the existing parameter value is replaced by a given value for all HRUs and sub-basins in the UHB. P-value and t-stat were used as indicators to measure the sensitivity of a particular parameter. Model sensitivity to a certain parameter increases with an increase in the parameter's t-stat absolute value and a decrease in its P-value. The detailed global sensitivity results of each parameter along with their calibrated (fitted) values acquired through the SUFI-2 algorithm, are given in Table 8. The outcomes of sensitivity analysis for each set of precipitation data and the temporal scale of the simulation are further elaborated on in the following lines. Uncorrected Proof RCHRG_DP, GW_DELAY, CN2, EPCO, ESCO, ALPHA_BNK, and SMFMX were found to be the most sensitive parameters in global sensitivity analysis when rain gauge precipitation data were used to drive the SWAT model on a daily scale. Moreover, the developed model exhibited a high sensitivity to changes in CN2, ALPHA_BNK, EPCO, CH_N2, GW_REVAP, GW_DELAP, and SURLAG parameters when the model was run to simulate daily streamflow using the satellite precipitation product 3B42. For the CFSR dataset, however, ALPHA_BNK, CH_K2, CN2, ESCO, SOL_AWC, REVAPMN, and EPCO were observed to be the most sensitive parameters.
For monthly streamflow simulation, we observe that CN2, GW_DELAY, SOIL_BD, RCHRG_DP, SOIL_K, ALPHA_BNK, SMFMX, and GWQMN exert the most significant impact on the model's output when using gauge precipitation data. Nonetheless, GW_DELAY, SOL_BD, CN2, RCHRG_DP, ALPHA_BNK, SMTMP, SOL_AWC, and CH_K2 were observed as the most sensitive parameters with a considerable influence on simulation outputs when using satellite precipitation data as the model input. Among the 24 parameters used, ALPHA_BNK, SMTMP, SMFMN, SOL_AWC, ESCO, TIMP, SLSUBBSN were seen as most sensitive after a global sensitivity analysis was performed for CFSR-driven monthly simulation model.

DISCUSSION
Basin-scale model calibration is an arduous challenge due to uncertainties arising from the simplification of intricate processes or the processes that might have been inadvertently ignored, occurring in the catchment area. For instance, the existence of reservoirs and wetlands within the watershed may significantly alter discharges at the downstream outlet (Abbaspour et al. 2015). Information associated with these variables is not always known to the modeler. This is particularly true for large basins with limited ground observations. Another prominent source of uncertainty in distributed hydrological models stems from the regionalization of climatic inputs such as precipitation and temperature data. A reasonably dense and reliable network of stations and gauges is required for accurate hydrological modeling and flood forecasting. Recent development in space technology may provide cost-effective alternative ground-based observations (Kumar et al. 2016). This study investigates one such example of a large watershed with limited measured data using precipitation from three different sources.
We analyzed the application of gridded rainfall products 3B42 version 7 provided by TRMM and CFSR provided by the NCEP in driving the SWAT model in the UHB on both daily and monthly scales. The results were compared with the simulation results obtained by using precipitation data acquired from three rain gauge stations in the vicinity of the study area. Both modeling scenarios were calibrated and validated against discharge data measured at the Bengbu hydrological station. We used SWAT-CUP's SUFI-2 algorithm for automatic calibration, validation, and uncertainty analysis of different input parameters. Several indices, including NSE, R 2 , PBIAS, R-factor, and P-factor, were employed to assess the efficiency of simulation outcomes. Our results are reasonably consistent with the existing research on hydrological modeling using TRMM precipitation datasets in other watersheds discussed in the Introduction section.
The results of streamflow at Bengbu station using rain gauge precipitation and satellite precipitation (TRMM) are plotted against the recorded daily discharges in Figure 5. A visual inspection of the hydrograph reveals that simulated streamflow using rain gauge data corresponds reasonably well with the observed daily streamflow recorded at Bengbu station. However, the daily simulation curve using TRMM precipitation appears to drift from the measured data at various points. This discrepancy is pronounced during the peak flows of summer 2005 and 2009 (overestimation) and the summer of 2010 (underestimation). The daily simulation curve using rain gauge precipitation, on the other hand, corresponded reasonably well with the observed streamflow. Indices utilized to assess the accuracy of daily simulation results using rain gauge precipitation show a 'good' to 'very good' performance. In comparison, daily simulated streamflow using TRMM data led to 'satisfactory' performance during model calibration and validation phases. P-factor and R-factor were employed to quantify the strength of calibration and validation analysis. The analysis reveals acceptable values of both P-factor (! 0.86) and the R-factor ( 1.07) for daily simulation streamflow results. Daily simulation using TRMM precipitation also indicates that the observed streamflow reasonably envelops the 95PPU plots (P-factor ! 0.73). However, large R-factor values (! 2.12) were observed, indicating high uncertainties in calibration and validation of daily streamflow using TRMM data.
CFSR, on the other hand, outperformed daily rain gauge and TRMM simulations in terms of NSE, R 2 , KGE, and PBIAS. However, higher parameter uncertainties (P-factor ¼ 0.65) were observed in model calibration for CFSR-driven daily SWAT simulations.
The model performance improved for CFSR, TRMM, and ground-based precipitation when the simulation was carried out on a monthly scale. Visual analysis of the monthly scale hydrograph using rain gauge data and CFSR data reveals a reasonable consistency between simulated and observed discharges during both high and low flow periods. However, the hydrograph appears to overestimate the summer peak of July 2004 and underestimate the peak in July 2005 for rain gauge datasets. For CFSR datasets, an overestimation in peak flow is seen in the summers of 2001 and 2008, whereas an underestimation is observed during the summer of 2007. Monthly scale simulation using TRMM data shows a significant underestimation in the summer peak of 2003, 2004, and 2010 and a significant overestimation in 2005. The performance indices showed comparable results for rain gauge, reanalysis (CFSR), and satellite (TRMM) precipitations when the model was calibrated and validated for monthly streamflow simulations (Figure 11). The values of R-factor and P-factor also show an equivalent performance for both precipitation datasets in the uncertainty analysis of monthly simulation results. Nonetheless, CFSR exhibits superior performances in terms of nearly all the indices employed. Overall, we observe that the model gave better results for rain gauge, reanalysis, and satellite rainfall when the simulation was executed on a monthly scale. Moreover, rain gauge precipitation, despite its sparsity in the study area, produced superior simulation results than TRMM. Nevertheless, reanalysis precipitation datasets produced the best simulation results in the UHB.
Sensitivity analyses were performed using SUFI-2 algorithm for six sets of SWAT simulations, i.e., daily scale model using rain gauge data as precipitation input, daily scale model using 3B42 version 7 as precipitation input, daily scale model using CFSR as precipitation input, monthly scale model using rain gauge precipitation as input, monthly scale model using the 3B42 version 7 as precipitation input, and monthly scale model using CFSR as precipitation input. The analysis was carried out independently for each model using 24 parameters. CN2.mgt, RCHRG_DP.gw, ALPHA_BNK.rte, GW_DELAY.gw, SMTMP, ESCO, SOL_BD.sol, and EPCO.hru were seen as the most sensitive parameters influencing runoff generated from the UHB. A stark reduction was observed in the curve number after the models were calibrated using rain gauge and TRMM datasets. Lower CN2 values result in lower runoff generated from rainfall and streamflow conveyed to the watershed outlet. This trend implies that the simulated discharge from the UHB is higher than the observed discharge at the Bengbu hydrological station. The significant reduction in generated streamflow can be attributed to extensive human activities, including retention structures such as reservoirs and water withdrawal for agricultural and industrial consumption within the catchment area. The significant diminution in CN2 values, and the subsequent decrease in simulated streamflow, was not observed when using the CFSR dataset. This phenomenon may be attributed to the underestimation of precipitation seen in the CFSR rainfall dataset, as illustrated in Figure 11. A close visual inspection of Figure 11 also reveals that the variations in simulated streamflow remain relatively consistent with the input precipitation. Furthermore, it is pertinent to mention that the best-fit values for parameters and their rank vary among models using different precipitation inputs and simulation time scale as each set was calibrated and validated individualistically (Table 8). However, it is impractical to determine the true physical value of best-fit parameters in a large watershed with complex processes involving water movements through land and atmosphere (Strauch et al. 2012). Incorporating the influences of anthropogenic activities may provide a more holistic depiction of the actual hydrologic processes taking place in the basin. Nonetheless, the model developed in this study serves as a viable alternative in areas where data on these activities are limited or where access to such information is restricted.

CONCLUSION
Access to accurate and reliable hydrometeorological data is essential for optimal water resource assessment. Gridded precipitation products offer a readily available and cost-effective alternative to the ground-based rain gauge datasets. Here, we assessed the efficiency of TRMM 3B42 and NCEP-CFSR precipitation datasets in driving a hydrological model for streamflow simulation. The SWAT model was implemented to simulate daily and monthly streamflow at the Bengbu hydrological station located downstream of the UHB. Daily and monthly simulations were also performed using daily gauge precipitation data recorded at three stations in the area. It is worth noting that numerous water storage and diversion structures have been constructed to regulate the river flow as the region lies in the floodplain. However, data associated with these reservoirs were not accessible for this research. SWAT-CUP's SUFI-2 algorithm was utilized to perform sensitivity analysis and to calibrate and validate the modeling results. NSE, R 2 , PBIAS, and KGE values were used as performance indicators to evaluate the accuracy of the simulation results. Their values demonstrate that rain gauge precipitation provides better streamflow simulation results than the TRMM dataset in the area on both monthly and daily scales. The model performance was enhanced further when CFSR precipitation datasets were used for streamflow simulations. The models showed improved performances on monthly scales than on daily scales for the three precipitation inputs used. We can deduce that TRMM and CFSR can provide reliable estimates for streamflow modeling in the absence of gauge rainfall records in the basin. Additionally, SWAT-CUP can be used to successfully calibrate different parameters influencing runoff when ground observations are insufficient.

DATA AVAILABILITY STATEMENT
All relevant data are included in the paper or its Supplementary Information.