Evaluation of multisource precipitation input for hydrological modeling in an Alpine basin: a case study from the Yellow River Source Region (China)

Alpine basins are typically poorly gauged and inaccessible owing to the harsh prevailing environment and complex terrain. In this study, two representative satellite precipitation products (Tropical Rainfall Measuring Mission (TRMM) Multi-satellite Precipitation Analysis (TMPA) 3B42RTV7 and Integrated Multi-Satellite Retrievals for GPM (IMERG) Final Run Version 06) and two reanalysis precipitation products (China Meteorological Assimilation Driving Datasets for the SWAT model (CMADS) and Climate Forecast System Reanalysis (CFSR)) in the Yellow River Source Region (YRSR) were selected for evaluation and hydrological veri ﬁ cation against gauge-observed (GO) data. Results show that the accuracy of these precipitation products in the warm season is higher than that in the cold season, and IMERG exhibits the best performance, followed by the CMADS, CFSR, and 3B42RTV7. Models that use the GO as input yielded satisfactory performance during 2008 – 2013, and precipitation products have poor simulation results. Although the model using the IMERG as input yielded unsatisfactory performance during 2014 – 2016, this did not affect the use of the IMERG as a potential data source for the YRSR. The model driven by the combination of GO and CMADS precipitation performed the best in all scenarios ( R 2 ¼ 0.77, Nash – Sutcliffe ef ﬁ ciency (NSE) ¼ 0.72 at the Tangnaihai station; R 2 ¼ 0.53, NSE ¼ 0.48 at the Jimai station). A simulation model combining GO and the CMADS provided the best result. The IMERG can be used as a potential data source for the YRSR.


INTRODUCTION
Accurate precipitation data are key to hydrological modeling (Strauch et al. 2012;Galván et al. 2014;Monteiro et al. 2016;Duan et al. 2019). However, due to the sparsity of many gauge networks and the large spatio-temporal variabilities of precipitation events (Zhu et al. 2016;Lu et al. 2018), finding a way of obtaining accurate precipitation data has always been challenging for scientists especially in Alpine basins (Guo et al. 2016;Yuan et al. 2018;Bhatta et al. 2019), which has greatly hindered the research into hydrological simulation thereof (Tuo et al. 2016). Satellite and reanalysis precipitation products provide an unprecedented opportunity to obtain precipitation data with a high spatio-temporal resolution.
To date, many satellite and reanalysis precipitation products have been developed and released to the public, such as the Global Precipitation Measurement (GPM) (Hou et al. 2013), Tropical Rainfall Measuring Mission (TRMM) (Huffman et al. 2010a), Climate Hazards Group Infrared Precipitation with Station data (CHIRPS) (Funk et al. 2015), China Meteorological Assimilation Driving Datasets for the SWAT model (CMADS) (Meng et al. 2019), and Climate Forecast System Reanalysis (CFSR) (Saha et al. 2010). As these products exhibit the advantages of extensive coverage, high spatio-temporal resolution, and continuity of measurement (Bajracharya et al. 2015;Prakash et al. 2016), they have been widely applied in hydrological studies across many regions (Fuka et al. 2014;De Almeida Bressiani et al. 2015;Auerbach et al. 2016;Roth & Lemann 2016;Cao et al. 2018;Awange et al. 2019;Duan et al. 2019). The research on satellite and reanalysis precipitation products in the hydrological model can be divided into two categories. One is that these products directly drive hydrological models to study the influence of precipitation data quality on the accuracy of hydrological simulations (Strauch et al. 2012;Tuo et al. 2016;Zhu et al. 2016;Nhi et al. 2018;Duan et al. 2019). The other is the pre-and post-correction precipitation product-driven hydrological model to evaluate the quality of the correction method, especially for precipitation products that do not perform well in the hydrological model (Sheng et al. 2017;Deng et al. 2019;Wang et al. 2020). However, most of these studies focus on low-altitude basins with dense in situ gauge observation, because the satellite and reanalysis precipitation products in such areas are less affected by topography, making it easier to evaluate and correct satellite and reanalysis precipitation data based on many gauge-observed (GO) data. Alpine basin areas are important in the conservation of water resources (Viviroli & Weingartner 2004;Immerzeel et al. 2009) and are sentinel outpost responding to climate change (Immerzeel et al. 2010;Shakil et al. 2015), such as on the Tibetan Plateau, known as the 'Asian water tower' (Immerzeel et al. 2010). It is more meaningful to estimate the quality of satellite and reanalysis precipitation products in an Alpine basin and to find precipitation products suitable for hydrological-runoff simulations thereof.
In recent years, many scholars (Yuan et al. 2018;Deng et al. 2019;Duan et al. 2019;Yw et al. 2019) have discussed the hydrological application of satellite and reanalysis precipitation products in Alpine basins. However, most studies focus on the influence of precipitation product quality on hydrological simulation accuracy. Unfortunately, these research results show that the performance of precipitation data from sparse in situ gauge observation stations in hydrological models is better than that of satellite and reanalysis precipitation products with a high spatio-temporal resolution. Yuan et al. (2018) evaluated the quality of the TRMM Multi-satellite Precipitation Analysis 3B42V7 and the Integrated Multi-satellite Retrievals for GPM (IMERG) Final Run Version 05 precipitation products and their hydrological utilities in the Yellow River Source Region (YRSR). Their results proved that the performance of GO is better than that of IMERG and TRMM precipitation data. In the Upper Gilge Abay basin, Duan et al. (2019) verified the applicability of the CHIRPS, TRMM, and CFSR in hydrological models using the Soil and Water Assessment Tool (SWAT) and still found that the GO performed best. The use of groundbased rain gauge data is generally considered to be a more accurate method as this entails a direct measurement of precipitation (Qin et al. 2014). However, ground-based rain gauges are considered as point measurements within the common problem of the uneven distribution (Chappell et al. 2013), which may not effectively reflect the spatio-temporal variability of precipitation systems (Anagnostou et al. 2009). Satellite and reanalysis precipitation products demonstrate the advantage of a large areal coverage (Bajracharya et al. 2015;Prakash et al. 2016), which can supplement the precipitation information in areas without stations. How to coordinate the advantages of GO, satellite and reanalysis of precipitation data is the key to hydrological-runoff simulation in Alpine basins.
The YRSR, with high solar radiation and a low temperature, was selected as a case study in the present research. Combined with the distributed hydrological model SWAT, two types of satellite precipitation products (TMPA 3B42RTV7 and IMERG Final Run Version 06) and two types of reanalysis precipitation products (CMADS and CFSR) were statistically and hydrologically validated. This study entailed (1) using GO to evaluate the quality of 3B42RTV7, IMERG, CMADS, and CFSR at grid and basin-scales; (2) driving the hydrological model by precipitation data pre-and post-correction; (3) driving the hydrological model by the combination of GO and satellite or reanalysis precipitation products, namely for those area with GO we used GO, and in areas without GO we utilized satellite or reanalysis precipitation products. To the best of our knowledge, the hydrological evaluation of the combination of GO and satellite or reanalysis precipitation products in the YRSR has not yet been reported. The results of this study show implications for improving water supply, flood forecasting, and ecosystem protection for Alpine basins and their downstream regions.

Study area
The YRSR, with a drainage area of ∼122,000 km 2 accounting for ∼15% of the area of the Yellow River basin, is located in the north-east of the Qinghai-Tibetan Plateau (roughly ranging between 95°30 0 -103°30 0 E and 32°30 0 -36°20 0 N). With elevations ranging from 2,675 to 6,253 m that decrease from the south-west to the north-east (Figure 1), the YRSR is classified as having a typical Alpine climate (Xu & He 2006), with intense sunshine and diurnal temperature changes. The rainfall is predominately concentrated in the flood season (June-October), accounting for ∼75% of annual precipitation, and snowfall is primarily concentrated between September and May (Hu et al. 2011). Precipitation runoff is the predominate form of runoff in the YRSR, accounting for ∼96% of all runoff therein (Liu & Chang 2005).
The YRSR has been selected as the study area due to the following three reasons: (1) the YRSR provides fresh water to hundreds of millions of people downstream; (2) it is subject to less anthropogenic impact with a total of approximately half a million inhabitants ; and (3) the YRSR is a sensitive zone in response to climate change (Junliang et al. 2013).

Precipitation dataset
Five types of precipitation datasets, namely the GO, IMERG Final Run V6, TMPA 3B42RTV7, CMADS, and CFSR, were selected for this study ( Table 2).
The GO was derived from the daily surface meteorological data of the China Meteorological Data Network. There are only 11 in situ gauged observation stations in the YRSR, and most of them are distributed downstream, and there is only one Maduo station upstream (Figure 1(c)). The TRMM was launched in 1979 by the National Aeronautics and Space Administration (NASA) and the Japanese Aerospace Exploration Agency (JAXA) to provide satellite monitoring of global precipitation. In 2015, the TRMM mission ended, the instruments were shut down, and the spacecraft re-entered the Earth's atmosphere. In this study, the 3B42RTV7 daily precipitation product was used, which is a near real-time precipitation product generated using the TRMM TMPA Version 7 algorithm (Huffman et al. 2010b). To the best of our knowledge, the hydrological evaluation of the 3B42RTV7 daily precipitation product in the YRSR has not yet been studied. The GPM was launched in February 2014 as the successor to the TRMM providing the next generation of global precipitation products. The IMERG precipitation products were GPM's level-3 products produced by the IMERG algorithm. Various products can be divided into three levels in terms of timeliness: Early-Run, Late-Run, and Final-Run. The Final-Run product was generally considered to be more accurate in terms of its results than the quasi-real-time products (Early-and Late-Run) . In this study, the IMERG Final-Run V6 daily precipitation product from January 1, 2008 to December 31, 2016 was selected, of which the precipitation data from January 2008 to January 2014 were calculated from the original remote sensing image of TMPA by the IMERG algorithm. The TRMM and IMERG precipitation products are currently two satellite precipitation products that have been widely applied in hydrological simulations (Nhi et al. 2018;Yuan et al. 2018;Duan et al. 2019).
The CMADS is a reanalysis dataset established using the China Meteorological Administration atmospheric assimilation system technology and multiple other scientific methods (Meng et al. 2019). The CMADS was completed over 9 years (January 1, 2008to December 31, 2016. The application potential of the CMADS in hydrological modeling has been verified in many watersheds in China (Li et al. 2019;Meng et al. 2019;Zhang et al. 2020). The CFSR is a reanalysis dataset developed by the National Centers for Environmental Prediction (NCEP) which was completed over 36 years (January 1, 1974to December 31, 2014. The CFSR has two versions, CFSv1 (Saha et al. 2010) and CFSv2 (Saha et al. 2014), of which the CFSv1 only released data before 2011; data from 2011 to 2014 were provided by the CFSv2. The CFSR in hydrological modeling is currently one of the most widely used reanalysis datasets with worldwide application (Zhu et al. 2015;Ruan et al. 2017;Yang et al. 2020) because of advantages such as its large time-scale, high-resolution spatial scale, and convenient data acquisition. The CMADS and CFSR were both included on the ArcSWAT official website.

Other data
In addition to precipitation data, the following data are needed for model construction and verification:  • Meteorological data: derived from the daily surface meteorological data of the China Meteorological Data Network (Version 3.0) (http://data.cma.cn/), including maximum/minimum temperature, relative humidity, wind speed, and hours of sunshine. The solar radiation was calculated using the Angtrom-Prescott equation as detailed in Wu et al. (2012).
• Streamflow data: observed daily streamflow data at the Tangnaihai (TNH) station and Jimai (JM) station from January 1, 2008 to December 31, 2015 were collected from the Nanjing Hydraulic Research Institute, China.
Figure 1(c) shows the spatial distribution of meteorological and hydrological stations. The projection coordinate system of the DEM, land use, and soil map was set to that of WGS_1984_Albers, with a central longitude of 100°E and standard latitude (north latitude) of ; 1 ¼33.5°, ; 2 ¼38°.

METHODOLOGY
This study has two parts. In the first part (precipitation product evaluation), the quality of 3B42RTV7, IMERG, CMADS, and CFSR precipitation products was evaluated at grid and watershed-scales based on GO. The IMERG should be divided into two eras: the TRMM era (2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013) and the IMERG era (2014-present) (Huffman et al. 2018). Therefore, GO was divided into GO1 and GO2 to compare the performance of the IMERG in the two eras (Table 2). In the second part (streamflow simulation evaluation), 12 precipitation scenarios were created to drive the hydrological model (Table 2). Scenarios S1-S7 were used to assess the runoff simulation effect of each precipitation dataset; scenarios S8 and S9 were SWAT models driven by corrected precipitation data to research the influence of precipitation data correction on runoff simulation (Section 4.2.2 describes the reasons for correcting only CMADS and CFSR precipitation data). Scenarios S10, S11, and S12 cover CMADS precipitation data combined with GO1, corrected CFSR precipitation data combined with GO1, and IMERG precipitation data combined with GO2, respectively: these were designed to determine the effects of precipitation data combination on runoff simulation (Section 4.2.3 describes the reasons for choosing these three combinations). The analysis process used herein is shown in Figure 2.

Precipitation data evaluation
To quantify and evaluate the accuracy of the 3B42RTV7, IMERG, CFSR, and CMADS precipitation products in the YRHR, the precipitation derived from the four precipitation products is directly compared with GO. Six statistical metrics, including the root mean square error (RMSE), percent bias (PBIAS), correlation coefficient (CC), probability of detection (POD), false alarm ratio (FAR), and critical success index (CSI) were adopted to evaluate the agreement between the GO and the four precipitation products. The calculation equations, units, ranges, and optimal values of the evaluation indicators are listed in Table 3.

SWAT model and model setting
The SWAT is a semi-distributed, physics-based, eco-hydrological model, which runs in daily, monthly, or annual time steps (Arnold et al. 1998), and has been widely used in hydrological processes (Grusson et al. 2015), soil erosion studies (Song et al. 2011), and nutrient transport evaluation (Wang et al. 2018). Previous studies have shown that dividing the YRSR into 25 , 29 (Hao et al. 2013), and 97 (Mengyaun et al. 2019) sub-basins can yield reliable simulation results. Therefore, the YRSR was divided into 26 sub-basins to reduce the unnecessary calculation. The SWAT was originally developed to assess water resources in large agricultural basins and was not designed to model heterogeneous mountain basins typical of the western United States (Fontaine et al. 2002). Ten elevation zones (each covering a change in the elevation of 500 m) were established in the present work to divide each sub-basin to weaken the influence of topography on precipitation. According to previous research (Fontaine et al. 2002;Hao et al. 2013), the snowfall temperature (SFTMP), snow melt base temperature (SMTMP), maximum melt rate for snow during the year (SMFMX), minimum melt rate for snow during the year (SMFMN), snow pack temperature lag factor (TIMP), and minimum snow water content that corresponds Uncorrected Proof to 100% snow cover (SNOCOVMX) in the snowmelt module have been adjusted to reduce the influence of snowmelt on the model (Table 4).

Parameter calibration and model evaluation
Calibration and uncertainty analyses of the simulation results from the model were undertaken using the Sequential Uncertainty Fitting Version 2 (SUFI2) in the SWAT calibration and uncertainty program (SWAT-CUP) (Abbaspour 2015). Thirty sensitive parameters were initially selected according to previous studies on hydrological modeling in Alpine basins (Hao et al. 2013;Bhatta et al. 2019;Mengyaun et al. 2019;Shuai et al. 2019). Sixteen parameters with the highest sensitivity were then selected using the Latin hypercube and one-factor-at-a-time sampling (LH-OAT) method for calibration (Table 5). Due to limitations of space, we do not present any analysis of the calibration parameters. According to Abbaspour (2015), the model was calibrated using three iterations with 400 simulations (necessitating a total of 1,200 simulations during calibration) using the Nash-Sutcliffe efficiency (NSE) (Nash & Sutcliffe 1970) and the coefficient of determination (R 2 ) as the objective function. The range of each parameter was modified after each iteration according to both new parameters suggested by the SWAT-CUP and their reasonable physical ranges. The criteria proposed by Moriasi et al. (2015) were adopted to classify model performance into the respective categories: 'very good' (NSE.0.80; R 2 .0.85), 'good' (0.70,NSE 0.80; 0.75,R 2 0.85), 'satisfactory' (0.50,NSE 0.70; 0.60,R 2 0.75), and 'unsatisfactory' (NSE 0.50; R 2 0.60).

Precipitation data pre-processing
Before modeling, the precipitation data were pre-processed: 3 7 5 2 0-1 1 % Note: n means the number of samples; P i and Q i , respectively, represent the precipitation product data and rain gauge data; P and O , respectively, represent the average value of precipitation products and rain gauge data. H represents the observed rainfall that is successfully detected, F represents the rainfall that is detected but not observed in rain gauges, M represents the observed precipitation but is not detected. Q o i and Q s i , respectively, represent the observed and simulation streamflow; Qo and Qs, respectively, represent the average value of observed and simulation streamflow. Uncorrected Proof (a) The numbers of grids or stations with precipitation products of 3B42RTV7, IMERG, CMADS, and CFSR located in the YRSR are 200, 1,027, 198, and 122, respectively. Considering that the SWAT only uses data from the one weather station closest to the centroid of the sub-basin (Masih et al. 2011;Galván et al. 2014), it is impractical to divide the watershed into 1,027 sub-watersheds and correspond thereto on a one-by-one basis. Therefore, virtual weather stations were constructed for each sub-basin (Tuo et al. 2016;Ruan et al. 2017). The specific methods are as follows: • Satellite raster or reanalysis station precipitation data falling in each sub-basin were extracted based on the ArcGIS platform.
• The arithmetic average method was used to calculate the areal rainfall of each sub-basin, giving precipitation data pertaining to each virtual precipitation station. • The centroid of each sub-basin is the location of the virtual precipitation station (Figure 1(c)).
(2) Considering that the starting period of SWAT-CUP calibration must be a whole year, the periods of coincidence of the CMADS (January 1, 2008 to December 31, 2016) and CFSR (January 1, 1974 to December 31, 2014) data are only 6 years (January 1, 2008 to December 31, 2013), deducting the warm-up period of the SWAT model (1-2 years), the final simulation time will be shorter (4-5 years), which does not reflect the quality of the data. Therefore, meteorological data from January 1, 2008 to December 31, 2010 and January 1, 2006 to December 31, 2007 were added for the warm-up of the SWAT model, so that the data time-span used for the simulation becomes 6 years (January 1, 2008 to December 31, 2013).

Evaluation of multi-precipitation products
Before further use of satellite and reanalysis of precipitation products, it is necessary to evaluate the quality of these precipitation products. The errors in satellite precipitation products usually stem from the weak relationship between precipitation rate and remote sensing signals (Bitew & Gebremichael 2010), satellite revisit time (Thiemig et al. 2013), and retrieval algorithm (Yan et al. 2020). For reanalysis of precipitation products, the uncertainties and errors mainly come from data sources, interpolation algorithms, and data assimilation systems (Zhu et al. 2015).  TRMM precipitation products cannot express the spatio-temporal variability of precipitation over high-altitude, complex terrain. Compared with GO, the precipitation data of IMERG and IMERG_T are the closest, while the precipitation data of 3B42RTV7 and CFSR are significantly overestimated, and the precipitation data of CMADS are significantly underestimated. Several authors (Saha et al. 2014;Ghodichore et al. 2018;Graham et al. 2019) stated that reanalysis precipitation products either significantly overestimated or underestimated observed precipitation. The PBIAS, CC, and RMES of the precipitation products and the GO were counted on a monthly time-scale in order to further reflect the difference between the precipitation products and the GO. Based on Figure 4, the PBIAS values of TRMM, IMERG_T, IMERG, CMADS, and CFSR were characterized by low warm season precipitation and high cold season precipitation. 3B42RTV7 precipitation data were underestimated in January and February, and overestimated at other times, especially from October to December. This may be because the TRMM overestimated precipitation in the form of snow, hail, etc. (Villarini et al. 2009;Cai et al. 2015). IMERG_T precipitation data were underestimated in the rainy season (May-November) and overestimated in the dry season (December-April). IMERG precipitation data were underestimated in the dry season (December-April), but the IMERG performed best in observing precipitation in the rainy season (average PBIAS¼À2.26%). CMADS precipitation data were underestimated in other months except December. The precipitation data of the CFSR overestimated the precipitation in all months. Except IMERG, the CC values of other precipitation products also show characteristics of being lower in the warm season and higher in the cold season, among which the CFSR shows the best correlation with GO (average CC¼0.73), whereas CMADS, 3B42RTV7, IMERG_T, and IMERG perform poorly, with mean CC values of 0.23, 0.01, À0.01, and À0.28, respectively. However, the RMSE values of five types of precipitation products exhibit seasonal characteristics related to the greater precipitation in the warm season and Uncorrected Proof lower precipitation in the cold season in the YRSR (Hu et al. 2011). IMERG precipitation products have the smallest deviation, with an average RMSE of 13.71 mm, followed by CMADS (17.35 mm), CFSR (21.32 mm), IMERG_T (32.42 mm), and 3B42RTV7 (47.89 mm).

Evaluation at basin scale
To ascertain whether different precipitation products can capture precipitation events within various precipitation intensity (PI) groups, the probability density function approach was used to evaluate the daily PI, and PI was divided into nine bins (0 PI,0.1, 0.1 PI,1, 1 PI,5,5 PI,10,10 PI,15,15 PI,20,20 PI,30,30 PI,40,and PI!40). Based on Figure 5, IMERG, IMERG_T, CMADS, and CFSR can correctly capture precipitation classifications, but 3B42RTV7 overestimates high rainfall of .10 mm/d. IMERG and CFSR overestimate the intensity of all precipitation events, especially the CFSR, which significantly overestimates moderate precipitation events of 1-10 mm/d. The precipitation underestimation by the CMADS is mainly concentrated within the range of 1-20 mm/d, whereas events within the range of 0.1-1 mm/d are overestimated.

Evaluation at a grid scale
According to Figure 6, the qualities of 3B42RTV7, IMERG_T, IMERG, CMADS, and CFSR are generally better in the southeast than in the north-west. The north-western areas are covered with snow all year round, owing to their high altitude and higher latitude. This leads to poor-quality precipitation observations in this area (Noh et al. 2009;Behrangi et al. 2016). The overestimation of 3B42RTV7 is the largest with the PBIAS of 33.11-59.74%, which gradually increases from downstream to upstream. The precipitation data of the CFSR are overestimated except for the station at Dari, while CMADS precipitation Uncorrected Proof data are underestimated except for the station at Maqu. IMERG precipitation data are overestimated in the downstream area and underestimated upstream. Compared with satellite precipitation products (CC of 0.09-0.40), the reanalysis precipitation products (CC of 0.34-0.58) have a better correlation with GO. The RMSE values of five precipitation products are large in the south-east and small in the north-west. According to the statistical indicators pertaining to various precipitation products, the overall performance of CMADS precipitation products is the best, with the PBIAS of À27.22 to 2.48%, CC of 0.43-0.58, and RMSE of 2.68-4.96 (mm/d), followed by IMERG, CFSR, IMERG_T, and 3B42RTV7. It should be noted that the period of IMERG is only 3 years, whereas other precipitation products use a period of 6 years. Elsewhere, it is reported that the spatial sampling uncertainties tend to decrease with an increasing accumulation time of precipitation (Villarini et al. 2008). It is believed that the IMERG should have better evaluation metrics.
IMERG_T and 3B42RTV7 have the same detection index value (Figures 7(a) and 7(b)), and the specific reason for this is given in Section 2.2.1, so here we only analyzed 3B42RTV7. According to Figure 7, the four precipitation products have high detection rates (POD ! 0.60), of which the CFSR performs best (POD !0.90), followed by IMERG (0.67 POD 0.82), CMADS (0.63 POD 0.84), and 3B42RTV7 (0.60 POD 0.70). FAR values of four precipitation products increase with latitude. Tian & Peters-Lidard (2010) reported that the satellite precipitation products demonstrate large uncertainty in high latitudes (beyond +40°). Among the four precipitation products, 3B42RTV7 shows the highest FAR (0.40 FAR 0.57), followed by IMERG (0.40 FAR 0.57), CFSR (0.29 FAR 0.57), and CMADS (0.30 FAR 0.48). The CFSR has the highest comprehensive forecasting ability, with a CSI of 0.48-0.69, followed by the CMADS and IMERG, and 3B42RTV7 exhibits the worst comprehensive forecasting ability. According to the detection indicators of various precipitation products, the overall performance of CFSR precipitation products is the best, with a POD of 0.90-0.98, the FAR of 0.29-0.51, and the CSI of 0.48-0.69, followed by the CMADS, IMERG, and 3B42RTV7.
The evaluation of grid precipitation based on GO data will generate uncertainty (Villarini et al. 2008;Tian et al. 2018;Mandapaka & Lo 2020), and the uncertainty decreases with the increase in the density of the precipitation gauge network (Tian et al. 2018). Villarini et al. (2008) found that the estimated area rainfall of daily-scale grid precipitation products (pixels of ∼100 km 2 ) is within 20% of its true value, and two in situ gauged observation stations are needed. However, there are few in situ gauged observation stations in the YRSR (0.01 gauges per 100 km 2 ), and the evaluation of grid precipitation products based on GO data at the basin scale is likely to underestimate its performance (Tian et al. 2018). Therefore, it is not sufficient to assess the performance of grid precipitation products based on statistical methods. Hydrological simulation verification is a supplementary method used for the evaluation of precipitation products (Guoqiang et al. 2015;Deng et al. 2019).

Results of streamflow simulation using different precipitation datasets
According to Figure 8, the runoff simulation results of Scenario S1 are the best overall, with R 2 and NSE values of 0.85/0.75 and 0.84/0.51 in the calibration/validation periods at the TNH and 0.81/0.57 and 0.80/0.39 in the calibration/validation periods at the JM. Scenario S6 performs the second best and, in the validation periods (R 2 ¼0.78, NSE¼0.53 at the TNH; R 2 ¼0.64, NSE¼0.53 at the JM), yields the satisfactory performance and outperforms Scenario S1, but it performs poorly Uncorrected Proof in the calibration period. Scenario S6 underestimates the runoff during the dry season, owing to the CMADS precipitation data being underestimated (Figure 4). The runoff simulation results of Scenarios S3 and S7 are significantly overestimated, and neither the TNH nor the JM station reaches a satisfactory performance, especially with respect to Scenario S3 at the JM station. The reason for this is that the precipitation data of 3B42RTV7 and CFSR are overestimated (Figure 4), and the precipitation data of the 3B42RTV7 overestimate the upstream precipitation (Figures 3(c) and 6(b)).
Based on Figures 8 and 9, the runoff simulation results of Scenario S5 are significantly better than those of Scenario S4. It is noted that IMERG_T represents the precipitation data of IMERG in the TRMM era, as calculated by the IMERG algorithm. The GPM IMERG, as a newer generation of the TRMM precipitation product, is of better quality due to its use of a more advanced GPM microwave imager sensorthe dual-frequency precipitation radar onboard the GPM satellitesand more passive microwave samples (Tang et al. 2020). Although many researchers have also proved that the IMERG returns a better performance than 3B42/3B42RT (Tang et al. 2015, Yuan et al. 2018, the gap is not big (Tang et al. 2020), and it may be even worse than 3B42 in some regions . Tang et al. (2020) used statistical methods to find that the performances of IMERG in the TRMM era (2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013) and the GPM era (2014-present) are similar. This is consistent with our annual-scale assessment results (Figure 3), but is very different from the effect when assessed over a  (Figures 8 and 9). There are two main reasons for this result: first, there are few in situ gauged observation stations in the YRSR, which increases the uncertainty of statistical evaluation methods (Villarini et al. 2008;Tian et al. 2018); second, the applicability of the IMERG calibration algorithm in the YRSR is limited (Yuan et al. 2018). Therefore, in data-scarce Alpine regions, the data of IMERG in the TRMM era should be evaluated by a comprehensive statistical method and hydrological verification results. Compared with Scenario S5, Scenario S2 is slightly better. In calibration periods, in Scenario S2 (R 2 ¼0.76, NSE¼0.75 at the TNH station; R 2 ¼0.77, NSE¼0.70 at the JM station) and Scenario S5 (R 2 ¼0.70, NSE¼0.65 at the TNH station; R 2 ¼0.66, NSE¼0.66 at the JM station), the runoff simulation results yield a satisfactory performance, but the performance of the two in the validation periods is extremely poor (NSE 0.26). This may be due to the short time-series of precipitation data in Scenarios S2 and S5, and the limited number of calibration times of parameters, which causes significant differences in the performance of simulation results in the calibration and validation periods. In summary, the runoff simulation results based on GO perform best overall, followed by IMERG, CMADS, CFSR, IMERG_T, and 3B42RTV7. IMERG and CMADS precipitation products can be used in this datascarce Alpine region.

Results of streamflow simulation using corrected precipitation datasets
As mentioned in Section 4.1, the GO and the reanalyzed precipitation products have a high correlation at basin and grid scales, but the correlation with the satellite precipitation products is poor (Figures 4 and 6). Therefore, we only corrected the precipitation data of the CMADS and the CFSR. At first, the daily precipitation of GO1, CFSR, and CMADS at the basin scale was calculated, and then GO1 was used to perform daily-scale regression analysis on CMADS and CFSR precipitation data at the basin scale. Comparing the fitting effects of different functions, the value of R 2 of the resulting cubic polynomial is found to be the highest. According to cubic polynomial fitting, the value of R 2 of CMADS is 0.827, and that of CFSR is 0.934 ( Figure 10). Figure 11 shows that the corrected CFSR precipitation data have improved the simulation results at the TNH station. The simulation results have changed from unsatisfactory to satisfactory, and the R 2 (NSE) values during the calibration and validation periods are increased by 0.28 (0.34) and 0.22 (0.27), respectively; however, the overall performance of the CMADS after correction remains unsatisfactory because the correlation between GO and CFSR precipitation data is better than that of the CMADS (Figure 4). Compared with the TNH station, the corrected CMADS and CFSR precipitation data generate no improvements in runoff model results of the JM station, and the simulated results remain unsatisfactory. This may be because precipitation stations in the YRSR are mostly distributed downstream, and there are only two precipitation stations in the basin above the JM station (Figure 1).

Results of streamflow simulation using combined precipitation datasets
Using R 2 and NSE indicators, the simulated results of IMERG and CMADS precipitation data are shown to be close to, or even better than, the GO in calibration or validation periods (Figures 8 and 9). The performance of CFSR precipitation data  Uncorrected Proof after correction is better ( Figure 11). Therefore, CMADS, CFSR_C, and IMERG precipitation data and GO, corresponding to Scenarios S10, S11, and S12, are combined. The spatial distribution of precipitation stations is shown in Figure 1(b).
According to Table 2, the overall performance of Scenario S10 combining GO and CMADS is the best, and the simulation results at the TNH station result in a good performance (R 2 ¼0.77, NSE¼0.72), which is superior to Scenario S1 (R 2 ¼0.80, NSE¼0.68) and Scenario S8 (R 2 ¼0.59, NSE¼0.50). Although the simulation results at the JM yield unsatisfactory performance, they are close to being deemed satisfactory (calibration periods: R 2 ¼0.50, NSE¼0.48; validation periods: R 2 ¼0.55, NSE¼0.47). The runoff simulation results of Scenarios S11 and S12 are not as good as those of Scenarios S1 and S2, but slightly better than those of Scenarios S5 and S9.

CONCLUSION
The overall objective of this study was to evaluate the hydrological application potential of 3B42RTV7, IMERG, CMADS, and CFSR in the YRSR. The major findings of this study are summarized as follows: • At the basin scale, 3B42RTV7, IMERG, CMADS, and CFSR have higher detection accuracy in the warm season, and the PBIAS and CC values of each precipitation product are characterized by small warm season and large cold season values. Among the four precipitation products, the IMERG has the smallest deviation (average RMSE¼13.71 mm), while the CFSR has the best correlation (average CC¼0.73).
• At the grid scale, among the four precipitation products, the CMADS demonstrates the best performance for precipitation observation, with the PBIAS of À27.22 to 2.48%, the CC of 0.43-0.58, and the RMSE of 2.68-4.96 (mm/d), followed by IMERG, CFSR, and 3B42RTV7. The CFSR has the best performance for precipitation events, with the POD of 0.90-0.98, the FAR of 0.29-0.51, and the CSI of 0.48-0.69, followed by CMADS, IMERG and 3B42RTV7.
• Taken together, the IMERG has the best performance, followed by CMADS, CFSR, and 3B42RTV7. 3B42RTV7 severely overestimated high rainfall of .10 mm/d. The CFSR obviously overestimates moderate precipitation events of 1-10 mm/ d, while the CMADS underestimates the precipitation events of 1-20 mm/d.
• The models using the GO as input result in satisfactory performance during 2008-2013, and precipitation products have poor simulation results. The results of simulation using the CMADS significantly underestimate the runoff during the dry season, but the performance in the validation periods (R 2 ¼0.78, NSE¼0.53 at the TNH station; R 2 ¼0.64, NSE¼0.53 at the JM station) is best among those scenarios analyzed. The runoff simulated using 3B42RTV7 and CFSR is significantly overestimated, especially when using 3B42RTV7. Although the model using the IMERG as input yields unsatisfactory performance during 2014-2016, it does not affect the use of the IMERG as a potential data source for the YRSR.
• After bias correction, the quality of the CFSR improves significantly with increases to R 2 and NSE of 0.25 and 0.31 at the TNH station, respectively. A SWAT model driven by the combination of GO and CMADS precipitation is the best across all scenarios. The simulation results at the TNH station yield a satisfactory performance (R 2 ¼0.77, NSE¼0.72). Although the simulation results at the JM station yield an unsatisfactory performance, they are close to being deemed satisfactory (R 2 ¼0.53, NSE¼0.48).
In summary, although the satellite and reanalysis precipitation products represented by the TRMM and the CFSR have been widely used in hydrological modeling, the quality of these products can be significantly improved when applied to Alpine basins. In contrast, the IMERG has a better performance in observing solid precipitation due to the more advanced GPM microwave imager sensor and the dual-frequency precipitation radar mounted on the GPM satellites . The findings of this assessment provide a valuable reference and feedback for satellite and reanalysis precipitation product development for use in Alpine basins. In addition, snowfall is the main form of precipitation in the YRSR from September to May; however, such an assessment was not fulfilled due to the lack of snowfall observation site, which is a task that warrants investigation and inclusion in future research.