Deterministic and probabilistic projections and their credibility in analyzing future precipitation variations in the Yellow River Basin, China

It remains a key challenge to obtain reliable future precipitation estimates and their reliability under different climate scenarios. In this study, the deterministic projection of future precipitation in the Yellow River Basin (YRB) was obtained within the Bayesian model averaging (BMA) framework. A probability estimation method based on the BMA weighting scheme was proposed to obtain the probabilistic projection of precipitation. We also analyzed the credibility of these two projections. The results showed that four indexes projected by the BMA method showed an increasing trend with a higher probability. The probabilities of increasing with varying degrees were more than those for decreasing for all the precipitation indexes. The credibility of the precipitation estimation under specific climate scenarios was testified by the lower ED (the mean of long-term annual relative simulation deviation) and VD (the variance of long-term annual relative simulation deviation). The estimation based on the BMA model is more trustworthy than any other model. For the four precipitation indicators, the accuracy between the calculated VR (Variation range, to describe the interval of variation of the indicators) with the greatest likelihood and the actual VR was 38.31 53.74%. In 81.93 94.70% of grids, the deviations were smaller than one level. Both the deterministic and probabilistic projections have high geographic distribution and variation trend consistency.


INTRODUCTION
The precipitation over the Yellow River Basin (YRB) has undergone continuous and drastic variation in the past decades , especially in terms of extremes and potential redistribution of water resources in time and space, threatening the lives of residents in the whole basin. Another concern lies in the future precipitation variation in the forthcoming decades and centuries under global environmental change. Studies have confirmed the incredible impact of climate change on people and the environment (Meshram et al. 2018(Meshram et al. , 2020Almazroui et al. 2021aAlmazroui et al. , 2021b, but it is still a significant challenge to reliably estimate future climate and its credibility under different climate scenarios and climate models. The vulnerable natural system in the YRB is of interest to researchers worldwide because it reflects the evolution of water resources under climate change and human activity in the future (Wu et al. 2016;Wang et al. 2019b). Moreover, the precipitation in the YRB has changed in time and space for decades (Lu et al. 2014;Zhang et al. 2014Zhang et al. , 2019Iqbal et al. 2018;Zhao et al. 2019), arousing many concerns and prompting research on future precipitation change as well as its impact on hydrology and water resources. The GCMs provide scenario determination and forcing data, forming the basis of climate change projection. They are an original estimation of long-term precipitation without specific precipitation indexes, but most of the projected precipitation indexes derive from the GCMs data. Typical indexes include the mean over time and extreme value. The key to obtaining a reliable future precipitation projection is to solve uncertainty in multiple models and get valid information. Multi-model ensemble, including equal-weighted or unequal weighted averaging, is the most common and effective solution (Radermacher & Tomassini 2012;Chen et al. 2017;Guo et al. 2017). For the YRB previous studies mainly focused on the impact of future precipitation change, while there was little research on future precipitation change itself. Of the research that has been done in this area, Wang et al. (2017a) and Chen et al. (2019) investigated the impacts of climate change on water resources (indicated by runoff) by using the projected precipitation and temperature, applying the Variable Infiltration Capacity (VIC) model, in the whole YRB and the source region of the YRB. The impacting mechanisms usually come from comparing runoff in the baseline period (the model output) and runoff change caused by the precipitation change (the model input). As for the impacts of future precipitation change on hydrology and water resources, Yin et al. (2016) analyzed the future extreme precipitation with different return periods in the Huang-Huai-Hai region using five GCMs multi-scenarios. The result indicated that the climate models had errors in simulating extreme historical precipitation. Liu et al. (2017) studied the projected climatic year types (CYTs) throughout China between 2015 and 2099 from spatial patterns, temporal changes, and frequency, demonstrating better performance and superiority of multi-model ensembles than individual models in projecting changes in CYTs by offsetting systematic errors. Nevertheless, all future water resources projections depend on the future climate scenarios, and the reliability of conclusions is subject to the credibility of the GCMs. Thus, the main challenge is evaluating and generating more reliable precipitation from GCMs.
Multi-model ensembles are a practical solution for hydrological, meteorological, and climatological predictions or forecasts. The ensemble projection could be understood as a weighted model average from the member models. Raftery et al. (2005) applied Bayesian model averaging (BMA) to forecast the surface temperature and sea level pressure in the Pacific Northwest for better and sharper forecasting, confirming the excellent performance of BMA in the multi-model ensemble when the ensemble members are overdispersed. Duan et al. (2007) also used BMA to acquire more accurate hydrological predictions from some original competing predictions in terms of more satisfying daily root mean square error and daily absolute mean error, as well as a better description of total predictive uncertainty than the members. These results make it a typical case of ensemble hydrological forecasting using BMA. From research on climate change, the ensemble members mainly come from climate models, including Regional Climate Models (RCMs) and GCMs. Dars et al. (2017) proved that BMA performed better than individual models in multi-model ensemble prediction with a weighted average of four CMIP5 models to assess the impacts of climate change on future precipitation. The result will significantly help related managers, experts, and policymakers. A similar study on temperature (Min & Hense 2006;Miao et al. 2014) inferred that BMA enjoys more reliability and less uncertainty than other model averaging methods or individual models. The application of BMA in climate change has yielded encouraging results and has been widely recognized (Duan & Phillips 2010;Notaro et al. 2016). Irrespective of whether the ensemble members come from hydrological models or climate models, the consistent process in the ensemble is to capture the practical information from those member models for a more reliable future projection.
Researchers have focused on the uncertainty and credibility of climate change projections due to enormous uncertainty in climate models and downscaling data processing. The output of climate scenarios is also uncertain for the complicated processes in the atmospheric system. It has been challenging to acquire desired results and assess the credibility of climate projection evaluations, especially without a practical evaluation system. The study (Fricker et al. 2013) pointed out validation of performance measures for the credibility of future climate projections. These issues about credibility aroused our keen interest in the credibility evaluation of future precipitation projection over water-scarce areas in the YRB. We aim to improve the projection level of climate elements by using appropriate indicators to measure the precipitation projection ability of GCM models.
In this paper, we proposed a probability precipitation estimation method based on the BMA method and the measurement of its credibility. For the future precipitation projection of the YRB, there would be two significant questions to resolve: (1) How to ensure good performance for BMA in terms of the precipitation indexes evaluation compared to the original climate models over the YRB and (2) How to ensure the credibility of the precipitation estimation based on BMA. To find the answer, we used the daily observed precipitation data during 1986-2005 and future precipitation projections selected from five GCMs during 2006-2099. Four precipitation indexes were adopted to describe the precipitation situation. Two indicators were used to assess the performance and credibility of precipitation ensemble projection conducted by the BMA method.

The Yellow River Basin
With a length of 5,464 km and a drainage area of 795,044 km 2 , the YRB ( Figure 1) is the second-largest river in China. It originates in the Tibetan Plateau and runs through four geomorphic unitsthe Qinghai-Tibetan Plateau, the Inner Mongolia Plateau, the Loess Plateau, and the North China Plain. Supplying water resources to 100 million people, the YRB runs through an arid region, a semi-arid region, and a semi-humid region, before reaching the Bohai Sea Wang et al. 2019a). These regions range in elevation between 1,000 and 4,000 m, have annual precipitation ranging from 123 to 1,021 mm, have annual evaporation varying from 700 to 1,800 mm, and have dramatic temperature fluctuations in time and space (Wang et al. 2019a). The long-time average annual value of precipitation, runoff (engaging a drastic decrease), and annual evapotranspiration were estimated to be about 460, 450, and 1,100 mm on average, respectively. Since the YRB is located in a temperate continental climate zone and is regulated by the East Asian monsoon, precipitation is the primary controlling source and restrictive factor of the runoff and water resources. This directly affects the water supply for 12% of China's population, 15% of China's arable land, and 8% of China's GDP approximately, even though the scarce runoff accounts for only 2% of China's total (Guan et al. 2016;Yin et al. 2017;Wang et al. 2019a). Therefore, the variation of precipitation plays a vital role in sustainable development and even in the daily life of the people of the YRB. Both in terms of time and space, the precipitation in the YRB is remarkably uneven, with June to October accounting for about 70% of annual total precipitation and the precipitation in the lower and middle reaches two times that of the upper reaches (Mu et al. 2013). Over the past 50 or 60 years, with the increase of air temperature, the precipitation of the YRB has fallen, and so too has runoff, exacerbating a profound contradiction between water demand and water supply and worsening the drought situation Wang et al. 2018Wang et al. , 2019a.

Data
The daily observed precipitation data during 1986-2015 came from the Meteorological Data Center of the China Meteorological Administration (CMA) (http://data.cma.cn/). It was gridded into the spatial resolution of 0.5°Â 0.5°, the data of the YRB were extracted and put into use.
After a comprehensive multi-criteria assessment, Zhou & Han (2018) selected 5 models (Table 1) from 18 CMIP5 GCMs. Therefore, the total rank of these selected models is more satisfactory. In the model evaluation of Zhou et al., several statistics on precipitation and temperature were calculated. The statistics consisted of climatological mean, interannual variation, seasonal cycle, inter-annual modes, and probability distribution function. They were assessed by Mean Absolute Error, Correlation Coefficient, Root Mean Square Error, and S-value (for the degree of overlap between the simulated probability distribution and the observed probability distribution). The eighteen models were ranked by Rank Score (RS) performance, with these five models ranking better than other models. These models were corrected by the Quantile-Mapping method (Tong et al. 2017;Han et al. 2018).
Each model consists of daily simulated climate data during 1986-2005 and daily projected climate data during 2006-2099, including climatic elements of precipitation and temperature data under three emissions scenarios of RCP 2.6, RCP 4.5, and RCP 8.5. The detailed information of the selected models is shown in Table 1. The gridded climate data in the YRB were extracted as the observed precipitation data. Notably, for comparing the precipitation variation in a different time in the future, we chose two representative periods from the future period, the period of 2021-2050 (the near future scenario) and 2061-2090 (the distant future scenario). Given the homogeneity of different climatic elements and emissions scenarios, the climatic element of precipitation under emissions scenarios from RCP 4.5 was only analyzed in this study.
Considering the non-correspondence of the spatial resolution between the observed precipitation and the climate data, we unified the grids between observed data and climate data by matching the longitude and latitude of each grid, confirming that there are 415 coincident grids in both observation data and climate data.

METHODOLOGY
In this section, we state our chosen precipitation indexes. The research methods mainly included BMA and the probability estimation method. BMA was used to calculate the optimal weight of each model for the ensemble and to project each precipitation index deterministically (in Section 4.3). The probability estimation method calculated the probabilistic projections of future variations of each precipitation index based on the BMA weights (in Section 4.4).

Precipitation indexes
There are four precipitation indexes adopted in this study ( Table 2). The PY indicates the total quantity of precipitation in the whole year over the YRB, while the PS indicates summer precipitation, which affects the water resources in the YRB. The R 10 indicates the frequency of heavy rain that occurs over the YRB. It reflects the extreme attributes of precipitation, in other words, the frequency of rainstorm events. The PI indicates the concentration of precipitation. It shows the concentration of precipitation events in time in a year. The PY and the PS are calculated from the annual precipitation, while the R 10 and the PI are daily. It is noted that the annual precipitation is the mean value of annual precipitation in all grids, and so is the daily precipitation. .0, and RCP8.5). BMA was compared with the simple averaging method and individual models, and the merged projection could better capture the changing behaviors of climatic factors due to its lowest bias. These pieces of evidence show that BMA has advantages in dealing with multi-model issues. According to their simulation or reproduction competency to our object variables, the effect of BMA is usually regarded as assigning weights to each member for its extraordinary ability to reduce the uncertainty of different outputs. In this study, BMA is used for future precipitation projection. All the variables that appear below are directly related to the background of this study. In other words, the definition and description of variables are from the perspective of this study of precipitation projection based on multiple climate models. If Pr is the predicted precipitation, is the observed precipitation data set used for BMA training with data length T, and M ¼ [M 1 , M 2 , …, M K ] is the ensemble of precipitation projections from K models, then, with the law of total probability, the BMA probabilistic projection of Pr can be expressed as: Where p(Pr|D) is the predicted probability of Pr given observed precipitation D, p(M k |D) is the posterior probability of a model projection M k given observed precipitation D, and p k (Pr|M k , D) is the posterior distribution of Pr given model projection M k and observed precipitation D there is always ∑ K k ¼ 1 p(M k |D) ¼ 1. Then, the posterior mean precipitation of the BMA projection is: where E(Pr|D) is the expected precipitation of BMA projection, w k refers to p(M k |D), and we can obtain ∑ K k ¼ 1 w k ¼ 1, and w k is the weight of the kth model as well as the right objective of the BMA method.
Considering the requirement of Gaussian distribution for p k (Pr|M k , D), we transform both the observed precipitation data and model projection data using the Box-Cox transformation first, and all the data would be transformed with the same parameter. Heavy rain index (R 10 ) The mean number of days with daily precipitation is more than 10 mm. Hanittinan et al.
day Annual mean precipitation intensity (PI) The mean ratio of annual precipitation and the number of days with daily precipitation greater than 1.0 mm.
as the parameters to be solved, then the log-likelihood function is: where s 2 k is the variance of model projection M k concerning observation D. Generally, w k is optimized by the Expectation-Maximization (EM) algorithm with the same distribution requirement as BMA. Since the transformed precipitation data approximately obeys the Gaussian distribution, the algorithm can be implemented legitimately. In brief, the EM algorithm consists of the E step and M step. The best solution of w k is generated through the alternation between these two steps. The following steps describe the procedure of the EM algorithm in detail.
Step 1: Initialization where T is the total number of data points in the training period.
Step 2: Computing the initial likelihood where g() denotes the Gaussian distribution.
Step 3: Computing a latent variable known as the expectation step where z k,t is a latent variable.
Step 4: Computing the weights known as the maximization step Step 5: Computing the projection variance Step 6: Updating the likelihood using Equation (7) Step 7: Checking convergence If L(θ Iter ) À L(θ IterÀ1 ) is no more than a prespecified tolerance level, stop; otherwise, go back to Step 3 for a new optimization circle.

Probabilities in different variation ranges
Based on the BMA weights, the probability of variation refers to the probability that the variation of a precipitation index locates in a particular range. According to the future changes of each index, six variation ranges are selected for each index. The variation ranges are shown in Table 3. In this paper, the variation ranges (VRs) are defined as events. For example, if the variation of PY is between À50 and À25 mm, a VR2 event is considered to have occurred in PY (the event level is 2).
The BMA method is based on the total probability formula. The principle is to solve the possibility of various causes (if they are mutually exclusive) through the result that has already occurred or find out the most likely cause for this result. Theoretically, the result must be caused by one of those causes. The cause of the current result must be found among all those causes that have been known in advance. Therefore, there is a prerequisite: every possible cause has an objective explanation for the result. For the precipitation projection, the observed precipitation is the result, and the precipitation in each climate model is the cause. There is no evidence to suggest that precipitation in these climate models can perfectly (or partially) simulate observed precipitation. However, we proposed that climate models can explain the observed precipitation process under the current understanding. There were two reasons for this. First, these climate models were designed to simulate precipitation as closely as possible, and researchers had considered the physical mechanisms of precipitation. Second, the climate model used in this study had been strictly evaluated, and the models with the best performance of precipitation simulation had been screened from many climate models. We believed that this hypothesis was reasonable and necessary for these two reasons.
Given the principle of the BMA method (to solve the probabilities of each cause based on the result) and the reasonable assumption, we can use the obtained probabilities (the BMA weights) of each cause (the models) to predict the future variations in precipitation and their probabilities. In this study, five climate models were applied. The probability of variation means the proportion of the projected results within a specific range. This can be defined as: where P is the probabilities of variation of i (i means one of the precipitation indexes), D i is the variation of the ith index, D i,j is the variation of the ith index in the jth model, w j is the BMA weight of the jth model, LVR and UVR is the lower bound and upper bound in the current variation ranges for index i (Table 3), respectively. For example, if the projected future variations of PY (i ¼ 1) from the five models (D i,j or D PY,j ) are 37, À6, 18, 20, and À9 mm, and the BMA weights of the five models (w j ) are 0.14, 0.23, 0.38, 0.09, and 0.16. Then, the probabilities of different variations of PY in the future are: PY (mm) ÀInf. to À 50 À 50 to À 25 À 25 to 0 0-25 25-50 50 to Inf.

Performance assessment of climate models
To better reflect the simulation and projection performance of all models and their ensembles precipitation in the whole YRB, the mean value of all grids in the basin was taken as the precipitation assessment index for the basin. The BMA weight of each model was calculated by taking the projected annual precipitation of each model as the input and the observed annual precipitation as the output. The superiority of the ensemble was evaluated by Root Mean Squared Error (RMSE) and total deviation (TD). The smaller the RMSE is, the better the model's performance in process simulation. In contrast, the smaller the TD is, the better the model's performance in quantitative simulation. The RMSE and TD are defined as: where M i is the ith historical simulation of each model, including the BMA, H i is the ith historical observation of each model, including the BMA, m is the total number of years.
Considering the primary purpose of this study is to explore the variation of precipitation indexes in the future based on the BMA model, it is necessary to investigate the simulation performance of the BMA model for precipitation indexes. Therefore, to accurately describe the simulation deviation of a model, the mean of the absolute value of simulation deviation (MASD) is applied, the MASD here is defined as: where the P si is the simulated precipitation index of the ith year and the P oi is the observed precipitation index of the ith year. The historical simulation performance of the climate model determines whether a model is satisfactory. In other words, the credibility of precipitation in the future predicted by a climate model rests with its historical performance, and simulation deviation is the right indicator to explain the historical performance of a climate model. The simulation deviation is always fluctuant, so the mean indicates its long-term average level of deviation from observation, while the variance indicates the stability of the simulation deviation. The following formula is used to calculate the mean and variance of annual relative simulation deviation: where E D is the mean long-term annual relative simulation deviation, V D is the variance of long-term annual relative simulation deviation, and M i is the ith historical simulated value of each model, H i is the ith historical observed value of each model, D i is the absolute value of relative simulation deviations, and i is the number of years.

BMA weight of each model and the superiority of ensemble
The BMA weight of each model was calculated, and the historical simulating performance was assessed by RMSE and TD simultaneously ( Table 4). The CS model has the maximum weight while the CN model has the minimum. The weight of the other models stands between the two. From a process perspective, the CS model has the minimum RMSE, and the CN model and the EC model have a similar level to the CS model, while the MI model has the maximum RMSE and the No model is close to it. From a quantitative perspective, the No model has the minimum TD, the EC model has the maximum, and the CS model is close. The BMA model has the minimum RMSE among all the models (since the calculation of the BMA weights is based on process performance). It has better performance than most models (better than 3/5 of the models) from the quantitative perspective, though it is not the best. The BMA model simulates the average state of historical precipitation indexes more precisely with the smaller MASD than any single model (Figure 2). Each model has different simulation performances in different precipitation indexes, but the BMA model consistently has the best performance in all four precipitation indexes with the minimum MASD. At the same time, every model among the five has its best or worst performance for different indexes. For instance, the CN model performs nicely for the PY and R 10 , but not so for the PS and PI, and even worse for the PS. Thus, it is necessary to do a model ensemble to simulate and project the precipitation over the YRB.
Climate models are inherently uncertain, so is the BMA model derived from climate models. Therefore, credibility is a crucial reference to precipitation index estimation. The credibility of precipitation indexes estimation is quantitatively described by the E D (mean long-term annual relative simulation deviation, Equation (14)) from the perspective of competence and the V D (variance of long-term annual relative simulation deviation, Equation (15)) from the robustness perspective, respectively. The E D and V D on each grid in Figures 3 and 4 are based on the relative simulation deviation instead of the absolute simulation deviation. It is worth noting that, for the BMA model in Figures 3 and 4, the E D and V D are calculated from the annual time series of precipitation indexes ensembled by BMA weights. The blank grids in R 10 are since the observed results of these grids are 0, which leads to abnormal calculation results.
There is a significant improvement in PY, PS, and PI estimation. Furthermore, from an E D perspective (Figure 3), the simulation deviation of the BMA model is smaller than that of any other model. Finally, from a V D point of view (Figure 4), the BMA model has a more stable simulation deviation than any other model.  Concerning the PY (Figure 3(a)), the relatively large E D in the north part of the CN, CS, EC, and MI models are as high as 0.5. However, that of the BMA model is no larger than 0.35. Thus, though the No model has less E D than the other models, it is still higher than the BMA model. The same conclusion could be seen with the PS (Figure 3(b)) and PI (Figure 3(d)), as well as in the east of the YRB. In all models, the E D in the west is smaller than in the east, coming in below 0.2 generally in the west and more significant than 0.3 in the east.
The R 10 (Figure 3(c)) has a different result, and it seems that the E D from the BMA model is not necessarily less than the other individual models. However, as can be seen from R 10 , the result of the BMA model is not worse overall.
As with the V D , the BMA model is not worse or better than any other model. Taking PY (Figure 4(a)) as a case, in the north YRB from the CN, CS, and EC model, and so with the north and southeast of the YRB from the MI model, the V D of these regions is more significant than 0.5 and even up to 0.8. However, it is correspondingly less than 0.4 with the BMA model. In short, the performance of the BMA model is superior to that of other individual models on R 10 (Figure 4(c)) and PI (Figure 4(d)). The PS (Figure 4(b)) is more acceptable than the EC and MI models while performing equivalent with the CN, CS, and No models.
In the projection of future precipitation, it is crucial to understand the credibility of future projection. The credibility here refers to whether the BMA model is more credible than any single model in the deterministic projection. In this study, the assessment indicators are set as E D , the mean long-term annual relative simulation deviation, and V D , the variance of long-term annual relative simulation deviation. As mentioned above, the simulation deviation is expressed as the absolute value of relative simulation deviation since it is interested in the deviation of both directions (i.e., positive and negative). The lower E D means the smaller range of deviation, and the lower V D , the more stable the deviation. As shown in Figures 2-4, compared with each model, the performance of reconstructing the historical precipitation situation of the BMA model is more satisfying in terms of lower MASD, E D , and V D . Thus, the BMA model has more significant advantages in predicting future climate under specific scenarios. Consequently, we get more rational ensemble weights and probability estimation of different changes in precipitation indexes. Based on the assumption of weight as a probability in the BMA model, the application of BMA improves the flexibility and the credibility of the projection.
Therefore, it is possible to conclude that the BMA model synthetically performs better than any single climate model in simulating the historical precipitation in the YRB due to valid information from each pattern. Thus, the precipitation in the future projected by the BMA model will be more rational than any single model.
(2) According to the observed precipitation data, the variation of each precipitation index was calculated, and they were converted into VRs (observed VR, Figure 5) according to Table 3. (3) Similarly, for the projected precipitation data (GCMs), the estimated variations of each model (2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015) were calculated, respectively. Then, the probability of different variations of each precipitation index was calculated according to the probability estimation method. The VR with the highest probability (estimated VR) in each grid was selected as the estimated variation of each precipitation index. (4) The difference between the estimated VR and the observed VR was taken as the deviation of probability estimation. In each grid, the estimation deviation of each precipitation index was calculated separately. The proportion of deviations of different sizes in the whole basin was classified (the proportion of grids with various deviations in the total number of YBR grids). The statistical results of deviations are shown in Table 5.
We needed to learn about the observed precipitation variation during the validation period. There was no apparent spatial characteristic in the observed variation of each precipitation index shown in Figure 5. The VRs in the four precipitation indexes differed significantly in terms of area. The VR6 occurred in 35.1% of the grids in PY, while the VR1 (0%) and the VR2 (0.72%, only three grids) had no or almost none. In PS, the VR3 occurred in 34.9% of the grids, while the VR1 (0%) and the VR2 (6.5%) did not occur or very little. The VR4 occurred in 57.4% of grids in R 10 , while the VR1 (0%), the VR2 (0.7%), and the VR6 (1.2%) did not or very little. The VR4 occurred in 48.0% of the grids in PI, while the VR1 (0%), the VR2 (6.5%), the VR5 (8.2%), and the VR6 (1.2%) had no or very few occurrences.
Similarly, we counted the estimated variation of the precipitation during the validation period, as shown in Figure 6. Compared with Figure 5, there was no apparent spatial relationship (in grids) between the observed results and the estimated results. Each VR (whether observed or estimated) had no apparent spatial distribution characteristics. In each precipitation index, the distribution of the estimated proportions was relatively close to the observed results for each VR, among which the PY, R 10 , and PI were in good agreement. However, the estimated and observed results of PS were a bit different. The VR2 and the VR3 were underestimated by 8.9 and 20%, respectively (absolute value, not relative to the VR2 or the VR3, the same as below), and the VR4 was overestimated by 30.9%.
The deviation of the estimated result is a point of concern. Table 5 shows the proportion of grids with different estimation deviations in all grids of the whole basin. Among the four precipitation indexes of PY, PS, R 10 , and PI, the grids with a deviation of 0 accounted for 38. 31, 30.84, 47.47, and 53.74% of the whole basin, respectively, meaning that the VR with the highest estimated probability in these grids was consistent with observed VR. In other words, if the VR with the highest  Table 3 for VRs). The proportions in the graph refer to the percentages of the number of grids occurring in each VR. The difference in the number of VR, if a grid was estimated as VR2 and observed as VR4, the deviation is À2. Negative deviations mean underestimating and positive deviations mean overestimating. A deviation of 0 indicates an accurate estimation.
estimated probability was used as the estimated result, the estimation accuracies of these four indexes were between 30.84 and 43.74%. The estimation deviations of 81. 93, 82.41, 94.70, and 85.79% of the four precipitation indexes were between À1 and 1. According to the current classification of precipitation indexes (Table 3), if the deviation of 1 level was taken as the allowable deviation, the probability estimation method could capture the actual occurrence of precipitation indexes in a significant probability (81.93-94.70%). The grids with an estimated deviation of À1 accounted for 13.74-29.88% of the whole basin, which meant that the VR with the highest estimated probability in these grids was underestimated by 1 level than the observed VR. The grids with an estimated deviation of 1 in the whole basin are between 17.35 and 34.70%. Therefore, the VR with the highest estimated probability in these grids was overestimated by 1 level than the observed VR. The proportions of grids with a deviation of 2 (overestimated by two levels) and À2 (underestimated by two levels) were between 1.93 and 11.57%. The total proportion of grids with overestimated or underestimated deviation in the four precipitation indexes of more than three levels was less than 4.33%. Figure 7 shows the spatial distribution of the estimation deviation of each precipitation index and the main distribution areas of overestimation and underestimation. The overestimation of PY mainly appeared in the southern part of Subregion 3 and the western part of Subregion 5 and Subregion 7. The underestimation was mainly in the middle of Subregion 1, Subregion 2, and Subregion 6. The overestimation of PS was mainly in Subregion 3, Subregion 4, the west of Subregion 5, and the northwest of Subregion 6, while the underestimation was mainly in the southeast of Subregion 1, the northwest of Subregion 2, the east of Subregion 5, and the southeast of Subregion 6. The overestimation of R 10 was mainly in the west of Subregion 5 and the east of Subregion 7. The underestimation was mainly in the middle of Subregion 1, the northwest of Subregion 2, the north of Subregion 3 and Subregion 6. The overestimation of PI was mainly in Subregion 7 and Subregion 8, the central and western parts of Subregion 5 and Subregion 3, and the northeast of Subregion 6, and the underestimation was mainly in Subregion 4 and the north of Subregion 3. Therefore, the projection of future precipitation variation in these regions should be focused on and treated with caution.
Analyzing the estimated composition of each observed VR (that is, which VR is estimated by the probability estimation method for each observed VR) helps us determine the deviations of the overestimation or underestimation in Figure 7. Figure 8 shows the composition of the estimation of each observed VR in different precipitation indexes. In the probability estimation method, the observed VR in a grid could be estimated as any VR1-VR6 (taking the VR with the highest estimated probability). For example, in PY, all grids where VR6 had occurred accounted for 35.1% of the whole basin ( Figure 5(a)). In  Figure 5, the average variation is expressed as the classified VRs, but the VR at each grid has the highest estimated probability among six VRs.
the estimation results of these grids, the proportions of VR1-VR6 are 0, 0.7, 3.4, 21.2, 29.5, and 45.2%, respectively. The accuracy of the probability estimation method for VR6 in PY is 45.2%. Figure 9 shows the accuracy of each VR in all precipitation indexes, corresponding to the occurrence proportion of each VR. PY, PS, and R 10 showed an apparent positive correlation between the estimation accuracy and the observed occurrence proportion. We also noticed that there were some differences in PI. This was due to the less observed occurrence proportion of VR1 (1.9%), VR2 (6.5%), and VR6 (1.2%), which led to large estimation deviations. There was a positive correlation between the estimation accuracy and the observed occurrence proportion of other VRs in PI.
We can explain why some areas in Figure 7 were overestimated or underestimated. For all the precipitation indexes in Table 5, the area of overestimation by 1 level accounted for the most significant proportion among the proportions of different degrees of overestimation, and the same is true in the underestimation. Since the proportion of overestimated or underestimated of each VR was based on the occurrence proportion of this VR, we would not consider those VRs with very few occurrences. For example, the occurrence proportion of VR6 in PI was 1.2% (five grids), although there were 20% underestimated by 1 level, and only one grid (5*20%) was underestimated. After statistical analysis, we decided not to consider the VRs that occurred in less than ten grids. In Table 6, both the underestimated and overestimated by 1 level came from the grids where the VR with a high proportion of occurrences. However, among VRs that occurred more often, higher-level VRs tended to be underestimated, while lower-level VRs tended to be overestimated. Therefore, in Figure 7, the proportions underestimated by 1 level and overestimated by 1 level mainly came from those VRs with higher occurrences.
In the probabilistic projection, credibility is mainly reflected in two aspects. The first is the deviation, the difference between the projection with the highest probability and the observation. Precipitation varies from place to place, leading to the difference in the ability of climate models to simulate precipitation in different places, which is the reason for the spatial heterogeneity of deviations. In Figure 7, the spatial distribution of the simulated deviations is a semi-quantitative measure of credibility. The larger the classified deviations, the lower the credibility is. For example, in the future projection of PI in Figure 7(d), the deviation of the source region (Subregion 1) is 0, it can be considered that the credibility of this region is relatively higher, and the deviation of the downstream region (Subregion 8) is between 2 and 5 (overestimated), it indicates that the credibility of this region is relatively lower. Figure 7 | Deviation between the observed VR and estimated the VR. The estimated VR has the highest probability among all VRs at each grid. The deviation is represented by the difference between the estimated VR number and the observed VR number. Where green means the deviation is 0 (accurate estimation), yellow means the deviation is 1 (overestimate by 1 level), and cyan means the deviation is 1 (underestimate by 1 level).

Uncorrected Proof
We also analyzed the accuracy of the probability projection method for different precipitation events (VRs). For each estimation, we can find the accuracy of the result in Figure 8, and we can also know the probability that the estimation might be for other VRs. The probability here is obtained by normalizing the numbers in Figure 8 by column, slightly different from the estimated composition discussed above. It is worth noting that the results in Figure 8 were mainly for the VRs that occurred more frequently in the past (as discussed above) because the more occurrences of VRs, the higher the accuracy of the projection would be ( Figure 9).

Deterministic projection: the precipitation evolution in the future
The ensembled projected precipitation is calculated based on the BMA weights on all models. Figure 10 illustrates the estimation of the average variation for every precipitation index of the YRB in two 30-year periods (taking 2021-2050 as the near future scenario and 2061-2090 as the distant future scenario).
The PY in the YRB shows an increasing trend in both 2021-2050 ( Figure 10(a-1)) and 2061-2090 ( Figure 10(a-2)), and the annual precipitation variation in the northwest and the east (the downstream) of the YRB will be obviously and consistently more minor than the other region. The increase in annual precipitation will be greatest in the southwest of the YRB and up to more than 80 and 120 mm locally under the two future scenarios. There is a spatial pattern of low-high-low (L-H-L) for the distribution of PY variation from the northwest to the southeast of the YRB.
Compared with the PY, the PS (Figure 10(b-1) and 10(b-2)) in the YRB will vary. In terms of the overall trend, there will be a bit of a decreasing trend locally under both the future scenarios. The decrease in PS will reach 10 and 20 mm, respectively. Unlike the PY, the PS in the southwest and northeast tends to increase, but the southeast will be opposite, and both increases and decreases in 2061-2090 are more significant than that in 2021-2050. In particular, there is a spatial pattern of high-lowhigh (H-L-H) for the distribution of PS variation from the southwest to the northeast of the YRB.
The R 10 will have a trend of increasing in most areas of the YRB with an increment of 1-2 in 2021-2050 ( Figure 10(c-1)) and 2-4 in 2061-2090 ( Figure 10(c-2)). Dramatically, there will be a more significant increase of R 10 in the southwest of the YRB, where the R 10 will be up to 4 and 7 in 2021-2050 and 2061-2090, respectively. Thus, spatially, the distribution pattern is somewhat like the PY.
The PI of the YRB (Figure 10(d-1) and 10(d-2)) will have a slightly increasing trend in both periods. There is no recognizable spatial pattern of PI variation. Still, PI variation in the northwest of the YRB will be more stable wholly based on a consistent increase or decrease of less than 0.5.
With the future estimation of precipitation indexes in Figure 10 and the definition of these indexes in Table 2, the precipitation evolution in the future could be projected sequentially. The projection can be implemented from the four indexes. The annual precipitation is more concerning due to its dominant effect. In all areas of the YRB, the annual precipitation will increase. The increase from 2061 to 2090 is more significant than 2021-2050, while the annual increase in the southwestern region will be more significant than that in other regions. The summer precipitation is another focus for the impact on the flood problem. In the southwest and east regions, the increment will be dramatically higher than in both scenarios. Comparing the precipitation variation of PY and PS, the increment of annual precipitation mainly occurs in summer, which means the risk of flooding will increase in the future. The R 10 , one of the most direct indicators of rainstorm disaster, will also increase on average about four to seven times per year in the southwest and two to four times in the midland. With the PY, PS, and R 10 , the southwest of the YRB, namely the upstream, needs more attention with the most increment of these indexes. The situation will be more difficult in the distant future (2061-2090). By contrast, the PI will increase slightly, mainly in the south of the YRB, still increasing the whole YRB. As shown in Figure 11, the VR in every subgraph means the corresponding index has changed in a specific range. The meaning of VR corresponding to each indicator is shown in Table 3. For one grid, the sum of the probability values of all six VRs is 1, where the sum of the probabilities of VR1-VR3 is the probability of decreasing, and the sum of the probabilities of VR4-VR6 is the probability of increasing. Obviously, among all the indexes under both scenarios in Figure 11, the probability of increasing is greater than that of decreasing in most regions of the YRB. In most regions of the YRB, the probability of PY increasing by more than 50 mm in both scenarios is much higher, and the probability in 2021-2050 (Figure 11(a-1)) is greater than that in 2061-2090 (Figure 11(a-2)), especially in the south of the  YRB. In the northwest of the YRB, it may increase 0-50 mm with a probability of 40-60%, whatever the scenario. The probability of PY decreasing by more than 25 mm in almost all regions (VR1 and VR2 in Figure 11(a-1) and 11(a-2)) is close to 0 under both scenarios.
PS has a more decentralized probability distribution than PY and is the largest among all indexes. By comparison with PY, the probabilities of PS increasing by 0-25 mm or decreasing by 0-25 mm will be proportionable under both scenarios. In particular, the difference in the spatial distribution pattern of probability between the two scenarios is slight. The probability of increasing by 0-25 mm in the north of the YRB is closer to 100% than 2021-2050 (Figure 11(b-1)). The same is true for the southwest part.
The probabilities of R 10 changing concentrate more on a specific variation range than either PY or PS, overwhelmingly occurring to increase by 0-2 days and by 2-4 days. Compared with in 2021-2050 (Figure 11(c-1)), there are more significant and more recognizable probabilities of R 10 increasing by 0-2 days in the northwest regions in 2061-2090 (Figure 11(c-2)). Like PY, the probability of R10 decreasing by more than two days in most regions of the YRB is 0.
The probability distribution in each VR of PI is similar to R 10 . Most probability occurs in VR4 (PI increasing by 0-0.5 mm/ day) and VR5 (PI increasing by 0.5-1 mm/day). Under both scenarios, the probability of PI increasing by 0-0.5 mm/day will be up to more than 0.8 in most regions in the west and north of the YRB, while the probabilities of PI increasing by 0.5-1 mm/ day are between 20 and 40%. It mainly occurs in the east and south. The probabilities of other VRs are negligible.

DISCUSSION: THE CONSISTENCY BETWEEN THE DETERMINISTIC AND PROBABILISTIC PROJECTION
There is an apparent consistency between the deterministic projection in Section 4.3 and the probabilistic projection in Section 4.4, mainly manifested in spatial distribution and variation. In the deterministic projection in Figure 10(a-1), the increase of the PY of regions such as the southeast of Subregion 1 and Subregion 2 and Subregions 5-7 was significantly higher than that of other regions. In the probabilistic projection in Figure 11(a-1), the occurrences of VR6 (increased by more than 50 mm) in these regions were relatively high. However, as shown in Figure 10(a-1), the increases in the northwest of Subregions 1-2, Subregion 3, and Subregion 4 are mostly less than 50 mm. The variations of these regions are more likely to be between 0-25 mm or 25-50 mm, as shown in Figure 11(a-1).
We can also see that the future variations of the southeast of Subregions 1-2 in Figure 10(a-1) are between 50 and 100 mm, and the probability of increasing more than 50 mm in the future in these regions in Figure 11(a-1) is close to 1. Similarly, in the northeast of Subregion 5 and Subregion 6, the future variations of these regions are between 50 and 100 mm (Figure 10(a-1)), and the probabilistic projections indicate that these regions have the greatest probability of increasing by more than 50 mm in the future (Figure 11(a-1)). This shows consistency in the future variation. The projections of the two periods in the future have these consistencies in both aspects (Figures 10(a-2) and 11(a-2)).
In this study, we also found that the accuracy of the projection of a precipitation event (VR) is related to its occurrences (Figures 8 and 9). This leads us to think that this may be the best method to define precipitation events for each precipitation index to obtain more comprehensive and specific accuracy information of probability estimates. At present, we have considered four typical precipitation indexes. If we consider more different precipitation indexes, we can understand the ability of the probability estimation method to project other precipitation characteristics. This study is based on the data of five GCMs. If more models are applied, a higher estimation accuracy may be obtained. We will discuss the above issues in the subsequent work and constantly improve the probability estimation method of future precipitation.

CONCLUSION
This paper proposed an estimation method of future precipitation variation using the BMA method, constituting the deterministic and probabilistic projection of future precipitation. It was applied in the YRB. First, the spatial distribution of variations of four precipitation indexes (namely annual mean precipitation (PY), summer precipitation (PS), heavy rain index (R 10 ), and annual mean precipitation intensity (PI)) over the YRB was analyzed sequentially. Then, based on the precipitation projection and BMA weights of all selected models, the probability of variation in different ranges for all precipitation indexes at different times in the future was estimated spatially. Then, the credibility of the deterministic projection based on BMA was evaluated with E D (mean of long-term annual relative simulation deviation) and V D (variance of long-term annual relative simulation deviation). The deviation and accuracy evaluated the credibility of the probabilistic projection. The main conclusions are: (1) The BMA model performs well in reconstructing the precipitation conditions because of the lowest MASD (the mean of the absolute value of simulation deviation) in simulating the observed historical precipitation indexes. In addition, the relatively lower E D (the mean of long-term annual relative simulation deviation) and V D (the variance of long-term annual relative simulation deviation) in simulating precipitation situations under a given climate scenario give the BMA model more credibility than the individual models.
(2) The probability estimation method proposed in this study can give sufficient probability information of different future precipitation variations. Among the four precipitation indexes, the accuracy between the estimated VR with the highest probability and the observed VR is 38.31-53.74%. The deviations in 81.93-94.70% of grids were no more than 1 level. Thus, the estimation accuracy of different precipitation events (VRs) varies greatly, and there was a positive correlation between the accuracy and the occurrence of events. (3) The precipitation situations indicated by four indexes of PY, PS, R 10, and PI have an increasing tendency in the whole YRB for a distinct higher probability of increasing with varying degrees rather than of decreasing, which suggests an increased risk of summer flooding in the future, compared to the near future. The increment of all the precipitation indexes will be even more significant in the distant future, as will the probability for a range of variation. (4) There is a good consistency between the deterministic projection and the probabilistic projection of the future precipitation in the YRB, which is reflected in the remarkable coincidence of spatial distribution and variation trend. With a better method to define precipitation events (classification method) and applying more and more reliable climate models, it is expected that more reliable precipitation projection can be obtained in the future.