Comparison of spatial interpolation methods for the estimation of precipitation patterns at different time scales to improve the accuracy of discharge simulations

Interpolating precipitation data is of prime importance to hydrological design, modeling, and water resource management. Various models have been developed that estimate spatial precipitation patterns. The purpose of this study is to analyze different precipitation interpolation schemes at different time scales in order to improve the accuracy of discharge simulations. The study was carried out in the upstream area of the Changjiang River basin. The performance of all selected methods was assessed using cross-validation schemes, with the mixed methods ultimately displaying the best performance at all three time scales. However, the differences in performance between the spatial interpolation methods decreased with increasing time scales. The unifying catchment Soil and Water Assessment Tool (SWAT), ‘ abcd ’ , and the Budyko equation were employed at the daily, monthly, and annual scales, respectively, to simulate discharge. The performance of the discharge simulation at the monthly and annual time scales was consistent with their ranks of spatial precipitation estimation. For coarse, or long period, precipitation, there were no signi ﬁ cant differences. However, the mixed methods performed better than the single model for the daily, or short, time scale with respect to the accuracy of the discharge simulation.


INTRODUCTION
Precipitation is one of the most important variables in the water cycle. Accurate continuous spatial precipitation data plays a significant role in the hydrological planning, modeling, and management of water resource systems (Parker et al. ). As precipitation is often only recorded at individual gauge sites and is very dispersed at larger scales (Vicente-Serrano et al. ), continuous spatial data over a region cannot be directly collected from measurements.
In order to solve this problem, various spatial interpolation methods for the estimation of precipitation have been developed based on point measurements over the past several years (Chaubey et al. ; Zhang & Srinivasan ). Spatial interpolation methods range from conceptually simple weighting methods, such as Thiessen polygons (Thiessen ) or Inverse Distance Weighting schemes, to more complex approaches using artificial intelligence and covariates (Burrough & Mc Donnel ; Vicente-Serrano et al. ).
In addition to elevation data, other parameters extracted from digital elevation models (DEMs) are used as covariates for interpolating precipitation. Grid data, such as Radar (NEXRAD) (Teegavarapu et al. ) and satellite data acquired by Tropical Rainfall Measuring Mission (TRMM) products (Wagner et al. ), are increasingly used in spatial interpolation as covariates. The mixed, or multi-method ensemble, approach has been increasingly used to improve model estimations (Hagedorn et al. ). The contribution of each individual model in the mixed, or multi-method ensemble, approach is represented by its weight. The use of simple weighting methods is still very common, as there are always insufficient or low-resolution covariate data and requires only low intensity computation. As both simple weighting methods and complex approaches have their own advantages and disadvantages based on specific case studies, it is necessary to select an appropriate spatial interpolation method for a given input dataset (Burrough & Mc Donnell ).

METHODOLOGY
In order to evaluate the performance of various precipitation interpolation methods with regard to their suitability at different time scales, daily, monthly, and annual precipitation distributions were selected for testing via cross-validation schemes. The results from each precipitation interpolation method were also used to drive hydrological models in order to analyze the effects of the different precipitation interpolation methods on the discharge simulation. A framework of the entire workflow process employed in this study is depicted in Figure 1, with associated abbreviations shown in Table 1.

Spatial interpolation methods for estimating precipitation distribution
Spatial interpolation methods can be grouped into four categories: local methods, global methods, geostatistical methods, and mixed methods (Burrough & Mc Donnell ; Plouffe et al. ). Six different methods (Table 1) were applied and compared in our case study. These methods were evaluated on daily, monthly, and annual scales.
Nearest Neighbor (NN) and Inverse Distance Weighting (IDW) are local spatial interpolation methods that utilize the concept of Tobler's first law to estimate precipitation by distance (Chen & Liu ). The NN (which is also called the Thiessen polygons interpolation method) method assumes that point p u at the unsampled spatial location is equal to the closest available rainfall, p i . If λ u,i is the distance between locations u and i, the weight of λ u,i is 1 for the location i corresponding to the minimum distance, d u,i , and zero for all other locations (Thiessen ; Nikolopoulos et al. ). The IDW method, recommended by the Handbook of Hydrology by ASCE (), is easy to implement (Teegavarapu & Elshorbagy ). IDW (Shepard 1968) assumes that weights, λ u,i , are also related to the distance, d u,I , from the sampled point, p i , to the estimated point, p u , as given by Equation (1).
where N is the number of sample points, and α is the power parameter in the interpolation. The variable α is the friction distance (Vieux ), ranging from 1.0 to 6.0, and is the main factor affecting the estimation. It also controls the decay of the weights with distance. The higher the value of α, the greater the influence of the points closest to the interpolated location (Nikolopoulos et al. ).
Different from the basis of Tobler's first law, where the precipitation is estimated by distance for the two local spatial interpolation methods (NN and IDW), trend surface analysis (TSA) is a global method that treats each value in a geographic area as a centroid in a grid pattern (Wang ).
where λ i,j is the polynomial coefficient and N is the degree of the polynomial that represents the best surface of the precipitation pattern. λ i,j and N are determined by using the least-squares approximation (Krumbein ) where γ(x i À x j ) represents the value of the semivariogram function for the distance between points x i and x j . γ(x j À x u ) is the value of the distance between x j and the estimated location x u , and ϕ is the Lagrange parameter, which is equal to the mean distance. On the basis of the second-order stationary hypothesis, the semivariogram is derived by fitting a semivariogram model to the empirical one, and can be determined for all distances, h, by solving Equation (4).
where C 0 , C 0 þ C, and a are the parameters known as the nugget, the sill, and the range, respectively. CK offers the possibility of considering more than one variable in the kriging interpolation, especially elevation (Pardo-Iguzquiza et al. ), and is expressed as following: where the subscripts '1' and '2' refer to the primary and secondary networks, respectively, and N 1 and N 2 are the number of available rain gauges in the primary and secondary networks over the period, respectively. Weights λ 1 and λ 2 are obtained by solving Equation (7).
In order to combine the physical relationships between climatic data and geographic or topographic variables, as done in the global TSA method, and the spatial correlation between the information recorded at weather stations, as done in OK, a mixed method, called TSA-OK, has been proposed to improve the accuracy of spatial interpolation methods. The TSA is an inexact interpolator, and has residual errors (as shown in Equation (8)). The residuals from the TSA method are then interpolated by the OK method (Equation (9)). The exact climatic variable for the final prediction is obtained through Equation (10).
where Pr(p|f k ) is the probability prediction given by the simulation of f k alone, and Pr( f k |p T ) is the posterior probability of the method prediction, given the data set, p T . Pr(f k |p T ) is also identified as a fractional statistical weight, w k , of the kth spatial interpolation method, whose magnitude reflects how well f k matches p T , and P 5 k¼1 w k ¼ 1. In order to determinate the prior Pr(p|f k ) for the parameters and the prior probability of each method, it is computationally convenient to assume that Pr(p|f k ) is a Gaussian distribution defined by a mean, μ k , and variance, σ k 2 (Raftery et al. ; Duan & Phillips ).
where g(•) is the Gaussian probability density function, and is the parameter vector.
Substituting (12) into (11), The log-likelihood function, l, was maximized to obtain both the Bayesian weight, w k , and the parameter vector, θ k , for its convenient computation. The log-likelihood function, l, is approximated as: In order to eliminate the overreach of biasing inferences on or the bias from only one of several methods considered,

'Abcd' model
The 'abcd' model is a nonlinear monthly hydrological model that was originally proposed by Thomas () for national water assessment. The key assumption of the 'abcd' model is that the evapotranspiration opportunity is nonlinearly related to the available water in a manner such that the evapotranspiration opportunity increases quickly with the available water for water-limited conditions, but asymptotically approaches a maximum constant value for energy- Statistical criteria for assessing the agreements of the spatial interpolation methods and hydrological models Statistical criteria for the spatial interpolation used to estimate precipitation The performance of every spatial interpolation used to estimate precipitation was assessed and compared using the leave-one-out, or cross-validation, approach (Goovaerts Berndt & Haberlandt ). The procedure was undertaken on different time scales (daily, monthly, and annually in this study) for each of the spatial interpolation methods.
where O i andp i are the observation and estimation at the ith gauge station, respectively.

Performance assessment of hydrological models
To evaluate the fitness of the hydrological models, Nash-Sut-  (18) and (19).

STUDY AREA AND DATA SET
The study area is located upstream of Yichang city in Hubei and is defined as the upper reach of the Changjiang River

Parameters of precipitation spatial interpolation calibration
The critical problem associated with the TSA is selection of the optimal polynomial order (Chayes ). In this study, the degree of fitting (AIC) was used to determine the polynomial order of the TSA. The lower the AIC value a particular method produced, the better fitness the observed data provided. As presented in Table 2, a third order polynomial was assumed to be sufficient to capture all the daily, monthly, and annual precipitation variations.
To identify the optimal parameters of the IDW method at the daily, monthly, and annual time scales, the α value, number of sample points, N, the research radius, the minimum MAE and RMSE, and the maximum R should be determined. The values of MAE, RMSE, and R, shown in Figure 3, revealed that the optimal values for parameters N and α can be determined as shown in Table 3.
Three semivariogram (Gaussian, spherical, and exponential) models were adopted for fitting the empirical model in the OK and CK spatial interpolation methods.  Table 4) and was selected by the OK and CK spatial interpolation methods.
The global optimal BMA weights for the log-likelihood function (13) were obtained by using the EM algorithm with daily, monthly, and annual time steps (shown in   Figure 5) were more dispersive than at the monthly and annual scales. It can be seen that the performance of BMA combined with seven methods was better than TSA-OK at the daily scale. The more dispersive the BMA weights, the more accurate the results from the methods that combined one or more approaches. If the weights of every single method in BMA were equal or similar, there was no significant difference between the mixed methods that were combined with various numbers of single methods. As the spatial precipitation correlations for short periods (daily) varied more than those for longer periods (monthly and annually), it was determined that a mixed   method with more single models was more accurate than a mixed method with fewer ones. Although the best performance of the mixed method might be partly due to the different number of parameters, the input for every method was just the precipitation series, and recommendation of a mixed method will still be useful in practice.
Of the single methods evaluated, the best performance was from IDW. OK was better than CK, TSA was better than NN and CK, and OK was better than TSA at the monthly and annual time scales, while the former was worse than the latter at the daily scale. The spatial variation of the monthly precipitation was higher than that of the annual precipitation. The values of R increased from the daily, monthly, and annual time scales, while their variation ranges decreased, which was different from the values of R at each site. The performance of each single method was not significantly different at the monthly and annual scales from the perspective of the entire study area. The annual data compensated for the possible differences in daily precipitation amounts, as in the case of moving storms, which was one reason to improve the interpolation methods.
Therefore, while any method could be recommended to estimate the monthly or annual spatial precipitation distribution, IDW was the primary recommendation.
The performance of all seven methods was also examined at 53 gauge stations. At a given station, the larger MAE, R, and RMSE values are shown in darker red, while the smaller values are shown in darker blue, as seen in Figure 7. According to the gauge station distribution shown in Figure 2, the better performance locations were   found downstream of the study area where the elevation was lower. Estimations for the upstream sites from the 3rd to the 30th gauge stations were not as good as those downstream.
As the precipitation gauge station density was low at the border of the study area (see Figure 3), the poor performance at these sites can be partly attributed to the lack of precipitation gauges with similar characteristics. Considering the R values, the performance of each method at the monthly and annual time scales improved compared to the daily time scale (e.g., No. 13 gauge site). However, the R values at the monthly scale were better than those at the annual scale for most sites. As precipitation in the study area comes mainly from monsoons, the spatial correlations between sampling sites were strong and beneficial to spatial interpolation. The performance of each method at some sites was poor for every time scale (e.g., No. 17 gauge site).
In order to directly compare the performance of each method at a given gauge site, the worst method was given a  Table 5 and Figure 7). However, the ranks of PBIAS were consistent with those of the R values with respect to the entire study area (shown in Table 5 and Figure 6). The highest NSE values were obtained from the 'abcd' model at the  Table 5).
The range of their differences was 0.05-0.06 for NSE (which accounts for nearly 10% of the average NSE value) and 5.50%-5.76% for PBIAS (which accounts for nearly 50% of the average PBIAS value). However, the differences at the monthly and annual time scales were smaller. If the metrics NSE and PBIAS were determined for the total period from 1960 to 2008, BMA would be found to have the best performance for the discharge simulation at the three time scales. However, the ranks of the discharge performance driven by another spatial precipitation interpolation scheme were different from those of spatial precipitation based on cross-validation techniques. It can also be inferred from the discharge simulation shown in Table 5 that the sample splitting for calibration and validation led to uncertainties in the discharge simulation.
Scatterplots depicting the observed discharge versus the simulated discharge at different time scales for low discharges exhibited a closer linear relationship than at high discharges. In addition, the simulated discharge was lower than the observed at the daily and monthly time scales for the calibration and validation periods. The observed vs. simulated discharge plots driven by the NN spatially interpolated precipitation at the daily scale for the validation period in Figure 9(a2) showed more concordance between the simulated and observed values than in the calibration period ( Figure 9(a 1 )). The spatial precipitation relationships between each gauge station at the daily scale were not as strong as at the monthly and annual time scales. The land surface conditions, which consider such things as water storage in the soil, contributed to the flow generation and routing. Therefore, the performance of the discharge simulation at the daily scale was not exactly the same as the corresponding precipitation estimate. The difference in large discharge values for the calibration period was more than in the validation period, while the NSE and PBIAS values for the calibration period were less than in the validation period (Table 5).  ⃰ 'I' is for the calibration period ; 'II' is for validation period (1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008) and 'II' is for whole period .
with respect to the discharge simulation. The performance of the daily discharge simulation was not consistent with the performance of the daily precipitation interpolation, because factors such as water storage and initial water conditions influence the discharge simulation in addition to precipitation. It was typical for the discharge simulation to underestimate high discharge. Ultimately, there was no dominant spatial precipitation interpolation for the discharge simulation at small time scales (i.e., daily).
Although the selection of spatial interpolation methods for areal precipitation estimation and discharge simulation at different scales have been proposed based on our case study, more case studies should be done for more general conclusion and guideline.