Rainfall data are an essential input for many simulation models. In fact, these latter have a decisive role in the development and application of rational water policies. Since the accuracy of the simulation depends strongly on the available data, the task of optimizing the monitoring network is of great importance. In this paper, an application is presented aiming at the evaluation of a precipitation monitoring network by predicting monthly, seasonal, and interannual average rainfall. The method given here is based on the theory of the regionalized variables using the well-known geostatistical variance reduction method. The procedure, which involves different analysis methods of the available data, such as estimation of the interpolation uncertainty and data cross validation, is applied to a case study data set in Tunisia in order to demonstrate the potential for improvement of the observation network quality. Root mean square error values are the criteria for evaluating rainfall estimation, and network performance is discussed based on kriging variance reduction. Based on this study, it was concluded that some sites should be dropped to eliminate redundancy and some others need to be added to the existing network, essentially in the center and the south, to have a more informative network.
INTRODUCTION
Rainfall is highly dynamic and constantly changing in form and intensity across a given region, especially in semi-arid regions (Cudennec et al. 2007). An important stage in many meteorological applications, climatic analysis and engineering design projects is the point or areal estimation of climatic variables over a catchment area from data taken at several weather stations. Traditionally, rainfall rates are derived from rain-gauge measurements. To estimate precipitation properly, an optimal spatial distribution of rain-gauges is required (Xu et al. 2013; Al-Mukhtar et al. 2014; Asghari & Nasseri 2015; Wu et al. in press). Thus, special attention is required not only during the actual measurements but also in the design of the rain-gauge monitoring network.
Rain-gauge monitoring networks are often installed to provide measurements that characterize the temporal and spatial variations in rainfall. However, even though modern rain-gauges are capable of providing the rainfall rate in real time at a very fine resolution in time, the spatial variation of rainfall is still difficult to characterize without a well-designed rain-gauge network (Xu et al. 2015). Pardo-Iguzquiza (1998) has indicated that the problem of rain-gauge networks' optimization could be considered rather out of date, as current weather radar analysis provides an estimation of rainfall rates at an excellent spatial and temporal resolution. However, complete coverage by weather radar is still limited in some countries and, anyhow, remains dependent on assimilation with ground measurements, all the more in mountainous semi-arid contexts. On the other hand, although recent advances in satellite remote sensing seem to have the potential for providing full spatial coverage of pixel rainfall estimates (Chang et al. 1998; Wu et al. 2015), satellite images alone still cannot provide accurate rainfall estimates at the spatial resolution required to match rain-gauge measurements.
Otherwise, the selection of rain-gauge locations is affected by many factors, such as accessibility, ease of maintenance, and topographical aspects, and the minimum density required for a rain-gauge network is dependent on the temporal resolution of the desired rainfall measurements. The optimization of a rain-gauge network for large time scale (annual, monthly, daily, etc.) measurements does not imply that it is optimal for less important ones. Generally, it is not due to the position of the rain-gauges, but due to their number (Pardo-Igùzquiza 1998). A much denser network is often necessary to characterize rainfall patterns with great variability and intermittency, even if optimization is now sometimes used for cost-driven reduction of the networks.
Determining a precise methodology for both expanding and evaluating the existing rain-gauge network is essential to understand the capacity of the network and the quality of the data it provides. Consequently, when selecting rain-gauges, one must take into account both their number and location to ensure the greatest accuracy in estimating rainfall. However, increasing the number of sampling points may produce a redundancy in the data which in turn reduces cost efficiency. Cost considerations are especially critical in developing countries that face extreme pressures on both their water and financial resources.
Designing accurate rainfall monitoring networks has occupied researchers for many years. Over the last decade, a number of public agencies whose purpose is to monitor environmental variables (Nunes et al. 2004; Pardo-Igùzquiza & Dowd 2004; Baalousha 2010) and groundwater variables (Carrera et al. 1984; Prakash & Singh 2000; Theodossiou & Latinopoulos 2006; Yang et al. 2008) have invested great economic, technical, and human resources in planning and operating improvements on existing monitoring networks. Meteorological monitoring networks, and particularly those devoted to rainfall monitoring, are among those which have received most attention from researchers (Kassim & Kottegoda 1991; Papamichail & Metaxa 1996; Ashraf et al. 1997; Pardo-Igùzquiza 1998; Goovaerts 2000). Recent scientific literature has provided various approaches, characterized by different levels of complexity, capable of supporting optimal network design (Barca et al. 2008; Cheng et al. 2008; Chebbi et al. 2011, 2013; Shaghaghian & Abedini 2013; Shafiei et al. 2014; Adhikary et al. 2015; Xu et al. 2015; Wu et al. in press). Among others, one type of approach is a kriging-based geostatistical approach that finds wide applications in rain-gauge network design. Optimal network configuration can be achieved by the variance reduction method (Adhikary et al. 2015). In many studies, this technique was employed alone for the rain-gauge network design (Kassim & Kottegoda 1991; Papamichail & Metaxa 1996; Ashraf et al. 1997; Prakash & Singh 2000; Pardo-Igùzquiza & Dowd 2004; Theodossiou & Latinopoulos 2006; Cheng et al. 2008; Chebbi et al. 2013; Shafiei et al. 2014; Adhikary et al. 2015). However, some studies applied the kriging technique in combination with other techniques, such as entropy (Chen et al. 2008; Yeh et al. 2011), multivariate factor analysis (Shaghaghian & Abedini 2013), and simulated annealing (Pardo-Igùzquiza 1998; Barca et al. 2008; Chebbi et al. 2011) for the network design. For more details on geostatistical theory, readers can refer to Chiles & Delfiner (1999), Deutsch & Journel (1992), Feki et al. (2012) and Hudson & Wackernagel (1994).
Several important studies have been conducted in the field of spatial analysis of network design. Kassim & Kottegoda (1991) compared simple and disjunctive kriging in the estimation of optimum locations of rain-gauges as part of a network for the determination of storm characteristics to be used in forecasting and design. Papamichail & Metaxa (1996) applied ordinary and universal kriging (UK) to describe the spatial variability of monthly rainfall data, and they used the property of kriging variance for the optimal design of the rain-gauge network. A criterion using ordinary kriging variance is proposed by Cheng et al. (2008) to assess the accuracy of hourly and annual rainfall estimation using the acceptance probability defined as the probability that estimation error falls within a desired range. Chebbi et al. (2011) proposed a method for robust rain-gauge network optimization using intensity–duration–frequency data by minimizing the mean spatial kriging variance. Adhikary et al. (2015) have developed a methodology to design an optimal rain-gauge network through optimal positioning of additional stations as well as optimal relocating of existing redundant stations using the kriging-based geostatistical approach for daily rainfall in Australia. Pardo-Igùzquiza (1998) presented a method that defines the optimal network by combining kriging and simulated annealing for daily and annual rainfall. The methodology presented by Barca et al. (2008), which uses geostatistics and simulated annealing, has been applied to the design of an extension to a rainfall monitoring network. A seasonal aggregation of the available rainfall data was considered. Chebbi et al. (2013) combined the variance reduction method with simulated annealing to optimally extend a rain-gauge network in order to interpolate rainfall intensity and an erosivity index. A methodology for the design of a rain-gauge network was developed by Shaghaghian & Abedini (2013) using a combination of geostatistical tools and factor analysis for mean annual rainfall. Shafiei et al. (2014) evaluated the performance and augmentation of a rain-gauge network for annual time scale using a probabilistic approach combined with a geographic information system. Recently, in their study, Wu et al. (in press) intended to treat the radar grid-based data as the reference data set to propose a framework for evaluating the rain-gauge network. The evaluation framework is developed based on quantifying the fitness performance of the semi-variogram of rainfall characteristics.
In summary, there are numerous inter-related factors affecting rain-gauge network design as the objective of designing a network: the process considered, the attribute under consideration, the temporal scale, the spatial scale, the topographic setting, types of precipitation, the nature of the objective function used for and the optimization algorithm used for minimization or maximization purposes (Shaghaghian & Abedini 2013). The main objective of this article is to evaluate an existing rain-gauge network in a semi-arid region using the kriging variance reduction method for three time scales: monthly, seasonal, and interannual. The advantage of the kriging-based approach is that the variance of estimation is independent of the actual measurements; therefore, the network can be pre-designed. We first describe the topographic and rainfall of the study area and the selected data used, before presenting the methodology. Then, the results are given and discussed. The final section gives the conclusions drawn from this research.
MATERIALS AND METHODS
The case study and data
Monthly rainfall data for 140 rain-gauges in Tunisia (Figure 1) were collected from the General Direction of Water Resources in Tunisia of the Ministry of Agricultural (Direction Générale des Resources en Eau (DGRE)) and the National Institute of Meteorology (Institut National de la Météorologie (INM)). The analysis was performed on annual and seasonal precipitation totals as defined by climatological seasons: spring (March, April, May), summer (June, July, August), autumn (September, October, November), and winter (December, January, February). Then, monthly, seasonal, and interannual average rainfalls were calculated for each rain-gauge station with its own period of observation which varies from 30 to 100 years. Stationarity was confirmed for all rainfall series as they exceed 30 years (Feki 2010). All rainfall amounts were expressed as 1/10 mm and corresponding statistical analysis was carried out in Table 1.
. | S . | O . | N . | D . | J . | F . | M . | A . | MAY . | J . | JUL . | AUG . | AUT . | WIN . | SPR . | SUM . | Inter An . |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Minimum | 25 | 84 | 40 | 55 | 70 | 58 | 110 | 54 | 31 | 6 | 0 | 0 | 50 | 68 | 78 | 2 | 51 |
Maximum | 685 | 1,498 | 2,011 | 2,625 | 2,436 | 1,987 | 1,597 | 1,389 | 767 | 316 | 127 | 233 | 1,398 | 2,349 | 1,251 | 218 | 1,288 |
Mean | 322 | 476 | 451 | 502 | 493 | 413 | 402 | 326 | 220 | 108 | 33 | 88 | 416 | 469 | 316 | 77 | 320 |
Std. Dev | 126 | 242 | 313 | 417 | 393 | 312 | 218 | 191 | 126 | 72 | 27 | 58 | 213 | 373 | 175 | 50 | 190 |
Kurtosis | 0.3 | 3.7 | 7.5 | 8.1 | 7.1 | 7.7 | 11.0 | 9.8 | 2.5 | −0.2 | 0.9 | −0.2 | 5.0 | 7.7 | 8.7 | −0.1 | 7.1 |
Skewness | −0.6 | 1.3 | 2.3 | 2.4 | 2.2 | 2.3 | 2.6 | 2.2 | 1.1 | 0.6 | 1.0 | 0.6 | 1.5 | 2.3 | 2.1 | 0.6 | 2.0 |
. | S . | O . | N . | D . | J . | F . | M . | A . | MAY . | J . | JUL . | AUG . | AUT . | WIN . | SPR . | SUM . | Inter An . |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Minimum | 25 | 84 | 40 | 55 | 70 | 58 | 110 | 54 | 31 | 6 | 0 | 0 | 50 | 68 | 78 | 2 | 51 |
Maximum | 685 | 1,498 | 2,011 | 2,625 | 2,436 | 1,987 | 1,597 | 1,389 | 767 | 316 | 127 | 233 | 1,398 | 2,349 | 1,251 | 218 | 1,288 |
Mean | 322 | 476 | 451 | 502 | 493 | 413 | 402 | 326 | 220 | 108 | 33 | 88 | 416 | 469 | 316 | 77 | 320 |
Std. Dev | 126 | 242 | 313 | 417 | 393 | 312 | 218 | 191 | 126 | 72 | 27 | 58 | 213 | 373 | 175 | 50 | 190 |
Kurtosis | 0.3 | 3.7 | 7.5 | 8.1 | 7.1 | 7.7 | 11.0 | 9.8 | 2.5 | −0.2 | 0.9 | −0.2 | 5.0 | 7.7 | 8.7 | −0.1 | 7.1 |
Skewness | −0.6 | 1.3 | 2.3 | 2.4 | 2.2 | 2.3 | 2.6 | 2.2 | 1.1 | 0.6 | 1.0 | 0.6 | 1.5 | 2.3 | 2.1 | 0.6 | 2.0 |
S, September; O, October; N, November; D, December; J, January; F, February; M, March; A, April; J, June; JUL, July; AUG, August; AUT, autumn; WIN, winter; SPR, spring; SUM, summer; Inter An, interannual; Std. Dev, standard deviation.
Methodology
Kriging is a spatial interpolation method which predicts distribution of precipitation optimally (Abedini et al. 2013). One of the most interesting features of kriging, in general, is that the estimation variance can be calculated before actual measurements are available. This feature suggests its application for designing networks of monitoring stations (Carrera et al. 1984). The kriging variance depends on the structure and the geometric configuration of the data points. The further the point estimate is from its neighbors, the more the kriging variance increases. As the kriging weights and the semi-variogram between the estimated point and its neighbors are deducted from the theoretical model of the semi-variogram, the variance of the error depends on the distance between the points and their dispositions in space, regardless of values.
A rain-gauge network encompasses a number of gauges within an area of interest. Two major concerns for the evaluation and design of a network are the density and locations of these rain-gauges. The minimum density of a network depends on its functional objective and characteristics of dominant precipitation types. For water resources management (reservoir operation, water balance, etc.), estimates of long duration rainfall, such as mean monthly, seasonal, and annual rainfall, are needed (Chen et al. 2008; Shaghaghian & Abedini 2013). In evaluating and designing a rain-gauge network, the major concern is how to quantify the performance of an existing network. Intuitively, a measure of the network performance is the accuracy associated with rainfall estimates. The accuracy should reflect not only the absolute error, but also the error variance, since the errors depend heavily on rain field variability, and this variability is far from being consistent from one event to another and from one year to the next. If only unbiased estimators are considered, then the characteristics of estimation accuracy can thus be focused on the variance of estimation error (Cheng et al. 2008). The estimation accuracy of point rainfall varies within the study area and is dependent on the number and geometric layout of rain-gauges.
In this study, evaluation of the rainfall network passes through two major steps: (1) redundancy elimination and (2) network augmentation. To eliminate the redundant rain-gauges in the network, cluster analysis is used. Cluster analysis for hydrologic regionalization involves the grouping of observations or variables into clusters, with each cluster containing observations or variables with highly similar hydrologic characteristics, such as geographical, physical, statistical, or stochastic features (Soltani & Modarres 2006). Ward's method was used. It is an agglomerative clustering technique, where clusters are grouped together to include increasing numbers of objects in a stepwise fashion. At the lowest level of the clustering hierarchy, each cluster contains a single object. Larger clusters are formed by merging the two clusters that produce the smallest increase in the within-cluster variance. This merger step is repeated until a single cluster containing all objects is created. Ward's method has been found to produce satisfactory results for meteorological data in previous research (Alhamed et al. 2002). The progress and intermediate results of a cluster analysis are conventionally illustrated using the dendogram or ‘tree’ diagram, a bi-dimensional figure that represents the sequence and the distance at which the observations are clustered. In this case, the y axis indicates decreasing levels of similarity, and ‘branches’ on the dendogram indicate the level of similarity where objects are grouped into clusters.
The augmentation of the number of stations is based on geostatistical methods: UK (Papamichail & Metaxa 1996; Hengl et al. 2004; Brus & Heuvelink 2007) and kriging with external drift (KED) (Goovaerts 2000; Bourennane & King 2003; Feki et al. 2012) algorithms are compared to interpolate data from the monitoring network and to get rainfall maps and the kriging variance maps. According to Hengl et al. (2004), UK is a spatial prediction technique that jointly employs correlation with auxiliary maps and spatial correlation. Optimization of the sample pattern by minimizing the UK variance integrates optimization in feature and geographic space in one step. The method has no ambiguities and does not involve subjective choices (Brus & Heuvelink 2007). Many authors (Deutsch & Journel 1992; Hudson & Wackernagel 1994), however, agree that the term UK should be reserved for the case where the drift is modeled as a function of the coordinates only. The KED is more commonly used when the drift is defined ‘externally’ through some auxiliary variables (Chiles & Delfiner 1999; Feki et al. 2012). The advantage of KED is that the equations are solved at once. Note that, although the KED technique seems to be computationally more straightforward, it needs semi-variogram parameters of the global least square (GLS) regression residuals (which is often ignored), and therefore the GLS regression coefficients. Many researchers make different assumptions to escape the estimation of residuals' semi-variograms. For example, Marcotte (2012) assumes that for a small amplitude drift the semi-variogram of the main variable can be used directly at close range. Hudson & Wackernagel (1994) and Bourennane & King (2003) go further by directly assuming that the semi-variogram of the residuals and of the main variable are equal.
Our approach particularly uses the kriging standard deviation to identify points of high variance as potential points for monitoring. The drawback of this approach is that it only considers the spatial variation of the data without considering the accessibility to the region. In other words, if we consider an additional point si+1 to a set of n points si (i=1,…,n) it is possible to construct the kriging system corresponding to the n+1 points s1,…,sn,sn+1, and deduce the new variance kriging related to this new experimental configuration. This method is called ‘method of fictitious point’.
Equation (1) indicates that the semi-variogram is independent of spatial locations; it only depends on the distance between the two points under consideration in the isotropic case. An experimental semi-variogram is computed from data pairs of observations, for specific distance lags and directions. A mathematical authorized model may be fitted to the experimental semi-variogram, and the coefficients of this model are used for kriging. In semi-variogram models where the semi-variance reaches a finite variance and levels out, the maximum is referred to as the sill. This is termed a bounded model (spherical and exponential models). Models that do not reach a sill are termed unbounded (power models). Unbounded semi-variograms may indicate a large-scale trend (that is, systemic increase or decrease) in the values of the property characterized across the region of study (Kitanidis 1997).
RESULTS AND DISCUSSION
Redundancy elimination
Characterizing the spatial variability of rainfall
The 101 rain-gauges are used in the semi-variogram calculation using the VARIOWIN Software. Under the second order assumption, a semi-variogram approaches its sill. However, except for July, the remaining calculated semi-variograms are fitted visually by power type (Table 2). Therefore, the semi-variograms increase without a sill; it means that the random rainfall field is non-stationary due to a spatial trend of the expected rainfall values (Feki 2010). Arnaud & Emery (2000) mentioned that the representation of a variable by a semi-variogram is a subjective operation: there is no ‘real’ underlying model. In practice, it is necessary to ensure that the model meets the main characteristics of the experimental semi-variogram, i.e., behavior at the origin, to the infinity, etc. Moreover, the adjustment of the model is not done using only the experimental semi-variogram but must integrate all available information on the phenomenon. Therefore, using such automatic methods of fitting as least square method, cross-validation, etc. is not justified as much as the phenomenon must be well-known spatially.
. | Rainfall simple semi-variogram . | |||
---|---|---|---|---|
. | Model . | C0 . | α* . | β** . |
September | Power | 0 | 0.07 | 1 |
October | Power | 3,478 | 0.26 | 1 |
November | Power | 10,780 | 0.5 | 0.98 |
December | Power | 34,200 | 0.68 | 1 |
January | Power | 17,600 | 0.8 | 0.99 |
February | Power | 18,430 | 0.48 | 0.98 |
March | Power | 9,870 | 0.3 | 0.96 |
April | Power | 6,290 | 0.2 | 0.98 |
May | Power | 1,120 | 0.07 | 1 |
June | Power | 312 | 0.03 | 0.98 |
July | Spherical | 73 | 730 | 210,548 |
August | Power | 594 | 0.018 | 0.98 |
Autumn | Power | 24,586 | 1.78 | 1 |
Winter | Power | 221,000 | 6.5 | 0.98 |
Spring | Power | 47,600 | 1.14 | 1 |
Summer | Power | 690 | 0.2 | 0.96 |
Interannual | Power | 572,000 | 22.9 | 1 |
. | Rainfall simple semi-variogram . | |||
---|---|---|---|---|
. | Model . | C0 . | α* . | β** . |
September | Power | 0 | 0.07 | 1 |
October | Power | 3,478 | 0.26 | 1 |
November | Power | 10,780 | 0.5 | 0.98 |
December | Power | 34,200 | 0.68 | 1 |
January | Power | 17,600 | 0.8 | 0.99 |
February | Power | 18,430 | 0.48 | 0.98 |
March | Power | 9,870 | 0.3 | 0.96 |
April | Power | 6,290 | 0.2 | 0.98 |
May | Power | 1,120 | 0.07 | 1 |
June | Power | 312 | 0.03 | 0.98 |
July | Spherical | 73 | 730 | 210,548 |
August | Power | 594 | 0.018 | 0.98 |
Autumn | Power | 24,586 | 1.78 | 1 |
Winter | Power | 221,000 | 6.5 | 0.98 |
Spring | Power | 47,600 | 1.14 | 1 |
Summer | Power | 690 | 0.2 | 0.96 |
Interannual | Power | 572,000 | 22.9 | 1 |
C0 (1/10 mm2) nugget effect; α* (1/10 mm2) sill (for the spherical model) or slope (for the power model); β ** (1/10 m) range (for the spherical model) or power coefficient (for the power model).
Spatial variation of interannual rainfall
Spatial variation of seasonal rainfall
Spatial variation of monthly rainfall
Nugget effect
Rainfall cartography
In this study, we have assigned a deterministic status for the internal and external drift, and we have admitted that the semi-variograms of the main variable (rainfall) and the residuals are closer (Hudson & Wackernagel 1994; Bourennane & King 2003; Marcotte 2012). Based on these results, the simple semi-variogram parameters of the main variable (rainfall) are used and introduced into the kriging algorithms. The UK and the KED are applied for the estimation of the monthly, seasonal, and interannual average rainfalls. The four months of October, January, March, and June are used for illustration.
During the winter season (January), there is a visible important gradient in the north-west of the country, but in the center and the south there is no spatial structure. In spring, the isohyets keep the same direction, namely, the north-east/south-west, but with a more important gradient in the north than in the center. Finally, the interannual rainfall isohyets' disposition is a combination of the seasonal ones; therefore, we distinguish an important north-west/south-east gradient in the north-west and a less important gradient all over the center and the south.
Cross-validation
Distribution of estimation uncertainty
Localization of new rain-gauges
The proposed approach to study the optimization of a network and decide about its capacity can be as follows:
Calculate a semi-variogram that includes all events or any time lag of the studied field.
Calculate and map the kriging standard deviation of this field.
Identify areas of strongest local maxima of kriging standard deviation.
Implement fictitious observation stations in these areas and re-calculate the kriging standard deviation.
Calculate and map the gain in precision.
Identify areas where the gain exceeds a given threshold, these areas are concerned by the implementation of new stations.
Seek valuable sites in these areas to install actual station measures, responding to other informative and practical criteria such as the absence of obstacles, easy access, and availability of means of communication.
CONCLUSION
In this paper, we are interested in optimizing the design of the rain-gauge network in Tunisia based on reducing the variance of kriging. A first phase of statistic analysis, on mean monthly, seasonal, and interannual data, is followed by the computation of the experimental semi-variograms and the semi-variogram model fitting. The analysis of these semi-variogram models has clearly highlighted some peculiar characteristics. In fact, all the months except July, seasons and interannual time scales have a power type model of semi-variogram showing a large-scale trend. UK and KED are applied to develop rainfall contour maps using geographical coordinates as internal drift and altitude as external drift, respectively. These methods are compared and it is shown that the method of KED performs better than the UK method in terms of RMSE. The plots of kriging standard deviation show contours of increasing magnitude where station density is minimal. A test implementation of some new stations is conducted throughout the country and the gain assessed. We have concluded that in the northern part of the country the system interpolation is not sensitive when adding additional information. Therefore, the existing network satisfactorily describes rainfall spatial variability. Indeed, the maximum obtained gain in precision exceeds 10%. In the center, the system is averagely sensitive when adding new stations, since the gain in precision varies between 20 and 40%. In the south, the gain accuracy can reach 80%. Consequently, the network is insufficient to take into account the rainfall spatial variability, especially in autumn. A further planned improvement of the methodology consists of integrating the geographic coordinates and altitude as secondary variables altogether and by using other multivariate geostatistical methods such as regression kriging.
ACKNOWLEDGEMENTS
The authors thank the INM and the DGRE for providing rainfall data.