Rainfall data are an essential input for many simulation models. In fact, these latter have a decisive role in the development and application of rational water policies. Since the accuracy of the simulation depends strongly on the available data, the task of optimizing the monitoring network is of great importance. In this paper, an application is presented aiming at the evaluation of a precipitation monitoring network by predicting monthly, seasonal, and interannual average rainfall. The method given here is based on the theory of the regionalized variables using the well-known geostatistical variance reduction method. The procedure, which involves different analysis methods of the available data, such as estimation of the interpolation uncertainty and data cross validation, is applied to a case study data set in Tunisia in order to demonstrate the potential for improvement of the observation network quality. Root mean square error values are the criteria for evaluating rainfall estimation, and network performance is discussed based on kriging variance reduction. Based on this study, it was concluded that some sites should be dropped to eliminate redundancy and some others need to be added to the existing network, essentially in the center and the south, to have a more informative network.

## INTRODUCTION

Rainfall is highly dynamic and constantly changing in form and intensity across a given region, especially in semi-arid regions (Cudennec *et al.* 2007). An important stage in many meteorological applications, climatic analysis and engineering design projects is the point or areal estimation of climatic variables over a catchment area from data taken at several weather stations. Traditionally, rainfall rates are derived from rain-gauge measurements. To estimate precipitation properly, an optimal spatial distribution of rain-gauges is required (Xu *et al.* 2013; Al-Mukhtar *et al.* 2014; Asghari & Nasseri 2015; Wu *et al.* in press). Thus, special attention is required not only during the actual measurements but also in the design of the rain-gauge monitoring network.

Rain-gauge monitoring networks are often installed to provide measurements that characterize the temporal and spatial variations in rainfall. However, even though modern rain-gauges are capable of providing the rainfall rate in real time at a very fine resolution in time, the spatial variation of rainfall is still difficult to characterize without a well-designed rain-gauge network (Xu *et al.* 2015). Pardo-Iguzquiza (1998) has indicated that the problem of rain-gauge networks' optimization could be considered rather out of date, as current weather radar analysis provides an estimation of rainfall rates at an excellent spatial and temporal resolution. However, complete coverage by weather radar is still limited in some countries and, anyhow, remains dependent on assimilation with ground measurements, all the more in mountainous semi-arid contexts. On the other hand, although recent advances in satellite remote sensing seem to have the potential for providing full spatial coverage of pixel rainfall estimates (Chang *et al.* 1998; Wu *et al.* 2015), satellite images alone still cannot provide accurate rainfall estimates at the spatial resolution required to match rain-gauge measurements.

Otherwise, the selection of rain-gauge locations is affected by many factors, such as accessibility, ease of maintenance, and topographical aspects, and the minimum density required for a rain-gauge network is dependent on the temporal resolution of the desired rainfall measurements. The optimization of a rain-gauge network for large time scale (annual, monthly, daily, etc.) measurements does not imply that it is optimal for less important ones. Generally, it is not due to the position of the rain-gauges, but due to their number (Pardo-Igùzquiza 1998). A much denser network is often necessary to characterize rainfall patterns with great variability and intermittency, even if optimization is now sometimes used for cost-driven reduction of the networks.

Determining a precise methodology for both expanding and evaluating the existing rain-gauge network is essential to understand the capacity of the network and the quality of the data it provides. Consequently, when selecting rain-gauges, one must take into account both their number and location to ensure the greatest accuracy in estimating rainfall. However, increasing the number of sampling points may produce a redundancy in the data which in turn reduces cost efficiency. Cost considerations are especially critical in developing countries that face extreme pressures on both their water and financial resources.

Designing accurate rainfall monitoring networks has occupied researchers for many years. Over the last decade, a number of public agencies whose purpose is to monitor environmental variables (Nunes *et al.* 2004; Pardo-Igùzquiza & Dowd 2004; Baalousha 2010) and groundwater variables (Carrera *et al.* 1984; Prakash & Singh 2000; Theodossiou & Latinopoulos 2006; Yang *et al.* 2008) have invested great economic, technical, and human resources in planning and operating improvements on existing monitoring networks. Meteorological monitoring networks, and particularly those devoted to rainfall monitoring, are among those which have received most attention from researchers (Kassim & Kottegoda 1991; Papamichail & Metaxa 1996; Ashraf *et al.* 1997; Pardo-Igùzquiza 1998; Goovaerts 2000). Recent scientific literature has provided various approaches, characterized by different levels of complexity, capable of supporting optimal network design (Barca *et al.* 2008; Cheng *et al.* 2008; Chebbi *et al.* 2011, 2013; Shaghaghian & Abedini 2013; Shafiei *et al.* 2014; Adhikary *et al.* 2015; Xu *et al.* 2015; Wu *et al.* in press). Among others, one type of approach is a kriging-based geostatistical approach that finds wide applications in rain-gauge network design. Optimal network configuration can be achieved by the variance reduction method (Adhikary *et al.* 2015). In many studies, this technique was employed alone for the rain-gauge network design (Kassim & Kottegoda 1991; Papamichail & Metaxa 1996; Ashraf *et al.* 1997; Prakash & Singh 2000; Pardo-Igùzquiza & Dowd 2004; Theodossiou & Latinopoulos 2006; Cheng *et al.* 2008; Chebbi *et al.* 2013; Shafiei *et al.* 2014; Adhikary *et al.* 2015). However, some studies applied the kriging technique in combination with other techniques, such as entropy (Chen *et al.* 2008; Yeh *et al.* 2011), multivariate factor analysis (Shaghaghian & Abedini 2013), and simulated annealing (Pardo-Igùzquiza 1998; Barca *et al.* 2008; Chebbi *et al.* 2011) for the network design. For more details on geostatistical theory, readers can refer to Chiles & Delfiner (1999), Deutsch & Journel (1992), Feki *et al.* (2012) and Hudson & Wackernagel (1994).

Several important studies have been conducted in the field of spatial analysis of network design. Kassim & Kottegoda (1991) compared simple and disjunctive kriging in the estimation of optimum locations of rain-gauges as part of a network for the determination of storm characteristics to be used in forecasting and design. Papamichail & Metaxa (1996) applied ordinary and universal kriging (UK) to describe the spatial variability of monthly rainfall data, and they used the property of kriging variance for the optimal design of the rain-gauge network. A criterion using ordinary kriging variance is proposed by Cheng *et al.* (2008) to assess the accuracy of hourly and annual rainfall estimation using the acceptance probability defined as the probability that estimation error falls within a desired range. Chebbi *et al.* (2011) proposed a method for robust rain-gauge network optimization using intensity–duration–frequency data by minimizing the mean spatial kriging variance. Adhikary *et al.* (2015) have developed a methodology to design an optimal rain-gauge network through optimal positioning of additional stations as well as optimal relocating of existing redundant stations using the kriging-based geostatistical approach for daily rainfall in Australia. Pardo-Igùzquiza (1998) presented a method that defines the optimal network by combining kriging and simulated annealing for daily and annual rainfall. The methodology presented by Barca *et al.* (2008), which uses geostatistics and simulated annealing, has been applied to the design of an extension to a rainfall monitoring network. A seasonal aggregation of the available rainfall data was considered. Chebbi *et al.* (2013) combined the variance reduction method with simulated annealing to optimally extend a rain-gauge network in order to interpolate rainfall intensity and an erosivity index. A methodology for the design of a rain-gauge network was developed by Shaghaghian & Abedini (2013) using a combination of geostatistical tools and factor analysis for mean annual rainfall. Shafiei *et al.* (2014) evaluated the performance and augmentation of a rain-gauge network for annual time scale using a probabilistic approach combined with a geographic information system. Recently, in their study, Wu *et al.* (in press) intended to treat the radar grid-based data as the reference data set to propose a framework for evaluating the rain-gauge network. The evaluation framework is developed based on quantifying the fitness performance of the semi-variogram of rainfall characteristics.

In summary, there are numerous inter-related factors affecting rain-gauge network design as the objective of designing a network: the process considered, the attribute under consideration, the temporal scale, the spatial scale, the topographic setting, types of precipitation, the nature of the objective function used for and the optimization algorithm used for minimization or maximization purposes (Shaghaghian & Abedini 2013). The main objective of this article is to evaluate an existing rain-gauge network in a semi-arid region using the kriging variance reduction method for three time scales: monthly, seasonal, and interannual. The advantage of the kriging-based approach is that the variance of estimation is independent of the actual measurements; therefore, the network can be pre-designed. We first describe the topographic and rainfall of the study area and the selected data used, before presenting the methodology. Then, the results are given and discussed. The final section gives the conclusions drawn from this research.

## MATERIALS AND METHODS

### The case study and data

Located to the south of the Mediterranean Sea with an area of 164,150 km^{2}, Tunisia displays the best available meteorological observation network and data set of the southern Mediterranean area, in terms of density, history, coherence, and quality at the national level (Slimani *et al.* 2007; Baccour *et al.* 2012a, 2012b; Feki *et al.* 2012), allowing various meteorological applications and climatological assessments. Yet, this network remains far from ideal to capture all the spatial and temporal variables across a strong north–south, orography-driven and seasonally dependent organization. The Tunisian territory has a quite heterogeneous relief (Figure 1). Small plains where the Wadi Medjerda flows (major permanent river of North Africa) separate mountains in the north. They are progressively replaced by steppes, then by oriental coasts that stretch from the Cap Bon peninsula to the Libyan border. The coasts (1,300 km) are cut by profound gulfs and several islands. Towards the south, the mountainous chains are lower (Sahara). Climatologically, Tunisia is situated in the geographic transition zone between the humid temperate climate and the arid Saharan climate (Feki *et al.* 2012). Although the global climate is arid, there are three distinct climatic regions: (1) the north region is characterized by a Mediterranean climate with a humid and sub-humid atmosphere; (2) a semi-arid climate is found in the center and the east coast of Tunisia; and (3) the climate becomes Saharan in the south. Precipitation is characterized by an important north–south spatial distribution. In the north, the cumulative annual mean rainfall ranges from 400 mm, 1,000 mm, to 1,500 mm per year over the Kroumirie Mountains. The rainy season extends from September to April with a maximum rainfall of 234.9 mm during the winter months. In the center of Tunisia, mean rainfall ranges from 200 to 400 mm per year and is characterized by its variability from one year to another. South Tunisia is characterized by a Saharan climate where rainfall is erratic.

Monthly rainfall data for 140 rain-gauges in Tunisia (Figure 1) were collected from the General Direction of Water Resources in Tunisia of the Ministry of Agricultural (Direction Générale des Resources en Eau (DGRE)) and the National Institute of Meteorology (Institut National de la Météorologie (INM)). The analysis was performed on annual and seasonal precipitation totals as defined by climatological seasons: spring (March, April, May), summer (June, July, August), autumn (September, October, November), and winter (December, January, February). Then, monthly, seasonal, and interannual average rainfalls were calculated for each rain-gauge station with its own period of observation which varies from 30 to 100 years. Stationarity was confirmed for all rainfall series as they exceed 30 years (Feki 2010). All rainfall amounts were expressed as 1/10 mm and corresponding statistical analysis was carried out in Table 1.

S | O | N | D | J | F | M | A | MAY | J | JUL | AUG | AUT | WIN | SPR | SUM | Inter An | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

Minimum | 25 | 84 | 40 | 55 | 70 | 58 | 110 | 54 | 31 | 6 | 0 | 0 | 50 | 68 | 78 | 2 | 51 |

Maximum | 685 | 1,498 | 2,011 | 2,625 | 2,436 | 1,987 | 1,597 | 1,389 | 767 | 316 | 127 | 233 | 1,398 | 2,349 | 1,251 | 218 | 1,288 |

Mean | 322 | 476 | 451 | 502 | 493 | 413 | 402 | 326 | 220 | 108 | 33 | 88 | 416 | 469 | 316 | 77 | 320 |

Std. Dev | 126 | 242 | 313 | 417 | 393 | 312 | 218 | 191 | 126 | 72 | 27 | 58 | 213 | 373 | 175 | 50 | 190 |

Kurtosis | 0.3 | 3.7 | 7.5 | 8.1 | 7.1 | 7.7 | 11.0 | 9.8 | 2.5 | −0.2 | 0.9 | −0.2 | 5.0 | 7.7 | 8.7 | −0.1 | 7.1 |

Skewness | −0.6 | 1.3 | 2.3 | 2.4 | 2.2 | 2.3 | 2.6 | 2.2 | 1.1 | 0.6 | 1.0 | 0.6 | 1.5 | 2.3 | 2.1 | 0.6 | 2.0 |

S | O | N | D | J | F | M | A | MAY | J | JUL | AUG | AUT | WIN | SPR | SUM | Inter An | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

Minimum | 25 | 84 | 40 | 55 | 70 | 58 | 110 | 54 | 31 | 6 | 0 | 0 | 50 | 68 | 78 | 2 | 51 |

Maximum | 685 | 1,498 | 2,011 | 2,625 | 2,436 | 1,987 | 1,597 | 1,389 | 767 | 316 | 127 | 233 | 1,398 | 2,349 | 1,251 | 218 | 1,288 |

Mean | 322 | 476 | 451 | 502 | 493 | 413 | 402 | 326 | 220 | 108 | 33 | 88 | 416 | 469 | 316 | 77 | 320 |

Std. Dev | 126 | 242 | 313 | 417 | 393 | 312 | 218 | 191 | 126 | 72 | 27 | 58 | 213 | 373 | 175 | 50 | 190 |

Kurtosis | 0.3 | 3.7 | 7.5 | 8.1 | 7.1 | 7.7 | 11.0 | 9.8 | 2.5 | −0.2 | 0.9 | −0.2 | 5.0 | 7.7 | 8.7 | −0.1 | 7.1 |

Skewness | −0.6 | 1.3 | 2.3 | 2.4 | 2.2 | 2.3 | 2.6 | 2.2 | 1.1 | 0.6 | 1.0 | 0.6 | 1.5 | 2.3 | 2.1 | 0.6 | 2.0 |

S, September; O, October; N, November; D, December; J, January; F, February; M, March; A, April; J, June; JUL, July; AUG, August; AUT, autumn; WIN, winter; SPR, spring; SUM, summer; Inter An, interannual; Std. Dev, standard deviation.

### Methodology

Kriging is a spatial interpolation method which predicts distribution of precipitation optimally (Abedini *et al.* 2013). One of the most interesting features of kriging, in general, is that the estimation variance can be calculated before actual measurements are available. This feature suggests its application for designing networks of monitoring stations (Carrera *et al.* 1984). The kriging variance depends on the structure and the geometric configuration of the data points. The further the point estimate is from its neighbors, the more the kriging variance increases. As the kriging weights and the semi-variogram between the estimated point and its neighbors are deducted from the theoretical model of the semi-variogram, the variance of the error depends on the distance between the points and their dispositions in space, regardless of values.

A rain-gauge network encompasses a number of gauges within an area of interest. Two major concerns for the evaluation and design of a network are the density and locations of these rain-gauges. The minimum density of a network depends on its functional objective and characteristics of dominant precipitation types. For water resources management (reservoir operation, water balance, etc.), estimates of long duration rainfall, such as mean monthly, seasonal, and annual rainfall, are needed (Chen *et al.* 2008; Shaghaghian & Abedini 2013). In evaluating and designing a rain-gauge network, the major concern is how to quantify the performance of an existing network. Intuitively, a measure of the network performance is the accuracy associated with rainfall estimates. The accuracy should reflect not only the absolute error, but also the error variance, since the errors depend heavily on rain field variability, and this variability is far from being consistent from one event to another and from one year to the next. If only unbiased estimators are considered, then the characteristics of estimation accuracy can thus be focused on the variance of estimation error (Cheng *et al.* 2008). The estimation accuracy of point rainfall varies within the study area and is dependent on the number and geometric layout of rain-gauges.

In this study, evaluation of the rainfall network passes through two major steps: (1) redundancy elimination and (2) network augmentation. To eliminate the redundant rain-gauges in the network, cluster analysis is used. Cluster analysis for hydrologic regionalization involves the grouping of observations or variables into clusters, with each cluster containing observations or variables with highly similar hydrologic characteristics, such as geographical, physical, statistical, or stochastic features (Soltani & Modarres 2006). Ward's method was used. It is an agglomerative clustering technique, where clusters are grouped together to include increasing numbers of objects in a stepwise fashion. At the lowest level of the clustering hierarchy, each cluster contains a single object. Larger clusters are formed by merging the two clusters that produce the smallest increase in the within-cluster variance. This merger step is repeated until a single cluster containing all objects is created. Ward's method has been found to produce satisfactory results for meteorological data in previous research (Alhamed *et al.* 2002). The progress and intermediate results of a cluster analysis are conventionally illustrated using the dendogram or ‘tree’ diagram, a bi-dimensional figure that represents the sequence and the distance at which the observations are clustered. In this case, the y axis indicates decreasing levels of similarity, and ‘branches’ on the dendogram indicate the level of similarity where objects are grouped into clusters.

The augmentation of the number of stations is based on geostatistical methods: UK (Papamichail & Metaxa 1996; Hengl *et al.* 2004; Brus & Heuvelink 2007) and kriging with external drift (KED) (Goovaerts 2000; Bourennane & King 2003; Feki *et al.* 2012) algorithms are compared to interpolate data from the monitoring network and to get rainfall maps and the kriging variance maps. According to Hengl *et al.* (2004), UK is a spatial prediction technique that jointly employs correlation with auxiliary maps and spatial correlation. Optimization of the sample pattern by minimizing the UK variance integrates optimization in feature and geographic space in one step. The method has no ambiguities and does not involve subjective choices (Brus & Heuvelink 2007). Many authors (Deutsch & Journel 1992; Hudson & Wackernagel 1994), however, agree that the term UK should be reserved for the case where the drift is modeled as a function of the coordinates only. The KED is more commonly used when the drift is defined ‘externally’ through some auxiliary variables (Chiles & Delfiner 1999; Feki *et al.* 2012). The advantage of KED is that the equations are solved at once. Note that, although the KED technique seems to be computationally more straightforward, it needs semi-variogram parameters of the global least square (GLS) regression residuals (which is often ignored), and therefore the GLS regression coefficients. Many researchers make different assumptions to escape the estimation of residuals' semi-variograms. For example, Marcotte (2012) assumes that for a small amplitude drift the semi-variogram of the main variable can be used directly at close range. Hudson & Wackernagel (1994) and Bourennane & King (2003) go further by directly assuming that the semi-variogram of the residuals and of the main variable are equal.

Our approach particularly uses the kriging standard deviation to identify points of high variance as potential points for monitoring. The drawback of this approach is that it only considers the spatial variation of the data without considering the accessibility to the region. In other words, if we consider an additional point *s _{i+1}* to a set of n points

*s*(

_{i}*i*

*=*

*1,…,n*) it is possible to construct the kriging system corresponding to the

*n*

*+*

*1*points

*s*, and deduce the new variance kriging related to this new experimental configuration. This method is called ‘method of fictitious point’.

_{1},…,s_{n},s_{n+1}*z*(

*s*), which corresponds to rainfall, represent realization of random variable

_{i}*Z*(

*s*) at particular points

_{i}*s*within a field

_{i}*S*. The intrinsic hypothesis assumes that for a random variable

*Z*(

*s*): (i) the expected value of

_{i}*Z*(

*s*) does not depend on the position

_{i}*s*, and (ii) the variance of does not depend on the position

_{i}*s*in

_{i}*S*for any separation vector

*h*. The semi-variogram function γ gives a measure of the spatial autocorrelation of a random variable

*Z*as a function of separation distance

*h*. A semi-variogram is defined as one half of the variance between any two locations separated by

*h*(Shafiei

*et al.*2014): where

*h*is the distance vector and

*s*is the location vector.

Equation (1) indicates that the semi-variogram is independent of spatial locations; it only depends on the distance between the two points under consideration in the isotropic case. An experimental semi-variogram is computed from data pairs of observations, for specific distance lags and directions. A mathematical authorized model may be fitted to the experimental semi-variogram, and the coefficients of this model are used for kriging. In semi-variogram models where the semi-variance reaches a finite variance and levels out, the maximum is referred to as the sill. This is termed a bounded model (spherical and exponential models). Models that do not reach a sill are termed unbounded (power models). Unbounded semi-variograms may indicate a large-scale trend (that is, systemic increase or decrease) in the values of the property characterized across the region of study (Kitanidis 1997).

*f*(

_{l}*l*= 0, 1, 2,…,

*L*) are deterministic functions of the geographic coordinates, are the kriging weights and is the semi-semi-variogram of

*Z(s*.

_{i})*et al.*2012) system is: where the minimal prediction variance: where and are the Lagrange parameters,

*f*(

_{l}*l*= 0, 1, 2,…,

*L*) are deterministic functions of the geographic coordinates, are the kriging weights, and is the semi-variogram of

*Z(s*.

_{i})## RESULTS AND DISCUSSION

### Redundancy elimination

The cluster analysis was performed based on Ward's method and using the interannual, seasonal, and monthly average rainfall (5 + 12 variables) of the selected rainfall stations. The rain-gauges' classification is carried out according to different scales. In fact, there are, initially, two big clusters which represent the north and the south of the country. Then, when the dissimilarity decreases, the number of clusters is more and more important. For this study, we have retained clusters corresponding to the third level of dissimilarity. This classification divided the rain-gauges set into 16 clusters with similar climatic and geographic characteristics. The matrix distance between the 140 rainfall stations is calculated. Then, the series of rain-gauges within 10 km distance of each other (Ferguson 1972) are compared and stations with more complete rainfall series are retained for each cluster. Finally, 101 stations were retained (Figure 2).

### Characterizing the spatial variability of rainfall

The 101 rain-gauges are used in the semi-variogram calculation using the VARIOWIN Software. Under the second order assumption, a semi-variogram approaches its sill. However, except for July, the remaining calculated semi-variograms are fitted visually by power type (Table 2). Therefore, the semi-variograms increase without a sill; it means that the random rainfall field is non-stationary due to a spatial trend of the expected rainfall values (Feki 2010). Arnaud & Emery (2000) mentioned that the representation of a variable by a semi-variogram is a subjective operation: there is no ‘real’ underlying model. In practice, it is necessary to ensure that the model meets the main characteristics of the experimental semi-variogram, i.e., behavior at the origin, to the infinity, etc. Moreover, the adjustment of the model is not done using only the experimental semi-variogram but must integrate all available information on the phenomenon. Therefore, using such automatic methods of fitting as least square method, cross-validation, etc. is not justified as much as the phenomenon must be well-known spatially.

Rainfall simple semi-variogram | ||||
---|---|---|---|---|

Model | C_{0} | α* | β** | |

September | Power | 0 | 0.07 | 1 |

October | Power | 3,478 | 0.26 | 1 |

November | Power | 10,780 | 0.5 | 0.98 |

December | Power | 34,200 | 0.68 | 1 |

January | Power | 17,600 | 0.8 | 0.99 |

February | Power | 18,430 | 0.48 | 0.98 |

March | Power | 9,870 | 0.3 | 0.96 |

April | Power | 6,290 | 0.2 | 0.98 |

May | Power | 1,120 | 0.07 | 1 |

June | Power | 312 | 0.03 | 0.98 |

July | Spherical | 73 | 730 | 210,548 |

August | Power | 594 | 0.018 | 0.98 |

Autumn | Power | 24,586 | 1.78 | 1 |

Winter | Power | 221,000 | 6.5 | 0.98 |

Spring | Power | 47,600 | 1.14 | 1 |

Summer | Power | 690 | 0.2 | 0.96 |

Interannual | Power | 572,000 | 22.9 | 1 |

Rainfall simple semi-variogram | ||||
---|---|---|---|---|

Model | C_{0} | α* | β** | |

September | Power | 0 | 0.07 | 1 |

October | Power | 3,478 | 0.26 | 1 |

November | Power | 10,780 | 0.5 | 0.98 |

December | Power | 34,200 | 0.68 | 1 |

January | Power | 17,600 | 0.8 | 0.99 |

February | Power | 18,430 | 0.48 | 0.98 |

March | Power | 9,870 | 0.3 | 0.96 |

April | Power | 6,290 | 0.2 | 0.98 |

May | Power | 1,120 | 0.07 | 1 |

June | Power | 312 | 0.03 | 0.98 |

July | Spherical | 73 | 730 | 210,548 |

August | Power | 594 | 0.018 | 0.98 |

Autumn | Power | 24,586 | 1.78 | 1 |

Winter | Power | 221,000 | 6.5 | 0.98 |

Spring | Power | 47,600 | 1.14 | 1 |

Summer | Power | 690 | 0.2 | 0.96 |

Interannual | Power | 572,000 | 22.9 | 1 |

C_{0} (1/10 mm^{2}) nugget effect; α* (1/10 mm^{2}) sill (for the spherical model) or slope (for the power model); β ** (1/10 m) range (for the spherical model) or power coefficient (for the power model).

### Spatial variation of interannual rainfall

Interannual rainfall semi-variograms are calculated for the 101 rain-gauges and fitted with power type model (Figure 3) which indicates the presence of a certain spatial trend. The model semi-variogram equation is: .

### Spatial variation of seasonal rainfall

Seasonal rainfall semi-variograms are fitted with a power type model (Figure 4), indicating the presence of a spatial drift.

### Spatial variation of monthly rainfall

Figure 5 shows the experimental and fitted sample semi-variograms of the monthly average rainfall for the four illustrative months: October, January, March, and June. The parameters of all the semi-variograms are given in Table 2. Except for the month of July, all the other months have a power type model of semi-variogram which highlights the presence of a drift showing a large-scale trend.

### Nugget effect

In theory, the semi-variogram value at the origin (0 lag distance) should be zero. If it is significantly different from zero for lags very close to zero, then this semi-variogram value is referred to as the nugget. The nugget represents variability at distances smaller than the typical sample spacing, including measurement error. In this study, the nugget effect values are important during the humid months (Figure 6) due to the great rainfall variability, with the largest value in December (34,200). Information changes rapidly in space during this season, which corresponds to the winter frontal disturbances coming from the north–west facing perpendicular channels of relief, and therefore large spatial rainfall variations (Slimani *et al.* 2007). Even if the network is relatively dense in the north, it turns out to be insufficient to account for certain micro-regionalization.

### Rainfall cartography

In this study, we have assigned a deterministic status for the internal and external drift, and we have admitted that the semi-variograms of the main variable (rainfall) and the residuals are closer (Hudson & Wackernagel 1994; Bourennane & King 2003; Marcotte 2012). Based on these results, the simple semi-variogram parameters of the main variable (rainfall) are used and introduced into the kriging algorithms. The UK and the KED are applied for the estimation of the monthly, seasonal, and interannual average rainfalls. The four months of October, January, March, and June are used for illustration.

In general, the two methods show the same rainfall spatial repartition with some localized differences (Figures 7–9). The most important precipitation amounts concern the north-west region for the autumn, winter, and spring seasons, and the interannual map with the highest amount during winter (January) due to the fact that the isohyets are closer in that region than in the center and the south. During summer, the most important precipitations concern the high altitude of the dorsal. Furthermore, for the two methods, the direction of the monthly average rainfall isohyets changes with regions and seasons; in fact, in autumn (October), the isohyets that have a north-east/south-west direction in the north change to a north-west/south-east direction in the center and south of the country with a low gradient.

During the winter season (January), there is a visible important gradient in the north-west of the country, but in the center and the south there is no spatial structure. In spring, the isohyets keep the same direction, namely, the north-east/south-west, but with a more important gradient in the north than in the center. Finally, the interannual rainfall isohyets' disposition is a combination of the seasonal ones; therefore, we distinguish an important north-west/south-east gradient in the north-west and a less important gradient all over the center and the south.

### Cross-validation

To assess the performance of each prediction method, cross-validation was used. This technique allows the comparison of the estimated and the actual values using the information available in our sample data set. The sample values are temporarily discarded from the sample data set; the value is then re-estimated using the remaining samples. The estimates are finally compared to the true values (Figure 10). There is a linear tendency between the actual and estimated values for all variables, and given by the two methods.

The two methods of interpolation show the same coefficient of determination between the estimated values and the actual values for autumn (Figure 11). From December to April, the KED method presents the largest value of the coefficient of correlation, while during summer the largest values are obtained by UK. In terms of the root mean square error (RMSE), the UK method provides the most accurate estimates for almost all months (Figure 12). The largest RMSE (and thus the most biased estimates) is for the winter (December and January).

### Distribution of estimation uncertainty

One of the advantages of the kriging algorithms is the fact that they calculate the associated estimation uncertainty. This estimation uncertainty, which is quantified by calculating the standard deviation maps, increases as the observation network density decreases. The general pattern of the uncertainty repartition obtained by UK and KED, for seasonal rainfall (Figure 13), interannual rainfall (Figure 14), and monthly average rainfall (Figure 15), is the same with a random aspect of isohyets, these latter becoming more structured to the south and the west of the country. Except in summer (or dry months), the minimum uncertainty estimation values are minimum at a direct neighborhood of points measure; they become more important in the regions with sparse networks, essentially the West and the South of the country. This contrast is more clear for the monthly time scale than the interannual and seasonal ones. The standard deviation of uncertainty estimation is largest in winter, compared with other seasons, because of the perturbations occurring during this season. In general, the KED method provides slightly weaker maximal values of standard deviation of uncertainty estimation than the UK for all seasons. Furthermore, the maps show more detailed contours, that is, in regions with sparse networks, incorporating the elevation in the rainfall interpolation gives more accurate estimations than geographical coordinate introduction.

### Localization of new rain-gauges

The locations of the additional rain-gauges can be optimized on the basis of the estimated variance. The simultaneous examination of the rainfall maps and standard deviation maps allowed the localization of new rain-gauges. Logically, regions where the standard deviation of the uncertainty estimation is high and where the rainfall gradient is important are the most concerned in increasing the density of stations. The comparison between the rainfall maps and the standard deviation uncertainty maps demonstrates that for the northern regions, the difference between the actual values and estimated values of rainfall is in the order of 10%, it increases to the center to about 25 to 30%, and it becomes more important to the south. The higher this percentage is, the worse is the estimation. The scenarios of increasing the monitoring network depend on the interpolation methods on one hand and the lag time on the other hand, but to ease the study a set of 17 fictitious points (Figure 16) were chosen as new stations in the region with a sparse network. Then, the information gain at any point of the study area was assessed in terms of reduction of the kriging variance: this type of approach is, therefore, an ‘analysis of variance reduction’ (Cheng *et al.* 2008).

The mapping of this indicator (Figure 17) shows that the maximum gains obtained by UK and KED are in the direct vicinity of the new stations. They are generally more important in the center and the south than in the north of the country. Using the KED interpolation method (Figure 17(b)) shows a small gain of accuracy in the extreme north of the country for the interannual rainfall. That seems be generated by winter and spring gain amounts (Figure 17(d)). The gains in accuracy reach 80% in the summer and autumn season in the south, and they vary between 0 and 10% in the north in winter and spring (Figure 17). The range of influence of the added stations on the gain of precision is also different from one region to another (Figure 18); it is largest in the south and the center, in contrast with the north where the influence is more localized. The reason is principally the position of the added station versus the initial network stations' positions. Finally, we notice that the two methods give slightly the same results, which means that the consideration of the altitude or the geographic coordinates in the kriging variance gave the same variance reduction.

The proposed approach to study the optimization of a network and decide about its capacity can be as follows:

Calculate a semi-variogram that includes all events or any time lag of the studied field.

Calculate and map the kriging standard deviation of this field.

Identify areas of strongest local maxima of kriging standard deviation.

Implement fictitious observation stations in these areas and re-calculate the kriging standard deviation.

Calculate and map the gain in precision.

Identify areas where the gain exceeds a given threshold, these areas are concerned by the implementation of new stations.

Seek valuable sites in these areas to install actual station measures, responding to other informative and practical criteria such as the absence of obstacles, easy access, and availability of means of communication.

## CONCLUSION

In this paper, we are interested in optimizing the design of the rain-gauge network in Tunisia based on reducing the variance of kriging. A first phase of statistic analysis, on mean monthly, seasonal, and interannual data, is followed by the computation of the experimental semi-variograms and the semi-variogram model fitting. The analysis of these semi-variogram models has clearly highlighted some peculiar characteristics. In fact, all the months except July, seasons and interannual time scales have a power type model of semi-variogram showing a large-scale trend. UK and KED are applied to develop rainfall contour maps using geographical coordinates as internal drift and altitude as external drift, respectively. These methods are compared and it is shown that the method of KED performs better than the UK method in terms of RMSE. The plots of kriging standard deviation show contours of increasing magnitude where station density is minimal. A test implementation of some new stations is conducted throughout the country and the gain assessed. We have concluded that in the northern part of the country the system interpolation is not sensitive when adding additional information. Therefore, the existing network satisfactorily describes rainfall spatial variability. Indeed, the maximum obtained gain in precision exceeds 10%. In the center, the system is averagely sensitive when adding new stations, since the gain in precision varies between 20 and 40%. In the south, the gain accuracy can reach 80%. Consequently, the network is insufficient to take into account the rainfall spatial variability, especially in autumn. A further planned improvement of the methodology consists of integrating the geographic coordinates and altitude as secondary variables altogether and by using other multivariate geostatistical methods such as regression kriging.

## ACKNOWLEDGEMENTS

The authors thank the INM and the DGRE for providing rainfall data.