This study proposes an evaluation framework to identify the optimal raingauge network in a watershed using grid-based quantitative precipitation estimation (QPE) with high spatial and temporal resolution. The proposed evaluation framework is based on comparison of the spatial and temporal variation in rainfall characteristics (i.e. rainfall depth and storm pattern) from the gauged data compared with those from QPE. The proposed framework first utilizes cluster analysis to separate raingauges into various clusters based on the locations and rainfall characteristics. Then, a cross-validation algorithm is used to identify the influential raingauge in each cluster based on evaluating performance of fitting weighted spatiotemporal semivariograms of rainfall characteristics from the gauged rainfall to the QPE data. Thus, the influential raingauges for a specific cluster number form the representative network. The optimal raingauge network is the one corresponding to the best fitness performance among the representative networks considered. The study area and data set are the hourly rainfall from 26 raingauges and 1,336 QPE grids for 10 typhoons in the Wu River watershed located in central Taiwan. The proposed evaluation framework suggests that a 10-gauge network is the optimal and can describe a good spatial and temporal variation in the rain field similar to the grid-based QPE from two additional typhoon events.

## INTRODUCTION

Rainfall data are essential in many hydrological analyses and hydraulic engineering designs, such as frequency analysis, rainfall-runoff analysis, and stormwater drainage design. For example, the water level used in designing a hydraulic structure, such as a levee, is estimated by using a rainfall-runoff model and flood wave propagation model with the design areal average rainfall hyetograph estimated from the raingauge network (Wu *et al.* 2012). Accurate estimation in the spatial distribution of rainfall requires a dense network, which entails large installation and operational costs, but reduces the opportunity of project failure (AI-Zahrani & Husain 1998; Putthividhya & Tanaka 2013; Adhikary *et al.* 2015a, 2015b). The World Meteorological Organization (WMO 1994) issued a guideline for recommended density of raingauges in a catchment based on the physiographic unit and area of the watershed. For example, 250 km^{2} per gauge is suggested for a small mountainous region with irregular rainfall, and for the flat region of a temperate zone. In addition, a modern raingauge network can provide real-time estimation and the rainfall forecast resulting from typhoons for the early flooding warning operation (e.g. Pandey *et al.* 1999; Tsintikidis *et al.* 2002; Chen *et al.* 2008). The areal average rainfall hyetographs are commonly designed using the gauged rainfall data, so they should be affected by the uncertainties in the measurement. These uncertainties might be attributed to climatic and geometric factors, such as the local topography, size of the study area, type of rainfall, large-scale atmospheric motions and forcing (e.g. Harris *et al.* 1996; Pandey *et al.* 1999; Tsintikidis *et al.* 2002). These uncertainties may also result from the lack of gauges and poor density of the raingauge network. Thus, evaluating the reliability and applicability of an existing raingauge network in a watershed is an important issue in modern hydrological analysis and water resources planning.

Many investigations have been proposed for evaluating raingauge networks in various watersheds, and the relevant resulting concepts can be grouped into two types: (1) geostatistics with semivariogram (e.g. Kassim & Kottegoda 1991; Papamichail & Metaxa 1996; Pardo-Iguzquiza 1998; Tsintikidis *et al.* 2002; Cheng *et al.* 2008; Chebbi *et al.* 2013; Putthividhya & Tanaka 2013; Shafiei *et al.* 2014) and (2) the information entropy method (e.g. Krstanovic & Singh 1992; Al-Zahrani & Husain 1998; Chen *et al.* 2008; Vivekanandam & Jagtap 2012). Geostatistics is a branch of statistics focusing on spatial or spatiotemporal data sets and has been widely applied in hydrological research. The Kriging method is a group of geostatistical techniques used to interpolate the value of a random field at an unobserved location from observations of its value at nearby locations. In the geostatistics method, each gauge is sequentially treated as an unobserved location, and its rainfall depth is estimated by using the Kriging equation. By comparing the estimations with true observed values, the Kriging error can be calculated. Recently, a number of advanced methods have been developed for quantifying the variogram, such as genetic programming (e.g. Adhikary *et al.* 2015a, 2015b) and the artificial neural network (e.g. Teegavarapu 2007). Eventually, the raingauge network associated with relatively small errors is identified as an optimal network. With respect to the information entropy method, Shannon's entropy proposed a measure of information concept, which depends on the current level of knowledge or uncertainty and the information entropy for the rainfall at each raingauge in the catchment (Shannon 1948). This entropy is calculated after all raingauges are reconstructed, and the one with the maximum entropy is selected as an important gauge. After that, the optimal raingauge network is composed of all important gauges. A number of investigations combines the above two methods to determine the spatial distribution of raingauges in a catchment (e.g. Chen *et al.* 2008; Awadallah 2012).

In addition to the above two methods, Cislerova & Hutchinson (1974) proposed a redesigning method to identify the raingauge network based on the statistical characteristics of random fields. The statistical characteristics include the covariance and correlation function with the criterion of maximum admissible interpolation error. Dymond (1982) established a simple expression equation for the mean square error in the rain field, i.e. the correlation between neighboring raingauges and the number of gauges, to carry out network reduction. Basalirwa *et al.* (1993) utilized principal component analysis to delineate Uganda into homogeneous rainfall sub-basins. Then, the representative raingauge in each sub-basin could be determined based on the communality index which measures the degree of association with other gauges in the data set. Yoo (2006) combined the sampling error of rainfall calculated from the gauged data, microwave attenuation measurements and satellite measurement; and then minimized combinations of the sampling error to determine the optimal raingauge network. Volkmann *et al.* (2010) restructured the raingauge network composed of the gauges and radar-based grids to optimize the network by comparing the resulting flash flood through the rainfall-runoff model. Jung *et al.* (2014) developed an evaluation method integrated with the average inter-gauge correlation coefficients. This method can find the optimal coverage of rainfall estimation in accordance with the estimated effective radius. By taking into account the estimation accuracy of point rainfall at ungauged sites, Shafiei *et al.* (2014) assessed the performance and augmentation of a raingauge network by integrating the geographic information system framework with the acceptance probability concept.

The aforementioned methods mostly focus on the assessment of the total rainfall amount measured at the raingauges during specific durations, such as annual rainfall depth (e.g. Cislerova & Hutchinson 1974), monthly rainfall depth (e.g. Papamichail & Metaxa 1996), or cumulative hourly rainfall depth (e.g. Garcia *et al.* 2008). In addition to the rainfall amount, Chebbi *et al.* (2013) utilized the parameters of rainfall intensity-duration-frequency formula derived using the rainfall depth of various durations to determine an appropriate raingauge network. However, the accurate and reliable spatiotemporal estimations of rainfall are essential for successfully predicting the catchment responses, such as the flash flood (e.g. Volkmann *et al.* 2010; Wu *et al.* 2015a, 2015b). Thus, the spatial and temporal distributions of rainfall during a rainstorm event are important for rainfall-runoff modeling, river routing, and inundation simulation due to their effect to the accuracy and reliability of input data. In summary, most investigations discussed the variation in the rainfall amount for a specific duration at all raingauges in the watershed, such as the annual rainfall amount, for the water resource analysis. This rainfall amount obviously varies with the location of the raingauge, so that it can be defined as a temporal variable. However, there is another important issue in hydrology analysis, i.e. flood warning during typhoon events. In general, flood warning could be achieved using rainfall-runoff modeling with the time series of rainfall forecasts (i.e. hyetograph) at all gauges in the watershed. A rainstorm event can be classified into three rainfall characteristics, i.e. rainfall duration, depth and storm pattern (distribution of rainfall in time) (see Figure 1) (Wu *et al.* 2006). Thus, the storm pattern changes not only with the location of the raingauge, but also with time. Accordingly, the storm pattern should be regarded as a spatial and temporal variable. As a result, the spatial and temporal variation in rainfall should be simultaneously considered for the evaluation of a raingauge network in a watershed.

Recently, remotely sensed rainfall products, such as radar rainfall estimations, have been widely used to provide information on the spatiotemporal structure of rainfall because of their large areal coverage and high resolution (Mandapaka *et al.* 2010). Radar rainfall's high spatial and temporal resolution and large areal coverage fares better when compared with traditional gauged measurements (Sharif *et al.* 2002). Therefore, radar rainfall estimations are widely used in water resource analysis, flood forecasting and warning over sparsely gauged catchments (e.g. AghaKouchak *et al.* 2010; Zhu *et al.* 2014). The accuracy of radar-based rainfall might be influenced by various sources of errors, such as an error in the measurement of the rainfall reflectivity (e.g. Zawadzki 1984; Austin 1987; Piccolo & Chirico 2005; AghaKouchak *et al.* 2010; Abdella & Alfredsen 2010; Wu *et al.* 2015a, 2015b). Nevertheless, its high resolution in time and space can be used for discussing and analyzing the varying rainfall trends in time and space and for simulating the surface runoff distribution resulting from heavy rainfall events (e.g. Brommundt & Bardossy 2007; Atencia *et al.* 2011). Several investigations have taken the radar or satellite grid-based rainfall into account in order to assess the raingauge network (e.g. Yoo 2006; Volkmann *et al.* 2010), but they have only regarded the grid-based rainfall as additional raingauges in order to identify an optimal network from the combination of gauges and grids.

Since the radar grid-based data have high spatial and temporal resolution, they should well describe the change in rainfall in time and space represented in terms of the spatiotemporal semivariogram. Therefore, this study intends to treat the radar grid-based data as the reference data set to propose a framework for evaluating the raingauge network. The proposed evaluation framework is developed based on quantifying the fitness performance of the spatiotemporal semivariogram of rainfall characteristics (rainfall depth and storm pattern) calculated from grid-based and gauged rainfall data, respectively. In detail, the evaluation would be carried out by assessing the degree of fitting the spatiotemporal semivariograms from the gauged data to that from the radar grid-based data. It is expected that the resulting fitness performance can provide helpful information on existing raingauges in the watershed so as to identify an optimal raingauge network. The optimal network can describe the variability of rainfall in time and space in a similar manner to the grid-based radar rainfall data.

## METHODOLOGY

### Basic concept

In theory, rainfall data are regarded as the spatial and temporal variables, so this study attempts to apply the geostatistics theory to evaluate the quality of the existing raingauge network. The evaluation is carried out by comparing the variation in gauged and radar rainfall in time and space. In general, the spatial and temporal variation of data can be quantified in terms of the spatiotemporal semivariogram. Therefore, this study utilizes the gauged and radar rainfall data, respectively, to calibrate parameters of the theoretical semivariogram models in order to estimate the spatiotemporal semivariogram. The resulting spatiotemporal semivariograms from the gauged data are compared based on its fitness performance to those from radar rainfall.

Although the number of raingauges in a watershed can be decided in advance according to the WMO's guideline (1994), it is difficult to identify which raingauges can contribute to the appropriate network. Therefore, this study first applies cluster analysis to classify the raingauges into different groups for a specific cluster number according to the gauge locations and rainfall characteristics (i.e. rainfall depth and storm pattern). Cluster analysis groups a set of objects in such a way that the objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other clusters. It is widely used in many hydrology applications in which hydrological variables are classified into a number of groups by the cluster analysis according to their characteristics (e.g. Wu *et al.* 2006; Kahya *et al.* 2008; Sawicz *et al.* 2011). After that, in order to find the influential raingauge in each cluster which significantly impacts the variability of rainfall in time and space, this study employs the cross-validation algorithm to quantify the contribution of variation in the rainfall at a single raingauge to the entire rainfall area.

In detail, all raingauges in the watershed are classified into several groups using cluster analysis. In each cluster, a raingauge is randomly selected as the validation gauge. The remaining raingauges serve as the calibration gauges in which the associated gauged rainfall data are used in the calibration of the theoretical semivariogram models. Subsequently, the iterations of extracting raingauges as the validation gauges are performed and the corresponding fitness performance indices of the semivariograms for the rainfall characteristics are calculated, excluding the rainfall data from the validation gauge. If an excluded raingauge for a specific cluster results in the worst fitness performance, this gauge can be defined as the influential one. In other words, the varying trend of rainfall in time and space from the gauge network without the influential gauge might obviously differ from those calculated using the radar rainfall. Accordingly, the influential raingauges in all clusters for a specific number of clusters can form the representative raingauge network.

The cluster analysis and cross-validation algorithm should be carried out for various numbers of clusters to find the corresponding representative raingauge network. Then, by recalculating the fitness performance with the rainfall data from the representative raingauge networks considered, the resulting optimal network should be the representative raingauge network in association with the best fitness performance. The graphic illustration of the proposed evaluation framework is shown in Figure 2 and the relevant theories and methods used are introduced in the following.

### Spatiotemporal semivariogram model

#### Theoretical semivariogram model

*et al.*2005). The spatiotemporal semivariogram is expressed as: where

*h*and

_{s}*h*represent the distance and time lag, respectively, and

_{t}*Z*(

*x,t*) denotes the spatiotemporal variable at time

*t*and position

*x*. Table 1 shows some well-known theoretical semivariogram models. These spatiotemporal semivariogram models have been published to describe the behavior of spatial and temporal semivariograms, such as the product of semivariograms (Rodriguez-Iturbe & Mejia 1974), the integrated product of semivariograms (Dimitrakopoulos & Luo 1993), and the product-sum model (De Cesare

*et al.*2001, 2002). Among the above spatiotemporal semivariogram models, the product-sum model can provide a large class of flexible models and require less constraint symmetry between the spatial and temporal correlation components without an arbitrary space-time metric (Gneiting

*et al.*2005). This study therefore adopts the product-sum method to calculate the spatio-temporal semivariogram of

*Z*(

*x,t*). The concept of the spatiotemporal semivariogram model is briefly described below.

Model | γ(h) | Range of h |
---|---|---|

1. Spherical model | 0 ≦ h ≦ a | |

c | h > a | |

2. Exponential model | h ≧ 0 | |

3. Gaussian model | h ≧ 0 | |

4. Power model | ch^{a} | h ≧ 0; 0 < a ≦ 2 |

5. Nugget model | 0 | h=0 |

c | h ≧ 0 | |

6. Linear model | ch | h ≧ 0 |

7. Linear-with-sill model | 0 ≦ h ≦ a | |

c | h > a | |

8. Circular model | 0 ≦ h ≦ a | |

9. Pentaspherical model | 0 ≦ h ≦ a | |

c | h > a | |

10. Logarithmic model | 0 | h=0 |

h >0 | ||

11. Periodic model | h ≧ 0 |

Model | γ(h) | Range of h |
---|---|---|

1. Spherical model | 0 ≦ h ≦ a | |

c | h > a | |

2. Exponential model | h ≧ 0 | |

3. Gaussian model | h ≧ 0 | |

4. Power model | ch^{a} | h ≧ 0; 0 < a ≦ 2 |

5. Nugget model | 0 | h=0 |

c | h ≧ 0 | |

6. Linear model | ch | h ≧ 0 |

7. Linear-with-sill model | 0 ≦ h ≦ a | |

c | h > a | |

8. Circular model | 0 ≦ h ≦ a | |

9. Pentaspherical model | 0 ≦ h ≦ a | |

c | h > a | |

10. Logarithmic model | 0 | h=0 |

h >0 | ||

11. Periodic model | h ≧ 0 |

Note: *c* and *a* denote the sill and influence ranges and *h* denotes distance (Davis 1973).

*C*and

_{s}*C*are the spatial and temporal covariances, respectively, and

_{t}*C*(0) and

_{s}*C*(0) stand for sills, which are defined as the limit values for the semivariograms; they are generally used as the parameters of theoretical semivariogram models.

_{t}*k*

_{1},

*k*

_{2}and

*k*

_{3}are the coefficients and they can be computed using the following equation (De Cesare

*et al.*2001, 2002): where

*C*(0,0) denotes the sill of the spatio-temporal semivariogram and, generally, is adopted as the maximum of

_{st}*C*(0) and

_{s}*C*(0). In addition, this study implements parameter calibration by means of the modified genetic algorithm method based on the sensitivity to the parameters (Wu

_{t}*et al.*2012) with an objective function

*F*as: where

_{obj}*N*is the number of distance ranges;

_{ds}*N*(

_{p}*h*) is the number of pairs within the lag

_{i}*h*for the

*i*th distance range; denotes the estimated semivariogram by the

*m*th theoretical semivariogram model, and is the experimental semivariogram calculated using measured spatial data.

#### Weighted semivariogram model

*et al.*(2011a, 2011b). The weighted semivariogram model combines results from several selected theoretical semivariogram models to provide the weighted average of semivariogram by using the following equation: in which

*M*is the number of theoretical semivariogram models of interest;

_{sv}*w*is the weighted factor; denotes the semivariogram estimated by the

_{sv}*m*th theoretical model; and denotes the objective function for the

*m*th theoretical model associated with the optimal model parameters. In view of Equation (4), the objective function value

*F*decreases with the error . Accordingly, the lower objective function value means that the estimated semivariogram fits better to the experimental one . Namely, the fitness of the estimated semivariogram to the experimental ones increases. Therefore, this study defines the weighted factor as a function of the inverse of the objective function value

_{obj}*F*. Note that the sum of

_{obj}*w*should be equal to one. Figure 3 shows a graphical illustration of theoretical semivariogram models and the weighted semivariogram model.

_{sv}### Fitness performance indices

*RMSE*) to quantify the fitness degree of spatiotemporal semivariograms of rainfall characteristics as follows: where and denote the weighted spatiotemporal semivariograms of rainfall characteristics calculated from the radar and gauged rainfall and

*N*is the number of separation distances defining the semivariogram diagrams. Note that a small

_{SV}*RMSE*indicates that an adopted semivariogram calculated using the gauged rainfall data is closer to those using the radar rainfall data. Since this study focuses on the spatial and temporal variation in the rainfall characteristics (i.e. rainfall depth and storm pattern), the performance index RMSE can be regarded as RMSE

_{rd}and RMSE

_{sp}, respectively, for the rainfall depth and storm pattern.

*k*is the number of parameters in the model, and

_{par}*L*stands for the value of the likelihood function. Since the likelihood function is difficulty derived for a complicated model, Equation (7) can be rewritten using the difference between the model outputs and observations as: where

_{f}*RSS*is the residual sums of the square errors of the model's outputs and

*n*is the number of observations. In Equation (8), the AIC has a clear interpretation in model fitting. That is, the first term on the right hand side is a measure of lack-of-fit of the model of interest, whereas the second term measures the increased unreliability of the chosen model attributed to the increased number of model parameters. According to the model concept mentioned above under the section ‘Basic concept’, the influential raingauge results in the worst fitness performance of semivariograms for rainfall characteristics. Therefore, the influential raingauge is the one that achieves the maximum AIC value in a cluster: in which

_{data}*n*

_{gauge}represents the number of gauges. Since two AIC values ( and ) are considered for determining the influential raingauge, this study uses a weighted objective function (Madsen 2000) to calculate the weighted average

*AIC*as: The cross-validation algorithm with

_{wavg}*AIC*is carried out to select the raingauges which significantly affect the temporal and spatial variation in rainfall under a specific number of clusters. In detail, the influential raingauge in a cluster can be selected in association with the maximum . Then all influential raingauges form the representative raingauge network under a specific number of clusters.

_{wavg}After that, the *AIC _{wavg}* values are recalculated using the gauged rainfall data from the representative raingauge networks under various numbers of clusters. Since the best-fit model is associated with the minimum AIC value (e.g. Mutua 1994), the resulting optimal raingauge network should be the one which achieves the minimum

*AIC*value among the representative networks considered.

_{wavg}### Evaluation framework

In summary, this study attempts to incorporate the cross-validation algorithm with the weighted semivariogram model to establish the semivariogram diagram of rainfall depth and storm pattern using the radar measurement and gauged rainfall data, respectively. Evaluation of the raingauge network for finding the influential raingauges can be accomplished by comparing the fitness performance index (RMSE) and the AIC for the spatiotemporal semivariograms of rainfall characteristics. Therefore, based on the above model concept and methods adopted in this study, the proposed evaluation framework can be summarized as follows:

**Step [1]**: Collect the hourly gauged and radar rainfall data from all gauges and extract the rainfall depth and storm pattern (i.e. rainfall characteristics) (see Figure 1).**Step [2]**: Classify the raingauges into several clusters through cluster analysis with the locations of the gauges and rainfall depth and storm pattern for a specific number of clusters.**Step [3]**: Use Equation (5) to estimate weighted theoretical spatiotemporal semivariograms of the rainfall characteristics from the radar rainfall as the reference base.**Step [4]**: Follow the cross-validation algorithm to select a raingauges in each cluster as the validation gauge, and the remaining gauges are calibration gauges.**Step [5]**: Calculate the weighted theoretical semivariogram using the observed rainfall characteristics from the calibration gauges. Then, use Equation (6) to calculate the corresponding fitness performance index (root mean square error) RMSE in regard to those from Step [3] (i.e. RMSE_{rd}and RMSE_{sp}). Accordingly, the corresponding AIC values are computed as and using Equation (9).**Step [6]**: Calculate the weighted average of and (i.e.*AIC*) using Equation (10) to determine the influential gauge in a cluster in association with the maximum_{wavg}*AIC*. Finally, all influential gauges are composed of the representative raingauge network for a specific number of clusters._{wavg}**Step [7]**: Repeat to Steps [2]–[6] for other numbers of clusters, the representative raingauge network for various cluster numbers could be obtained.**Step [8]:**Recalculate the*AIC*values using gauged rainfall data in various representative networks. By examining the_{wavg}*AIC*, the optimal raingauge network should construct the representative one associated with the minimum_{wavg}*AIC*._{wavg}

## RESULTS AND DISCUSSION

### Study area and data set

The study area, the Wu River watershed, is located in central Taiwan. The basin encompasses an area of about 2025.6 km^{2}, and the river is about 119.1 km long. Without any hydraulic storage structures, such as reservoirs and dams, the runoff of the Wu River intrinsically reflects the watershed's response to the rainfall characteristics. Within the Wu River watershed, there are 26 automatic recording raingauges (see Figure 4 and Table 2). In Taiwan, the QPESUMS (Quantitative Rainfall Estimation Using Multiple Sensors system) became operational in late 2001 and incorporates data from multiple sensors, such as raingauges, multiple radars, satellites, numerical models, and lightning detectors with the goal of making reasonable quantitative precipitation estimations (QPE) and quantitative forecast, applicable to flash flood and debris flow warnings (Jou *et al.* 2006). Although the QPE data are produced based on the radar data from the Weather Bureau in Taiwan, they have been corrected based on the gauged rainfall data (Wu *et al.* 2015a, 2015b). Thus, using the gauged data, the QPE data can be effectively corrected, and this results in fewer uncertainties in the QPE (e.g. Abdella & Alfredsen 2010). Therefore, in this study, the QPE data are treated as the grid-based rainfall data used in the model development. The QPESUMS provides 1,336 grids of QPE rainfall data in the Wu River watershed.

Location | |||
---|---|---|---|

No of gauge | Gauge | TM_X (m) | TM_Y(m) |

RG1 | Qing-Liu (1) | 244435.5 | 2662715 |

RG2 | Cui-Luan (1) | 269444.9 | 2675033 |

RG3 | Liu-Fen-Liao | 212393.2 | 2647843 |

RG4 | Cao-Tun | 216645.6 | 2652601 |

RG5 | Bei-Shan (1) | 238411.7 | 2653644 |

RG6 | Tou-Bain-Keng | 231025.8 | 2668203 |

RG7 | Hui-Sun | 253671.2 | 2665883 |

RG8 | Ri-Yue-Tan | 240635.6 | 2641873 |

RG9 | Yu-Chi | 242744.5 | 2644064 |

RG10 | Zhang-Hu | 234457.1 | 2644525 |

RG11 | Tai-Chung | 217891.9 | 2671200 |

RG12 | Qing-Liu | 246222.9 | 2663997 |

RG13 | Da-Du-Cheng | 245150.6 | 2651749 |

RG14 | Ling-Xiao | 251510.3 | 2656920 |

RG15 | Pu-Li | 249108.6 | 2650829 |

RG16 | Cui-Luan | 273035.8 | 2676373 |

RG17 | Feng-Shu-Lin | 257758.4 | 2653566 |

RG18 | Ren-Ai | 263320.5 | 2657113 |

RG19 | Kun-Yang | 277805.5 | 2668485 |

RG20 | Rui-Yan | 268616.3 | 2668748 |

RG21 | Cui-Feng | 270824.1 | 2667111 |

RG22 | Bei-Shan | 237235.1 | 2653415 |

RG23 | Shui-Chang-Liu | 235829 | 2661878 |

RG24 | Chang-Fu | 237785.6 | 2666339 |

RG25 | Wai-Da-Ping | 241638.9 | 2650466 |

RG26 | Pu-Zhong | 2635217 | 321.6667 |

Location | |||
---|---|---|---|

No of gauge | Gauge | TM_X (m) | TM_Y(m) |

RG1 | Qing-Liu (1) | 244435.5 | 2662715 |

RG2 | Cui-Luan (1) | 269444.9 | 2675033 |

RG3 | Liu-Fen-Liao | 212393.2 | 2647843 |

RG4 | Cao-Tun | 216645.6 | 2652601 |

RG5 | Bei-Shan (1) | 238411.7 | 2653644 |

RG6 | Tou-Bain-Keng | 231025.8 | 2668203 |

RG7 | Hui-Sun | 253671.2 | 2665883 |

RG8 | Ri-Yue-Tan | 240635.6 | 2641873 |

RG9 | Yu-Chi | 242744.5 | 2644064 |

RG10 | Zhang-Hu | 234457.1 | 2644525 |

RG11 | Tai-Chung | 217891.9 | 2671200 |

RG12 | Qing-Liu | 246222.9 | 2663997 |

RG13 | Da-Du-Cheng | 245150.6 | 2651749 |

RG14 | Ling-Xiao | 251510.3 | 2656920 |

RG15 | Pu-Li | 249108.6 | 2650829 |

RG16 | Cui-Luan | 273035.8 | 2676373 |

RG17 | Feng-Shu-Lin | 257758.4 | 2653566 |

RG18 | Ren-Ai | 263320.5 | 2657113 |

RG19 | Kun-Yang | 277805.5 | 2668485 |

RG20 | Rui-Yan | 268616.3 | 2668748 |

RG21 | Cui-Feng | 270824.1 | 2667111 |

RG22 | Bei-Shan | 237235.1 | 2653415 |

RG23 | Shui-Chang-Liu | 235829 | 2661878 |

RG24 | Chang-Fu | 237785.6 | 2666339 |

RG25 | Wai-Da-Ping | 241638.9 | 2650466 |

RG26 | Pu-Zhong | 2635217 | 321.6667 |

To consider the effect of the strength of data on the rainfall distribution in time and space, the gauged rainfall from 26 raingauges and QPE from 1,336 QPESUMS grids among 10 typhoon events (see Table 3) in the Wu River watershed are selected as the study data. Since the proposed evaluation framework to identify the representative raingauge networks first classifies the raingauges into various groups by means of cluster analysis, this study classifies 26 gauges into clusters of 5, 8, 10, 13, 15, 18, and 20, of which 10 clusters are suggested in WMO guidelines. Table 4 represents the results from the classification of existing raingauges carried out by cluster analysis.

Occurrence period | ||||
---|---|---|---|---|

No. of event | Typhoon | Starting time | Ending time | Duration (hr) |

EV1 | HAITANG | 2005/7/17 13:00:00 | 2005/7/20 19:00:00 | 79 |

EV2 | TALIM | 2005/8/31 10:00:00 | 2005/9/2 00:00:00 | 39 |

EV3 | BILIS | 2006/7/13 01:00:00 | 2006/7/17 00:00:00 | 96 |

EV4 | SEPAT | 2007/8/17 14:00:00 | 2007/8/20 00:00:00 | 59 |

EV5 | KROSA | 2007/10/6 01:00:00 | 2007/10/7 19:00:00 | 43 |

EV6 | KALMAEGI | 2008/7/17 11:00:00 | 2008/7/20 00:00:00 | 62 |

EV7 | FUNG-WONG | 2008/7/28 03:00:00 | 2008/7/29 17:00:00 | 39 |

EV8 | SINLAKU | 2008/9/12 14:00:00 | 2008/9/15 20:00:00 | 79 |

EV9 | JANGMI | 2008/9/28 05:00:00 | 2008/9/30 03:00:00 | 47 |

EV10 | MORAKOT | 2009/8/6 03:00:00 | 2009/8/11 01:00:00 | 119 |

Occurrence period | ||||
---|---|---|---|---|

No. of event | Typhoon | Starting time | Ending time | Duration (hr) |

EV1 | HAITANG | 2005/7/17 13:00:00 | 2005/7/20 19:00:00 | 79 |

EV2 | TALIM | 2005/8/31 10:00:00 | 2005/9/2 00:00:00 | 39 |

EV3 | BILIS | 2006/7/13 01:00:00 | 2006/7/17 00:00:00 | 96 |

EV4 | SEPAT | 2007/8/17 14:00:00 | 2007/8/20 00:00:00 | 59 |

EV5 | KROSA | 2007/10/6 01:00:00 | 2007/10/7 19:00:00 | 43 |

EV6 | KALMAEGI | 2008/7/17 11:00:00 | 2008/7/20 00:00:00 | 62 |

EV7 | FUNG-WONG | 2008/7/28 03:00:00 | 2008/7/29 17:00:00 | 39 |

EV8 | SINLAKU | 2008/9/12 14:00:00 | 2008/9/15 20:00:00 | 79 |

EV9 | JANGMI | 2008/9/28 05:00:00 | 2008/9/30 03:00:00 | 47 |

EV10 | MORAKOT | 2009/8/6 03:00:00 | 2009/8/11 01:00:00 | 119 |

Number of clusters | |||||||
---|---|---|---|---|---|---|---|

No. of gauge | 5 | 8 | 10 | 13 | 15 | 18 | 20 |

RG1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |

RG2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 |

RG3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 |

RG4 | 3 | 4 | 4 | 4 | 4 | 4 | 4 |

RG5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 |

RG6 | 4 | 6 | 6 | 6 | 6 | 6 | 6 |

RG7 | 1 | 1 | 1 | 12 | 7 | 7 | 7 |

RG8 | 5 | 8 | 8 | 8 | 8 | 8 | 8 |

RG9 | 5 | 8 | 8 | 9 | 9 | 9 | 9 |

RG10 | 5 | 8 | 10 | 10 | 10 | 10 | 10 |

RG11 | 4 | 6 | 6 | 11 | 11 | 11 | 11 |

RG12 | 1 | 1 | 1 | 1 | 12 | 12 | 12 |

RG13 | 5 | 5 | 9 | 13 | 13 | 13 | 13 |

RG14 | 1 | 1 | 1 | 13 | 14 | 14 | 14 |

RG15 | 5 | 5 | 9 | 13 | 15 | 15 | 15 |

RG16 | 2 | 2 | 2 | 2 | 2 | 16 | 16 |

RG17 | 1 | 7 | 7 | 7 | 14 | 17 | 17 |

RG18 | 2 | 7 | 7 | 7 | 14 | 18 | 18 |

RG19 | 2 | 2 | 2 | 2 | 2 | 16 | 19 |

RG20 | 2 | 2 | 2 | 2 | 2 | 2 | 20 |

RG21 | 2 | 2 | 2 | 2 | 2 | 2 | 20 |

RG22 | 5 | 5 | 5 | 5 | 5 | 5 | 5 |

RG23 | 5 | 6 | 6 | 6 | 6 | 6 | 6 |

RG24 | 1 | 6 | 6 | 6 | 6 | 6 | 6 |

RG25 | 5 | 5 | 5 | 5 | 13 | 13 | 13 |

RG26 | 3 | 3 | 3 | 3 | 3 | 3 | 3 |

Number of clusters | |||||||
---|---|---|---|---|---|---|---|

No. of gauge | 5 | 8 | 10 | 13 | 15 | 18 | 20 |

RG1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |

RG2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 |

RG3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 |

RG4 | 3 | 4 | 4 | 4 | 4 | 4 | 4 |

RG5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 |

RG6 | 4 | 6 | 6 | 6 | 6 | 6 | 6 |

RG7 | 1 | 1 | 1 | 12 | 7 | 7 | 7 |

RG8 | 5 | 8 | 8 | 8 | 8 | 8 | 8 |

RG9 | 5 | 8 | 8 | 9 | 9 | 9 | 9 |

RG10 | 5 | 8 | 10 | 10 | 10 | 10 | 10 |

RG11 | 4 | 6 | 6 | 11 | 11 | 11 | 11 |

RG12 | 1 | 1 | 1 | 1 | 12 | 12 | 12 |

RG13 | 5 | 5 | 9 | 13 | 13 | 13 | 13 |

RG14 | 1 | 1 | 1 | 13 | 14 | 14 | 14 |

RG15 | 5 | 5 | 9 | 13 | 15 | 15 | 15 |

RG16 | 2 | 2 | 2 | 2 | 2 | 16 | 16 |

RG17 | 1 | 7 | 7 | 7 | 14 | 17 | 17 |

RG18 | 2 | 7 | 7 | 7 | 14 | 18 | 18 |

RG19 | 2 | 2 | 2 | 2 | 2 | 16 | 19 |

RG20 | 2 | 2 | 2 | 2 | 2 | 2 | 20 |

RG21 | 2 | 2 | 2 | 2 | 2 | 2 | 20 |

RG22 | 5 | 5 | 5 | 5 | 5 | 5 | 5 |

RG23 | 5 | 6 | 6 | 6 | 6 | 6 | 6 |

RG24 | 1 | 6 | 6 | 6 | 6 | 6 | 6 |

RG25 | 5 | 5 | 5 | 5 | 13 | 13 | 13 |

RG26 | 3 | 3 | 3 | 3 | 3 | 3 | 3 |

*Note:* ‘1’, ‘2’, and ‘20’ mean the 1st cluster, 2nd cluster and 20th cluster.

### Rainfall characteristics analysis

According to the proposed evaluation framework, the rainstorm events should be characterized in advance. Figures 5–8 show the rainfall depth and storm patterns for gauged and QPE hourly rainfall data. From Figure 5, it is known that rainfall characteristics clearly change with the gauges for a particular event. For example, the average rainfall depths of 26 raingauges approximates 661.6 mm for EV10 (Typhoon Morakot), but the maximum and minimum depths are 891 mm and 415 mm, respectively. It is also observed that there significantly exist spatial variations in rainfall depth. Thus, the rainfall depth could be regarded as a spatial variable and the corresponding variation can be expressed in terms of the spatial semivariogram. Similar conclusions can be drawn from the QPE data as shown in Figure 6.

Examining the storm pattern shown in Figures 7 and 8, it is clear that the variation changes with the time as well as with the location. The storm pattern can be similarly treated as a spatiotemporal variable; thus, its variability can be represented by the spatiotemporal semivariogram. Although the storm pattern varies with location and time, it can be classified generally into three types: the advanced type, central type and delayed type. For the central storm pattern, the maximum dimensionless rainfall occurs at the middle time step of the duration (called the central time step), such as with EV10 at RG21 (Cui-Feng gauge). As for the advanced and delayed storm pattern, their maximum dimensionless rainfalls are at earlier and later time steps, respectively, such as with EV1 and EV9 at RG9 (Kun-Yang gauge) and RG11 (Taichung gauge). This implies that the storm pattern not only varies with location, but also changes with time. That is to say, the uncertainties probably exist in the storm pattern in time and space.

### Comparison of spatiotemporal semivariograms for rainfall characteristics

*et al.*1954; Hossain

*et al.*2004; Schröter

*et al.*2011; Zhu

*et al.*2013; De Coning 2013). Therefore, this study uses the dimensionless rainfall depth in the computation of spatial semivariograms as: where

*d*is the rainfall depth at the

_{i}*i*th gauge, and

*N*is the number of gauges in a watershed. Then, the spatial semivariogram for the QPE and gauged rainfall depth can be calculated from the aforementioned nondimensionalized values. Figure 9 presents the spatial semivariogram for the rainfall depth calculated from the QPE and gauged data, respectively. It is seen that spatial semivariograms from gauged rainfall data approach those obtained from the QPE data, except for EV1 and EV3. Also, the results from the comparison in spatiotemporal semivariograms randomly change with the rainstorm events.

_{g}Since the storm pattern belongs to the spatial and temporal variate, spatiotemporal semivariograms are required to quantify the variation for rainfall in time and space. Since a spatiotemporal semivariogram is a two-dimensional variable, it is related to location and time. This study compares the difference between the spatiotemporal semivariogram from the QPE and gauged data, respectively, by calculating their relative ratio as shown in Figure 10. It can be observed that the angle of the relative-ratio line of the spatiotemporal approximates 45° for EV10. The spatial and temporal variations in the storm pattern from the gauged data approaches those from QPE data. In other words, the temporal and spatial changes in the storm pattern extracted from gauged data resemble those from the QPE data. However, for EV2, the angle of the relative-ratio line exceeds 45°; this means that the spatiotemporal semivariograms for the storm pattern from the gauged data are higher than those from the QPE data. The same results can also be seen for other rainstorm events.

The above results reveal that there are significant variations in the rainfall depth and storm pattern extracted from the gauged rainfall data in time and space as compared with those from QPE data. To compare spatiotemporal semivariograms estimated from the 10 rainstorm events, the average values of the fitness performance indices (i.e. RMSE and AIC) are required for identifying the representative raingauge network.

### Identification of influential gauges

The proposed framework identifies the influential gauges and corresponding representative raingauge networks by carrying out cluster analysis and cross-validation. In cluster analysis, seven numbers of clusters, including 5-, 8-, 10-, 13-, 15-, 18- and 20-cluster networks, are considered and 10-clusters network are illustrated for identifying the influential raingauges. Table 4 presents the classification of all raingauges through the cluster analysis. It is known that single gauges (Cao-Tun gauge and Zhang-Hu gauge) are classified into the 4th and 10th clusters, so they are defined directly as the influential gauges in the 4th and 10th clusters, respectively.

Table 5 shows the average AIC values of semivariograms for the rainfall depth and storm pattern ( and ), respectively, computed from 10 rainstorm events in the 10-cluster gauge network. It can be seen that the average values for the rainfall depth are generally higher than those for the storm pattern. On average, and values for the rainfall depth and storm pattern approximate –42 and –1640, respectively. To find the influential raingauge for each cluster, the weighted average AIC (*AIC _{wavg}*) values are calculated using Equation (10) as shown in Table 5. In the 1st cluster which includes four gauges, the average AIC values for the rainfall depth and the storm pattern are about –41.6 and –1627.05 , respectively. The corresponding

*AIC*values are located between –882.3 and –836.7. Since the Qing-Liu (1) gauge has the maximum

_{wavg}*AIC*value (–836.7), it can be selected as the influential gauge for the 1st cluster. Similarly, in the 2nd cluster, the value at the Cui-Luan gauge is –44.91 and it is the minimum in five gauges; however, its value for the storm pattern (about –1610.1) is significantly higher than those of the remaining four gauges. This results in its weighted average AIC value (

_{wavg}*AIC*= –849.9) being the maximum value. Thus, the Cui-Luan gauge is defined as the sensitive gauge in the 2nd cluster. Additionally, for the 3rd cluster where two gauges are involved (i.e. Pu-Zhong and Liu-Fen-Liao), the average AIC values for the rainfall depth ( = –39.24) and the storm pattern ( = –1603.4) at the Pu-Zhong gauge are the maximum. Thus, the corresponding weighted average is at the maximum ( = –840.9). Therefore, the Zhong-Pu gauge should be identified as the influential gauge in the 3rd cluster. Using the same method as the aforementioned, the influential gauges in other clusters can also be determined as show in Figure 11. It can be seen that the 10 influential raingauges, the Qing-Liu gauge (RG1), Cui-Luan gauge (RG2), Liu-Fen-Liao gauge (RG3), Cao-Tun gauge (RG4), Ri-yue-tan gauge (RG8), Zhang-hu gauge (RG10), Taichung gauge (RG11), Pu-Li gauge (RG15), Feng-Shu-Liu gauge (RG17), and Bei-Shan (1) gauge (RG22), are approximately uniformly distributed throughout the Wu River Watershed. It concludes that these 10 raingauges can form a representative network for the Wu River watershed under the consideration of 10 clusters.

_{wavg}Average AIC | |||||
---|---|---|---|---|---|

No of cluster | Raingauge | Rainfall depth | Storm pattern | Weighted average AIC_{wavg} | |

1 | Qing-Liu (1) | RG1 | –42.58 | –1588.31 | –836.74* |

Hui-Sun | RG7 | –41.49 | –1617.49 | –850.24 | |

Qing-Liu | RG12 | –42.47 | –1679.75 | –882.34 | |

Ling-Xiao | RG14 | –40.11 | –1622.67 | –851.44 | |

2 | Cui-Luan | RG2 | –44.91 | –1610.10 | –849.96* |

Cui-Luan (1) | RG16 | –41.10 | –1696.38 | –889.29 | |

Kun-Yang | RG19 | –40.57 | –1646.60 | –863.86 | |

Rui-Yan | RG20 | –43.57 | –1668.08 | –877.61 | |

Cui-Feng | RG21 | –44.81 | –1622.81 | –856.22 | |

3 | Liu-Fen-Liao | RG3 | –39.24 | –1603.39 | –840.93* |

Pu-Zhong | RG26 | –39.99 | –1685.44 | –882.71 | |

4 | Cao-Tun | RG4 | –43.62 | –1611.26 | –849.25* |

5 | Bei-Shan (1) | RG5 | –41.87 | –1566.61 | –825.17* |

Bei-Shan | RG22 | –44.86 | –1637.54 | –863.63 | |

Wai-Da-Ping | RG25 | –42.11 | –1677.98 | –881.10 | |

6 | Tou-Bain-Keng | RG6 | –46.35 | –1655.45 | –874.07 |

Tai-Chung | RG11 | –38.15 | –1615.34 | –845.82* | |

Shui-Chang-Liu | RG23 | –39.58 | –1694.95 | –887.05 | |

Chang-Fu | RG24 | –40.25 | –1655.30 | –867.89 | |

7 | Feng-Shu-Lin | RG17 | –41.97 | –1634.72 | –859.33* |

Ren-Ai | RG18 | –46.14 | –1632.66 | –862.47 | |

8 | Ri-Yue_Tan | RG8 | –42.48 | –1631.12 | –858.04* |

Yu-Chi | RG9 | –43.55 | –1687.44 | –887.27 | |

9 | Da-Du-Cheng | RG13 | –40.44 | –1677.27 | –879.08 |

Pu-Li | RG15 | –45.08 | –1661.80 | –875.98* | |

10 | Zhang-Hu | RG10 | –40.69 | –1635.06 | –858.22* |

Average AIC | |||||
---|---|---|---|---|---|

No of cluster | Raingauge | Rainfall depth | Storm pattern | Weighted average AIC_{wavg} | |

1 | Qing-Liu (1) | RG1 | –42.58 | –1588.31 | –836.74* |

Hui-Sun | RG7 | –41.49 | –1617.49 | –850.24 | |

Qing-Liu | RG12 | –42.47 | –1679.75 | –882.34 | |

Ling-Xiao | RG14 | –40.11 | –1622.67 | –851.44 | |

2 | Cui-Luan | RG2 | –44.91 | –1610.10 | –849.96* |

Cui-Luan (1) | RG16 | –41.10 | –1696.38 | –889.29 | |

Kun-Yang | RG19 | –40.57 | –1646.60 | –863.86 | |

Rui-Yan | RG20 | –43.57 | –1668.08 | –877.61 | |

Cui-Feng | RG21 | –44.81 | –1622.81 | –856.22 | |

3 | Liu-Fen-Liao | RG3 | –39.24 | –1603.39 | –840.93* |

Pu-Zhong | RG26 | –39.99 | –1685.44 | –882.71 | |

4 | Cao-Tun | RG4 | –43.62 | –1611.26 | –849.25* |

5 | Bei-Shan (1) | RG5 | –41.87 | –1566.61 | –825.17* |

Bei-Shan | RG22 | –44.86 | –1637.54 | –863.63 | |

Wai-Da-Ping | RG25 | –42.11 | –1677.98 | –881.10 | |

6 | Tou-Bain-Keng | RG6 | –46.35 | –1655.45 | –874.07 |

Tai-Chung | RG11 | –38.15 | –1615.34 | –845.82* | |

Shui-Chang-Liu | RG23 | –39.58 | –1694.95 | –887.05 | |

Chang-Fu | RG24 | –40.25 | –1655.30 | –867.89 | |

7 | Feng-Shu-Lin | RG17 | –41.97 | –1634.72 | –859.33* |

Ren-Ai | RG18 | –46.14 | –1632.66 | –862.47 | |

8 | Ri-Yue_Tan | RG8 | –42.48 | –1631.12 | –858.04* |

Yu-Chi | RG9 | –43.55 | –1687.44 | –887.27 | |

9 | Da-Du-Cheng | RG13 | –40.44 | –1677.27 | –879.08 |

Pu-Li | RG15 | –45.08 | –1661.80 | –875.98* | |

10 | Zhang-Hu | RG10 | –40.69 | –1635.06 | –858.22* |

‘*’ means the influential gauge selected in each cluster.

### Establishment of the representative raingauge network

The same procedure as for the 10-gauge network is carried out to determine the representative raingauge networks for 5-, 8-, 13-, 15-, 18- and 20-gauge networks as shown in Figure 12. To find the optimal network, the *AIC _{wavg}* are re-calculated using the rainfall data from the influential gauges within the representative networks. Figure 13 shows the

*AIC*corresponding to the representative raingauge networks for seven numbers of clusters. It reveals that the

_{wavg}*AIC*decreases from the 5-cluster network (about –858) to the 10-cluster network (approximately –991.2), and increase to the 20-cluster network (about –72.5). Since the best-fit model is associated with the minimum AIC value, the 10-gauge raingauge network can be selected as the optimal network in the Wu River watershed.

_{wavg}In order to demonstrate the optimal 10-gauge network, this study utilizes two additional rainstorm events (Typhoons Fanapi in 2010 and Saola in 2012) to evaluate the difference in the semivariograms for the rainfall characteristics against the QPE data between all gauges and the optimal network. Figures 14 and 15 show the comparison of the semivariograms for the rainfall depth and storm pattern from the QPE data with those from and gauged data in all gauges and the optimal network, respectively. Note that the spatiotemporal semivariograms calculated from rainfall data at all raingauges (on average 1.84) significantly depart from those from the QPE data. Moreover, Figure 15 shows the relative-ratio line for the spatiotemporal semivariogram from gauged rainfall data in all raingauges and the optimal network, respectively. This reveals that the semivariograms for the optimal network are closer to those calculated from QPE data than those based on all gauges. The RMSE of the rainfall depth (0.01) and the storm pattern (0.002) for the optimal network are, respectively, significantly less than those for all gauges, 0.13 (rainfall depth) and 0.066 (storm pattern). As a result, the optimal 10-gauge network in the Wu River watershed determined using the proposed evaluation framework can provide the rainfall characteristics, of which spatial and temporal variation resemble those results from the grid-based QPE.

## CONCLUSIONS

This study proposes an evaluation framework to identify the influential raingauges which can form the representative raingauge network by integrating cluster analysis with weighted semivariogram models. The evaluation framework is based on quantifying the fitness performance index (AIC) for the spatiotemporal semivariograms of the rainfall characteristics in the raingauges to the grid-based quantitative rainfall estimation (QPE). The 26 raingauges and 1,336 grids of QPE data from the 10 typhoons in the Wu river watershed located in Central Taiwan are the study area and data set. By means of the proposed evaluation framework, the representative raingauge networks of 5, 8, 10, 13, 15, 18 and 20 clusters are obtained, and the 10-gauge network is selected as the optimal network. Two additional rainstorm events, Typhoon Fanapi (2010) and Saola (2012), are used to evaluate the applicability and reliability of the optimal network. The results reveal that the spatiotemporal semivariograms for the gauged rainfall characteristics calculated from the optimal 10-gauge network resemble the variation in the QPE's rainfall characteristics in time and space.

Although a number of investigations have proposed methods for identifying the raingauge networks, their results differ due to the diverse objective of selecting the optimal network. The optimal network identified in this study is expected to capture the change of the rainfall characteristic in time and space based on the grid-based radar data. Nevertheless, the comparison of the proposed framework with other methods is intended in future work. However, in addition to the identification of the optimal network, increasing the density of raingauges to provide essential and useful rainfall record is also an important issue in recent hydrology research, such as the correction of radar-based quantitative rainfall estimation (e.g. Wood *et al.* 2000; Yilmaz *et al.* 2005; Barca *et al.* 2008; Stisen *et al.* 2012; Adhikary *et al.* 2015a, 2015b). Therefore, future work should expand the proposed evaluation framework to determine the locations for new raingauges in a catchment area.