China has suffered from increasingly severe flood events in recent years, most of which are caused by heavy rains. The substantial casualties and damage caused by flooding necessitates a better understanding of precipitation extremes, especially in heavily populated urban areas. Based on L-moments from a regional perspective, this paper analyzes precipitation extremes in the Taihu Basin, utilizing annual maximum daily precipitation and partial duration series at 96 rain gages. The comparison of regional and at-site analysis results shows that the former provides more robust estimates, especially in the upper tail of a distribution (higher quantiles). Also, the use of partial duration series, which captures more information about extreme events, was found to be preferable to describe the extreme precipitation events in the Taihu Basin. Given the recently observed more frequent occurrence and greater magnitude of precipitation extremes, it is suggested that the food design standard used in the basin should be updated, especially for the urbanizing zones.
INTRODUCTION
Since the late 20th century, the frequency of extreme precipitation events has increased substantially in many countries worldwide, including China (IPCC 2013). Extreme precipitation events very often cause severe floods. In most of the southeast and northwest parts of China, the frequency of precipitation-induced flood events has increased significantly (Li et al. 2012). Meanwhile, the rapid urbanization in recent years in China greatly aggravates flood damage. It is imperative to investigate the extreme rainfall-induced floods in urban areas, and the analysis of extreme precipitation events is a critical component of engineering design to protect against such events. To date, the quantification of rainfall frequency in China is mainly based on at-site frequency analysis via the conventional moment method (CMM) using Pearson type III distributions (PE3) (Ministry of Water Resources China 2006). This method has been applied since the 1950s without any modifications or updates. Previous studies have demonstrated that this method leads to a lower estimation of extreme precipitation amounts and correspondingly lower flood design standards (Wang et al. 2008). Therefore, to update the current design precipitation atlas and to enhance flood protection capabilities, it is necessary to examine the precipitation extremes using other approaches.
Regional frequency analysis, which trades space for time, provides another framework for characterization of the frequency distribution of hydrological events (Norbiato et al. 2007) and improves at-site statistical characterization by incorporating spatial data from several sites in a homogeneous region. This method assumes that in a homogeneous region, the distributions of extremes are identical apart from a site-specific scaling factor. In this study, the index flood/precipitation, which is one of the commonly employed models, is used as a representative quantity of regional frequency analysis.
Similar to conventional moments, L-moments provide measures of distributional location (mean), scale (variance), skewness (shape), and kurtosis (peakedness) as well. Distribution parameters’ estimation based on L-moments is robust to outliers and virtually unbiased for small samples (Hosking & Wallis 1997). It has been accepted as an accurate and robust method for selecting and parameterizing representative probability distribution functions (Lin & John 1993) and is used in many countries and regions worldwide (Parida et al. 1998; Sankarasubramanian & Srinivasan 1999; Yue & Wang 2004; Atiem & Harmancioğlu 2006; Liu et al. 2007; Kjeldsen & Jones 2009; Um et al. 2010; Chen & Hong 2012). In the United States, the L-moment method (LMM) for distribution parameter estimation has been applied as a design standard for hydrological works by the OHD of NOAA (Lin & John 1993; Lin et al. 2012). However, in China, there is still a lack of relevant research in urban areas. Thus it is worthwhile to test this approach and assess its applicability in China.
The Taihu Basin is one of the most important basins in China with substantial recent increases in urban land. It also suffers from massive and severe flood damage induced by precipitation events every year (Wu 2000a, 2000b; Wu & Guan 2000; Ou & Wu 2001). A better understanding of frequency of occurrence of extreme precipitation is urgently needed. Several studies about the extreme rainfall frequency in the Taihu Basin have already been conducted. Liang et al. (2013) compared LMM and CMM utilizing rainfall data in the Taihu Basin, and the results demonstrated a better performance of LMM. Zhou et al. (2014b) investigated the correlation between rainfall amounts observed in several small sub-basins of the Taihu Basin. Zhou et al. (2014a) conducted at-site analysis at the rain gage with the longest rainfall records in the Taihu Basin using the parameter estimation methods of maximum likelihood and LMM. This study extends from those previous studies and provides a framework of precipitation frequency analysis using regional L-moments analysis. In this study, estimation of precipitation extremes is carried out based on L-moments from a regional perspective using two datasets, i.e., annual maximum daily precipitation (AMDP) and partial duration series (PDS). The estimations from the regional approach and the at-site approach are compared, and the results obtained using AMDP and PDS are also compared.
METHODOLOGY
Theory of L-moments
The L-moment approach of regional frequency analysis
This approach starts by identifying the regional homogeneity and evaluating its validity through statistical tests. It investigates the data properties by L-statistics, groups the sites into homogeneous regions, and then estimates the distribution of extremes at a given site. The merit of regional analysis is that sampling variations in the parameter estimates are more unbiased, and high quantiles (the upper tail of a distribution) are more robust. Hence, it provides more accurate estimates as it combines data at different sites in one homogeneous region.
The criteria for assessing the homogeneous regions are (Hosking & Wallis 1990): if H< 1, the region is acceptably homogeneous; if 1 < H< 2, the region is possibly homogeneous; and if H > 2, the region is definitely homogeneous.
Extension to PDS methods
For a more reliable estimation of precipitation extremes, it is very important to involve as many large events (peaks) as possible from the observed series. The AMDP series contains one annual peak per year and is accepted as an efficient and common series for frequency analysis in hydrology. However, regardless of whether the second largest event in a year exceeds the largest events of other years, the second largest event in any year will not be included in the AMDP series; thus the AMDP series may employ insufficient information (Zhou et al. 2014a). When dealing with values exceeding a certain threshold, PDS is preferred for hydrological frequency analysis, and is capable of capturing more information about extreme events than the AMDP series (Hosking & Wallis 1997; Norbiato et al. 2007, Pham et al. 2014a; Zhou et al. 2014a). This approach is effective for accurately estimating precipitation extremes, reducing the uncertainty and maximizing the utilization of available data (Pham et al. 2014b). Although there still exist some difficulties in defining PDS (Madsen et al. 1997), the method has been studied by more researchers in recent years.
PDS is a series of selected data whose magnitude is greater than a threshold value (u) or over a threshold characterized by the average number of peaks per year (λ). For a series of observations x = {x1, x2, …, xn} over n year, PDS is a set containing m values, y = {y1, y2, …, ym}, which exceed a chosen threshold u, thus, yi > u, i = 1, 2, … , m; and λ = m/n. Here, the total number of peaks (m) is a random quantity chosen from the average number of peaks per year. The value of m depends on either the average number of peaks per year or the threshold value. Thus, compared with the AMDP series, the PDS is able to extract more information for the prediction of extreme rainfalls.
In this study, the main procedure of PDS data selection at site i is as follows:
Produce the AMDP series, {xi, AMDP}, with a data length of Ni, AMDP.
Select the average number of peaks per year λ and determine the required data length N’i, PDS = λNi,AMDP. Note that all the peaks are independent random variables.
Based on the required data length Ni, PDS, where, e.g., λ = 3 and Ni, PDS = 3Ni, AMDP, determine the required threshold xmin.
Based on xmin, select observations from individual years of record and form the PDS series, {xi, PDS}, where xi,PDS > xmin.
STUDY AREA AND DATASET
The Taihu Basin
The Taihu Basin (30.14 °–32.26 °N, 119.0 °–121.89 °E) is located in the Yangtze Delta plain at the mouth of the Yangtze River (Wang 2006); with an area of 36,900 km2, it is one of the tributary regions of the Yangtze River Basin. Although the Taihu Basin only contains 0.4% of the total area of China, it is one of the most economically advanced areas of China (Yang & Wang 2003).
With a climate of subtropical humid monsoon, the Taihu Basin receives about 1,181 mm of rainfall per year (Gou et al. 2010). Precipitation has a considerable temporal and spatial variation in this region, generally decreasing from south to north. During the flooding season from May to September, plum rains with the characteristics of long duration, and heavy rains with the characteristics of great intensity and short duration both occur frequently (Wu 2000a, 2000b). Both types of rains contribute to serious flood risks. Considering that the region is suffering from frequent flood events with the recent process of fast urbanization, the Taihu Basin Authority (TBA) has updated its flood control plan in the Taihu Basin and has raised the longest return period for flood control up to 200 years in 2015 (Huang 2000; TBA 2008).
Dataset
No. . | Sub-region . | Number of sites . | Mean annual precipitation (mm) . |
---|---|---|---|
Region 1 | Taihu | 7 | 1,156 |
Region 2 | Hangjiahu | 21 | 1,225 |
Region 3 | Wuchengxiyu | 7 | 1,074 |
Region 4 | Yangchengdianmao | 9 | 1,098 |
Region 5 | Huxi | 12 | 1,132 |
Region 6 | Zhexi | 34 | 1,441 |
Region 7 | Pudongpuxi | 6 | 1,105 |
Taihu Basin | 96 | 1,181 |
No. . | Sub-region . | Number of sites . | Mean annual precipitation (mm) . |
---|---|---|---|
Region 1 | Taihu | 7 | 1,156 |
Region 2 | Hangjiahu | 21 | 1,225 |
Region 3 | Wuchengxiyu | 7 | 1,074 |
Region 4 | Yangchengdianmao | 9 | 1,098 |
Region 5 | Huxi | 12 | 1,132 |
Region 6 | Zhexi | 34 | 1,441 |
Region 7 | Pudongpuxi | 6 | 1,105 |
Taihu Basin | 96 | 1,181 |
RESULTS AND DISCUSSION
Analysis of precipitation extremes (AMDP/L-moment)
The results of the heterogeneity measures (Table 2) demonstrate that all the sub-regions are considered to be definitely heterogeneous except for Region 5, whose H(3) is slightly over 1. H(1) and H(2), calculated from L-Cv/L-Ck and L-Cs/L-Ck, are more representative indexes of heterogeneity compared with H(3). Thus, Region 5 is considered to be approximately heterogeneous.
. | H(1) . | H(2) . | H(3) . |
---|---|---|---|
Region 1 | −1.69 | −1.11 | −0.91 |
Region 2 | −1.91 | −2.92 | −2.75 |
Region 3 | −1.28 | −1.26 | 0.12 |
Region 4 | −0.31 | 0.44 | 0.36 |
Region 5 | 0.14 | 0.39 | 1.01 |
Region 6 | −0.18 | −1.95 | −1.49 |
Region 7 | −0.42 | −1.32 | −0.66 |
. | H(1) . | H(2) . | H(3) . |
---|---|---|---|
Region 1 | −1.69 | −1.11 | −0.91 |
Region 2 | −1.91 | −2.92 | −2.75 |
Region 3 | −1.28 | −1.26 | 0.12 |
Region 4 | −0.31 | 0.44 | 0.36 |
Region 5 | 0.14 | 0.39 | 1.01 |
Region 6 | −0.18 | −1.95 | −1.49 |
Region 7 | −0.42 | −1.32 | −0.66 |
. | . | Region 1 . | Region 2 . | Region 3 . | Region 4 . | Region 5 . | Region 6 . | Region 7 . |
---|---|---|---|---|---|---|---|---|
GLO | L-CK | 0.239 | 0.262 | 0.212 | 0.218 | 0.203 | 0.260 | 0.240 |
Z | 0.42 | −0.16 | −0.40 | −0.23 | 4.55 | 1.54 | 1.19 | |
GEV | L-CK | 0.212 | 0.239 | 0.178 | 0.186 | 0.167 | 0.237 | 0.212 |
Z | −0.37 | −1.25 | −1.42 | −0.78 | 2.75 | 0.12 | 0.43 | |
GNO | L-CK | 0.191 | 0.213 | 0.165 | 0.171 | 0.157 | 0.211 | 0.192 |
Z | −0.97 | −2.48 | −1.81 | −1.25 | 2.27 | −1.44 | −0.15 | |
PE3 | L-CK | 0.156 | 0.169 | 0.142 | 0.144 | 0.137 | 0.167 | 0.156 |
Z | −2.00 | −4.58 | −2.54 | −2.09 | 1.30 | −4.10 | −1.15 | |
GPA | L-CK | 0.139 | 0.171 | 0.097 | 0.106 | 0.082 | 0.168 | 0.139 |
Z | −2.49 | −4.58 | −3.89 | −3.30 | −1.42 | −4.05 | −1.62 | |
BEST | GEV | GLO | GLO | GLO | PE3 | GEV | GNO |
. | . | Region 1 . | Region 2 . | Region 3 . | Region 4 . | Region 5 . | Region 6 . | Region 7 . |
---|---|---|---|---|---|---|---|---|
GLO | L-CK | 0.239 | 0.262 | 0.212 | 0.218 | 0.203 | 0.260 | 0.240 |
Z | 0.42 | −0.16 | −0.40 | −0.23 | 4.55 | 1.54 | 1.19 | |
GEV | L-CK | 0.212 | 0.239 | 0.178 | 0.186 | 0.167 | 0.237 | 0.212 |
Z | −0.37 | −1.25 | −1.42 | −0.78 | 2.75 | 0.12 | 0.43 | |
GNO | L-CK | 0.191 | 0.213 | 0.165 | 0.171 | 0.157 | 0.211 | 0.192 |
Z | −0.97 | −2.48 | −1.81 | −1.25 | 2.27 | −1.44 | −0.15 | |
PE3 | L-CK | 0.156 | 0.169 | 0.142 | 0.144 | 0.137 | 0.167 | 0.156 |
Z | −2.00 | −4.58 | −2.54 | −2.09 | 1.30 | −4.10 | −1.15 | |
GPA | L-CK | 0.139 | 0.171 | 0.097 | 0.106 | 0.082 | 0.168 | 0.139 |
Z | −2.49 | −4.58 | −3.89 | −3.30 | −1.42 | −4.05 | −1.62 | |
BEST | GEV | GLO | GLO | GLO | PE3 | GEV | GNO |
Note that in regional frequency analysis, the aim is not to identify a ‘true’ distribution but to find a distribution that will yield accurate quantile estimates for each site using a single frequency distribution to fit data from several sites (Hosking & Wallis 1997). In general, a homogeneous region is slightly heterogeneous, and there will be no single ‘true’ distribution that applies to each site. Thus, the results of frequency analysis for sites 6032 and 6033 are reasonable here.
Further analysis of heterogeneity
Considering that the bias in the estimation of RGF may be increased by heterogeneity if the ‘probably’ discordant sites are retained, the impacts of sites 6032 and 6033 on regional estimates are evaluated via sensitivity tests.
Let Q0 be the regional precipitation estimates (with return periods = 25, 50, 100, 200 years) without site A, and Q1 be the regional precipitation estimates with site A included; the sensitivity index is defined as p = |Q1−Q0 |/Q0. The threshold is set as p = 5%, if p < 5% site A can be retained; otherwise, the site may be deleted or grouped to another region.
The sensitivity test result is summarized in Table 4. It shows that the p value for both sites are <5% when the return period is smaller than 200 years. This indicates that Region 6 can be treated as a homogenous region and the estimations obtained in the previous sections are acceptable.
. | Return period (year) . | ||||
---|---|---|---|---|---|
Removed sites (Region 6) . | 10 . | 25 . | 50 . | 100 . | 200 . |
6033 | 0.13 | 0.05 | 0.20 | 0.44 | 0.67 |
6032 | 0.13 | 0.05 | 0.20 | 0.44 | 0.67 |
6032 and 6033 | 0.45 | 0.69 | 0.85 | 0.97 | 1.11 |
. | Return period (year) . | ||||
---|---|---|---|---|---|
Removed sites (Region 6) . | 10 . | 25 . | 50 . | 100 . | 200 . |
6033 | 0.13 | 0.05 | 0.20 | 0.44 | 0.67 |
6032 | 0.13 | 0.05 | 0.20 | 0.44 | 0.67 |
6032 and 6033 | 0.45 | 0.69 | 0.85 | 0.97 | 1.11 |
PDS data analysis
The same procedure of analysis is carried out to test the use of the PDS series instead of the AMDP series. The L-statistics based on the AMDP series and the PDS series for the seven sub-regions are provided in Table 5. Compared with the results based on AMDP, the estimated L-Cv values based on PDS are smaller, and the L-Cs and L-Ck values are larger. It demonstrates that the PDS series has a smaller sampling variance and a larger skewness and kurtosis, which results in the increase of estimates for each site in the upper tail of their distributions. Furthermore, the distributions for the seven sub-regions based on the PDS series are different from those based on the AMDP series. It is well-recognized that the best distribution is influenced by property of the data (Wang et al. 2011) and the model of GPA distributions using the PDS series is the most acceptable method (Begueria 2005; Trefry et al. 2005; Pham et al. 2014b). In this study, the GNO distribution is identified as the best distribution for Regions 1 and 2. The GLO distribution is best fitted for Region 4. The GPA distribution is best fitted for four regions: Regions 3, 5, 6, and 7. Therefore, the results obtained in this study verify that the PDS data are most closely fitted by GPA distributions in regional analysis.
. | L-Cv . | L-Cs . | L-Ck . |
---|---|---|---|
Region 1 | 0.23 | 0.30 | 0.22 |
0.16 | 0.41 | 0.25 | |
Region 2 | 0.25 | 0.34 | 0.25 |
0.23 | 0.34 | 0.26 | |
Region 3 | 0.23 | 0.23 | 0.21 |
0.23 | 0.27 | 0.24 | |
Region 4 | 0.20 | 0.25 | 0.20 |
0.21 | 0.24 | 0.18 | |
Region 5 | 0.22 | 0.21 | 0.11 |
0.23 | 0.31 | 0.19 | |
Region 6 | 0.24 | 0.33 | 0.22 |
0.22 | 0.34 | 0.26 | |
Region 7 | 0.23 | 0.30 | 0.19 |
0.17 | 0.40 | 0.22 |
. | L-Cv . | L-Cs . | L-Ck . |
---|---|---|---|
Region 1 | 0.23 | 0.30 | 0.22 |
0.16 | 0.41 | 0.25 | |
Region 2 | 0.25 | 0.34 | 0.25 |
0.23 | 0.34 | 0.26 | |
Region 3 | 0.23 | 0.23 | 0.21 |
0.23 | 0.27 | 0.24 | |
Region 4 | 0.20 | 0.25 | 0.20 |
0.21 | 0.24 | 0.18 | |
Region 5 | 0.22 | 0.21 | 0.11 |
0.23 | 0.31 | 0.19 | |
Region 6 | 0.24 | 0.33 | 0.22 |
0.22 | 0.34 | 0.26 | |
Region 7 | 0.23 | 0.30 | 0.19 |
0.17 | 0.40 | 0.22 |
CONCLUSIONS
In this study, regional analysis based on L-moment in the Taihu Basin are carried out with the AMDP series and the PDS series, and the performance of AMDP and PDS are investigated. The regional analysis procedure was found to be more effective for extreme precipitation estimation in this basin as compared with at-site analysis. The procedure is therefore recommended for other cases.
It was noted that the discordancy measures Di of site 6032 and site 6033 do show a slight discordancy. However, after more detailed investigation of the data series and examination of the sensitivity test results, the two sites are still accepted. This verification procedure provides a practical way for engineers to use in the flood design in this area. The L-moment-based heterogeneity measure identifies that the seven sub-regions in the Taihu Basin are acceptably homogeneous even though Region 5 shows a slight heterogeneity in H(3). As the basin comprises different climatic regions, the seven hydrological sub-regions are appropriate for water resources management purposes. The different types of distributions suitable for the seven different sub-regions partly reflect the different climatic conditions.
Robustness and accuracy of regional L-moment analysis is demonstrated through the comparison of at-site analysis and regional analysis results for sites 6032 and 6033. It shows that outliers have a significant influence on the upper tail of a distribution, and regional analysis helps to reduce the influence of insufficient data at individual sites. Based on the results of this study, the analysis method may even be used in ungauged areas.
PDS considers all of the extreme values and the use of the PDS series enables a more complete analysis of extreme events than the use of the AMDP series, even with some outliers. For small return periods, estimates based on the AMDP series are lower than those based on PDS. It was found that PDS provides higher quantile estimates for practical purposes in the upper tail of a distribution; and the GPA distribution is suitable for regions in the Taihu Basin. Although the PDS series is still under-used relative to the AMDP series because of its complexity, it is recommended that the PDS should be used for the study of extreme hydrological events. More research based on the PDS series is preferable to describe the extreme hydrological events in the Taihu Basin.
In the mountainous areas of the west part of the Taihu Basin, extreme precipitation is estimated to be greater. Thus, it is suggested that the flood design standard should be updated in this region considering the more frequent occurrence and greater magnitude of precipitation extremes. In addition, this work can be extended to other regions of China.
ACKNOWLEDGEMENTS
This paper is the result of part of a research initiative titled ‘Key Analysis for Flood Control in Rural Areas’ sponsored by National Science & Technology Pillar Program (Grant No. 2014BAL05B02). This paper also addresses the research topic under ‘The Application of Hydrometeorological L-moments in Flood Control Plan’, a project which is supported by the Ministry of Water Resources P.R. China (No. 201001047-02). Our greatest gratitude goes first and foremost to Professor Bingzhang Lin, Dean of Applied Hydrometeorological Research Institute of Nanjing University of Information Science & Technology, China. Professor Lin provided us with many valuable suggestions. The authors would like to thank all the reviewers and editors for their helpful comments and suggestions on improving the quality of this paper.