A regional partial duration series (PDS) model is applied for estimation of intensity duration frequency relationships of extreme rainfalls in Denmark. The model uses generalised least squares regression to relate the PDS parameters to gridded rainfall statistics from a dense network of rain gauges with daily measurements. The Poisson rate is positively correlated to the mean annual precipitation for all durations considered (1 min to 48 hours). The mean intensity can be assumed constant over Denmark for durations up to 1 hour. For durations larger than 1 hour, the mean intensity is significantly correlated to the mean extreme daily precipitation. A Generalised Pareto distribution with a regional constant shape parameter is adopted. Compared to previous regional studies in Denmark, a general increase in extreme rainfall intensity for durations up to 1 hour is found, whereas for larger durations both increases and decreases are seen. A subsample analysis is conducted to evaluate the impacts of non-stationarities in the rainfall data. The regional model includes the non-stationarities as an additional source of uncertainty, together with sampling uncertainty and uncertainty caused by spatial variability.

## INTRODUCTION

Design of water infrastructure is often based on intensity duration frequency (IDF) relationships of extreme rainfall (e.g. Schilling 1991; Arnbjerg-Nielsen *et al.* 2013). They provide information about the mean rainfall intensity of different durations for various frequencies or return periods. IDF relationships are relevant for a wide range of temporal scales; from sub-hourly duration for design of storm water pipes in the upstream parts of sewer networks to several hours or days for design of retention basins that collect water from large catchments. IDF relationships can be estimated by performing an extreme value analysis of rainfall data at the site of interest. Such estimates, however, may be hampered by the lack of sufficiently long rainfall records when extrapolating to large return periods. In regional frequency analysis, data from several sites within a region are pooled whereby the estimation uncertainty can be reduced significantly (e.g. Madsen & Rosbjerg 1997a; Kyselý *et al.* 2011; Burn 2014). In addition, regional frequency analysis facilitates estimation of IDF relationships at ungauged sites by combining regional extreme value statistics and site specific climatic and physiographic characteristics.

A widely applied method in regional frequency analysis is the index-event approach (originally named the index-flood approach in flood frequency analysis) using L-moments (Hosking & Wallis 1993, 1997). This approach has been used in several regional frequency analysis studies of extreme rainfall, e.g. in Australia (Haddad *et al.* 2011), Canada (Alila 1999; Burn 2014), Czech Republic (Kyselý *et al.* 2011), Italy (Di Baldassarre *et al.* 2006), Slovakia (Gaál *et al.* 2008), South Africa (Smithers & Schulze 2001), and Washington State (Wallis *et al.* 2007). All these studies are based on the traditional index-event method using annual maximum series (AMS). Madsen & Rosbjerg (1997a) developed a regional index-event approach based on partial duration series (PDS) that includes all events above a specified threshold level in the extreme value analysis. Madsen *et al.* (1997) showed that the regional index-event PDS model with generalized Pareto distributed exceedances, in general, is more efficient (in terms of quantile estimation uncertainty) than the corresponding index-event AMS model based on the generalized extreme value distribution. The regional PDS model has been further developed and applied for estimation of IDF relationships in Denmark (Madsen *et al.* 2002, 2009).

In the traditional index-event approach, data are pooled within a fixed region that can be assumed to be homogenous with respect to certain statistical characteristics, typically second and higher order moments. Alternatively, a region of influence approach can be used to identify separate homogeneous pooling groups for each site (Burn 1990). The region of influence approach has been applied to regional rainfall analysis by Kyselý *et al.* (2011) and Burn (2014). Another method that relaxes the use of fixed regions, or can be used in combination with a fixed region or region of influence approach, is based on establishing regression relationships that describe the spatial variation of extreme rainfall statistics using covariate information in terms of physiographic and climatic characteristics. Such regional regression relationships also facilitate estimation at ungauged sites. In a regional analysis in Washington State, Wallis *et al.* (2007) found the L-Coefficient of variation (L-CV) and L-skewness to vary systematically with the mean annual precipitation (MAP). Di Baldassarre *et al.* (2006) also related L-CV and L-Skewness to MAP in their study of rainfall extremes in Northern Italy, and Madsen *et al.* (2002, 2009) found that the annual number of extreme events in a regional PDS model of Danish rainfall extremes could be related to MAP. Haddad *et al.* (2011) related L-CV and L-skewness as well as the index parameter to location and distance to the coast, whereas Beguería & Vicente-Serrano (2006) applied a regional regression model relating the PDS parameters to location, altitude and slope.

This study considers regional estimation of IDF relationships in Denmark. It builds on the regional PDS model developed by Madsen *et al.* (2002) and later updated by Madsen *et al.* (2009). The current study includes rainfall data up to 2012, corresponding to 50% more data in terms of station-years compared to the previous study by Madsen *et al.* (2009). In addition, the regional model is extended by using new covariate information in terms of gridded rainfall statistics from a dense rain gauge network measuring daily rainfall. In the update of the regional model by Madsen *et al.* (2009) a general increase in extreme rainfall was found, with most pronounced increases for durations between 10 min and 3 hours. In a recent study by Gregersen *et al.* (2013) a significant increase was found in the annual number of extreme events for all durations analysed between 1 and 24 hours and in the mean extreme intensity for 1- and 3-hour durations. In this study, the impacts of these non-stationarities on the regional model are investigated using subsample analysis.

## DATA AND METHODS

### Rainfall data

Rainfall data from a network of high-resolution rain gauges in Denmark are used in the analysis. The network is based on RIMCO tipping bucket gauges with 0.2 mm resolution and tips being recorded every minute. The network was established in 1979 and is operated by the Water Pollution Committee of the Society of Danish Engineers and the Danish Meteorological Institute (Jørgensen *et al.* 1998). The gauges have been maintained, but the principles of measuring and calibrating the gauges have not been changed in the period investigated.

The data analysed consist of rainfall intensities with a temporal resolution of 1 minute for individual rain events separated by dry periods of at least 1 hour. From the 1-minute intensity data, maximum rainfall intensities for durations ranging between 1 minute and 48 hours are extracted using a moving window approach (Madsen *et al.* 2002). For durations less than 1 hour, independent events are separated by at least 1 hour dry periods. For durations larger than 1 hour, independent events are separated by dry periods that are at least as large as the duration considered. In this case, the separate events defined for the 1-minute intensity data will be merged into fewer and larger independent events. For the extreme value analysis, PDS are derived for each duration from the series of event-based maximum intensities by including intensities above a pre-defined threshold level. The same threshold levels as applied in the previous analyses (Madsen *et al.* 2002, 2009) are used. Short-duration (less than 1–2 hours) extremes are primarily caused by convective rainfall in summer months, whereas long-duration (larger than 12–24 hours) extremes are caused by frontal rainfall and can occur all year round.

*et al.*(2009) included 66 stations with a total of 1,250 station-years, and hence the current study comprises an increase in station-years of 50%.

The development of the annual number of station-years shows a relatively constant level of about 40 station-years per year up to 1990, followed by a steady increase up to a level of about 70 station-years per year during the last 10 years (see Figure 1). To evaluate the impact of the development in data availability over time, a subsample of 31 stations that have more than 30 years of observations is analysed. The subsample includes 999 station-years in total.

*μ*) is estimated from the CGD data using a PDS model with a regional constant threshold level corresponding to approximately three events per year. It varies between 24.5 and 29.5 mm over Denmark, with larger values in eastern Zealand northern Jutland and southern islands (see Figure 2).

_{CGD}In the previous studies by Madsen *et al.* (2002, 2009) different physiographic characteristics (geographical location, altitude, shelter index) were included as covariates in the regression analysis. However, none of these were found to be significant for describing the regional variability and hence are not included in this study.

### Regional model

The regional extreme value model developed by Madsen *et al.* (2002) is applied in this study. The model is based on the PDS approach using a regional constant threshold level to define PDS of extreme rainfall intensities at the different stations. In the regional PDS model, the annual number of extreme events is assumed to follow a Poisson distribution, and the magnitude of the extreme events is assumed to follow a Generalised Pareto (GP) distribution. For determination of a regional parent distribution, the previous studies by Madsen *et al.* (2002, 2009) applied the L-moment goodness-of-fit test proposed by Hosking & Wallis (1993) and extended by Madsen *et al.* (2002) for application to two-parameter distributions used in PDS modelling. These studies showed that the GP distribution was, in general, preferable for the range of rainfall durations considered.

*λ*), and the mean (

*μ*) and L-CV (

*τ*) of the exceedance magnitudes are modelled as regional variables. The regional model estimate of the rainfall intensity for a given return period

_{2}*T*is then given by (Madsen

*et al.*2002).where

*z*is the regional threshold level, , , and are regional model estimates of the Poisson rate, mean, and L-CV, respectively, and is the corresponding estimate of the GP shape parameter.

_{0}*i*,

*M*is the number of stations, are the regression parameters, are the covariates, and are the model residuals with covariance matrix.In Equation (3), is the sampling error variance, is the residual model error variance, and is the sampling error correlation coefficient. Estimation of sampling variances and correlations to be used in the GLS regression model are described in Madsen

*et al.*(2002). is estimated along with the regression parameters using an iterative scheme, see Madsen & Rosbjerg (1997b) for details.

*T*-year estimate at a given location is then obtained from Equation (1). The variance of the

*T*-year estimate is calculated based on the variances of the PDS parameter estimates from the GLS regression models using a Taylor series approximation of Equation (1).where the partial derivatives are evaluated around the GLS parameter estimates.

The variances of the estimated PDS parameters include both residual model error variance and sampling variance corrected for intersite correlations. When only the intercept is included in the regression model, the model provides an estimate of the regional mean PDS parameter, and the estimate of the residual model error variance is then a measure of regional heterogeneity. The regional mean is, in general, different from the arithmetic mean since the GLS model weighs the estimated PDS parameters according to the error covariance matrix, hence giving less weight to more uncertain estimates and groups of sites that have higher inter-site correlations (Madsen & Rosbjerg 1997b). If , the region can be considered homogeneous and the observed variability of the PDS parameter estimates at the different sites in the region can be explained by sampling uncertainty. A residual model error variance larger than zero indicates a heterogeneous region, and one can then apply the GLS regression model with available covariate information to evaluate the potential of describing the regional variability.

*i*= 1,2,…,

*M*in the region.where is the covariance matrix of the estimated regression parameters. The prediction variance includes both the sampling uncertainty of the estimated regression model parameters and the residual model error variance. When comparing different regression models, the model with the smallest average prediction variance is preferred. The reduction in prediction variance (

*RPV*) between a regression model with

*k*explanatory variables, , and the regional mean model, , can be used as a measure of the value of covariate information.

Note that *RPV* can become negative in the case where the inclusion of explanatory variables only provides a minor reduction in residual model error variance, which is smaller than the corresponding increase in the sampling uncertainty of the estimated regression model parameters.

*et al.*(2004) proposed a pseudo coefficient of determination.where and are the residual model error variances for, respectively, a regression model with

*k*explanatory variables and the regional mean model. Note that if then

*R*= 1, although the model is not perfect. In this case, sampling errors account for the differences between the site specific PDS parameter estimates and the GLS regression model estimates. Compared to

^{2}*RPV*,

*R*only considers the reduction in residual model error variance by using covariate information.

^{2}Finally, the significance of the estimated regression parameters is evaluated using a standard t-test.

## RESULTS

### Regional model

*λ*, the GLS results show regional variability for all durations, and a part of this variability can be explained by MAP. The GLS regression models with MAP have smaller average prediction variances than the regional mean models.

*RPV*ranges between 0.01 and 0.54 and

*R*between 0.04 and 0.59, with the smallest values for the intermediate durations 30–360 minutes, and the largest values for the 24- and 48-hour durations. A t-test of the slope of the regression equation shows that the relationship with MAP can be considered significant for all durations at a significance level of 5%, except for 60-minute duration where the significance level is 7%. Estimated GLS regression models for 1-hour and 24-hour durations are shown in Figure 3. GLS regression results for all durations are summarised in Supplementary Material Table S1 (available with the online version of this paper).

^{2}For the mean value of threshold exceedances *μ*, the GLS regression results show regional variability for all durations. For durations of 3–48 hours a significant part of this variability can be explained by *μ _{CGD}*. For these durations,

*RPV*ranges between 0.05 and 0.44 and

*R*between 0.17 and 0.75, and the t-test shows that the relationship with

^{2}*μ*is significant at a 5% level. The largest

_{CGD}*RPV*,

*R*and most significant slopes of the regression line are obtained for 12- and 24-hour durations. For durations smaller than 3 hours, there is no clear pattern in the relationship with

^{2}*μ*. For some durations significant correlations are found, whereas for other durations the correlations are not significant and even result in poorer prediction variance compared to the regional mean model (negative

_{CGD}*RPV*for the 60-minute duration). For consistency, a regional mean model is applied for all durations smaller than 3 hours. Estimated GLS regression models for 1-hour and 24-hour durations are shown in Figure 3. GLS regression results for all durations are summarised in Supplementary Material Table S2 (available online).

For the L-CV of threshold exceedances the GLS regression results indicate regional variability for all durations except for 6 hours. No covariate information has been found to explain this variability, and a regional mean model is applied for all durations. Results are summarised in Supplementary Material Table S3 (available online).

Results of the regional model are shown in Figure 2. The figure shows estimated extreme intensities for 1- and 24-hour durations mapped on the CGD grid. It should be noted that the extreme intensities estimated from the regional model are point estimates, and the maps in Figure 2 show the estimates at the grid centre points as gridded values. The explanatory variables used in the regional model are mapped on the CGD grid in Figure 2 (top row). The spatial patterns of the estimated PDS parameters *λ* and *μ* correspond to the spatial patterns of, respectively, MAP and *μ _{CGD}*. For durations smaller than 3 hours, the regional variability is only due to the variability in

*λ*as explained by MAP (Figure 2, middle row), whereas for durations of 3–48 hours the regional variability in

*μ*as described by

*μ*also contributes to the regional differences in the extreme intensities (Figure 2, bottom row). For smaller return periods, the regional variability in

_{CGD}*λ*has a relatively larger contribution to the regional variability of extreme intensities, whereas for larger return periods the regional variability in

*μ*dominates.

*μ*for these durations. For durations larger than 1 hour, the relative range increases for increasing duration caused by an increasing regional variability of

*μ*and

*λ*. For 24- and 48-hour durations, the upper limit, of the 2 and 10-year events are similar to the lower limits of, respectively, the 10 and 100-year events.

### Subsample analysis

For the Poisson parameter *λ*, GLS regression results show, in general, larger *R ^{2}* values for the subsample compared to the full sample, except for the intermediate durations of 30–180 minutes. However, due to the smaller sample, the subsample has larger sampling uncertainties resulting in smaller

*RPV*values for most durations. The estimated slope of the regression models is smaller for the subsample for all durations and is not significant (at a 5% level) for the durations of 30–360 minutes. In general, the subsample has a smaller range of

*λ*-estimates over Denmark and smaller prediction uncertainties. The results are summarised in Supplementary Material Tables S1 and S4 (available online).

For the mean value of threshold exceedances, *μ* results from the subsample analysis show that the relationship with *μ _{CGD}* is not significant for durations up to 3 hours where negative

*RPV*values and non-significant slope estimates (at a 5% level) are obtained. For larger durations, slope estimates are significant for the subsample regressions but with smaller slope estimates (except for 12-hour duration, where similar slope estimates are found). In general, the subsample results show smaller

*μ*-estimates over Denmark. The subsample provides both smaller and larger prediction uncertainties, depending on duration, than those obtained from the full sample. The results are summarised in Supplementary Material Tables S2 and S5 (available online).

For the shape parameter in the regional GP distribution *κ* larger (less negative) shape parameters are obtained for the subsample, revealing lighter-tailed GP distributions. The subsample provides smaller prediction uncertainties for durations larger than 10 minutes, except for 6-hour duration. Results are summarised in Supplementary Material Table S3.

The analysis shows larger estimates of *λ* and *μ* in the full sample, which in combination with the increase in station-years included in the regional model indicate an increasing trend in *λ* and *μ*. These results correspond well with the findings of Gregersen *et al.* (2013) who analysed a subset of the rainfall data used in this study, including 70 stations with 10–31 years of observations in the period 1979–2009. They found a significant increasing trend of *λ* for all durations analysed (1, 3, 6, 12 and 24 hours). Increasing trends were also found for *μ* for all durations, but they were statistically significant only for 1 and 3-hour durations.

Larger estimates of *λ* and *μ*, and smaller (more negative) regional GP shape parameters in the full sample all point towards larger intensity estimates as shown in Figure 5. The larger prediction uncertainties generally found for *λ*, *μ* and *κ* using the full sample indicate that the impact of non-stationarities is more important than the expected reduction in sampling uncertainty for increasing sample size. However, it could also reflect an increase in the spatial variability caused by adding additional stations in the analysis. It is very difficult to verify which causes are predominant due to the spatial and temporal heterogeneity of the data.

### Comparison with previous studies

In the previous regional studies of Danish rainfall extremes (Madsen *et al.* 2002, 2009), it was also found that the Poisson rate is significantly correlated with MAP. In Supplementary Material Table S4, the range of *λ*-estimates over Denmark from the previous studies are compared to those obtained in the current study. A general increase in *λ* is seen, with more pronounced increases for smaller durations. It should be noted that in the studies by Madsen *et al.* (2002, 2009) a different MAP was used based on data from the standard normal period 1961–1990 (Frich *et al.* 1997).

The regional variability of the mean value of threshold exceedances was in the previous studies described by defining sub-regions with a constant mean. In the first study by Madsen *et al.* (2002), a larger mean intensity was seen in the Copenhagen area for durations larger than 1 hour, with differences between the western and eastern Copenhagen area for some durations. A regional model was defined with three sub-regions, respectively, (i) Copenhagen East, (ii) Copenhagen West, and (iii) the rest of the country. In the subsequent study by Madsen *et al.* (2009), the regional model was revised. For durations up to 3 hours, a regional mean model was applied for the whole country, whereas for larger durations significant differences between west and east Denmark were found and two sub-regions were defined, respectively, west and east of the Great Belt. In this study, new covariate information in terms of the extreme value statistic *μ _{CGD}* is applied. For durations of 3–48 hours, a significant part of the regional variability can be described by

*μ*, hence allowing a more elaborate assessment of the regional variability as compared to the previous studies. For durations of less than 3 hours, the results of the current study confirm the use of a regional mean model as in the previous studies. Regional model estimates of

_{CGD}*μ*from the different studies are compared in Supplementary Material Table S5. For durations up to 1 hour, a general increase in the regional mean of

*μ*is seen. For larger durations, the range of

*μ*over Denmark shows an increasing trend.

With respect to the L-CV, the current study provides similar results to the previous study, supporting the use of a regional constant L-CV (GP shape parameter). Results from the different studies are compared in Supplementary Material Table S3. For durations of up to 6 hours there is, in general, a decreasing trend towards more negative shape parameters (heavier-tailed distributions), whereas for the largest durations, 24–48 hours, an increasing trend (lighter-tailed distributions) is seen.

*et al.*(2009) are compared in Figure 7. For the 1-hour intensity, there is an increase in the Poisson rate, with a general increase from west (from about 2%) to east (up to about 30%). For the 24-hour intensity, a larger variation in the Poisson rate is seen, ranging from −25% to 58%. For the 1-hour intensity, there is an increase in the mean intensity of about 6%, which is constant over Denmark since the models have a regional constant mean intensity. For the 24-hour intensity, the change in mean intensity varies from −30% to 60%, with a regional pattern similar to

*μ*(Figure 2, top right). For the 1-hour intensity, the changes in the extreme intensities follow the west-east pattern of the changes in the Poisson rate, with an increase between 4% and 12% for the 2 and 100-year return periods. For the 24-hour intensity, the changes in the 2 and 100-year intensities follow the pattern of the changes in the mean intensity. There are both decreases and increases; from −13% to 27% for the 2-year event, and from −26% to 40% for the 100-year event. The main increases are seen in the northern part of Jutland, north-east Zealand, southern islands and Bornholm.

_{CGD}## DISCUSSION AND CONCLUSIONS

A new regional model has been developed for estimation of IDF relationships of extreme rainfall in Denmark. The model is based on 50% more data than used in the previous regional analysis by Madsen *et al.* (2009), and uses new covariate information in terms of gridded rainfall statistics from a dense network of gauges with daily measurements (CGD). The analysis confirms previous results regarding the spatial variability of the Poisson rate; that is, the rate increases for increasing MAP for all durations analysed between 1 minute and 48 hours. With respect to the mean value of threshold exceedances *μ*, significant correlation with the mean extreme intensity from CGD was found for durations between 3 and 48 hours. For durations below 3 hours, *μ* is assumed constant over Denmark in accordance with the previous studies. Finally, the analysis of L-CV of the exceedance magnitudes confirms the previous studies, and a regional constant L-CV (GP shape parameter) is applied in the model. The use of the mean extreme intensity from CGD as covariate information in the regional model allows a more elaborate assessment of the regional variability, and a more consistent estimation of extreme rainfall intensities in Denmark. Based on gridded maps of *μ _{CGD}* and MAP, the IDF relationships can be estimated at an arbitrary site in Denmark.

Compared to the previous study by Madsen *et al.* (2009), there is a general increase in extreme rainfall intensity for durations of up to 1 hour caused by a general increase in the Poisson rate and the mean extreme intensity, and a more negative GP shape parameter. For larger durations, both increases and decreases are seen due to the correlation with *μ _{CGD}* compared to the division into two regions with constant mean extreme intensity in the previous study.

To analyse the impacts of using the temporal heterogeneous dataset, a subsample analysis was conducted including only stations that cover almost the entire observation period. The analysis showed that the relatively larger contribution of station-years in recent years combined with increases in *λ* and *μ* and decreasing (more negative) GP shape parameters give larger estimates of extreme intensities compared to including only records that cover the full observation period in the regional model. The regional model based on the full sample has larger prediction uncertainty of intensity estimates than the model based on the subsample. This is due to the non-stationarities in the data, but may also reflect larger spatial variability in the full sample.

Gregersen *et al.* (2015) analysed long records of daily rainfall dating back to 1874 and found a general increase in the Poisson rate, but overlaid by a multi-decadal variability that indicated a cyclic behaviour. The increase seen in recent years is much larger than the long-term trend but may, at least to some extent, be attributed to the multi-decadal variability seen in the long records. Since it is currently not possible to attribute the recent increases to anthropogenic changes or natural variability, the regional model using the full sample provides the best estimate according to current knowledge of extreme rainfall characteristics and associated uncertainties. Rather than including the non-stationarities in the regional model implicitly as an additional source of uncertainty, a model that explicitly describes non-stationarities in the PDS parameters could be developed. This is currently being investigated.

## ACKNOWLEDGEMENTS

This work was carried out with the support of The Foundation for Development of Technology in the Danish Water Sector, contract no. 7492-2012, and the Danish Council for Strategic Research as part of the RiskChange project, contract no. 0603-00390 (http://riskchange.dhigroup.com).