Extreme rainfall estimation is a long-standing challenge for hydrological hazard assessment and infrastructure design, particularly if considering the need to deal with climate change. Advances in statistical methods and in rainfall data availability allow for frequent updates of regional rainfall frequency analyses. These allow for new estimates that, however, cannot simply replace older ones in the risk management, due to technical, socio-economic and legislative reasons. To preserve compatibility between old and new regional estimates a multi-model approach could be used, where model uncertainties can be combined to help reach a final decision. To make this possible, one has to face the uneasy retrieval of data and results of older analyses and, quite often, non-trivial areal rainfall estimates are needed with all methods. To give an answer to these technical needs, a tool named MultiRain has been developed. The tool computes depth–duration–frequency (DDF) curves, both related to a point and integrated over an area, from multiple regional statistical analyses. The MultiRain procedure is based on Python scripting, geographic information system (GIS) functions and web technologies, and can be performed via web browser or in a desktop GIS environment. A demonstration version has been built using four different regional analyses proposed in a 20-year period for the north-west of Italy.
Risk assessment related to severe precipitation is a growing priority of hydrological practice, due to the significant consequences of these phenomena on human safety, in a context of apparent increase of the number and severity of extreme meteorological events. Most of the adverse hydro-meteorological events that led to about 25,000 fatalities per year across the globe (Munich Re's NatCatService 2017) relate directly to intense rainstorms, which are difficult to forecast, both in location and in terms of expected effects, particularly in urban areas. Intensity–duration–frequency (IDF) or depth–duration–frequency (DDF) curves are commonly the starting point for design and operation of flood protection works (Grimaldi et al. 2011). They are mathematical relationships that allow the estimation of the average rainfall intensity i (or the total rainfall depth h) over a given timescale d for a given return period T (e.g., Koutsoyiannis et al. 1998). From the DDF one can obtain a design hyetograph, i.e., a reference rainfall sequence to be used in rainfall–runoff modelling with various methods (e.g., Alfieri et al. 2008). For the evaluation of the DDF curves in ungauged sites, or averaged over an area, regional rainfall frequency analysis (RFA) is required, and many studies have been carried out to predict rainfall depths with a certain probability of occurrence in a generic location within large areas (see, e.g., Reed et al. 1999; Svensson & Jones 2010). The estimation is usually carried out by interpolating spatially the data or the distribution parameters estimated at the station points (e.g., Myers 1994; Libertino et al. 2018), or by pooling the data to form statistically homogeneous regions (see, e.g., Hosking & Wallis 1997).
It is not uncommon that different methods of RFA of extreme precipitation exist for the same regions. Quite often they adopt different techniques for identifying homogeneous regions and use different probability distributions or different methods for estimating distribution parameters. Most of the time, a new RFA is based on the availability of new data, even only for a limited number of years or of stations. The rationale for not totally replacing the older results with the new ones is that the hydrologic design figures used to assess the reliability of infrastructure need to be consistent over time. That is, if the design of a given infrastructure is based on estimates obtained by an ‘old’ study, the design of related and successive works should comply with the previous one, even though data and knowledge have increased and improved over time. At the same time, the institutions in charge of the risk management of infrastructure are very interested in the results of new methodologies, that use advanced geo-statistical techniques and data with a higher spatio-temporal resolution, to face the challenges of climate change. In fact, even if the widespread perception of an increase in the frequency of severe rainstorms does not yet find clear confirmation in the scientific literature, local correlations between the increase in temperature and the intensification of extreme rainfall are being consolidated (e.g., Westra et al. 2013).
To help preserve the consistency in design rainfall estimates over time, some scientific studies have promoted the simultaneous use and association of different methodological approaches over the search of a single ‘best model’, sometimes hard to identify (see, e.g., Shamseldin et al. 1997; Di Baldassarre et al. 2009; Velázquez et al. 2011; Okoli et al. 2018). The so-called model-averaging is particularly efficient when the goal is the estimation of quantiles for large return periods (Claeskens & Hjort 2008). For these high quantiles, the high estimation variance generally allows different distribution functions to be admissible after the hypothesis testing. The need to comply with this undetermined model selection assumes practical significance, for example, when considering that hydraulic safety of dams in Italy relies on the estimate of a 1,000-year flood, and that for small basins it is very important to check that estimate using regional flood frequency analyses and rainfall–runoff methods.
The above considerations underline the usefulness of merging and harmonizing the information deriving from different methods which have evolved over time. This process, which includes the above-mentioned model-averaging, is called here a multi-model approach for the selection of the hydrologic design values. Non-trivial technical difficulties arise for the practitioners when managing this merging process, deriving from the different variables required by the methods, the different detail of reproducibility of estimates, the topographical bases used (from paper maps to geographic information systems (GIS)), the different probability distributions, etc.
This paper presents the technical rationale of the development of a tool that allows a fast comparison and combination of the results from different rainfall mapping studies (see, Grasso et al. 2018). More specifically, this tool, named MultiRain, is a set of GIS-based procedures designed to produce concurrent DDF estimates from various regional frequency analyses both at a point and over an area. To better describe the features of the MultiRain product, all its building phases are presented, starting with the case study of the north-west of Italy, where four existing RFA studies are available. The four RFA methodologies are first briefly summarized, along with some details required for the customization of the software tool. The MultiRain general features are then described, with a focus on the solutions adopted for harmonizing the presentation of the results of the four methods. An application of the MultiRain customized for north-west Italy is finally presented, with regard to the assessment of the rarity of a real rainfall event, where the results can be obtained with the combination of the different approaches demonstrating the tool capabilities.
DATA AND METHODS
Study area and available RF analysis
In this section, the case study will be introduced, and the available extreme rainfall mapping analyses will be described with the details needed to explain the customization required by the software tool. The case considered here refers to the Piemonte region, an area of about 25,000 km2 in the north-west of Italy. Four RFA procedures are currently available for the area of interest, each one providing maps and procedures for the estimation of the DDFs for each location in the region. Automatic procedures for DDF computation are partly available only for one of the methods, along with some GIS support. Automatic areal rainfall estimates are not possible for all the methods. In some cases, the RFA parameters are provided through non-spatial tables or non-properly geo-referenced maps, that poses critical technical challenges for their systematic application.
In Equation (1), ) is the design rainfall depth, is the mean depth–duration curve (the index value for the duration d) and KT is the growth function, i.e., the inverse of the non-dimensional frequency distribution, expressed as a function of T.
Three RFAs were built using almost the same database; however, different statistical approaches are adopted, and extreme precipitation quantiles can differ considerably. A brief description of the methodologies is provided below, focusing on the characteristics that can make a direct (manual) comparison quite cumbersome.
PAI-Po design rainfall map (Po River basin)
The PAI-Po procedure has been developed by the Po River Authority on the 60,000 km2 area covering the Po drainage basin, in the context of the Po River Hydrogeological Master Plan (PAI – Piano Stralcio per l'Assetto Idrogeologico, AdbPo 2001). This latter organization represents the framework behind the hydrologic and hydraulic risk assessment plan, providing rules, limitations and guidelines to be used in land and urban planning in the Po River basin.
The analysis has been carried out on 277 rain gauges, distributed over the whole basin area using the time series of the Servizio Idrografico e Mareografico Italiano (SIMN) of annual maxima of rainfall depth collected for the durations: 1, 3, 6, 12 and 24 hours. Data from 1929 to 1989 were considered.
The and parameters are then interpolated on a 2 km spatial grid using ordinary kriging, to estimate DDF curves at ungauged locations. All the parameters in the grid covering the Po basin are provided in a table in which the location of each cell is obtained by sequential ordering of rows and columns indices. Querying individual cells on an online map is also possible (AdbPo 2001).
As this procedure does not provide results according to the index-rainfall framework, a systematic conversion has been proposed here, to make PAI-Po results easily comparable to the other RFAs. All details are provided in the Appendix.
VAPI – De Michele & Rosso RFA (NW Italy)
For the whole of north-west Italy, a study was conducted (De Michele & Rosso 2001) in the framework of a research project for the estimation of floods in Italy (VAPI project), driven by the National Group for the Prevention of Hydrogeological Disasters (CNR-GNDCI) in the 1990s. The VAPI project represents an important pillar, at national level, in hydrological risk assessment, since its main objective was to produce methods for extreme rainfall and flood estimation at the national scale.
In this analysis (De Michele & Rosso 2001), a database of 366 stations was considered: 270 in the Po River basin and 96 in Liguria, out of the Po River basin. Station record length was at least 20 years and the average sample length was 34, ending up with a database not very different from that of the PAI-Po study. DDF curves were estimated through the index-rainfall method, using a generalized extreme value (GEV) distribution (Jenkinson 1955) to represent the growth curve KT. Parameters were estimated using the PWM method (Hosking et al. 1985). Ordinary kriging interpolation was used for mapping the parameters of Equation (2) and of the GEV at ungauged sites. The maps of a, n and of the GEV parameters (Equations (1) and (2)) for north-west Italy (including Lombardia, Piemonte, Valle d'Aosta, Liguria and Emilia-Romagna regions) were made available as images in pdf format (see Figure 1), so that the numeric information cannot be extracted.
VAPI – Villani RFA (Piemonte region)
The VAPI – Villani RFA (Villani 2003) was later developed for the Piemonte region according to the standards suggested by CNR-GNDCI. It uses a hierarchical approach based on the two-component extreme value (TCEV) distribution (Rossi et al. 1984), which takes into account separately the estimation of parameters of the ordinary and of the extra-ordinary components of the curve. The analysis was carried out on 517 rain gauges with at least 10 years of observations, adding some rainfall data collected by Regione Piemonte and by some neighbouring French institutions during 1913–1986 to the main database of the national hydrologic service (SIMN). The spatial mapping of the TCEV parameters is obtained through a noiseless kriging (Furcolo & Villani 1998). Note that the extraordinary component of the growth curve resulted as not significant in a large proportion of the region. In that case, the TCEV collapses on a Gumbel distribution, with the higher-order parameters set as and . Together with the publication of the procedure report (Villani 2003), a support software (named Va.Pi.) was developed, aimed at the production of DDF curves for single stations and for a given set of Piemonte basins. However, the limited distribution of Va.Pi., which was originally developed for Windows98 and never updated afterwards, discouraged its use. The Va.Pi. map grids have a resolution ranging from 1 km to 5 km, depending on the variability of the represented data.
Patched Kriging RFA (Piemonte region)
The Patched Kriging procedure (Libertino et al. 2018) was undertaken to take advantage of a consistent increment in the data availability that had occurred since the previous analyses were published. The study uses annual extreme data from 1913 to 2010, gathered from the SIMN and from the regional hydrologic services, for the durations 1, 3, 6, 12 and 24 hours. The methodology involves the sequential application in time of the ordinary kriging equations, producing a homogeneous dataset of synthetic series with uniform lengths. On these series, the index method is adopted and the IDF curves are computed according to Equations (1) and (2). The authors decided not to choose a ‘best’ probability distribution and to provide a map of the a and n parameters of the DDF and of the first three sample L-moments. This allows the end-users to choose the most appropriate probability distribution after autonomously checking of the goodness of fit. Results are provided on a 250 m gridded domain covering the areas of the Piemonte and Valle D'Aosta regions. After resampling on a 1 km grid, the results were included in a digital map embedded in a Web-GIS called Atlante delle piogge intense in Piemonte (ARPA Piemonte 2013) developed by the Regional Agency for Environmental Protection.
The MultiRain tool development
The software tool MultiRain described in this work operates in a geographic consistent framework, that facilitates the selection of the location of interest and the comparison of results. This is intended to allow easy comparison of results of different RFA methodologies and to compute the areal-averaged DDF curves within a basin. In the first part of this section we describe the customization problems we tackled to allow the uniform management of the existing RFAs in the tool. In the second part, the main approach adopted and the features that describe the possibilities of generalization for the adoption of MultiRain are presented. In particular, data preprocessing techniques are described, along with a brief explanation of the implementation process and of the technological details of the GIS application.
Customization issues for the Piemonte region
The main goal in the customization of the MultiRain tool is the building of a unique geospatial database containing all the necessary elements for the application of the different RFA methods available. Problems that can arise in this task derive from the heterogeneous criteria for mapping the parameters (e.g., digitized scanned images, tabulated values, ASCII raster data with different formats and resolutions) and by the methodological differences between the various regionalization approaches (e.g., the homogeneous region procedures are based on vector maps while the methods that involve the continuous spatial variation of the parameters refer to raster maps). To exemplify, some details of the customization of the four RFA analyses in Piemonte are provided below.
The PAI-Po methodology proved to be the most difficult to customize, as it provides the and parameters of Equation (3) on a 2 km grid, for T= 20, 100, 200 and 500 years. Index curve (Equation (2)) and growth curve (Equation (3)) are not provided. Starting from the knowledge that the PAI-Po procedure adopts a Gumbel distribution, for each cell of the gridded domain, the transformation of the results (provided as in Equation (3)) to a mapping of a and KT parameters was accomplished by following a procedure reported in a master thesis (Magro 2007) briefly explained in the Appendix.
After the reconstruction described in the Appendix, raster maps of parameters a and n (Equation (2)), and of the two Gumbel parameters and have been built. The computerized procedure needed to provide the tool's functionalities, compute the DDF curves for whatever return period in a single cell, and is applicable to all Gumbel-based RFAs. After customization, an extensive testing was done to compare rainfall quantiles obtained from the tool with the ones available, e.g., to selected stations from the original report.
For the VAPI – De Michele & Rosso method, the main issue to tackle was the need to digitize the parameters' maps, which are only available in paper format and as .pdf image (see, e.g., Figure 1 (left)). For this purpose, we used an image processing procedure implemented in Matlab for extracting the colour information from the .pdf map, relating the extracted colours to the provided colour bar, and attributing a numeric value to each pixel of the image. The digital images were first noise-cleaned, i.e., unnecessary graphical elements, such as the grid lines, the station indicators, etc. were removed (see Figure 1 (left)). The numeric values derived from the Matlab procedure were then converted into raster format, suitably geo-referenced and then merged, to connect the two separate maps provided for the Po basin and the Liguria catchments. An example of the resulting map obtained for the n coefficient is shown in Figure 1 (right).
For the VAPI – Villani RFA, we obtained from the Va.Pi software folder the raster files of parameters a and n for the mean, and of , , Geo-referred parameter maps were obtained converting the raster from a binary grid format (.grd) to the ASCII format (.asc) and referring the cell centroids to the WGS84/UTM Zone 32N coordinate system. In the cells where , the TCEV distribution is known to reduce to a Gumbel distribution, and this was reproduced in the MultiRain tool.
The maps related to the Patched Kriging methodology (Libertino et al. 2018) have been requested from the authors in raster format. Using the provided L-moments raster maps with a 250 m × 250 m resolution, we have computed the and parameters of the Gumbel distribution and the , and parameters of the GEV, allowing evaluation of the growth curve KT with two probability functions. This procedure has entailed minimum pre-processing customization efforts, as the raster maps of a and n were also available and already georeferenced.
MultiRain technical features
In the early stage of the MultiRain tool development, a Python plugin for the QGIS software (i.e., ‘QGIS local plugin version’) was built. In that version, the digitized regional maps, necessary to infer DDF curves at ungauged sites, were inserted into the plugin configuration package, to be used locally. Subsequently, with the aim to more easily distribute GIS data and functionality over the internet through standard internet protocols, specific web services have been implemented according to the OGC Web Processing Service (WPS) specification, accessible both by web browser and by using some desktop GIS programs. The use of server-side scripting allows the users to access calculations independently of the underlying software and avoid the problem of updating the plugin to keep it compatible with the updates of the various GIS programs. Moreover, data do not need to be stored locally (client-side) but are maintained by the hosting entity and can be updated and expanded on the server, without the need for the users to download updates. It should be added that computing times are usually shorter than those of the client scripting. This WPS format appeals to practitioners and planners as it reduces uncertainties and calculation errors and makes it possible to obtain DDF curves in an automatic way.
The connection to the WPS services is also available through a Desktop GIS software (i.e., QGIS, ArcGIS). This setting is interesting for several reasons: it allows the use of the geospatial algorithms available in the GIS software and the integration of data stored locally. Furthermore, results can be visualized directly on the map canvas enabling integration and overlay of spatial information from different sources in order to gain a deeper insight into relationships and scenario evaluation.
For sharing these procedures, a web platform has been developed, with free and open-source software, using PyWPS to set up the WPS processes, GRASS GIS as a backend to access all the geoprocessing functionalities, GisClient3 to build the WebGIS (accessible through client browsers) and Apache as web server. GisClient3 is an interesting web authoring tool configurator for PostGIS and MapServer that has the ability both to build Mapfiles and to provide OpenLayers maps. All the spatial data, necessary for RFAs implementation, have been registered in the GRASS database.
Despite all the pros deriving from the use of the WPS service, the use of the ‘QGIS local plugin version’ remains attractive, as it allows researchers to further explore rainfall frequency analysis and to carry out enlarged spatial analysis and also provides to any Python and QGIS skilled user the possibility to add other RFA methods.
The MultiRain tool, despite differences in scripting, has the same capabilities and provides the same benefits for the user in both the developed versions (the ‘QGIS local plugin’ and the WPS one). The user can either provide a point (as a shapefile or by inserting its coordinates) or select an area of interest (by uploading a proper shapefile). In the first case, the results are referred to the values of a pixel; in the second case, DDF curves of all the cells falling into the basin divide are computed and averaged spatially. Both versions of the tool allow for the following operations for all the available methodologies:
query of all the available maps, for extraction of the parameters (point or areal);
computation of the index rainfall curve ;
production, in a text format, of the values of the local or areal parameters of the DDF curves;
production of a graphical representation of the DDFs for some significant return periods.
Moreover, if the user specifies a return period T and a duration, the available scripts:
compute the growth factor KT;
provide the design rainfall for the duration and return period requested;
provide a graphical representation of the DDFs for the considered return period and duration, obtained with the different methodologies.
The WPS procedures implemented can be invoked client-side in two ways: directly from the built web-mapping application, accessible from the project website (soon available at http://www.idrologia.polito.it/MultiRainTool), or from any GIS desktop software that supports the OGC WPS standard (some software may require external plugins). In the first case, the user can either upload via web browser the polygon vector file of the area of interest or proceed, directly online, using an automatic watershed delineation service (performed using GRASS functionalities starting from a digital elevation model map, based on the NASA SRTM, designed by Ganora et al. (2016)). In the second case, the user can execute the WPS processes choosing the vector layer that delimitates the basin of interest from those already available into the current GIS session.
RESULTS AND DISCUSSION
The final configuration of the MultiRain tool customized for the Piemonte region is shown here with a series of screenshots and related comments.
Figure 2 shows an example of the execution of one of the WPS procedures from the QGIS desktop software using a WPS Client plugin. In particular, the figure represents the results of the application of the WPS process to estimate DDF curves with the PAI-Po methodology for a given catchment (the vector layer added into QGIS project and displayed in the map canvas) with return period T= 500 years and duration d= 12 hours. The WPS procedure provides data outputs as ‘text/plain’ and ‘image/svg’. The values of the areal parameters of the DDFs and the rainfall quantile for the duration and return period requested are reported in the textbox, while a graphical representation of the DDFs for the considered T and some others significant return periods are represented in a svg format.
When the WPS process is executed directly from the web-mapping application, textual results are displayed in a webpage message box, where an URL-endpoint for retrieving the created graph is provided. In case the watershed divide is not available, it can be delimited through a WPS procedure available through the web browser, as depicted in Figure 3, starting from the definition of the outlet's coordinates.
The MultiRain tool QGIS plugin version allows for more in-depth analysis not only at the basin scale, but also at regional level. Indeed, the plugin provides not only the graphs or the parameters' value of the DDF curves for the area of interest, but also allows maps to be built of the quantiles and of the growth factors for a selected return period. This last functionality is applied to build maps of Figure 4, where the rainfall quantiles for T= 1,000 years and d= 3 hours are represented for all methods. Consistent differences among the obtained quantiles are apparent, as well visible is how the final grid resolution reflects the different spatial resolution of the original analyses.
Application example and discussion
To demonstrate the capabilities of the MultiRain tool, the version we have customized for NW Italy is applied with regard to the assessment of rarity of a real rainfall event, where the results that can be obtained with the different approaches demonstrate a typical experience from the practitioner viewpoint.
The example concerns the Pellice River, a basin with an area of about 216 km2, which was affected by significant flooding in November 2016 (ARPA Piemonte 2016) generated by a strong and persistent precipitation event. The chosen reference rainfall station is Luserna San Giovanni (361006, 4963859 in the reference system WGS84 – UTM 32N), positioned at the outlet of the basin (see black triangle in the maps of Figure 4). The MultiRain tool has been applied to estimate the return period of the rainfall event, through the four described methods.
Five different families of DDF curves are obtained, since the Patched Kriging model can estimate quantiles with both the Gumbel and the GEV distributions. The curves obtained with T= 500 are represented in Figure 5 both at the station location (left) and averaged at the basin scale (right). Comparing local and areal curves, one can notice that the spatial average reduces the differences that emerge on site estimation.
In order to estimate the return period of the November 2016 event, the maximum rainfall depths for 1, 3, 6, 12 and 24 hour durations, recorded during the storm were considered (see Table 1, column 2) and compared with the DDF curves of the different RFAs, exploiting the potentialities of MultiRain. The return periods for a 24 hour duration is estimated by all methods between 200 and 500 years, with the exception of the VAPI – De Michele & Rosso method, that suggests a return period of over 5,000 years. Figure 6 (left) shows the graph where this assessment is made using the most recent and data-rich method (the Patched Kriging) using a Gumbel distribution.
|.||.||Return time T (years) for different models|
|Duration (h) .||Cumulated rainfall depth (mm) .||PAI .||VAPI Villani .||VAPI Rosso .||Patched Kriging Gumbel .||Patched Kriging GEV .|
|.||.||Return time T (years) for different models|
|Duration (h) .||Cumulated rainfall depth (mm) .||PAI .||VAPI Villani .||VAPI Rosso .||Patched Kriging Gumbel .||Patched Kriging GEV .|
To provide wider understanding of what can happen in similar situations, when various methods provide significantly different results and it is not possible to identify a ‘best’ model, the MultiRain tool helps in adopting a model-averaging approach, commonly considered a robust way to manage parameters and model uncertainty in quantile estimation.
In detail, for a certain return period, the averages of the a, n and K(T) among the available methodologies can be computed to obtain an averaged curve. Differential weight can be given to the different estimates in the averaging phase, according to the expertise of the user. As a demonstration of the model-averaged result, the T= 500 averaged curve is computed, giving equal weight to the considered RFAs, and shown in Figure 6 (right) together with the 2016 event dots. The result of the overlapping of the data with the curve shows that a robust estimation of the return period for the 24 hour duration is closer to T= 500 than to T= 200 or to T= 5,000 years.
When hydrologic design values encounter updates of the estimation procedures, information deriving from different sets of data and of regional statistical estimation procedures needs to be harmonized or merged. In other conditions, e.g., for critical infrastructures, when a ‘best model’ is difficult to identify, the multi-model approach allows for the consideration of a wide range of scenarios and the adoption of robust estimates in the design and control phases.
In this framework, the MultiRain tool presented here aims at providing a fast operational aid to apply a multi-model approach to the regional rainfall frequency estimation. The tool offers estimates of depth–duration–frequency curves both for a generic location in the region considered, and as a spatial average. The estimate and comparison of spatial averages is given when the user uploads a shapefile of the contour of the area of interest (usually a watershed divide). The MultiRain Web Processing Services implemented, accessible both by web browser and by desktop GIS software, allows the user to carry out a thorough comparative analysis based on the results derived from different methodologies, without installing software and downloading data.
Despite the web platform being developed in a modular and reproducible way to allow future integrations, the need for maintaining appropriate control on the security of the data, of the scripts and of the Web server itself has led us to keep the online platform closed to editing. All the extensions and editing of the MultiRain WPS Service must be performed by the authors. Skilled users, interested in expanding the potentialities of the tools, can take advantage of the free availability of the ‘QGIS local plugin version’, that allows the source code to be edited and the integration of further data and methodologies.
Future developments will be addressed first and foremost at integrating the tool with procedures for improving its customization, needed for its implementation in further regions.
Final refinements to the user-interface of the web services are in development and the MultiRain tool web page will be available soon at: wwww.idrologia.polito.it/MultiRainTool.
The Supplementary Material for this paper is available online at https://dx.doi.org/10.2166/hydro.2019.016.