Estimating the loss of life (LOL) resulting from dam-failures is required for devising emergency action plans and strategies for alert issuance and evacuation. However, current models for simulating fatalities are computationally expensive, forced by highly uncertain variables and not readily interpretable, which may limit their use in engineering and research. For circumventing these problems, we utilize the Polynomial Chaos Expansion (PCE), technique for approximating the LOL, as obtained from the agent-based model LifeSim, and propagating uncertainty of inputs, namely, alerted population, mobilized population, alert issuance and hazard identification, to the model responses. We also benefit from the PCE spectral representation for assessing the effects of each input in the LOL associated with the dam-failure in an urban area in Brazil, considering efficient and inefficient scenarios for alert and evacuation, during the day and night. The PCE error ranged from 10−3 to 10−2, and the mean squared error between the metamodel output and LifeSim was between 1 and 2 fatalities. In global sensitivity analysis, the variables alert issuance and hazard identification contributed the most to the number of fatalities. These findings provide objective guidelines for implementing more effective safety measures, potentially reducing LOL resulting from a dam-break in the study area.

  • We utilize a metamodel as a surrogate to a computional agent-based loss of life model.

  • We propagate input uncertainty through the metamodel.

  • We perform global sensitivity analysis by using Sobol indexes.

  • This research brings together information directly linked to major dam failures, loss of life and how to reduce them with efficient warning and evacuation systems, analyzing the uncertainties in hydrodynamic models.

Simulating the number of fatalities is a paramount step for assessing the consequences of dam failures (Lumbroso et al. 2021). In effect, estimating such a variable may be useful for evaluating critical scenarios and defining appropriate strategies for reducing and communicating risk, either at the design stage of a hydraulic structure or for adapting existing policies for flood mitigation and prioritizing safety measures. Hence, in view of the growing number of constructed dams and, accordingly, the higher number of people at risk, recent literature has focused on advancing knowledge and improving models for more accurately describing the potential loss of life stemming from dam failures (Kalinina et al. 2021; Lumbroso et al. 2021; Silva & Eleutério 2023; Peng et al. 2024; Wang et al. 2024).

In general, models for estimating loss of life differ in complexity and in the underlying structures and are usually classified as empirical and dynamic (USACE 2021). Empirical models (e.g., regression equations) are based on historical cases, considering the mortality rate of the population at risk and the characteristics of the flooding event (Dawson et al. 2011). Dynamic agent-based models, on the other hand, rely on spatial information on the flooding event, the exposed buildings, and the population vulnerability, and account for the interaction of these factors on the individual's decision-making and on the causes of fatalities (Lumbroso et al. 2021).

According to Dawson et al. (2011), agent-based modeling is the most suitable approach to address the challenges of simulating processes and consequences of flooding events, as such models are designed to capture dynamic interactions and responses in a spatial environment. Under this rationale, the dynamic models Life Safety Model and LifeSim have stood out in the scientific literature (Aboelata & Bowles 2005; Johnstone et al. 2005). In particular, Kalinina et al. (2021) discussed the differences between the models, pointing out that the Life Safety model is intended to evaluate the individual behavior during flooding events (a person or a car), whereas LifeSim extends the simulation to large spatial scales, i.e., considering the analysis of many individuals, and provides alert and evacuation scenarios – which hence broadens its scope for application in areas with more heterogeneous land uses, complex and irregular human occupation, and more intricate evacuation paths.

The LifeSim model was initially incorporated in a simplified version into the Hydrologic Engineering Center's - Flood Impact Analysis (HEC-FIA) program by the U.S. Army Corps of Engineers (USACE 2015), and then fully integrated into HEC-LifeSim (USACE 2018), which was recently updated to LifeSim 2.0 (USACE 2021). The model allows for simulating scenarios related to warning and evacuation systems for the exposed population (USACE 2021). As the progression of the flooding event and the actions of the population exposed to the danger are dynamic processes, LifeSim simulates the interactions between processes considering three main modules: Shelter Loss Module, Alert and Evacuation Module, and Loss of Life Module (USACE 2021). In more detail, LifeSim attempts to describe the behavior of the affected population, conditioned to shelter conditions (type and height of buildings), demographic features (total population, density of occupation, and age), and evacuation paths (road network and escape destinations), when an alert is issued as a response to a flooding event that exceeds some critical threshold.

Besides applications in engineering, LifeSim has been increasingly utilized for research purposes, mostly for assessing influential factors associated with loss of life in different parts of the world (e.g., Bilali et al. 2021, 2022; Kalinina et al. 2021 and references therein). It should be noted, however, that LifeSim is affected by many distinct sources of uncertainty, particularly those related to some highly uncertain (or poorly estimated) or even the absence or failures of these inputs, many of them averaged over large areas, and the boundary /initial conditions for the simulations (e.g., the time of the dam failure or the time of the alert). As a result, model outputs – loss of life – can be too dispersed with respect to the median point prediction. Thus, quantifying and propagating input uncertainty is necessary for both model scrutiny and decision-making processes.

For accommodating the outlined uncertain conditions, LifeSim resorts to Monte Carlo simulations (USACE 2021). In this case, model inputs (but not necessarily all model parameters) are treated as stochastic variables, which are independently sampled at each model run. Conditionally on a particular input vector, the output is deterministically obtained, and, by reiterating the model with a large number of new random inputs, an ensemble of predictions is estimated. Arguably, this is a computationally expensive approach, as LifeSim is a very parametrized model. Hence, devising a less expensive framework for propagating input uncertainty in practical applications is desirable – this would allow exploring the output space in a more comprehensive manner (through a much larger number of runs) and tracking the most critical trajectories for the model responses (Sudret 2007).

Surrogate models (or metamodels) comprise an appealing alternative for dealing with the high computational costs of agent-based counterparts. Metamodels are flexible mathematical structures that can be estimated from only a few runs of the original numerical model and then utilized for extensive simulation at reasonable costs (Shields et al. 2019). Surrogate models have been applied in several fields of knowledge, such as the investigation of bistable energy collectors (Norenberg et al. 2022), wind turbine blades (Pavlack et al. 2022), marine turbines (Nispel et al. 2021), and urban drainage (Nagel et al. 2020). Particularly with respect to the estimation of loss of life, Kalinina et al. (2021) applied surrogate models in a hypothetical dam in Switzerland and performed an ad hoc sensitive analysis on some input data from the LifeSim model – which included population characteristics and the delay in communicating hazards and in alert issuance. The authors concluded that, among the analyzed variables, the total population was the one that mostly impacted the model's outputs.

Within a broader class of surrogate models, in this paper, we utilize the polynomial chaos expansion (PCE) technique (Sudret 2007) for estimating the loss of life related to dam failure. In short, PCE maps a given computational model into a combination of orthonormal polynomial functions, which allows the inexpensive computation of the deterministic model outputs (Marelli et al. 2022). Moreover, PCE is a well-suited technique for global sensitivity analysis as it allows trivially decomposing the variance of the model outputs, from which the Sobol coefficients of all orders, which summarize the contribution of the inputs, can be readily retrieved. Hence, as opposed to the current LifeSim implementation, estimating the influence of each of the input variables on the model responses through a PCE model is straightforward, which may be helpful for more accurately assessing risk and establishing more effective evacuation plans for critical (pre-defined) dam failure events.

We apply the PCE metamodel to a hypothetical dam failure study in the city of Belo Horizonte, Brazil – a densely populated urban area with a variety of economic activities that strongly modulate the affected population in distinct scenarios of failure (e.g., daytime and nighttime). The objective of the study is to provide a comprehensive assessment of loss of life estimation under distinct initial/boundary conditions for simulation, as well as to investigate how the input variables (and their interactions) affect the model estimates as the dam failure scenarios change. The main novelty of the study is that we focus on alert and evacuation systems, which are paramount for the elaboration of emergency action plans (EAPs) and for reducing the number of fatalities associated with dam failures, as discussed and presented in the papers by Lumbroso et al. (2021) and Kalinina et al. (2021), but, to the best of our knowledge, have not been tackled within previous research. The paper is structured as follows. In the next section, we describe the study area and utilized datasets, as well as the formalism and underlying assumptions of the PCE model. We then discuss the obtained results related to loss of life and perform the global sensitivity analysis to identify the more influential input variables. Lastly, we present the main conclusions of the study and the envisaged research developments.

The methodology of the study can be summarized in the flowchart in Figure 1.
Figure 1

Methodology flowchart.

Figure 1

Methodology flowchart.

Close modal

Case study – the Pampulha Reservoir

We utilize LifeSim and PCE for studying the loss of life associated with a hypothetical dam failure at the Pampulha Reservoir, which is located in Belo Horizonte, Minas Gerais (Figure 2). The dam is located in a densely populated urban area, which causes a large number of people to be exposed to a dam failure. In addition, a variety of activities in business and services are developed in the downstream valley, which provides a complex occupation dynamic with marked distinctions in the population distribution across the study area at different times of the day. Hence, this case study might offer valuable insights into critical scenarios of alert and evacuation in an urban area. The main characteristics of the Pampulha Reservoir Dam are summarized in Table 1.
Table 1

Data from the Pampulha Dam

Description/featuresObjective
Geographic coordinates Latitude 19°50′44.69″S and longitude 43°58′1.43″W 
Purpose Flood cushioning/landscaping 
Year of construction/inauguration 1936/1938 
Elevation crest of the massif El. 805,00 m 
Elevation of dam foundation (m) 785,00 
Normal maximum NA elevation El. 801,00 m 
Overall height 20.0 m 
Crest length 450 m 
Total reservoir volume (up to crest) 30,084.312 m3 (El. 805.00 m) 
Usable volume up to the threshold 10,009.628 m3 (El. 801.00 m) 
Massive Compacted terror 
Emergency overflow Side channel with a width of 32.00 m (El. 801.00 m) 
Auxiliary extravasor Tulip with a diameter of 12.54 m (El. 801.50 m) 
Description/featuresObjective
Geographic coordinates Latitude 19°50′44.69″S and longitude 43°58′1.43″W 
Purpose Flood cushioning/landscaping 
Year of construction/inauguration 1936/1938 
Elevation crest of the massif El. 805,00 m 
Elevation of dam foundation (m) 785,00 
Normal maximum NA elevation El. 801,00 m 
Overall height 20.0 m 
Crest length 450 m 
Total reservoir volume (up to crest) 30,084.312 m3 (El. 805.00 m) 
Usable volume up to the threshold 10,009.628 m3 (El. 801.00 m) 
Massive Compacted terror 
Emergency overflow Side channel with a width of 32.00 m (El. 801.00 m) 
Auxiliary extravasor Tulip with a diameter of 12.54 m (El. 801.50 m) 
Figure 2

Location of the study area – Pampulha Reservoir.

Figure 2

Location of the study area – Pampulha Reservoir.

Close modal

The Pampulha Reservoir initiated its operations in 1938, with the main purpose of supplying water to the municipality of Belo Horizonte. In 1954, a failure due to piping resulted in the dam failure, with a volume of about 12.6 million m³ being routed downstream. Despite the material damages related to this event, no lives were claimed as a result of the flood wave. In 2016, the surroundings of the Pampulha Reservoir, which encompasses the lagoon and the architectural monuments, received the title of Cultural Heritage of Humanity, granted by United Nations Educational, Scientific and Cultural Organization (UNESCO). Currently, it is a cultural and leisure symbol in the city of Belo Horizonte.

Hydrodynamic model

The hydrograph resulting from the hypothetical failure of the Pampulha Reservoir Dam (Figure 3) was computed by Nascimento et al. (2020) when preparing the emergency action plan. The failure mode was internal erosion (piping), as this was deemed the most likely to occur by the referred authors.
Figure 3

Hypothetical dam failure hydrograph.

Figure 3

Hypothetical dam failure hydrograph.

Close modal
The hydrodynamic flood model was computed using the Hydrologic Engineering Center's River Analysis System (HEC-RAS) software, version 6.3.1, considering the characteristics of the dam and the mentioned rupture hydrograph. The water depths in the flood zone ranged from 0.0097 to 14.5 m, with the greatest depths associated with the Ribeirão das Onças River (Figure 4). Most of the structures affected by the flood wave are commercial and residential buildings and viaducts; in addition, the Pampulha Airport is very close to the dam.
Figure 4

Maximum depths at the downstream valley of the Pampulha Reservoir due to dam failure.

Figure 4

Maximum depths at the downstream valley of the Pampulha Reservoir due to dam failure.

Close modal

Characterization of the downstream valley

For characterizing the downstream valley, we mainly relied on secondary data. The affected structures and population at risk were retrieved from the study of Nascimento et al. (2020), which developed the EAP for the downstream valley of the Pampulha Reservoir in the event of a dam rupture. The delimitation of the flood-affected area, as well as the survey of population in households, were, in turn, obtained from the statistical grid developed by the Brazilian Institute of Geography and Statistics (IBGE 2016).

This statistical grid allows a more detailed analysis of territorial divisions and provides data in smaller geographic units, which are composed of a set of regular areas that divide geopolitical territories. This allows integration of data from distinct sources into incompatible geographic units. The smallest geographic unit, as obtained from the Brazilian demographic survey, is the so-called census tract, which does not have a homogeneous form. Thus, through statistical processes of aggregation and disaggregation, the information contained (population and households) in these census tracts are resampled to the 1 × 1 km resolution in rural areas, and the 200 × 200 m resolution in urban areas – which now comprise homogeneous time-invariant units for modeling purposes (IBGE 2016).

For defining the statistical grid to be analyzed in our case study, we considered the limits of the dam failure flood map and a 100-m offset as a buffer. Then, 428 statistical grids were extracted for further analysis. From the delimited area, data were extracted using the QGIS geoprocessing tool v. 3.36.1, from which 107,029 inhabitants and 32,868 households were identified in the flood zone.

In addition to the information from the IBGE demographic survey, we utilized the geospatial information collected by the Informatics and Information Company of the Municipality of Belo Horizonte (PRODABEL), which is publicly available at the Geoprocessing Portal of information of the City of Belo Horizonte. In particular, we extracted the use and territorial occupation of the municipality in 2022 from the PRODABEL database. Figure 5 depicts the outlined information across the area of interest.
Figure 5

Uses and occupation of the downstream valley of the Pampulha Reservoir. Source: Adapted from PRODABEL (2022).

Figure 5

Uses and occupation of the downstream valley of the Pampulha Reservoir. Source: Adapted from PRODABEL (2022).

Close modal

Therefore, we were able to allocate the information extracted from IBGE's microdata. This allocation comprised two main stages. The first was the use of the algorithm built by Nascimento et al. (2020) to read the data in the form of numbers associated with each piece of information. In addition, the algorithm has been updated so that people with physical disabilities (blind, deaf, limited mobility, and mentally disabled) could be aggregated to the variable ‘people with limited mobility’ in the LifeSim software. After reading these data, we proceed to the characterization of the structures identified in the area of interest with the QGIS geoprocessing tool, which allows the proportional arrangement and the homogeneous distribution of the information in the households allocated in the study region.

To define the number of floors of the buildings, we assumed a homogeneous vertical distribution of households, i.e., the same number of apartments per floor. In addition, we considered buildings with four floors in the ‘apartment’ typology.

Regarding the variables related to the presence of people at home in the afternoon and evening, the following hypotheses were adopted:

  • For the night shift, people who return home every day were considered;

  • For the daytime period, we considered people who do not work, those who perform domestic work at home, those who work at home, those who study in the morning, and those who do not work (from the sixth year of elementary school to the third year of high school) and 50% of people who attend higher education and who do not work.

  • The distribution of people over and under 65 years was carried out respecting the total population of each household.

Based on these assumptions, the number of people in households during non-business hours (02:00 a.m.), 105,577 people, and during business hours (02:00 p.m.), 55,317 people, could be estimated.

Estimates of the population in educational (elementary, higher, state, federal, and private), health (Emergency Care Units and health centers), and social assistance institutions were obtained from the PRODABEL database. A total of 583 units located in the Pampulha, North and Northeast regions of Belo Horizonte were identified. However, only 84 units are located within the area of interest. We then assumed that 50% of the employees of the health and social assistance units work during the day and 50% during the night, due to their full-time operation.

Based on PRODABEL data, we could also identify the number of people involved in economic activities. The dataset presents the location and type of activity, the area used for the development of the activity, the size of the company, the start date, the legal nature, the corporate name, and the numeric identification. The database revealed 379,402 economic activity facilities in the municipality of Belo Horizonte, of which 12,324 are located in the area of interest.

Finally, to avoid the redundant count of people in households and economic activity facilities, we have relied on the following assumptions:

  • Only activities that were located outside exclusively residential areas were considered – this should avoid the double counting of the exposed population since those people who work in households in these areas have already been considered; and

  • Activities related to teaching, health, and social assistance were not considered.

From this analysis, 156,241 economic activity facilities were identified in Belo Horizonte and 5,070 in the area of interest. A regrouping of activities was performed based on the working shift: night and day activities. As nocturnal activities (02:00 a.m.), bars and establishments specialized in beverage services, such as entertainment, party and event houses, and discos, and dance clubs were considered. All other activities were assumed to be daytime ones. As a result, the 5,070 economic activities were divided into 5,038 and 32 for daytime and nighttime, respectively.

The total estimated population for business hours (02:00 p.m.) was 101,865 people and for non-business hours (02:00 a.m.) was 106,520 people. Regarding the construction material, all of them were considered to be made of concrete, which, according to the stability criterion of USACE (2015), behaves similarly to masonry. It is noteworthy that the variable population within the study area was not considered.

The road network represents the paths that can be used by evacuees on foot or in vehicles. The data were obtained from the OpenStreetMap platform, which is included in the LifeSim interface and allows the identification of road directions.

In the EAP for the Pampulha Reservoir Dam, safe destinations were identified for the population to evacuate in the event of a dam failure. The points were provided by the Civil Defense via Public Data, which should guide the evacuation process. These destinations were planned to optimize population motion and avoid displacements into the risk area (Figure 6).
Figure 6

Safe destinations for evacuation at the downstream valley of the Pampulha Reservoir.

Figure 6

Safe destinations for evacuation at the downstream valley of the Pampulha Reservoir.

Close modal

Loss of life modeling

Loss of life modeling was performed using the HEC-LIFESIM software. The data were collected in the study by Nascimento et al. (2020), who developed maps of vulnerability and provided the characterization of the population and infrastructure in the downstream valley based on secondary data. The HEC-LifeSim software provides dynamic modeling in a spatially distributed system to estimate potential fatalities and direct economic damage from flood events (USACE 2021). The model performs explicit simulations relating the alert of the hazard resulting from the flood event and the mobilization of the potentially exposed population, inside buildings and on road networks (USACE 2021). The interactions of the model are performed using the Monte Carlo statistical method. For this purpose, each simulation begins with the first evacuation warning or considering the first time of arrival of a flood wave, and ends when the rupture hydrograph has been completely propagated downstream or each mobilized group has completed the evacuation actions (USACE 2021).

The exposure and vulnerability scenarios were built upon data from the last demographic survey conducted by IBGE in 2010 – the most recent information available – and on the database from PRODABEL, following the rationale discussed in Silva & Eleutério (2023).

The simulations in HEC-LIFESIM were performed considering the daytime (02:00 p.m.) and night (02:00 a.m.) for two scenarios: one in which the identification of the danger (alert and mobilization of the population) is efficient; and another inefficient, which reproduces a region that is poorly prepared to react to the emergency of a dam failure, with respect to the responsibilities of the entrepreneur, the alert system, and the mobilization of the population. To represent these scenarios, the software has mobilization and alert curves, in which the coefficients of the equations represent the stages of hazard identification, communication to hazard managers, issuing of the alert, first alert received by the population, start of evacuation, and safe destination.

Surrogate models – PCE

Metamodels are surrogate functional forms intended to replace a complex model, with high computational demands, with a statistically equivalent and low-cost simulation model, which is built upon a limited number of runs of the true model (Le Gratiet et al. 2017; Sudret et al. 2017; Sudret 2021). Therefore, metamodeling has become a tool with important applications in the field of engineering and applied mathematics. However, due to the underlying complexity of its formulation, this technique has seen relatively little use in other fields (Marelli et al. 2022).

PCE is a technique used to model and propagate uncertainties in stochastic computer simulations by approximation using a basis of orthonormal polynomials (Sudret 2007). This concept was originally introduced by Wiener (1938). However, its application only began at the end of the twentieth century, with the pioneering work of Ghanem & Spanos (1991). The main advantage of the PCE method is that the number of points required for the estimation of the output statistics is relatively low (Luthen et al. 2022). Also, as compared to the other spectral representations, PCE shows faster convergence rates with increasing order of expansion (Sun et al. 2021).

Originally proposed to solve stochastic differential equations, PCE allows for the creation of a robust relationship between the system's response and the random input variables, as it can determine the mean and standard deviation of the random response (Sudret 2007). Xiu & Karniadakis (2002) generalized the Chaos polynomial to the orthogonal polynomial family by applying the solution of elliptic partial differential equations with uncertainties. Sudret (2006, 2007) presented the use of PCE in the context of sensitivity analysis. PCE has the added benefit of allowing directly estimating variance-based sensitivity indices (e.g., Sobol indices), which is the main motivation for choosing PCE as a probabilistic model in this paper.

PCE is based on approximating a quadratically integrable function (which represents the output of a physical model), by means of a finite sum of orthogonal functionals. In other words, the model responses Y can be represented as the sum of a single set of coefficients related to a base of multidimensional orthogonal polynomials (Ramos 2014; Marelli et al. 2022). Formally (Marelli et al. 2022):
(1)
in which X is the vector of input (random) variables; is the surrogate model; Ψα(X) is the family of orthogornal polynomials with respect to the joint distribution function of the input variables, are indices that materialize the components of polynomials; and yα are real-valued deterministic coefficients (Marelli et al. 2022).
In practical implementations, the sum in Equation (1) needs to be truncated to a finite value. For this, a subset of multi-indices is retained for the approximation, as presented in the following equation (Marelli et al. 2022):
(2)
The polynomial basis Ψα(X) It is commonly constructed from a set of orthogonal univariate polynomials (), that should satisfy the inner product (Marelli et al. 2022):
(3)
in which E[.] is the expectation operator; i is the input variable with respect to which the polynomials are orthogonal j and k are the polynomial degrees; is the marginal distribution of the ith input variable; and δjk is the Kronecker delta.
From this, the multivariate polynomials Ψα(X) are constructed as the tensor product of their univariate equivalents (Equation (4)). The tensor product is an operation that combines univariate polynomials in all possible combinations, thus forming multivariate polynomials. In formal terms (Marelli et al. 2022):
(4)
This construction of the tensor product is critical for creating a complete and adequate base of multivariate polynomials that allow for adequate representation complex functions and multivariable phenomena. Due to the orthogonality relations in Equation (3), multivariate polynomials are also orthogonal, as shown in the following equation (Marelli et al. 2022):
(5)
in which δαβ is a Kronecker delta for the multidimensional case.

Marelli et al. (2022) present the classical families of polynomials used as a basis for the expansion (Table 2). By combining these univariate polynomials using tensor products, it is feasible to efficiently propagate input uncertainty and account for output variability in complex systems, allowing one to straightforwardly perform sensitivity analysis.

Table 2

Polynomial families used in PCE

DistributionUnivariate polynomial familyHilbertian basis
Uniform Legendre  
Gaussian Hermite  
Gamma Laguerre  
Beta Jacobi  
DistributionUnivariate polynomial familyHilbertian basis
Uniform Legendre  
Gaussian Hermite  
Gamma Laguerre  
Beta Jacobi  

Source: Adapted from Marelli et al. (2022).

Based on the polynomial families, the total degree truncation scheme, which corresponds to all the polynomials used in the model's M input variables with a degree less than or equal to p, is used. Therefore, the total degree of the polynomial basis grows exponentially with the degree p (Marelli et al. 2022):
(6)

The relevance of the terms in the database, however, is not uniform. Frequently, the most important terms in expansion are those for which only a few variables have significant influence. This phenomenon is known as the principle of sparsity of effects. These schemes are designed to improve computational efficiency and interpretation of results, allowing for a simplified and meaningful representation of the phenomena being studied (Marelli et al. 2022).

The hyperbolic truncation scheme (q-norm), which is defined on a scale of 0–1, is given by the following equations (Blatman 2009):
(7)
in which
(8)
in which, for q = 1, the hyperbolic truncation corresponds exactly to the standard total degree truncation scheme (Equation (6)). However, for values of q less than 1, the hyperolic truncation includes all high-degree univariate terms but excludes high-degree terms with many interacting variables.

In this study, we considered the alerted population, the mobilized population, the issuance of the alert, and the hazard identification input (random) variables, for which probability distributions should be assigned before the construction of the metamodel and the coefficients of the PCE must be computed. The output of the surrogate model is the number of fatalities due to the dam failure.

Global sensitivity analysis – Sobol indexes

The aforementioned uncertainty propagation methods provide insights into the effects of the input variables on the variability of the model response (Sudret 2007). This hierarchization of the input variables is known as sensitivity analysis, which can be classified as local or global. Local analysis is obtained by changing the values of a model parameter in a given range and fixing the other ones. On the other hand, the global analysis is given by the alteration of all the parameters that are being analyzed mutually. Despite the increased complexity, this last option allows a better representation in the sensitivity analysis of the output results since the parameters are simulated together throughout the numerical iterations (Sudret 2007; Pavlack et al. 2022).

In this study, we resort to the Sobol indices for global sensitivity. The Sobol indices (Sobol 1993) are based on the definition of expansion of the computational model in terms of increasing growing dimension. Similarly, this is a variance-based method for sensitivity analysis, in which the total variance of the model outputs is described as the sum of the variances of the plots (Marelli et al. 2022).

Sobol indices are intended to understand how the variation of the output values can be attributed to individual inputs of the model, in addition to the joint interaction among the input variables (Brevault et al. 2013; Zhang et al. 2015). In formal terms, the output variance can be decomposed as:
(9)
in which f0 is a constant term; fi(Xi) is a function of random variable Xi; fij(Xi, Xj) is a function of random variables Xi and Xj, and so on. Equation (9) can be further decomposed to define the total variance D, as related to f (X), as follows:
(10)
Partial variances are calculated then calculated by:
(11)
The Sobol sensitivity indices, for the first and higher orders, are then defined as the ratio of partial variance and total variance (Equation 12). The total Sobol index represents the sum of all indices for the variable in question:
(12)

For calibrating the metamodel and estimating the Sobol indexes, we utilized the UQLab tool, which was developed by the Research Group on Uncertainty, Safety, and Risk Quantification (Marelli et al. 2022) of the Zurich Institute of Technology in Switzerland. Access was made through the MatLab programming platform.

The stage of identification of LifeSim inputs was performed by investigating the functioning of the model. The software has an interface that allows analyzing the input data by modules, as shown in Figure 7.
Figure 7

Inputs required for simulating the LifeSim model.

Figure 7

Inputs required for simulating the LifeSim model.

Close modal

The input data have different formats for forcing LifeSim, with most of the information being entered through shapefiles. The hydrodynamic model, which has the hydraulic information necessary to perform the simulations, can be exported in the format .HDF.

The structural inventory is inserted in shapefile format with information regarding the types of occupation, types of construction, total population, population with mobility difficulties, number of floors and height of the foundation. It is noteworthy that most of this information is provided by the IBGE demographic census through official surveys. The road network is exported from OpenStreetMap, which is a free and collaborative mapping tool. Based on these inputs, it is possible to identify the study area: on the platform, the targeting data of the roads and official routes of passage of vehicles are generated.

Emergency zones are evaluated for the association of alert and mobilization curves, in which the population is characterized and grouped based on the issuance and dissemination of the alert, as well as the preparation and perception of the danger. The destinations of escape routes are entered as safe points to which the population can move and be safe from the danger of flooding. Generally, these are points in high terrain and locations outside the flooded spot. Two important variables are the time to identify the hazard and the delay in communicating the alert, as they will influence the alert and mobilization curves and, consequently, the number of estimated fatalities.

The LifeSim output identification step was performed by evaluating all data at the end of each simulation. It is noteworthy that LifeSim is not currently an open-source software. In this sense, the numerical approach used by the surrogate model is a non-intrusive method, which does not alter the original source code of the computational model being studied in this research.

In the construction of the surrogate model based on the PCE, it is possible to calculate the Sobol indices from the polynomial coefficients (Marelli et al. 2022). For this, it is necessary to include commands that relate the global sensitivity analysis, based on the Sobol index, to the metamodel algorithm. The configuration is given in the approximation of information used in the PCE, and MatLab commands are used to construct the sensitivity analysis. To use higher-order Sobol indices, it is necessary to set the algorithm for different Sobol orders.

The Sobol index method was chosen for the global sensitivity analysis of the input variables as it is trivially related to the PCE model. The Sobol indices, up to order 3, were calculated for each scenario, without the need for additional samples from LifeSim, which is one of the advantages and justifications for the choice of the method. For the sake of simplicity, each input was associated with a symbol, as shown in Table 3.

Table 3

Variables used for sensitivity analysis – Sobol index

VariableSymbol
Alerted population X1 
Mobilized population X2 
Issuance of the alert X3 
Hazard identification X4 
VariableSymbol
Alerted population X1 
Mobilized population X2 
Issuance of the alert X3 
Hazard identification X4 

Loss of life model

From the two-dimensional hydrodynamic model and the input data for the loss of life model, we performed simulations in the LifeSim software. The simulations considered efficient and inefficient scenarios, during the day and at night. The developed scenarios are described below.

  • Inefficient scenario: This is intended to represent a region that is poorly prepared for alert and evacuation. The coefficient ranges of the equations of delay in the dissemination of the alert and delay in the start of mobilization were determined to reflect this situation. In the interval between the identification of the threat and the issuance of the alert, a sufficiently long period was adopted, ranging from 0 to 24 h (i.e., from 0 to 1.440 min). Regarding the modes of evacuation, we considered that 50% of the evacuees would evacuate on foot and the other 50% would use vehicles.

  • Efficient scenario: The choice of the most efficient coefficient ranges, with respect to the dissemination of the alert and the beginning of the mobilization, represents efficient emergency planning at all stages. Regarding the modes of evacuation, we considered that 50% of the evacuees would evacuate on foot and the other 50% would use vehicles.

For each scenario, 1,000 simulations, based on the Monte Carlo method, were performed to draw random variables between the minimum and maximum values of the ranges defined for alert advance, alert curves and mobilization. The results for all scenarios, considering the different levels of efficiency, are presented in Figure 8 which shows the number of fatalities as a function of the anticipation of the warning of danger to the population.
Figure 8

Simulation of loss of life under distinct scenarios of alert issuance and evacuation.

Figure 8

Simulation of loss of life under distinct scenarios of alert issuance and evacuation.

Close modal

On the basis of these simulations, we observed a significant reduction in the number of fatalities from the implementation of efficient warning and evacuation systems, as well as the adequate preparation of the population for evacuation. When the dam failure occurs at night, Figure 8 revealed that an inefficient scenario with respect to the alert issue would entail, on average, 976 fatalities, while in the efficient scenario, this would be reduced to an average of 319 claimed lives. For the daytime period, in the inefficient scenario, there would be an average of 1,047 fatalities, whereas for the efficient scenario, we would observe, on average, 325 fatalities. In other words, the implementation and operationalization of efficient warning and evacuation systems in the area affected by the flood wave would lead to a reduction of 69% in the number of fatalities for the daytime period and 67% for the nighttime. In the efficient scenarios, the minimum toll was eight fatalities for the night shift and nine fatalities for the day shift. In contrast, in the inefficient scenarios, the minimal toll was much more severe, with 279 fatalities for the night shift and 334 for the day shift. These results highlight the importance of efficient warning measures to reduce the number of fatalities.

The results also indicate that the daytime breakout scenario (02:00 p.m.) is more critical as compared to the nighttime scenario (02:00 a.m.). This difference can be ascribed to commercial activities in the region during daytime hours, which implies a greater concentration of people exposed to risk during this period. Also, we note that, although not directly considered in the simulations, during business hours there would be an enhanced flow of people transiting through the risk area, which can have very negative effects on evacuation.

The results show significant potential in reducing human damage (fatalities) through the optimization of systems for identifying possible failures, disseminating alerts and organizing evacuation actions. This includes effective training of the population exposed to the hazard and the capacity building of the dam's emergency management team. In this sense, investing in technologies and strategies that improve the risk identification capacity and the efficiency of warning and evacuation systems is essential to ensuring an adequate response.

Surrogate model

The input variables were defined on the basis of their relevance to the elaboration of the EAP, as indicated in the study by Lumbroso et al. (2021). However, we note that, for the failure of the Pampulha Reservoir Dam, no detailed information on plausible ranges for our input variables is available for our case study, which increases uncertainty on the realizations of underlying processes. As a result, we considered the full range of the empirical curves intended to estimate the alerted population, the mobilized population, the alert issuance, and the hazard identification in LifeSim. Such values are shown in Table 4.

Table 4

Variables used for uncertainty and sensitivity analysis of the LifeSim model

VariableUnitValues
Population alerted Fraction 0 − 1 
Mobilized population Fraction 0 − 1 
Issuance of the alert Minutes 0 − 1.440 
Hazard identification Minutes 0 − 1.440 
VariableUnitValues
Population alerted Fraction 0 − 1 
Mobilized population Fraction 0 − 1 
Issuance of the alert Minutes 0 − 1.440 
Hazard identification Minutes 0 − 1.440 

After determining the input variables, namely, the alerted population, the mobilized population, the alert issuance, and the hazard identification, it is necessary to choose the probability distribution that best represents the data. For this purpose, the marginal distributions of the input variables were analyzed for constructing the surrogate model based on PCE. As virtually no information on the inputs is available in the study area, we utilized uniform distributions, which would maximize uncertainty with respect to their outcomes.

For constructing the surrogate model, it is necessary to choose the polynomial basis and its maximum degree, and then compute the coefficients of the polynomial expansion. Hence, for selecting the best-fit models in each scenario, we considered polynomial degrees ranging from 1 to 15 truncation coefficients varying between 0.1 and 1.0. The polynomial degree was validated on the basis on the lowest estimated error under cross-validation and the extrapolation quality of the model.

For the calibration of the surrogate model, a sample of 1,000 LifeSim output simulations was utilized. Four metamodels of PCE were created: two for the pessimistic scenario of alert and mobilization (one for the daytime and one for the nighttime) and two for the optimistic scenario of alert and mobilization (one for the daytime and one for the nighttime). Table 5 summarizes the results generated in the construction of the PCE model. We note that the calibration errors are close to zero and similar to those found by Kalinina et al. (2021), although we could not find a clear pattern for the goodness-of-fit with respect to the proposed scenarios or the time of the day in which the hypothetical failure occurs.

Table 5

Truncation and polynomial degree results for each scenario in the construction in the surrogate model

ScenarioPeriodErrorPolynomial degreeQ-normalized
Pessimist Diurnal 0.0074 0.9 
Pessimist Nocturne 0.0124 15 0.5 
Optimistic Diurnal 0.0082 11 0.8 
Optimistic Nocturne 0.0066 14 0.8 
ScenarioPeriodErrorPolynomial degreeQ-normalized
Pessimist Diurnal 0.0074 0.9 
Pessimist Nocturne 0.0124 15 0.5 
Optimistic Diurnal 0.0082 11 0.8 
Optimistic Nocturne 0.0066 14 0.8 

Figure 9 depicts the polynomial coefficients estimated in each simulation for the fitted surrogate models. It is observed that, for all metamodels, many coefficients are close to zero, which indicates that certain polynomial terms have little impact on the description of the system and can be ignored to simplify the analysis or optimize computational calculations. This behavior is in agreement with the principle of effect sparsity, which indicates that only a subset of the variables has a significant effect on the model's response. In view of this fact, it is possible to directly relate the number of polynomial coefficients used in the construction of each surrogate model with the truncation coefficient (q-norm). The truncation scheme has the effect of excluding high-order terms, which contributes to simplifying the model without compromising its accuracy.
Figure 9

Logarithmic spectra of the PCE coefficients.

Figure 9

Logarithmic spectra of the PCE coefficients.

Close modal

The pessimistic-nocturnal scenario had the highest truncation coefficient among all scenarios (q-norm = 0.9), which resulted in the use of a greater number of polynomial coefficients to adjust the substitute model – a total of 142 coefficients, as shown in Figure 9(a). On the other hand, the pessimistic-diurnal scenario had the lowest truncation coefficient among all scenarios (q-norm = 0.5), and, despite converging with a polynomial degree of 15, 40 polynomial coefficients were necessary to stabilize the surrogate model (Figure 9).

In the optimistic scenario, for both periods (night and day), the same truncation coefficient (q-norm = 0.8) was obtained. However, when comparing the polynomial degrees, the daytime period converged to a lower polynomial degree (degree 11) than the nighttime period (degree 15). Therefore, 61 polynomial coefficients were generated for the daytime period (Figure 9) and 40 polynomial coefficients for the nighttime period (Figure 9).

Cross-validation

Cross-validation of the models was performed using the mean square error (MSE) as a metric, considering the responses related to each proposed scenario. MSE indicates how far the estimates simulated by the surrogate model are from the values obtained with the LifeSim computational model – the closer to zero, the better the model's performance. The formulation is presented in the following equation:
(13)
in which is the output of the substitute model (fatalities); and is the output of the LifeSim computational model (fatalities).

Table 6 provides the cross-validation results for each proposed scenario. It is possible to observe that the simulations for the optimistic scenarios present very similar results, with smaller errors with respect to the pessimistic counterparts.

Table 6

Mean square error between the data of the computational model and the surrogate model

ScenarioPeriodMSE
Pessimist Diurnal 1.89 
Pessimist Nocturne 2.10 
Optimistic Diurnal 1.41 
Optimistic Nocturne 1.43 
ScenarioPeriodMSE
Pessimist Diurnal 1.89 
Pessimist Nocturne 2.10 
Optimistic Diurnal 1.41 
Optimistic Nocturne 1.43 

However, it is important to emphasize that, overall, the performance of the models is adequate. The largest error found represents a difference of only two fatalities, more or less, with respect to the actual estimates. In this context, the identified error was considered acceptable.

As compared to Kalinina et al. (2021), the MSE values found in this study were higher. However, those authors relied on a much better database for defining the marginal distributions of the inputs, which obviously reduces uncertainty and entails less dispersed estimates during prediction. Despite the inherent uncertainties, considering the hypotheses adopted by LifeSim, such as the alert and mobility curves that are based on empirical data, the estimates were consistent and close to the evaluated computational model (LifeSim). This fact indicates that the estimated model has an acceptable degree of accuracy for estimating the number of fatalities in different scenarios.

After the calibration and validation of the surrogate model, we compared its computational costs to those associated with LifeSim. For this, we run 100,000 simulations of each model. LifeSim required about 20 h to perform the 100,000 simulations, while the PCE model required 20 min of simulation. These time values are similar to those found by Kalinina et al. (2021) in the same number of simulations.

Global sensitivity analysis

Once the metamodel was validated, we proceeded to the sensitivity analysis, based on the variables presented in Table 3. When evaluating the results obtained for the nighttime pessimistic scenario (Figure 10), we observed that, in the Total Sobol index, which summarizes the total influence of each variable (i.e., isolated and interacting with other variables), the variables ‘Alert Issuance’ and ‘Hazard Identification’ presented the most significant contribution to the variability of fatality estimates, with index values of 0.8005 and 0.8016, respectively, on a scale ranging from 0 to 1, in which values closer to 1 indicate a greater contribution of the input variable. In addition, the variable ‘Mobilized Population’ had a contribution value of 0.0206 and ‘Alerted Population’ had a contribution of 0.0007.
Figure 10

Sobol indexes for the pessimistic nighttime scenario.

Figure 10

Sobol indexes for the pessimistic nighttime scenario.

Close modal

The indices for the first, second, and third orders were calculated, and the results reinforced the importance of the variables ‘Alert Issuance’ and ‘Hazard Identification’ (Figure 10). For the first order, which evaluates the individual influence of each input by averaging over the variations of the other inputs, the values 0.1883 and 0.1891, respectively, were obtained for the referred variables. For the second-order analysis, the Sobol index was 0.6015, which indicates the significant influence of the interaction between the variables ‘Alert Issuance’ and ‘Hazard Identification’ on the fatality estimates. On the other hand, no significant pairwise interactions among the other combinations of inputs were found, as the correspondent second-order Sobol indexes are very close to zero. In the third-order analysis, the Sobol index was 0.0085 for the influence of the interaction between the variables ‘Alert Issuance’, ‘Hazard Identification’, and ‘Mobilized Population’ on the fatality estimates (Figure 10). Being very low, this value indicates a minimal influence of the combination between the three variables in the variance of the model outputs – the same holds for the remaining third-order Sobol indexes, which are virtually null. One should note that the summation of the ith order Sobol indexes, i = 1, 2, 3, is equal to 1, as required.

When evaluating the daytime pessimistic scenario (Figure 11), we noted that the variables ‘Alert Issuance’ and ‘Hazard Identification’ presented the most significant contribution to the variability of fatality estimates, being 0.8545 and 0.8722, respectively. As compared to the nighttime pessimistic scenario, these inputs presented a slightly larger contribution. On the other hand, the index value associated with the variable ‘Mobilized Population’ is similar in both situations, with a low contribution of 0.0206.
Figure 11

Sobol indexes for the daytime pessimistic scenario.

Figure 11

Sobol indexes for the daytime pessimistic scenario.

Close modal

In the daytime pessimistic scenario, the Sobol index for the first order, which evaluates the individual influence of each variable, the variables ‘Mobilized Population’, ‘Alert Issuance’, and ‘Hazard Identification’ presented values of 0.0354, 0.0910, and 0.1086, respectively, for the Sobol index (Figure 11). The values are slightly lower than those found for the pessimistic night scenario for the latter two variables, but some influence of the mobilized population is perceived in this scenario – in fact, a larger number of people may be at risk in vulnerable working places during the day, which explains the higher contribution of this variable. For the second-order analysis, the Sobol index was 0.7609 for the influence of the interaction between the variables ‘Alert Issuance’ and ‘Hazard Identification’ on the fatality estimates (Figure 11), but again no other significant pairwise interactions were observed. In addition, the second-order index is close to the pessimistic nocturnal scenario. In the third-order analysis, the Sobol index was 0.0011 for the influence of the interaction between the variables ‘Alert Issuance’, ‘Hazard Identification’, and ‘Mobilized Population’ on the fatality estimates (Figure 11). In addition, the Sobol index generated values below 0.0002 for the interaction between the other variables. These values, which are even smaller than in the pessimistic night scenario, indicate that the combination of three variables has a low contribution to the variability of the model's output.

For the daytime optimistic scenario, the variables ‘Alerted Population’, ‘Mobilized Population’, ‘Alert Issuance’, and ‘Hazard Identification’ presented index values of 0.0071, 0.1951, 0.8339, and 0.8698, respectively, in the total Sobol index (Figure 12). As compared to the previous scenarios, input variables X1 and X2 had a greater contribution in this case, which can be ascribed to the greater importance of the efficiency of alertness and mobilization of the population. In fact, these variables indicate an optimistic scenario of population preparedness, which consequently, leads to a reduction of the number of fatalities.
Figure 12

Sobol indexes for the optimistic daytime scenario.

Figure 12

Sobol indexes for the optimistic daytime scenario.

Close modal

In the daytime optimistic scenario, for the first order, the Sobol index values were 0.0900, 0.0280, and 0.0602 for the variables ‘Mobilized Population’, ‘Alert Issuance’, and ‘Hazard Identification’, respectively (Figure 12). The values of variables X3 and X4 are close to those found in the previous scenarios. However, variable X2 had a more significant contribution to the optimistic diurnal scenario, which is justified by the importance of this variable in the scenario that considers an efficiency in alerting and mobilizing the population.

For the second-order analysis, the Sobol index was 0.7098 for the interaction between variables ‘Alert Issuance’ and ‘Hazard Identification’ on the fatality estimates (Figure 12) 0.0157 between ‘Mobilized Population’ and ‘Hazard Identification’, and 0.0121 for the interaction between the variables ‘Mobilized Population’ and ‘Hazard Identification’. The contribution of variables of X2 and X4 are close to those found in the previous scenarios, but the influence of variable X2 is stronger for this scenario and, therefore, the contribution of this variable with the others is also more prominent in this case.

In the third-order analysis, the Sobol index was 0.0769 for the influence of the interaction between the variables ‘Alert Issuance’, ‘Hazard Identification’, and ‘Mobilized Population’ on the fatality estimates (Figure 12). In addition, the Sobol index presented a value of 0.0069 for the interaction between the variables ‘Alerted Population’, ‘Alert Issuance’, and ‘Hazard Identification’. There is a larger contribution between the interaction of these variables in this scenario, as well as a larger contribution at the individual level.

Finally, for the nighttime pessimistic scenario, the Total Sobol indices related to the variables ‘Alerted Population’, ‘Mobilized Population’, ‘Alert Issuance’, and ‘Hazard Identification’ were 0.0069, 0.2401, 0.7567, and 0.7567, respectively (Figure 13). The variable ‘Mobilized Population’ had the largest contribution among all scenarios, which can be justified by the importance of mobilization in the optimistic scenarios, which assume that the population is prepared to evacuate in cases of imminent risk.
Figure 13

Sobol indexes for the optimistic nighttime scenario.

Figure 13

Sobol indexes for the optimistic nighttime scenario.

Close modal

For the first order, the Sobol index values were 0.2106, 0.0275, and 0.0275 for the variables ‘Mobilized Population’, ‘Alert Issuance’, and ‘Hazard Identification’, respectively (Figure 13). These results are in agreement with those found in the daytime optimistic scenario: variable X2 continues to have a larger contribution due to the importance of alerting and mobilization in the optimistic scenario. On the other hand, in the pessimistic scenarios, the variables X3 and X4 had greater contributions in all orders of Sobol analyzed in this study. For the second-order analysis, the Sobol index was 0.6988 for the influence of the interaction between the variables ‘Alert Issuance’ and ‘Hazard Identification’ on the fatality estimates (Figure 13), 0.0048 between ‘Mobilized Population’ and ‘Hazard Identification’, and 0.0048 for the interaction between the variables ‘Mobilized Population’ and ‘Hazard Identification’. These values also indicate the prevalence of variables X3 and X4 for the second order of Sobol, as found in all scenarios.

In the third-order analysis, the Sobol index was 0.0192 for the influence of the interaction between the variables ‘Mobilized Population’, ‘Alert Issuance’, and ‘Hazard Identification’ on the fatality estimates (Figure 13). In addition, the estimated Sobol index is 0.0057 for the interaction between the variables ‘Alerted Population’, ‘Alert Issuance’, and ‘Hazard Identification’, and lower than 0.001 for the other variables. As in the daytime optimistic scenario, there is a larger contribution of the variables in the output of the model, which can be justified by their importance variables in the evacuation and mobilization of the population in case of imminent danger, unlike the pessimistic scenario, which assumes that the population is unprepared and that the alert and mobilization system is inefficient.

The use of computational tools to analyze the consequences of dam failure events is essential for a better understanding of the involved processes, as well as to provide guidelines for the planning of emergency actions, which, if efficiently implemented, may reduce the number of fatalities and environmental and socioeconomic damage. In this context, the present study resorted to the LifeSim agent-based model for investigating the loss of life that would result from the failure of the Pampulha Reservoir Dam, in Brazil. In addition, to provide a broader understanding of the influence of input variables in LifeSim responses, which is useful for designing guidelines for evacuation, we utilized the computationally inexpensive PCE technique for approximating the numerical model and the Sobol Indices for decomposing the variance of the model responses.

For simulating LifeSim, we considered the inputs alerted population, mobilized population, issuance of the alert and hazard identification as uniformly distributed variates – as we did not have detailed information on them in our study area – and assumed four distinct scenarios that differ in the efficiency of the alerts and the time of day in which the dam failure took place. For a nighttime event, the inefficient scenario would result in 976 fatalities on average, whereas for the efficient one, the toll would reduce to 319 claimed lives. On the other hand, a daytime event would entail 1,047 and 325 fatalities, for the inefficient and efficient scenarios, respectively. We note that, despite the very distinct activities developed in the study during day and night periods, the number of fatalities was quite similar in both cases, which is likely a result of the very uncertain conditions from which LifeSim is forced. Hence, further research on the ‘actual’ distribution function of the input variables would be beneficial for properly distinguishing between these situations. In addition, the advantages of designing proper alert and evacuation strategies are noticeable: even under very uncertain inputs, the number of fatalities is considerably decreased when the alert is properly communicated, and the population is properly trained to react to imminent danger.

To more thoroughly explore the LifeSim's output space, we built four metamodels that accurately approximate the numerical model responses under cross-validation – of course, the metamodels are very parametrized and, as a result, low levels of bias were expected a priori. The metamodels considerably reduced the computational costs for the stochastic simulation of loss of life resulting from the dam failure. More important, however, is that decomposing the variance of the PCE model is straightforward, and the computation of the Sobol indices allowed readily estimating the influence of the random inputs. Alert issuance and hazard identification were, by far, the most important contributing factors in pessimistic scenarios, but the alerted population and the mobilized population presented nonnegligible indices for the optimistic ones. Second-order analysis highlighted the importance of the joint study pairwise of variables, but, from this point onwards, the joint effects are deemed too low. Overall, our results highlighted the importance of well-designed alert strategies, which would then stand out as a priority for decision-makers. However, a limitation of this method is the construction of a surrogate model that properly represents the computational model being studied.

To sum up, the combination of LifeSim and the PCE metamodel enabled propagating uncertainty at reasonable costs and further investigating conditioning factors to the population behavior as a response to a critical dam failure, even in situations in which little information on alert and mobilization is available. We believe this knowledge may underpin updates in policies and safety measures for reducing the number of fatalities in the case of a dam failure. Of course, there are several aspects for improvement in future work. In effect, other applications to well-documented structures and systems may provide additional insights for prescribing the probability distributions of the stochastic inputs. Moreover, we intend to assess whether large natural floods might be used as proxies for technological ones, at least in some locations of the study area, for aggregating more information on the behavior of the population at risk – which could, in turn, indicate further limitations of LifeSim in describing the evacuation process.

The authors acknowledge the support to this research from Conselho Nacional de Desenvolvimento Científico e Tecnol'ogico (CNPq), Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES), and Fundação de Amparo à Pesquisa do Estado de Minas Gerais (FAPEMIG). The authors also wish to acknowledge the anonymous reviewers and editors for the valuable comments and suggestions, which greatly helped improve the paper.

All relevant data are included in the paper or its Supplementary Information.

The authors declare there is no conflict.

Aboelata
M. A.
&
Bowles
D. S.
(
2005
)
LIFESim: A Model For Estimating Dam Failure Life Loss. Report to Institute for Water Resources, US Army Corps of Engineers and Australian National Committee on Large Dams. Utah State University, USA : Institute for Dam Safety Risk Management
.
Belo Horizonte City Hall – Prodabel
(
2022
)
BHMap – Interactive Map of Belo Horizonte year 2022. Available at: http://bhmap.pbh.gov.br/v2/mapa/idebhgeo. [Accessed 2nd April 2023]
.
Bilali
A. E.
,
Taleb
A.
&
Boutahri
I.
(
2021
)
Application of HEC-RAS and HEC-LifeSim models for flood risk assessment
,
Journal of Applied Water Engineering and Research
,
9
(
4
),
336
351
.
http://dx.doi.org/10.1080/23249676.2021.1908183 (Accessed: 12 May 2022)
.
Bilali
A. E.
,
Taleb
I.
,
Nafii
A.
&
Taleb
A.
(
2022
)
A practical probabilistic approach for simulating life loss in an urban area associated with a dam-break flood
,
International Journal of Disaster Risk Reduction
,
76
,
103011
.
https://doi.org/10.1016/j.ijdrr.2022.103011
.
Blatman
G.
(
2009
)
Adaptive Sparse Polynomial Chaos Expansion for Uncertainty Propagation and Sensitivity Analysis
.
Doctoral Thesis
,
Universidade Blaise Pascal – Clermont II
.
Brazilian Institute of Geography and Statistics – IBGE
(
2010
)
Description of the variables of the Demographic Census sample. Rio de Janeiro. Available at: https://www.ibge.gov.br/estatisticas/sociais/saude/9662-censo-demografico-2010.html?=&t=microdados [Accessed 10 ago August 2022]
.
Brazilian Institute of Geography and Statistics – IBGE
(
2016
)
Description of the variables of the Demographic Census sample. Statistical Grid. Rio de Janeiro: IBGE. Available at: https://biblioteca.ibge.gov.br/index.php/biblioteca-catalogo?view=detalhes&id=2102043
.
Brevault
L.
,
Balesdent
M.
,
Berend
N.
&
Le Riche
R.
(
2013
)
Comparison of different global sensitivity analysis methods for aerospace vehicle optimal design. In: 10th World Congress on Structural and Multidisciplinary Optimization, Orlando FL, USA, May 2013. International Symposium on Stochastic Hydraulics. Orlando: WCSMO, pp. 19–24
.
Dawson
R. J.
,
Peppe
R.
&
Wang
M.
(
2011
)
An agent-based model for risk-based flood incident management
,
Natural Hazards
,
59
(
1
),
167
189
.
http://dx.doi.org/10.1007/s11069-011-9745-4 (Accessed: 20 June 2022)
.
Ghanem
R.
&
Spanos
P. D.
(
1991
)
Stochastic Finite Elements: A Spectral Approach
.
New York
:
Springer-Verlag
.
Johnstone
W. M.
,
Sakamoto
D.
,
Assaf
H.
&
Bourban
S.
(
2005
)
Architecture, modelling framework and validation of BC Hydro's virtual reality life safety model
. In:
International Symposium on Stochastic Hydraulics,
Nijmegen, The Netherlands: 10th World Congress on Structural and Multidisciplinary Optimization
, pp.
23
24
.
Kalinina
A.
,
Spada
M.
&
Burgherr
P.
(
2021
)
Quantitative assessment of uncertainties and sensitivities in the estimation of life loss due to the instantaneous break of a hypothetical dam in Switzerland
,
Water
,
13
(
23
),
3414
.
http://dx.doi.org/10.3390/w13233414 (Accessed 20 Jan 2022)
.
Le Gratiet
L. L.
,
Marelli
S.
&
Sudret
B.
(
2017
)
Metamodel-based sensitivity analysis: Polynomial chaos expansions and Gaussian processes
. In
Handbook of Uncertainty Quantification
.
Zurich: Springer
, pp.
1289
1325
.
http://dx.doi.org/10.1007/978-3-319-12385-1_38 (Accessed: 15 July 2021)
.
Lumbroso
D.
,
Davison
M.
,
Body
R.
&
Petkovsek
G.
(
2021
)
Modelling the Brumadinho tailings dam failure, the subsequent loss of life and how it could have been reduced
,
Natural Hazards and Earth System Sciences
,
21
(
1
),
21
37
.
https://doi.org/10.5194/nhess-21-21-2021-supplement (Accessed: 18 July 2022)
.
Luthen
N.
,
Marelli
S.
&
Sudret
B.
(
2022
)
Automatic selection of basis-adaptive sparse polynomial chaos expansions for engineering applications
,
International Journal for Uncertainty Quantification
,
12
(
3
),
2022
.
https://doi.org/10.48550/arXiv.2009.04800 (Accessed: 18 Sep 2022)
.
Marelli
S.
,
Luthen
N.
&
Sudret
B.
(
2022
)
UQLab user manual – polynomial chaos expansions. Report UQLab-V2.0-104
. In:
Chair of Risk, Safety and Uncertainty Quantification
,
Switzerland
:
ETH Zurich
.
Nagel
J.
,
Rieckermann
J.
&
Sudrer
B.
(
2020
)
Principal component analysis and sparse polynomial chaos expansions for global sensitivity analysis and model calibration: application to urban drainage simulation, Reliability Engineering & System Safety, 195, 106737. DOI: 10.1016/j.ress.2019.106737
.
Nascimento
N. O.
,
Eleutherius
J. C.
,
Palmier
L. R.
,
Silva
A. F. R.
,
Vieira
L. M. S.
&
Brazil
L.
(
2020
)
Emergency Action Plan: Pampulha Dam
.
Belo Horizonte
:
SUDECAP – Superintendence of Capital Development – PBH (Belo Horizonte City Hall)
.
Nispel
A.
,
Ekwario-Osire
S.
,
Dias
J. P.
&
Cunha
A.
Jr
(
2021
)
Uncertainty quantification for fatigue life of offshore wind turbine structure
,
ASME Journal of Risk and Uncertainty
,
7
(
4
),
040901
.
doi:10.1115/1.4051162
.
Norenberg
J. P.
,
Cunha
A.
,
Da Silva
S.
&
Varoto
P. S.
(
2022
)
Global sensitivity analysis of asymmetric energy harvesters
,
Nonlinear Dynamics
,
109
(
2
),
443
458
.
https://doi.org/10.1007/s11071-022-07563-8
.
Pavlack
B.
,
Paixão
J.
,
Silva
S.
,
Cunha
A.
&
García Cava
D.
(
2022
)
Polynomial chaos-Kriging metamodel for quantification of the debonding area in large wind turbine blades
,
Structural Health Monitoring.
,
21
(
2
),
666
682
.
doi:10.1177/14759217211007956
.
Peng
J.
,
Zhang
J.
&
Sayama
T.
(
2024
)
Assessment of loss of life owing to dam-failure flooding considering population distribution and evacuation
,
International Journal of Disaster Risk Reduction
,
112
,
104737
.
https://doi.org/10.1016/j.ijdrr.2024.104737
.
Ramos
K. P. G.
(
2014
)
Propagation of Uncertainties via Expansion by Polynomial Chaos in the Simulation of oil Reservoirs
.
Dissertation (master's degree)
,
Pontifical Catholic University of Rio de Janeiro, Department of Electrical Engineering
. .
Shields
M. D.
,
Au
S.
&
Sudret
B.
(
2019
)
Advances in simulation-based uncertainty quantification and reliability analysis
,
Journal of Risk and Uncertainty in Engineering Systems
,
5
(
4
),
02019003
.
http://dx.doi.org/10.1061/ajrua6.0001025 (Accessed: 17 July 2022)
.
Silva
A. F. R. S.
&
Eleutério
J. C.
(
2023
)
Analysis of flood warning and evacuation efficiency by comparing damage and life-loss estimates with real consequences related to the São Francisco tailings dam failure in Brazil
,
Natural Hazards and Earth System Sciences
,
23, 3095–3110
. https://nhess.copernicus.org/articles/23/3095/2023/
(Accessed: 22 December 2023)
.
Sobol
I. M.
(
1993
)
Sensitivity estimates for nonlinear mathematical models
,
Mathematical Modelling and Computational Experiment
,
1
(
4
),
407
414
.
Sudret
B.
(
2006
)
Global sensitivity analysis using polynomial chaos expansions
. In:
Spanos
P.
&
Deodatis
G.
(eds.)
Proc. 5th Int. Conf. on Comp. Stoch. Mech (CSM5)
.
Rhodos, Greece: Reliability Engineering & System Safety
.
Sudret
B.
(
2007
)
Uncertainty Propagation and Sensitivity Analysis in Mechanical Models Contributions to Structural Reliability and Stochastic Spectral Methods
.
252 f. Thesis (Doctorate)
,
Course of Ecole Doctorale Sciences Pour L'inge ́Nieur, Universit ́E Blaise Pascal – Clermont II, Paris.
Sudret
B.
(
2021
)
Recent developments on surrogate models for stochastic Simulators
,
4th International Conference on Uncertainty Quantification in Computational Sciences and Engineering
.
(UNCECOMP'2021), Athens (Greece)
,
30 June 2021
.
Sudret
B.
,
Marelli
S.
&
Wiart
J.
(
2017
). '
Surrogate models for uncertainty quantification: An overview
',
2017 11th European Conference on Antennas and Propagation (EUCAP)
.
doi:10.23919/EuCAP.2017.7928679
.
Sun
X.
,
Pan
X.
&
Choi
J.
(
2021
)
Non-intrusive framework of reduced-order modeling based on proper orthogonal decomposition and polynomial chaos expansion
,
Journal of Computational and Applied Mathematics
,
390
,
113372
.
https://doi.org/10.1016/j.cam.2021.113372
.
USACE – United States Army Corps of Engineers
(
2015
)
Hydrologic Engineering Center's – Flood Impacts Analysis (HEC-FI). Version 3.0
.
Davis, Califórnia: USACE
.
USACE – United States Army Corps of Engineers
(
2018
)
Hydrologic Engineering Center's – Life Loss Estimation (HEC-LifeSim). Version 1.0.1
.
Davis, Califórnia: USACE
.
USACE – United States Army Corps of Engineers
(
2021
)
Hydrologic Engineering Center's – Life Loss Estimation (LifeSim). Version 2.0
.
Davis, Califórnia: USACE
.
Vianini Neto
L.
(
2016
)
Study of the Rupture of the Pampulha Dam, in Belo Horizonte: Retroanalysis of the Breach of the 1954 Accident and Hypothetical Rupture in Current Conditions. 2016. 304 F
.
Belo Horizonte, Brazil: Federal University of Minas Gerais
.
Wang
Y.
,
Fu
Z.
,
Cheng
Z.
,
Xiang
Y.
,
Chen
J.
,
Zhang
P.
&
Yang
X.
(
2024
)
Uncertainty analysis of dam-break flood risk consequences under the influence of non-structural measures
,
International Journal of Disaster Risk Reduction
,
102
,
104265
.
https://doi.org/10.1016/j.ijdrr.2024.104265
.
Wiener
N.
(
1938
)
The homogeneous chaos
,
American Journal of Mathematics
,
60
(
4
),
897
893
.
Xiu
D.
&
Karniadakis
G.
(
2002
)
The Wiener–Askey polynomial chaos for stochastic differential equations
,
Siam Journal on Scientific Computing
,
24
(
2
),
619
644
.
http://dx.doi.org/10.1137/s1064827501387826. Society for Industrial & Applied Mathematics (SIAM). (Accessed: 13 January 2022)
.
Zhang
X. -Y.
,
Trame
M.
,
Lesko
L.
&
Schmidt
S.
(
2015
)
Sobol sensitivity analysis: A tool to guide the development and evaluation of systems pharmacology models
,
Cpt: Pharmacometrics & Systems Pharmacology
,
4
(
2
),
69
79
.
http://dx.doi.org/10.1002/psp4.6. (Accessed: 20 out. 2022)
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).