## Abstract

An uncertainty assessment framework based on Karhunen–Loevè expansion (KLE) and probabilistic collocation method (PCM) was introduced to deal with flood inundation modelling under uncertainty. The Manning's roughness for channel and floodplain were treated as 1D and 2D, respectively, and decomposed by KLE. The maximum flow depths were decomposed by the 2nd-order PCM. Through a flood modelling case with steady inflow hydrographs based on five designed testing scenarios, the applicability of KLE-PCM was demonstrated. The study results showed that the Manning's roughness assumed as a 1D/2D random field could efficiently alleviate the burden of random dimensionality within the analysis framework, and the introduced method could significantly reduce repetitive runs of the physical model as required in the traditional Monte Carlo simulation (MCS). The study sheds some light on reducing the computational burden associated with flood modelling under uncertainty which is useful for the related damage quantification and risk management.

## INTRODUCTION

Floods are arousing increasing attention over the world, due to their tremendous damaging effects among natural hazards and increasing frequency and severity under the impact of climate change (Milly *et al.* 2002; Adger *et al.* 2005). Flood inundation modelling is a useful tool for helping evaluate the vulnerability of flood-prone areas and support mitigation efforts. However, due to the inherent complexity of the flood model itself, a large number of parameters involved, and errors associated with input data or boundary conditions, there are always uncertainties that could lead to serious impact on the accuracy, validity and applicability of the model outputs (Levy & Hall 2005; Pappenberger *et al.* 2008; Blazkova & Beven 2009; Simonovic 2009; Pender & Faulkner 2011; Altarejos-García *et al.* 2012). Monte Carlo simulation (MCS) has been a traditional stochastic approach to deal with propagation of uncertainties from input to output of a modelling process, where synthetic sampling is used with hypothetical statistical distributions (Ballio & Guadagnini 2004; Loveridge & Rahman 2014). Based on MCS approach, many further developments have been reported on uncertainty quantification for flood modelling processes, such as Markov Chain Monte Carlo (MCMC) and Generalized Likelihood Uncertainty Estimation (GLUE) (Isukapalli *et al.* 1998; Balzter 2000; Aronica *et al.* 2002; Qian *et al.* 2003; Peintinger *et al.* 2007).

MCS and other related methods are conceptually simple, and straightforward to use. However, in flood modelling, the repetitive runs of the numerical models normally require significant computational resources (Ballio & Guadagnini 2004; Liu *et al.* 2006). Another problem in flood modelling is the heterogeneity issue in uncertainty assessment. Due to the distributed nature of geological formation and land use condition, as well as a lack of sufficient investigation to obtain such information at various locations of the modelling domain, some parameters associated with boundary value problems (BVPs), such as Manning's roughness and hydraulic conductivity, are random fields in space (Roy & Grilli 1997; Liu & Matthies 2010). However, in the field of flood inundation modelling, such uncertain parameters are usually assumed as homogeneous for specific types of domains (e.g. grassland, farms, forest, developed urban areas, etc.) rather than heterogeneous fields, which could lead to inaccurate representation of the input parameter fields (Balzter 2000; Peintinger *et al.* 2007; Simonovic 2009; Grimaldi *et al.* 2013).

To improve the computational efficiency of MCS, the Polynomial Chaos Expansion (PCE) approach was proposed and applied in structure mechanics, groundwater modelling, and many other fields (Isukapalli *et al.* 1998; Xiu & Karniadakis 2002; Li *et al.* 2011a, 2011b). PCE is one of the stochastic response surface methods (SRSMs) which attempts to use Galerkin projection to determine the polynomial chaos coefficients for the relationship between the uncertain inputs and outputs and therefore transform the highly-nonlinear relationship of stochastic differential equations of the numerical modelling into deterministic ones (Ghanem & Spanos 1991; Isukapalli *et al.* 1998; Betti *et al.* 2012). However, Galerkin projection, as one of the key and complicated procedures of the PCE method, produces a large set of coupled equations and the related computational requirement would rise significantly when the numbers of random inputs or PCE order increases. The probabilistic collocation method (PCM), as a computationally efficient alternative, was introduced to carry out multi-parametric uncertainty analysis of numerical geophysical models (Webster *et al.* 1996; Tatang *et al.* 1997). It uses Gaussian quadrature instead of Galerkin projection to obtain the polynomials chaos, which are more convenient in obtaining the PCE coefficients based on a group of selected special random vectors called collocation points (CPs) (Li & Zhang 2007). Previously, PCM has gained a wide range of applications in various fields, such as groundwater modelling and geotechnical engineering (Isukapalli *et al.* 1998; Li & Zhang 2007; Li *et al.* 2011a, 2011b; Zheng *et al.* 2011; Mathelin & Gallivan 2012).

In terms of spatial randomness associated with parameters within the numerical modelling domains, the Karhunen–Loevè expansion (KLE) was proposed to solve some types of BVPs involved in groundwater modelling, in which the heterogeneous fields of the uncertain inputs are assumed with corresponding spectral densities and their random processing are represented by truncated KLE (Ghanem & Spanos 1991; Huang *et al.* 2007; Phoon *et al.* 2002; Zhang & Lu 2004). Zhang & Lu (2004) implemented KLE decomposition to the random field of log-transformed hydraulic conductivity within the framework of uncertainty analysis of flow in random porous media. Liu & Matthies (2010) applied different kinds of truncated KLE to represent the multi-input random field composed by three uncertainty parameters, where the topography and floodplain Manning's roughness coefficient were considered as two-dimensional (2D) spatial random fields. The study used the PCE method based on the Garlerkin projection to build up the SRSMs to represent the outputs. Huang & Qin (2014a, 2014b) applied KLE to decompose the multi-input field of channel and floodplain Manning's roughness and analyzed the uncertain propagation during the flood modelling process.

To deal with the stochastic numerical modelling field, stochastic approaches based on combined KLE and PCM (KLE-PCM) were proposed (Huang *et al.* 2007; Li & Zhang 2007). The general framework involves decomposition of the random input field with KLE and representation of output field by PCE, by which the complicated forms of stochastic differential equations are transformed into straightforward ones. The previous studies on KLE-PCM applications were mainly reported in studies of ground water modelling and structural systems modelling (Zhang & Lu 2004; Li & Zhang 2007; Li *et al.* 2009; Shi *et al.* 2010). However, in the field of flood modelling, the related studies are rather limited. Recently, Huang & Qin (2014a, 2014b) attempted to use integrated KLE and PCM to quantify uncertainty propagation from a single 2D random field of floodplain hydraulic conductivity. The study indicated that the floodplain hydraulic conductivity could be effectively expressed by truncated KLE, and the SRSMs for output fields (maximum flow depths) could be successfully built up by the 2nd- or 3rd-order PCMs. However, this preliminary study only considered a single input of a 2D random field, which is a rather simplified condition in practical applications. In fact, as an essential BVP parameter frequently investigated for flooding modelling, the stochastic distributions of Manning's roughness for channel and floodplain are spatially varying, due to the geology of the channel and largely land use of the floodplain. To address such an issue, adopting a coupled 1D/2D modelling scheme turns out to be a reasonable and attractive choice (Finaud-Guyot *et al.* 2011; Pender & Faulkner 2011). However, this brings about the requirement of more CPs in PCM, and the necessity of addressing joint-distributions among multiple random inputs.

Therefore, as an extension to our previous work, this study aims to apply combined KLE and PCM (KLE-PCM) to deal with flood inundation modelling involving a 1D/2D random field. The Manning's roughness in the channel and floodplain are assumed as 1D and 2D random fields, respectively; the hydraulic conductivity of flood plain is considered as a 2D random field. KLE is used to decompose the input fields and PCM is used to represent the output ones. Using five testing scenarios with different input/parameter settings on a simplified flood modelling case, we attempt to demonstrate the effectiveness of KLE-PCM over the traditional MCS in dealing with 1D/2D random field, which is rarely tackled by the previous works.

## METHODOLOGY

### Stochastic differential equations for flood modelling

*n*(

**x**), hydraulic conductivity

*K*(

_{s}**x**), and water depth

*h*(

**x**) to be the uncertain variables of concern (involving both uncertain inputs and outputs), the stochastic governing equation for the flood flow can be written as (FLO-2D Software 2012; Huang & Qin 2014a, 2014b): where

*h*means the flow depth (L);

*t*means the time (T);

*V*is the velocity averaged in depth for each of the eight directions

**x**(L/T);

**x**is 2D Cartesian as

**x**= (

*x*,

*y*) coordinate in the 2D overflow modelling or the longitudinal distance along the channel in the 1D channel flow modelling (L);

*η*means the soil porosity;

*K*represents hydraulic conductivity (L/T); Ψ

_{s}*represents the dry suction (L), generally in negative values;*

_{f}*F*is the total infiltration (L);

*θ*and

_{s}*θ*are defined as the saturated and initial soil moistures, respectively;

_{o}*n*is the Manning's roughness coefficient representing either Manning's roughness of floodplain

*n*or Manning's roughness of channel

_{f}*n*;

_{c}*r*means hydraulic radius (L);

*S*is the bed slope. Equations (1a) and (1b) can be solved numerically by FLO-2D for each of eight directions (FLO-2D Software 2012).

_{o}In this study, two types of uncertain inputs are considered in the flood inundation modelling. The first type is Manning's roughness coefficient. The general symbol *n*(**x**) in Equation (1) can be split into channel roughness *n _{c}*(

**x**) (as a 1D random field) and floodplain roughness

*n*(

_{f}**x**) (as a 2D random field). The second type of uncertain parameter is the floodplain hydraulic conductivity, denoted as

*K*(

_{f}**x**), over the 2D floodplain modelling domain. The maximum (max) flow depth distribution over the entire modelling domain

*h*(

**x**) is taken as the modelling output. Subsequently, Equations (1a) and (1b) are changed into stochastic partial differential equations accordingly with other items (e.g.

*η*and Ψ

*) kept unchanged in the governing equations, which can be solved with existing numerical models. Therefore, the output fields*

_{f}*h*(

**x**) would present as probabilistic distributions or statistical moments (i.e. the mean and standard deviation).

### KLE representation for coupled 1D and 2D (1D/2D) random field

*z*(

**x**,

*ω*) (e.g. roughness coefficient

*n*) is a random field with a log-normal distribution, let

*Z*(

**x**,

*ω*) = ln

*z*(

**x**,

*ω*), where

**x**∈

*D*,

*D*is the measure of the domain size (length for 1D domain, area for 2D domain and volume for 3D domain, respectively) and

*ω*∈ Ω (a probabilistic space). Furthermore,

*Z*(

**x**,

*ω*) can be expressed by a normal random field with mean

*μ*(

_{Z}**x**) and fluctuation

*Z′*(

**x**,

*ω*), showing as:

*Z*(

**x**,

*ω*) =

*μ*(

_{Z}**x**) +

*Z′*(

**x**,

*ω*). Herein,

*Z*(

**x**) has a spatial fluctuation according to its bounded, symmetric, and positive covariance function

*C*(

_{Z}**x**,

**y**) = 〈

*Z′*(

**x**,

*ω*)

*Z′*(

**y**,

*ω*)〉 shown as (Ghanem & Spanos 1991; Zhang & Lu 2004): where

*λ*means eigenvalues;

_{m}*f*(

_{m}**x**) are eigenfunctions; and the approximation of

*Z*(

**x**) can be expressed by truncated KLE with

*M*items in a limited form as follows (Ghanem & Spanos 1991): where means the

*m*th independent standard normal random variables (NRVs). As and

*f*(

_{m}**x**) generally show up in pairs, we can define an eigenpair as . Then, Equation (3) can be simplified into (Zhang & Lu 2004):

Theoretically, to be assumed in advance as a random process for uncertain input *z*(·), *ω* should be a function of the position vector **x** defined over the domain D. In order to reach a predefined level of accuracy for the spectral representation (i.e. ) of the random process , more items (i.e. eigenpairs) are needed in Equation (4). Correspondingly, more energy could be kept within the input random field. Here, energy is an indicator to reflect the accuracy of the spectral representation. However, this may lead to a higher computational requirement. For 1D channel modelling domain, *m* is the number of items saved in 1D modelling direction; for 2D rectangular physical domain, *M* = *M _{x}* ×

*M*, where

_{y}*M*and

_{x}*M*represent the number of items kept in

_{y}*x*and

*y*directions, respectively.

Moreover, in this study, there are a number of normalizations in each dimensionality of the physical space, including: (i) normalized length , where *L_{x}* is the length of one side of the domain at a single direction (i.e.

*x*direction defined in 1D channel modelling;

*x*or

*y*direction for 2D rectangular domain); (ii) normalized correlation length ; (iii) normalized eigenvalues and normalized eigenfunctions (Zhang & Lu 2004). After normalization, the KLE representation of 1D/2D input random field can be obtained based on 1D and 2D random fields decomposed by Equation (4), respectively.

*Z*(

**x**) = ln

*n*(

**x**) for an example,

*n*(

**x**) can be divided into 1D random field of channel

*n*(

_{c}**x**) and 2D random field of floodplain

*n*(

_{f}**x**) with independent exponential functions as

*Z*

_{1}(

**x**) = ln

*n*(

_{c}**x**) and

*Z*

_{2}(

**x**) = ln

*n*(

_{f}**x**), respectively. Then,

*Z*

_{1},

*Z*

_{2}and

*Z*can be expressed by truncated KLEs as (Huang

*et al.*2001): where

*M*

_{1}and

*M*

_{2}are the KLE items for and , respectively. For the multi-input random field, the total number of KLE items would be dependent on the dimensionality of single 1D or 2D input random field and the relationship among them (Shi

*et al.*2010). For instance, if 1D

*n*(

_{c}**x**) and 2D

*n*(

_{f}**x**) is assumed under full correlationship, the total random dimensionality of 1D/2D random field

*n*(

**x**),

*M*, can be calculated by

*M*=

*M*

_{1}+

*M*

_{2}=

*M*

_{1}+

*M*

_{2x}×

*M*

_{2y}, where

*M*

_{2x}and

*M*

_{2y}are the numbers of KLE items kept in each x and y direction of the rectangular domain, respectively. Compared with a coupled 2D/2D random field, the

*n*(

**x**) in this study is treated as a 1D/2D field, with the total dimensionality of KLE (

*M*) being reduced. When another input random field

*K*(

_{s}**x**) is introduced, the dimensionality of this multi-input random field by KLE decomposition is calculated as

*M*=

*M*

_{1}+

*M*

_{2}+

*M*

_{3}=

*M*

_{1}+

*M*

_{2x}×

*M*

_{2y}+

*M*

_{3x}×

*M*

_{3y}, where

*M*

_{3x}and

*M*

_{3y}are the numbers of KLE items kept in the

*x*and

*y*directions of the rectangular domain, respectively. Subsequently, the random field of (single or multi-input) is transformed by KLE into a function of NRVs and the dimensionality of input random field is the number of NRVs used in Equation (5).

### PCE representation of max flow depth field h (x )

*h*(

**x**) as follows (Li & Zhang 2007; Shi

*et al.*2009): where

*a*

_{0}(

**x**) and represent the deterministic PCE coefficients. are defined as a set of

*d*-order orthogonal polynomial chaos for the random variables . For this study, are assumed as independent NRVs and are defined as

*d*-order

*Q*-dimensional Hermite Polynomial (Wiener 1938; Li & Zhang 2007): where

**represents a vector of NRVs defined as . The Hermite polynomials can be used to build up the best orthogonal basis for**

*ς***and therefore to construct the random field of output (Ghanem & Spanos 1991). For example, the 2nd-order PCE approximation of**

*ς**h*(

**x**) can be expressed as (Shi

*et al.*2009): where

*Q*is the number of the NRVs. Equation (8) can be simplified as (Shi

*et al.*2009): where

*c*(

_{i}**x**) are the deterministic PCE coefficients including

*a*

_{0}(

**x**) and ;

*φ*

_{i}(

**) are the Hermite polynomials in Equation (8). In this study, the number of NRVs is required as**

*ς**Q*and therefore the total number of the items (

*P*) can be calculated by

*d*and

*Q*as:

*P*= (

*d*+

*Q*)!/(

*d*!

*Q*!).

### KLE-PCM in flood inundation modelling

The general idea of PCM is actually a simplification of the traditional PCE method, in which the particular sets of ** ς** are chosen from the higher-order orthogonal polynomial for each parameter (Isukapalli

*et al.*1998; Li & Zhang 2007). By decomposing the spatial-related random input fields by the KLE and the representing output by PCM, KLE-PCM can easily transfer the complicated nonlinear flood modelling problems into independent deterministic equations (Li & Zhang 2007; Sarma

*et al.*2005; Li

*et al.*2009). In this study, the framework of KLE-PCM is shown in Figure 1 and is described as follows (Li

*et al.*2011a, 2011b; Huang & Qin 2014a).

#### Step 1: KLE representation of inputs

We firstly identify *R* uncertain parameters, i.e. **z**(z_{1}, z_{2},…, z* _{R}*), over the 1D/2D random field with assumed independent PDF, according to the geological survey and site investigation. Then, select

*P*(

*P*= (

*d*

*+*

*M*)!

*/d*!

*M*!)) CPs denoted as

*ς**= (*

_{i}*ς*

_{i}_{1},

*ς*

_{i}_{2},…,

*ς*)

_{iP}^{T}, where

*d*is the order of the PCE;

*M*is the number of KLE items;

*i*= 1, 2,… and

*P*. The CPs are transformed by truncated KLE into input combinations. The selections of effective set of CPs

**(**

*ς*

*ς*_{1},

*ς*_{2},…,

*ς**) are referred to the works of Webster*

_{P}*et al.*(1996), Huang

*et al.*(2007), Shi & Yang (2009), and Li

*et al.*(2011a, 2011b). As a crucial procedure of PCM influencing the method performance, an effective way of processing is to use the roots of the higher orthogonal polynomial, which is approved with a higher precision comparing with the Gaussian quadrature method (Huang

*et al.*2007; Shi & Yang 2009; Li

*et al.*2011a, 2011b). In this study, the CPs for the 2nd-order PCE expansion would be chosen from the set [0, , ], which are the roots of the 3rd-order Hermite Polynomial

*H*

_{3}(

*ς*) =

*ς*

^{3}−3

*ς*.

#### Step 2: Numerical model runs

*P* realizations of input combinations are plugged into the numerical model (i.e. FLO-2D) to generate output field of the maximum flow depth *h*(**x**).

#### Step 3: Creation of SRSM

**MC**(

**x**) =

**h**(

**x**). The coefficient matrix

**C**(

**x**) is defined as

**C**(

**x**) = [

*c*

_{1}(

**x**),

*c*

_{2}(

**x**)…,

*c*(

_{P}**x**)]

^{T}and

**M**is a

*P*×

*P*matrix of Hermite polynomials constructed based on the selected CPs.

**M**= [

*φ*

_{1}(

**),**

*ς**φ*

_{2}(

**),…**

*ς**φ*(

_{i}**)…,**

*ς**φ*(

_{P}**)]**

*ς*^{T}, which satisfies the condition of rank (

**M**) =

*P*, corresponding to Hermite polynomials items in Equation (9) with the selected CPs.

**h**(

**x**) is the output matrix , which are generated in Step 2. The relationship between

**M**and

**h**(

**x**) introduced above under ‘KLE-PCM in flood inundation modelling’ is calculated by

**MC**(

**x**) =

**h**(

**x**) as the coefficients matrix

**C**(

**x**), which is identified as a SRSM for a specified multi-input random field involved in numerical modelling (i.e. flood inundation modelling). Subsequently, the statistic moments such as the means and standard deviations of the max flow depths

*h*(

**x**) in this study can be calculated directly by (Li & Zhang 2007; Shi

*et al.*2009):

#### Step 4: Selection of optimal SRSM

*h*(

**x**) in Step 3, root means squared error (RMSE), coefficient of determination (

*R*

^{2}), relative error of the predicted means (

*E*

_{c}_{,k}) and relative error of the predicted confidence interval (

*E*

_{b}_{,k}) are used for performance evaluation on the validity and applicability of the PCE-KLE models (O'Connell

*et al.*1970; Karunanithi

*et al.*1994; Yu

*et al.*2015): where

*k*, in this work, means the

*k*th grid element of concern and

*K*represents the total number of the concerned grid elements;

*h*and are the predicted maximum water depth in the

_{k}*k*th grid element predicted by MCS approach and KLE-PCM, respectively; and are the corresponding means of

*h*and , respectively; subscripts

_{k}*u*,

*c*and

*l*represent the 5th, 50th, and 95th percentiles of the maximum water depths predicted by the KLE-PCM and MCS. By using Equation (11), the performance of the established SRSMs is compared with the results calculated directly by MCS, from which the optimal SRSM is chosen for future predictions. Therefore, within a physical domain involving a multi-input random field, if an appropriate SRSM is developed for a scenario, we can use it to carry out predictions for future scenarios, which would occur in the same modelling domain with the same BVP. More technical details about the methodology can be found in the Supplementary material (available with the online version of this paper).

## CASE STUDY

### Background

We chose a flood inundation case modified from Horritt & Bates (2001) and Aronica *et al.* (2002) to demonstrate the applicability of the 2nd-order KLE-PCM method. The basic settings are shown as follows (Aronica *et al.* 2002; Huang & Qin 2014a): (i) boundary conditions: steady upstream flow (i.e. inflow hydrograph) at 73 m^{3}/s occurred in a five-year return period flood event with a steady downstream flow at 67.139 m^{3}/s by a gauged weir at Buscot; (ii) relatively flat topography within a rectangular modelling domain: 50-m resolution DEM varying from 67.73 to 83.79 m and the modelling domain is divided into 3,648 (76 × 48) grid elements; (iii) channel cross-section: rectangular with the size of 25 m in width by 1.5 m in depth; (iv) Manning's roughness coefficient (*n*): *n* for the floodplain is suggested as 0.06 and that for the channel is 0.03. More information about this testing case can be found in Aronica *et al.* (2002). The flood inundation is numerically modelled by FLO-2D with channel flow being 1D and floodplain flow being 2D.

In order to evaluate the performance of the KLE-PCM in dealing with flood simulation with 1D/2D random input field, five scenarios are designed (Table S1 in the Supplementary material, available with the online version of this paper). Scenarios 1 and 2 are used to evaluate the uncertainty assessment based on 1D/2D random field of Manning's roughness, namely *n _{c}*(

**x**) for channel and

*n*(

_{f}**x**) for floodplain, without and with the 2D random field of floodplain hydraulic conductivity

*k*(

_{f}**x**), respectively. Scenarios 3–5 are designed with three different inflows (i.e. 36.5, 146 and 219 m

^{3}/s, respectively). Scenarios 1 and 2 are meant for identifying the optimal SRSM; and Scenarios 3–5 are employed for evaluating the performance of the optimal SRSM in predicting different flooding events under uncertainty. Six parameter sets for the 2nd-order SRSMs and six for the 3rd-order ones are tested based on KLE-PCM under Scenarios 1–5 (information of parameter sets can be referred to Supplementary material). These parameter settings are shortlisted based on the testing outcomes from a large number of SRSMs (more than 200) under each scenario. For benchmarking purposes, the results from 5,000 realizations of MCS-based sampling for Scenario 1, and 10,000 realizations for Scenarios 2–5 are calculated. Based on our test, the adopted numbers are sufficient enough to ensure PDF convergence of the results; further increase of such numbers only cause marginal changes of the outputs.

### Results analysis

#### 1D/2D Random field of Manning's roughness coefficient

In Scenario 1, the random field *n*(**x**) is decomposed by KLE, which requires 12 items (i.e. *M* = *M*_{1} + *M*_{2} = 3 + 3^{2}, where *M*_{1} = 3 and *M*_{2} = 3^{2} are taken for 1D and 2D random fields, respectively). Accordingly, 91 (i.e. *P* = (*d**+**M*)!*/d*!*M*! = (2 + 12)!/2! 12! = 91) CPs are chosen for the 2nd-order KLE-PCM, leading to 91 realizations of the 1D/2D random fields (namely 91 runs of the numerical model). Table S2 (in Supplementary material, available online) shows two sets of CPs for Scenario 1 and Figure 2 illustrates four corresponding random field realizations for floodplain Manning's roughness over the modelling domain. It can be seen that the 1D/2D random field (i.e. *n _{c}*(

**x**) coupled with

*n*(

_{f}**x**)) generated by KLE (in KLE-PCM) is essentially distinctive from MCS-based sampling (in MCS method) and these sets of CPs can be used for further computation of statistical moments (as shown in Equation (10)). In addition, it is indicated that the random fields are reflected within the domains of both floodplain and channel. For example, Figure 2(c) represents the 35th realization of the 1D/2D random field of the channel/floodplain Manning's roughness over the entire modelling domain, which are different from the 35th realization of the pure 2D random field as shown in Figure 2(a). Obviously, the consideration of coupled 1D/2D random field is more reasonable and close to real flood modelling scenarios.

In Scenario 1, the 2nd-order KLE-PCM model built up with 91 realizations (denoted as SRSM-91) is applied to the flood inundation case. The number of 91 is selected based on our test with various alternatives. Based on our test, to ensure a reasonable fitting performance of SRSM, the appropriate range of *η _{n}*

_{c}or

*η*should be between 0 and 0.1; after further testing many possible combinations of

_{nf}*η*

_{n}_{c}and

*η*, we have selected six best sets of

_{nf}*η*

_{n}_{c}and

*η*for building the corresponding SRSMs (Table S3 in Supplementary material, available online). Figure 3 shows the simulated means and Stds of maximum flow depth

_{nf}*h*(

**x**) from six SRSM-91s and MCS (with 5,000 realizations) along the cross-sections of

*x*= 11/76, 30/76 and 60/76 over the physical domain. The cross-sections of concern are located in the upstream, middle stream and downstream of the channel. It can be seen that the mean depths from all SRSM-91s with different sets of

_{N}*η*

_{n}_{c}and

*η*(denoted as Set 1 to Set 6 shown in Table S3) fit fairly well with those from MCS, at the mentioned profiles located in the upstream, middlestream and downstream, respectively. However, when it comes to Stds approximation of

_{nf}*h*(

**x**), these SRSM-91s demonstrate different simulation capacities and Set 1 shows the most satisfying performance (average RMSE = 0.0083, as shown in Table S3). The approximation performance of SRSM-91s is also varying for different profile locations. Taking SRSM-91 with Set 1 for instance, when the location of the profile changes from upstream to downstream, the corresponding RMSE would increase from 0.0043 to 0.0115 m. The above results demonstrate that the 2nd-order KLE-PCM (i.e. SRSM-91 with Set 1) could reasonably reproduce the stochastic results in Scenario 1 as from MCS, but with only 91 runs of the numerical model (comparing with 5,000 realizations of MCS). Generally, it proves promising that establishment of a SRSM with suitable parameters is cost-effective in addressing uncertainty associated with large-scale spatial variability during the flood inundation modelling.

#### 1D/2D random field of Manning's roughness coefficient coupled with 2D random field of floodplain hydraulic conductivity

Based on the random field in Scenario 1, an additional 2D random input field of floodplain hydraulic conductivity *K*(**x**) is added in Scenario 2. Such a case represents a more complicated multi-input random field that appears more common in flood modelling. For this scenario, the random dimensionality of KLE would be *M* = 3 + 3^{2} + 3^{2} = 21, and accordingly, the number of items for the 2nd-order PCM is *P* = (*N**+**M*)! */N*! *M*!) = (2 + 21)! /2! 21! = 253. The performance of the 2nd-order KLE-PCM would be examined and compared with MCS based on 10,000 realizations.

In Scenario 1, SRSM-91 with Set 1 achieves the best performance among six alternatives. Similarly, for Scenario 2, based on the best SRSM-91 parameters, we have further built up six SRSMs (with various combinations of *η _{n}*

_{c}–

*η*–

_{nf}*η*values) to test the applicability of the 2nd-order KLE-PCM with 253 items. Table S3 shows the detailed RMSE values of SRSM-253 regarding the Std fitting. It is found that the approximations of the mean depths from the SRSM-253s are generally in good agreement with MCS results for the concerned profiles; however, approximations of Stds have more notable variations compared with those from MC. The SRSM-253 with Set 1 (

_{fk}*η*

_{n}_{c}= 0.03,

*η*= 0.03 and

_{nf}*η*= 0.03) achieves the best performance among all SRSM-253s alternatives. It is also found that the capability of SRSM varies with profile locations, which is consistent with the results of SRSM-91. The potential reason may be that there is relatively less meandering within the upper part of the channel and the water is deeper, which could lead to a higher deviation of the model results.

_{kf}Figure 4 shows the spatial distributions of the means and Stds of *h*(**x**) over the entire modelling domain simulated by SRSM-253 with Set 1 of parameters and MCS. Overall, the statistics of the simulated *h*(**x**) from SRSM-253 are fairly close to those from MC, especially for the means. The simulated Stds of *h*(**x**) from the two methods are generally consistent with each other, except that SRSM-253 leads to somewhat overestimation in the middle part of the floodplain. In terms of computational efficiency, SRSM-253 needs to run the numerical model for 253 times, which is significantly less than that used by MCS for the same random field.

#### Prediction under different inflow scenarios

From the above test, the SRSM-253 with Set 1 (as *η _{nc}* =

*η*=

_{nf}*η*= 0.03) is found to be the optimal SRSM-253 to deal with the BVP involving the multi-input random field in Scenario 2. In this section, we want to examine the performance of this optimal surrogate in predicting different inflow scenarios but with the same random field in Scenario 2. The steady inflows in the testing scenarios (i.e. Scenarios 3–5) are designed as 36.5, 146 and 219 m

_{kf}^{3}/s, respectively, representing the low, medium and high levels of flooding in the future for the study region.

Figure 5 shows a comparison of statistics of results forecasted by SRSM-253 with Set 1 and the corresponding MCS (with 10,000 realizations) along the cross-section profile *x _{N}* = 21/76. It appears that more grid elements would get inundated when inflow level increases. This leads to a wider range of higher values of means and Stds under higher inflow conditions. From Figure 5, the predicted means are fairly close to those from MCS with RMSE being 0.0488, 0.0724 and 0.0811 m, and R

^{2}being 0.998, 0.998, and 0.998, for inflows of 37.5, 146, and 219 m

^{3}/s, respectively. The predicted Stds from SRSM-251 generally fit well with that from MCS for Scenarios 3–5. However, it is also noticed that when the inflow changes to different levels, the predicted Stds for some grid elements are somewhat deviated from the benchmark. Figure 6(b) shows the predicted Stds at the two extreme points (i.e. around the channel area with an index of 0.23 along profile

*x*= 21/76) are about 35.8% higher than those from MCS when the future inflow is 36.5 m

_{N}^{3}/s. When the flow increases to 146 m

^{3}/s, there are a series of overestimation of Stds along the indexes from 0.4 to 0.5, with average relatively errors being around 20%. When the inflow increases up to 219 m

^{3}/s, there is somewhat underestimation (about 11.4–31.2%) around the channel area and overestimation (about 0.4–45.1%) over the flood plain (with indexes ranging from 0.3 to 0.6). Considering the magnitude of Stds are much lower than the mean, the overall fitting of SRSM-253 is quite comparable to that of MC. Also, the computational needs are significantly less than MC.

Figure 6 shows the confidence intervals of max flow depths for three different locations. They are generated based on the predicted means and Stds with the optimal SRSM (SRSM-253 with Set 1) and MCS under four inflow conditions. Herein, the max flow depth are the peak values occurring along the profiles *x _{N}* = 21/76, 30/76 and 60/76, and their locations are grid (21/76, 11/48) in the upstream, grid (30/76, 17/48) in the middlestream and grid (68/76, 22/48) in the downstream. It can be seen from Figure 6(a) that, for the lower inflow condition (36.5 m

^{3}/s), the SRSM provides better prediction for peak depths located in the downstream than that in the upstream and middlestream. This may be because of the existence of more complicated terrains (i.e. meanders) around grids (30/76, 17/48) and (68/76, 22/48) which leads to a higher nonlinear relationship and more divergence of predicted intervals. For higher inflow levels (146 and 219 m

^{3}/s), the predicted intervals of peak depths reproduce those from MCS very well for the three locations with average

*E*

_{b}_{,c}being 3.2% and average

*E*

_{b}_{,k}being 19.1%. This implies that SRSM is better used for higher flow conditions, where the sensitive areas such as dry or meandering locations could change to less sensitive ones when they are inundated with water. Overall, the study results verify that the SRSM-253 with Set 1 could be used to predict peak depths for different events within the 1D/2D modelling domain involving the multi-input random field, which are useful for further flood inundation risk assessment.

#### Further discussion

From the five inflow scenarios, KLE-PCM is demonstrated cost-effective in dealing with complex BVPs problems involving coupled 1D/2D random fields of Manning's roughness and hydraulic conductivity. The calibration process still involves some efforts in testing the optimal parameters by comparing with MC; however, the prediction process becomes more efficient for future events as only a limited number runs of the numerical model is needed. In terms of accuracy, the KLE-PCM has proved effective in generating comparable results from direct MCS-based sampling scheme. Comparing with applications of KLE-PCM in groundwater modelling field (Li & Zhang 2007; Shi & Yang 2009), there are a number of differences. Firstly, the flood modelling involves a much larger spatial variability of input parameters due to a larger modelling domain of surface land conditions. This leads to a more complicated (single or multi-) input random field affecting output field, whose representation by KLE would involve a notably different scale of correlation lengths and different amount of KLE items. Secondly, the flood inundation problem normally involves a higher level of non-linearity and complexity due to coupled 1D and 2D considerations for input parameters; as a comparison, the groundwater modelling system only involves 1D or 2D settings. Thirdly, ground water cases normally involve limited point-based measurements for soil property, this gives sufficient ground for conducting a random-field based uncertainty assessment. In flood inundation modelling, the land cover is normally observable and the range of spatial variability would be restricted by a specific type of land use conditions. However, there still could be large uncertainty associated with the parameter estimation as there are hardly any effective approaches to accurately measure parameters such as roughness coefficient in a large region like floodplain. This study has successfully demonstrated the effectiveness of KLE-PCM in dealing with large-scale spatial variability of BVP parameters and coupled 1D/2D random field. The related findings are useful for supporting real-scale flood modelling under uncertainty and the related risk assessment and management.

In this study, only two types of land covers are considered, including the floodplain (pre-classified as cropland with a mean roughness coefficient around 0.06) and river channel (with a mean roughness coefficient around 0.03). Each land cover is assumed a random field with stochastic features described by assumed distributions. The related stochastic parameters should be properly bounded within a reasonable range in order to generate stochastic variable values that are consistent with the physical observations. In more complicated cases, there could be more than two types of land covers. Then, it would be necessary to apply KLE-PCM on each surface type in order to better address the physical distribution of the fields. However, this may considerably increase the computational burden which could make the method less attractive compared with conventional MCS. Further explorations on this point are necessary. Another point deserving notice is that our study focuses on forward-type uncertainty propagation which is similar to MCS. The mean values of the roughness coefficients were obtained from model calibration where actual satellite image of inundation has been used (Aronica *et al.* 2002). In this study, we used these mean values to help define the stochastic parameters of concern. The observation data is not essential in our study as we can use Monte Carlo simulation results as the benchmark for comparison. This has been a common practice in KLE/PCM related studies, e.g. Xiu & Karniadakis (2002), Shi *et al.* (2010), and Li *et al.* (2011a, 2011b). Potential studies on inverse problems, such as the Bayesian approach, could be explored in the future.

Although the computational burden is largely alleviated by KLE-PCM comparing with traditional MCS, there are also some limitations. Firstly, when more input random fields are involved in the modelling system, in order to accurately decompose such a field it requires KLE with more items and much higher-rank chaos polynomial matrix to build up corresponding SRSM, whose construction is timing-consuming. Secondly, in this study, we only consider steady inflow conditions. In practical applications, there could be unsteady inflow scenarios, which involve much higher non-linear relationships and more parameters for building up acceptable SRSMs. Finally, the selection of CPs is also time-consuming when the dimensionality of the multi-input random field represented by KLE is high. In order to obtain a higher accuracy of SRSM, full-rank matrix of Hermite polynomials are required so that the selection of CPs is a crucial procedure for the whole framework of KLE-PCM. How to conduct a cost-effective stochastic sampling of the CPs needs further explorations.

## CONCLUSIONS

This study addressed the issue of parameter uncertainty associated with 1D and 2D coupled (1D/2D) random field of Manning's roughness in modelling flood inundation process under steady inflow conditions. We have built up an optimal 2nd-order KLE-PCM with 91 items (SRSM-91) to deal with the 1D/2D random input field of Manning's roughness in Scenario 1, and then a 2nd-order KLE-PCM with 253 items (SRSM-253) to handle a multi-input random field (by adding hydraulic conductivity) in Scenario 2. Both SRSMs were used to test the applicability of SRSM. Furthermore, Scenarios 3–5 were (designed with steady inflow as 36.5, 146 and 219 m^{3}/s, respectively) used to test the prediction capability of the established SRSM-253 with the best parameter set under different flood scenarios. The study results demonstrated that KLE-PCM was cost-effective in obtaining the mean and standard deviations of the water depth compared with MCS. It was also indicated that established SRSM-253 had good prediction capacity in terms of confidence interval of the max flow depths within the flood modelling domain.

From this study, a number of limitations were found and expected to be tackled in future works: (i) many practical flood simulations involve unsteady inflow hydrographs, which means the inflow hydrograph should be addressed as one of the temporal uncertainties; (ii) when more 1D/2D input random fields are involved in the flood modelling process, the dimensionality of the multi-input random field would increase notably and this desires more efficient algorithms in identifying CPs; (iii) when the flood inundation modelling is to be coupled with other processes like hydrological modelling, the cost-effectiveness of KLE-PCM needs to be further verified.

## ACKNOWLEDGEMENTS

This research is based on research work supported by Singapore's Ministry of Education (MOE) AcRF Tier 1 (Ref no. RG188/14; WBS no. M4011420.030) Project and in part by Nanyang Technological University (NTU) Start-Up Grant (WBS no. M4081327.030). The authors are also grateful to Dr Paul Bates (University of Bristol) for providing the relevant test data.

## REFERENCES

*.*

**10**, 012208

*.*

*MIT Joint Program on the Science and Policy of Global Change Report Series No. 4*