## Abstract

Uncertainty quantification is very important in environmental management to allow decision makers to consider the reliability of predictions of the consequences of decision alternatives and relate them to their risk attitudes and the uncertainty about their preferences. Nevertheless, uncertainty quantification in environmental decision support is often incomplete and the robustness of the results regarding assumptions made for uncertainty quantification is often not investigated. In this article, an attempt is made to demonstrate how uncertainty can be considered more comprehensively in environmental research and decision support by combining well-established with rarely applied statistical techniques. In particular, the following elements of uncertainty quantification are discussed: (i) using stochastic, mechanistic models that consider and propagate uncertainties from their origin to the output; (ii) profiting from the support of modern techniques of data science to increase the diversity of the exploration process, to benchmark mechanistic models, and to find new relationships; (iii) analysing structural alternatives by multi-model and non-parametric approaches; (iv) quantitatively formulating and using societal preferences in decision support; (v) explicitly considering the uncertainty of elicited preferences in addition to the uncertainty of predictions in decision support; and (vi) explicitly considering the ambiguity about prior distributions for predictions and preferences by using imprecise probabilities. In particular, (v) and (vi) have mostly been ignored in the past and a guideline is provided on how these uncertainties can be considered without significantly increasing the computational burden. The methodological approach to (v) and (vi) is based on expected expected utility theory, which extends expected utility theory to the consideration of uncertain preferences, and on imprecise, intersubjective Bayesian probabilities.

## INTRODUCTION

Scientific integrity principles and ethical concerns require scientists to be open about and proactively communicate the uncertainty in their predictions. In environmental decision support, predictions of the consequences of decision alternatives have to be assessed for the fulfilment of societal goals. This adds additional uncertainty to the decision support process, as elicited societal preferences are uncertain due to uncertainty of and temporal changes in the preferences of individuals, different perceptions and values of different individuals in the society, and uncertainties induced by the parameterization of quantified preferences and the elicitation process.

Over the past few decades, many promising concepts and methodologies have been developed to address these uncertainties. Nevertheless, most environmental decision support processes address uncertainty only incompletely. To stimulate more comprehensive uncertainty analyses in the future, it is the goal of this paper to review and discuss techniques that can contribute to better considering uncertainty in environmental research and decision support and to outline a decision support procedure that more comprehensively addresses uncertainty. In particular, the goal is to emphasize how the uncertainty of preferences and the ambiguity in specifying prior distributions can be included in a decision support process as these uncertainties, despite their high relevance, are often neglected in environmental decision support.

Many reviews on modelling environmental systems and on the modelling process exist. Most of these reviews emphasize the importance of quantifying uncertainty (e.g. Clark 2005; Refsgaard *et al.* 2007; Schuwirth *et al.* 2019). Other studies focus on expert or stakeholder involvement in model building (e.g. Voinov & Bousquet 2010; Krueger *et al.* 2012; Voinov *et al.* 2016) which also relates to extending structural diversity and considering uncertainty. This article builds on this literature, but focuses on useful methodologies for uncertainty quantification rather than on guidelines for the model building or decision support process.

In a modelling process that primarily focuses on increasing our understanding of the investigated system, we need:

- (a)
a model that attempts to describe the underlying mechanisms of the observed behaviour of a system and makes it possible to test these model formulations (‘hypotheses’) with observed data.

To support environmental decisions, we need:

(b1) a description of societal preferences about what should be achieved, ideally in quantitative terms expressed as functions of observable system attributes; and

(b2) a model that is based on the current state of scientific knowledge and predicts the consequences of decision alternatives on output variables that are relevant for assessing the fulfilment of the societal goals (the attributes mentioned in b1).

Note that the requirements for models of category (a) and models of category (b2) are somewhat different, so that despite many synergies, a model designed for (a) is not always the best model for (b2). In particular, a model for decision support (b2) has to predict the output variables (attributes) used to quantify the preferences (b1) and it has to describe the dependence of these on input variables that distinguish the decision alternatives (Schuwirth *et al.* 2019).

In the following sections, conceptual aspects of uncertainty quantification and model building are discussed before moving on to models for predicting the consequences of decision alternatives and models for quantifying preferences. Then a decision support framework with comprehensive uncertainty quantification is described. This is followed by a short section on numerical approaches and software to implement the suggested techniques. Finally, conclusions are drawn about the transfer of the suggested techniques to research and practice.

## CONCEPTUAL ASPECTS OF UNCERTAINTY QUANTIFICATION AND MODEL BUILDING

### The need for Bayesian techniques

The most important conceptual decision is on how to describe uncertain knowledge. Bayesian (epistemic) probabilities are the most straightforward choice as they provide a consistent framework for conditional beliefs and iterative learning and they are compatible with adopting randomness (aleatory probabilities) as part of the uncertain knowledge about future outcomes (Reichert *et al.* 2015).

As environmental decisions should be based on the best available current state of scientific knowledge, prior knowledge must be carefully elicited as intersubjective knowledge (Gillies 1991; Gillies 2000; Reichert *et al.* 2015). This is implemented in practice by using well-designed eliciting techniques (Morgan & Henrion 1990; Meyer & Booker 2001; O'Hagan *et al.* 2006; Rinderknecht *et al.* 2011) to elicit priors, combining the assessments from multiple experts, and by carefully documenting the use of prior information from the literature to get justifiable priors.

### Beyond prior times likelihood – considering intrinsic uncertainty

Here, ** θ** are model parameters;

**y**

_{obs}represent potentially observed states;

*p*(

**y**

_{obs}|

**,**

*θ***x**) is the probabilistic model for observations conditional on the input,

**x**, and the parameter values (called likelihood function if viewed as a function of the parameters with actual observed values substituted for

**y**

_{obs});

*p*(

**) is the probability distribution that quantifies the prior knowledge about the model parameters; and**

*θ**p*(

**|**

*θ***y**

_{obs},

**x**) is the probability distribution representing the updated, posterior knowledge about the parameters given the observations. Although Equation (1) is correct in the right context (learning about model parameters), its interpretation (‘prior times likelihood’) obscures:

- (i)
that in nearly all practical applications, the main part of the prior knowledge is formulated by the probabilistic model (likelihood), which is needed to define the meaning of the parameters (or may not have parameters);

- (ii)
that one is usually interested in predicting future or conditional states,

**y**_{new}, (conditional on changes in influence factors or potential measures suggested as decision alternatives),*p*(**y**_{new}|**y**_{obs},**x**), rather than in model parameters (which are just an auxiliary tool for an intermediate step); and - (iii)
that, from a statistical perspective, there is often no fundamental difference between model parameters and unobserved states and Bayesian inference can be applied to condition the joint distribution of observed and unobserved model variables on those observed to get an update of those unobserved.

In the context above, it would be better to state: ‘our joint prior knowledge of parameters and observed states is equal to the likelihood times the prior of its parameters; conditioning this joint probability on the observations leads to the posterior of the parameters’.

**z**), true outputs (

**y**), observed outputs (

**y**

_{obs}), and parameters of the system model (

**,**

*ζ***,**

*θ***) and of the observation model (**

*ψ***) as a hierarchical model (see e.g. Clark 2005). The factorization of the joint distribution into conditionals given by Equation (2) decomposes the model into submodels for the investigated system, the parameters, and the observation process, as illustrated in Figure 1. This considerably facilitates model construction because the submodels are less complex as they describe more specific subsystems with a smaller number of input and output variables and parameters.**

*ξ*Model construction based on a graphical model as shown in Figure 1 and formalized by Equation (2) became prominent as ‘Bayesian belief network modelling’ (see e.g. Borsuk *et al.* 2004). However the underlying concept applies to all probabilistic models and is also known as ‘hierarchical modelling’ (with a slight variation in focus, see e.g. Clark 2005). Note that also ‘structural equations modelling’ (see e.g. Kline 2011) is based on very similar concepts (although initially limited to normal distributions). In the context of time-series models, again very similar model structures are known as ‘state-space models’ (see e.g. Künsch 2001), ‘hidden Markov models’ (see e.g. Künsch 2001) or ‘dynamic Bayesian belief network models’ (see e.g. Murphy 2002). Sometimes, those different terminologies for essentially the same underlying concepts can be confusing.

### Imprecise probabilities

*et al.*2012, 2014). The following outline is based on the density ratio class to formulate imprecise probabilities (DeRobertis & Hartigan 1981; Rinderknecht

*et al.*2012, 2014). This class is defined by constraining the shape of probability densities by a lower,

*l*, and an upper,

*u*, non-normalized probability density and then normalizing the shapes in between:

This concept is illustrated in Figure 2. The left panel illustrates how shapes of non-normalized probability densities (green) are bounded by non-normalized upper (*u*, red) and lower (*l*, blue) non-normalized densities. The right panel illustrates the non-normalized density (green, dashed) with the highest probability in the interval [*θ*_{1}, *θ*_{2}].

This class of probability distributions was chosen because of its invariance under Bayesian updating, under marginalization, and under propagation through deterministic functions (see Rinderknecht *et al.* 2014) which makes sequential learning possible within the same framework.

## MODELS FOR UNDERSTANDING AND PREDICTION

### Exploring data

Data science methodologies provide new opportunities for exploring data (see e.g. LeCun *et al.* 2015; or more specifically regarding application in ecology and water resources research, Peters *et al.* 2014; Shen 2018). Even as the main purpose of these methods is exploration and prediction without explicitly considering mechanisms, these methods can contribute to a primarily mechanistically oriented modelling process by:

Developing ‘prediction models’ for aspects for which understanding is not as important as for other aspects (e.g. image recognition from remote sensing data or plankton species identification to compile input data for mechanistic models).

Identifying patterns to stimulate mechanistic model development (e.g. finding clusters of organisms that have similar properties or behave similarly).

Developing ‘black box’ models that serve as benchmarks to analyse the potential for the improvement of more sparsely parameterized mechanistic models (Kratzert

*et al.*2019a).Trying to interpret the functioning of the model developed by applying machine learning techniques in the sense of ‘interpretable data science’ (e.g. Papernot & McDaniel 2018; Gilpin

*et al.*2019; Kratzert*et al.*2019b).Constraining the model or the learning algorithm to consider physical characteristics of the system to facilitate interpretation and learning or developing ‘hybrid models’ that bridge between mechanistic and data-based models (‘theory-guided data science’, e.g. Karpatne

*et al.*2017).

### Constructing stochastic, mechanistic models

Wherever possible, models that are intended to predict beyond their calibration range should be designed to represent underlying mechanisms. Hardly any environmental system behaves deterministically at the level at which we observe its behaviour. Reasons for this can be true stochasticity resulting from quantum-mechanical processes, demographic stochasticity of birth and death processes, genetic stochasticity, or apparent non-deterministic behaviour caused by the limited temporal and spatial resolution of the initial state of the modelled system and of external driving forces. For these reasons, to get an appropriate description of system behaviour and of its uncertainty, we need stochastic, mechanistic models. Environmental stochasticity can be considered by making inputs and/or model parameters stochastic processes in time (see e.g. Reichert & Mieleitner 2009). Typically, considering stochasticity leads to hierarchical models that complement unobserved parameters with unobserved states as illustrated in Equation (2) and Figure 1. As shown by Chou & Greenman (2016) for a density-dependent, age-structured population model, it can be inconsistent if one tries to formulate deterministic models directly: they may not necessarily represent the development of the mean of a (more realistic) stochastic model. Ideally, stochastic modelling is combined with a multi-model approach to account for the consequences of structural uncertainty on predictions even better.

## MODELS OF PREFERENCES

### Multi-criteria decision analysis (MCDA)

The multi-criteria decision analysis (MCDA) methodology (Keeney & Raiffa 1976; Keeney 1992; Eisenführ *et al.* 2010) provides an ideal framework for modelling preferences for environmental decision support (see e.g. Reichert *et al.* 2015).

*et al.*2011; Reichert

*et al.*2019). This leads to a value function of the main objective that, through the aggregation functions and the lowest-level value functions, depends on all attributes used to characterize the degrees of fulfilment of all lowest level sub-objectives. Equation (4) and Figure 3 show an example of the construction of the value function,

*ν*, for the main objective that depends through the value functions for lowest-level sub-objectives

*ν*

_{1},

*ν*

_{2},

*ν*

_{3a}, and

*ν*

_{3b}and the aggregation functions

_{3}and on the attributes

*y*

_{1}to

*y*

_{5}:

Similarly to the case of the outcome prediction model (Equation (2) and Figure 1) this decomposition of the overall value function into value functions of lowest-level sub-objectives and aggregation functions at higher levels facilitates the construction of the overall value function by requiring less complex functions at each decomposition step.

*et al.*2010 for more details). Given the utility function,

*u*, of the main objective and the probability distributions, , of the system attributes,

**y**, from probabilistically modelling the consequences of each alternative,

*a*, the expected utilities, EU, can be calculated for all alternatives: Here, is the prior or posterior (if the prior was updated) parameter distribution of model parameters for alternative

*a*and there may be additional integration across internal variables and states if the model is hierarchical. Decision support is then based on ranking the alternatives according to decreasing values of their expected utilities.

### Expected expected utilities

*et al.*2019b). Here, the utilities, , are parameterized with parameters, , the uncertainty of which is described by their distribution, . Utilities are only defined up to an affine transformation (e.g. Eisenführ

*et al.*2010). To compare utilities we need to ‘standardize’ all utilities of the uncertain set at extreme values (typically to 0 and 1 for the worst and best outcomes). If this is done, the alternatives can be ranked according to their expected expected utility, EEU (Boutilier 2003):

This framework leads to a unique ranking that considers the uncertain utilities. As outlined in the next section, combining this framework with imprecise probabilities makes it possible to assess the ambiguity of this ranking resulting from the ambiguity about the probability distributions used to quantify uncertainty of outcome predictions as well as values and utilities; this adds an essential new element to the analysis.

## A COMPREHENSIVE FRAMEWORK FOR UNCERTAINTY ANALYSIS IN DECISION SUPPORT

By combining the ‘standard steps’ of decision support, such as problem definition, stakeholder analysis, elicitation of an objectives hierarchy, etc (see e.g. Reichert *et al.* 2015) with the techniques outlined above, a comprehensive uncertainty analysis can be performed by modifying the following three steps.

### Elicitation and construction of value and utility functions

When using parameterized value functions and a parameterized conversion function to utilities, parameter estimation can be done based on elicited discrete choice selections (Hoyos 2010) or indifference replies (Haag *et al.* 2019a) using Bayesian inference with density ratio class priors. This leads to a posterior density ratio class of the value/utility parameters (Rinderknecht *et al.* 2014) that jointly describes uncertainty and the ambiguity about its quantification. If a multi-model approach was performed, either the best fitting value model can be selected or multiple models considered for further analysis.

### Prediction of outcomes of decision alternatives

For predicting the outcomes of the decision alternatives, either density ratio class priors (for their elicitation see e.g. Rinderknecht *et al.* 2011) can be used directly, or they can be updated by Bayesian inference if data is available. In the latter case, this also leads to a (posterior) density ratio class of the parameters (Rinderknecht *et al.* 2014). Again, a multiple model approach is useful for considering structural uncertainty.

### Compilation of results

Evaluating the expected expected utilities in Equation (6) for imprecise probability distributions of and leads to an interval of expected expected utilities EEU(*a*), for each alternative, *a*. These intervals for all alternatives lead to an incomplete ordering of the decision alternatives that reflects the ambiguity in addition to the uncertainty. However, this seems to be the best representation of knowledge and uncertainty and thus provides the best information to be communicated to decision makers.

## NUMERICAL CHALLENGES AND SOFTWARE

Efficient numerical algorithms, mainly based on Markov Chain Monte Carlo (MCMC) techniques, such as Metropolis-Hastings sampling (see e.g. Gelman *et al.* 2013), have been developed to sample from posteriors in Bayesian inference for cases in which the likelihood function can easily be evaluated. This is no longer true for hierarchical models that require computationally much more demanding techniques, such as Gibbs sampling (see e.g. Gelman *et al.* 2013), Particle Markov Chain Monte Carlo (PMCMC; see e.g. Andrieu *et al.* 2010), Hamiltonian Monte Carlo (HMC; see e.g. Betancourt 2017), or Approximate Bayesian Computation (ABC; see e.g. Beaumont 2010; Albert *et al.* 2015). Krapu & Borsuk (2019) provide an overview of software designed for this purpose.

On the other hand, the extension to imprecise probabilities based on the density ratio class does not add a lot of additional computational effort, as a (unweighted) sample of a single distribution from the class can easily be turned into a weighted sample of any other distribution from the class without having to do inference again (Rinderknecht *et al.* 2014). Still calculating the EEU-intervals requires the evaluation of expected expected utilities for a large sample of elements from the class.

## CONCLUSIONS

An ideal framework for environmental decision support would be to combine mechanistic, stochastic models for prediction with uncertain utilities for quantifying preferences and to use imprecise probabilities for robustness analysis. We outlined how this can be done without significantly increasing the computational burden. It would be a very interesting next step to apply this procedure to an actual decision problem and to analyse the benefits and challenges of considering all sources of uncertainty. In particular, it would be interesting to explore whether decision makers are willing to accept the outcome in the form of an incomplete ranking based on intervals of expected expected utility rather than a unique ranking. Easy-to-apply software could significantly contribute to facilitating the application of the suggested approach in practice.

## ACKNOWLEDGEMENTS

The development of this paper has benefited from discussions with many scientists over the last few years. I would like to mention in particular Nele Schuwirth, Simon Lukas Rinderknecht, Johanna Mieleitner, Carlo Albert, Simone Langhans, Fridolin Haag, and Judit Lienert. This paper is dedicated to my former PhD students Johanna Mieleitner (1975–2019) and Simon Lukas Rinderknecht (1979–2019), who contributed significantly to the development of some of the techniques discussed in this paper.