## Abstract

Different hydrological models provide diverse perspectives of the system being modeled, and inevitably, are imperfect representations of reality. Irrespective of the choice of models, the major source of error in any hydrological modeling is the uncertainty in the determination of model parameters, owing to the mismatch between model complexity and available data. Sensitivity analysis (SA) methods help to identify the parameters that have a strong impact on the model outputs and hence influence the model response. In addition, SA assists in analyzing the interaction between parameters, its preferable range and its spatial variability, which in turn influence the model outcomes. Various methods are available to perform SA and the perturbation technique varies widely. This study attempts to categorize the SA methods depending on the assumptions and methodologies involved in various methods. The pros and cons associated with each SA method are discussed. The sensitivity pertaining to the impact of space and time resolutions on model results is highlighted. The applicability of different SA approaches for various purposes is understood. This study further elaborates the objectives behind selection and application of SA approaches in hydrological modeling, hence providing valuable insights on the limitations, knowledge gaps, and future research directions.

## INTRODUCTION

The reckless and continuous anthropogenic activities of continuous nature adversely affect the quantity and quality of water resources, thereby altering various hydrological fluxes and feedback processes (Pielke *et al*. 1999; Rose & Peters 2001; Sivapalan *et al*. 2003; Jiang *et al*. 2012). Hydrological modeling is a mathematical representation of these hydrological processes which influence primarily the energy and water balance of a watershed (McCuen 1973). In general, models attempt to represent the complex spatially distributed interactions of water, energy, and vegetation by means of mathematical equations. Broadly, two types of models exist: conceptual and physically based hydrological models (HMs) (Beven & O'Connell 1982; Beven 1989; Madsen 2000). Regardless of the modeling approach, inevitably, all models are imperfect representations of reality, with each providing different perspectives of the system and approximating various physical processes with some constants and parameters. In most of the approaches, though, many parameters would be observable (e.g., basin area, slope, elevation, vegetation type), and a few would be unobservable conceptualizations of basin characteristics. Irrespective of the choice of models, the primary problem in any hydrological modeling is the uncertainty in the determination of these model parameters. Further complications arise owing to the mismatch between model complexity and the data which are available to parameterize, initialize, and calibrate such models (Troch *et al*. 2003; Ye *et al*. 2008; Gupta *et al*. 2012; Zhang *et al*. 2012; Song *et al*. 2015).

A sensitivity analysis (SA) to determine the possible values to be assigned to the parameters and the qualitative and or quantitative variations in the output of an associated model, should be an integral part of any hydrological modeling study (Vemuri *et al*. 1969; Saltelli *et al*. 2000). Despite SA being an unavoidable step in modeling, no unique definition exists; rather, there are different definitions depending upon the fields of application (Razavi & Gupta 2015; Song *et al*. 2015). In hydrological modeling, SA can be simply defined as the change in the output responses due to the change in one or more model inputs or parameters. However, considering the fact that the sensitivity of model parameters is only one of the objectives of SA, this definition is incomplete. One probable reason behind the absence of any generalized definition is the limited practice of SA methods as a tool to design and study the hydrologic models (McCuen 1973).

Critical examination of the relations between model inputs and outputs is essential, which may: (i) help to identify any potential deficiencies in model structure and formulation; (ii) provide guidance for model order reduction and parameterization; and (iii) analyze the information content of available observations (Castaings *et al*. 2007). SA methods, hence, help to identify the parameters that have a strong impact on the model outputs, thereby influencing the efficacy of any model. Dawdy & O'Donnell (1965) stated that ‘In general, the greater the sensitivity of the model response to a parameter, the closer and sooner will that parameter be optimized’ and concluded that the insensitive parameters often did not approach the true value. It is worthwhile to mention here that SA takes into account the effect of parameters as well as the uncertainties in model forcings (D'Agnese *et al*. 1999; Hill & Tiedeman 2006). Traditionally, SA has been carried out through manual calibration, which is very tedious and time-consuming (Boyle *et al*. 2000; Madsen *et al*. 2002). SA based on automatic calibration procedures can be classified into local and global search strategies (Sorooshian & Gupta 1995). While local approaches deal with assessing the effect of parameters on the output by varying each parameter, one at a time around any base case, global approaches assess the change in output by varying all the parameters simultaneously over the entire feasible range.

Past studies provide insight into the application of SA in various fields such as economics (Coyle *et al*. 2003), food safety (Frey & Patil 2002; Mokhtari & Frey 2005), groundwater (Mishra *et al*. 2009), chemical (Saltelli *et al*. 2005; 2012) and environmental modeling (Hamby 1994), etc. The application of SA methods in hydrological modeling, although very limited (Blasone *et al*. 2007), has been gaining attention in the recent past. Initially, SA in HMs was carried out using local, deterministic, gradient-based methodologies (Dawdy & O'Donnell 1965; Nash & Sutcliffe 1970; Johnston & Pilgrim 1976). Later, robust and effective global search procedures were evolved (Wang 1991; Duan & Gupta 1992; Sumner *et al*. 1997). In addition, recent works have highlighted hybrids of local and global approaches in which derivative-based methods are employed to obtain the distribution of parameter sensitivity across the entire feasible parameter space (Rakovec *et al*. 2014); the definition of feasible ranges has a distinct subjective component, however. Apart from analyzing the sensitivity of various parameters, the sensitivity of HM responses to spatial resolution (Bloschl & Sivapalan 1995; Bruneau *et al*. 1995; Finnerty *et al*. 1997; Merz *et al*. 2009) and to the uncertainties in the spatial rainfall patterns were also explored in the past (Obled *et al*. 1994; Arnaud *et al*. 2011).

A critical review of different SA and sampling methods is absolutely necessary, considering the facts below:

First, most of the past reviews concentrate on global analyses approaches (Iooss & Lematre 2015; Song

*et al*. 2015). Despite the advancements in these global approaches, local approaches are found to be beneficial in regional modeling studies, especially in semi-distributed/distributed hydrological modeling (Liang 1994; Rakovec*et al*. 2014). Thus, in addition to global approaches, local approaches need to be given more focus.Second, a rigorous classification of SA approaches by considering different physical aspects of modeling studies and the suitability of SA methods, simultaneously need to be studied. Since perturbations in each SA approach may vary depending on the distribution of model parameters (whether it is lumped, semi-distributed, or distributed), a critical review of each SA method taking into account the choice of HMs is of the utmost importance (Saltelli

*et al*. 2008).Third, a review of SA and sampling techniques based on its application in different parts of the world (regional characteristics) has not been attempted by any studies so far. We found a strong affinity between SA techniques deployed and regions modeled (Spruill

*et al*. 2000; Tucker & Whipple 2002). Such revelation will provide a platform for the modeler to adopt an appropriate SA method depending on the study area being modeled. This is particularly significant, considering the fact that the objectives, assumptions, and methodology to be taken up for a suitable SA technique are largely influenced by the hydrological and physical characteristics of the region to be modeled. This information may help in developing a more realistic representation of hydrologic linkages of the region through models and will help in saving time.Finally, a comprehensive review of different SA objectives and goals to be considered in hydrologic modeling, highlighting the importance of conducting such an analysis, is necessary to aid the modelers’ community.

Hence, the purpose of this paper is to critically review the existing local and global approaches of SA methods deployed in hydrological modeling. Emphasis is also placed on the categorization of the SA methods, which will facilitate an effective selection of suitable SA method. The importance of SA is highlighted in the section ‘Need for SA’. Various SA approaches are detailed in the section ‘Classification of sampling methods, SA approaches, and their pros and cons’ by binning the approaches based on the methodology, number of parameters, etc. The pros and cons of different approaches are also elaborated upon. The spatio-temporal dependency of HMs is discussed in the section ‘HMs and SA application’. Different key factors which should be considered in SA in hydrological modeling are highlighted in the section ‘Key factors to be considered in SA’. The regional dependence of SA and sampling approaches are detailed in the section ‘Regional dependence of SA methods and sampling strategies’. Finally, possible challenges which may be encountered while applying SA approaches are discussed in the section ‘Discussion and conclusions’.

## NEED FOR SA

The study of sensitivity or SA serves a range of purposes depending upon the nature of application, of which a few are summarized below:

### Determination of response surface

Response surface is a region or space that represents the possible changes in model response in accordance with the changes in model input parameters. In any modeling study, identification of the most sensitive area among the entire factor space is the key requirement from SA, in order to avoid any possible perfunctory analysis. The determination of response surface helps to reduce the parameter space by identifying the most influential parameters, and thus reducing the complexity of models. Various approaches to assess the response surface developed by past studies (Sorooshian & Arfi 1982; Faravelli 1990; Bauer *et al*. 1999; Isukapalli *et al*. 2000) primarily focused on the reduction of the total number of model runs to speed up the simulations and also on the interaction between the fluxes in order to closely replicate the physical system under consideration.

### Determination of the most influential uncertain parameters

Ranking the influential parameters in the order of their relative contribution to the variability to the model responses, termed as factor prioritization, is one major objective of SA. Identification and ranking of the input parameters are significant in driving the model simulations close to the true value. Proper understanding about the effect of even small variations in input factors on model responses and its feedback effects on model processes will help to minimize the overall uncertainty (Kuczera & Parent 1998; Crosetto *et al*. 2000; Razavi & Gupta 2015; Song *et al*. 2015).

### Determination of interaction between the parameters

Assessment of interactions between various parameters involved in model processes is necessary to reveal the nature of processes and the possible influence of one variable on any other variables (dependent or independent) (Norton *et al*. 2004). However, computation of interaction between parameters is often complex and limited because of the inherent non-linear nature of most of the HMs (e.g., Frey & Patil 2002; Saltelli *et al*. 2004; Nossent *et al*. 2011).

### Concordance between the model and the physical system

Another aspect of SA is to understand the concordance between the model structure and the assumptions on which it is based. This will help to theoretically evaluate the processes of the relevant system and to assess the extent to which the realistic representation of the real world has been achieved in the model (Reiter 1987; Boyle *et al*. 2000).

### Model simplification and identification of non-influential parameters

Another objective of SA is the efficient and optimum use of the model by removing the redundant parameters, thereby reducing the computation time and errors. The concept of over-parameterization, which is usually neglected in many modeling exercises, is critically addressed by a few studies (Spear & Hornberger 1980; Beven 1989; van Griensven *et al*. 2006). These studies highlighted the significance of the removal of many non-influential parameters in the model while parameterization/calibration improved model efficiency.

## CLASSIFICATION OF SAMPLING METHODS, SA APPROACHES, AND THEIR PROS AND CONS

Approaches adopted for SA depend on the type of HMs to a great extent. HMs may be classified as lumped, semi-distributed, and distributed (Chow *et al*. 1988), with the distributed model allowing more variability of parameters over the entire feasible space. Lumped models are approximate representations of the system and are mostly inefficient in accurately replicating the real system since the parameters involved are lumped, i.e., area weighted averages, and do not represent the exact physical features of hydrologic processes (Beven 1989; Hughes & Beater 1989; Arnaud *et al*. 2011). Although distributed modeling provides a platform which allows the variability of all the parameters, it is highly complex due to the associated uncertainty in the determination of the parameters. SA will help to reduce the complexity and uncertainty and help to gain better understanding of the concordance between the output and the inputs in any hydrological modeling study.

The approaches commonly adopted to perform SA can be classified into two: local and global (Saltelli *et al*. 2004; van Griensven *et al*. 2006). However, this classification is crude and vague. The decision on the type of approach, i.e., whether global or local, and the requirement of sampling methods to screen the number of simulations primarily depends on the range of parameter space (King 2009). The classification of SA methods is a challenging task in its own way, mainly because of the contradictory nature of inherent assumptions in practical applications. Saltelli *et al*. (2004) highlights four important points which need to be implemented during SA: (i) ordering, (ii) fixing, (iii) mapping of parameters, and (iv) reducing the variability of output. Keeping all the above points under consideration, this study adopts a more meaningful classification of SA methods with an intent to provide a simple guideline for selection of the SA method in hydrological modeling exercises. Hence, in this paper, SA methods are placed in the following categories: (1) scale dependent and scale independent methods (Morris 1991; Saltelli 1999); (2) function dependent and function independent methods (Razavi & Gupta 2015); (3) derivative-based and non-derivative-based methods (Saltelli *et al*. 2004); and (4) single parameter and multiple parameter methods (Hamby 1994; Saltelli *et al*. 2008). Broad classifications with examples are given in Table 1. A brief description of each method is further discussed. Some SA approaches demand high computational requirements, hence selection of the appropriate sampling strategy can help in subsidizing this demand. Therefore, along with SA approaches, sampling methods are also described.

S. no. | Sub-class | Method | Description | Common techniques (e.g.) |
---|---|---|---|---|

1 | 1.1 | Scale dependent | Depends on the selection of time step or the amount by which the factor or parameter is varied | EE method. OAT sampling |

1.2 | Scale independent | Independent of the step size | Variance-based method. Monte Carlo sampling | |

2 | 2.1 | Function dependent | Depends on the assumed functional form or model | OAT sampling. FF sampling |

2.2 | Function independent | Devoid of any assumptions regarding function or model | Variance based method. EE method | |

3 | 3.1 | Derivative based | All local approaches are derivative based. Less computation time | All local approaches: Monte Carlo. OAT sampling |

3.2 | Non-derivative based | All global approaches which allow interaction between parameters | All global approaches: LH method | |

4 | 4.1 | Single parameter | Only one parameter is involved | Random values. Stratified sampling. Mean and variance estimates |

4.2 | Multiple parameter | More than one parameter is involved in model formulation | OAT sampling. Multivariate stratified sampling |

S. no. | Sub-class | Method | Description | Common techniques (e.g.) |
---|---|---|---|---|

1 | 1.1 | Scale dependent | Depends on the selection of time step or the amount by which the factor or parameter is varied | EE method. OAT sampling |

1.2 | Scale independent | Independent of the step size | Variance-based method. Monte Carlo sampling | |

2 | 2.1 | Function dependent | Depends on the assumed functional form or model | OAT sampling. FF sampling |

2.2 | Function independent | Devoid of any assumptions regarding function or model | Variance based method. EE method | |

3 | 3.1 | Derivative based | All local approaches are derivative based. Less computation time | All local approaches: Monte Carlo. OAT sampling |

3.2 | Non-derivative based | All global approaches which allow interaction between parameters | All global approaches: LH method | |

4 | 4.1 | Single parameter | Only one parameter is involved | Random values. Stratified sampling. Mean and variance estimates |

4.2 | Multiple parameter | More than one parameter is involved in model formulation | OAT sampling. Multivariate stratified sampling |

### Random value sampling

In random value sampling, also referred to as non-uniform sampling, the value of a parameter is selected randomly within its range (say, varying from *X*_{1} to *X _{N}*). This may create different densities of values in different intervals (Saltelli

*et al*. 2008), with the possibility of too many values falling in a short interval and too few values in other regions. However, given sufficient sample size, random sampling generates unbiased estimates of mean and variance, since, as the sample size increases, the cluster of values taken by a parameter balance out and spread out in the entire range.

Random sampling has found its application in various fields. Price & Dixon (1987) developed two versions of controlled random search procedures for computer-aided design workstations, which is applicable in other fields also. Brazil & Krajewski (1987) compared a uniform random search and adaptive random search (ARS) procedure for the modeling of Sacramento Soil Moisture Accounting (SAC-SMA) model and concluded that the ARS method is an improvement over the uniform random search approach. A major disadvantage of random search procedure is that it does not consider the nature of response surface, hence making the search more irrational.

### Stratified sampling

Stratified sampling reduces the computation time by rapidly converging the calculated values close to the observed value (McKay *et al*. 1979; Thomas & Lewis 1995; Christiaens & Feyen 2002). In stratified sampling, screening is done systematically in such a manner that each subinterval of equal length contains the same number of sample points. As compared to random sampling, stratified sampling is more ‘process-based’ and is easier to implement.

*λ*) is given by: where is the upslope-drained area to a point

_{i}*i*, and is the local slope angle at that point (Beven & Kirkby 1979).

### Monte Carlo sampling

In Monte Carlo sampling, screening of input parameters are done on the basis of their distribution function. Monte Carlo estimate of function *f*(*x*) is equal to the average value of *f*(*x*) over *n* points selected at random from the entire space. Since long records of data can be generated, the results from Monte Carlo sampling are statistically valid (Krajewski *et al*. 1991). Monte Carlo sampling is flexible and easy to put into practice, which is evident from its wide range of applications in various areas, e.g., study of hydrologic effects of spatial variability of rainfall (Smith & Hebbert 1979), ecological modeling (Annan 2001), distributed catchment modeling (Binley *et al*. 1989a, 1989b; Krajewski *et al*. 1991), and subsurface studies (Sharma *et al*. 1987). Despite its ease of application, one of the drawbacks of Monte Carlo is that as the number of parameters increases, a large number of simulations are required to get precise results (Boyle *et al*. 1997). Another difficulty in the implementation of Monte Carlo sampling is the assignment of input ranges and derivation of linked probability distribution functions (PDFs). Many times, model parameters are either not directly measurable or it would be cost prohibitive to collect adequate random samples of input to derive the true PDFs and ranges (Annan 2001; Muleta & Nicklow 2005).

### Latin hypercube sampling

As described above, Monte Carlo and derived approaches are ineffective when modeling involves a large number of input parameters. The concept of the Latin hypercube (LH) simulation (McKay *et al*. 1979; Iman & Conover 2007) has been developed to obviate this limitation. In LH sampling, the entire factor space is divided into *P* equal intervals with probability of occurrence equal to *1/P* using stratified sampling. Random values of the parameters are generated such that sampling is done only once for each parameter and for each interval. This approach results in *P* independent simulations, with model run of *P* times (van Griensven *et al*. 2006). This method simultaneously allows for the sampling of all parameters and gives direct measure of the distribution of output.

The diverse areas where LH sampling is employed are: water quality modeling (Melching & Anmangandla 1993), watershed modeling (Melching 1995), and distributed hydrological modeling (Christiaens & Feyen 2002). Many studies have also compared LH sampling with different methods such as differential analysis, factorial design (McKay *et al*. 1979; Iman & Helton 1985; Kleijnen 2005) and claimed that LH sampling produces reliable results. Numerous studies have also employed LH-one-at-a-time (LH-OAT) method in hydrological modeling studies using the Soil and Water Assessment Tool (SWAT), in which the OAT method is repeated for each point sampled using LH sampling (Holvoet *et al*. 2005; van Griensven *et al*. 2006; Mulungu & Munishi 2007; Ndomba *et al*. 2008). In particular, LH sampling is preferred in distributional HMs due to its efficiency in dealing with the curse of dimensionality.

### Log-odds ratio estimates

*Δ*LOR, which is given by: where is the probability of the occurrence and is the non-occurrence of an event when the input parameter changes by some amount. is the probability of the occurrence and is the non-occurrence of an event when there is no change in the input parameters.

LOR gives the measure of dependency of the output on the input. Positive LOR indicates higher probability of occurrence of an event while negative LOR indicates very less probability of occurrence. Stiber *et al*. (1999) states that the effect of a particular input on output is proportional to the magnitude of LOR estimate. LOR estimates have been widely used in hydrology (e.g., (1) rainfall analysis (Hughes *et al*. 1999; Mackay *et al*. 2001); (2) forecast studies (Stephenson 2000)). LOR considers the cross-correlation between dependent (e.g., daily rainfall depths, stream-flow patterns, etc.) and independent (e.g., rainfall occurrences, precipitation, soil depths, etc.) variables. Despite its wide application, this method does not give any information related to the interaction between the inputs and is therefore difficult to implement in non-linear models.

### OAT method

In this method, as the name itself implies, the effect on response or output is judged by varying one input parameter at a time from the pool of sensitive parameters, while keeping the other parameters constant. OAT design is suggested in studies where spatial variations are more prominent than temporal variations. Although OAT sampling is efficient in generating trusthworthy results for factor prioritization and is endorsed by many past studies, its applicability is questioned by a few studies. For modeling exercises with a large number of parameters and if only a few are influential parameters, then OAT sampling is prone to produce uncertain results (Saltelli *et al*. 2008). The reliabilty of sensitivity results for a particular input variable is largely dependent on the value chosen for other model parameters. In addition, sensitivity results in OAT sampling largely depend on the scale by which the parameter is varied (Razavi & Gupta 2015). The interaction between different inputs is largely ignored (Saltelli 1999) in OAT sampling. Saltelli & Annoni (2010), in their article titled ‘How to avoid a perfunctory SA’ illustrated the inadequacy of OAT through a novel geometric proof, thus providing a realistic and state-of-the-art argument against OAT to the modeling community. Methods like Monte Carlo, elementary-effect, linear regression, and factorial design are improvements over OAT sampling (Saltelli 1999; Crosetto *et al.* 2000; Campolongo *et al.* 2007; Dimov & Georgieva 2010; Saltelli *et al.* 2010; Shin *et al.* 2013).

### Fractional factorial design

Fractional factorial (FF) design (Fisher 1942) considers a fraction of total number of possible combinations of input factors in order to reduce the number of experimental runs. The overall effect and also the effect of possible interactions are estimated by grouping the entire factor space into different ‘levels’. The indices are calculated for each combination of levels. Hence, a full factorial design having *k* number of factors has *n ^{k}* runs, where

*n*is the number of levels. Therefore, the FF method suffers from the curse of dimensionality and is efficient only when the number of factors is small. When sample size is moderate or large, the three-way and higher order interactions are proved to be insignificant (Kleijnen 2005). FF design is proposed for moderate and higher dimensional factor space with

*n*runs where

^{k−p}*1/n*is the size of the fraction. FF design has been employed in many past studies (Henderson-Sellers 1993; Cryer & Havens 1999; Xu & Gertner 2007; Meng & Quiring 2008). For detailed information of FF sampling, readers are requested to refer to Saltelli

^{p}*et al*. (1995) and Kleijnen (2005).

### Variance-based approach

*Y*denotes the

*m*number of outputs and vector

*X*denotes the

*n*inputs to the model and is the vector of

*k*model parameters. Then the total variance of the model output can be expressed as: where denotes the conditional variances expressed as , and so on.

The first term *V _{i}* represents the main effect of

*β*, i.e., the variance of the expected value of the output, conditioned on the input parameter

^{i}*β*

^{i}. The second term represents the interaction between

*β*

^{i}and

*β*

^{i}, i.e., the variance of the expected value of the output, conditioned on the input parameter

*β*

^{i}.

*i*th parameter is determined by computing the variation in the model output by keeping the

*i*th parameter constant and resampling the other input parameters. The first order sensitivity index of order

*k*is hence defined as: The presence of the interaction or the interdependence between the model parameters can also be judged by calculating the total effect index, which is the combination of first order sensitivity of the

*i*th factor and its interaction with the other sensitive parameters. The total effect of the

*i*th parameter is measured by estimating the variation of the model output by resampling the

*i*th parameter while keeping the other sensitive parameters constant. Total effect sensitivity index is expressed as: Computation of each term in Equation (5) is a cumbersome task. In order to overcome this, another expression of total effect index (Saltelli

*et al*. 2008) is: Variance-based methods consider the entire range of parameter space and also their interactions. However, for semi-distributed models, such an approach may lead to complexity because of the existence of a large number of parameters and their interactions. Therefore, prior investigation is needed to decide the applicability of this method for SA in semi-distributed HMs.

Saltelli *et al*. (2008) states that total effect index is always greater than or equal to the first order index of a particular factor. However, it is contradicted by the fact that in the case of correlated factors, total effect index may take a value smaller than the first order sensitivity index (Pappenberger *et al*. 2008). Many times, variance-based method is preferred over other methods since it is devoid of any model function and can simulate interaction effects efficiently. Variance-based method has been adopted in various hydrological modeling studies, such as distributed modeling, rainfall–runoff modeling, etc. (Lilburne *et al*. 2006; Sobol & Kucherenko 2009; Saltelli *et al*. 2010; Zhan *et al*. 2013; Rakovec *et al*. 2014). An extension of this approach is also adopted in multi-objective SA (Gupta *et al*. 1999; Bastidas *et al*. 2006; Pappenberger *et al*. 2008; van Werkhoven *et al*. 2008a; Wagener *et al*. 2009). The major drawback of the variance-based method is that it requires considerable computation time (model runs), which in turn, demands a viable, efficient, and robust sampling technique for its implementation.

### Elementary effect method

Elementary effect (EE) method (Morris 1991) is known for its ease of application and computational benefits making it suitable for computationally demanding modeling exercises (Shin *et al*. 2013). Unlike derivative-based method, it allows larger variability of the inputs and follows the concept of local variation around base point. A sensitivity measure is ultimately derived, which is an average of many local measures over the entire input space. This method is found to be efficient when the number of input parameters are large (Saltelli *et al*. 2008). The EE method was first introduced to examine the significance of each factor individually by ranking them in the order of their importance. This method is model independent and gives a qualitative measure of sensitivity.

*μ*accounts for the total effect on the output responses and standard deviation

*σ*measures the interaction among different input parameters. Campolongo

*et al*. (2007) substituted the measure

*μ*with

*μ**to neutralize the effects of opposite signs, especially when the model is non-monotonic.

*μ**is able to simulate the total sensitivity measure by working with a group of factors, i.e., the case when more factors are moved at the same time, factors are first considered in groups. The modified mean

*μ**is defined as:

Higher indicates that the parameter *i* significantly influences the model response, whereas higher values indicate that the interaction of *i* with other parameters is significant. Campolongo *et al*. (2007) further states that *μ** is a good alternative of the total sensitivity index S_{T}. Sobol & Kucherenko (2009) suggest that the assessment of global sensitivity for the individual factors or groups of such factors can be improved by estimating mean of squares.

Each sensitive parameter is resampled once to compute the model response. The EE of each parameter is then computed by deducting the model response from the responses generated due to change in the *i*th parameter and normalizing by the amount with which the *i*th parameter is changed. These EEs are calculated for each sensitive parameter and sensitivity indices are estimated. EE method has found wide applications in the field of hydrology, such as studies related to runoff generation using Distributed Time Variant-Gain Model (DTVGM) (Zhan *et al*. 2013), Xinanjiang model (Song *et al*. 2012; 2013), and MIKE/NAM rainfall–runoff model (Liu & Sun 2010). Also, in water quality modeling studies, the EE method proved its efficacy in finding out the optimum repetition number of the EEs (*r*) of the method (Ruano *et al*. 2011). Further, Rakovec *et al*. (2014) combined the EE method effectively with the variance-based and regional SA methods, termed as distributed evaluation of local sensitivity analysis (DELSA) method and demonstrated its application to a simple non-linear reservoir model to find the most influencing parameters. Despite its numerous advantages such as lower computational cost, ease of implementation, etc., it may lead to false conclusions due to the scaling issues (Razavi & Gupta 2015) and due to the ranking method based on only *μ, μ**, and *σ* (Sobol & Kucherenko 2009) which may be ineffective for highly non-linear functions.

## HMs AND SA APPLICATION

Available conceptual as well as distributed HMs namely, Variable Infiltration Capacity model, SWAT, TOPMODEL, MIKE-SHE, etc., demands the screening of sensitive model parameters. Studies in the past have incorporated a wide application of SA approaches in various HMs. Table 2 summarizes the previous studies of local and global SA, carried out in the various hydrological modeling studies.

SA methods | Type of SA approach | HMs | Number of parameters | Number of runs | Reference |
---|---|---|---|---|---|

DELSA | Hybrid of local and global | Five Framework for Understanding Structural Errors (FUSE) model structure (FUSE-016, FUSE-014, FUSE-160, FUSE-072, and FUSE-170) | 14 | 300 | Rakovec et al. (2014) |

Central difference | Local | TOPKAPI | 35 | 71 | Foglia et al. (2009) |

Monte-Carlo sampling based | Local | VIC | 10 | 59,049 | Demaria et al. (2007) |

Combination of Monte-Carlo sampling and likelihood estimates. Generalized Likelihood Uncertainty Estimation (GLUE) | Global | VIC | 6, 7 | – | Bao et al. (2011), He & Pang (2015) |

MIKE-SHE | 8 | >400 | Blasone et al. (2008) | ||

LH sampling with regression analysis | Global | SWAT | 35 | – | Muleta & Nicklow (2005) |

LH sampling with probability distribution of outputs | Global | MIKE-SHE | 5 | 25 | Christiaens & Feyen (2002) |

LH-OAT | Global | SWAT | 27,41 | >100 | Holvoet et al. (2005), van Griensven et al. (2006) |

Elementary-effect (EE)/Morris method | Global | SAC-SMA | 14 | 280 | Gan et al. (2014) |

HL-RDHM | 78 × 14 | 20,000 | Herman et al. (2013a, 2013c) | ||

REALM | 14 | 18,000 | King & Perera (2013) | ||

MIKE/NAM | 9 | >6,000 | Liu & Sun (2010) | ||

Variance-based (VB)/Sobol method | Global | SAC-SMA | 14 | 1,050 | Gan et al. (2014), van Werkhoven et al. (2008a, 2009), Wagener et al. (2009) |

HL-RDHM | 78 × 14 | >6 million | Herman et al. (2013a) | ||

HBV | 12 | 10,000 | Herman et al. (2013b) | ||

SWAT | 26 | 336,000, 60,000 | Nossent et al. (2011), Zhang et al. (2013) | ||

TOPMODEL | 9 | 5,632 | Reusser et al. (2011) | ||

Five Framework for Understanding Structural Errors (FUSE) model structure (FUSE-016, FUSE-014, FUSE-160, FUSE-072, and FUSE-170) | 14 | 2,000,000 | Rakovec et al. (2014) | ||

Combination of EE and VB method | Global | Xin'anjiang model | 10 | 100,000 | Song et al. (2013) |

DTVGM | 14 | 600, 4,000 | Zhan et al. (2013) |

SA methods | Type of SA approach | HMs | Number of parameters | Number of runs | Reference |
---|---|---|---|---|---|

DELSA | Hybrid of local and global | Five Framework for Understanding Structural Errors (FUSE) model structure (FUSE-016, FUSE-014, FUSE-160, FUSE-072, and FUSE-170) | 14 | 300 | Rakovec et al. (2014) |

Central difference | Local | TOPKAPI | 35 | 71 | Foglia et al. (2009) |

Monte-Carlo sampling based | Local | VIC | 10 | 59,049 | Demaria et al. (2007) |

Combination of Monte-Carlo sampling and likelihood estimates. Generalized Likelihood Uncertainty Estimation (GLUE) | Global | VIC | 6, 7 | – | Bao et al. (2011), He & Pang (2015) |

MIKE-SHE | 8 | >400 | Blasone et al. (2008) | ||

LH sampling with regression analysis | Global | SWAT | 35 | – | Muleta & Nicklow (2005) |

LH sampling with probability distribution of outputs | Global | MIKE-SHE | 5 | 25 | Christiaens & Feyen (2002) |

LH-OAT | Global | SWAT | 27,41 | >100 | Holvoet et al. (2005), van Griensven et al. (2006) |

Elementary-effect (EE)/Morris method | Global | SAC-SMA | 14 | 280 | Gan et al. (2014) |

HL-RDHM | 78 × 14 | 20,000 | Herman et al. (2013a, 2013c) | ||

REALM | 14 | 18,000 | King & Perera (2013) | ||

MIKE/NAM | 9 | >6,000 | Liu & Sun (2010) | ||

Variance-based (VB)/Sobol method | Global | SAC-SMA | 14 | 1,050 | Gan et al. (2014), van Werkhoven et al. (2008a, 2009), Wagener et al. (2009) |

HL-RDHM | 78 × 14 | >6 million | Herman et al. (2013a) | ||

HBV | 12 | 10,000 | Herman et al. (2013b) | ||

SWAT | 26 | 336,000, 60,000 | Nossent et al. (2011), Zhang et al. (2013) | ||

TOPMODEL | 9 | 5,632 | Reusser et al. (2011) | ||

14 | 2,000,000 | Rakovec et al. (2014) | |||

Combination of EE and VB method | Global | Xin'anjiang model | 10 | 100,000 | Song et al. (2013) |

DTVGM | 14 | 600, 4,000 | Zhan et al. (2013) |

FUSE: Framework for Understanding Structural Errors; TOPKAPI: Topographic Kinematic Approximation and Integration; VIC: Variable Infiltration Capacity Model; MIKE-SHE, MIKE/NAM: rainfall–runoff model; SWAT: Soil and Water Assessment Tool; SAC-SMA: Sacramento Soil Moisture Accounting; HL-RDHM: Hydrology Laboratory – Research Distributed Hydrologic Model; RELAM: Resource Allocation Model; HBV: Hydrologiska Byrans Vattenbalansavdelning Model; TOPMODEL: Topography-based Hydrological Model; DTVGM: Distributed Time Variant Gain Model.

As mentioned earlier, in addition to global approaches, local SA methods play an important role in SA of various HMs, especially while dealing with the major issue of computational requirement. Rakovec *et al*. (2014) proposed a methodology of SA by combining local and global approaches (EE and VB methods, repectively). A comparison of the DELSA approach with traditional Sobol method on two models, i.e., a simple non-linear reservoir model and a complex conceptual HM framework (FUSE) had demonstrated the potential utility (lower computational cost) of the DELSA approach (Rakovec *et al*. 2014). Furthermore, SA using a combination of LH sampling and probability distribution in the MIKE-SHE model required only 25 runs (Christiaens & Feyen 2002), while the same exercise when implemented using GLUE approach required more than 400 runs (Blasone *et al*. 2008). Similarly, when SA of SWAT model is carried out with LH-OAT design, it requires less model simulations as compared to traditional Sobol method (Holvoet *et al*. 2005; van Griensven *et al*. 2006; Nossent *et al*. 2011; Zhang *et al*. 2013). Gan *et al*. (2014) skillfully comprehended the evaluation of various SA approaches using the SAC-SMA model and demonstrated that the Morris method requires lesser model evaluations when compared with the Sobol method. It was also shown that the traditional methods like regression and correlation analysis are not efficient in dealing with real-time problems, which are non-linear or non-monotonic in nature. The efficacy of the Morris method over Sobol method is also demonstrated in Herman *et al*. (2013a). Further, SA of 78 × 14 parameters of HL-RDHM model was carried out by both Morris (20,000 model runs) and Sobol method (>6 million model runs) (Herman *et al*. 2013a, 2013c).

## SPATIO-TEMPORAL DEPENDENCE OF SA

SA of dynamic hydrologic systems is predominantly dependent on both space and time (McCuen 1973; Finnerty *et al*. 1997; Bastola & Murphy 2013). The physical significance of the input factors can easily be assessed by considering the sensitivity of the input factors to space and time resolutions. The dependence of SA on space and time is attributed to the availability of input data from distinct real data sources, at different spatial and temporal scales. In fully distributed HMs, spatial variability of input parameters are considered to capture catchment heterogeneity in model response. The prevalent use of these complex distributed models necessitates a proper understanding of its physical basis both spatially and temporally (O'Loughlin *et al*. 2013).

Finnerty *et al*. (1997) explored the space–time scale sensitivity of the Sacramento model and found that a decrease in spatial resolution brings down the evapotranspiration estimates, which in turn, increases the total channel runoff. Increase in surface runoff is also witnessed with a decrease in time scale (for constant spatial scale). In addition to the Sacramento model, the responses from TOPMODEL also indicate sensitivity towards space and time (Bruneau *et al*. 1995). The importance of considering various temporal scales in SA, to gain adequate knowledge about the underlying model processes, is highlighted in various past studies (Wagener *et al*. 2003, 2009; Kavetski & Clark 2010; Reusser & Zehe 2011; Reusser *et al*. 2011; Herman *et al.* 2013a, 2013b, 2013c). However, most of the related studies assessing the temporal variability had ignored many of the important parameters in an attempt to consider longer simulation periods. A possible reason could be that many processes depend on definite climatic conditions, which occur at regular time intervals and are confined to small areas, and hence, are identifiable at short time scales. For example, when a conceptual rainfall–runoff model is operated at daily time scales, almost all input parameters are significant; whereas as time scale is increased to yearly, only a few factors relating to soil storage and recession constants show high sensitivities (Massmann & Holzmann 2012). Bruneau *et al*. (1995) state that the efficiency of model is more sensitive to the choice of time scales when compared to that of the size of the grid, and demonstrated this by simulating surface runoff with larger spatial resolution and medium temporal resolution. Likewise, in hydrological modeling, the intrinsic linkage between the input parameters and the space–time scale has been established by many studies (Obled *et al*. 1994; Bruneau *et al*. 1995; Finnerty *et al*. 1997; Annan 2001).

## KEY FACTORS TO BE CONSIDERED IN SA

### Objective of SA and associated definition

The first and foremost step before undertaking SA is to obtain adequate information on the objective behind it. As discussed earlier, there are various objectives of SA. Limited knowledge of the purpose of the study and lack of proper definition of sensitivity will produce inaccurate or even conflicting results regarding the sensitivities of the associated parameters. Sensitivity results differ as objective functions vary (van Werkhoven *et al*. 2008b; Wagener *et al*. 2009; Liu & Sun 2010; Shin *et al*. 2013; Song *et al*. 2013; Zhan *et al*. 2013). However, this can be addressed through an intuitive selection of single weighted objective functions that incorporate the essence of different objective functions, in the form of weights (Hill & Tiedeman 2006; Foglia *et al*. 2009).

### Computational budget

SA approaches differ in terms of the application or computational cost. Computational budget depends on the number of model runs required to simulate realistic model responses. Since any modeler would demand more realistic results, which in turn require complex models, excessive computational funding is needed for most of the modeling exercises. To circumvent this problem, Razavi *et al*. (2012a, 2012b) proposed a surrogate modeling approach in which complex models that require more computational resources in terms of time and cost, are replaced by proxy models which are relatively fast, thereby reducing the cost demands to a great extent.

### Selection of appropriate method of SA for desired model

Proper selection of SA method is the key step in making any analysis successful. Selection of SA approaches for different modeling exercises depends on a number of factors like physical basis of model, which is employed, time to carry out the analysis, number of input factors, etc. Selection of a best-suited SA approach ensures robust and realistic results.

### Range of the parameter and the number of samples

In any sampling method, the number of samples to be considered is one of the crucial factors responsible for the model outcome. Determining the range of a parameter depends on the physical basis of that parameter and the simulated results from the past studies. The same parameter may exhibit a wide range for one study region and may take a constant value for any other region (Shin *et al*. 2013; Zhan *et al*. 2013). It is to be noted that, any inaccuracy in the determination of parameter ranges may result in considering an insensitive parameter to be sensitive and vice versa (Shin *et al*. 2013). The number of samples is also found to significantly affect the model responses (Tong 2006).

### Distinguishing SA and uncertainty analysis

As stated before, the uncertainty in the input parameters enhances the factor of uncertainty in the model responses, which in turn, significantly influences the manipulation of model outputs by decision-makers regarding the sensitivity studies. While the objectives of SA and uncertainty analyses (UA) are different, they often complement each other. Saltelli & Annoni (2010) state that, while UA gives the measure of the certainty or uncertainty of the inputs or the outcome, SA complements UA by finding the source of uncertainty. Although these two analyses have different objectives, often they are paired (Yapo *et al*. 1996; Gupta *et al*. 1999; Crosetto *et al*. 2000; Madsen 2000).

### Correlation between input parameters

The majority of SA methods neglect the strong correlation between the inputs and consider input parameters to be independent. The inter-relations between various inputs significantly affect the model outcomes and the associated uncertainties (Lemke *et al*. 2004; Xu & Gertner 2007; Manache & Melching 2008; Pan *et al*. 2011). Therefore, proper understanding about the dependencies between input factors and their influence on model results is important in any modeling study.

## REGIONAL DEPENDENCE OF SA METHODS AND SAMPLING STRATEGIES

Although pool of methods are available to perform SA, since all the approaches widely differ in the distribution pattern of parameters, mathematical representations, pertinence, and the sensitivity indices, one must be very conscientious while selecting a befitting SA strategy. In order to generate significant and convincing results, the appropriate SA strategy should be selected by considering the primary objective of the study and also the computational time needed. Adoption of a particular SA and sampling strategy, to some extent, depends on the modeler's technical familiarity of the region (or basin) being modeled and its hydrological characteristics. Our review brought up a strong affinity between SA/sampling techniques deployed and regions modeled. A systematic literature review with different SA and sampling methods as keywords, i.e., ‘Random sampling’, ‘Stratified sampling’, ‘LH sampling’, ‘Monte-Carlo method’, ‘EE method’, ‘Variance based method’ along with the keyword ‘hydrology’ was carried out to explore the regional preference of SA methods in hydrological modeling. An approximate analysis on the worldwide usage of various approaches thus derived is illustrated in Figure 1. Although different regions employ different sampling and sensitivity techniques, the most popular techniques in the Asian region are variance-based (18%) and Monte Carlo (18%) methods. EE (33%) and LH (31%) methods are widely used in North America. LH (35%) and Monte Carlo (33%) methods are popular in Central and South America. Random sampling and stratified sampling are widely used in Africa, Europe (including the Middle East), and Australia, with the percentage usage being 19%, 13%, and 24% respectively, in each region. Interestingly, LH, Monte Carlo, EE and variance-based techniques and their extensions have very little exposure in Africa and European countries including the Middle East.

## DISCUSSION AND CONCLUSIONS

SA, the most important component of hydrologic modeling, helps to simplify the complexities and understand the physical processes of complex hydrologic systems in a comprehensive way. The principal aim of SA is to assess the variability of response surface with respect to significant changes in input factors and to prioritize these factors by finding the non-influential factors. This would simplify the complexity of the model either by omitting a few trivial input parameters or by assigning a constant value to them. Various SA methods exist which differ in terms of mathematical approaches, assumptions, availability, cost of application, and applicability. The employment of any SA approach depends on the field of application and the definition. In this paper, different SA techniques are reviewed and categorized in terms of the number of parameters, scale and function dependency, derivative and non-derivative basis. It is widely accepted that the ambiguous definition of SA produces different or even contradictory outcomes (Hamby 1994). Hence, the first and foremost aim in performing SA is defining the problem domain according to the nature of application, to produce worthwhile results.

Most of the existing methods have their own pros and cons and therefore may not be considered as superior or inferior when compared to each other. Many of the studies also questioned the reliability or dependency of any one sensitivity technique, because as the complexity of model increases, the task to find out the correlated parameters and interaction effects becomes more challenging and time-consuming. Also, when modeling is done with an intention to depict the physical processes more realistically or, in other words, when the model becomes more distributed, the number of parameters or the factors contributing to modeling studies also increases. This further increases the uncertainty in the model outcomes primarily due to the uncertain model inputs. Hence, multi-objective SA find its application in many studies (Gupta *et al*. 1998; Bastidas *et al*. 2006). A few studies even highlighted the applicability of single best objective function comprising all the features of different objectives (Blasone *et al*. 2007; Foglia *et al*. 2009).

For low-dimensional systems, various approaches like OAT, stratified sampling and LOR estimates may simulate realistic results. However, with increase in the number of input factors, the analysis suffers from the curse of dimensionality and becomes cumbersome. EE method, preferred in the case of complex models, also gives erroneous results and mistakes non-influential parameters as influential ones in the modeling. EE method is highly dependent on the step size or scale. In any complex system, where interaction between parameters plays an important role, variance-based method is adopted, which does not depend on the formulation of objective function. Variance-based methods facilitate spatial-temporal dependency of the input factors. The only drawback of the variance-based technique is the high cost of analysis, which can be reduced through proper screening of samples and prior implementation. Variance-based method together with sampling methods like LH sampling and EE method may eliminate many drawbacks such as computational cost, interaction effects between variables and spatial-temporal dependency of the outcome on the input factors. It is evident that almost all the SA techniques presented are susceptible to one or other limitation. The majority of the methods assumes independency among the inputs, which may not replicate real-world problems. Therefore, considering all these factors, modelers should adopt subjective judgments while carrying out SA in hydrologic modeling studies, which in turn demand thorough knowledge about the inherent characteristics of the system and the interactions between the various variables involved. These facts, further, reinforce the need to conduct and develop a comprehensive SA method, which would aid in automating the steps in hydrologic modeling, thus helping decision-makers in adequate management of system components.

## ACKNOWLEDGEMENTS

The authors would like to thank the Department of Science and Technology (India) for partly supporting this study through the grant no: SB/FTP/ETA-0110/2013 to Dr Dhanya C.T. The authors express sincere thanks to the Indian Institute of Technology, Delhi for supporting this work.

## REFERENCES

*.*

*ASCE*,

*.*

*.*

*On the Importance of Input Variables and Climate Variability to the Yield of Urban Water Supply Systems*

*A Two-Layer Variable Infiltration Capacity Land Surface Representation for General Circulation Models*

*.*

*.*

*.*

*:*