## Abstract

Quantifying climate change impact on water resources systems at regional or catchment scales is important in water resources planning and management. General circulation models (GCMs) represent our main source of knowledge about future climate change. However, several key limitations restrict the direct use of GCM simulations for water resource assessments. In particular, the presence of systematic bias and the need for its correction is an essential pre-processing step that improves the quality of GCM simulations, making climate change impact assessments more robust and believable. What exactly is systematic bias? Can systematic bias be quantified if the model is asynchronous with observations or other model simulations? Should model bias be sub-categorized to focus on individual attributes of interest or aggregated to focus on lower moments alone? How would one address bias in multiple attributes without making the correction model complex? How could one be confident that corrected simulations for the yet-to-be-seen future bear a closer resemblance to the truth? How can one meaningfully extrapolate correction to multiple dimensions, without being impacted by the ‘Curse of Dimensionality’? These are some of the questions we attempt to address in the paper.

## HIGHLIGHTS

Importance of procedures for correcting systematic biases is discussed.

Extensive literature is presented on bias correction and its use.

The importance of correcting specific attributes for water resources applications is illustrated.

Challenges in formulating a bias correction alternative are highlighted.

Added information on how correcting these biases is critical before any dynamical or statistical downscaling application.

## INTRODUCTION

‘All Models Are Wrong, Some Are Useful’. These famous words from nearly half a century ago (Box 1976) are a reminder of the limits all models are constrained by. The usefulness of models, however, improves, often significantly, if systematic bias can be removed. To do this, however, the bias needs to be quantified, and appropriate corrections needs to be devised. These corrections may be simple re-parameterizations of the model, an update in model state variables, or a more comprehensive re-specification of the model structure. Identifying, characterizing, and removing such bias, however, becomes especially tricky if the model is complex, and its inputs (as well as observed responses used for assessment) uncertain. Such is the scenario that characterizes the biases inherent in climate model simulations of the future, which forms the focus of this article.

Climate change impact studies often require simulated outputs from climate impact models. Coupled atmosphere-ocean general circulation models (GCMs) or regional climate models (RCMs) are our main source of knowledge about likely changes in the climate into the future but are known to contain systematic errors (biases) in simulations when compared with observations (Collins *et al.* 2013). In particular, regional and local-scale events and extremes are not well represented in GCMs (Mehrotra & Sharma 2015). As a result of this, the majority of the ‘raw’ simulations of climate models do not statistically match with observations (Mehrotra & Sharma 2015). Therefore, it is necessary to remove or minimize the differences between the observed and raw GCM/RCM simulations as these are used to drive downscaling and impact assessment models. Bias correction (BC) has become even more important in climate change impact research due to growing databases of global and regional climate model simulations and relative simplicity and effectiveness of BC approaches. Over recent years, many different methods have been developed to perform BC. These vary from very simplistic methods, such as the delta-change method, which takes the difference between the current and future models and then adds the difference to the current observation to obtain a projected future (Xu 1999; Hay *et al.* 2000; Lenderink *et al.* 2007), or correcting only the statistical mean of the simulations (Wilby & Wigley 1997), or the variance correction using simple scaling adjustment (Berg *et al.* 2012) to more sophisticated methods, for example, based on complex stochastic modelling aimed at fixing bias in various dependence attributes (Eden *et al.* 2014; Wong *et al.* 2014). The most common BC approach in hydrological studies uses quantile-mapping (QM) (Panofsky *et al.* 1958; Haddad & Rosenfeld 1997; Wood *et al.* 2004; Déqué 2007; Piani *et al.* 2010; Gudmundsson *et al.* 2012). QM has been applied to climate model output globally (Haerter *et al.* 2011; Teutschbein & Seibert 2012; Maurer & Pierce 2014) and its variants have been presented and thoroughly reviewed by researchers (Maraun 2016; Shrestha *et al.* 2017; Guo *et al.* 2020).

A number of studies have also highlighted the common issues of the application of QM BC, specifically when assessing projected climate change impacts. It was found that QM can modify long-term changes in simulated series (Hagemann *et al.* 2011; Themeßl *et al.* 2012; Maraun 2013; Maurer & Pierce 2014). Maurer & Pierce (2014) evaluated the effect of QM on GCM-simulated precipitation changes between two historical periods and noted that the approach can affect trends in extreme quantiles differently than trends in the mean. They suggested that further work is needed to assess the impact of QM on the tails of the distribution.

Although climate model performance has improved, deficiencies remain in representing regional variability on interannual to decadal time scales (Sheffield *et al.* 2013; Beadling *et al.* 2020; Fan *et al.* 2020; Wang *et al.* 2021). Biases in the temporal variability of raw climate series can pose problems for impact modelling. For example, a model might have appropriate variability at hourly or daily time scales, however, little or no variability at annual time scales. To deal with this type of bias, a single timescale (e.g., daily or monthly) BC approach such as delta change or QM may not be the best option to be used, since it focuses only on daily or monthly statistics of the time series and therefore cannot necessarily correct for the variability biases at aggregated time scales and the resulting bias-corrected series might show a biased representation of the magnitude and variability of the low-frequency phenomena such as El Niño and La Niña (Ashfaq *et al.* 2011; Bellenger *et al.* 2014). Such biases could also modify the simulations and statistics of heat waves, flooding events, agriculture estimates, and reservoir simulations (Pierce *et al.* 2015).

Considering standard BC approaches ignore the biases in aggregated time series, nested bias correction (NBC) offers a solution as it corrects for the biases in both distribution and persistence at multiple time scales. As the BC works over pre-defined time scales, biases at other time scales not directly corrected may not be eliminated (Johnson & Sharma 2012). To avoid the specification of such pre-defined time scales (e.g., daily, monthly, and annual timescale) researchers have suggested converting the time series into a frequency domain using an appropriate transformation and applying BC in the frequency domain (Pierce *et al.* 2015; Nguyen *et al.* 2016) as correction in the frequency space corresponds to the applying NBC at multiple time scales in the time-space. The idea of frequency bias correction (FBC) is based on the logic that when a time series is analysed in the frequency domain, the variance of the time series can be expressed as a function of frequency, making it more suitable for studying variability across different time scales. Similar to FBC, time-frequency-based BC approaches are developed using wavelet transforms. The discrete wavelet-based bias correction (DWBC) (Kusumastuti *et al.* 2021), employs discrete wavelet transform (DWT) to bias correct climate model simulations in time and frequency domain. The correction is performed across the spectrum recognizing the trend in the time series. Another refined approach, continuous wavelet-based bias correction (CWBC), offers more precise correction than DWBC since it applies a correction to non-dyadic spectrum as well which is missed by DWT (Kusumastuti *et al.* 2022). The wavelet-based bias correction (WBC) approaches offer the next level of BC with its ability to maintain the climate change signal which QM BC is known to alter (Cannon *et al.* 2015; Maraun 2016).

Climate change impact assessment studies and both dynamical and statistical downscaling approaches require several climate variables as inputs and these often have strong physical inter-dependencies in observations (Li *et al.* 2014; Vrac & Friederichs 2015; Mehrotra & Sharma 2015, 2016). The univariate BC, e.g., Delta, Scaling, QM, or time nesting approaches (NBC or recursive nested bias correction, RNBC) which assume independence across variables or GCM grid cells and consider BC independently for each variable or grid cell and do not recognize the spatial and cross-variable dependences of climate variables (White & Toumi 2013; Li *et al.* 2014; Chen *et al.* 2015). For example, it was found that univariate QM matched the time series distribution quite well, however, mismatched the observed and bias-corrected spatial and inter-variable dependencies of the time series (White & Toumi 2013). Similarly, Mehrotra & Sharma (2015, 2016) found that univariate BC approaches, namely QM and NBC, were ineffective in correcting space and across-variables dependencies.

A bivariate correction procedure was proposed by Piani & Haerter (2012) to simultaneously correct temperature and precipitation. This required correcting one time series (i.e., precipitation) conditional to the bias-corrected time series for the other variable (i.e., temperature). Alternate bivariate corrections were proposed by Li *et al.* (2014) by constructing bivariate distributions given univariate marginal based on Copula theory to jointly correct precipitation and temperature biases in climate models. This concept was later extended to multiple variables (Mao *et al.* 2015; Vrac & Friederichs 2015; Cannon 2016, 2018). The copula theory states that any multivariate distribution can be formulated from the individual marginal cumulative distributions and a copula function describing the dependence between the individual distributions.

A multivariate extension of the linear mean/variance rescaling approach was proposed by Bürger *et al.* (2011) using a Cholesky decomposition of the covariance matrix, to correct multivariate GCM fields for use in a regression-based downscaling. Bárdossy & Pegram (2012) reproduced the observed precipitation spatial dependence field using the Cholesky decomposition in rank space. They suggested sequential correlation based on either a matrix modification or a sequential regression in order to preserve the spatial correlation. Mehrotra & Sharma (2015) presented a multivariate recursive nesting BC (MRNBC) and a multivariate extension of QM, multivariate recursive quantile nesting BC (MRQNBC) (Mehrotra & Sharma 2016) to correct systematic biases in multiple variables and across multiple time scales. A multivariate correction approach was similarly proposed by Cannon (2018), using the Cholesky decomposition BC rationale (Bürger *et al.* 2011) and a univariate QM. The approach relies on recursion to achieve convergence (based on the matching of observed and BC time series distributions).

While the multivariate approaches, for example, multivariate recursive nesting bias correction (MRNBC) and QM-based MRNBC (MRQNBC), represent the state of the art to bias correct multiple variables across multiple time scales, they still require specification of time scales of interest by the user. The mathematical complexity of these approaches grows with the increase in the number of variables and grid points/locations. Nguyen *et al.* (2018) proposed a multivariate extension of the frequency-based BC approach. The approach is aimed at addressing biases in high- and low-frequency variations in the individual time series as well as the biases across multiple variables for each of these variations, all in the frequency space thus avoiding the need for time scale nesting and matrix manipulation involved in MRNBC and MRQNBC procedure. Their results suggest that the approach can reduce both intra- and inter-variable dependence biases in the corrected simulations.

A different approach to the multivariate methods above stemmed from the Schaake Shuffle (SS) (Clark *et al.* 2004), which proposed a data re-shuffling strategy to alter dependence patterns in multiple ensembles of simulated climate variables. In this approach, the ensemble members for a given time step are ranked and matched with the rank of observation data from other time steps randomly selected from similar dates in the historical record. At the end of the procedure, the ensembles are shuffled following the observed time sequence thereby recovering the observed space–time variability in simulated ensembles. This rank-based shuffling technique with modifications has been applied for the purpose of multivariate BC or downscaling of climate simulations (Cannon 2018; Vrac & Friederichs 2015). A more comprehensive shuffling-based BC method has been introduced by Vrac & Friederichs (2015) and Mehrotra & Sharma (2019) for correcting multivariate structure of climate model outputs. The approach combines univariate QM with SS but attempts to capture the pattern of change suggested for future climates.

The rest of the paper is organized as follows. The next section discusses the biases in a model leading to the development of BC approaches from the simplest to the more sophisticated ones as introduced earlier. Section 3 focuses on the other aspects of BC that need to be considered when applying BC approaches in hydrological studies. In section 4, the limitations and assumptions of BC approaches are discussed. Finally, the discussion is summarized in section 5.

## MODEL BIASES AND METHODS FOR BC

All climate models (both global and regional GCMs, RCMs) have systematic errors (biases) in simulations. For example, climate models often simulate excessive rainy days, creating a drizzle effect. Similarly, systematic errors are present in the timing of the monsoon, the amount of seasonal rainfall, underestimation of rainfall extremes, and consistently too-high or too-low temperatures (Pörtner *et al.* 2022). The use of uncorrected simulations in impact models or assessments can be unrealistic.

Errors in climate models occur due to a range of factors. Key factors include limited spatial resolution (large grid sizes), uncertain historical forcing, discretization, because of our limited understanding of physical processes, their simplified formulation in models, and imperfect initialization of models (Gohar *et al.* 2017).

To overcome these biases, a range of BC methods have been developed. For all methods, it is important to realize that the quality of the observational datasets determines the quality of the BC. For an effective BC, it is important to have a good dataset of observations. Long-term datasets are needed if correcting extreme statistics (Mehrotra & Sharma 2015).

The simplest approach used for BC is the so-called delta change of change factor approach (Hay *et al.* 2000). The delta-change has a long history in climate impact research. It represents not a BC as such, but employs the model's response to climate change to modify observations to represent a plausible future. As it is easy to use, it has formed a useful benchmark for BC in climate impact assessment studies. The approach uses the GCM or RCM response to climate change to modify observations. The delta change assumes that GCMs can accurately simulate relative changes. The observed time series is adjusted by either adding the difference or multiplying the ratio of change parameter between future and present climate as simulated by the GCM, without altering the pattern over time. For rainfall usually, a percentage change is calculated. If the climate model predicts a 20% increase in rainfall, a new time series will be made by multiplying the historic rainfall by 1.2. More complexity in the approach can be introduced by defining different change factors for different months or seasons. For temperature, an additive correction is adopted, for example, if the climate model predicts 3 °C higher temperatures, 3 °C is added to all historic observations to construct a new time series representing the future climate. It may be noted that this method does not take into account change in climate variability such as an increase in extreme rainfall or longer dry or wet spells. The selection of temporal domains, time periods, time scale, and type and number of change factors varies depending on the meteorological parameters and the specific needs of the study (Hansen *et al.* 2017).

Another popular alternative is quantile mapping (QM) (Panofsky *et al.* 1958; Haddad & Rosenfeld 1997; Wood *et al.* 2004; Déqué 2007; Piani *et al.* 2010; Gudmundsson *et al.* 2012). The approach modifies a modelled value by mapping quantiles of the model's distribution onto observations. This approach not only adjusts the mean and variance but also the quantiles for the variable of interest. The major limitation of the basic QM method, or for that matter, of almost all BC approaches, is the assumption of stationarity (i.e., the observed cumulative distribution functions (CDFs) represents the baseline period to correct, consequently forcing a model that remains stationary, or creating future distributions that are patterned after the observed). Considering this, many variants of QM have been proposed. For example, the equidistant cumulative function matching (EDCDF) by Li *et al.* (2010) incorporates information from the model projection CDF rather than the historic model distribution. The method does not assume the observed distribution to apply to the future period. However, it assumes instead that the difference between the observed and model-predicted data remains constant in the future. Similarly, Michelangeli *et al.* (2009) proposed a cumulative distribution function transformation (CDF-t), which assumes that the historical mapping between the model and observed CDFs applies to the future period as well. EDCDF preserves the GCM-predicted change at each quantile additively (i.e., as future minus observed). However, changes in precipitation are often more usefully evaluated as multiplicative changes, since a fixed amount of precipitation change has different implications in wet and arid regions (Pierce *et al.* 2015). The quantile delta mapping (QDM) (Cannon *et al.* 2015) preserves relative changes in simulated precipitation quantiles. This is equivalent to the QM being applied on the detrended climate series and adding back the trends thereafter.

The QM approach is known to alter the magnitude of mean change as projected by the raw GCM simulations (Hagemann *et al.* 2011; Pierce *et al.* 2013; Maurer & Pierce 2014). Since BC represents a purely statistical ‘fix’, it does not discriminate between the physical processes associated with anthropogenic forcing and shorter-term fluctuations associated with natural climate variability internal to the model. To reduce the disparity between global modelling studies based on bias-corrected outputs from a GCM/RCM, it is prudent to implement a BC alternative that does not alter the original GCM trend. A simple adjustment to EDCDF can retain the model-predicted future change in mean precipitation that has been evaluated as a ratio (Wang & Chen 2014).

### NBC approach

The QM, CDF-t, and EDCDF approaches described in the preceding section focus on either monthly or daily attributes of the climate variables being corrected. However, in some situations and for certain types of variables, longer-term variabilities need better representation to enable aggregated time scale characterization, for example, in the case of modelling drought and the availability of water resources. The missing interannual variability is addressed by using observed time series attributes at multiple time scales and performing time nesting, termed as the NBC (Johnson & Sharma 2012; Mehrotra & Sharma 2012). The NBC approach considers BC at multiple time scales. For example, a three-level NBC starts with BC at daily time scale. The daily observed and bias-corrected series is then aggregated to a monthly time scale and BC is applied. The observed monthly and BC series are aggregated to an annual time scale and BC is applied again. Finally, BC at all time scales is incorporated. As time nesting was found to create artefacts in bias-corrected series attributes at daily time scales, Mehrotra & Sharma (2012) repeats the NBC procedure recursively to minimize the biases at all time scales. This modification was termed as RNBC (Mehrotra & Sharma 2012).

### Multivariate BC approaches

BC approaches are generally applied to a single variable at a location and at a given time scale (Li *et al.* 2014; Mehrotra & Sharma 2015, 2016; Vrac & Friederichs 2015). As BC is applied to each variable independently, the physical dependencies across variables of interest are ignored (Colette *et al.* 2012; Maraun 2013). Often, bias-corrected variables are used collectively to estimate quantities such as potential evapotranspiration or their use in impact assessment models, such as rainfall–runoff modelling. Similarly, variable bias-corrected in isolation at different locations might reflect poor observed spatial correlations and lead to improper spatial response (Hnilica *et al.* 2017; Nahar *et al.* 2017).

Multivariate BC alternatives have been proposed to address some of these issues. A bivariate correction was proposed by Piani & Haerter (2012) to simultaneously correct temperature and precipitation. This was achieved by correcting one time series (i.e., precipitation) conditionally to the bias-corrected values of the other variable's time series (i.e., temperature). Copula-based methods have also been proposed to consider the joint dependence between variables or the spatial dependence across grids (Mao *et al.* 2015; Vrac & Friederichs 2015). Mehrotra & Sharma (2015) proposed a parametric multivariate extension of NBC, named as multivariate recursive nested bias correction (MRNBC), while a multivariate and multi-timescale extension of quantile matching-based nonparametric BC, named multivariate recursive quantile nested bias correction (MRQNBC), was suggested by Mehrotra & Sharma (2016).

Two versions of a multivariate BC were proposed by Cannon (2018). These approaches consider quantile mapping to match observed marginal distributions and Pearson correlation and Spearman rank correlations to match dependence structure. François *et al.* (2020) compared the performances of four parametric and nonparametric multivariate BC approaches and reported that depending on the dimensional configuration, the instability of some methods can possibly affect the bias-corrected simulations. In particular, for multivariate parametric alternatives, increasing the number of variables often led to a deterioration of spatial properties.

With an excessive number of variables and grid points, a parametric option, for example, MRNBC or MRQNBC might struggle with dimensionality issues as it maintains spatiotemporal dependence through statistics of matrices of variables at grid points. This is referred to as the ‘Curse of Dimensionality’ in the literature. With increasing dimensions, stability could be an issue. In such cases, the use of shuffling-based BC provides an effective means of dealing with the dimensionality. In shuffling-based approaches, a univariate BC is used to correct the distribution biases while shuffling (reordering) is used to correct the dependence biases. As both procedures are applied to individual variables, the approach is independent of the number of variables or grid points. An equivalent alternative has been recently proposed by Kusumastuti *et al.* (2023) using a continuous wavelets transformation, also highlighting the importance of not extending corrections into the far future, as the parameterization of the BC model in the current climate simulation and observed record may not remain valid under extreme change scenarios.

Vrac & Friederichs (2015) combined a univariate BC method with the shuffling technique presented by Clark *et al.* (2004) and referred to the technique as the empirical copula–BC (EC–BC) approach. A further modification to this approach was proposed by Vrac (2018) wherein a multivariate method of rank resampling for distributions and dependences was proposed. The approach was named the R2D2 BC approach. The author claimed that the approach provides some stochasticity since it allows generating as many multivariate corrected outputs as the number of grid cells times the number of climate variables (statistical dimensions) of the simulations to be corrected.

While the above-referenced studies focus on correcting multivariate biases present in GCM simulations mostly, an exciting new domain is the correction of lateral and lower boundary biases in RCM simulations (Rocheta *et al.* 2017, 2020). Correction of such biases has been shown to improve the consistency with which extreme precipitation is simulated for future climates across model simulations (Kim *et al.* 2020), with arguments on why multivariate biases are critical to address to improve this further (Kim *et al.* 2021). This has since been confirmed with RCM lateral and lower boundaries simulated using MRQNBC (Kim *et al.* 2023) and shown to improve even the simulations of compound extreme events (Kim *et al.* 2023). This is a promising direction for future applications before any dynamical downscaling is undertaken.

*et al.*2023).

### Correcting bias in the frequency domain

Daily or monthly QM BC has been used in many climate change impact studies (Wood *et al.* 2004; Boé *et al.* 2007; Haerter *et al.* 2011). However, where low-frequency variability and persistence attributes are important, QM exhibits limited advantage as it corrects for biases at a given time scale and not at higher aggregated time scales (Haerter *et al.* 2011; Mehrotra & Sharma 2016). One possible solution to this is the nesting of corrections at multiple pre-defined time scales (Haerter *et al.* 2011; Johnson & Sharma 2012; Mehrotra & Sharma 2016). However, biases at other time scales, not included in the nesting, may not be corrected. Another solution is to look beyond specific time scales. One such possibility is representing the time series in the frequency domain. The use of the frequency space makes the time series more suitable for modulating variability across different time scales (Bloomfield 2004; Maheswaran & Khosa 2012). A Fourier transform or spectral analysis has been widely employed in meteorology and hydroclimatology to examine seasonal, annual, and multi-annual periodicities (Leite & Peixoto 1995; Hegge & Masselink 1996; Fleming *et al.* 2002). Besides, the Fourier transform is also applied to generate surrogate time series, maintaining the linear properties present in the synthetically generated series. More details on this are found in Prichard & Theiler (1994), Chen *et al.* (2010) and Keylock (2012). Fast Fourier transform (FFT) has also been used to bias correct the variability for each frequency component (Pierce *et al.* 2015; Nguyen *et al.* 2016). Wavelet analysis is a step beyond Fourier analysis, localizing the signal in the frequency domain and providing spectral representation as a function of time. Both discrete and continuous wavelet transform have been used for a range of applications (Farge 1992; Torrence & Compo 1998; Torrence & Webster 1999; Jiang *et al.* 2020). Discrete wavelet transforms disaggregate the time series (signals) into multiple high- and low-frequency components (Percival & Walden 2000). Therefore, the slow-moving component of a time series which represents the underlying low-frequency behaviour can be extracted and modelled separately (Maheswaran & Khosa 2012; Belayneh *et al.* 2016; Sang *et al.* 2016). Additionally, the lowest-frequency component can serve to represent the underlying trend (Adarsh & Janga Reddy 2015). A DWBC alternative was proposed by Kusumastuti *et al.* (2021) to obtain the underlying trend and to remove bias in variability at varying time scales (frequencies). The principle of discrete analysis provides the information on the disaggregated time series at dyadic frequency components. For some variables, the underlying information of climate variables, such as the low-variability component, may lie in the non-dyadic spectrum. Keeping it in mind, Kusumastuti *et al.* (2022) proposed a continuous wavelet transform-based bias correction (CWBC). A note of caution for using BC under simulations of extreme change (for a 4xCO_{2} change scenario) was provided in Kusumastuti *et al.* (2023), which showed that even while multivariate biases were properly corrected and not affected by the Curse of Dimensionality, the parameterization of the BC model becomes unstable far into the future where the change (and hence bias) is not manifested in the current climate simulations.

## OTHER CONSIDERATIONS IN BC

Climate models employ fundamental mass and energy balance principles and therefore future changes across climate variables are derived in a physically consistent way. Unfortunately, while reducing in magnitude in each new generation of the climate model, biases continue to exist. For some applications, these biases may present significant obstacles when attempting to project into the future, particularly if the impact or application is sensitive to non-linearities in the system such as univariate or multivariate threshold effects. Statistical methods can correct model bias, though doing so always endangers the physical consistency between the corrected climate variables. This is especially the case with univariate corrections for individual climate variables but continues to impact even with more sophisticated multivariate alternatives. It is recommended to use bias-corrected data if the application only requires precipitation and/or temperature and it is sensitive to non-linearities (such as thresholds) in precipitation and/or temperature. Univariate BC can be used if the application requires variables other than precipitation and temperature but it is not sensitive (or has little sensitivity) to the coupled effect of other variables with precipitation and temperature. Univariate BC should be avoided if the application has strong sensitivities to the combined effect of temperature and/or precipitation with other climate variables such as humidity or wind (in which case physical consistency may be a dominant requirement) (Evans & Argüeso 2014).

Climate models are routinely bias corrected by comparing the modelled simulations for the current climate with the observations. Therefore, the selection of the reference time period warrants greater attention in impact studies. The bias in climate model results cannot be uniquely defined but depends on the reference period used (Maraun 2016). An associated challenge is the presence of a trend within the observed period, which, if significant (as is the case with temperature) needs careful consideration.

Precipitation and temperature (minimum and maximum) are the variables, used commonly in impact assessment applications. These variables are routinely bias corrected as they are the variables with long, reliable observational time series. However, for some studies, other variables are also needed.

Apart from direct statistics of interest, GCMs substantially misrepresent other crucial phenomena as well. For example, key processes governing low-frequency variability such as El Niño/Southern Oscillation (ENSO), Monsoonal systems, or the mid-latitude storm track are biased. Also, for precipitation, GCMs are known to have a ‘drizzling’ bias that is, characterized by high precipitation frequency and low intensity (Dai 2006; DeMott *et al.* 2007). The ‘drizzling’ bias impedes the realistic representation of precipitation characteristics and hydrologic extremes in the climate models (Trenberth *et al.* 2003; Trenberth 2011). We discuss here some of these issues and possible solutions suggested.

### Choice of reference period in BC

The World Meteorological Organization (WMO) recommends using the 30-year 1961–1990 period as the climate normal when comparing with future periods. It is recommended that this should be maintained as a reference for monitoring long-term climate variability and change and BC (Trewin 2007; WMO 2014). However, a regularly updated 30-year baseline period, for example, 1981–2010, or 1991–2020 can also be employed. The Intergovernmental Panel on Climate Change (IPCC) used the 20-year period 1986–2005 as the baseline in many assessment results in the Fifth Assessment Report (Stocker *et al.* 2015) and years 1995–2014 in its Sixth Assessment Report (IPCC 2023). Scientists usually measure rising global temperatures against the baseline of the years between about 1850 and 1900 (defined as the ‘pre-industrial period’), ‘when fossil-fuel burning had yet to change the climate’. In principle, ‘pre-industrial levels’ could refer to any period before the industrial revolution. However, the number of direct temperature measurements decreases as one goes back in time. Defining a ‘pre-industrial’ reference is therefore a compromise between the reliability of the temperature information and how well it represents pre-industrial conditions.

The choice of the baseline period (as a reference for assessing future changes for any projected variable) can play an important role in the ensuing assessment. In regional climate impact studies, well-established or arbitrarily chosen baselines are often used without questioning. Also, BC methods require sufficient observational data to characterize the reference climatology. The use of 30 years of data to include some variations at the decadal timescale (Li *et al.* 2010; Mehrotra & Sharma 2016) is common.

It has been suggested (Maraun 2016) that the bias in climate model results cannot be uniquely defined but depends on the reference period adopted. While the 1961–1990 period is frequently used as a reference, CORDEX climate simulations were bias corrected using 1989–2010 as a reference time window. Hence, the climate simulated during 1961–1990 is likely to be different from observations, although the magnitude of the differences is unknown. Only a few studies have investigated this issue in detail. For example, Zhang *et al.* (2018) in a study found that baseline period choice is an important source of uncertainty when studying the impacts of land use and climate change on hydrology. Similarly, Kotlarski *et al.* (2019) found that the observational uncertainty (i.e., uncertainty related to the differences between reference observation-based datasets) is smaller than RCM uncertainty at a European scale, although the observational uncertainty can be larger in some regions. A possibly large effect of observational uncertainty was corroborated also in the study by Sun *et al.* (2018) who reported a variation of annual precipitation estimates of as much as 300 mm/year among 30 climate datasets.

### BC approaches with trend correction

Some variables, for example, temperature, show clear increasing trends in the future and the BC method applied should preserve the trend of the raw data.

It is argued that trends of raw climate series should be preserved so that the BC does not impact climate sensitivity (Hempel *et al.* 2013). For atmospheric moisture and associated variables (precipitation), preserving the relative change signal is also important for maintaining physical scaling relationships with model-projected temperature change (e.g., Clausius–Clapeyron scaling of 7% suggests an equivalent increase in atmospheric moisture or column of water vapour for each degree Centigrade rise in temperature (Wasko *et al.* 2016). A trend-preserving BC (Hempel *et al.* 2013) was applied to CMIP5 representative concentration pathway (RCP) GCM experiments (Taylor *et al.* 2012). The approach was found to preserve monthly trends, but trends in the daily extremes were modified by the QM. Similarly, a detrended quantile mapping (DQM) algorithm was applied by Bürger *et al.* (2013), and results indicated that while the approach tends to maintain the modelled long-term trend in the mean, it does not guarantee that trends in precipitation extremes, as governed by the tails of the distribution, are also preserved.

Another variant of QM, the quantile delta change or quantile perturbation method, was used for precipitation projections (Olsson *et al.* 2009; Willems & Vrac 2011; Sunyer *et al.* 2015). The approach modifies historical observations by superimposing relative trends in quantiles from the climate. In this case, modelled trends in all quantiles, including in the tails, are preserved. It may be noted that as the approach adjust observed series, it does not explicitly bias correct the raw time series of the climate model.

A detrended QM was used by Hempel *et al.* (2013) and Bürger *et al.* (2013) to preserve monthly trends. A modification to the basic QM was presented by Cannon *et al.* (2015), termed DQM. Relative to QM, DQM incorporates additional but limited information about the climate model simulation corresponding to the projected period, in this case in the form of the projected mean. Depending on the degree of extrapolation still required after detrending (and the extrapolation algorithm used), the climate change signal from DQM will tend to match that of the underlying climate model. This applies to the mean but not necessarily to all quantiles, such as those representing the tails of the distribution that define climate extremes.

Similarly, Cannon *et al.* (2015) modified the QM method and outlined an approach called QDM and claimed that it is not constrained by the stationarity assumption. They also showed that QDM is not very different from equidistant-quantile matching (EQM) if BC is applied additively. Switanek *et al.* (2017) proposed a nonstationary QM correction and named it ‘Scaled Distribution Mapping’, which is similar to QDM but explicitly accounts for the number of rainy days and wet spells.

Recent applications of time-frequency-based BC approaches, DWBC and CWBC, have shown that these approaches allow to maintain the underlying trend of the time series while correcting for the biases in mean and standard deviation across the spectrum. The delta function introduced in the approach maintains the continuity of the trend from current to future climate (Kusumastuti *et al.* 2021, 2022).

### Retaining model-predicted mean changes in the bias-corrected time series

Previous studies have noted that traditional QM modifies the mean of climate change signals (Hagemann *et al.* 2011; Pierce *et al.* 2013; Maurer & Pierce 2014). This can introduce inconsistencies in the BC results. QM is also known to alter the variability and trend at all time scales (Pierce *et al.* 2013; Maurer & Pierce 2014). If a climate variable shows too much variability, QM reduces it and if the variable has too little variability, QM increases it.

The model-predicted mean changes in the bias-corrected series can be retained by further adjusting the QM bias-corrected results. The QM approach preserves the magnitude of error in the current climate model value for the same future climate value. EQM assumes the magnitude of error in a current climate model value at a given quantile is preserved in the future. CDF-t assumes the magnitude of error in the current climate model quantile at a given quantile is preserved in the future. Thus, one needs to consider the procedure which would preserve the magnitude of error in the current climate model quantile to be preserved in the future. EQM preserves model-predicted mean changes for variables where the correction is applied additively. However, the approach requires modification for variables where the correction is applied multiplicatively. Pierce *et al.* (2015) proposed an extension to the EQM approach to retain model-predicted mean changes. They suggested multiplying the EQM corrected values by a correction factor, , where *y* is the change (calculated as a ratio) in the value of the variable and is the change in the mean value following BC. The angle brackets indicate that the mean is taken over a time period, for example, a month.

One way in which extrapolation can be avoided to some extent and the trend in the raw GCM data can also be maintained is to first detrend the series, apply BC, and reimpose the trend afterward. The trend removal is likely to shift the future climate distribution closer to that of the current climate. Adding the trend afterward will retain the trend in the time series.

### Adjustment of the number of wet days in the precipitation time series

If the number of wet days of a raw climate model precipitation simulation does not match the observed ones, an adjustment of the values is needed as the BC method might not be able to correct precipitation intensity appropriately. There could be two cases. The first one relates to the situation where the raw data have too many wet days compared to the observations, a common case with the majority of GCMs. In the second case, raw data have fewer wet days in comparison to the observations.

When the raw model simulation has too many wet days, a simple approach is to define a threshold (TH), such that the frequency of days with model precipitation > TH is the same as the frequency of rainy days in the observed dataset (Ines & Hansen 2006; Schmidli *et al.* 2006; Lavaysse *et al.* 2012; Nguyen *et al.* 2018; Mehrotra & Sharma 2019). Before the application of BC, model precipitation values below TH are set to zero. As an alternative, it is also possible to apply a BC model directly on the complete time series including irrespective of dry or wet days (Piani *et al.* 2010; Vrac *et al.* 2012; Mehrotra & Sharma 2016). When QM is applied on the full-time series, in order to match the number of zeros, in case of an excessive number of wet days, some days will be assigned negative values and setting them to zero will result in an appropriate number of wet days.

When the raw model simulation has fewer numbers of wet days than observations, the threshold approach cannot be applied. Mehrotra & Sharma (2019) suggested adding more rainy days by picking the dry days at the start or end of dry spells and adding a minimum rainfall threshold. Adding the threshold modifies the selected dry days into wet days and achieves the correct number of wet days. As with the case of an excessive number of wet days, QM can adjust precipitation values and match the number of wet days. This adjustment procedure is applied by comparing the current climate data with the observations. Future climate data are adjusted in the same proportion.

### Climate variables used in BC

Precipitation and temperature are the two primary climate variables frequently considered in the BC approaches. The majority of studies have considered monthly or daily precipitation (Haddad & Rosenfeld 1997; Wood *et al.* 2004; Boé *et al.* 2007; Déqué 2007; Graham *et al.* 2007; Moore *et al.* 2008; Olsson *et al.* 2009; Piani *et al.* 2010; Willems & Vrac 2011; Gudmundsson *et al.* 2012; Lafon *et al.* 2013; Cannon *et al.* 2015). Precipitation is more difficult to bias correct than temperature because it is intermittent in nature (rainy days followed by days with no rainfall), has a skewed distribution and zero lower bounds, and often behaves differently at aggregated spatial and temporal scales.

In addition to precipitation and temperature, other climate variables and phenomena including wind, humidity, streamflow, evaporation, bushfire, surface pressure, geopotential height, mean sea level pressure, and air temperature have also been bias-corrected and used in climate change impact studies (Bürger 1996; Matyasovszky & Bogardi 1996; Enke & Spekat 1997; Mehrotra *et al.* 2004; Mehrotra & Sharma 2010; Iizumi *et al.* 2017; Dowdy *et al.* 2019).

Please note that statistical and dynamical downscaling techniques are routinely used to enhance the regional information provided by GCMs by combining large-scale climatic information with small-scale behaviour and dynamics. These large-scale variables often require some kind of BC before their use in the downscaling models.

## LIMITATIONS, ASSUMPTIONS AND FUTURE OF BC APPROACHES

There is no perfect BC approach. All existing BC approaches offer some limitations in terms of their application coupled with their methodological limitations. These limitations and assumptions are a source of uncertainty in bias-corrected RCM rainfall and the resulting hydrological impact assessments and evaluations estimated using them (Teutschbein & Seibert 2012; Maraun & Widmann 2018; Potter *et al.* 2018). It is important to note that no BC approach can correct for errors inherited from the RCMs or GCMs in temporal sequencing, seasonality, or biases in large-scale circulation patterns that could lead to unrealistic and non-physical climate projections (Stocker *et al.* 2015).

In general, a BC is subject to the following limitations:

BC methods assume that the magnitude and nature of BC are stationary and will not change into the future. This assumption, however, has been questioned by researchers from time to time (Ehret

*et al.*2012; Maraun 2016). Recent studies (e.g., Nahar*et al.*2017; Hui*et al.*2019) have reported nonstationary biases in the climate model precipitation. However, realizing that using raw model simulations will perhaps be even more questionable, many studies choose to bias correct the raw climate data before use. Some modifications in the BC approaches trying to account for the non-stationarity in the biases have been proposed (Cannon*et al.*2015; Switanek*et al.*2017):BC is a purely statistical alteration to the raw model simulation. It cannot discriminate between the physical processes determining trends associated with climate change, or with the natural climatic variability internal to the model.

BC methods require sufficient observational data to characterize the reference climatology. Usually, 30 years of data is considered as a minimum requirement, mostly to include low-frequency variability or variations across years at the decadal time scales (Mehrotra & Sharma 2016, 2019).

The quality of the observed data affects the quality of bias-corrected data and how well the climate model is able to represent the relevant physical processes that govern the variable of interest.

The physical consistency of the different climate variables may not be maintained if they are bias-corrected independently. For example, bias-correcting temperature may result in sub-zero values, whereas rainfall does not convert to snowfall in simulations. Negative values for diurnal temperature range may be generated from daily minimal and maximal temperatures. This will be important in some applications but not others. Similarly, it may rain when the temperature is above 35°, the occurrence of which has been documented as rare (Roderick

*et al.*2019, 2020).BC is not recommended if the biases are large (Murphy

*et al.*2018). Also, the GCMs or ensembles that show large biases in the analysis should be dropped (Fung 2018). Interpretation of any bias-corrected data needs to take into account the assumptions stated above.Simple daily rainfall BC approaches offer limited ability to correct for biases in the wet and dry spells (continuous sequence of wet or dry days) and wet spell totals (Chen

*et al.*2013; Addor & Seibert 2014; Evans*et al.*2017).When developing a BC method to adjust the simulations, the different sources of uncertainties contributing to the biases are not differentiated.

BC is a reasonable choice that allows using the output of existing climate models and conducting analyses to understand the climate change impacts. However, the assumption of stationarity requires careful consideration as there is evidence that a warming world can change the climate response in the future, for example, extreme precipitation, temperature, and evapotranspiration (Maraun 2016; Nahar *et al.* 2017; Hui *et al.* 2019). Given that the climate continues to change, traditional BC methods may become less effective, and new approaches will need to be developed to account for these changes. The use of innovative techniques, such as deep learning, may also play an increasingly important role in the future of BC. These methods have shown promise in other fields and may prove useful in the BC field as well. Overall, the future of BC approaches for climate variables is likely to involve the development of more advanced statistical and machine learning techniques that can address the many challenges associated with working with climate data.

## CONCLUSIONS

Outputs of GCMs and RCMs are routinely used as inputs to impact assessment models. Despite the modelling and scientific developments, climate simulations are often biased when compared to observations, leaving modellers with questions as to how they can rely on projections issued into the future. Using raw outputs can significantly affect the results of impact studies, necessitating the use and development of a range of alternatives that correct systematic climate model biases.

The usefulness of climate models improves, often significantly, if systematic bias can be removed. To do this, however, the bias needs to be quantified, and appropriate corrections need to be devised. These corrections may be simple re-parameterizations of the model or a re-specification of the model structure. Identifying, characterizing, and removing such bias, however, become especially tricky if the model is complex, and its inputs (as well as observed responses used for assessment) uncertain. Such is the scenario that characterizes the biases inherent in climate model simulations of the future.

BC is routinely used to correct the statistical biases of the raw simulations of climate models before using them as input in climate change assessment studies. It aims to adjust selected statistics of a climate model simulation to better match observed statistics over the current climate reference period. Depending upon the complexity, it may adjust marginal statistics only, or also multi-time scale, multi-variable, and multi-locational aspects. A fundamental assumption of BC is that the chosen climate model produces meaningful output, including a credible representation of climate change with consistent biases. BC cannot fix the fundamental problems of a climate model.

If the main interest of a study is to assess the relative change in a future climate, bias is not a big problem, and BC can be ignored. BC is important for studies involving a threshold analysis (crop behaviour at certain temperatures, rainfall, and temperature extremes, heat waves, etc.), and water availability analysis as any bias in rainfall can have a large impact on the outcome and in the estimation of extremes as future changes in flood risk can go wrong.

There are trade-offs between bias-corrected climate data versus directly modelled climate data used in climate change impact assessment studies. These trade-offs should be looked into when modelling impact assessment frameworks for specific applications; for example, it makes sense to use bias-corrected data to assess the climate change impacts on reservoir storage and operation than using it to investigate the impacts on emissions and air quality (Liu *et al.* 2014). Do not bias-correct if the biases are large, e.g., the Met Office Hadley Centre Model (HadGEM3-GC3.05) subset of the global projections shows significant bias in winter surface air temperature in large parts of the Northern Hemisphere, although not over the UK (see Section 3.4 of Land Science Report (Murphy *et al.* 2018). You should disregard the members that show these large biases in your analysis.

The results discussed in the paper are based on the output of CMIP5 models, although the BC approach is independent of model type, group, or phase. The CMIP6 modelling project is based on an improved understanding of the Earth's climate system. The improvements in representations of processes such as clouds, aerosols, and biogeochemistry, spatial resolution, and data sharing in CMIP6 are expected to lead to more accurate and reliable projections of future climate change, as well as a better understanding of the underlying processes driving these changes, leading to reduced biases.

Currently used BC methods largely neglect feedback mechanisms, and it is unclear whether they are time-invariant under climate change conditions. Applying BC increases agreement of climate model output with observations in the current climate and hence narrows the uncertainty range of simulations and predictions without, however, providing a satisfactory physical justification. In any case, it should be acknowledged that a successful BC relies on a sound understanding not only of the statistical model but also the relevant climatic processes and their representation of the considered climate model.

## DATA AVAILABILITY STATEMENT

All relevant data are included in the paper or its Supplementary Information.

## CONFLICT OF INTEREST

The authors declare there is no conflict.

## REFERENCES

*.*

*Environmental Research Letters*

**18**(5), 1–7

*IPCC, 2022: Summary for Policymakers*. Available at: https://www.ipcc.ch/report/ar6/wg2/downloads/report/IPCC_AR6_WGII_SummaryForPolicymakers.pdf