Abstract
Topical research on hydrological behaviour of climate change in terms of downscaling of monthly precipitation is investigated in this paper by formulating an inclusive multiple modelling (IMM) strategy. IMM strategies manage multiple models at two levels and the paper uses statistical downscaling model, Sugeno fuzzy logic and support vector machine at Level 1 and feeds their outputs to a neuro-fuzzy model at Level 2. In the downscaling stage, large-scale NCEP (National Centres for Environmental Prediction)/NCAR (National Centre for Atmospheric Research) data are used for a station with local data record from 1961 to 2005 for training and testing Level 1 models. The results are found to be ‘fit-for-purpose’, but the variations between them signify some room for improvements. The model at Level 2 combines outputs of those at Level 1 and produces Level 2 results, which improve compared with those at the Level 1 models in terms of dispersion of residual errors. In this way, IMM provides a more defensible modelling strategy for application in the projection stage. The comparison between observed and projected precipitation indicates that precipitation will be likely to reduce compared with observed precipitation in cold seasons (October–February), but the projected precipitation will be likely to increase slightly in wet seasons (April and May).
HIGHLIGHTS
Inclusive multiple modelling (IMM) is formulated for the downscaling of precipitation.
IMM manages multiple regression and artificial intelligence models at two levels.
Performances of IMM are improved over Level 1 models in the downscaling stage.
IMM provides a more defensible strategy for application in the projection stage.
INTRODUCTION
Enhancing the accuracy of the models on statistical downscaling of precipitation is investigated in this paper using inclusive multiple modelling (IMM) practices, where statistical downscaling refers to transforming large-scale predictor variables to local climate variables referred to as predictand(s), e.g., precipitation by a statistical tool. There is no single way of modelling a predictand variable, as the possibilities are large in a pluralistic modelling environment and the paper tests the application of IMM practices. Khatibi et al. (2020) present these practices as a way of maximising the extracted information or enhancing accuracies of local models through a selection of a limited number of models without exhaustive testing of all of the available models. IMM is outlined further below and is implemented through formulating strategies to enhance the correlation between large-scale predictors with local precipitation (predictand) at two levels: at Level 1, a number of generally available models are constructed using predictors and the predictand(s); and at Level 2, yet another model is constructed, which reuses the predictions of the models at Level 1 as its inputs.
General circulation models (GCMs) provide simulations of large-scale weather variables in the future for climate studies. Climate change impacts water resources and is investigated in terms of climate variables on a local scale by downscaling climate variables from large-scale to local-scale variables. The following downscaling approaches are widely used: (i) dynamical or physical downscaling and (ii) statistical or empirical downscaling. Dynamical downscaling extracts large-scale information from GCMs at low resolutions and maps them onto higher resolution models known as the limited area method (LAM) or the regional climate model (RCM) (e.g., Ishida et al. 2020). Extracting local information using LAM or RCMs requires complex design and computational time, which limits their use (Anandhi et al. 2008). Statistical downscaling seeks an empirical relationship between climate predictors at large-scale climate variables and the predictand of local climate variables in the historical period. Its main drawback is its assumption, in which the derived relationship between predictors and predictand in the historical period is extrapolated to future periods, although they are simple and used widely.
Statistical downscaling models (SDSMs) are categorised into (i) statistical or regression models and (ii) artificial intelligence (AI)-based models. Statistical models include SDSMs (Wilby et al. 2002) and Long Ashton Research Station Weather Generator (LARS-WG) (Semenov & Barrow 1997), which use multiple linear regression (MLR) and are widely employed to downscale daily and monthly precipitation and temperature (e.g., Chen et al. 2010; Pervez & Henebry 2014; Gulacha & Mulungu 2017). Overall, they seek a complex relationship between predictors and the predictand for precipitation downscaling. Notably, SDSM performances in monthly scales are reportedly better than their daily scale as extreme events lie beyond the range of regression models (Pervez & Henebry 2014), and hence a further focus is given below on both statistical and AI models.
Previous studies in statistical downscaling have addressed various aspects of the problem, which include (i) using statistical and AI models to identify the relation between predictors and the predictand(s) for a local area (e.g., Chen et al. 2010; Li et al. 2020); (ii) using different screening techniques to select predominant predictors among a large number of predictors (e.g., Nourani et al. 2019); (iii) evaluating different GCMs or RCMs for statistical downscaling (e.g., Singh et al. 2019; Li et al. 2020) and (iv) combining the results of different models (e.g., Su et al. 2019). Table 1 presents an outline in terms of their variations, their downscaling models and predictor screening techniques.
Researcher(s) . | Study area . | Climate model . | Statistical downscaling . | Predictor screening technique . | Time scale . | Time period . | Downscaled parameter . | Transformation . | Scenario . | ||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Statistical . | AI . | Calibration . | Validation . | Prcp. . | Temp. . | Prcp. . | Temp. . | ||||||
Chen et al. (2010) | Shih-Men Reservoir basin (Taiwan) | Had-CM3a | SDSMb Discriminant analysis Multiple regression | SVMc SVCd SVRe | Kolmogorov–Smirnov test | Daily | 1964–1990 | 1991–2000 | ✓C | − | Norm. | − | A2 B2 |
Hashemi et al. (2011) | Clutha Watershed (New Zealand) | Had-CM3 | SDSM | GEPf | GEP | Daily | 1961–1990 | 1991–2000 | ✓ | − | – | − | − |
Pervez & Henebry (2014) | Two river basins (South Asia) | CGCM3.1g | SDSM | − | CC | Daily Monthly | 1988–1997 | 1998–2003 | ✓C | − | Fourth Root | − | A2 A1B |
Gulacha & Mulungu (2017) | The Wami-Ruvu River Basin (Tanzania) | Had-CM3 | SDSM | − | Partial correlation Significance value | Monthly | 1961–1975 | 1976–1990 | ✓C | MaxT MinT ✓ UC | – | − | A2 B2 |
Nourani et al. (2019) | Tabriz station (Iran) | Can-ESM2h BNUESMi CGCM3 | MLRj | ANN | Decision tree Mutual information CC | Monthly | 75% | 25% | ✓ | MeanT ✓ | Std. | − | RCP4.5 RCP8.5 A1B B1 |
Sachindra et al. (2018) | Victoria State (Australian) | NOAAl | − | GEP ANN SVM RVMm | CC | Monthly | 1950–1991 | 1992–2014 | ✓ | − | Std. | − | − |
Su et al. (2019) | Heihe River basin (China) | NOAA | SRMn BMAo | – | PCA | Monthly | 1971–2012 | ✓ | − | Std. | − | − | |
Singh et al. (2019) | India | GCMs including Can-ESM2 RCMs including CNRM-CM5p | Linear regression | − | − | Daily | 1951–2000 | ✓ | − | − | – | RCP4.5 RCP8.5 | |
Ahmed et al. (2020) | Pakistan | 15 GCMs including HadGEM2r (as the best) | − | ANN KNNs SVM RVM | − | Monthly | 1961–1992 | 1993–2005 | ✓ | MaxT MinT ✓ | – | − | − |
Li et al. (2020) | Ontario (Canada) | GCM including Can-ESM2, ECt-EARTH RCM including CanRCM4u, CRCM5u | Arithmetic mean MLR | LSTM SVM | − | Daily | 1980–1986 | 1987–1989 | − | ✓ | – | − | − |
Researcher(s) . | Study area . | Climate model . | Statistical downscaling . | Predictor screening technique . | Time scale . | Time period . | Downscaled parameter . | Transformation . | Scenario . | ||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Statistical . | AI . | Calibration . | Validation . | Prcp. . | Temp. . | Prcp. . | Temp. . | ||||||
Chen et al. (2010) | Shih-Men Reservoir basin (Taiwan) | Had-CM3a | SDSMb Discriminant analysis Multiple regression | SVMc SVCd SVRe | Kolmogorov–Smirnov test | Daily | 1964–1990 | 1991–2000 | ✓C | − | Norm. | − | A2 B2 |
Hashemi et al. (2011) | Clutha Watershed (New Zealand) | Had-CM3 | SDSM | GEPf | GEP | Daily | 1961–1990 | 1991–2000 | ✓ | − | – | − | − |
Pervez & Henebry (2014) | Two river basins (South Asia) | CGCM3.1g | SDSM | − | CC | Daily Monthly | 1988–1997 | 1998–2003 | ✓C | − | Fourth Root | − | A2 A1B |
Gulacha & Mulungu (2017) | The Wami-Ruvu River Basin (Tanzania) | Had-CM3 | SDSM | − | Partial correlation Significance value | Monthly | 1961–1975 | 1976–1990 | ✓C | MaxT MinT ✓ UC | – | − | A2 B2 |
Nourani et al. (2019) | Tabriz station (Iran) | Can-ESM2h BNUESMi CGCM3 | MLRj | ANN | Decision tree Mutual information CC | Monthly | 75% | 25% | ✓ | MeanT ✓ | Std. | − | RCP4.5 RCP8.5 A1B B1 |
Sachindra et al. (2018) | Victoria State (Australian) | NOAAl | − | GEP ANN SVM RVMm | CC | Monthly | 1950–1991 | 1992–2014 | ✓ | − | Std. | − | − |
Su et al. (2019) | Heihe River basin (China) | NOAA | SRMn BMAo | – | PCA | Monthly | 1971–2012 | ✓ | − | Std. | − | − | |
Singh et al. (2019) | India | GCMs including Can-ESM2 RCMs including CNRM-CM5p | Linear regression | − | − | Daily | 1951–2000 | ✓ | − | − | – | RCP4.5 RCP8.5 | |
Ahmed et al. (2020) | Pakistan | 15 GCMs including HadGEM2r (as the best) | − | ANN KNNs SVM RVM | − | Monthly | 1961–1992 | 1993–2005 | ✓ | MaxT MinT ✓ | – | − | − |
Li et al. (2020) | Ontario (Canada) | GCM including Can-ESM2, ECt-EARTH RCM including CanRCM4u, CRCM5u | Arithmetic mean MLR | LSTM SVM | − | Daily | 1980–1986 | 1987–1989 | − | ✓ | – | − | − |
aHadley Centre Coupled Model, Version 3.
bStatistical downscaling model.
cSupport vector machine.
dSupport vector classification.
eSupport vector regression.
fGene expression programming.
gThird generation coupled global climate model.
hCanadian Center for Climate Modelling and Analysis Earth System Model.
iBeijing Normal University Earth System Model.
jMultiple linear regression.
kArtificial neural network.
lNational Oceanic and Atmospheric Administration.
mRelevance vector machine.
nBayesian model averaging.
oStepwise regression model.
pCentre National de Recherches Météorologique Model 5.
qEarth system model low resolution.
rHadley Centre Global Environment Model version 2.
sK-nearest neighbour.
tEuropean community.
uCanadian regional climate models.
Research activities in the first group include Chen et al. (2010), who use SDSM, multiple regression and support vector machine (SVM) for daily precipitation downscaling. They used discriminant analysis and support vector classification for classifying days as wet and dry and utilised multiple regression and SVM to estimate precipitation in wet days. Their results show that SVM produces more reasonable results, although SDSM has a better performance for cases when daily precipitations are small. Hashemi et al. (2011) compared SDSM with gene expression programming (GEP) for downscaling daily precipitation; they observe that GEP offers a slight improvement and some advantages including automatic screening of the predictors. Employing artificial neural network (ANN), GEP, SVM and relevance vector machine (RVM) to downscale monthly precipitation, Sachindra et al. (2018) observe that AI models have better performances in estimating mean precipitation but underestimate the maximum and standard deviation of precipitation series. They observe that SVM and RVM score the highest performance metrics but recommend RVM or ANN over SVM or GEP for flood prediction studies and RVM for drought studies.
The activities in the second group included screening techniques to select predominant predictors. Hammami et al. (2012) developed a penalised regression method, referred to as the least absolute shrinkage and selection operator (LASSO) to identify the dominant predictors, and the results indicated that LASSO performed better than the classic approach. Using ANN to downscale monthly precipitation, Nourani et al. (2019) investigate three screening techniques to select dominant large-scale predictors, which include decision tree, mutual information and correlation coefficient (CC). They report that training ANN performs better when predictors are selected by incorporating decision tree. Teegavarapu & Goly (2018) investigate stepwise regression model (SRM), mixed-integer nonlinear programming (MINLP) and ANN to select dominant predictors and report that MINLP and ANN improve downscaling performances but selecting predominant predictors is site-specific. Jafarzadeh et al. (2021) identify dominant predictors by four different algorithms and observe that among them the Bayesian theorem and stepwise regression performed better than others.
The third group of activities evaluates or ranks different GCMs or RCMs. Ahmed et al. (2020) report on downscaling monthly precipitation, maximum and minimum temperatures using ANN, K-nearest neighbour (KNN), SVM and RVM. They rank 15 GCMs and report HadGEM2-AO as the most skilled model and conclude that KNN and RVM exhibit higher performance than SVM and ANN. Employing linear regression for downscaling precipitation obtained by predictors belonging to various GCMs and RCMs, Singh et al. (2019) report that their results reveal the underestimation and overestimation of GSMs and RCMs, respectively, but show RCM to perform closer to observed precipitation.
The fourth group of activities addresses topical research on combining different results of statistical downscaling. Using Bayesian model averaging (BMA) to combine different regression models, Su et al. (2019) incorporate different predictors and compare the results with SRM. They conclude that BMA performs better than SRM since using too many predictors in a regression model decreases the predictive power, and hence BMA overcomes this limitation. Downscaling daily temperature using statistical and AI models, Li et al. (2020) report on statistical models, including arithmetic mean and MLR, and combine their predictors with equal and unequal weights, respectively. AI models include long short-term memory and SVM. Their results show that both statistical and AI models perform well but are unable to accurately estimate extremes. This group of models is regarded as IMM practices, but the paper develops an IMM strategy for a deeper understanding of the problem.
One of the main gaps in the models for the statistical downscaling of precipitation is the absence of explicit techniques to extract full information from the local data, hence this paper. The application of IMM practices to these models in the monthly time scale at a station is tested for a case study in a region at the southeast coast of the Caspian Sea, northwest of Iran. Notably, limiting the study area to one station makes it easy to study the IMM capability from an elementary basis without undue complications. This simplification is also cited in the literature (e.g., Chen et al. 2010; Nourani et al. 2019) but for other reasons. IMM can combine the results of statistical and AI models at Level 1 with a combiner AI model at Level 2. The novelty of the paper includes the application of IMM to the statistical downscaling of precipitation and the developed models project precipitation into the future, as detailed next.
METHODOLOGY
Critical view of conventional approaches
The various aspects of the past research on statistical or AI-based models on downscaling are outlined above, based on which the paper seeks a deeper understanding of IMM practices. This is against a background where most of the modelling research activities are largely focussed on ranking with the main goal of selecting the better performing one, often referred to as the ‘superior’ model. Khatibi et al. (2020) and Khatibi & Nadiri (2020) refer to these types of modelling practices as exclusionary multiple models (EMMs), and hence these models are better described as ‘fit-for-purpose’ but not superior. One reason is that EMM practices do not take full benefit of the multiple models but construct them at the expense of the culture of ranking, which transforms models into being ‘the end’. In reality, models ought to be regarded as the ‘means to an end’ of serving as a learning tool stipulated by IMM practices.
The paper takes on board IMM practices, which are driven by formulating modelling strategies to seek ‘defensibility’ of the results by ensuring that the information extracted from site-specific data is enhanced to demonstrable levels. Nadiri et al. (2019) explain that there is already an established mathematical basis for the mean of multiple models to have root-mean-square errors (RMSE less than the average of RMSE of the individual models). Thus, simple averaging of multiple models at Level 1 can be regarded as a set of results at Level 2, suitable for benchmarking the results of a more sophisticated ‘combiner model’ at Level 2. Moreover, the past published results by the authors show that the quality of the results at Level 2 is also improved in various ways and these collectively add to the ‘defensibility’ of the results, and hence, they are ‘fit-for-purpose’ at Level 1 for not using the full potential of the constructed multiple models.
The authors build up evidence for the scope of IMM strategies by testing them in different fields of water and environmental modelling. Past applications include both explicit evidence on IMM (e.g., Karimi et al. 2020; Khatibi & Nadiri 2020; Khatibi et al. 2020; Nadiri et al. 2020) and tacit evidence on IMM (e.g., Khatibi et al. 2020; Ghorbani et al. 2018; Nadiri et al. 2018; Sadeghfam et al. 2019; Moazamnia et al. 2020). These also reflect on the IMM evolution, which goes back to the 1960s, as reviewed by Clemen (1989), although an exhaustive review is beyond the scope of the paper.
An IMM modelling strategy is formulated in the paper, which aims to benefit from multiple models at Level 1 (referred to as base models), but at Level 2, a combiner model reuses base models. Notably, any statistical or AI-based models can be selected and organised as base models and a combiner model; see the illustration in Figure 1. The figure depicts the activities in levels as follows: at Level 0, the available data are reviewed and decisions are made on the model structure and modelling strategies; at Level 1, base models comprise one statistical and two AI-based models: SDSM, Sugeno fuzzy logic (SFL) and SVM and at Level 2, neuro-fuzzy (NF) is selected as the combiner model. These models are only specified in this section in the sense that only enough details are presented that third parties can reproduce the results reported by the paper. All the required procedures were implemented in the MATLAB platform.
Level 0: pre-processing – selecting dominant predictors
Downscaling involves a selection of predominant predictors, and the paper selects those by the NCEP (National Centres for Environmental Prediction)/NCAR (National Centre for Atmospheric Research) predictors. Generally, NCEP/NCAR reanalyse data as the historical and large-scale climate variables with the resolution of 2–4° and are globally gridded data involving observations and numerical weather predictions. Table 2 presents the list of 26 predictor variables in NCEP/NCAR.
Predictor variable . | Name . | Predictor variable . | Name . |
---|---|---|---|
1,000 hPa wind speed | p1_f | 850 hPa wind speed | p8_f |
1,000 hPa zonal wind component | p1_u | 850 hPa zonal wind component | p8_u |
1,000 hPa meridional wind component | p1_v | 850 hPa meridional wind component | p8_v |
1,000 hPa relative vorticity of true wind | p1_z | 850 hPa relative vorticity of true wind | p8_z |
1,000 hPa wind direction | p1th | 850 hPa geopotential height | p850 |
1,000 hPa divergence of true wind | p1zh | 850 hPa wind direction | p8th |
500 hPa wind speed | p5_f | 850 hPa divergence of true wind | p8zh |
500 hPa zonal wind component | p5_u | Specific humidity at 500 hPa | s500 |
500 hPa meridional wind component | p5_v | Specific humidity at 850 hPa | s850 |
500 hPa relative vorticity of true wind | p5_z | Specific near-surface humidity | Shum |
500 hPa geopotential height | p500 | Mean temperature at 2 m | Temp. |
500 hPa wind direction | p5th | Total precipitation | Prcp. |
500 hPa divergence of true wind | p5zh | Mean sea-level pressure | Mslp |
Predictor variable . | Name . | Predictor variable . | Name . |
---|---|---|---|
1,000 hPa wind speed | p1_f | 850 hPa wind speed | p8_f |
1,000 hPa zonal wind component | p1_u | 850 hPa zonal wind component | p8_u |
1,000 hPa meridional wind component | p1_v | 850 hPa meridional wind component | p8_v |
1,000 hPa relative vorticity of true wind | p1_z | 850 hPa relative vorticity of true wind | p8_z |
1,000 hPa wind direction | p1th | 850 hPa geopotential height | p850 |
1,000 hPa divergence of true wind | p1zh | 850 hPa wind direction | p8th |
500 hPa wind speed | p5_f | 850 hPa divergence of true wind | p8zh |
500 hPa zonal wind component | p5_u | Specific humidity at 500 hPa | s500 |
500 hPa meridional wind component | p5_v | Specific humidity at 850 hPa | s850 |
500 hPa relative vorticity of true wind | p5_z | Specific near-surface humidity | Shum |
500 hPa geopotential height | p500 | Mean temperature at 2 m | Temp. |
500 hPa wind direction | p5th | Total precipitation | Prcp. |
500 hPa divergence of true wind | p5zh | Mean sea-level pressure | Mslp |
Reported studies provide evidence on the performance of decision tree in downscaling by AI models (Nourani et al. 2019). It is a supervised AI technique, which is structured in terms of four levels (from the upper to the lower level): root, branches, nodes and leaves. The M5 model was developed by Quinlan (1992) as a classifier technique to understand the relationship between dependent and independent variables. M5 establishes regression equations to each leaf as a small part of data, whereas the classic regression establishes an equation to the whole data. The required procedure comprises (Pal & Deswal 2009): (i) using a split criterion to create a tree model, in which M5 utilises the reduction of standard deviation of each class and (ii) pruning of the branches and substituting them in a regression equation. The dominant predictors can be selected at upper nodes, which have a high priority to predictand. Also, predictors at lower nodes not appearing in the structure of the tree have lower priorities.
Level 1: base models
Statistical downscaling model
Since precipitation as the predictand has a highly variable nature on a local scale, it cannot be fully described by large-scale predictors. Therefore, stochastic techniques artificially inflate the variance of downscaled precipitation time series (Wilby et al. 2002), where variance inflation refers to the scale variance of the downscaled predictand to achieve better agreement with observed values.
Support vector machine
Sugeno fuzzy logic
SFL is a widely used AI technique; for more details see Takagi & Sugeno (1985) and for more details on the author's implementation, see Sadeghfam et al. (2018). To the best knowledge of the authors, SFL is yet to be used as an SDSM, but the paper uses it as a downscaling model at Level 1. Its implementation requires the following procedure: (i) a clustering technique is required to identify groupings within the site-specific data to derive rule base automatically and to identify inherent parameters in membership function (MF), in which MF retains the contribution from the original fuzzy logic. The identification of the grouping in the data is through the subtractive clustering (SC) method given by Chiu (1994). SC is processed by using cluster radius to control the number of clusters and fuzzy rules (Chen & Wang 1999). The optimum cluster radius is identified by systematically varying the cluster radius in the range of 0–1 until the optimum value is identified in terms of RMSE.
Level 2: combiner model
The IMM strategy uses NF, originally given by Jang (1993), as a combiner model at Level 2, in which the downscaled precipitation by the base models at Level 1 is reused as input data for improved performances. NF determines fuzzy rules by integrating ANN with the fuzzy inference system, such that the parameters of the membership function in SFL are tuned by a hybrid algorithm using neural networks. The implementation of NF involves five layers, but these are detailed in textbooks (Jang et al. (1997). The authors' implementation is similar to Moazamnia et al. (2019), which may be referred to for further details.
Performance metrics
The authors do not encourage the use of superlatives such as superiors models in the inter-comparison of models based on anecdotal test cases. There is unlikely to be any superior model, but one model may explain a set of local data better than others. As discussed by Khatibi et al. (2020) and Khatibi & Nadiri (2020), it is more appropriate to express model performances in terms of ‘fit-for-purpose’ for those based on EMM practices and as ‘defensible’ for those based on IMM practices. The ultimate aim is to ensure that any information contained in the local data is extracted to their full extent, in which case the models are defensible. This requires a statistical closure and more appropriate ways need to be developed. This paper demonstrates defensibility by visual means and counting the number of residual data points within a statistical range, although other approaches are also feasible.
The performance of any individual model is studied using Nash–Sutcliffe coefficient (NSC) and RMSE for the models at Levels 1 and 2 at calibration/training and validation/testing phases. For a ‘perfect’ model, the NSC value is close to 1, whereas RMSE is close to 0. Poor performances are reflected in NSC lower than 1 and RMSE higher than 0.
The literature review indicates that two types of performance metrics are popularly reported in statistical downscaling studies: (i) performance metrics for all datasets in the time scale of monthly or daily and (ii) performance metrics for averaged values in each of the 12 months of the years. The review shows that performance metrics of the first type are significantly less popular than the second type. Gulacha & Mulungu (2017) listed a set of studies related to the statistical downscaling of precipitation and showed that R2 varies in the range of 0.06–0.52 as per the first type of performance metric. However, performance metrics as per the second type are higher than the first type, albeit this has not been explicitly stated in previous studies. The first type of performance metric is reported in the study.
Data availability and study area
The study area, shown in Figure 2, comprises two grid cells, where the main cell is located at the south of the Caspian Sea and the north of Iran (see Figure 1), largely covering southern Gilan province, and the second grid covers territories of East Azerbaijan, Ardabil and Zanjan provinces. These cells belong to Can-ESM2 (2.79° × 2.81° – lat° × lon°) with the resolution of 309 × 309 km2, as used in the 5th Assessment Report (AR5) of the Intergovernmental Panel on Climate Change (IPCC). The synoptic station is located in the main cell, but Grid Cell 2 is also considered as its climatic pattern can affect the climate of adjacent grid cells. Notably, Can-ESM2 data are available on the website of Canadian Climate Data and Scenarios (http://climate-scenarios.canada.ca). Table 2 presents the list of predictors from Can-ESM2.
The study uses the data obtained from the synoptic station of Rasht, Gilan province, run by the Iran Methodological Organisation. The annual precipitation is 1,100 mm in the study area, which is approximately 4× greater than the average precipitation in the country for that particular 10-year period (2004–2013). This amount of precipitation produces approximately 4,500 × 106 m3 runoff throughout the province (Gilan Regional Water Authority). The annual maximum and minimum temperatures in the same time period are 37.8 and −3.5 °C, respectively. According to Emberger (1930), the study area is characterised by cold and humid climate.
The study uses two sets of data: (i) predominant predictors among NCEP/NCAR reanalysis data are listed in Table 1 and (ii) the monthly precipitation for the synoptic station of Rasht serving as the predictand is the downscaling process. Predictors and the predictand were divided into 1961–1995 and 1996–2005 for the training phase (or calibration) and the testing phase (or validation), respectively. Accordingly, 540 data points were used in the training phase and 120 data points in the testing phase. Notably, training/testing are common terminologies in AI-based modelling, and calibration/validation is used for statistical (regression) modelling.
The study also uses a third set of results based on using GCM scenario results to project the calibrated and tested modelling results into the future and best practice procedure, which are outlined next. The paper uses Can-EMS2 to extract future climate variables under RCP2.6, RCP 4.5 and RCP8.5 scenarios. The difference between these scenarios lies in temperature changes during 2011–2040, 2041–2070 and 2071–2100 (Table 3).
. | 2011–2040 . | 2041–2070 . | 2071–2100 . |
---|---|---|---|
RCPa 2.6 | 0.75 | 0.78 | 0.88 |
RCP4.5 | 1.07 | 1.44 | 2.07 |
RCP8.5 | 1.06 | 1.8 | 3.55 |
. | 2011–2040 . | 2041–2070 . | 2071–2100 . |
---|---|---|---|
RCPa 2.6 | 0.75 | 0.78 | 0.88 |
RCP4.5 | 1.07 | 1.44 | 2.07 |
RCP8.5 | 1.06 | 1.8 | 3.55 |
aRepresentative concentration pathways.
MODELLING RESULTS
Level 0: pre-processing and identification of predominant predictors
In the pre-processing stage, the data quality was studied by conducting the outlier test as included in Grubbs's test on local precipitation data and analysing the gap data. No outlier data were identified and the number of data gaps was 5 months, which does not impair the research in terms of data quality. The pre-processing activities on data also comprise the selection of GCMs, grid cells around the synoptic station at a study area and that of predominant predictors. Notably, this stage may involve a selection from Can-ESM2, Had-CM3, BNU-ESM, HadGEM2-AO and CGCM3(T63) using CC between observed and predicted predictand but it is considered as out of the scope of the study. The selection of Can-ESM2 as a GCM with satisfactory results has also been reported in previous studies, e.g., Hosseini et al. (2020). Two grid cells were selected after examining other grid cells adjacent to the grid cells in Figure 2. Based on preliminary evaluations, predominant predictors are selected within the grid cells and shown in Figure 2. This may refer to: (i) the Alborz mountain at the south of the study area which breaks climatic connection between the study area and grid cells in the south and (ii) the Caspian Sea in the north which affects the climatic predictors within corresponding grid cells and weakens their correlation with the cells shown in Figure 2.
The input datasets for models at Level 1 are predominant predictors screened from 26 NCEP/NCAR predictors belonging to Can-ESM2 at two grid cells (see Figure 2), which are selected by CC for SDSM and by decision tree for AI models. The result of predictors screening by CC indicates that four predictors at Grid Cell 1 are selected as the predominant predictors and they comprise (i) precipitation; (ii) meridional wind component at 850 hPa; (iii) wind speed at 1,000 hPa and (iv) meridional wind component at 1,000 hPa. Also, the result of predictors screening by decision tree indicates that six predictors are selected at both Grid Cells 1 and 2, which comprise (i) precipitation; (ii) mean sea level pressure at Grid Cell 1; (iii) divergence of true wind at 850 hPa; (iv) wind direction at 1,000 hPa; (v) geopotential height at 850 hPa and (vi) specific humidity at 850 hPa at Grid Cell 2. Notably, the predominant predictors identified by both approaches are not identical in terms of the type of predictors and the number of grid cells. The predictors are normalised between 0 and 1 to eliminate the effect of variation range.
Level 1: results
Figure 3(a)–3(c) presents precipitation time series predicted by SDSM, SVM and SFL at Level 1 for the training and testing phases and compares them against their observed precipitation time series. A visual comparison between the predicted and observed time series indicates that models at Level 1 are fit-for-purpose and confirms that the performance somewhat deteriorates from training to testing phases, as expected. In the testing period, predicted values seemingly tend to be higher values than those by AI models. Also, some of the predicted extreme values suffer from larger errors by all models during both training and testing phases, but even in this case, the behaviours of predicted time series are similar to the observed values.
Visual displays of the time series of predicted and observed values are limited in their scope, but a critical understanding emerges readily by a study of their scatter diagrams, as presented in Figure 4(a1), 4(b1) and 4(c1), displaying observed versus predicted precipitation for training and testing phases of SDSM, SVM and SFL at Level 1. A model with higher performance has lower dispersion against the 1:1 line. A more critical understanding emerges by presenting residual errors of models versus observed precipitation, as given by Figure 4(a2), 4(b2) and 4(c2), according to which a model with higher performance has lower dispersion against the line with zero residuals. Figure 4(a2) and 4(c2) indicates that SVM (4b2) has relatively lower dispersion than SFL (4c2) and SDMS (4a2), and SFL (4c2) has a lower dispersion than SDSM (b2). However, for the higher values of precipitation, there is a slight trend towards the upper part of the diagrams, and therefore extreme values of precipitation can be relatively underestimated.
A further study of the models at Level 1 is presented in Table 4 in terms of performance metrics of NSC and RMSE, although the intention is not to rank these models. As per NSC and RMSE, SVM performs better than SFL and SDSM, and SFL performs better than SDSM.
Figure 5(a) and 5(b) compares averaged values of estimated and observed precipitation in every 12 months of the years during calibration/training and validation/testing phases, respectively. The averaged values of precipitation in all 12 months of the years during the whole period of data availability display a pattern of behaviour which may be used as a convenient representative of the observed local precipitation data and will be used as a visual comparison basis to study the behaviour of the modelling results. The figure indicates that models have better performance metrics in the calibration/training period than the validation/testing and this is expected. The figure also provides evidence that SVM performs relatively better than the SDSM and SFL, except for May (see Figure 5(a)) and May–July (see Figure 5(b)). The fact that SVM does not perform better than other models in all months justifies the necessity of combining models at Level 1 by IMM strategies.
Level 2: results of NF models
The results of the IMM-NF modelling strategy are also provided in Table 4, according to which the improvements in performance metrics are significant with only one exception that the performance of SDSM is slightly better. Further comparisons are outlined below using the information presented in Figures 3–5. A comparison between the predicted precipitation at Level 2 in Figure 3(d) and that at Level 1 (Figure 3(a)–3(c)) suggests visually a possible improvement at Level 2 but this needs further attention. The improvements in performance metrics (Table 4) and the quality of model fits (Figure 5) are evident. For instance, the time series display residual errors in Figure 3(e), which show that lower residuals during both training and testing phases belong to IMM-NF at Level 2.
The scatter diagram for IMM-NF at Level 2 is presented in Figure 4(d1) and 4(d2), which makes it possible to visually compare the scatter diagrams for models at both levels. It readily shows that NF has relatively lower dispersion than models at Level 1 and this reveals that the enhanced performance metrics by IMM-NF at Level 2 also give rise to its quality improvements. Figure 4(e1) and 4(e2) highlights this issue and quantifies the dispersivity of the results by counting data points located in the transparent yellow band within ±50 mm. The figure indicates that 76% of the results by IMM-NF are located within the band, whereas those for SDSM, SFL and SVM are 63, 69 and 69%, respectively. According to the figure and Table 4, IMM-NF at Level 2 has significantly improved the performance of individual base models at Level 1 in terms of quantified dispersivity and performance metrics. Also, Figure 5 indicates that predicted precipitation by IMM-NF has a lower deviation with respect to observed precipitation than other models.
Projections into the future
The SDSMs by IMM-NF modelling are employed to project future precipitations under different scenarios, the procedure for which is outlined above. The results of different scenarios are compared for 12 months of a year during each projected period (2011–2040; 2041–2070 and 2071–2100). Figure 6(a) compares the results of different scenarios during 2011–2040 and confirms insignificant differences between different scenarios. Similar behaviour is observed for other periods (2041–2070 and 2071–2100), but the results are not presented for brevity. Also, the figure represents observed and predicted precipitation during 1961–2005 and indicates that the projected precipitation will be reduced compared with the past precipitation (1961–2005) in some months (October–February).
The projected precipitation by the models at Levels 1 and 2 is displayed in Figure 6(b)–6(d) for each projected period. According to the figure, the pattern of variations during different months is similar except for SDSM, which overestimates the projected precipitation from July to October. This issue is attributable to the linear relationship between predictors and the predictand in SDSM, whereas AI models seek nonlinear relationships. Notably, significant differences are not observed between models in the downscaling step. The comparison between the projected precipitation with observed precipitation indicates that generally, the projected precipitation will be reduced with regard to the observed precipitation from January to February and October to December despite some reversals. Also, the projected precipitation will be slightly increased with regard to the observed precipitation from April to May.
DISCUSSION
The IMM strategy is in its infancy and the ongoing research activities of the authors demonstrate its feasibility in different fields of water resources engineering. This paper presents a new application of IMM in climate studies, in which IMM incorporates a set of models at Level 1 and combines their results at Level 2 in the downscaling and projection stages. The paper reveals an interesting feature of IMM, in which models have approximately identical behaviour at the downscaling stage during both the training and testing phases. Thus, there is not much difference between the modelling results produced at Levels 1 and 2 compared with the pattern of the averaged values of precipitation in all 12 months of the years as discussed in Section 3.2 and Figure 5. However, the modelling results at the projection stage display some discordant outcomes as displayed in Figure 6, in which the models at Level 1 are somewhat discordant in comparison with the above-mentioned pattern, whereas the projected modelling results at Level 2 are in concordance with the above pattern. The results presented in the paper serve as a proof-of-concept for the applications of IMM to modelling studies on climate change, as they enhance accuracy and provide new insights. However, there are no published results on similar applications to climate change, and therefore there are no comparable results to discuss in the paper.
The idea of statistical downscaling by multiple models is an ongoing research activity in the literature and categorised into (i) ‘ensemble models’ which performs downscaling by different AI or regression models and compares the results (e.g., Li et al. 2020) and (ii) using statistical models to combine the result of downscaling obtained by different models (e.g., Su et al. 2019). Although the second category can be included in the IMM modelling strategy, it is necessary to capture deeper information on the performance of AI models as a downscaling or combiner model. This study provides a proof-of-concept for using IMM in downscaling and projecting precipitation, in which the data are taken from one synoptic station. A consideration for several stations using an IMM strategy is planned by the authors to use the case of the Lake Urmia basin, where the lake is shrinking catastrophically.
The formulated IMM modelling strategy consists of two levels in which SDSM, SFL and SVM are trained and tested at Level 1 and their outputs are fed to NF at Level 2. Notably, the choices for selecting the models at both levels and the number of modelling strategies are wide. However, Khatibi et al. (2020) also discuss the possibility of analysing statistical distribution of error residuals to set a closure on minimising the number of strategies but these are outside the scope of the paper, although they are planned for future studies.
The monthly time scale for precipitation downscaling was investigated in the study, but other downscaling resolutions are practised: a higher resolution of 3 h was used by Mendes & Maia (2020); the daily interval was used by various authors (see Table 1) and lower resolutions of the monthly intervals (see Table 1) or annual interval by Sachindra & Perera (2018). Generally, higher resolutions are appropriate to investigate hydrological models such as rainfall–runoff (e.g., Tavakolifar et al. 2017), whereas lower resolutions are used for groundwater (e.g., Tewari et al. 2015) or drought studies (e.g., Tabari et al. 2021). Considering the importance of downscaling resolution, there is room for investigating the applicability of IMM for downscaling and projecting precipitation at different time resolutions.
CONCLUSION
This study formulated an IMM strategy to investigate the response of a study area to climate change through downscaling and projection of monthly precipitation. The IMM strategy incorporates three models at Level 1, which comprise SDSM, SFL and SVM, and feeds their outputs to NF at Level 2. The modelling activities also included the selection of predominant predictors among the NCEP/NCAR predictors using appropriate techniques. The results indicated that models at Level 1 are fit-for-purpose, but the model at Level 2 (NF) significantly improves the downscaling accuracy in terms of less dispersion in residual errors and performance metrics. All models somewhat underestimated the extreme precipitation values.
The future is uncertain and the paper presents the results, knowing that its projection for the next 100 years will be highly uncertain; nonetheless, the paper attempts to improve the reliability of projected precipitation by the formulated IMM strategy in the downscaling stage. At a first glance, it was observed that although the models presented close results in the downscaling stage, one of the models at Level 1 (SDSM) showed a largely different behaviour during some months in the projection stage. IMM reduces this discordance between Level 1 models, which is not evident in the downscaling stage. Also, the projection results indicate that precipitation will be reduced compared with observed precipitation in cold seasons (October–February), but the projected precipitation will be slightly increased in wet seasons (April to May).
The IMM strategy formulated by the study provided an insight into downscaling and projection of precipitation. However, further investigations may be suggested based on the limitations in the study summarised as follows: (i) an IMM modelling strategy was formulated, in which the aim is to learn both from local data, as well as from different GCM models; (ii) the residual analysis provided evidence that the accuracy is enhanced by increasing precipitation and there is room for further improvements by investigating different strategies to increase the performance; (iii) monthly precipitation was investigated by the paper but lower and higher resolutions for both precipitation and temperature can be considered and (iv) the performance of IMM was highlighted in the projection stage and this may be referred to the uncertainty associated between models, and therefore an uncertainty analysis can be carried out for a better insight.
DATA AVAILABILITY STATEMENT
All relevant data are available from an online repository or repositories. (http://climate-scenarios.canada.ca/)