Changes in climate might have a significant impact on rainfall characteristics, including extreme rainfall. This study aims to project the future daily rainfall, preserving most of the rainfall characteristics, including extreme rainfall incorporating climate changes. This paper presents two hybrid semi-parametric statistical downscaling models for future projection of IDF curves. The precipitation flux from seven scenarios of ten GCMs and observed daily rainfall data are considered as predictors and predictand variables, respectively. At site, daily rainfall occurrence is modeled using a two-state first-order Markov chain. Rainfall amounts on each wet day are modelled using a univariate nonparametric kernel density estimator. Two types of amount generation models are presented in this study. The bounded model (KDE-SP) is developed, considering the support for the kernel distribution as positive. In the unbounded model (KDE-Ext), the wet days are reclassified as extreme and non-extreme rainy days. A significant increasing trend can be observed in the future projected intensity–duration–frequency relationships. The maximum increment using empirical distribution is observed as 93.21 and 80.93% on a 5-year return period in the far future for the SSP5-8.5 scenario, using KDE-Ext and KDE-SP models, respectively. Although both methods show similar results, the KDE-Ext model performs better in simulating extreme rainfall.

  • This study introduces two new statistical downscaling techniques to simulate future daily rainfall time series based on a two-state, first-order Markov chain and the kernel density estimation technique.

  • Both models can produce long-term synthetic rainfall series different from the historical data, better preserving most of the statistical characteristics and extreme rainfall values.

Global warming and its resulting climate change are anticipated to have significant effects on ecosystems, agriculture, freshwater availability, and human civilization that are susceptible to changes in precipitation (Kannan & Ghosh 2013; Kumar et al. 2023; Sahu et al. 2023). Water is essential for both civilization and the environment, and it is also of enormous significance to understand how the changes in the global climate may influence regional water availability. General circulation models (GCMs) are the most credible mathematical models that simulate the global climate variables according to the shared socioeconomic pathways (SSPs) depending on the mitigation and socioeconomic challenges and changing greenhouse gas concentration levels and radiative forcing levels, which are generally used to assess the potential effects of climate change on hydroclimate variables (Dey et al. 2022) The distributions of the GCM outputs in space and time are often much coarser, which is insufficient to clearly assess local or point-scale climatic variables (Kannan & Ghosh 2013; Salvi et al. 2013; Raju & Kumar 2020; Pham et al. 2021). In order to evaluate the point or local scale hydroclimatic variables from GCM outputs, spatial or temporal downscaling techniques are widely used in literature to address these scaling mismatch issues (Giorgi & Mearns 1991; Ghosh & Mujumdar 2006; Fowler et al. 2007; Kannan & Ghosh 2013; Chandra et al. 2015; Tavakolifar et al. 2017; Halder & Saha 2021; Pham et al. 2021).

There are two primary ways of downscaling large-scale GCM output to a finer resolution: (a) dynamical approach and (b) statistical approach (Fowler et al. 2007; Kannan & Ghosh 2013; Salvi & Ghosh 2013; Chandra et al. 2015). Statistical downscaling approaches are computationally cost-effective and beneficial if there is enough historical data available for building the statistical or empirical relationships between the variables simulated by the large-scale GCMs, known as predictors, and station-scale climate variables, known as predictands (Mearns et al. 2003; Mujumdar & Kumar 2012).

In the literature, statistical downscaling methods are generally classified into three categories: weather generators or stochastic weather generators, weather typing, and regression models, or transfer functions (Giorgi & Mearns 1991; Wilby & Wigley 1997; Wilby et al. 2002; Fowler et al. 2007; Kannan & Ghosh 2011; Chandra et al. 2015; Pham et al. 2021). Some statistical downscaling methods used for climate variable projections include Markov chain models based on transitional probability (TP) (Haan et al. 1976; Bardossy & Plate 1991; Hughes et al. 1993; Wilks 1999a), spell length models based on TP (Lall & Sharma 1996; Wilks 1999b), nonhomogeneous Markov model (Rajagopalan et al. 1996), nonhomogeneous hidden Markov model (Hughes & Guttorp 1994), nonparametric nonhomogeneous hidden Markov models (Mehrotra & Sharma 2005, 2006), semi-parametric Markov model (Mehrotra & Sharma 2007), the fuzzy clustering technique (Ghosh & Mujumdar 2006), k-nearest neighbour (k-NN) resampling technique (Rajagopalan & Lall 1999), fuzzy clustering approach (Ghosh & Mujumdar 2006), artificial neural network (Olsson et al. 2004), support vector machine (Pham et al. 2019), and nonparametric kernel regression statistical downscaling model (Kannan & Ghosh 2011; Salvi & Ghosh 2013).

The typical basic Markov chain-based rainfall simulation model has been in the literature for the last few decades (Gabriel & Neumann 1962; Haan et al. 1976; Richardson 1981; Wilby 1994; Wilks 1992, 1989). The first statistical daily rainfall occurrence model using a first-order Markov chain was presented by Gabriel & Neumann (1962). Haan et al. (1976) simulate daily rainfall amounts using exponential distribution and uniform distribution for six nonzero classes, which are also identified by the first-order Markov model. Richardson (1981) and Wilby (1994) have also employed exponential distributions to model the daily rainfall amounts on each wet day. Wilks (1989) has modelled a two-state first-order Markov chain model to obtain daily rainfall occurrence and deployed a two-parameter gamma distribution to get the rainfall amount based on the probability density function to simulate the monthly rainfall. Wilks (1992) developed a statistical weather generator model using a two-state first-order Markov chain and two-parameter gamma distribution to generate daily rainfall amounts and also incorporate GCM outputs to assess the impacts of climate change. Chandra et al. (2015) proposed a statistical weather generator model to simulate extreme rainfall (ER) in three future time slices. They used a three-state first-order Markov chain model to determine the non-rainy days, rainy days of moderate intensity, and rainy days of high intensity. Chandra et al. (2015) fitted the three-state gamma distribution for both moderate and high-intensity rainfall to generate the daily rainfall amount for each wet day.

Rajagopalan et al. (1996) present a single-step, nonhomogeneous Markov model to generate daily rainfall at a single site. The one-step, 2 × 2 transitional probability matrices (TPMs) are estimated using a kernel density estimator through a weighted average of transition counts at the day of interest over the historical period. The rainfall amounts on each wet day were estimated using the kernel density estimation (KDE) technique centred on the day of interest over all the historical observed periods. Harrold et al. (2003a) present a nonparametric model based on the nearest neighbour approach to simulate daily rainfall occurrences for single sites. Harrold et al. (2003b) present a nonparametric stochastic model based on the KDE technique to generate daily rainfall amounts for single-site conditions on each wet day estimated using the method proposed by Harrold et al. (2003a). However, in this study, four distinct classes of previous-day rainfall amount were considered to be the predictor variable to predict the current-day rainfall amount. The seasonality of the daily rainfall series was achieved using a l-day moving window approach. Mehrotra & Sharma (2005) developed a nonparametric, nonhomogeneous hidden Markov model based on the k-NN technique to simulate daily rainfall occurrences for multiple sites using four atmospheric circulation variables. Mehrotra & Sharma (2007) proposed a semi-parametric stochastic modelling framework based on the KDE approach to generate multi-site daily rainfall amounts. The rainfall occurrence for each site was modelled using a two-state, first-order Markov chain model modified by the nearest neighbour approach with ‘aggregate’ predictor variables indicating how wet it has been over a particular period. The rainfall amounts were modelled using the nonparametric KDE technique with an l-day moving window.

Most rainfall simulation studies using the Markovian framework generally model the rainfall occurrence using the Markov chain technique, and rainfall amounts have been modelled using some parametric distribution (e.g., gamma, exponential, log-normal, generalized extreme value (GEV)). The combination of a modified Markov chain and some nonparametric distribution (e.g., K-NN and KDE) has also been used in literature, as discussed earlier. However, no study has been found by the authors that used both the Markov chain and nonparametric KDE technique to simulate the daily rainfall time series for future periods incorporating GCM precipitation outputs. This work tried to cover this gap using two downscaling methods, where both the Markov chain and nonparametric KDE technique are used to simulate the daily rainfall time series for three future time slices.

This study presents two stochastic downscaling frameworks to simulate daily rainfall occurrences and amounts for a single site, such that the model is able to represent the sequential future rainfall time series data for daily and longer timescales. The approach is structured to ensure that the model maintains persistent attributes such as ER, the number of wet and dry days, and other statistics discussed in Table 2, consistent with the observed historical rainfall record. Both downscaling frameworks operate in two parts. The first part involves downscaling of daily rainfall occurrence using a TP-based two-state first-order Markov chain (Richardson 1981). This part of the downscaling framework is named the rainfall occurrence downscaling model (RODM). The details about this method can be found in (Wilks & Wilby 1999). The second part of the downscaling framework, named the rainfall amounts downscaling model (RADM), simulates the daily rainfall amounts for each day classified as a wet day by the RODM. RADM is modelled based on univariate KDE function (Härdle et al. 2004; Scott 2015; Hollander et al. 2015; Silverman 2018). A major drawback of the nonparametric approach is that the model has limited extrapolation capacity to simulate daily precipitation values beyond the largest value recorded (Rajagopalan et al. 1996). Due to incorporating the perturbation factor (PF) from GCM data along with the KDE, this method can generate values different from the historical data and has become a novel approach.

Table 1

Detail list of CMIP6 GCM outputs used in this study. Grid (Lon-Lat) in column 4 represents that the total longitude and latitude (in degree) of the earth's surface are divided into the mentioned grid (Lon-Lat) numbers here

Sl. No.Source IDInstitution IDGrid (Lon-Lat)SSPs
1. CanESM5 CCCma 128 × 64 SSP1-1.9, SSP1-2.6, SSP2-4.5, SSP3-7.0, SSP4-3.4, SSP4-6.0, SSP5-8.5 
2. CNRM-ESM2-1 CNRM-CERFACS 256 × 128 SSP1-1.9, SSP1-2.6, SSP2-4.5, SSP3-7.0, SSP4-3.4, SSP4-6.0, SSP5-8.5 
3. IPSL-CM5A2-INCA IPSL 96 × 96 SSP1-2.6, SSP3-7.0 
4. MRI-ESM2-0 MRI 320 × 160 SSP1-1.9, SSP1-2.6, SSP2-4.5, SSP3-7.0, SSP4-3.4, SSP4-6.0, SSP5-8.5 
5. CMCC-CM2-SR5 CMCC 288 × 192 SSP1-2.6, SSP2-4.5, SSP3-7.0, SSP5-8.5 
6. E3SM-1-0 E3SM-Project 360 × 180 SSP5-8.5 
7. EC-Earth3-Veg EC-Earth-Consortium 512 × 256 SSP1-1.9, SSP1-2.6, SSP2-4.5, SSP3-7.0, SSP5-8.5 
8. MIROC-ES2L MIROC 128 × 64 SSP1-1.9, SSP1-2.6, SSP2-4.5, SSP3-7.0, SSP4-3.4, SSP4-6.0, SSP5-8.5 
9. NESM3 NUIST 192 × 96 SSP1-2.6, SSP2-4.5, SSP5-8.5 
10. TaiESM1 AS-RCEC 288 × 192 SSP1-2.6, SSP2-4.5, SSP3-7.0, SSP5-8.5 
Sl. No.Source IDInstitution IDGrid (Lon-Lat)SSPs
1. CanESM5 CCCma 128 × 64 SSP1-1.9, SSP1-2.6, SSP2-4.5, SSP3-7.0, SSP4-3.4, SSP4-6.0, SSP5-8.5 
2. CNRM-ESM2-1 CNRM-CERFACS 256 × 128 SSP1-1.9, SSP1-2.6, SSP2-4.5, SSP3-7.0, SSP4-3.4, SSP4-6.0, SSP5-8.5 
3. IPSL-CM5A2-INCA IPSL 96 × 96 SSP1-2.6, SSP3-7.0 
4. MRI-ESM2-0 MRI 320 × 160 SSP1-1.9, SSP1-2.6, SSP2-4.5, SSP3-7.0, SSP4-3.4, SSP4-6.0, SSP5-8.5 
5. CMCC-CM2-SR5 CMCC 288 × 192 SSP1-2.6, SSP2-4.5, SSP3-7.0, SSP5-8.5 
6. E3SM-1-0 E3SM-Project 360 × 180 SSP5-8.5 
7. EC-Earth3-Veg EC-Earth-Consortium 512 × 256 SSP1-1.9, SSP1-2.6, SSP2-4.5, SSP3-7.0, SSP5-8.5 
8. MIROC-ES2L MIROC 128 × 64 SSP1-1.9, SSP1-2.6, SSP2-4.5, SSP3-7.0, SSP4-3.4, SSP4-6.0, SSP5-8.5 
9. NESM3 NUIST 192 × 96 SSP1-2.6, SSP2-4.5, SSP5-8.5 
10. TaiESM1 AS-RCEC 288 × 192 SSP1-2.6, SSP2-4.5, SSP3-7.0, SSP5-8.5 
Table 2

Performance evaluation of both downscaling models using different statistical methods. The p-values presented for the null hypothesis tests in columns 3–6 represent the probability of observing a test statistic similar to the observed value under the null hypothesis

Ensemble average statistical results from ten GCMsModel nameTwo-sample t-test for equal meansTwo-sample Kolmogorov–Smirnov test for same distributionTwo-sample F-test for equal variancesWilcoxon rank sum test for equal median of two populationsR2RMSEMPEMAPE
30 years AMP series 1. KDE-Ext 0.47 0.94 0.67 0.49 0.96 0.69 –9.01 9.19 
2. KDE-SP 0.53 0.94 0.68 0.64 0.96 0.66 −7.83 8.24 
GEV-distributed AMP series up to 100 years RP 1. KDE-Ext 0.16 0.89 0.10 0.22 0.98 0.66 −8.23 8.23 
2. KDE-SP 0.23 0.96 0.11 0.33 0.98 0.62 −7.04 7.04 
daily mean 1. KDE-Ext 0.86 0.79 0.99 0.71 0.99 0.57 −55.19 58.09 
2. KDE-SP 0.96 0.99 0.81 0.93 0.99 0.56 −41.80 50.68 
Monthly mean 1. KDE-Ext 0.86 0.79 0.98 0.71 0.99 17.46 −55.16 58.08 
2. KDE-SP 0.96 0.99 0.82 0.93 0.99 17.06 −41.80 50.70 
Daily median 1. KDE-Ext 0.04 0.07 0.91 0.04 0.89 2.21 −67.07 67.07 
2. KDE-SP 0.56 0.07 0.46 0.37 0.93 0.88 −26.00 27.96 
Monthly median 1. KDE-Ext 0.72 0.74 0.81 0.47 0.99 29.31 −66.10 66.20 
2. KDE-SP 0.89 0.74 0.99 0.74 0.99 17.24 −47.42 51.34 
Daily standard deviation 1. KDE-Ext 0.67 0.99 0.99 0.75 0.26 6.80 −24.80 41.67 
2. KDE-SP 0.63 0.79 0.98 0.71 0.26 6.80 −25.78 41.39 
Monthly standard deviation 1. KDE-Ext 0.55 0.19 0.13 0.62 0.89 30.35 −26.07 59.46 
2. KDE-SP 0.51 0.19 0.13 0.58 0.90 30.75 −21.85 56.63 
Daily skewness 1. KDE-Ext 0.92 0.99 0.52 0.67 0.13 0.94 −2.23 25.25 
2. KDE-SP 0.79 0.79 0.53 0.62 0.11 0.94 −0.39 25.32 
Monthly skewness 1. KDE-Ext 0.22 0.19 0.95 0.14 0.50 0.61 10.83 44.57 
2. KDE-SP 0.27 0.19 1.00 0.19 0.50 0.60 7.31 44.42 
Ensemble average statistical results from ten GCMsModel nameTwo-sample t-test for equal meansTwo-sample Kolmogorov–Smirnov test for same distributionTwo-sample F-test for equal variancesWilcoxon rank sum test for equal median of two populationsR2RMSEMPEMAPE
30 years AMP series 1. KDE-Ext 0.47 0.94 0.67 0.49 0.96 0.69 –9.01 9.19 
2. KDE-SP 0.53 0.94 0.68 0.64 0.96 0.66 −7.83 8.24 
GEV-distributed AMP series up to 100 years RP 1. KDE-Ext 0.16 0.89 0.10 0.22 0.98 0.66 −8.23 8.23 
2. KDE-SP 0.23 0.96 0.11 0.33 0.98 0.62 −7.04 7.04 
daily mean 1. KDE-Ext 0.86 0.79 0.99 0.71 0.99 0.57 −55.19 58.09 
2. KDE-SP 0.96 0.99 0.81 0.93 0.99 0.56 −41.80 50.68 
Monthly mean 1. KDE-Ext 0.86 0.79 0.98 0.71 0.99 17.46 −55.16 58.08 
2. KDE-SP 0.96 0.99 0.82 0.93 0.99 17.06 −41.80 50.70 
Daily median 1. KDE-Ext 0.04 0.07 0.91 0.04 0.89 2.21 −67.07 67.07 
2. KDE-SP 0.56 0.07 0.46 0.37 0.93 0.88 −26.00 27.96 
Monthly median 1. KDE-Ext 0.72 0.74 0.81 0.47 0.99 29.31 −66.10 66.20 
2. KDE-SP 0.89 0.74 0.99 0.74 0.99 17.24 −47.42 51.34 
Daily standard deviation 1. KDE-Ext 0.67 0.99 0.99 0.75 0.26 6.80 −24.80 41.67 
2. KDE-SP 0.63 0.79 0.98 0.71 0.26 6.80 −25.78 41.39 
Monthly standard deviation 1. KDE-Ext 0.55 0.19 0.13 0.62 0.89 30.35 −26.07 59.46 
2. KDE-SP 0.51 0.19 0.13 0.58 0.90 30.75 −21.85 56.63 
Daily skewness 1. KDE-Ext 0.92 0.99 0.52 0.67 0.13 0.94 −2.23 25.25 
2. KDE-SP 0.79 0.79 0.53 0.62 0.11 0.94 −0.39 25.32 
Monthly skewness 1. KDE-Ext 0.22 0.19 0.95 0.14 0.50 0.61 10.83 44.57 
2. KDE-SP 0.27 0.19 1.00 0.19 0.50 0.60 7.31 44.42 

Note: Daily and monthly rainfall statistics of daily rainfall time series are evaluated for all twelve months (January–December) for simulated and observed time series during VP, and the performance evaluation is carried out with respect to the observed time series statistics.

In this study, the predictor variable is considered as rainfall time series from large-scale GCM outputs to simulate the point-scale rainfall time series. In literature, downscaling methods proposed by several researchers using the KDE technique (Mehrotra & Sharma 2005, 2006, 2010; Kannan & Ghosh 2013; Shashikanth & Ghosh 2013; Shashikanth et al. 2018) are not considered rainfall time series from large-scale GCM outputs as predictor variable/variables. This study tried to invent a novel approach to simulate daily rainfall time series from large-scale GCMs precipitation series and station-scale observed rainfall series through RODM and RADM, using the combination of TP-based two-state first-order Markov chain and univariate KDE techniques.

The Alipore meteorological station (Station Index No. 42807) shown in Figure 1, with geographical coordinates of 22.53° N and 88.33° E (Desamsetti et al. 2016), is considered as the location of this study. Monthly, daily, and hourly rainfall data are collected for Alipore station from the India Meteorological Department (IMD), Pune, and IMD Kolkata. In the quality control phase, a few spurious hourly data are discarded. The discarded and some missing data (<= 5% of total data) are filled out using a statistical method discussed in the Methodology section. In the monthly rainfall series, 24 h of heaviest rainfall for a particular year is given, and the maximum value is found as 369.6 mm/day from 1901 to 2013. In the hourly rainfall series, we discovered three dates (6 September 1972, 26 September 1990, and 8 August 2014) that exceeded this daily maximum rainfall. Also, in these 3 days, all hourly data are missing from midnight, and maximum hourly data are recorded as 135.2, 240.2, and 1,212.8 mm/h, respectively, which are statistically very high compared to other hourly maximum values. Likewise, total rainfall up to midnight for these 3 days is recorded as 382.6, 485.2, and 1,960.4 mm/day, respectively, which are more than the maximum daily rainfall. So, we discarded all the data from these three days, assuming as erroneous data. The observed rainfall data is divided into the observed baseline period (OBP) and observed validation period (OVP). The baseline period (BP) is considered from 1969 to 1998, and the validation period (VP) is from 1984 to 2013. The models are fitted based on the BP data. So, in this study, the BP can also be considered as the calibration period. For the data length constraint and due to the incorporation of PF to obtain the perturbed parameter for the model validation and future projection, the overlapping of the calibration (or base period) and VP has been done such as by Hundecha & Bárdossy (2004); Walton et al. (2015); Halder & Saha (2021).
Figure 1

Study area map along with geographical coordinates. The name of the meteorological station is Alipore Meteorological Station, with Station Index No. 42807 and geographical coordinates of 22.53° N and 88.33° E, which is marketed on all three maps.

Figure 1

Study area map along with geographical coordinates. The name of the meteorological station is Alipore Meteorological Station, with Station Index No. 42807 and geographical coordinates of 22.53° N and 88.33° E, which is marketed on all three maps.

Close modal

A total number of 35 GCM outputs containing daily precipitation data have been downloaded from the official website of Coupled Model Intercomparison Project Phase 6 (CMIP6), which has historical and at least four scenario outputs. The historical and scenario data used in this study have a temporal length from 1,850 to 2,014 and 2,015 to 2,100, respectively. Seven scenarios named SSP1-1.9, SSP1-2.6, SSP2-4.5, SSP3-7.0, SSP4-3.4, SSP4-6.0, and SSP5-8.5 have been used for downscaling the daily rainfall into three different future periods of 2021–2050 (near future [NF]), 2051–2080 (middle future [MF]), and 2071–2100 (far future [FF]) (Verma et al. 2024). GCM historical data have been used for the GCM historical baseline period (G-HBP) (1969–1998) and GCM historical validation period (G-HVP) (1984–2013).

Selection of GCMs

The number of GCMs is reduced to 10 GCMs to decrease the computational time of the models. However, the best 10 GCM models have been selected to maintain the efficiency of the downscaling models. Initially, 15 GCM models were discarded by comparing the GCM mean monthly rainfall (MMR) and the observed MMR from 1969 to 1988. After that, a threshold value is set as the 90th percentile of all rainy day rainfall values to get the peak over the threshold (POT) value for all the months, for all the GCMs, as well as for observed data during 1969–1988 and 1984–2013. Then, the PF is calculated for both observed and GCM data for all the months. These PF from observed and GCM data are used to calculate the root mean square error (RMSE) and mean absolute percentage error (MAPE). A weight based on the observed POT values is assigned to RMSE and MAPE corresponding to each GCM. Finally, two lists of GCMs corresponding to RMSE and MAPE (Verma et al. 2023) are prepared in ascending order, and the last 5 GCMs from both lists are discarded. RMSE values are given the first preference when selecting the ten best GCMs. The first 10 GCMs in the list of GCMs with the lowest RMSE value are shortlisted, which also belong to the list of GCMs with MAPE. The detailed list of GCM outputs, which are shortlisted and used in this study, is given in Table 1. The list of GCMs, which are discarded based on MMR, and the list of GCMs along with RMSE and MAPE are presented in the Supplementary Material.

This study presents two statistical downscaling methods named KDE-Ext (KDE for extreme and non-extreme series) and KDE-SP (KDE for wet day rainfall series considering support positive) to simulate daily rainfall time series. Both downscaling models are based on a two-state first-order Markov chain and kernel density estimator. At first, TP of wet days and ER are obtained for every month from all rainfall time series. At the initial stage, both models first identified a day as a wet or dry day according to the two-state first-order Markov chain TP. After that, in the KDE-SP method, the rainfall amount on each wet day is simulated from the cumulative distribution function (CDF) based on the KDE technique. Support or boundary of the density function of the KDE-SP method is considered positive for each rainy day rainfall amount series (RRAS). In the case of the KDE-Ext method, a wet day is reclassified as extreme or non-extreme using the probabilities of extreme rainfall (PER). The ER of a series of a particular month is defined as the values that are equal to or greater than a specific threshold value of the series of that particular month. The threshold value (denoted as Th90), which is set by the trial and error method, is taken as the 90th percentile value of the series of that month. After classifying extreme and non-extreme wet days, the rainfall amounts of the classified days are obtained from the CDF of the KDE. In this KDE-Ext method, support of the CDF is considered unbounded (means − ∞ to +∞) for each extreme or non-ER series. The bandwidth (h) of the KDE technique is regarded as auto-selected bandwidth for the RRAS for both downscaling methods in order to simulate the intra-annual variations. The inverse distance weighting (Halder & Saha 2021) technique is used to determine the point-scale parameters or characteristics required for the downscaling method from the gridded GCM daily rainfall series. The GEV distribution (Halder & Saha 2021) and empirical distribution are used to construct the intensity–duration–frequency (IDF) curves from a 30-year annual maximum precipitation (AMP) series.

The PF (Halder & Saha 2021) is used as a multiplication factor to obtain the perturbed TPM (P-TPM), PER (P-PER), and RRAS (P-RRAS). Mathematically
(1)
and
(2)
where PFRRAS(mpe) and PFTPM(m) are the PF for RRAS and TPM for the month m, respectively. RRASf(mpe) and RRASb(mpe) are the RRAS for future and baseline periods corresponding to the same exceedance probability pe for the month m. TPMf(m) and TPMb(m) are the TPM for future and baseline periods corresponding to the same sequential month m, where m= 1,2, 3 … 12.
The multi-model ensemble averaging (MMEA) technique (Dey et al. 2022; Verma et al. 2023) is used to address the intermodal uncertainty due to the use of multiple GCMs. Mathematically
(3)
Pm is the ensemble mean variables/outputs from all GCM models for time t, Pn is the variables/outputs estimated from downscaled rainfall series from ith GCM for time t, and n is the total number of GCMs.

Rainfall occurrence model

The rainfall occurrence model for downscaling the at-site daily rainfall is done using a two-state, first-order Markov chain technique (Gabriel & Neumann 1962; Wilks & Wilby 1999), assuming the probability of rainfall occurrence on any given day depends only on whether the preceding day was dry or wet. The RODM parameters (Markov chain transition probabilities) are estimated for each rainfall series (OBP, OVP, G-HBP, G-HVP, and GCM scenario series from each SSP available in each GCM for near, middle, and FF) for each month (January–December). The PF of TPM is obtained from the future and historical GCM's daily rainfall series and multiplied with the TPM obtained from OBP data to get the required P-TPM for RODM.

The occurrence of rainfall in a day can be obtained using the following equation.
(4)
where p represents the P-TPM, the perturbed probability (p01 or p11) of current-day rainfall occurrence depending on the previous-day's rainfall condition. Rot is the rainfall state on day t, generated conditionally, based on the generation of uniform [0, 1] pseudo-random number (RN) μc and by comparing it with the associated TP p.

Rainfall amount model

The univariate KDE method (Härdle et al. 2004; Scott 2015; Hollander et al. 2015; Gramacki 2018; Silverman 2018) is used to generate the rainfall amounts from kernel CDF conditioned on the rainfall state as wet day simulated by the RODM. A threshold of 0.1 mm/day rainfall intensity is considered in this study to define a wet day as guided by the IMD. The Gaussian kernel function (Sharma et al. 1997; Härdle et al. 2004; Gramacki 2018) is used in this study to estimate the probability density function for each month. This study used Silverman's rule of thumb for bandwidth estimation (Härdle et al. 2004; Scott 2015; Hollander et al. 2015; Gramacki 2018; Silverman 2018). The Gaussian kernel used an infinite support domain to estimate probability distribution and assign a small probability to some regions of the support domain where the outputs become negative, which is invalid for hydrological parameters such as rainfall.

This leakage of probability problem is addressed by determining whether the simulated values are negative or positive at each phase. A new sample is created from the same kernel slice each time a negative quantity is encountered by creating a new RN μk until a positive amount value is attained (Sharma et al. 1997; Kannan & Ghosh 2013). This technique is used in the KDE-Ext method.

Another way to solve the leakage of probability problem is to truncate the Gaussian kernel to a positive support region from the infinite support domain. This is done to avoid assigning small probabilities to the regions of the domain where the data is invalid or out of bounds. The truncation of the Gaussian kernel can be achieved by setting the kernel to zero outside the desired support region. This approach is adopted in the KDE-SP technique. The detailed procedure of the KDE-Ext method is presented, as a flowchart in Figure 2.
Figure 2

Detailed overview of the proposed downscaling methods. (a) KDE-Ext and (b) KDE-SP. Here, PEROBS = probabilities of extreme rainfall (PER) for the observed baseline period (OBP), TPMOBP = transitional probability matric for OBP, PERBP = PER of GCM data for the baseline period (BP), PERSP = PER of GCM data for the scenario period (SP), TPMBP = TPM of GCM data for the BP, TPMSP = TPM of GCM data for the SP, P-TPM = perturbed TPM, PFTPM = perturbation factor (PF) of the TPM, PFPER = PF of the PER, P-PER = perturbed PER, μc = random number (RN) for rainfall occurrence model, μr = RN for rainfall amount model, μke = RN for generation of ER amount using kernel distribution, μkne = RN for generation of non-ER amount using kernel distribution, μk = RN for generation of rainfall amount using kernel distribution, iCDFke = inverse cumulative kernel distribution of ER, iCDFk = inverse cumulative kernel distribution of rainfall.

Figure 2

Detailed overview of the proposed downscaling methods. (a) KDE-Ext and (b) KDE-SP. Here, PEROBS = probabilities of extreme rainfall (PER) for the observed baseline period (OBP), TPMOBP = transitional probability matric for OBP, PERBP = PER of GCM data for the baseline period (BP), PERSP = PER of GCM data for the scenario period (SP), TPMBP = TPM of GCM data for the BP, TPMSP = TPM of GCM data for the SP, P-TPM = perturbed TPM, PFTPM = perturbation factor (PF) of the TPM, PFPER = PF of the PER, P-PER = perturbed PER, μc = random number (RN) for rainfall occurrence model, μr = RN for rainfall amount model, μke = RN for generation of ER amount using kernel distribution, μkne = RN for generation of non-ER amount using kernel distribution, μk = RN for generation of rainfall amount using kernel distribution, iCDFke = inverse cumulative kernel distribution of ER, iCDFk = inverse cumulative kernel distribution of rainfall.

Close modal

Model validation

In order to validate the downscaling procedure, the downscaling models are run following the procedure as outlined in Figure 2 to project the daily rainfall time series for the VP (1984–2013), taking the GCM data and observed data for the BP (1969–1998). The models have generated 1,000 independent realizations of daily rainfall of length equal to the baseline historical rainfall record. All the statistics shown in this article are the average of all 1,000 independent realizations of daily rainfall series and MMEA. The performance of both models is assessed based on daily and monthly statistics to determine their ability to replicate the observed rainfall characteristics.

The statistical measures (R2, RMSE, MPE, and MAPE) and the hypothetical tests carried out to evaluate the performance of the two downscaling models indicate a good correlation between the variables obtained from observed and predicted outputs during VP for both models, which are shown in Table 2. Four null hypothesis tests listed in Table 2 have been carried out at a significance level of 0.05 (i.e. 95% confidence interval (CI)). In Table 2, the p-value in columns 3–6 represents the probability of observing a test statistic similar to the observed value under the null hypothesis. The results of the null hypothesis tests for both models show that the null hypothesis can't be rejected at a 95% CI for all four tests, which means the values listed in the first column in Table 2 from both observed and simulated series during VP are from the same population. Our primary focus is to create a model capable of simulating ER very well. From Table 2, it can be clearly understood that both models can simulate ER very well, as statistics from extreme parameters show satisfactory outputs.

Daily IDF curves have been constructed using the simulated outputs by the downscaling models during the VP (1984–2013) and compared with the IDF curves obtained from the observed baseline (1969–1998) and VP (1984–2013) data series. In this study, all the IDF curves are drawn for the duration of 24 h (day). The comparison of the IDF curves for both KDE-SP and KDE-Ext models is shown in Figures 3(a) and 3(b) and Figures 3(c) and 3(d), respectively. The daily IDF curve from simulated outputs shows a good match with both the IDF curves obtained from the observed BP and VP data series using empirical distribution for the return period (RP) up to 30 years. For the GEV-distributed data series, the simulated IDF curves show a good fit at lower RP, but at high RP, the simulated IDF curves match better with the observed BP period.
Figure 3

Daily IDF curves after ensemble averaging from all 10 GCMs using both downscaling methods. (a) and (c) show empirically distributed curves for RP from the KDE-SP and KDE-Ext models, respectively. (b) and (d) show the GEV-distributed curve from KDE-SP and KDE-Ext models, respectively. Here, projected blue lines indicate the simulated ensemble average outputs of the models during the specified period. Also, Obs and Obs.VP represents the observed data and OVP data during the specified period, respectively.

Figure 3

Daily IDF curves after ensemble averaging from all 10 GCMs using both downscaling methods. (a) and (c) show empirically distributed curves for RP from the KDE-SP and KDE-Ext models, respectively. (b) and (d) show the GEV-distributed curve from KDE-SP and KDE-Ext models, respectively. Here, projected blue lines indicate the simulated ensemble average outputs of the models during the specified period. Also, Obs and Obs.VP represents the observed data and OVP data during the specified period, respectively.

Close modal
In Figure 4, a shaded area is drawn using a 90% CI band of GEV-distributed observed VP of annual maximum rainfall series (AMRs). The IDF curves obtained from simulated rainfall series from both downscaling methods during the VP are also drawn in Figure 4. All the ensembled IDF curves simulated from ten GCM outputs fall within the confidence band, indicating that the model performs well in simulating ER as the simulated outputs are within the acceptable limit.
Figure 4

Daily IDF curves for all GCMs and ensemble average outputs from all 10 GCMs using downscaling methods (a) KDE-SP and (b) KDE-Ext. The shaded area shows the 90% CI band of GEV-distributed observed AMRs during VP. Here, Obs and Obs.VP represents the observed data and observed VP data during the specified period, respectively.

Figure 4

Daily IDF curves for all GCMs and ensemble average outputs from all 10 GCMs using downscaling methods (a) KDE-SP and (b) KDE-Ext. The shaded area shows the 90% CI band of GEV-distributed observed AMRs during VP. Here, Obs and Obs.VP represents the observed data and observed VP data during the specified period, respectively.

Close modal
The simulated ensemble average dry and wet spell length was checked by comparing it with the observed BP and VP spell lengths shown in Figure 5. Both KDE-SP and KDE-Ext methods conserve the dry and wet spell characteristics satisfactorily. The simulated ensembled number of wet and dry days per month has been calculated for both models and compared with the observed BP and VP outputs shown in Figure 6. The results from both models perfectly match dry and wet day numbers for each month.
Figure 5

The ensemble average spell length of dry and wet days simulated by the KDE-Ext (a, b) and KDE-SP (c, d) models. The Y-axis represents the average number of wet or dry spell lengths. Here, Obs represents the observed data during the specified period.

Figure 5

The ensemble average spell length of dry and wet days simulated by the KDE-Ext (a, b) and KDE-SP (c, d) models. The Y-axis represents the average number of wet or dry spell lengths. Here, Obs represents the observed data during the specified period.

Close modal
Figure 6

The ensemble average outputs for the number of dry and wet days in each month, simulated by the KDE-Ext (a, b) and KDE-SP (c, d) model. The Y-axis represents the average number of wet or dry days in each month from January–December. Here, SEA represents the simulated ensemble average outputs of the models, and Obs represents the observed data.

Figure 6

The ensemble average outputs for the number of dry and wet days in each month, simulated by the KDE-Ext (a, b) and KDE-SP (c, d) model. The Y-axis represents the average number of wet or dry days in each month from January–December. Here, SEA represents the simulated ensemble average outputs of the models, and Obs represents the observed data.

Close modal

Future projection

It is vital to project IDF relationships under climate change conditions with reliable and precise approaches to building urban infrastructure that will respond to the probable changes in flood frequency. The future projection has been carried out for three time periods named NF, MF, and FF, using seven SSPs of ten GCMs outputs with the help of both downscaling methods. The ensembled IDF curves for all SSPs are constructed and shown in Figures 7 (KDE-Ext) and 8 (KDE-SP) for all three future periods. Figures 7 and 8 also represent the inter-SSPs comparison for all three future projected time slices. The scenario uncertainty can also be observed in Figures 7 and 8. The GCM or intermodal uncertainty resulting from multiple GCMs is addressed using MMEA. The scenario uncertainty that occurred from the use of several SSPs was not quantified in this study. The projected daily IDF curves are compared to the observed BP IDF curve to identify the changes under various emission scenarios in the future projected rainfall series. These changes are quantified by calculating percentage changes, as shown in Table 3.
Table 3

Percentage changes of projected rainfall using empirical distribution in the near, middle, and FF with respect to observed BP rainfall using both downscaling methods for all seven SSPs

Percentage change with respect to observed (1969–1998) rainfall
KDE-Ext
KDE-SP
RP2510153025101530
 Obsa 4.38 5.99 11.06 13.16 14.32 4.38 5.99 11.06 13.16 14.32 
 VP 17.60 26.05 −7.22 −10.87 −6.04 15.75 24.83 −7.48 −11.19 −6.97 
SSP1-1.9 1. NF 18.10 22.02 −13.04 −14.33 −2.93 16.17 20.25 −13.17 −14.06 −3.57 
2. MF 25.22 36.41 1.16 −1.64 5.83 23.56 35.32 0.59 −1.95 3.80 
3. FF 22.48 34.24 −3.13 −7.28 −0.93 20.23 32.91 −3.92 −7.65 −2.72 
SSP1-2.6 1. NF 27.65 34.16 −0.53 −3.43 3.07 26.43 33.27 −0.25 −2.97 2.41 
2. MF 39.42 54.50 19.11 17.15 26.11 37.57 52.82 18.69 16.69 25.29 
3. FF 26.43 44.09 6.86 4.63 13.46 24.69 43.43 6.91 4.46 12.32 
SSP2-4.5 1. NF 18.27 24.56 −10.41 −13.85 −8.41 16.45 23.07 −11.23 −14.51 −9.52 
2. MF 23.33 33.77 −2.27 −6.55 −1.64 21.05 30.70 −4.15 −7.80 −3.37 
3. FF 27.37 32.09 −3.62 −4.57 8.50 25.61 31.94 −3.62 −4.18 8.46 
SSP3-7.0 1. NF 52.76 71.31 36.25 33.28 40.29 36.75 53.23 15.74 16.91 30.37 
2. MF 74.24 83.43 51.86 42.48 41.94 55.02 78.63 32.73 30.46 40.53 
3. FF 41.84 62.13 28.95 32.24 48.41 41.37 57.49 26.72 29.10 44.30 
SSP4-3.4 1. NF 34.81 33.18 −6.62 −10.25 −3.97 34.72 32.85 −5.89 −10.04 −4.56 
2. MF 34.08 39.99 −0.31 −6.82 −2.84 33.24 37.62 −0.40 −7.34 −6.66 
3. FF 39.29 46.53 2.93 −3.05 2.03 36.60 44.61 1.98 −4.84 −1.45 
SSP4-6.0 1. NF 23.24 21.00 −15.71 − 19.51 −13.57 21.90 19.83 −16.47 − 20.70 −14.99 
2. MF 36.53 41.21 3.23 −0.56 4.69 35.35 39.86 4.21 −0.26 3.53 
3. FF 50.21 46.46 −2.49 −8.63 −2.91 49.85 44.23 −3.39 −9.20 −3.97 
SSP5-8.5 1. NF 21.62 35.66 2.65 −9.84 6.96 19.39 29.89 −1.67 −3.51 2.79 
2. MF 47.99 59.34 12.83 2.84 14.87 34.05 48.68 12.78 7.94 12.56 
3. FF 75.27 93.21 52.83 38.66 62.50 61.72 80.93 39.84 38.98 51.24 
Percentage change with respect to observed (1969–1998) rainfall
KDE-Ext
KDE-SP
RP2510153025101530
 Obsa 4.38 5.99 11.06 13.16 14.32 4.38 5.99 11.06 13.16 14.32 
 VP 17.60 26.05 −7.22 −10.87 −6.04 15.75 24.83 −7.48 −11.19 −6.97 
SSP1-1.9 1. NF 18.10 22.02 −13.04 −14.33 −2.93 16.17 20.25 −13.17 −14.06 −3.57 
2. MF 25.22 36.41 1.16 −1.64 5.83 23.56 35.32 0.59 −1.95 3.80 
3. FF 22.48 34.24 −3.13 −7.28 −0.93 20.23 32.91 −3.92 −7.65 −2.72 
SSP1-2.6 1. NF 27.65 34.16 −0.53 −3.43 3.07 26.43 33.27 −0.25 −2.97 2.41 
2. MF 39.42 54.50 19.11 17.15 26.11 37.57 52.82 18.69 16.69 25.29 
3. FF 26.43 44.09 6.86 4.63 13.46 24.69 43.43 6.91 4.46 12.32 
SSP2-4.5 1. NF 18.27 24.56 −10.41 −13.85 −8.41 16.45 23.07 −11.23 −14.51 −9.52 
2. MF 23.33 33.77 −2.27 −6.55 −1.64 21.05 30.70 −4.15 −7.80 −3.37 
3. FF 27.37 32.09 −3.62 −4.57 8.50 25.61 31.94 −3.62 −4.18 8.46 
SSP3-7.0 1. NF 52.76 71.31 36.25 33.28 40.29 36.75 53.23 15.74 16.91 30.37 
2. MF 74.24 83.43 51.86 42.48 41.94 55.02 78.63 32.73 30.46 40.53 
3. FF 41.84 62.13 28.95 32.24 48.41 41.37 57.49 26.72 29.10 44.30 
SSP4-3.4 1. NF 34.81 33.18 −6.62 −10.25 −3.97 34.72 32.85 −5.89 −10.04 −4.56 
2. MF 34.08 39.99 −0.31 −6.82 −2.84 33.24 37.62 −0.40 −7.34 −6.66 
3. FF 39.29 46.53 2.93 −3.05 2.03 36.60 44.61 1.98 −4.84 −1.45 
SSP4-6.0 1. NF 23.24 21.00 −15.71 − 19.51 −13.57 21.90 19.83 −16.47 − 20.70 −14.99 
2. MF 36.53 41.21 3.23 −0.56 4.69 35.35 39.86 4.21 −0.26 3.53 
3. FF 50.21 46.46 −2.49 −8.63 −2.91 49.85 44.23 −3.39 −9.20 −3.97 
SSP5-8.5 1. NF 21.62 35.66 2.65 −9.84 6.96 19.39 29.89 −1.67 −3.51 2.79 
2. MF 47.99 59.34 12.83 2.84 14.87 34.05 48.68 12.78 7.94 12.56 
3. FF 75.27 93.21 52.83 38.66 62.50 61.72 80.93 39.84 38.98 51.24 

aObs represents the observed BP outputs.

The bold value signify the maximum and minimum percentage changes.

Figure 7

Future projected daily IDF curves after doing MMEA of all 10 GCM outputs using KDE-Ext downscaling methods for empirical distribution (a–c) and GEV distribution (d–f) for three time slices arranged chronologically. Here, Obs represents the observed data.

Figure 7

Future projected daily IDF curves after doing MMEA of all 10 GCM outputs using KDE-Ext downscaling methods for empirical distribution (a–c) and GEV distribution (d–f) for three time slices arranged chronologically. Here, Obs represents the observed data.

Close modal
Figure 8

Future projected daily IDF curves after doing MMEA of all 10 GCM outputs using KDE-SP downscaling methods for empirical distribution (a–c) and GEV distribution (d–f) for three time periods arranged chronologically. Here, Obs represents the observed data.

Figure 8

Future projected daily IDF curves after doing MMEA of all 10 GCM outputs using KDE-SP downscaling methods for empirical distribution (a–c) and GEV distribution (d–f) for three time periods arranged chronologically. Here, Obs represents the observed data.

Close modal

The changes presented in Table 3 are calculated using empirically distributed AMP series from 30 years duration. The results from all SSPs show more increments in lower-order RP compared to higher-order RP for both downscaling methods. The most extreme situation may occur under SSP3-7.0, SSP1-2.6, and SSP5-8.5 scenarios in NF, MF, and FF, respectively. SSP1-2.6 shows a major increasing trend in the MF, which means ER may occur in the middle of the century under SSP1-2.6. Other SSPs showed an increasing trend in ER when the time period increased from NF to FF. The maximum increasing trend is observed under the SSP5-8.5 scenario. The highest increment can be observed in the FF at 5-year RP for both KDE-Ext and KDE-SP methods, and the corresponding values are 93.21 and 80.93%, respectively. Some decreasing trends of ER intensities are also found in some scenarios, especially when the RP becomes high for both downscaling methods. The maximum negative deviation in ER intensity was observed in the SSP4-6.0 scenario in the NF at 15-year RP, and values are near about 20% for both models. The daily IDF curves using GEV distribution have also been obtained up to 100 years RP and similar changes were found to those observed from empirical distribution.

It can be seen from Figure 6 that the simulated number of wet and dry days for every month is almost similar to the observed BP and VP. However, it can be observed that the simulated number of wet and dry days is more likely to be OBP compared to OVP. The simulated number of wet and dry days is obtained by the RODM, which is based on the two-state first-order Markov chain model. So, from the observation, it can be said that it is more skilled in replicating the BP data, as it is not a non-stationary model. This is a limitation of the two-state, first-order Markov chain model. However, this simulation depends on the difference between the number of wet and dry days between the observed BP and VP, and PF comes from the GCM outputs.

The results, shown in Table 2, suggest that both the models performed well by simulating similar outputs like the observed historical data. The R2 values of the ER series for both models lie between 0.96 and 0.98, which means that the RADM can generate ER values like the OVP rainfall series, which supports the model towards the study focus on the simulation of ER values. Figure 4 also proved that both models can simulate ER values like the OVP rainfall series. Though both models slightly overestimated the OVP extreme and slightly underestimated the OBP extreme, the ensemble average of extreme values balances both the OVP and OBP extreme values. The most acceptable GCM models with respect to OVP ER are the ‘TaiESM1’, followed by ‘CMCC-CM2-SR5’, followed by ‘E3SM-1-0, GCM outputs, according to Figure 3.

The maximum increasing trend is observed under the SSP5-8.5 scenario in both empirical and GEV distribution, which is under high mitigation challenges and low adaptation challenges (Riahi et al. 2017), but under the highest trajectories of radiative forcing (W/m2), global mean temperature, and global CO2 emission (Gidden et al. 2019). These findings also support the investigation of Maity & Maity (2022), who have exhibited a significant increment in the hourly rainfall intensity of about 41–44% under the SSP585 scenarios. Maximum percentage increment using empirical and GEV distribution is observed under the SSP5-8.5 scenario at 5-year and 2-year RP for both downscaling methods, respectively. These increments in ER may significantly impact the design of urban drainage networks in the study area as 2–5-year RP IDF curves are generally used. It is found that mainly SSP1-2.6, SSP3-7.0, and SSP5-8.5 scenarios indicate significant climate change in the future period under ER conditions. Crévolin et al. (2023) simulate the IDF curves using the Quantile-Quantile Downscaling method for 30 major cities in Canada. They found that most cities may experience high-intensity storms with an average increment of around 30% under the SSP2-4.5 and 40% under the SSP5-8.5 between 2,071 and 2,100, whereas this study found a maximum of 32.09% under the SSP2-4.5 in the FF and a range of 38.66–93.21% increment in ER in the FF. Xu et al. (2024) observed a maximum increment of 40, 31, 27, and 22% in the AMR under the SSP1-2.6, SSP5-8.5, SSP2-4.5, and SSP3-7.0 in the study area of Barranquilla, Colombia, which is also a similar finding of this study. Halder & Saha (2021) simulated the IDF curves for the same study location (Alipore IMD) using the quantile perturbation downscaling method from CMIP6 data. They also found a significant increase in ER intensity for most of the GCMs and SSPs in future periods like this study.

If we compare both downscaling methods with respect to validation results, then it can be observed that both models show almost similar types of results. The KDE-SP model simulates the values well for lower-order RP, whereas the KDE-Ext model simulates the values well for higher-order RP. So, the KDE-Ext model is preferred when the expected design life of the infrastructure system is very high, but in general, the KDE-SP model can be used as it provides a better result in lower-order RP, and also, the results from higher-order RP are good. Another advantage of the KDE-SP model is that it is simpler than KDE-Ext.

This study has shown two reasonably simple semi-parametric single-site rainfall simulation models that can generate long synthetic daily rainfall sequences and reflect both the short- and long-term variability attributes present in the observed historical record. Using a first-order Markov model and transition probabilities, the method replicates rainfall events. The at-site daily rainfall amount specified as a rainy day or extreme rainy day identified by the classification model is simulated based on the univariate KDE model. The present downscaling approach uses rainfall data from GCMs outputs as predictor variables, thereby outlining potential applications of the method for simulating the rainfall field in a changing climate as predicted by GCMs. Therefore, the current downscaling method can simulate rainfall in a changing climate.

In literature, the weather generator process with the Markovian framework uses parametric distribution to estimate rainfall amounts. The main disadvantage of the method is that ER cannot be simulated properly using a single distribution. Also, the same distribution cannot be applied universally. Even using different types of distribution for extreme and non-ER, the method cannot replicate the observed rainfall properly. In comparison, the methods proposed in this study are also based on the Markovian framework but use nonparametric distribution, which is more robust as it can be applied to all types of rainfall data. Additionally, it is able to modify the distribution by changing the bandwidth according to data type. It is proved that the single kernel distribution (KDE-SP) model is enough to simulate the extreme and non-ER accurately for different types of rainfall data, whereas, in the case of parametric distribution, it cannot be possible with a single distribution.

In the validation stage, both downscaling methods showed very good performance of RODM, which was reflected through the spell length check and the number of wet/dry day checks for every month during the OVP. RADM also shows excellent performance, which was reflected through the 90% CI band (for the IDF curve) of GEV-distributed observed AMRs during OVP, and the model predicted the IDF curve for the same duration.

The daily rainfall intensity is projected over three time periods, each 30 years in duration, during the 21st century, which is presented as daily IDF curves showing considerable changes. The maximum changes were observed during the end of the century, i.e., in the FF. The rainfall is projected according to the different GCM scenarios available on CMIP6. Different scenarios show different results, but SSP5-8.5 shows a maximum increasing trend. These results can be used for the design of different hydraulic components. It is very difficult to suggest a particular scenario for design purposes. The selection of scenarios may be decided according to design requirements, risk factors taken, design life of the structure, climate condition of the location, and assessment of climate change.

The foremost important objective of the study was to simulate the ER perfectly, which is fulfilled most precisely by both models. The developed rainfall sequences for future emission scenarios will be an important source of data for research on how climate change will affect regional hydrology. In the present downscaling methods, observed and GCM rainfall data series are used to simulate both the rainfall state and amounts. However, there may be some scope for incorporating other climate variables along with rainfall data for the simulation of rainfall state and amounts, which may show greater variability and thus may improve the downscaling models to simulate rainfall state and amounts more robustly.

The authors acknowledge the Department of Civil Engineering, Indian Institute of Engineering Science and Technology (IIEST), Shibpur, for providing the infrastructure support. The authors also acknowledge the India Meteorological Department (IMD) for providing the required meteorological data for research purposes.

The authors declare that no funding was received for doing this research.

S. H. conceptualized the whole article, developed the methodology and software, rendered support in formal analysis, and wrote the original draft preparation and edited the article. U. S. supervised the article, and wrote the review and edited the article.

Data cannot be made publicly available; readers should contact the corresponding author for details.

The authors declare there is no conflict.

Bardossy
A.
&
Plate
E. J.
(
1991
)
Modeling daily rainfall using a semi-Markov representation of circulation pattern occurrence
,
Journal of Hydrology
,
122
(
1–4
),
33
47
.
https://doi.org/10.1016/0022-1694(91)90170-M
.
Chandra
R.
,
Saha
U.
&
Mujumdar
P. P.
(
2015
)
Model and parameter uncertainty in IDF relationships under climate change
,
Advances in Water Resources
,
79
,
127
139
.
https://doi.org/10.1016/j.advwatres.2015.02.011
.
Crévolin
V.
,
Hassanzadeh
E.
&
Bourdeau-Goulet
S.-C.
(
2023
)
Updating the intensity-duration-frequency curves in major Canadian cities under changing climate using CMIP5 and CMIP6 model projections
,
Sustainable Cities and Society
,
92
,
104473
.
https://doi.org/10.1016/j.scs.2023.104473
.
Desamsetti
S.
,
Rani
S. I.
,
Mallick
S.
,
Gupta
D.
,
George
J. P.
&
Rajagopal
E. N.
(
2016
)
Comparison of NCMRWF and ECMWF Archives of Conventional Meteorological Observations
.
National Centre for Medium Range Weather Forecasting, Government of India, Noida
.
Dey
A.
,
Sahoo
D. P.
,
Kumar
R.
&
Remesan
R.
(
2022
)
A multimodel ensemble machine learning approach for CMIP6 climate model projections in an Indian river basin
,
International Journal of Climatology
,
42
(
16
),
9215
9236
.
https://doi.org/10.1002/joc.7813
.
Fowler
H. J.
,
Blenkinsop
S.
&
Tebaldi
C.
(
2007
)
Linking climate change modelling to impacts studies: Recent advances in downscaling techniques for hydrological modelling
,
International Journal of Climatology
,
27
(
12
),
1547
1578
.
https://doi.org/10.1002/joc.1556
.
Gabriel
K. R.
&
Neumann
J.
(
1962
)
A Markov chain model for daily rainfall occurrence at Tel Aviv
,
Quarterly Journal of the Royal Meteorological Society
,
88
(
375
),
90
95
.
https://doi.org/10.1002/qj.49708837511
.
Ghosh
S.
&
Mujumdar
P. P.
(
2006
)
Future
rainfall scenario over Orissa with GCM projections by statistical downscaling
,
Current Science
,
90
(
3
),
396
404
.
Gidden
M. J.
,
Riahi
K.
,
Smith
S. J.
,
Fujimori
S.
,
Luderer
G.
,
Kriegler
E.
,
van Vuuren
D. P.
,
van den Berg
M.
,
Feng
L.
,
Klein
D.
,
Calvin
K.
,
Doelman
J. C.
,
Frank
S.
,
Fricko
O.
,
Harmsen
M.
,
Hasegawa
T.
,
Havlik
P.
,
Hilaire
J.
,
Hoesly
R.
,
Horing
J.
,
Popp
A.
,
Stehfest
E.
&
Takahashi
K.
(
2019
)
Global emissions pathways under different socioeconomic scenarios for use in CMIP6: A dataset of harmonized emissions trajectories through the end of the century
,
Geoscientific Model Development
,
12
(
4
),
1443
1475
.
https://doi.org/10.5194/gmd-12-1443-2019
.
Giorgi
F.
&
Mearns
L. O.
(
1991
)
Approaches to the simulation of regional climate change: A review
,
Reviews of Geophysics
,
29
(
2
),
191
216
.
https://doi.org/10.1029/90RG02636
.
Gramacki
A.
(
2018
)
Nonparametric Kernel Density Estimation and Its Computational Aspects
.
 
Vol. 37
,
Cham
:
Springer International Publishing
.
https://doi.org/10.1007/978-3-319-71688-6
.
Haan
C. T.
,
Allen
D. M.
&
Street
J. O.
(
1976
)
A Markov chain model of daily rainfall
,
Water Resources Research
,
12
(
3
),
443
449
.
https://doi.org/10.1029/WR012i003p00443
.
Halder
S.
&
Saha
U.
(
2021
)
Future projection of extreme rainfall for flood management due to climate change in an urban area
,
Journal of Sustainable Water in the Built Environment
,
7
(
3
).
https://doi.org/10.1061/jswbay.0000954
.
Härdle
W.
,
Werwatz
A.
,
Müller
M.
&
Sperlich
S.
(
2004
)
Nonparametric and Semiparametric Models
.
Berlin, Heidelberg
:
Springer
.
https://doi.org/10.1007/978-3-642-17146-8
.
Harrold
T. I.
,
Sharma
A.
&
Sheather
S. J.
(
2003a
)
A nonparametric model for stochastic generation of daily rainfall amounts
,
Water Resources Research
,
39
(
12
).
https://doi.org/10.1029/2003WR002570
.
Harrold
T. I.
,
Sharma
A.
&
Sheather
S. J.
(
2003b
)
A nonparametric model for stochastic generation of daily rainfall occurrence
,
Water Resources Research
,
39
(
10
).
https://doi.org/10.1029/2003WR002182
.
Hollander
M.
,
A. Wolfe
D.
&
Chicken
E.
(
2015
)
Nonparametric Statistical Methods
.
Wiley
.
https://doi.org/10.1002/9781119196037
.
Hughes
J. P.
&
Guttorp
P.
(
1994
)
A class of stochastic models for relating synoptic atmospheric patterns to regional hydrologic phenomena
,
Water Resources Research
,
30
(
5
),
1535
1546
.
https://doi.org/10.1029/93WR02983
.
Hughes
J. P.
,
Lettenmaier
D. P.
&
Guttorp
P.
(
1993
)
A stochastic approach for assessing the effect of changes in synoptic circulation patterns on gauge precipitation
,
Water Resources Research
,
29
(
10
),
3303
3315
.
https://doi.org/10.1029/93WR01066
.
Hundecha
Y.
&
Bárdossy
A.
(
2004
)
Modeling of the effect of land use changes on the runoff generation of a river basin through parameter regionalization of a watershed model
,
Journal of Hydrology
,
292
(
1–4
),
281
295
.
https://doi.org/10.1016/j.jhydrol.2004.01.002
.
Kannan
S.
&
Ghosh
S.
(
2011
)
Prediction of daily rainfall state in a river basin using statistical downscaling from GCM output
,
Stochastic Environmental Research and Risk Assessment
,
25
(
4
),
457
474
.
https://doi.org/10.1007/s00477-010-0415-y
.
Kannan
S.
&
Ghosh
S.
(
2013
)
A nonparametric kernel regression model for downscaling multisite daily precipitation in the Mahanadi basin
,
Water Resources Research
,
49
(
3
),
1360
1385
.
https://doi.org/10.1002/wrcr.20118
.
Kumar
K.
,
Verma
S.
,
Sahu
R.
&
Verma
M. K.
(
2023
)
Analysis of rainfall trends in India, incorporating non-Parametric tests and wavelet synopsis over the last 117 years
,
Journal of Environmental Informatics Letters
,
https://doi.org/10.3808/jeil.202300117
.
Lall
U.
&
Sharma
A.
(
1996
)
A nearest neighbor bootstrap for resampling hydrologic time series
,
Water Resources Research
,
32
(
3
),
679
693
.
https://doi.org/10.1029/95WR02966
.
Maity
S. S.
&
Maity
R.
(
2022
)
Changing pattern of intensity–duration–frequency relationship of precipitation due to climate change
,
Water Resources Management
,
36
(
14
),
5371
5399
.
https://doi.org/10.1007/s11269-022-03313-y
.
Mearns
L. O.
,
Giorgi
F.
,
Whetton
P.
,
Pabon
D.
,
Hulme
M.
&
Lal
M.
(
2003
)
Guidelines for Use of Climate Scenarios Developed From Regional Climate Model Experiments
.
Available from
: https://www.ipcc-data.org/guidelines/dgm_no1_v1_10-2003.pdf.
Mehrotra
R.
&
Sharma
A.
(
2005
)
A nonparametric nonhomogeneous hidden Markov model for downscaling of multisite daily rainfall occurrences
,
Journal of Geophysical Research
,
110
(
D16
),
D16108
.
https://doi.org/10.1029/2004JD005677
.
Mehrotra
R.
&
Sharma
A.
(
2006
)
A nonparametric stochastic downscaling framework for daily rainfall at multiple locations
,
Journal of Geophysical Research
,
111
(
D15
),
D15101
.
https://doi.org/10.1029/2005JD006637
.
Mehrotra
R.
&
Sharma
A.
(
2007
)
A semi-parametric model for stochastic generation of multi-site daily rainfall exhibiting low-frequency variability
,
Journal of Hydrology
,
335
(
1–2
),
180
193
.
https://doi.org/10.1016/j.jhydrol.2006.11.011
.
Mehrotra
R.
&
Sharma
A.
(
2010
)
Development and application of a multisite rainfall stochastic downscaling framework for climate change impact assessment
,
Water Resources Research
,
46
(
7
).
https://doi.org/10.1029/2009WR008423
.
Mujumdar
P. P.
&
Kumar
D. N.
(
2012
)
Floods in a Changing Climate: Hydrologic Modeling
.
Cambridge
:
Cambridge University Press
.
https://doi.org/10.1017/CBO9781139088428
.
Olsson
J.
,
Uvo
C. B.
,
Jinno
K.
,
Kawamura
A.
,
Nishiyama
K.
,
Koreeda
N.
,
Nakashima
T.
&
Morita
O.
(
2004
)
Neural networks for rainfall forecasting by atmospheric downscaling
,
Journal of Hydrologic Engineering
,
9
(
1
),
1
12
.
https://doi.org/10.1061/(ASCE)1084-0699(2004)9:1(1)
.
Pham
Q.
,
Yang
T.-C.
,
Kuo
C.-M.
,
Tseng
H.-W.
&
Yu
P.-S.
(
2019
)
Combing random forest and least square support vector regression for improving extreme rainfall downscaling
,
Water
,
11
(
3
),
451
.
https://doi.org/10.3390/w11030451
.
Pham
H. X.
,
Shamseldin
A. Y.
&
Melville
B. W.
(
2021
)
Projection of future extreme precipitation: A robust assessment of downscaled daily precipitation
,
Natural Hazards
,
107
(
1
),
311
329
.
https://doi.org/10.1007/s11069-021-04584-1
.
Rajagopalan
B.
&
Lall
U.
(
1999
)
A k-nearest-neighbor simulator for daily precipitation and other weather variables
,
Water Resources Research
,
35
(
10
),
3089
3101
.
https://doi.org/10.1029/1999WR900028
.
Rajagopalan
B.
,
Lall
U.
&
Tarboton
D. G.
(
1996
)
Nonhomogeneous Markov model for daily precipitation
,
Journal of Hydrologic Engineering
,
1
(
1
),
33
40
.
https://doi.org/10.1061/(ASCE)1084-0699(1996)1:1(33)
.
Raju
K. S.
&
Kumar
D. N.
(
2020
)
Review of approaches for selection and ensembling of GCMs
,
Journal of Water and Climate Change
,
11
(
3
),
577
599
.
https://doi.org/10.2166/wcc.2020.128
.
Riahi
K.
,
van Vuuren
D. P.
,
Kriegler
E.
,
Edmonds
J.
,
O'Neill
B. C.
,
Fujimori
S.
,
Bauer
N.
,
Calvin
K.
,
Dellink
R.
,
Fricko
O.
,
Lutz
W.
,
Popp
A.
,
Cuaresma
J. C.
,
Leimbach M
K. C. S.
,
Jiang
L.
,
Kram
T.
,
Rao
S.
,
Emmerling
J.
,
Ebi
K.
,
Hasegawa
T.
,
Havlik
P.
,
Humpenöder
F.
,
Da Silva
L. A.
,
Smith
S.
,
Stehfest
E.
,
Bosetti
V.
,
Eom
J.
,
Gernaat
D.
,
Masui
T.
,
Rogelj
J.
,
Strefler
J.
,
Drouet
L.
,
Krey
V.
,
Luderer
G.
,
Harmsen
M.
,
Takahashi
K.
,
Baumstark
L.
,
Doelman
J. C.
,
Kainuma
M.
,
Klimont
Z.
,
Marangoni
G.
,
Lotze-Campen
H.
,
Obersteiner
M.
,
Tabeau
A.
&
Tavoni
M.
(
2017
)
The shared socioeconomic pathways and their energy, land use, and greenhouse gas emissions implications: An overview
,
Global Environmental Change
,
42
,
153
168
.
https://doi.org/10.1016/j.gloenvcha.2016.05.009
.
Richardson
C. W.
(
1981
)
Stochastic simulation of daily precipitation, temperature, and solar radiation
,
Water Resources Research
,
17
(
1
),
182
190
.
https://doi.org/10.1029/WR017i001p00182
.
Sahu
R. T.
,
Verma
S.
,
Verma
M. K.
&
Ahmad
I.
(
2023
)
Characterizing spatiotemporal properties of precipitation in the middle Mahanadi subdivision, India during 1901–2017
,
Acta Geophysica
,
72
(
2
),
1143
1158
.
https://doi.org/10.1007/s11600-023-01085-6
.
Salvi
K.
,
Kannan
S.
&
Ghosh
S.
(
2013
)
High-resolution multisite daily rainfall projections in India with statistical downscaling for climate change impacts assessment
,
Journal of Geophysical Research: Atmospheres
,
118
(
9
),
3557
3578
.
https://doi.org/10.1002/jgrd.50280
.
Scott
D. W.
(
2015
)
Multivariate Density Estimation
.
Wiley
.
https://doi.org/10.1002/9781118575574
.
Sharma
A.
,
Tarboton
D. G.
&
Lall
U.
(
1997
)
Streamflow simulation: A nonparametric approach
,
Water Resources Research
,
33
(
2
),
291
308
.
https://doi.org/10.1029/96WR02839
.
Shashikanth
K.
&
Ghosh
S.
(
2013
)
Fine resolution Indian summer monsoon rainfall projection with statistical downscaling
,
International Journal of Chemical, Environmental & Biological Sciences
,
1
(
4
),
615
618
.
Shashikanth
K.
,
Ghosh
S.
&
Karmakar
S.
(
2018
)
Future projections of Indian summer monsoon rainfall extremes over India with statistical downscaling and its consistency with observed characteristics
,
Climate Dynamics
,
51
(
1–2
),
1
15
.
https://doi.org/10.1007/s00382-017-3604-2
.
Silverman
B. W.
(
2018
)
Density Estimation for Statistics and Data Analysis
.
Routledge
.
https://doi.org/10.1201/9781315140919
.
Tavakolifar
H.
,
Shahghasemi
E.
&
Nazif
S.
(
2017
)
Evaluation of climate change impacts on extreme rainfall events characteristics using a synoptic weather typing-based daily precipitation downscaling model
,
Journal of Water and Climate Change
,
8
(
3
),
388
411
.
https://doi.org/10.2166/wcc.2017.107
.
Verma
S.
,
Kumar
K.
,
Verma
M. K.
,
Prasad
A. D.
,
Mehta
D.
&
Rathnayake
U.
(
2023
)
Comparative analysis of CMIP5 and CMIP6 in conjunction with the hydrological processes of reservoir catchment, Chhattisgarh, India
,
Journal of Hydrology: Regional Studies
,
50
,
101533
.
https://doi.org/10.1016/j.ejrh.2023.101533
.
Verma
S.
,
Prasad
A. D.
&
Verma
M. K.
(
2024
)
A framework for the evaluation of MRP complex precipitation in a CORDEX-SA regional climate applied to REMO
,
International Journal of Hydrology Science and Technology
,
17
(
1
),
17
45
.
https://doi.org/10.1504/IJHST.2024.135125
.
Walton
D. B.
,
Sun
F.
,
Hall
A.
&
Capps
S.
(
2015
)
A hybrid dynamical–Statistical downscaling technique. Part I: development and validation of the technique
,
Journal of Climate
,
28
(
12
),
4597
4617
.
https://doi.org/10.1175/JCLI-D-14-00196.1
.
Wilby
R. L.
(
1994
)
Stochastic weather type simulation for regional climate change impact assessment
,
Water Resources Research
,
30
(
12
),
3395
3403
.
https://doi.org/10.1029/94WR01840
.
Wilby
R. L.
&
Wigley
T. M. L.
(
1997
)
Downscaling general circulation model output: A review of methods and limitations
,
Progress in Physical Geography: Earth and Environment
,
21
(
4
),
530
548
.
https://doi.org/10.1177/030913339702100403
.
Wilby
R. L.
,
Dawson
C. W.
&
Barrow
E. M.
(
2002
)
Sdsm – A decision support tool for the assessment of regional climate change impacts
,
Environmental Modelling & Software
,
17
(
2
),
145
157
.
https://doi.org/10.1016/S1364-8152(01)00060-3
.
Wilks
D. S.
(
1989
)
Conditioning stochastic daily precipitation models on total monthly precipitation
,
Water Resources Research
25
(
6
),
1429
1439
.
https://doi.org/10.1029/WR025i006p01429
.
Wilks
D. S.
(
1992
)
Adapting stochastic weather generation algorithms for climate change studies
,
Climatic Change
22
(
1
),
67
84
.
https://doi.org/10.1007/BF00143344
.
Wilks
D. S.
(
1999
)
Multisite downscaling of daily precipitation with a stochastic weather generator
,
Climate Research
11
,
125
136
.
https://doi.org/10.3354/cr011125
.
Wilks
D. S.
(
1999
)
Interannual variability and extreme-value characteristics of several stochastic daily precipitation models
,
Agricultural and Forest Meteorology
,
93
(
3
),
153
169
.
https://doi.org/10.1016/S0168-1923(98)00125-7
.
Wilks
D. S.
&
Wilby
R. L.
(
1999
)
The weather generation game: A review of stochastic weather models
,
Progress in Physical Geography: Earth and Environment
,
23
(
3
),
329
357
.
https://doi.org/10.1177/030913339902300302
.
Xu
M.
,
Bravo de Guenni
L.
&
Córdova
J. R.
(
2024
)
Climate change impacts on rainfall intensity–duration–frequency curves in local scale catchments
,
Environmental Monitoring and Assessment
,
196
(
4
),
372
.
https://doi.org/10.1007/s10661-024-12532-2
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).

Supplementary data