Statistical Downscaling of High Resolution Precipitation in India using Convolutional Long Short Term Memory Networks

Different empirical-statistical downscaling methods are widely applied in weather and climate studies for assessment of effect of global-scale climate conditions on local-scale climate variables. Statistical downscaling of the General Circulation Model (GCM) simulations are widely used for accessing changes in future climate at different spatiotemporal scales. The process of downscaling is affected by uncertainty associated with the GCM selection and downscaling model. This study proposes a novel Statistical Downscaling (SD) model established on Convolutional Long Short Term Memory (ConvLSTM) Network. The methodology is applied to obtain future projection of rainfall at 0.25° spatial resolution over entire Indian sub-continental region. The traditional multisite downscaling models typically perform downscaling on a single homogeneous rainfall zone, predicting rainfall at only one grid point in a single model run. The proposed model captures spatiotemporal dependencies in multisite local rainfall and predicts rainfall for the entire zone in a single model run. The study proposes a shared ConvLSTM model. This particular modeling framework shares a ConvLSTM model across multiple neighboring regions in order to capture the similarity in rainfall patterns of and customizes it to individual grid points. This novel downscaling approach provides a single end-to-end supervised model for predicting the future precipitation series for entire India. The model captures the regional variability in rainfall superior to a region wise trained model. The proposed methodology performs superior as compared to the presently available state of the art LSTM and Kernel Regression (KR) based methodologies previously applied for Indian sub continental region. The projected future rainfall for different scenarios of climate change, obtained with the help of the proposed downscaling model reveals an overall increase in the rainfall mean over India. The changes in future rainfall extremes over India are spatially nonuniform with a probable increase at the western-ghats and northeastern India. The rainfall extreme at the same time is observed to decrease in northern and western India as well as along the southeastern coastline. These results highlight the importance of conducting in depth hydrologic study for different river basins of the country for future water availability assessment and making water resource policies.


Introduction
The change in earth's climate and its associated impacts on different components of the ecosystem are earning higher human attention in the recent years.Unexpected spurts of extremely high temperature and precipitation, intensi cation of hydrologic cycle are some of the globally experienced implications of climate change [IPCC, 2012].The country of India is no exception to this.The climate change repercussions portray a bleak picture for India mainly because of two reasons.Firstly, as the agricultural activities forming the backbone of Indian economy largely depend on the monsoon rainfall, therefore any changes in the features of monsoon rainfall directly leads to changes in crop productions that affects livelihood of greater part of the country.Secondly, heavy population density of the country increases vulnerability to climate extremes such as heat-cold wave, heavy precipitation-ood, cyclones etc.
The knowledge on the long term changes in climate conditions, scienti cally termed as climate projections is highly important in generating strategic knowledge to overcome such disastrous situation.
The well informed planning policies conditioned on realistic and region-speci c future projections forms a crucial information in dealing with the impact of climate change.The impacts of global scale climate change on regional scale are in general evaluated by downscaling of General Circulation Models (GCMs) simulated large scale climate variables [Prudhomme et al., 2002;IPCC, 1999].Though GCMs are capable of projecting large scale circulations and spatially uniform climate variables such as temperature, pressure with some skills, they often fail to capture the spatially non-uniform elds such as precipitation [Hughes and Guttorp., 1994;Meehl et al., 2005].At the same time, GCM outputs cannot be directly applied for impact assessment due to their coarser spatial resolution ranging from 0.5° to 3°.The application of downscaling therefore becomes essential to have a realization of the regional-scale hydrometeorological variables.
The downscaling process aims at obtaining data at ne resolution with the help of the available coarse scale information.There are a plethora of methodologies categorized under this topic.These methodologies are broadly classi ed into Dynamical Downscaling (DD) and Statistical Downscaling (SD).DD involves operating a physics based high-resolution Regional Climate Models (RCM) taking inputs from coarse-resolution GCM dataset.DD is computationally expensive.At the same time, SD deals with developing a statistical relationship between the large scale atmospheric variables (predictors) and a ne resolution surface variable of interest (predictant).SD models are usually, computationally inexpensive.
Different SD techniques has been developed, which are overall categorized in the following three classes, 1) Weather Generators 2) Weather Typing 3) Transfer function.The weather generators, are also considered as random complex number generating functions on the basis of witnessed structures of climate variables [Katz and Parlange, 1996].Their outputs are similar to daily weather data at speci c place [Wilks and Wilby 1999; Soltani and Hoogenboom 2003; Wilby and Dawson 2007].The weather typing methodologies encompass combining atmospheric circulation variables into different categories [Wilby 1998].This is standard approach of statistical downscaling.Various downscaling methods differs mainly based on the predictor variables, selection of statistical transfer function, or mathematical procedure.Linear and non-linear regression [Wilby et. al., 2004], Support Vector Machine (SVM) [Vapnik et al., 1995], Arti cial Neural Network (ANN) [Karamouz et. al., 2009] and canonical correlation [Conway et. al., 1996], Relavance Vector Machine (RVM) [Ghosh and Mujumdar, 2006] have been used to establish relationship between predictor and preditand.Transfer functions or Regression are popular due to their simplicity but cannot model very well variability and extreme events.Generalized Linear Model (GLM) [Yang et. al., 2005] Markov chain models [Hughes et al., 1999], hidden Markov chain models [Bellone et al., 2000], spell length models [Lall et al., 1996], conditional random elds [Raje and Mujumdar, 2009], beta regression [Mandal et. al., 2016] fuzzy logic based methodologies [Ghosh and Mujumdar, 2006], Bayesian Joint Probability (BJP) modelling methodology [Robertson and Wang, 2009], Arti cial Neural Network (ANN) based methods [Crane and Hewitson, 1998;Mondal and Mujumdar, 2012], and Stochastic Space Random Cascade (SSRC) methodology for precipitation downscaling with the help of GCM data [Groppelli et. al., 2011] are few documented statistical downscaling approaches used for climate variable projections.
The SD models show credibility in capturing certain properties of the evidentiary target datasets.
However, some limitations of the data driven approaches lowers the overall skills.The previously developed SD methodologies based on kernel regression based model [Kannan et al., 2011], LSTM based model [Misra et al., 2017], Support Vector Machine [Ghosh, 2010], CRF [Raje et al., 2009] depict different skills in capturing the statistical properties and spatial structure of the ISMR rainfall.Most SD models underperform mainly because they atten out the gridded input data in one dimension before passing it to the mathematical modeling framework.Furthermore, the traditional multisite downscaling models typically perform downscaling on a single homogeneous rainfall zone, predicting rainfall at one grid point in a single model run.
Convolutional Neural Networks (CNNs) [Krizhevsky et al., 2012] were designed to deal with multidimensional input volumes.CNN helps in maintaining the spatial structure of input data.Therefore the predictor inputs can directly be passed to the CNN model without distorting their spatiotemporal structure.Vandal et al. [2017] used CNNs for downscaling daily precipitation over the continental United States.CNNs seldom fail to capture temporal dependencies in the input data due to missing element of temporal corrections in their architecture [Vandal et al. 2017].At the same time regression-based SD models e.g.kernel regression [Kannan et al., 2013] poses limitation to capture extremes values [Salvi et al 2013].Addressing these complications, Shi et al. [2015] proposed the architecture for Convolutional LSTM (ConvLSTM).The study showed that ConvLSTM network captures both temporal and spatial correlation in input better than CNNs.
This study proposes a modi ed approach of ConvLSTM termed as "Shared ConvLSTM framework" as a SD model.This model is applied to obtain rainfall projections over the Indian land mass.The genesis of the name "Shared" lies in the previous works by Kannan et al. [2013], where the homogeneous regions for which the rainfall and predictor data share spatial relationships and have overlapping boundaries.This study is planned to explore the advantage of using the rainfall and predictor data for entire India for training a single model, which we name as Shared ConvLSTM model.The manuscript is organized as follows: Section 2 describes the study region and data used, Section 3 explains methodology adopted to obtain the rainfall projections.Results and discussions are provided in Section 4, followed by listing of major contributions from this work in Section 5 and summary and conclusion in Section 6.

Experimental Dataset
The study uses three datasets namely, gridded precipitation data, climate reanalysis data and GCM data.The gridded daily rainfall data for Indian land mass (6.5°N-38.5°Nand 66.5°E-100.5°E)at spatial resolution 0.25° latitude x 0.25° longitude is obtained from India Meteorological Department (IMD) [Rajeevan and Bhate, 2008].This gridded precipitation data provides total 4954 grid points at 0.25°s patial resolution constituting the entire Indian landmass.The rainfall projection is obtained for all these (4954) grid points.The gridded rainfall data at 0.25° spatial resolution is referred as observed rainfall with the following discussion.
The climate variables that are realistically simulated by GCMs are selected as input to the SD models (termed as predictor variables) to simulate the local-scale climate variable (termed as predictand variable) -rainfall in this study.Following Salvi et al. (2013), andShashikant et al. (2017) this study uses ve climate variables as predictors representing atmospheric circulation patterns of the western coast.These predictor variables are namely surface-level air temperature (AIRTEMP), mean sea level pressure (MSLP), speci c humidity (SHUM), horizontal component of wind velocity (UWIND) and vertical component of wind velocity (VWIND).The set of gridded predictor variable dataset having a spatial resolution of 2.5° latitude x 2.5° longitude spanning the region 5°N-40°N and 65°E-100°E is obtained from NCEP/NCAR reanalysis dataset [Kalnay et al., 1996].Akhtar et al., 2019 evaluated the performance of SD models based on selected atmospheric predictor variables in providing simulations of monsoon precipitation over India and identi ed that most of these predictors show better predictive skills over different climate zones of the country.Here it is important to mention that, some other reanalysis datasets, which may be seen as a superior choice considering the relative coarse resolution of NCEP data are not selected for following two major reasons.Firstly, the NCEP data allows a longer period of analysis  and secondly, the methodology is developed to support climate conditions at coarse resolution.
The future state of selected climate variables is obtained from simulations by Canadian Centre for Climate Modelling and Analysis (CCCma), second generation Earth System Model (CanESM2).The model outputs utilized in this analysis is collected from PCMDI CMIP5 data archive available at (https://pcmdi.llnl.gov/mips/cmip5).We obtain the historical outputs from CCCma-CanESM2 available for the time period; 1969-2005.The future projections of selected ve climate variables are obtained for the time period 2030-2100 for emission scenarios RCP 4.5 (medium) and RCP 8.5 (high).Saha et al. (2014) with an evaluation of historical outputs of 42 CMIP5 climate models for ISMR rainfall shows that CCCma-CanESM2 GCM best simulates the atmospheric variability in the Indian region and provides a realistic simulation of Indian monsoon rainfall.CCCma-CanESM2 provides required climate data projections for different important climate variables for the emission scenarios (RCPs) prescribed by IPCC for time period entire 21st century.
The use of multiple GCMs and the ensemble approach to incorporate GCM uncertainty, have signi cant effect on projections [Christensen et al. 2007 Sharma, 2000].This study uses output from a single GCM having focus on the methodology development and its comparative evaluation to three established methodologies for downscaling of daily precipitation over the Indian sub continental region.Here we would like to mention that, the study is conceptualized as to develop a novel downscaling approach that serves as a single end-to-end supervised model for predicting the future precipitation for entire Indian region and capture the regional variability of ISMR.The following section elaborates the methodology development.

Methodology Overview
SD methodology statistically links coarse resolution predictor variables with the ne resolution predictand variable.This study aims to propose computationally e cient and reliable SD methodology to obtain realistic projection of ISMR.We obtain projections of ISMR at 0.25° resolution using a Convolutional Long Short Term Memory (ConvLSTM) and Shared ConvLSTM modelling framework.The results of obtained projections are viewed in light of those from two most recently used approaches non-parametric Kernel Regression (KR) [Salvi et al., 2013] and LSTM [Misra et al., 2017].Brief description of KR and LSTM methodologies is provided in Supplementary sections S1 and S2 respectively.This section provides an overview of modelling framework and data pre-processing followed by the steps for developing a standard ConvLSTM model as well as shared ConvLSTM model.

Statistical downscaling framework and data preprocessing
The overview of the ConvLSTM based SD methodology is represented in Therefore, it is essential to bias correct the GCM before it is taken as input to the model.The bias correction of GCM predictors is carried out prior to statistical downscaling, to reduce systematic biases in the mean and variance of various GCM predictors.The bias correction is carried out with respect to that of NCEP/NCAR predictors.Supplementary section S3 presents description of the quantile based bias correction method for GCM predictors (Li et al., 2010).
To account for highly variable climate patterns of the Indian landmass, the downscaling is performed independently for the seven climatologically homogeneous zones of the country (Parthasarathy et al. 1996).The extent of these regions are illustrated in Fig. 2a.The spatial extents for the NCEP/NCAR predictor variables used for predicting rainfall in these regions are shown with the boxes bounding over the rainfall zones (Fig. 2b, Table 1).The geographical extent of predictor region is selected following Salvi et al. (2013).Apart from the climate data representing the large scale circulation over a greater region around the selected region (zone), lag1 (previous day) rainfall of the zone is also included in predictor dataset.The ConvLSTM model mandates the data input in the form of regular shaped boundaries (square/rectangular box).Therefore the smallest square bounding box covering all the grid points in a particular zone is selected for training the ConvLSTM model.
The selected NCEP/NCAR variables, namely mean sea level pressure (MSLP), speci c humidity (SHUM), horizontal component of wind velocity (UWIND) and vertical component of wind velocity (VWIND) have different numerical ranges.Therefore it is di cult to utilize them together as a combined predictor dataset.The process of standardization of predictor dataset is applied to normalize the variables to the range [0, 1].The standard deviation and mean of predictor variables is computed with the help of daily dataset of respective predictor variable for monsoon time period (JJAS) for the de ned time period.
Large-scale climate predictor variables are standardized by subtracting the mean and dividing by the standard deviation of the respective variable taken over the prede ned time period.In order to unify the grid size (geographical extent) of grids within the complete predictor dataset, the NCEP/NCAR predictor dataset having a geographical resolution of 2.5 0 is interpolated to grid size 0.25 0 , to match the size of the rainfall data.The interpolation of NCEP/NCAR predictor dataset is performed using the bilinear interpolation methodology.NCEP/NCAR data and gridded lag-1 rainfall data are concatenated to form the input.For example, for the central region, the size of the gridded precipitation data is 48x48.Size of NCEP/NCAR data is 8x8x5 (for 5 predictor variables) which is interpolated to 48x48x5.Hence, the size of the nal input to the model is 48x48x6.To ensure that the interpolated variables do not lose any information and maintain spatial structure of the variables, we check the statistical properties of selected variables at each grid point.Figure 3 presents mean value of NCEP/NCAR predictor variables before and after bias correction.
The traditional multisite downscaling models typically perform downscaling on a single homogeneous rainfall zone, predicting rainfall at one grid point in a single model run.In contrast to this, the proposed ConvLSTM based modelling framework projects rainfall for all the grids in a zone with a single model run.Further to this, the study proposes a shared ConvLSTM model with a novel modelling framework that shares a ConvLSTM model across multiple neighbouring regions.The shared ConvLSTM model captures the similarity in rainfall patterns of the neighbouring regions and customizes it to individual grid points.
This downscaling approach provides a single end-to-end supervised model for predicting the future precipitation series for entire India.It captures the regional variability in rainfall better than a region wise trained model.The following sub section provides description and functioning of the Conv-LSTM model applied to the pre processed predictor dataset.Importantly, this model do not need speci c feature engineering and can learn the features themselves.Moreover, ConvLSTMs can take three dimensional inputs.The dimensionality reduction techniques like Principal Component Analysis (PCA), reduces the dimension of predictor data, however this may cause loss of few important features.The proposed methodology completely omits the commonly mandated step of dimensionality reduction and hence preserves the complete information of the predictor dataset.

ConvLSTM model
LSTM is a Deep Learning (DL), Recurrent Neural Network (RNN) model proposed by Hochreiter and Schmidhuber.LSTM is designed to model temporal sequences with their long-range dependencies (Hochreiter and Schmidhuber, 1997).A standard Neural Network (NN) contains several simple, connected processors termed as neurons, each of them producing a series of real-valued activations.The input neurons are triggered with the help of sensors that perceive the environment, at the same time the other neurons are triggered with the help of weighted connections with previously activated neurons.The network may also function in a reverse order, that is, few of the neurons triggering actions that in uence the environment.Recurrent Neural Networks (RNN) is special kind of neural networks designed to handle sequence dependence.The NN credit assignment or learning is referred as nding optimal weights that helps NN demonstrate the desired behavior.A simple traditional NN captures information in input very well fails to capture long term information dependency in input data.As the name suggests, LSTMs are specially designed to remember information for long time periods and avoid the problem of long-term dependency.Deep Learning (DL) in the context of NN is referred to assigning credit across long causal computational stages, where each stage governs a non-linear comprehensive activation of the network.
LSTM is a widely used RNN that is proven to work well on a large range of problems such as: language modelling, image captioning, speech recognition, translation, etc.
LSTMs functions with a chain like structure, same as a traditional RNN, except that the repeating module has four interactive network layers in place of a single NN layer.LSTM removes or adds information to the cell state through a carefully controlled structure termed as gates.Gates are composed of a point wise multiplication operation and sigmoid neural net layer.LSTM has three different sigmoidal gates to determine state of the cell.The sigmoid layer outputs numbers between zero and one, describing weight of the information retention, i.e. value zero means no retention while the value one means a complete retention.
The rst sigmoid layer which decides on information that will be discarded is termed as the 'forget gate layer'.For any time step (t) a forget gate layer considers hidden output at previous time step (t-1) denoted as h t−1 and current input data denoted as x t to output between 0 and 1 corresponding to each number in cell state C t−1 given as : Here, W denotes the weight matrix, su x x, f, t, h denotes variable, forget layer, time step and hidden output respectively.
The next step decides new information retention in the cell state.This is achieved in two different parts.First, a sigmoid layer termed as the 'input gate layer'.This layer decides the values to be updated.Second, tanh layer.This layer creates a vector of new candidate values, Ĉ t that could be added to the state.
The following step, combines these two input information to update the old cell state, C t−1 , to new cell state C t .This is achieved by multiplying the old state by f t , discarding information and adding i t * Ĉ t to determine new scaled state of the cell at time step t.
The output of an LSTM cell is determined based on cell state C t .The ltered output is determined by running a sigmoid layer deciding parts of cell state that will be retained in the output.In order to achieve this, the cell state is set through tanh to con ne the values to range from − 1 and 1 and multiply it by output of the sigmoid gate.
ConvLSTM is a variation of LSTM, that comprises a convolution operation inside LSTM cell.The fully connected RNN layers in a standard LSTM are replaced by convolutional layers in a convolutional LSTM block.The model, therefore consist of 2 basic units, convolutional layers and LSTM cells.These layers are tailored for spatiotemporal input-output based prediction problems, as here in case of rainfall projection.A basic unit of a convolutional LSTM cell is depicted in Fig. 4.
Convolutional LSTMs [Krizhvesky et al., 2012] can be used to model dependencies between a three dimensional input-output volume with temporal dependencies, which is the same as a rainfall projection problem.In this study, large scale climate circulation and lag 1 rainfall acts as predictors or the input volume and the gridded rainfall data is taken as the output volume.The ConvLSTM model is thus applied to all grid locations in the input and output at the same time.The network connections in a ConvLSTM, as shown in Fig. 5, are built in such a way that it captures the relationship between the target (rainfall variable) at a grid point and the input (predictor variables) at all the grid locations in the input volume.A high degree of autocorrelation is generally observed in the daily rainfall series of any region.This autocorrelation can improvise predictive ability of a downscaling model by a signi cant proportion.Capturing autocorrelation in daily rainfall series is climatologically important in daily rainfall prediction studies.The ConvLSTM model thus essentially captures the relationship between large scale predictors and local rainfall.It also captures the spatial correlation between both input predictors and target rainfall elds.Due to the backward network connections in the ConvLSTM model, (as illustrated in Fig. 4) the model automatically takes care of the autocorrelation in the rainfall series.Shi et al.
[2015] used the ConvLSTM model for precipitation nowcasting, showing a high accuracy in precipitation prediction.This study demonstrates that model processes a noteworthy skill of capturing all these dependencies and the autocorrelation in the daily rainfall series.

ConvLSTM model training
A four layer Convolutional LSTM model is trained with the input dataset and the current day rainfall as output.This model, as depicted in Fig.Where n is a subset of the gridded output data having size r×c, t i is the true value of target, and m i is mean value of the predictive distribution.
The following weighting scheme is used for model evaluation: The trained model is validated using the predictor dataset for the validation time period.Multiple efforts of model training and testing are carried out to obtain an optimum weight matrix.The trained model is applied to the GCM data to generate future rainfall projections.This modelling framework is applied independently for each rainfall zone.The basic architecture of the ConvLSTM models for each region is depicted in table 2.
Gridded rainfall bounding boxes are taken as square shaped so that the NCEP predictor input volumes can be easily interpolated to the shape of the gridded rainfall.The regridded data is concatenated to form the nal ConvLSTM based downscaling model input.For example, for all India rainfall data, NCEP predictors are of size 15×15 and gridded rainfall is of size 129×135.So, here we need to make the rainfall data of a shape which the predictors can be easily interpolated to.Hence, we take a larger region based rainfall data of size 135×135.We could have instead padded the data with zeros instead of taking larger region.The experimental investigation revealed that taking lag-1 rainfall from a larger region as input in the loss function worked better than the zero padding approach.

Shared ConvLSTM modeling framework
The previous works in literature proposed region based downscaling methodologies for larger regions of heterogeneous rainfall patterns like India by dividing the region into homogeneous zones.This study also applied the same design with the previous models.NN have been shown to bene t from sharing of related data and multitasking.Therefore, this study considers proposing a model, termed as shared ConvLSTM.The shared ConvLSTM model tries to explore whether the rainfall zones and the corresponding predictor data taken as a whole improve or worsen the results.This modelling framework considers gridded rainfall dataset of size 129X135 and a spatial resolution of 0.25°×0.25°.In order to ease training, we convert it to size 135×135 by padding with zeros.The zeros in the region outside Indian landmass are ignored during training by the use of the weighted loss function WConvLSTM.The NCEP/NCAR dataset size for entire India which we are using is 15×15 at a resolution of 2.5°×2.5°.We interpolate it to size 135×135 using bilinear interpolation as discussed above.The basic architecture of this model is same as that presented in Fig. 5.The basic steps are similar to the ConvLSTM model with the differences: gridded precipitation data with padding is used as the predictor and the interpolated NCEP/NCAR data is used as predictand for training the model.NCEP/NCAR data and gridded lag-1 rainfall data are concatenated to form the nal input to the model.The four layer ConvLSTM model is trained with this input and current day rainfall as output.This model is trained using Adam optimizer with an initial learning rate of 10 − 5 and the same WNMSE ConvLSTM loss function as discussed above.The trained model is then validated using NCEP/NCAR data and rainfall data for the validation period.Bias correction of GCM data is performed followed by validating the model for CanESM2 GCM data for the historical time period.The future rainfall predictions are generated using CanESM2 GCM data for the future time periods.

Results And Discussions
This section provides a brief discussion regarding the obtained results and their interpretation.This section presents details on application of four different downscaling models namely KR, LSTM, ConvLSTM and shared ConvLSTM to provide rainfall projections at 0.25 0 spatial resolution over Indian sub continental region.All four models are trained and tested using NCEP/NCAR reanalysis dataset.The best performing model is identi ed and applied to the historical dataset of CCCma GCM.A fair evaluation of model results is carried out using the cross validation time period long enough to estimate relevant persistence characteristics of projected rainfall.The statistical performance indices are estimated on the basis of results of 13 year (2001-2013) realizations of the rainfall occurrences using NCEP/NCAR reanalysis dataset and 37 years (1969-2005) realizations of the GCM historical dataset.The results are evaluated based on realization of various statistics representing spatiotemporal characteristics of rainfall which are essential for planning and management of water resources.The numerical comparisons are made based on estimation of the three basic statistical parameters namely mean, standard deviation and extremes in rainfall.Here, we consider the 95th percentile of the data (both predicted and observed) as extremes.To evaluate the spatiotemporal properties of future rainfall we use data for RCP4.5 (medium) and RCP8.5 (high) climate scenarios of CCCma GCM.

Model validation over baseline period: Comparison of statistical parameters
The ability of a model to capture the mean of a precipitation series is aptly measured using the mean square as well as mean absolute error metrics.We evaluate the difference between mean, standard deviation and extremes of observed and predicted rainfall dataset over each grid point for the four selected models.The spatial distribution of mean absolute errors for selected models using NCEP/NCAR data for the testing time period (2001-2013) are presented in Fig. 6.
Here, it is observed that KR Model performs reasonably well in capturing the mean rainfall (Fig. 6a), and rainfall extremes (Fig. 6c), except for the regions of high rainfall like the Western Ghats and NE hills, where an underestimation is observed.The KR model at the same time underestimates standard deviation in many parts of the country and overestimates in parts of western Ghats and NE hills (Fig. 6b).
The LSTM model performs worse than KR model in terms of predicting mean rainfall condition (Fig. 6j).
The model completely fails in capturing standard deviation (Fig. 6k) and extremes (Fig. 6l).The red color domination indicates a higher deviation in observed rainfall.The difference delimits from 0 to + 20 mm for more than 80% grids points (Fig. 6k).The difference between extremes of the observed and projected rainfall with this model largely differs.This may be attributed to the simple architecture, i.e., a single hidden layer LSTM network used for predicting rainfall at grid points having highly varied rainfall pattern.
The region wise ConvLSTM model performs well in capturing the mean, STD and extreme rainfall condition, except spurts of underprediction in regions of high rainfall like Western Ghats (Fig. 6g, i).Here, it is important to note that the underestimation of standard deviation as observed with the KR model is partly overcome with region wise ConvLSTM model (Fig. 6h).The model, however, depicts limitation to predict both extreme rainfall condition and STD over Western Ghats and parts of NE zone.The shared ConvLSTM model shows the least deviation in the mean rainfall condition standard deviation and extremes from the observed values for all the regions as compared to the other three models.The model performs e ciently with error in mean, standard deviation and extremes delimited to 0-5 mm (Fig. 6d-f).Supplementary gure S3 provides quanti cation of model performances for different climate zones of the country with the help of bar plots.It is observed that the KR model overpredicts the mean rainfall and under predicts the STD and extremes for all the zones except NE hills.The LSTM model underestimates mean, standard deviation and extremes to a larger extent for all the zones.Both KR and LSTM model depicts an over prediction for mean, STD and extremes for NE hills and NE zone.This model consistently underestimates the standard deviation and extremes for all the zones except NE zone.The ConvLSTM model predicts the mean rainfall with a higher precision than both KR and LSTM models for all the zones including the NE zone.However, this methodology underestimates the standard deviation and extremes of rainfall.The ConvLSTM model depicts an inferior performance compared to KR model but better then LSTM model in capturing standard deviation and extremes except Jammu and Kashmir region.The shared ConvLSTM model performs superior in capturing all the three statistical properties consistently for different climate zones of the country.
Cross correlation between observed and predicted rainfall at individual grid points is an essential property indicating the correctness of the spatial nature of projected rainfall.The spatiotemporal variability captured in rainfall patterns is evaluated by observing spatial cross correlation between observed and predicted rainfall.The grid wise observed and predicted rainfall association is provided in Fig. 7 for all the zones.The gures show that the zone-wise cross correlation between observed rainfall at different grid points is best captured by Shared ConvLSTM model as compared to KR, LSTM and ConvLSTM models.
The Shared ConvLSTM model is therefore applied for prediction of future rainfall patterns from CCCma GCM dataset.The rainfall projected using Shared ConvLSTM model with the help of GCM-simulated predictor variables is equated with observed rainfall in order to access the capability of the GCM data to obtain the rainfall projections.The following sub-section provides details of model performance using bias corrected historical GCM dataset.

Validation of model using bias corrected historical GCM dataset
The performance of Shared ConvLSTM model using historical GCM data

Future Rainfall Projections using Shared ConvLSTM model
The validated Shared ConvLSTM model is applied to the future time period 2030-2070 for RCP4.5 (stabilization scenario) and RCP8.5 (business as usual or high emission scenario) using bias corrected CCCma GCM data to obtain rainfall projections.The statistics of projected rainfall scenarios for RCP4.5 and RCP8.5 are presented in Fig. 9.The difference between mean of observed rainfall data from 1951-2000 and predicted rainfall for the years 2030-2070 is illustrated in Fig. 9 (g, i).The results reveals a remarkable increase of the mean rainfall in NE region and NE hills.A moderate increase in the mean rainfall is observed in the northern plains and Jammu and Kashmir region, Western zone and majority of Southern India.The Western Ghats, NE region, NE hills and parts of central India show increase in STD and extremes.Similar to the mean rainfall, the standard deviation and extremes are notably low in the Jammu and Kashmir region, western zone and a major part of South India.
The summary of results drawn from this analysis is as follows: The Convolutional LSTM model e ciently captures the mean, STD and extremes in rainfall pattern and shows a considerable improvement over the existing models like nonparametric kernel regression and LSTM.Here it is essential to note that, region wise ConvLSTM methodology functions based on evaluating single model parameter to predict rainfall over the entire region, as compared to the KR model, where grid wise estimation of model parameters are made.The Convolutional LSTM can be used for gridded data of any size.ISMR depicts a huge variability in both space and time.The ConvLSTM model shows an e cient way to predict future rainfall with high precision and greater computational e ciency.The shared ConvLSTM model shows a higher performance as compared to the region wise ConvLSTM model which gives the effectiveness of multi task learning.The results also support the hypothesis that the neighboring regions in the Indian subcontinent have spatiotemporal similarities in the rainfall patterns.

Contributions Of This Work
To the best of our knowledge this study presents the rst application of Convolutional LSTM for statistical downscaling of daily precipitation.While different earlier efforts examined the skill of rainfall projections over the Indian land mass using multiple GCMs, this analysis provides a useful guide to determine the skill of statistically downscaled seasonal precipitation using multiple models.The skill of four different statistical downscaling models in precipitation projections are evaluated for the Indian landmass at spatial scales relevant to local decision-making.We provide an evaluation of the comparative performance of the proposed statistical methods using deep learning methods.Three SD  3).
Geographic variations in downscaled precipitation reveals that these three models are skilled to capture patterns across the diverse topography of the Indian land mass, principally in regions subject to high variability in orographic precipitation.
The previously proposed models perform region wise, whereas here we propose a multitask learning based stacked ConvLSTM model for predicting rainfall all over India using a single model with a considerable higher accuracy.The underestimation of mean rainfall condition as observed with the previous models is signi cantly overcome by application of this method.This model performs well in terms of capturing standard deviation and extremes in rainfall.This model suits well for predicting gridded precipitation.

Conclusion
Climate change derives clear response to ecological and socioeconomic drivers.In terms of its global context, it is therefore one of the most concerned subject today.Though GCMs are considered to be credible tools to predict future climate change, the larger uncertainties in between different GCMs and the coarse spatial resolution of the GCM dataset mainly poses a limitation to application of GCM outputs, especially for regional scale water management.The introduction of downscaling techniques largely overcomes this limitation of GCM data application.The present study aims to present comparative performance evaluation of four different STD techniques for Indian landmass at ner spatiotemporal scales to suit advance hydrological impact studies.The validation and calibration of result demonstrates that Shared ConvLSTM downscaling procedure show comparable ability to simulate the precipitation.The evaluation of results revealed that Shared ConvLSTM using CCCMa CMIP5 GCM reproduces accurate long-term STD of daily precipitation resulting in best capture of extreme events and distribution of daily precipitation in entire data range.Here, the usage of multiple GCM data is purposefully avoided with the focus on methodological development.The result from the CCCMa GCM showed an increase in mean precipitation.The relative change in precipitation mean ranges from ± 20 mm.The result under both RCP4.5 and RCP8.5 scenarios agrees with the increasing precipitation mean.The comparative changes in downscaled precipitation STD ranges from ± 20 mm while the change for extreme annual precipitation in the range ± 100 mm.Importantly the changes in STD and extremes in future rainfall are spatially nonuniform.
The present study shows shared ConvLSTM as a promising SD methodology, though there are some limitations in the experimentation taken up.The study evaluates model performance trained for a su ciently longer time period.The performance of model trained with sparse training dataset is a potential work to examine.Further to this the capability of proposed modeling framework for downscaling climate variables other than precipitation may be attempted.Follow up studies may be taken up to address the enduring problems of climate change such as application of multiple reanalysis datasets, multiple GCMs and the role of different climate variables within the predictor datasets with the proposed modeling framework.
Most importantly, the proposed model depicts the capability to capture spatial non-homogeneity around the downscaled projections within a zone which is a key factor in obtaining reliable regional level rainfall projections for adapting to climate change.Therefore despite the above said limitations, the study presents a conformable novel architecture for SD with superior predictive capabilities.
Future climate projections play an important role in providing understanding of the climate systems and makes the basis for addressing number of science and policy problems.With the development and availability of Phase 6 of the Coupled Model Intercomparison Project (CMIP6) that provides multi-model climate projections of future emissions and land use changes, the state of art methodological development on the downscaling models as the one presented here will strengthen a wide range of integrated studies across the climate science modelling, like impacts, adaptation, vulnerability and more.This kind of future prediction will positively help the policy makers for effective decision making about water storage and sustainability in regions of drought (very low rainfall) and dealing with regions of ood (very heavy rainfall).

Declarations Funding
The authors would like to thank the Ministry of Human Resource Development, India for funding this work as a part of the project "Arti cial Intelligence for Societal needs".Caption not included with this version.

Con icts of interest/Competing interests
Page 24/30 Caption not included with this version.
Caption not included with this version.
Caption not included with this version.
Page 25/30 Caption not included with this version.
Page 26/30 Caption not included with this version.
Page 27/30 Caption not included with this version.
Caption not included with this version.

Fig. 1 .
The basic steps of the ConvLSTM-based SD model are as follows: Preparation of the predictor dataset, and application of ConvLSTM-model.The model is trained with the help of NCEP/NCAR reanalysis dataset of time period 1951-2000, and is validated using data of time period 2001-2013.The training time period of 50 years is considered long enough to establish the correctness of a SD model.The model trained and tested using NCEP/NCAR reanalysis dataset, is applied to the GCM data.The model is rstly applied to historical GCM data of the time period 1969-2005, to check correctness of the developed model application for GCM dataset.The model validated for GCM historical data is thereafter applied to GCM near future data of the time period 2030-2070 and far future 2070-2100 to obtain rainfall projections under changing climate.The GCM data depicts systematic variation with respect to the climate reanalysis data, known as bias.Bias in GCM data may result in uncertainty in model predictions.
in capturing mean, STD and extremes in rainfall is presented in Fig. 8.The model performs well in capturing the statistical properties of rainfall using the bias corrected historical GCM data.The spatial distribution of observed mean rainfall (Figs.8a) is well captured by the projected rainfall (Figs.8d) obtained using the Shared ConvLSTM model.The absolute difference between model simulated and observed mean rainfall as depicted by Fig. 8g indicates that the model captures the mean rainfall with a high degree of accuracy.The rainfall variability in terms of standard deviation (Fig.8b) is also well captured by the Shared ConvLSTM model (Figs.8e).The difference in standard deviation of projected rainfall from the observed rainfall is presented with Fig.8h.The overall positive differences, indicates that the model projects standard deviation in lower magnitude as related to observed rainfall.The spatial variability of the extremes in the observed rainfall (Fig.8b) is also well captured by the Shared ConvLSTM model (Figs.8e).The difference in observed and model predicted rainfall as presented with Fig.8hindicates overall low absolute differences in rainfall extremes.Supplementary gure S4 provides quanti cation of model performances for different climate zones of the country with the help of bar plot.It is observed that shared ConvLSTM model overpredicts mean rainfall and underpredicts the standard deviation and extremes for all the zones except NE hills.The model at the same time performs superior to capture all the three statistical properties consistently for different zones of the country.

Figure 7
Figure 7 , Zhu et al, 2008, Sun et al. 2012, Ahmed, et al 2013, Das et al. 2018, (Abadi et al., 2016) layer with lter size 3×3, 5×5 and 7×7.The number of lters are chosen to be 4, 8, 16, 32, 64, 128 and learning rate is selected as 0.0001, 0.001, 0.01, 0.1.The number of iterations are chosen from 5 to 500 with a stride of 5. Following this, the best performing model identi ed based on the weighted mean squared error estimation.The selection of optimization methodology plays a key role in training of deep learning models.The advanced optimization method, namely Adam (Kingma and Ba, 2015) is adopted especially for training very deep networks.Here, the Adam optimizer is selected, mainly as it is an optimized version of stochastic gradient.This optimization algorithm is widely used for nding minima for convex optimization problems.It is generally expected to reach the global minima in a certain number of iterations if the learning rate is carefully chosen.The model is trained with Adam optimizer with an initial learning rate of 10 − 5 .The model is developed in Tensor ow(Abadi et al., 2016)using Keras library and Python on a NVIDIA Tesla P100-PCIE GPU.The rainfall is highly variable across the grid points in a region, therefore to appropriately capture both mean and extremes of the rainfall at every grid point in a region weighted mean squared error ConvLSTM loss function WNMSE is used.The WNMSE applies weighting of 5, consists of four ConvLSTM layers.The rst three layers are with 64 lters, each having size 5×5, whereas the last layer has 1 lter of size 1×1 which enables to get the output of desired volume (48×48×1 in case of central region).Here, it is essential to note that there are different combinations of hyperparameters for training the ConvLSTM model.The model training is performed using all possible combinations of 1 to 5 individual observed rainfall values at every grid in the training set.The rainfall data has a vastly skewed distribution with values greater than 30 mm per day covering approximately 5% of the entire training set over Indian region.The loss is summation of losses over all grid points within the zone for which prediction of the rainfall distribution is made.The training loss function for the ConvLSTM model is as follows: 8 ………………………..……....
approaches, namely Kernel Regression (KR), Convolutional Long Short Term Memory (ConvLSTM) and Shared ConvLSTM model exhibits similar correlative skill measures.However, the Shared ConvLSTM model shows superior skill measures as compared to other three models.This is because ConvLSTM model takes advantage of convolution and LSTM.At the same time the model extracts features from the region in place of using one-dimensional time series data.At the same time LSTM model completely fails to capture the precipitation variation (standard deviation) and therefore the extremes.The deep learning models namely LSTM, ConvLSTM, and shared ConvLSTM are trained to explore the dependence of the occurrence of precipitation on the predictor variables with time lag 1.At all times, the deep learning methodologies were trained to minimalize mean squared error between estimated and observed mean, STD and extremes in every epoch for the training time period.Both KR and shared ConvLSTM have similar results, with each producing near to observed mean rainfall.However, the study observes a higher skill of the estimated STD and extreme of precipitation with shared ConvLSTM as compared to KR model (Table