## Abstract

The use of a statistical downscaling technique is needed to investigate the hydrological consequences of climate change on the local hydropower capacity. Global Circulation Models (GCMs) are crucial tools used in various simulations for potential climate change effects, including precipitation and temperature. Statistical downscaling methods comprise the improvement of relations between the large-scale climatic parameters and the local variables. This study presents the trend analysis of the observed variables compared to the statistically downscaled emission scenarios that are adopted from the Canadian Second Generation Earth Systems Model (CanESM2) in the basin of Göksu River which is located in Turkey. The key purpose of the research is to evaluate both the predicted monthly precipitation and the projections of GCMs within the three simulated scenarios of RCP2.6, RCP4.5, and RCP8.5 by Gene Expression Programming (GEP). In addition, the findings of statistical downscaling of monthly mean precipitation will be compared to the Linear Regression (LR) model. The R-value is 0.827 and 0.755 for precipitation of the GEP model for the periods of calibrating and validation. In comparison with the LR model for the validation and calibration periods (1971–2005), the results of the GEP model prove its applicability in projecting the data for monthly mean rainfall. Generally, in the simulated periods of 2021–2100, the mentioned scenarios forecast a decline in the monthly mean precipitation in the basin. Moreover, the scenario of RCP8.5 projected more suitably for the case study than expected under the scenarios RCP4.5 and RCP2.6. The mean statistically downscaled CanESM2 model was compared with the trend analysis of the areal mean precipitation over the case study area, and the trend was shown as decreasing. However, the RCP 8.5 scenario was the more quasi-asymptotic for trend.

## HIGHLIGHTS

Climate change impact is applied in catchment hydrology.

New methodology is proposed.

Future projections of water resources are done.

Artificial intelligence is used.

Statistical downscaling is cooperated with artificial intelligence.

## INTRODUCTION

Over different time scales, the climate might vary in many ways. At this time, the issue of global warming is worrying scientists. Here, it is worth mentioning that the problem of global warming is the result of some human activities that increase the natural greenhouse impacts. In the next century, the climate will witness dangerous changes as humans continue their activities that increase greenhouse gases. This will result in influencing many areas of our life, such as affecting the runoff of the river catchment along with hydroelectric power stations. All of these need efforts to measure the impacts of such a change, for example to show computer-based modeling of how the electric power and hydroelectric power sectors will be massively influenced by climate change (Harrison *et al.* 1998). Based on Dibike's and Coulibaly's study (2005), the Global Circulation Models (GCMs) access on a worldwide scale the numerical/physical features in the atmosphere, and the system of land cover. Moreover, the GCMs have been designed in such a way to reproduce the climatic variables of the present, the past, and the future. The best methods presently offered for simulating the climate change system's reactions to the growing greenhouse gases are the numerical models (GCMs) that describe physical procedures in the cryosphere, atmosphere, ocean, and even on the land's surface (IPCC 2013). GCMs reflect physical mechanisms in the atmosphere, oceans, and ground surface and are presently able to predict the global climate system's response to growing concentrations of greenhouse (Hulme & Carter 1999). GCM outputs cannot generate local climate details in fine spatial resolution due to uncertainty in the spatial resolution between the GCM and hydrological models (Hadipour *et al.* 2013). Subsequently, downscaling can be defined as a method that delivers from the variables of the large-scale atmosphere to the local-scale climate at the level of the land, and was developed to transform the GCM outputs from coarse spatial resolution to a finer spatial resolution (Anandhi *et al.* 2013).

Two primary techniques, namely empirical, also called statistical, and dynamical downscaling, have been developed for downscaling. Statistical downscaling adopts statistical relationships between the regional climate and carefully selected large-scale parameters (Wilby *et al.* 2004), while dynamical downscaling makes use of regional climate models (RCMs) to simulate finer scale physical processes consistent with the large-scale weather evolution prescribed from a GCM, and is more expensive (Mearns *et al.* 2013). Zorita & von Storch (1999) stated that statistical downscaling rests on the principle that the local climate's relation with the large-scale movement continues to be effective through the varying future climatic conditions.

Gene Expression Programming (GEP), Genetic Algorithms (GAs) and Genetic Programming (GP) are algorithms that use populations of individuals, chosen according to fitness, and introduces genetic differences with the help of one or more genetic operators (Rechenberg 1973). The central difference between the three algorithms is in the nature of the individuals: in GAs the individuals are linear strings of fixed length (chromosomes); in GP, the individuals are nonlinear entities of different sizes and shapes (parse trees); and in GEP, the individuals are encoded as linear strings of fixed length (the genome or chromosomes) that are afterwards expressed as nonlinear entities of different sizes and shapes (i.e., simple diagram representations or expression trees). GEP was designed by Ferreira in 1999 (Ferreira 2001) and it includes both the simple, linear chromosomes of fixed length parallel to the ones used in genetic algorithms GAs and the ramified structures of different sizes and shapes like the parse trees of GP. This is equivalent to say that in GEP, the genotype and phenotype are lastly disjointed and the system can now benefit from all the merits this brings about. GEP is commonly applied in hydrologic engineering, particularly in the last decade. Especially, it has been applied in the prediction of hydro-meteorological variables (Guven 2009; Guven & Aytek 2009; Guven & Talu 2010; Seckin & Guven 2012; Traore & Guven 2013; Al-Juboori & Guven 2016).

The Mann-Kendall (MK) test, which was initially advanced by Mann (1945), and the test statistic distribution was subsequently derived by Kendall (1975), is a rank-based non-parametric test for identifying a monotonic trend in a time series Yue *et al.* (2002). Hydro-meteorological data (such as rainfall, streamflow, temperature, etc.) generally have a skewed distribution; it is more appropriate for non-parametric methods than parametric methods for trend detection. Therefore, the MK test has been generally utilized to evaluate the statistical significance of trends in these data series (Önöz & Bayazit 2003; Modarres & da Silva 2007; Kumar *et al.* 2010).

The key purpose of the research was to evaluate both the predicted monthly precipitation and the projections of the GCM within the three simulated scenarios by GEP. Moreover, for the observed climatic data between 1971 and 2005, this study has also used the MK trend analysis and the Sen slope estimator to compare the different pathways RCP2.6, RCP4.5, and RCP8.5 for the period 2021–2100.

## STUDY AREA AND DATA USED

The study area is the Göksu River basin, which is one of the sub-basins of the Dogu Akdeniz basin. The basin is placed in the southern part of Turkey, near the Mediterranean coast; more specifically, it is 95 km from Mersin city. Moreover, the area of this basin is approximately 17,048 km^{2}. In addition, the site of a flow gauge station, named Karahacılı, is used for the outlet point of the basin. Furthermore, the map shows that the coordinates of the mentioned basin are 36°24′06.5″N latitude and 33°48′56.1″E longitude. Figures 1 and 2 show the location of the study area according to the Göksu River basin. There are five local gauge stations that are going to be used in the study: Ermenek, Mut, Karman, Hadim, and Silifke. In addition, Tables 1 and 2 offer essential information about the locations and properties of the above-mentioned gauge stations.

. | Station no. . | Station name . | Latitude (N) . | Longitude (E) . |
---|---|---|---|---|

A | 18,210 | ERMENEK | 36°38′01.0″ | 32°54′27.0″ |

B | 17,956 | MUT | 36°39′05.0″ | 33°26′02.0″ |

C | 17,246 | KARAMAN | 37°11′35.5″ | 33°13′12.7″ |

D | 17,928 | HADIM | 36°59′21.5″ | 32°27′20.5″ |

E | 17,330 | SILIFKE | 36°22′56.6″ | 33°56′14.3″ |

. | Station no. . | Station name . | Latitude (N) . | Longitude (E) . |
---|---|---|---|---|

A | 18,210 | ERMENEK | 36°38′01.0″ | 32°54′27.0″ |

B | 17,956 | MUT | 36°39′05.0″ | 33°26′02.0″ |

C | 17,246 | KARAMAN | 37°11′35.5″ | 33°13′12.7″ |

D | 17,928 | HADIM | 36°59′21.5″ | 32°27′20.5″ |

E | 17,330 | SILIFKE | 36°22′56.6″ | 33°56′14.3″ |

Item . | Silifke . | Mut . | Hadim . | Ermenek . | Karaman . |
---|---|---|---|---|---|

Mean annual rainfall (mm) | 549.83 | 375.16 | 638.10 | 477.53 | 326.12 |

Mean monthly rainfall (mm) | 46.25 | 31.43 | 53.42 | 40.04 | 27.34 |

Maximum annual rainfall (mm) | 1,007.70 | 608.70 | 1,074.20 | 778.80 | 478.30 |

Minimum annual rainfall (mm) | 300.50 | 135.80 | 421.30 | 268.00 | 212.60 |

Maximum monthly rainfall (mm) | 504.50 | 248.80 | 405.80 | 289.00 | 144.10 |

Duration of data (year) | 34 | 34 | 34 | 34 | 34 |

Number of datasets (days) | 12,775 | 12,775 | 12,775 | 12,755 | 12,755 |

Item . | Silifke . | Mut . | Hadim . | Ermenek . | Karaman . |
---|---|---|---|---|---|

Mean annual rainfall (mm) | 549.83 | 375.16 | 638.10 | 477.53 | 326.12 |

Mean monthly rainfall (mm) | 46.25 | 31.43 | 53.42 | 40.04 | 27.34 |

Maximum annual rainfall (mm) | 1,007.70 | 608.70 | 1,074.20 | 778.80 | 478.30 |

Minimum annual rainfall (mm) | 300.50 | 135.80 | 421.30 | 268.00 | 212.60 |

Maximum monthly rainfall (mm) | 504.50 | 248.80 | 405.80 | 289.00 | 144.10 |

Duration of data (year) | 34 | 34 | 34 | 34 | 34 |

Number of datasets (days) | 12,775 | 12,775 | 12,775 | 12,755 | 12,755 |

The monthly areal mean precipitation was classified into two periods: the calibration and the validation periods. The former period, which lasted from 1971 to 1995, was used for developing statistical downscaling models, while the latter period, which was from 1996 to 2005, was used for examining the model's performance and compare downscaling results.

The mean precipitation data of these five stations were converted to the first dataset (predictand set) and areal mean precipitation (P_{M}) of the basin, which was measured by using the Thiessen polygons technique. The second dataset is the large-scale predictor variables data, taken from the Canadian Center for Climate Modeling and Analysis scenarios (CCCMA), where various GCMs and future scenarios have been evolved. This study has chosen only the Canadian Second Generation Earth System Model (CanESM2) with RCP 2.6, RCP 4.5, and RCP 8.5 scenarios. These have been also obtained from the Canadian climate data and scenarios official website (http://climate-modelling.canada.ca/climatemodeldata/cgcm4/CanESM2/). The GCM helps us to get a digital depiction of the climate system of the Earth. This model is designed to have vertical levels and a horizontal mesh of grid boxes for inland, ocean, and atmosphere. The modeling process for every GCM may encompass many points, such as the size of the field, as the large areas are recognized more than the smaller ones. Along with the size, the field's site also matters as the agreement level among GCMs outputs varies a lot from one place to another.

The process of presenting the access data from the website was done through the use of grid cells. Every grid cell represents a web neighboring the corresponding model grid points. This mechanism works by input of decimal latitude and longitudinal coordinates of any location. In this study, we used the centroid of the basin; when it is entered, and the data retrieved. Moreover, these variables consist of 26 variable datasets (predictors) that were inserted in the CanESM2 folders. The Canadian climate website offers the variables of the model in a daily time scale data. Applying the downscaling process to large-scale data, the variables of the daily 26 GCM had to be transformed to the corresponding maximum monthly mean data as in the predictand set. Table 3 shows the description of the 26 large-scale weather factors.

No. . | CanESM2 predictors . | Description . | Short name . |
---|---|---|---|

1 | ceshmslpgl | Mean sea level pressure | V1 |

2 | ceshp1_fgl | 1,000 hPa Wind speed | V2 |

3 | ceshp1_ugl | 1,000 hPa Zonal velocity | V3 |

4 | ceshp1_vgl | 1,000 hPa Meridional velocity | V4 |

5 | ceshp1_zgl | 1,000 hPa Vorticity | V5 |

6 | ceshp1thgl | 1,000 hPa Wind direction | V6 |

7 | ceshp1zhgl | 1,000 hPa Divergence | V7 |

8 | ceshp5_fgl | 500 hPa Wind speed | V8 |

9 | ceshp5_ugl | 500 hPa Zonal velocity | V9 |

10 | ceshp5_vgl | 500 hPa Meridional velocity | V10 |

11 | ceshp5_zgl | 500 hPa Vorticity | V11 |

12 | ceshp5thgl | 500 hPa Wind direction | V12 |

13 | ceshp5zhgl | 500 hPa Divergence | V13 |

14 | ceshp8_fgl | 850 hPa Wind speed | V14 |

15 | ceshp8_ugl | 800 hPa Zonal velocity | V15 |

16 | ceshp8_vgl | 800 hPa Meridional velocity | V16 |

17 | ceshp8_zgl | 800 hPa Vorticity | V17 |

18 | ceshp8thgl | 800 hPa Wind direction | V18 |

19 | ceshp8zhgl | 800 hPa Divergence | V19 |

20 | ceshp500gl | Relative humidity at 500 hPa | V20 |

21 | ceshp850gl | Relative humidity at 850 hPa | V21 |

22 | ceshprcpgl | Total rainfall | V22 |

23 | ceshs500gl | Specific humidity 500 hPa | V23 |

24 | ceshs850gl | Specific humidity 500 hPa | V24 |

25 | ceshshumgl | Surface-specific humidity | V25 |

26 | ceshtempgl | Mean temperature at 2 m height | V26 |

No. . | CanESM2 predictors . | Description . | Short name . |
---|---|---|---|

1 | ceshmslpgl | Mean sea level pressure | V1 |

2 | ceshp1_fgl | 1,000 hPa Wind speed | V2 |

3 | ceshp1_ugl | 1,000 hPa Zonal velocity | V3 |

4 | ceshp1_vgl | 1,000 hPa Meridional velocity | V4 |

5 | ceshp1_zgl | 1,000 hPa Vorticity | V5 |

6 | ceshp1thgl | 1,000 hPa Wind direction | V6 |

7 | ceshp1zhgl | 1,000 hPa Divergence | V7 |

8 | ceshp5_fgl | 500 hPa Wind speed | V8 |

9 | ceshp5_ugl | 500 hPa Zonal velocity | V9 |

10 | ceshp5_vgl | 500 hPa Meridional velocity | V10 |

11 | ceshp5_zgl | 500 hPa Vorticity | V11 |

12 | ceshp5thgl | 500 hPa Wind direction | V12 |

13 | ceshp5zhgl | 500 hPa Divergence | V13 |

14 | ceshp8_fgl | 850 hPa Wind speed | V14 |

15 | ceshp8_ugl | 800 hPa Zonal velocity | V15 |

16 | ceshp8_vgl | 800 hPa Meridional velocity | V16 |

17 | ceshp8_zgl | 800 hPa Vorticity | V17 |

18 | ceshp8thgl | 800 hPa Wind direction | V18 |

19 | ceshp8zhgl | 800 hPa Divergence | V19 |

20 | ceshp500gl | Relative humidity at 500 hPa | V20 |

21 | ceshp850gl | Relative humidity at 850 hPa | V21 |

22 | ceshprcpgl | Total rainfall | V22 |

23 | ceshs500gl | Specific humidity 500 hPa | V23 |

24 | ceshs850gl | Specific humidity 500 hPa | V24 |

25 | ceshshumgl | Surface-specific humidity | V25 |

26 | ceshtempgl | Mean temperature at 2 m height | V26 |

## METHODOLOGY

The large-scale weather factors (26 input sets) have been used as predictors, and have been given with the assistance of the CanESM2. Also, as a predictand (output), the areal mean precipitation was used on the basin. For the groups of the data, it is important to mention that a monthly time scale is the basis of all data groups. Both data classes are split into two subsets i.e. calibration and validation. The first is the period between 1971 and 1995 for calibration, and the second is the period of validation between the years 1996 and 2005. Moreover, to forecast the downscaled monthly areal mean precipitation in the basin, the GEP techniques have been applied. Additionally, this research also compares the results of the GEP with those of Linear Regression (LR). Moreover, the MK trend analysis and Sen's slope estimator of P_{M} over the case study area were compared with the different pathways RCP2.6, RCP4.5, and RCP8.5 for 2021–2100.

### Selection of the most effective predictors

What is regarded to be a fundamental part of the statistical downscaling methods is the selection of the large-scale weather GCM factors (predictors), 26 variables. Based on certain devices, variables have been updated to maximize the association with predictor variables and predictands. Hence, the predictand and the predictors’ datasets have been changed to Ln (taking the natural logarithm of each data) variables. Using Pearson's rank correlation coefficient technique, correlation analysis has been utilized to select the most powerful predictors and to analyze the linear relationship between inputs (26 different large-scale weather factors) and outputs (areal mean precipitation) separately; moreover, as predictors for downscaling models, variables with a higher correlation coefficient were chosen. This study uses predicators with a correlation of R > 0.5 in order to refer to the typical situation of the variables of predictor and predictands. Tables 3 and 4 show the description of the 26 variables and variables of R > 0.5 (9 in this case) which are the uppermost correlating GCM variables and areal mean precipitation separately; the correlation coefficient is bigger than 0.5. The degree of interaction between variables is identified by the size of the correlation coefficient, and the signal shows if the relation is direct or contrary.

No. . | GCM predictors . | Description . | R . | No. . | GCM predictors . | Description . | R . |
---|---|---|---|---|---|---|---|

V1 | ceshmslpgl | Mean sea level pressure | 0.634 | V20 | ceshp500gl | Relative humidity at 500 hPa | −0.539 |

V5 | ceshp1_zgl | 1,000 hPa vorticity | −0.581 | V24 | ceshs850gl | Specific humidity 500 hPa | −0.577 |

V7 | ceshp1zhgl | 1,000 hPa divergence | 0.509 | V25 | ceshshumgl | Surface-specific humidity | −0.566 |

V17 | ceshp8_zgl | 800 hPa vorticity | −0.571 | V26 | ceshtempgl | Mean temperature at 2 m height | −0.662 |

V19 | ceshp8zhgl | 800 hPa divergence | 0.596 |

No. . | GCM predictors . | Description . | R . | No. . | GCM predictors . | Description . | R . |
---|---|---|---|---|---|---|---|

V1 | ceshmslpgl | Mean sea level pressure | 0.634 | V20 | ceshp500gl | Relative humidity at 500 hPa | −0.539 |

V5 | ceshp1_zgl | 1,000 hPa vorticity | −0.581 | V24 | ceshs850gl | Specific humidity 500 hPa | −0.577 |

V7 | ceshp1zhgl | 1,000 hPa divergence | 0.509 | V25 | ceshshumgl | Surface-specific humidity | −0.566 |

V17 | ceshp8_zgl | 800 hPa vorticity | −0.571 | V26 | ceshtempgl | Mean temperature at 2 m height | −0.662 |

V19 | ceshp8zhgl | 800 hPa divergence | 0.596 |

Nine large-scale weather factors: Mean sea level pressure (V1), 1,000 hPa Vorticity (V5), 1,000 hPa Divergence (V7), 800 hPa Vorticity (V17), 800 hPa Divergence (V19), Relative humidity at 500 hPa (V20), Specific humidity 500 hPa (V24), Surface-specific humidity (V25), Mean temperature at 2 m height (V26), were used as the input data in LR model.

### Downscaling using GEP technique

^{2}, x

^{3}, x

^{4}, x

^{5}, 3Rt, 4Rt, 5Rt, Ln) was chosen in the GEP program. Identifying the chromosome structure, which embraces the number of genes for each chromosome and the magnitude of the gene, is the second crucial stage for the GEP program. Up to this point, it is worth stating that with the aid of four genes per chromosome for every GEP model, we achieved the best outcomes. The process of choosing the function's link is the third important stage. Concerning this study, ‘ + ’ (addition) was the chosen linking function. The fitness measure is the last stage, and here it is important to note that the training set's Root-Relative Square Error (RRSE) is used as a fitness function. In order to calibrate the parameters and equations of the downscaling model by GEP, the local precipitation (monthly areal mean precipitation) and large-scale weather data of the calibration duration have been utilized. Moreover, to produce the validation forecast future predictions, the GEP model, which automatically picks the effective predicators from nine and totally eight inputs, also improved the procedure of calibration equations which is given in Equation (1). Mean sea level pressure (V1), 1,000 hPa Vorticity (V5), 1,000 hPa Divergence (V7), 800 hPa Vorticity (V17), 800 hPa Divergence (V19), Relative humidity at 500 hPa (V20), Surface-specific humidity (V25), Mean temperature at 2 m height (V26), eight large-scale weather factors were used as a input data in GEP model to produce Equation (1). The definition of each variable is given in Table 4:

As presented here, Rt is the root, × is the multiplication process between the data values, and Y represents the areal mean precipitation of basin in the equation.

### Trend analysis

For the observed climatic data between 1971 and 2005, this study uses the MK trend analysis and the Sen's slope estimator to compare the different pathways RCP2.6, RCP4.5, and RCP8.5 for 2021–2100.

The descriptive statistics of the observed data including the number of observations and the CanESM2 model under the different scenarios of RCP2.6, RCP4.5, and RCP8.5, mean standard deviation, maximum and minimum values are listed in Table 5.

Variable . | Observations (years) . | Average (mm) . | Minimum (mm) . | Maximum (mm) . | Standard deviation (mm) . |
---|---|---|---|---|---|

Observed (1971–2005) | 35 | 481.783 | 278.280 | 776.227 | 100.008 |

RCP2.6 (2021–2100) | 80 | 341.312 | 332.098 | 350.526 | 5.420 |

RCP4.5 (2021–2100) | 80 | 320.297 | 275.545 | 365.050 | 26.328 |

RCP8.5 (2021–2100) | 80 | 281.669 | 203.190 | 360.148 | 46.169 |

Variable . | Observations (years) . | Average (mm) . | Minimum (mm) . | Maximum (mm) . | Standard deviation (mm) . |
---|---|---|---|---|---|

Observed (1971–2005) | 35 | 481.783 | 278.280 | 776.227 | 100.008 |

RCP2.6 (2021–2100) | 80 | 341.312 | 332.098 | 350.526 | 5.420 |

RCP4.5 (2021–2100) | 80 | 320.297 | 275.545 | 365.050 | 26.328 |

RCP8.5 (2021–2100) | 80 | 281.669 | 203.190 | 360.148 | 46.169 |

To attain the correct slope for the non-parametric test in the series, the two-sided test is accomplished at 100 (1 – *α*) % of the 0.05 confidence interval get the correct slope for a non-parametric test in the series (Mondal *et al.* 2012). Furthermore, in the form of falling and rising trends, the negative and the positive slope of Qi is achieved.

## RESULTS AND DISCUSSION

Large-scale weather factors (26 input sets) were used as predictors which were obtained from CanESM2. The P_{M} was used as the predictand (output). All datasets are based on a monthly time scale. Both datasets were divided into two groups: calibration and validation. The first group is the calibration period ranging between 1971 and 1995. The second group is the validation period ranging between 1996 and 2005. GEP downscaling technique was utilized to predict the downscaled P_{M} of the basin. Furthermore, the prediction results of downscaling techniques were compared with each other to examine the best model performance. Also, LR was used for comparison of the proposed nonlinear models with a linear model. Moreover, the trend analysis of P_{M} over the case study area was compared by the mean of the statistically downscaled CanESM2.

The GEP model provided the greatest consequences depending on the performance of the offered models in the calibration period. For every model gained from the calibration times, the validation process was conducted using a simpler formulation. Equation (1), is used with the GEP model in order to generate the validation process. The findings of the models GEP and LR are presented in terms of R during the validation period. As seen in Figure 3, a scatter plot was generated between the observed (vertical axis-Y) and the predicted P_{M} (horizontal axis-X). When moving from left to right, as shown in Figure 3, a downhill pattern will accompany all models, and this implies, according to the observed data, that models have a great validation. At this level, it is worth saying that the LR model had the lowermost R-value (0.607), whereas the uppermost R-value (0.775) was for the GEP model.

Moreover, Table 6 compares the simulated P_{M} for the GEP and LR models and the observed P_{M} within the same validation period 1996–2005. As shown in Table 6, the GEP model's simulated mean monthly areal precipitation is 34.322 mm, whereas it is 30.763 mm for the LR model; this indicates that the means of the above models are less than the observed mean. We can say that the models underestimate the means. Nevertheless, the GEP model produced better results to simulate the mean than the results of the LR model. This leads to a state so that on the one hand, the model of GEP proves to operate well for the simulated minimum precipitation; on the other hand, the inability to perform in a good way has been clearly shown in LR's simulated minimum precipitation. Here, it is important to mention that there is a major difference in the maximum value of simulated P_{M} when it is compared with the observed data. As for the maximum precipitation, a minimum difference of 18% has been recorded by the model of LR. This gives the LR model the feature of having the minimum underestimating difference of 8% when comparing it to the GEP model. Moreover, the GEP model has a lower value of mean absolute error (MAE) (19.231 mm) and root mean square error (RMSE) (31.896 mm) than the LR model, as shown by the numerical values given in Table 6. Additionally, the LR's MAE is nearly 3.3% and RMSE is about 3.7%, and in both cases, these are greater than those of the GEP model.

Data . | Mean (mm) . | Min (mm) . | Max (mm) . | Std. deviation (mm) . | MAE (mm) . | RMSE (mm) . | R . | R^{2}
. |
---|---|---|---|---|---|---|---|---|

Observed | 38.589 | 1.000 | 265.254 | 40.666 | – | – | – | – |

GEP model | 34.322 | 2.865 | 66.184 | 23.172 | 19.231 | 31.896 | 0.775 | 0.600 |

Linear regression | 30.763 | 0.934 | 78.036 | 25.014 | 19.871 | 33.091 | 0.607 | 0.369 |

Data . | Mean (mm) . | Min (mm) . | Max (mm) . | Std. deviation (mm) . | MAE (mm) . | RMSE (mm) . | R . | R^{2}
. |
---|---|---|---|---|---|---|---|---|

Observed | 38.589 | 1.000 | 265.254 | 40.666 | – | – | – | – |

GEP model | 34.322 | 2.865 | 66.184 | 23.172 | 19.231 | 31.896 | 0.775 | 0.600 |

Linear regression | 30.763 | 0.934 | 78.036 | 25.014 | 19.871 | 33.091 | 0.607 | 0.369 |

The GEP's, LR's, and observed data simulated P_{M} during the period of validation is presented in Figure 4. The simulated P_{M} of all models along with the observed data is clarified in Figure 5. Examining that figure, it is noticeable that, in August and July, the P_{M} has been underestimated by the GEP and LR models. However, the LR and GEP models have overestimated the P_{M} in February and November as it is displayed in Figure 8. In contrast, the LR and the GEP models have misjudged the P_{M} in January, March, April, May, and December. The percentage of the LR's underestimated mean is 5.7% in May, which is smaller than that of the GEP model, as shown in Figure 8. Nevertheless, the GEP and LR models recorded the same precipitation and the best expectation in January, August, and December. However, the LR's prediction of mean precipitation in February and November proved to be the best in comparison with the observed data. In comparison with the LR model for the periods of validation and calibration, the GEP model has a good performance in anticipating the distinctions in monthly areal mean precipitation, as is clear in Figure 4, Figure 5 and Table 5.

Concerning the periods of validation offered in Table 6, the GEP model demonstrated its excellence in these periods with the uppermost correlation coefficient and slightly lowermost in the values for RMSE and MAE. All of this leads us to assume that the GEP model will be suitable to use under future emission scenarios RCP 2.6, RCP 4.5, and RCP 8.5, the P_{M} between the years of 2021 and 2100.

## FUTURE PROJECTION OF P_{M} UNDER DIFFERENT EMISSION SCENARIOS

The classifications of the downscaled outcomes from scenarios RCP 2.6, RCP 4.5, and RCP 8.5 had four periods with a range of 20 years. The four durations are 2020s (2021–2040), 2040s (2041–2060), 2060s (2061–2080), and 2080s (2081–2100). A comparison was made between them and the period for the baseline (1971–2005) in order to study the upcoming change in P_{M} of the case study area. The GEP's different predictions of P_{M} under the CanESM2 scenario for various periods are provided in Figure 6; thus, it is noticeable that, when compared with the baseline mean precipitation, we find that the prediction of GEP's amount of precipitation for each different period will be reduced. Again, in comparison with the baseline mean precipitation, the predictions of precipitation for each different period provided via the GEP model will be lessened in the 2020s, 2040s, 2060s, and 2080s, and increased in April 2020s as displayed in Figure 7. Shifting to Figure 8, there will be a decrease in GEP's predicted precipitation when comparing it with the precipitation of the baseline.

Under the scenarios RCP 2.6, RCP 4.5, and RCP 8.5, there will be a decline in the monthly mean precipitation P_{M} of each month for the years of the 2020s, 2040s, 2060s, and 2080s.

Moving to the scenario RCP2.6, the P_{M} will be reduced in July. In addition, according to that scenario, the decrease in the P_{M} will be 51.6% for the 2020s, 48.5% for the 2040s, 51.3% for the 2060s, and 49.705% for the 2080s. According to the RCP4.5 scenario, during the month of June, there will be an obvious decrease; whereas the P_{M} of the 2020s, 2040s, 2060s, and 2080s will be 77.7%, 81.2%, 82.9%, and 80.7%, respectively. Also, in June, the monthly mean precipitation of the 2020s, 2040s, 2060s, and 2080s will show a decline by 80.6%, 83.4%, 83%, and 85.3%, respectively, under the RCP8.5 scenario. As also offered in Figures 6–8, the P_{M} will decrease through the months of June, July, August, and September throughout time spans of the 2020s, 2040s, 2060s, and 2080s; this is according to the scenarios RCP2.6, RCP4.5, and RCP8.5. Moving to the scenario RCP2.6, projected mean precipitation is very near to the observed baseline mean precipitation for the duration of March and September. Similarly, under RCP4.5, the projected mean precipitation in August is also neighboring the observed baseline mean precipitation. As shown in Figures 6 and 7, all the precipitation predictions via the mentioned scenarios differ in size; nevertheless, their patterns are similar. Moreover, in the same figures, an annual decrement of the baseline period (1971–2005) is presented under all scenarios.

Concerning the scenarios RCP2.6, RCP4.5, and RCP8.5 presented in Figure 9, they show the expected P_{M} for various periods by suggested downscaled models. Thus, we can notice from Figure 9 that, under all scenarios from the period of 2021–2049 to the period of 2070–2099, the annual precipitation of the GEP model is the lowest. In Figure 9, the mean annual precipitation, which is downscaled by the models in the baseline period 1971–2005, is represented by small points, while the mean observed precipitation is represented by the dotted straight line.

### Trend analysis of P_{M} over the case study area by mean of statistically downscaled CanESM2 future climate projection

The trend analysis of P_{M} over the case study area during the periods of 1971–2005 was calculated by the MK test and Sen's slope estimator using XLSTAT software, as well as the same for the CanESM2 model under the different scenarios RCP2.6, RCP4.5, and RCP8.5 during the periods 2021–2100. Outcomes of the MK trend test pointed out that there is a statistically substantial decreasing trend in the series of mean yearly P_{M}. As the computed *p*-value is greater than the significance level alpha = 0.05, one cannot reject the null hypothesis H0. Moreover, there were statistically significant falling trends for the mean P_{M} for all years. Table 7 shows the findings of non-parametric analyses (Kendall's tau, Var (S), *p*-value, alpha, and Sen's estimator).

Series\Test . | Kendall's tau . | Var (S) . | p-value
. | Sen's slope . |
---|---|---|---|---|

Observed (1971–2005) | −0.230 | 4,958.333 | 0.053 | −2.906 |

RCP 2.6 (2021–2100) | −0.101 | 57,933.33 | 0.188 | −2.906 |

RCP 4.5 (2021–2100) | −0.395 | 57,933.33 | <0.0001 | −1.110 |

RCP 8.5 (2021–2100) | −0.504 | 57,933.33 | <0.0001 | −2.007 |

Series\Test . | Kendall's tau . | Var (S) . | p-value
. | Sen's slope . |
---|---|---|---|---|

Observed (1971–2005) | −0.230 | 4,958.333 | 0.053 | −2.906 |

RCP 2.6 (2021–2100) | −0.101 | 57,933.33 | 0.188 | −2.906 |

RCP 4.5 (2021–2100) | −0.395 | 57,933.33 | <0.0001 | −1.110 |

RCP 8.5 (2021–2100) | −0.504 | 57,933.33 | <0.0001 | −2.007 |

Out of these checks offered in Table 7, it was concluded that they display a downward trend in the total, mean, maximum and minimum precipitation in the study area between 1971 and 2005, but these trends were not statistically significant at 95% confidence level. In addition, the amount of precipitation was less in the period between 1996 and 2005 than in the period between 1971 and 1995. Furthermore, Sen's estimator revealed that annual total precipitations had trends of −2.906 mm/year for a period of 1971–2005. In this section of the study, the trend analysis was used to compare with the series RCP2.6, RCP4.5, and RCP8.5 during the periods 2021–2100, as illustrated in Figure 10.

The annual MK test and Sen's slope estimator were accomplished at each scenario that was used to compute the observed values over the entire basin. As a result of the trend analysis of P_{M} for the periods of 2021–2100, a multidirectional dropping trend was found for each scenario during these periods. The findings of the MK trend test for RCP 2.6 pointed out that there is a statistically significant decreasing trend in the series of mean annual P_{M} (as the computed *p*-value is higher than the significance level alpha = 0.05, one cannot abandon the null hypothesis H0). Moreover, the RCP 4.5 scenario produced a decreasing curve and was slightly closer to the trend for the mean P_{M}. However, the RCP 8.5 scenario produced a downward sloping curve and was quasi-asymptotic for trend analysis (as the calculated *p*-value is lower than the significance level alpha = 0.05, one should discard the null hypothesis Ho, and agree to the alternative hypothesis Ha). Moreover, the scenario RCP8.5 projects more suitable for the case study than expected under the scenarios RCP4.5 and RCP2.6. The trend analysis of P_{M} over the case study area was compared using the mean statistically downscaled CanESM2 model, and the trend was shown as decreasing.

From these analyses shown in Table 7, it was identified that the values have a downward trend in the total of scenarios in the study area between 2021 and 2100, but these trends were not statistically significant at the 95% confidence level. In addition, the values were more than the expected overall mean rainfall and it was drier in the duration between 2070 and 2100 than in the period between 2021 and 2069. Moreover, Sen's estimator revealed that annual total rainfalls for RCP2.6, RCP4.5, and RCP8.5 scenarios have trends of −0.282 mm/year, −1.110 mm/year, and −2.007 mm/year correspondingly, as shown in Figure 11.

## CONCLUSIONS

The use of the downscaling method for GEP has been used to forecast the basin's downscaled monthly areal mean precipitation. Furthermore, to analyze the outputs of the model GEP, the outcomes of prediction processes of downscaling methods have been identified and contrasted with LR. Also, the exploration of the study extends further to the downscaling outcomes of the GCM scale predicted with CanESM2 model and RCP2.6, RCP4.5, and RCP8.5 future emission scenarios.

The good application of the model of GEP in the process of evaluating P_{M} in the period of calibration (1971–1995) and the period of validation (1996–2005) has been based on the numerical signs between observed and downscaled data. For the calibration period, the GEP's correlation coefficient (R) is 0.827 and for the validation period is 0.755.

For the three scenarios, the predicted findings suggest a significant decrease in their patterns; however, they are distinct in the quantity of P_{M}. Generally speaking, during the coming century, the three scenarios expect a decline in the average yearly P_{M}. For predicting the potential P_{M} shift in the region of the basin, the downscaled outcomes of the scenarios RCP2.6, RCP4.5, and RCP8.5 have been split into four 20-year time ranges, these being: 2020s (2021–2040), 2040s (2041–2060), 2060s (2061–2080), and 2080s (2081–2100).

With the help of trend analysis, the modeled scenarios RCP2.6, RCP4.5, and RCP8.5 have been compared with the baseline period (1971–2005). Moreover, to display the trend of the potential variations in P_{M} in the region of the basin, the MK test and Sen's slope estimator have been developed. Nevertheless, over the period of 2021–2100, the scenario RCP8.5 indicates a sharper decline in precipitation and projects more suitably for the case study than the scenarios RCP4.5 and RCP2.6.

## DATA AVAILABILITY STATEMENT

All relevant data are included in the paper or its Supplementary Information.

## REFERENCES

**78**, pp. 197–208