## Abstract

Climate change is a global issue posing threats to the human population and water systems. As Malaysia experiences a tropical climate with intense rainfall occurring throughout the year, accurate rainfall simulations are particularly important to provide information for climate change assessment and hydrological modelling. An artificial intelligence-based hybrid model, the bootstrap aggregated classification tree–artificial neural network (BACT-ANN) model, was proposed for simulating rainfall occurrences and amounts over the Langat River Basin, Malaysia. The performance of this proposed BACT-ANN model was evaluated and compared with the stochastic non-homogeneous hidden Markov model (NHMM). The observed daily rainfall series for the years 1975–2012 at four rainfall stations have been selected. It was found that the BACT-ANN model performed better however, with slight underproductions of the wet spell lengths. The BACT-ANN model scored better for the probability of detection (POD), false alarm rate (FAR) and the Heidke skill score (HSS). The NHMM model tended to overpredict the rainfall occurrence while being less capable with the statistical measures such as distribution, equality, variance and statistical correlations of rainfall amount. Overall, the BACT-ANN model was considered the more effective tool for the purpose of simulating the rainfall characteristics in Langat River Basin.

## INTRODUCTION

According to the Intergovernmental Panel on Climate Change Fourth Assessment Report (AR4), climate change is an identifiable change in the state of the climate, caused by either the impacts from anthropogenic activities or natural variabilities that remains for decades or even a longer period of time. Global warming with the frequent occurrences of extreme hydrological events are the major consequences triggered by climate change, and they bring about disastrous calamities and significant impacts to the environment and livelihood in the world. The issue of climate change, has been and still is, the most critical global issue for the world to tackle. The Global Climate Models (GCMs) are advanced and complicated models able to express the physical processes in the atmosphere over the land surface and ocean. The global climate system in response to the raising in concentration of greenhouse gases under different scenarios, is simulated well by these GCMs. However, these GCMs are three-dimensional models on a global scale, and their resolution is considered to be too coarse and not recommended for direct use in any impact assessment over a region (Ahmed *et al.* 2013). Therefore, there are two types of downscaling approaches (dynamical and statistical downscaling) that have been developed as bridging for minimizing associated discrepancies and in acquiring the prerequisite local climate variables.

Statistical downscaling is a statistical technique that is used to predict the local climate variable through a robust relationship established between the large-scale (global) atmospheric variables and the local climate variables (Tang *et al.* 2016). Numerous studies have reported that artificial neural network (ANN) is a well-known regression-based model, capable of establishing a relationship between global atmospheric variables and local climate variables for downscaling purpose (Ahmed *et al.* 2015; Vu *et al.* 2016). In the study of Mendes *et al.* (2014), the proposed ANN model with backpropagation algorithm showed the capability in downscaling the observed rainfall with a correlation greater than 88%. Similarly, good performances were obtained in the study of Campozano *et al.* (2016), with the ANN showing overall better prediction ability than the statistical downscaling model (SDSM) in downscaling the monthly rainfall over the Paute River Basin in southern Ecuador. However, both models underpredicted the median value of the rainfall depth for November. They suggested the approach of downscaling with specific predictors in certain months or seasons may improve the performance of the model.

The approach of weather classification is a technique of grouping the local variables into different classes of large-scale climate variables in accordance with their patterns (Mehrotra & Sharma 2005). The non-homogeneous hidden Markov model (NHMM) is well known for its capability in capturing the spatial and temporal variability through the recognition of noticeable weather patterns in the multi-station (Greene *et al.* 2011). In the study of Liu *et al.* (2011), they evaluated the capability of both SDSM and NHMM in simulating the daily precipitation over an arid basin in China. As predicted, the NHMM performed slightly better than the SDSM in capturing the spatial distribution characteristics of rainfall. This is attributed to the NHMM modelling the multi-site rainfall with the consideration of spatial correlations. Both models exhibited less ability in downscaling annual series inter-annual variability for annual series of monthly and annual rainfall. However, the analysis of intra-annual monthly rainfall correlation indicated that the NHMM was capable of simulating the monthly rainfall well in all the months, while the SDSM was good only for certain months. Furthermore, the NHMM has been engaged in numerous studies as a downscaling model over the years and performed reasonably well in simulating the rainfall occurrence and amount in several studies (Gelati *et al.* 2010; Liu *et al.* 2013; Mares *et al.* 2014).

Classification is one of the supervised machine learning techniques used for statistical downscaling purposes, after a relationship is mapped between the atmospheric (large-scale/global) variables and local climate variables. The classification decision tree models the rainfall occurrence by predicting the rainfall state as a function of atmospheric variables (Kannan & Ghosh 2011). This model gives satisfactory results in capturing the occurrence of rainfall (Ingsrisawang *et al.* 2008). Thereafter, the daily rainfall amount can be subsequently generated using a regression method, such as nonparametric kernel regression model (Kannan & Ghosh 2013) or a beta regression model (Mandal *et al.* 2016), conditioned on the predicted rainfall state from the classification decision tree. However, the classification and regression tree (CART) can be further improved following the application of a bootstrap aggregation (Gaitan *et al.* 2014) and a random forest algorithm (Jing *et al.* 2016), in modelling the rainfall occurrence and rainfall amount, respectively. In the study of Gaitan *et al.* (2014), the ensemble of classification trees showed an obvious improvement in the analysis of Pierce skill score (PSS) over a single classification tree, in modelling the rainfall occurrence. Jing *et al.* (2016) compared four machine learning algorithms, namely, the CART, the k-nearest neighbours (k-NN), the support vector machine (SVM) and the random forests (RF) algorithms to downscale the monthly rainfall data over North China. The validation results showed that the RF-based model achieved the highest accuracy, followed by the SVM, CART and lastly, k-NN.

In the field of hydrology, these hybrid approaches have overcome the deficiency or poor performance of traditional individual models in predicting or downscaling the rainfall. The SDSM is the best example of an earlier existing hybrid model with a combination of stochastic weather generator and multiple linear regression techniques. It was frequently used in downscaling applications due to its good simulation performance and low cost (Amirabadizadeh *et al.* 2016). Over the years, there were numerous new hybrid models proposed, for example, the hybrid WNN model (Ramana *et al.* 2013), using the wavelet technique to combine with the ANN for monthly rainfall prediction. The application of logistic regression is coupled with ANN with partial least square regression for downscaling the daily rainfall of Saguenay watershed in Canada (Muluye 2012); the wavelet transform and SVM hybrid model (Halik *et al.* 2015) for reservoir inflow prediction under the GCM scenario; and the hybrid generalized liner model–artificial neural network model (Abdellatif *et al.* 2013) for future rainfall simulation, are but some examples. As we can see from the above, these hybrid models have judiciously exhibited the better simulation performance, when compared to their original individual models.

The main objective of this study is to propose an efficient rainfall simulation model for Langat River Basin, Malaysia. An artificial intelligence-based hybrid model with data pre-processing approach, namely, the bootstrap aggregated classification tree–artificial neural network (BACT-ANN) was therefore investigated for its efficiency in rainfall simulation for the Langat River Basin. The BACT-ANN model was calibrated and validated for the study area, and its capability was evaluated together with the stochastic non-homogeneous hidden Markov model (NHMM). Malaysia experiences tropical unusual localized rainfall, where the rainfall can occur over a small specific area but is not immediately observed over nearby localities (Ahmad & Sidek 2015; Ng *et al.* 2018). With this in mind, therefore, the scope of this study is to perform single-site rainfall simulation without considering the spatial dependence of rainfall data.

## STUDY AREA

The Langat River Basin is located in the mid-western part of Peninsular Malaysia with a total catchment area of 2,282 km^{2}. It lies within 2° 40′ 152″N to 3° 16′ 15″N latitudes and 101° 19′ 20″E to 102° 1′ 10″E longtitudes. The three main tributaries of Langat River Basin are the Langat River, the Labu River and the Semenyih River. It experiences a tropical climate with intense localized rainfall, high relative humididty and uniform temperatures throughout the year. There are two reservoirs in the Langat River Basin, namely, Langat and Semenyih, built for domestic and industrial water while serving as mitigation measures for flood protection.

## DATA

### Observed rainfall series

Four rainfall stations, namely, Pejabat JPS Sg. Manggis (station 2815001), P/KWLN/S Telok Gong (station 2917001), RTM Kajang (station 2913001) and Sek. Keb. Kg. Sg. Lui (station 3118102), with the collected rainfall series of years 1975–2012, were chosen for their location within the Langat River Basin and are indicated in Figure 1. The observed daily rainfall series were acquired from the Department of Irrigation and Drainage Malaysia (DID) and the Malaysian Meteorological Department (MMD). A threshold value of 0.1 mm rainfall was used to define a wet day (Suhaila & Jemain 2009; Syafrina *et al.* 2015), as the use of a larger threshold value may cause the underestimation of rainfall occurrence over the Langat River Basin.

### Reanalysis dataset for atmospheric predictors

The large-scale/global data required for this study were the reanalysis datasets from the National Centre for Environmental Prediction and National Centre for Atmospheric Research (NCEP & NCAR). These potential predictors are the daily values of 26 variables (1961–2005), which consist of circulation variables (i.e., geopotential and wind component), temperature, radiation and moisture variables (specific humidity), etc. For consistency between the observed rainfall series and reanalysis dataset, their daily data from years 1976–2005 were acquired from their respective raw dataset for modelling. Each potential variable went through the lag-transformation process (from lag −9 to lag 9) for selecting the suitable predictor variables, which are highly correlated with the observed rainfall series, at each station.

## METHODOLOGY

The workflow of this study, which includes the analysis of rainfall series, selection of predictors, development of rainfall simulation models (BACT-ANN model and NHMM) and evaluation of models, is presented in Figure 2.

### Normality tests of observed rainfall series

A necessary step prior to development of model and evaluation of result is to check the behaviour (normal or non-normal distribution) of observed rainfall series. There were three types of normality tests used in this study to determine the behaviour of observed rainfall series at each station, namely, the Anderson–Darling test, the Lilliefors test and the Jarque–Bera test. All three tests were carried out with the same null hypothesis, where the sample is normally distributed, otherwise, the sample is non-normally distributed to fulfil the alternative hypothesis.

### Screening of predictors

In a complete NCEP & NCAR reanalysis dataset, the majority of the 26 potential predictors are mutually correlated. They contain plenty of information but still a part of them can be considered as excessive information since they provide no significant effects to the downscaling process. To select a suitable set of predictors for this study, both the observed daily rainfall series and the NCEP & NCAR reanalysis dataset were screened to evaluate their relationship. During the screening process, the analyses of explained variance and partial correlation were used as the selection criteria of suitable predictors. Each predictor and its lag-transformed predictors were screened with the daily observed rainfall series for selecting the most suitable set of predictors for each station.

### Bootstrap aggregated classification tree–artificial neural network (BACT-ANN) model

In this study, the inspiration for combining the BACT and the ANN models is to enhance the rainfall amount simulation performance of the ANN model. The additional BACT rainfall occurrence model and the data pre-processing approach are the significant differences in making the BACT-ANN model different from the conventional ANN model.

#### BACT rainfall occurrence model

In this study, the classification decision tree is more suitable to be used for rainfall occurrence modelling over the regression decision tree. This is because the outcome of rainfall occurrence state is either a rainy or a non-rainy day, which could be treated as a classification type problem. At this stage, the observed daily rainfall series was transformed into observed daily rainfall occurrence binary series with the threshold value of 0.1 mm for rainfall occurrence simulation. In order to enhance the prediction ability of an individual classification decision tree, the optimal bootstrap aggregation (bagging) with 150 trees was employed on the classification tree according to reported previous work (Lian *et al.* 2019). Each tree in the ensemble is generated with a freely drawn bootstrap replica of input data and the final classification result was determined based on the majority votes. Besides, the random forest algorithm was also adopted in the rainfall occurrence model by setting the random selection of the number of predictor variables for each node to ‘all’ in the Matlab platform.

Data pre-processing is one of the important steps in developing the BACT-ANN model, which can enhance the performance of the ANN rainfall amount model by ensuring it simulates the rainfall amount conditional on the simulated wet days. The observed daily rainfall series was pre-processed using the trained BACT model, where some of the ‘rainy days’ might become ‘non-rainy days’ even though in their original series they are indicated as ‘rainy days’. However, for those observed rainy/non-rainy days that matched with the simulated rainy/non-rainy days from the trained BACT model, they remain their original values in the observed daily rainfall series.

#### ANN rainfall amount model

During this stage, the process of rainfall simulation is similar to the process of using the conventional ANN model. However, the only difference is the pre-processed observed daily rainfall series (wet day series) and their corresponding predictors were fitted into the model instead of using the whole dataset of observed daily rainfall series and selected predictor. The model only focuses on the rainy days with rainfall values, but not on the whole original series of rainy and non-rainy days. The Levenberg–Marquadt backpropagation algorithm was adopted in the ANN model, while the tan-sigmoid transfer function and linear transfer function were applied to the hidden layer and output layer, respectively. The optimum number of hidden neurons in a hidden layer was determined using the trial and error method, which has been deemed the best solution and commonly used in reported previous studies (Mendes *et al.* 2014; Campozano *et al.* 2016). The simulation results of the BACT-ANN model and NHMM were evaluated for the rainfall occurrences and amount predictions.

### Non-homogeneous hidden Markov model (NHMM)

*t*are irrelevant to all other variables, but they are conditional on the weather state at day , which can be defined as:

At this stage, this process is considered as homogeneous in time due to their transition probabilities matrix between the states remaining unchanged over time. The optimum number of hidden states was determined through fitting the observed daily rainfall occurrence series into the NHMM using different hidden states (from one to seven).

The parameter estimation in NHMM was done using an algorithm of iterative expectation maximization (EM), and the EM algorithm was initialized ten times from different random starting points. The Bayesian information criterion (BIC) score was used as the criterion for selecting the optimum number of hidden states. Then, the NHMM was fitted with the selected predictors to generate the rainfall amount series, using delta and gamma distributions to model the dry days and wet days, respectively.

### Evaluation of models

#### Rainfall occurrence model

HSS is defined as the ratio of correct prediction to a random prediction that is statistically independent from the observation. The HSS index ranges from −1 to 1 with the perfect forecast when HSS = 1. The forecast is treated as an unskilled random forecast when HSS = 0; while standard forecast is more accurate than the forecast when HSS <1.

In the above equations, *a* is the number of events which are predicted and actually happened; *b* is the number of events which are predicted but actually did not happen; *c* is the number of events which are not predicted but actually happened; and *d* is the number of events which are not predicted and actually did not happen.

The computations of these analyses are more towards evaluating the ability of a rainfall occurrence model in simulating the observed rainfall occurrence day by day, which is very useful in the evaluation of statistical downscaling.

#### Rainfall amount model

The monthly rainfall series was (Table A1) obtained through the summation of the daily rainfall series for comparing and evaluating the performance of both the BACT-ANN model and NHMM model in monthly rainfall simulation. If the distribution of observed rainfall series was proved to be skewed and not normally distributed in the normality tests, then the non-parametric tests were adopted for the evaluation of model performance. The distribution and equality of variance between both observed and simulated monthly rainfall series were evaluated using Kolmogorov–Smirnov test, Mann–Whitney U test and squared-rank test.

The Kolmogorov–Smirnov (K-S) test is one of the non-parametric statistical hypothesis tests used to determine whether two independent samples of data come from the same or different distributions. The Mann–Whitney U test is used to evaluate the equality between both the observed and simulated monthly rainfall series relative to their rankings. If the computed *p*-value exceeds the significance level, the null hypothesis is accepted with the difference of location between the samples equal to zero. The squared-rank test is another non-parametric test which can be used to assess the equality of variance between observed and simulated monthly rainfall series with the null hypothesis of the samples from the identical distribution.

In addition, the degree of association between both observed and simulated monthly rainfall series was checked using the Kendall's tau-b correlation and the Spearman's rho correlation statistics. The correlation coefficient of both analyses ranges from −1 to 1, where the positive correlation implies that the ranks of both variables are increasing, while the negative correlation implies that the ranks of both variables are moving in the opposite direction. As well, both analyses are performed in hypothesis testing with the null hypothesis if there is no correlation between two series.

All the non-parametric tests in this study were carried out with a significance level of 0.05. Also, the attributes of observed and simulated monthly rainfall series were also identified by comparing the box and whisker plots of rainfall series. A box and whisker plot is a graph that can be used to summarize five statistical characteristics (minimum, first quartile, median, third quartile and maximum) of a time series and provide an easier method for visualization of differences among different groups.

## RESULTS AND DISCUSSIONS

### Normality tests

To determine the distribution type of rainfall time series, the observed daily rainfall time series of the study area were checked and evaluated based on the *p*-value of normality tests. In this study, the null hypothesis of each normality test (Anderson–Darling test, Lilliefors test and Jarque–Bera test) was rejected due to their computed *p*-value (<0.0001) being smaller than the significance level of 0.05. In that case, the alternative hypothesis was accepted, where the observed rainfall series was proved to be skewed and not normally distributed at each station. Hence, the evaluation of rainfall amount results was restricted to using the non-parametric tests due to the results obtained from these normality tests, and no normalization process was involved in this study.

### Predictor selection

The suitable combination of predictors for each station is presented in Table 1. It is noticeable that the observed rainfall series of selected stations were highly responsive to the near surface specific humidity and zonal velocity component at different geopotential heights. In the study of Amirabadizadeh *et al.* (2016), both the temperature and rainfall of their selected stations within the Langat River Basin were also found to be sensitive to the predictor of near surface specific humidity. An explanation for this is that the Langat River Basin experiences a tropical rainforest climate, which is hot and humid throughout the year. Therefore, this predictor was selected in this study again since it has a significant relationship with the observed rainfall series of the study area.

Station | Predictors |
---|---|

2815001 | Geostrophic airflow velocity at 850 hPa height (2) |

Zonal velocity component at 850 hPa height (0) | |

Specific humidity at 500 hPa height (1) | |

Specific humidity at 850 hPa height (− 1) | |

Near surface specific humidity (0) | |

2913001 | Zonal velocity component at 850 hPa height (− 2) |

Vorticity at 850 hPa height (1) | |

Specific humidity at 500 hPa height (1) | |

Near surface specific humidity (− 1) | |

2917001 | Meridional velocity component (0) |

Zonal velocity component at 500 hPa height (0) | |

Geostrophic airflow velocity at 850 hPa height (− 1) | |

Vorticity at 850 hPa height (1) | |

Near surface specific humidity (− 1) | |

3118102 | Mean sea level pressure (2) |

Zonal velocity component at 500 hPa height (0) | |

Divergence at 850 hPa height (0) | |

Specific humidity at 850 hPa height (0) | |

Near surface specific humidity (0) |

Station | Predictors |
---|---|

2815001 | Geostrophic airflow velocity at 850 hPa height (2) |

Zonal velocity component at 850 hPa height (0) | |

Specific humidity at 500 hPa height (1) | |

Specific humidity at 850 hPa height (− 1) | |

Near surface specific humidity (0) | |

2913001 | Zonal velocity component at 850 hPa height (− 2) |

Vorticity at 850 hPa height (1) | |

Specific humidity at 500 hPa height (1) | |

Near surface specific humidity (− 1) | |

2917001 | Meridional velocity component (0) |

Zonal velocity component at 500 hPa height (0) | |

Geostrophic airflow velocity at 850 hPa height (− 1) | |

Vorticity at 850 hPa height (1) | |

Near surface specific humidity (− 1) | |

3118102 | Mean sea level pressure (2) |

Zonal velocity component at 500 hPa height (0) | |

Divergence at 850 hPa height (0) | |

Specific humidity at 850 hPa height (0) | |

Near surface specific humidity (0) |

- Selected predictor (optimum lag transformation of predictors).

### Structure of neural network in the BACT-ANN model

The structure of the neural network in the BACT-ANN model consists of three layers, namely, input, hidden and output layers. The neuron numbers of input and output layers in the BACT-ANN model directly depend on the number of fitted predictors and observed rainfall series, respectively. Therefore, five input neurons were employed for station 2815001, station 2917001 and station 3118102, but four input neurons for station 2913001. One output neuron was employed since there was only one observed rainfall series used for each station. The hidden neuron number is determined using the trial and error method.

In this study, the BACT-ANN model employed 60 hidden neurons for station 2815001 and station 3118102, and 85 hidden neurons for station 2913001 and station 3118102. Theoretically, the accuracy of an ANN model is contributed to by the large optimum number of hidden neurons to sufficiently describe the relationship between input and output data. It is noticeable that the BACT-ANN model employed considerably large hidden neuron numbers in the range of 60–85. The increased accuracy was presented by the BACT-ANN model due to the application of the pre-processing approach, which resulted in the increase of the number of wet days to be trained in the model.

### Determination of hidden states in NHMM

The number of hidden states used in the NHMM is required to be sufficient to express the rainfall state of observed rainfall series at each station. The NHMM was fitted with a different number of hidden states and its quality was evaluated based on their computed BIC scores. Based on the results in Figure 3, the computed BIC scores achieved their minimum points at three hidden states for station 2815001 and station 2913001; and at two hidden states for station 2917001 and station 3118102.

### Rainfall occurrence model

#### Spell lengths' distribution

Through the analyses of wet- and dry-spell lengths' distribution, the performances of both the NHMM and BACT models in simulating the rainfall occurrence throughout the whole study period (years 1976–2005) can be evaluated and compared. Both models exhibited overall good skills in capturing the observed wet- and dry-spell lengths. The spell lengths' distribution simulated by both models showed a similar pattern to the observed distribution in the shape of geometric distribution. However, the NHMM had some overpredictions of wet spell length (station 2917001 (Figure 4(c)) and station 3118102 (Figure 4(d))) and dry-spell lengths (2815001 (Figure 5(a))), especially for the low spell durations at each station. Compared to the NHMM, the BACT model exhibited overall better performance but it still slightly underpredicted the wet spell lengths (stations 2815001 (Figure 4(a)) and station 2917001 (Figure 4(c))) and dry spell length (station 2917001 (Figure 5(c))) at some stations.

#### Matching

The performances of both the rainfall occurrence models were further evaluated using the analyses of the POD, FAR and HSS during the validation period (1996–2005). Based on the results in Figure 6, the BACT model performed better than NHMM in simulating the observed wet days with higher POD indices in the range of 0.51–0.65, while the NHMM obtained POD indices less than 0.44. This situation indicated that the NHMM was less capable of simulating the wet days to match with the observed wet days in the same time. For the analysis of FAR, the higher index indicated that the model has a higher tendency to predict the rainy days on wrong days. Based on the results in Figure 7, the BACT model obtained the FAR indices in the range of 0.41–0.6, which are smaller than the NHMM (0.54–0.66) at each station. This situation can be explained by way of the NHMM having a higher tendency in predicting the rainy days on the wrong days as compared to the BACT model.

According to the study of Kannan & Ghosh (2011), a prediction with an HSS index greater than 0.15 can be considered as a convincingly good prediction. In this study, the BACT showed its good prediction abilities with the HSS greater than 0.15 at each station except for station 2913001, as shown in Figure 8. In contrast, the prediction of NHMM is more likely to be a random prediction as it obtained the scores nearly equal to zero at each station. Hence, the results presented above indicate that the BACT model outperformed the NHMM with some good skills in reproducing the observed wet days with lesser false alarm of wet days at each station.

### Rainfall amount model

#### Non-parametric tests

The computed monthly *p*-values in the K-S test for both the NHMM and the BACT-ANN models at each station, are presented in Table 2. For instance, the computed *p*-value of the BACT-ANN model in January was 0.952 at station 2815001, which is greater than the significance level. Therefore, the null hypothesis cannot be rejected and there is up to 95.2% to indicate the null hypothesis is true. Basically, both the NHMM and BACT-ANN models exhibited good performance with their computed *p*-values exceeding the significance level of 0.05 in a majority of 12 months. For station 2913001, especially, both the models were able to simulate the distribution of observed rainfall series with the *p*-values greater than 0.05 in every month. However, the BACT-ANN model still showed a higher number of months (11 out of 12 months) had passed the significance level of 0.05 than the NHMM (10 out of 12 months) at stations 2917001 and 3118102. Overall, the BACT-ANN model produced *p*-values in the range of 0.035–0.998, which were higher than the NHMM (0.003–0.952).

Grey shaded value indicates the p-value of the model exceeded the significance level of 0.05.

For the Mann–Whitney U test, both the NHMM and BACT-ANN models exhibited reasonably good results with their computed *p*-values exceeding the significance level of 0.05 in a majority of 12 months. As presented in Table 3, the *p*-values produced from the BACT-ANN model (0.012–0.971) were relatively higher than those from the NHMM. This situation indicated that the simulated monthly rainfall series from the BACT-ANN model had higher probability to be identical to observed monthly rainfall series, than those from the NHMM model. Similar results were achieved in the squared-rank test, where both models also presented p-values exceeding the significance level of 0.05 in a majority of 12 months, as shown in Table 4. Therefore, the simulated rainfall series from both the NHMM and BACT models were inferred to be of no significant difference from the observed rainfall series regarding their variances. Other than the same results at station 2917001, the BACT–ANN model still exhibited better performance than the NHMM with the higher number of months in passing the significance level in 12 months.

Grey shaded value indicates the p-value of the model exceeded the significance level of 0.05.

Grey shaded value indicates the p-value of the model exceeded the significance level of 0.05.

In addition, both Kendall's tau-b and Spearman's rho correlations were used in this study to assess the statistical associations between observed and simulated monthly rainfall series. Overall, the BACT-ANN model outperformed the NHMM with its computed p-values of Kendall's tau-b correlation coefficient smaller than the significance level of 0.05 in every month. For example, the correlation coefficient of the BACT-ANN model at station 2815001 was 0.707 in January, with the p-value smaller than 0.05. Thus, the null hypothesis is rejected and the obtained correlation coefficient is significantly different from zero. As presented in Table 5, the correlation coefficients obtained by the BACT-ANN model ranged between 0.287 and 0.737 and significantly different from zero correlation. The NHMM model showed relatively low correlation coefficients with some computed p-values exceeding the significance level of 0.05 in certain months.

Grey shaded value indicates the coefficient with the p-value smaller than the significance level of 0.05.

The coefficients obtained from the Spearman's rho correlation analysis is generally greater than from the Kendall's tau-b correlation analysis, and this analysis is more sensitive to error and discrepancies in the data. Based on the results in Table 6, the evaluation results remained the same with the BACT-ANN model outperforming the NHMM. The BACT-ANN model exhibited higher significant coefficients in the range of 0.418–0.876. The NHMM obtained some positive and negative coefficients, which were not significantly different from zero for certain months. Based on the results from both correlation analyses, the simulated rainfall series from the NHMM was shown to be less reliable than the BACT-ANN model at each station.

Grey shaded value indicates the coefficient with the p-value smaller than the significance level of 0.05.

#### Box and whisker plot

The distributions of monthly rainfall series simulated from both the NHMM and BACT-ANN models were evaluated and compared with observed monthly rainfall series, using the box and whisker plot. The median (middle line in the box) of both simulated monthly rainfall series were relatively similar to the observed monthly rainfall series at each station, as shown in Figure 9. However, the interquartile range (length of box) and overall range (distance between the end of two whiskers) of monthly rainfall series from the NHMM model was found to be the smallest, which indicated that the rainfall data from NHMM were less dispersed from the median value. Except for an obvious difference at station 2815001, both the observed and the BACT-ANN model showed reasonably similar interquartile range and overall range of monthly rainfall series. Overall, the BACT-ANN model was demonstrated to be better than the NHMM model in simulating the observed monthly rainfall series in terms of data dispersion, maximum and minimum values (ends of whiskers).

## CONCLUSIONS

This study proposed a two-stage artificial intelligence-based hybrid simulation model, with the combination of bootstrap aggregated classification tree (BACT) and artificial neural network (ANN), to simulate the rainfall occurrence and amount at the tropical Langat River Basin, Malaysia. It also presented the capability of the BACT-ANN model in preserving the characteristics of rainfall data. The BACT-ANN model was compared with the NHMM model. Overall, the BACT-ANN model was shown to be superior in simulating the wet and dry spell lengths, probability of wet days and rainfall occurrence, with higher probability of detection (POD) indices, lower false alarm rate (FAR) indices and higher Heidke skill score (HSS), respectively. The NHMM model tended to overpredict the wet spell and dry spell lengths. For the simulation of rainfall amount, the BACT-ANN model also outperformed the NHMM model in simulating the distribution, equality, variance and statistical correlations of rainfall amount. The BACT-ANN model successfully preserved the characteristics of observed monthly rainfall, namely minimum, first quartile, median, third quartile and maximum) through the representation of box and whisker plot. It can be concluded that the BACT-ANN model gave better results in simulating the important characteristics of rainfall amount and occurrence. As such, it may be potentially applicable for hydrological applications and climate change assessment, especially for tropical regions. The proposed model can be utilized to downscale the future rainfall using future predictors under different scenarios, where the results can be further used to investigate the future rainfall trend and as an input for the impact assessment of future climate change.

In the Langat River Basin, the study period of 30 years is the longest period where rainfall data are considered reliable. However, the use of a longer study period is recommended for future research to investigate any further improvement of the BACT model in simulating the observed rainfall occurrence, and to provide more information for the ANN to establish a more robust relationship between large-scale atmospheric variables and local climate variables.

## ACKNOWLEDGEMENTS

The authors would like to express their sincere appreciation to the *Ministry of Higher Education (MOHE) Malaysia* and the Universiti Tunku Abdul Rahman, Malaysia, for providing the funds for this study.