Accurate prediction of precipitation is of great importance for irrigation management and disaster prevention. In this study, back propagation artificial neural network (BPANN), radial basis function artificial neural network (RBFANN) and Kriging methods were applied and compared to predict the monthly precipitation of Liaoyuan city, China. An autocorrelation analysis method was used to determine model input variables first, and then BPANN, RBFANN and Kriging methods were applied to recognize the relationship between previous precipitation and later precipitation with the monthly precipitation data of 1971–2009 in Liaoyuan city. Finally, the three models' performances were compared based on models accuracy, models stability and models computational cost. Comparison results showed that for model accuracy, RBFANN performed best, followed by Kriging, and BPANN performed worst; for stability and computational cost, RBFANN and Kriging models performed better than the BPANN model. In conclusion, RBFANN is the best method for precipitation prediction in Liaoyuan city. Therefore, the developed RBFANN model was applied to predict the monthly precipitation for 2010–2019 in the study area.
Precipitation is an important component of hydrology and water resources system, as well as an important factor of water resources assessment. Precipitation variability, like flooding and droughts, can cause serious natural disasters (Mar & Naing 2008; Aksoy & Dahamsheh 2009; Hung et al. 2009; Wu et al. 2010; Azadi & Sepaskhah 2012). As a result, having information about the hourly, daily, monthly, seasonal and annual rainfall data beforehand are necessary, which can help in dealing with water resources management and disaster prevention (Partal & Kisi 2007; Aksoy & Dahamsheh 2009; Lo et al. 2015).
A number of methods have been used for predicting the precipitation in the past years. Georgakakos & Bras (1984) developed a one-dimensional, physically based precipitation model with a lead times of hours, depending on the ground station data of temperature, dew-point temperature, ground-level pressure as input. Partal & Cigizoglu (2009) predicted the daily precipitation from meteorological data from Turkey using the wavelet–artificial neural network (ANN) method. Dahamsheh & Aksoy (2009) used different ANNs and the linear regression (LR) methods to forecast monthly precipitation in Jordan in the Middle East. Azadi & Sepaskhah (2012) used ANN to forecast annual precipitation for west, southwest, and south provinces of Iran. Generally, the used hourly, monthly, seasonal and annual precipitation prediction models can be classified into two categories. The first category is physical mechanism model, in which each step of the model is based on physical law of hydrology process. The second category is a statistical data mining method in which there is no interest in the precipitation physical process, and only interest in the relationship of input and output (Luk et al. 2001). However, when vast and accurate precipitation-related climatological data are difficult to obtain, the physical mechanism models are unsuitable, and statistical data mining models using previous precipitation data only as their input vector are more suitable (Dahamsheh & Aksoy 2009).
Several data mining methods have been used to predict the precipitation since the 1970s, such as autoregressive integrated moving average (ARIMA), LR and ANN, etc. After the book ‘Time series analysis: forecasting and control’ was published (Box & Jenkins 1976), autoregressive moving-average and ARIMA models became one of the general time series models in the 1970s (Delleur & Kavvas 1978), and are still in common use (Burlando et al. 1993; Weesakul & Lowanichchai 2005). As the simplest data mining method, LR method has been used for precipitation prediction. Shukla & Mooley (1987) developed a LR equation to predict the summer monsoon rainfall over India. DelSole & Shukla (2002) proposed a strategy for selecting the best LR prediction model for Indian monsoon rainfall. ANN is regarded as a powerful mathematical model for achieving the forecast of rainfall volume in time series. French et al. (1992) used an ANN for forecasting rainfall intensity fields at a lead-time of 1 h. Bodri & Čermák (2001) used back propagation artificial neural network (BPANN) to predict the monthly precipitation in Moravia, Czech Republic. Olsson et al. (2004) coupled two BPANNs in series to forecast the 12 h mean rainfall in the Chikugo River basin in southern Japan.
Among the above data mining methods, the ANN method has become more and more popular in recent years. As the most commonly used ANN method, there are some drawbacks of BPANN, such as the over training, the congress process is very slow, and the hidden layer is also hard to determine. Some researchers have demonstrated that radial basis function artificial neural network (RBFANN) has a better performance in data mining and prediction areas due to its faster convergence, high stability and not becoming trapped in a local optimum compared with BPANN (Zhang et al. 2008; Luo et al. 2010). In recent years, the Kriging method, first developed as a geostatistical method, has successfully been used for precipitation spatial analysis since 1978 (Borga & Vizzaccaro 1997; Delhomme 1978; Haberlandt 2007). As a statistical estimation method, we attempt to use Kriging method for precipitation temporal prediction.
For each area, selecting the most suitable model for precipitation prediction is of great importance. Dahamsheh & Aksoy (2009) used different ANNs as well as the LR to forecast intermittent monthly precipitation in Jordan in the Middle East, and the accuracy of different models were compared. Lu et al. (2015) compared accuracy of generalized regression neural network and support vector regression for forecasting monthly rainfall in Western Jilin Province, China. Most of the previous researches used accuracy as the prediction performance assessment factor (Chattopadhyay 2007; Dahamsheh & Aksoy 2009; Lu et al. 2015). However, model stability and model computational cost are also important factors that affect the prediction model performance. To the authors’ knowledge, few works have been published in the literature that made a comprehensive analysis of different models’ accuracy, stability and computational cost in the precipitation prediction process. Moreover, there have rarely been researchers that used Kriging method for precipitation temporal prediction.
Liaoyuan city is located in the south central region of Jilin province, with a semi humid temperate continental monsoon climate. Monthly temporal scale precipitation prediction is important to surface water and groundwater resources management of Liaoyuan city. As an extension of the previous research, BPANN, RBFANN and Kriging methods were used for recognizing the relationship between previous precipitation data and later precipitation data at Liaoyuan city, China. Models performance was compared based on model accuracy, model stability and model computational cost. The superior one was used for predicting the monthly precipitation of this area, and the temporal and spatial variation of this area was analyzed.
The novelty of the paper is: (1) Kriging method was used for monthly precipitation temporal analysis and prediction in Liaoyuan city; (2) different methods were compared for recognizing the relationship between previous precipitation data and later precipitation data based on accuracy, stability and computational cost.
STUDY AREA AND DATA
Liaoyuan city is located in the south central region of Jilin province, the head of Dongliao river and Huifa river (42°17′40″—43°13′40″ N, 124°51′22″—125°49′52″ E). Liaoyuan has a semi humid temperate continental monsoon climate with four distinct seasons. The average annual precipitation is 651.8 mm, and the precipitation is unevenly distributed annually due to the effect of monsoon. Most of the precipitation is concentrated in June to September (flood period), which accounts for 80% of the annual precipitation. The maximum monthly precipitation appears in July and August, which accounts for 48% of the annual precipitation.
BPANN, first proposed by McCelland & Rumelhart (1986), is a feed-forward network based on error back propagation (BP) algorithm. BP algorithm is based on searching an error surface using gradient descent for point(s) with minimum error (Yang et al. 2009). BPANN has extensively been applied in function approximation, pattern recognition, data compression field (Zhang et al. 2008).
Radial basis function was first proposed as a solution to real multivariate interpolation problems by Powell (1987). RBFANN is a three-layered feed forward network that uses radial basis functions as activation functions, it consists of an input layer, hidden layer, and output layer (Shen et al. 2010). RBFANN has been very effective and covers a wide range of applications involving classification, function approximation, noise interpolation, and regularization because of its fast convergence and high stability compared with some other methods (Kégl et al. 2000; Moradkhani et al. 2004; Luo et al. 2013).
Kriging is a geostatistical method developed by the French mathematician Georges Matheron based on the Master's thesis of Daniel Gerhardus Krige (Matheron 1963), and it has successfully been used for data interpolation, pattern recognition and surrogate model building, etc. (Abedini et al. 2008; Luo & Lu 2014).
Input variables determination with autocorrelation analysis
It is important to select the appropriate input variables before developing the precipitation prediction model (Bodri & Čermák 2001; Dahamsheh & Aksoy 2009). Autocorrelation analysis is a common method to determine the input variables by calculating the autocorrelation coefficient between the observed and predicted monthly precipitation (Monserud & Marshall 2001; Sarkar et al. 2002).
In fact, the precipitation of the given month was influenced significantly by the given month's precipitation in n previous years, as well as the previous k months' precipitation of the current year. Therefore, two autocorrelation analyses were made to determine the numbers n and k respectively with monthly precipitation data of 1971–2009 of Weiguo and Shiyi stations. With the autocorrelation analyses on MATLAB 2014 platform, we obtained that n is 20, and k is 6, which means the given monthly precipitation was influenced significantly by the given month's precipitation in 20 previous years, as well as the previous 6 months' precipitation of the current year. Therefore, there are 26 input variables (Pt−1, Pt−2,Pt−3,Pt−4, Pt−5,Pt−6, Pt−12, Pt−24, Pt−36, Pt−48, Pt−60, Pt−72, Pt−84, Pt−96, Pt−108, Pt−120, Pt−132, Pt−144, Pt−156, Pt−168, Pt−180, Pt−192,Pt−204, Pt−216, Pt−228, Pt−240) and one output variable (Pt) in the precipitation prediction model.
Prediction model developed with BPANN, RBFANN and Kriging
For every station, there are monthly precipitation data of 1971 to 2009, a total of 228 sets input output data. These data were divided into two sets, the training set with the earlier 156 data and the validation data with the later 72 data. Three prediction models were developed with BPANN, RBFANN and Kriging methods for each precipitation station.
For BPANNs, in both Weiguo station and Shiyi station, three layers BPANN, four layers BPANN and five layers BPANN were all adopted to build the relationship between the given monthly precipitation and previous precipitation. The Log-Sigmoid function was used as the transfer function of hidden layer.
For RBFANNs, in both Weiguo station and Shiyi station, the Gauss function was used as transfer function, and the orthogonal least square method was used for network training.
For the Kriging models, in both Weiguo station and Shiyi station, polynomial functions of the order of 1 and 2 were used as deterministic functions, while Gauss function was used as a correlation function
The parameters of all of the three models were designed through an optimization process with genetic algorithm. In the genetic algorithm searching process, selection probability, crossover probability and mutation probability were set as 0.9, 0.7 and 0.05, respectively, and the initial population size and the maximum number of generations were set as 200 and 100, respectively. All of the three prediction models were built and trained at MATLAB 2014 platform.
RESULTS AND DISCUSSION
Comparison of prediction models performance
Prediction model accuracy analysis
|RMSE (mm) .||MAE (mm) .||R2 .||RMSE (mm) .||MAE (mm) .||R2 .|
|RMSE (mm) .||MAE (mm) .||R2 .||RMSE (mm) .||MAE (mm) .||R2 .|
Prediction model stability analysis
In the BPANN training process, results with great difference were obtained for each time the network was trained, which means a poor stability. While for RBFANN and the Kriging training process, each time the model were trained would get the same results, which means a good stability.
Prediction model computational cost analysis
For each time the BPANN prediction model was run, it needed an average of 6.0 seconds CPU time on 3.6 GHz Intel Core i7 CPU and 8 GB RAM PC platform. BPANN obtained different results each time it was trained, and in this study, 60 times trainings were done before the proper result was obtained. For RBFANN and Kriging models, only one time training was needed to get a reasonable accuracy. For each time the RBFANN and Kriging prediction model was run, it needed an average of 2.9 and 0.5 seconds CPU time on 3.6 GHz Intel Core i7 CPU and 8 GB RAM PC platform.
From the above observations, we can conclude that RBFANN is the most dependable method for precipitation prediction considering the accuracy, stability and computational cost, followed by the Kriging method. The BPANN model has a low accuracy, high computational cost, and the stability is bad.
Next 10 years' precipitation prediction
Due to the conclusion that RBFANN is superior to the other two prediction models, the developed RBFANN model is used to predict the monthly precipitation of 2010–2019 in Shiyi and Weiguo stations. The prediction results are presented in Figure 5, which demonstrated that from 2010 to 2019, precipitation of Shiyi station is higher than that of Weiguo station. For Shiyi station, the maximum monthly precipitation is present in July 2010, while for Weiguo station, it is present in August 2013.
Autocorrelation analyses were first used to determine the input variables of the prediction models, and the analyses results showed that given monthly precipitation was influenced significantly by the given month's precipitation in 20 previous years, as well as the previous 6 months' precipitation of the current year. BPANN, RBFANN and Kriging methods were used in this study to build the prediction models, and model performance was compared based on accuracy, stability and computational cost.
The comparison results showed the following. (1) RBFANN has large superiority in accuracy, stability and computational cost, followed by Kriging. The BPANN has low accuracy, low stability and high computational cost, which is not proper in the precipitation prediction in this area. (2) RBFANN model is used to predict the monthly precipitation for 2010–2019. These prediction results can provide a number of technical supports to water resources management and disaster prevention for Liaoyuan city.
This research was supported by the Nature Science Foundation of China (No. 41502221 and 41372237) and China Postdoctoral Science Foundation (No. 2015M570275).