Accurate prediction of precipitation is of great importance for irrigation management and disaster prevention. In this study, back propagation artificial neural network (BPANN), radial basis function artificial neural network (RBFANN) and Kriging methods were applied and compared to predict the monthly precipitation of Liaoyuan city, China. An autocorrelation analysis method was used to determine model input variables first, and then BPANN, RBFANN and Kriging methods were applied to recognize the relationship between previous precipitation and later precipitation with the monthly precipitation data of 1971–2009 in Liaoyuan city. Finally, the three models' performances were compared based on models accuracy, models stability and models computational cost. Comparison results showed that for model accuracy, RBFANN performed best, followed by Kriging, and BPANN performed worst; for stability and computational cost, RBFANN and Kriging models performed better than the BPANN model. In conclusion, RBFANN is the best method for precipitation prediction in Liaoyuan city. Therefore, the developed RBFANN model was applied to predict the monthly precipitation for 2010–2019 in the study area.

INTRODUCTION

Precipitation is an important component of hydrology and water resources system, as well as an important factor of water resources assessment. Precipitation variability, like flooding and droughts, can cause serious natural disasters (Mar & Naing 2008; Aksoy & Dahamsheh 2009; Hung et al. 2009; Wu et al. 2010; Azadi & Sepaskhah 2012). As a result, having information about the hourly, daily, monthly, seasonal and annual rainfall data beforehand are necessary, which can help in dealing with water resources management and disaster prevention (Partal & Kisi 2007; Aksoy & Dahamsheh 2009; Lo et al. 2015).

A number of methods have been used for predicting the precipitation in the past years. Georgakakos & Bras (1984) developed a one-dimensional, physically based precipitation model with a lead times of hours, depending on the ground station data of temperature, dew-point temperature, ground-level pressure as input. Partal & Cigizoglu (2009) predicted the daily precipitation from meteorological data from Turkey using the wavelet–artificial neural network (ANN) method. Dahamsheh & Aksoy (2009) used different ANNs and the linear regression (LR) methods to forecast monthly precipitation in Jordan in the Middle East. Azadi & Sepaskhah (2012) used ANN to forecast annual precipitation for west, southwest, and south provinces of Iran. Generally, the used hourly, monthly, seasonal and annual precipitation prediction models can be classified into two categories. The first category is physical mechanism model, in which each step of the model is based on physical law of hydrology process. The second category is a statistical data mining method in which there is no interest in the precipitation physical process, and only interest in the relationship of input and output (Luk et al. 2001). However, when vast and accurate precipitation-related climatological data are difficult to obtain, the physical mechanism models are unsuitable, and statistical data mining models using previous precipitation data only as their input vector are more suitable (Dahamsheh & Aksoy 2009).

Several data mining methods have been used to predict the precipitation since the 1970s, such as autoregressive integrated moving average (ARIMA), LR and ANN, etc. After the book ‘Time series analysis: forecasting and control’ was published (Box & Jenkins 1976), autoregressive moving-average and ARIMA models became one of the general time series models in the 1970s (Delleur & Kavvas 1978), and are still in common use (Burlando et al. 1993; Weesakul & Lowanichchai 2005). As the simplest data mining method, LR method has been used for precipitation prediction. Shukla & Mooley (1987) developed a LR equation to predict the summer monsoon rainfall over India. DelSole & Shukla (2002) proposed a strategy for selecting the best LR prediction model for Indian monsoon rainfall. ANN is regarded as a powerful mathematical model for achieving the forecast of rainfall volume in time series. French et al. (1992) used an ANN for forecasting rainfall intensity fields at a lead-time of 1 h. Bodri & Čermák (2001) used back propagation artificial neural network (BPANN) to predict the monthly precipitation in Moravia, Czech Republic. Olsson et al. (2004) coupled two BPANNs in series to forecast the 12 h mean rainfall in the Chikugo River basin in southern Japan.

Among the above data mining methods, the ANN method has become more and more popular in recent years. As the most commonly used ANN method, there are some drawbacks of BPANN, such as the over training, the congress process is very slow, and the hidden layer is also hard to determine. Some researchers have demonstrated that radial basis function artificial neural network (RBFANN) has a better performance in data mining and prediction areas due to its faster convergence, high stability and not becoming trapped in a local optimum compared with BPANN (Zhang et al. 2008; Luo et al. 2010). In recent years, the Kriging method, first developed as a geostatistical method, has successfully been used for precipitation spatial analysis since 1978 (Borga & Vizzaccaro 1997; Delhomme 1978; Haberlandt 2007). As a statistical estimation method, we attempt to use Kriging method for precipitation temporal prediction.

For each area, selecting the most suitable model for precipitation prediction is of great importance. Dahamsheh & Aksoy (2009) used different ANNs as well as the LR to forecast intermittent monthly precipitation in Jordan in the Middle East, and the accuracy of different models were compared. Lu et al. (2015) compared accuracy of generalized regression neural network and support vector regression for forecasting monthly rainfall in Western Jilin Province, China. Most of the previous researches used accuracy as the prediction performance assessment factor (Chattopadhyay 2007; Dahamsheh & Aksoy 2009; Lu et al. 2015). However, model stability and model computational cost are also important factors that affect the prediction model performance. To the authors’ knowledge, few works have been published in the literature that made a comprehensive analysis of different models’ accuracy, stability and computational cost in the precipitation prediction process. Moreover, there have rarely been researchers that used Kriging method for precipitation temporal prediction.

Liaoyuan city is located in the south central region of Jilin province, with a semi humid temperate continental monsoon climate. Monthly temporal scale precipitation prediction is important to surface water and groundwater resources management of Liaoyuan city. As an extension of the previous research, BPANN, RBFANN and Kriging methods were used for recognizing the relationship between previous precipitation data and later precipitation data at Liaoyuan city, China. Models performance was compared based on model accuracy, model stability and model computational cost. The superior one was used for predicting the monthly precipitation of this area, and the temporal and spatial variation of this area was analyzed.

The novelty of the paper is: (1) Kriging method was used for monthly precipitation temporal analysis and prediction in Liaoyuan city; (2) different methods were compared for recognizing the relationship between previous precipitation data and later precipitation data based on accuracy, stability and computational cost.

STUDY AREA AND DATA

Liaoyuan city is located in the south central region of Jilin province, the head of Dongliao river and Huifa river (42°17′40″—43°13′40″ N, 124°51′22″—125°49′52″ E). Liaoyuan has a semi humid temperate continental monsoon climate with four distinct seasons. The average annual precipitation is 651.8 mm, and the precipitation is unevenly distributed annually due to the effect of monsoon. Most of the precipitation is concentrated in June to September (flood period), which accounts for 80% of the annual precipitation. The maximum monthly precipitation appears in July and August, which accounts for 48% of the annual precipitation.

There are 17 precipitation stations in Liaoyuan (Figure 1). Precipitation time series data between January 1971 and December 2009, a period of 468 months (39 years), of Shiyi and Weiguo stations were used as the research data for predicting the future precipitation. The precipitation data from 1971 to 2002 were collected by rain gauge, while the precipitation data from 2002 to 2009 were collected by JDZ-1 type Rainfall Data Logger. The precipitation time series data were provided by Liaoyuan Sub-bureau, Hydrology and Water Resources Bureau of Jilin Province.
Figure 1

Study area and its precipitation stations.

Figure 1

Study area and its precipitation stations.

METHODS

BPANN

BPANN, first proposed by McCelland & Rumelhart (1986), is a feed-forward network based on error back propagation (BP) algorithm. BP algorithm is based on searching an error surface using gradient descent for point(s) with minimum error (Yang et al. 2009). BPANN has extensively been applied in function approximation, pattern recognition, data compression field (Zhang et al. 2008).

BPANN consists of three or more layers, namely input layer, hidden layer(s) and output layer (Lee 2004; Fan et al. 2007; Aksoy & Dahamsheh 2009).

is a T dimensional input vector. Thus, the output of the neurons in the BPANN hidden layer is assumed as: 
formula
1
where is the weight of the connection from the pth input neuron to the ith hidden neuron, and is the bias for the ith hidden neuron, is the transfer function, which is usually the sigmoid function expressed as (Lee 2004): 
formula
2
Outputs of the neuron in BPANN output layer are transformed by: 
formula
3
where is the weight of the connection from the ith hidden neuron to the kth output neuron, and is the bias for the kth output neuron.

RBFANN

Radial basis function was first proposed as a solution to real multivariate interpolation problems by Powell (1987). RBFANN is a three-layered feed forward network that uses radial basis functions as activation functions, it consists of an input layer, hidden layer, and output layer (Shen et al. 2010). RBFANN has been very effective and covers a wide range of applications involving classification, function approximation, noise interpolation, and regularization because of its fast convergence and high stability compared with some other methods (Kégl et al. 2000; Moradkhani et al. 2004; Luo et al. 2013).

is a T dimensional input vector. Thus the output of the neurons in the RBFANN hidden layer is assumed as: 
formula
4
where is the center associated with the neuron in the radial basis function hidden layer, , where H is the number of hidden units, is the norm of , is a radial basis function (Chen et al. 1991; Baddari et al. 2009). Outputs of the neuron in RBFANN output layer are linear combinations of the hidden layer neuron outputs as: 
formula
5
where is the connecting weights from the hidden layer neuron to the output layer, is the threshold value of the output layer neuron.

Kriging

Kriging is a geostatistical method developed by the French mathematician Georges Matheron based on the Master's thesis of Daniel Gerhardus Krige (Matheron 1963), and it has successfully been used for data interpolation, pattern recognition and surrogate model building, etc. (Abedini et al. 2008; Luo & Lu 2014).

A Kriging model is a combination of two components (Queipo et al. 2005): deterministic functions and localized deviations: 
formula
6
where is term of deterministic functions, are coefficients of deterministic functions, are k known regression functions, which are usually polynomial functions, and is term of localized deviations with the following characteristics: 
formula
7
where is the correlation function between any two of the ns samples. The common types of correlation functions are linear function, exponential function, Gauss function, spline function, etc. (Ryu et al. 2002).

MODEL DEVELOPMENT

Input variables determination with autocorrelation analysis

It is important to select the appropriate input variables before developing the precipitation prediction model (Bodri & Čermák 2001; Dahamsheh & Aksoy 2009). Autocorrelation analysis is a common method to determine the input variables by calculating the autocorrelation coefficient between the observed and predicted monthly precipitation (Monserud & Marshall 2001; Sarkar et al. 2002).

In fact, the precipitation of the given month was influenced significantly by the given month's precipitation in n previous years, as well as the previous k months' precipitation of the current year. Therefore, two autocorrelation analyses were made to determine the numbers n and k respectively with monthly precipitation data of 1971–2009 of Weiguo and Shiyi stations. With the autocorrelation analyses on MATLAB 2014 platform, we obtained that n is 20, and k is 6, which means the given monthly precipitation was influenced significantly by the given month's precipitation in 20 previous years, as well as the previous 6 months' precipitation of the current year. Therefore, there are 26 input variables (Pt−1, Pt−2,Pt−3,Pt−4, Pt−5,Pt−6, Pt−12, Pt−24, Pt−36, Pt−48, Pt−60, Pt−72, Pt−84, Pt−96, Pt−108, Pt−120, Pt−132, Pt−144, Pt−156, Pt−168, Pt−180, Pt−192,Pt−204, Pt−216, Pt−228, Pt−240) and one output variable (Pt) in the precipitation prediction model.

Prediction model developed with BPANN, RBFANN and Kriging

For every station, there are monthly precipitation data of 1971 to 2009, a total of 228 sets input output data. These data were divided into two sets, the training set with the earlier 156 data and the validation data with the later 72 data. Three prediction models were developed with BPANN, RBFANN and Kriging methods for each precipitation station.

For BPANNs, in both Weiguo station and Shiyi station, three layers BPANN, four layers BPANN and five layers BPANN were all adopted to build the relationship between the given monthly precipitation and previous precipitation. The Log-Sigmoid function was used as the transfer function of hidden layer.

For RBFANNs, in both Weiguo station and Shiyi station, the Gauss function was used as transfer function, and the orthogonal least square method was used for network training.

For the Kriging models, in both Weiguo station and Shiyi station, polynomial functions of the order of 1 and 2 were used as deterministic functions, while Gauss function was used as a correlation function

The parameters of all of the three models were designed through an optimization process with genetic algorithm. In the genetic algorithm searching process, selection probability, crossover probability and mutation probability were set as 0.9, 0.7 and 0.05, respectively, and the initial population size and the maximum number of generations were set as 200 and 100, respectively. All of the three prediction models were built and trained at MATLAB 2014 platform.

RESULTS AND DISCUSSION

Comparison of prediction models performance

Prediction model accuracy analysis

For each prediction model construction method, the developed prediction model was used to predict the precipitation of the 72 validation samples. Take Shiyi station as an example, the effects of model structure on model accuracy were analyzed. Figure 2 shows the boxplots of absolute error of different prediction models at Shiyi station, which demonstrates that: for BPANN model, approximation accuracy of four layers BPANN obtained highest approximation accuracy; for RBFANN model, approximation accuracy of RBFANN with Gauss function as transfer function obtained acceptable accuracy, and through parameter optimization, the optimal hidden neurons number is 16; for Kriging model, Kriging with first order polynomial function as regression function obtained the highest approximation accuracy. Therefore, four layers BPANN model, RBFANN model with Gauss function as transfer function and Kriging model with first order polynomial function as regression function were selected as the predicted models.
Figure 2

Absolute error boxplots of the prediction models.

Figure 2

Absolute error boxplots of the prediction models.

In this study, root mean squared error (RMSE), mean absolute error (MAE) and coefficient of determination (R2) are used to assess the accuracy of each model. The accuracies of the three models were compared with that of LR model, which are shown in Table 1. The scatter diagram of the observed precipitation vs. predicted precipitation with different models are shown in Figure 3, which shows how the observed versus the predicted precipitation scatter around the 1:1 perfect line together; the closer the scatter is to the line, the higher the accuracy is. Both Table 1 and Figure 3 demonstrate that: at these two stations, RBFANN model had the highest prediction accuracy, followed by the Kriging model. BPANN model had the lowest prediction accuracy in these three prediction models. All of RBFANN, BPANN and Kriging models perform better than LR model.
Table 1

Accuracy measures of the prediction models at Shiyi and Weiguo stations

 Shiyi
Weiguo
RMSE (mm)MAE (mm)R2RMSE (mm)MAE (mm)R2
BPANN model 43.41 28.85 0.64 39.95 27.80 0.64 
RBFANN model 31.59 21.16 0.84 29.43 20.82 0.80 
Kriging model 37.06 29.00 0.74 35.92 24.48 0.70 
LR model 48.54 31.83 0.56 43.46 28.92 0.57 
 Shiyi
Weiguo
RMSE (mm)MAE (mm)R2RMSE (mm)MAE (mm)R2
BPANN model 43.41 28.85 0.64 39.95 27.80 0.64 
RBFANN model 31.59 21.16 0.84 29.43 20.82 0.80 
Kriging model 37.06 29.00 0.74 35.92 24.48 0.70 
LR model 48.54 31.83 0.56 43.46 28.92 0.57 
Figure 3

Observed precipitation vs. predicted precipitation of validation samples with BPANN, RBFANN, Kriging and LR prediction model for: (a) Shiyi station and (b) Weiguo station.

Figure 3

Observed precipitation vs. predicted precipitation of validation samples with BPANN, RBFANN, Kriging and LR prediction model for: (a) Shiyi station and (b) Weiguo station.

In order to clearly understand how these prediction models perform throughout the validation period, the time series of the observed and predicted monthly precipitation is presented in Figure 4. From Figure 4 we can demonstrate that all of these prediction models generally underestimated or overestimated in June, July and August while other months’ precipitation were estimated closer to the observed data. A possible reason for this is that from June to August, the precipitation is extremely high, and these high precipitation data are relatively rare in the training data, so the prediction models may be unable to make proper rules for high precipitation corresponding to the training data set.
Figure 4

Time series of the observed and predicted monthly precipitation for: (a) Shiyi station and (b) Weiguo station.

Figure 4

Time series of the observed and predicted monthly precipitation for: (a) Shiyi station and (b) Weiguo station.

Figure 5

Predicted monthly precipitation of 2010–2019 at (a) Shiyi station and (b) Weiguo station for Liaoyuan city with RBFANN.

Figure 5

Predicted monthly precipitation of 2010–2019 at (a) Shiyi station and (b) Weiguo station for Liaoyuan city with RBFANN.

Prediction model stability analysis

In the BPANN training process, results with great difference were obtained for each time the network was trained, which means a poor stability. While for RBFANN and the Kriging training process, each time the model were trained would get the same results, which means a good stability.

Prediction model computational cost analysis

For each time the BPANN prediction model was run, it needed an average of 6.0 seconds CPU time on 3.6 GHz Intel Core i7 CPU and 8 GB RAM PC platform. BPANN obtained different results each time it was trained, and in this study, 60 times trainings were done before the proper result was obtained. For RBFANN and Kriging models, only one time training was needed to get a reasonable accuracy. For each time the RBFANN and Kriging prediction model was run, it needed an average of 2.9 and 0.5 seconds CPU time on 3.6 GHz Intel Core i7 CPU and 8 GB RAM PC platform.

From the above observations, we can conclude that RBFANN is the most dependable method for precipitation prediction considering the accuracy, stability and computational cost, followed by the Kriging method. The BPANN model has a low accuracy, high computational cost, and the stability is bad.

Next 10 years' precipitation prediction

Due to the conclusion that RBFANN is superior to the other two prediction models, the developed RBFANN model is used to predict the monthly precipitation of 2010–2019 in Shiyi and Weiguo stations. The prediction results are presented in Figure 5, which demonstrated that from 2010 to 2019, precipitation of Shiyi station is higher than that of Weiguo station. For Shiyi station, the maximum monthly precipitation is present in July 2010, while for Weiguo station, it is present in August 2013.

CONCLUSIONS

Autocorrelation analyses were first used to determine the input variables of the prediction models, and the analyses results showed that given monthly precipitation was influenced significantly by the given month's precipitation in 20 previous years, as well as the previous 6 months' precipitation of the current year. BPANN, RBFANN and Kriging methods were used in this study to build the prediction models, and model performance was compared based on accuracy, stability and computational cost.

The comparison results showed the following. (1) RBFANN has large superiority in accuracy, stability and computational cost, followed by Kriging. The BPANN has low accuracy, low stability and high computational cost, which is not proper in the precipitation prediction in this area. (2) RBFANN model is used to predict the monthly precipitation for 2010–2019. These prediction results can provide a number of technical supports to water resources management and disaster prevention for Liaoyuan city.

ACKNOWLEDGEMENTS

This research was supported by the Nature Science Foundation of China (No. 41502221 and 41372237) and China Postdoctoral Science Foundation (No. 2015M570275).

REFERENCES

REFERENCES
Aksoy
H.
Dahamsheh
A.
2009
Artificial neural network models for forecasting monthly precipitation in Jordan
.
Stochastic Environmental Research and Risk Assessment
23
(
7
),
917
931
.
Baddari
K.
Aïfa
T.
Djarfour
N.
Ferahtia
J.
2009
Application of a radial basis function artificial neural network to seismic data inversion
.
Computational Geosciences
35
(
12
),
2338
2344
.
Box
G. E.
Jenkins
G. M.
1976
Time series analysis: forecasting and control. Holden-Day
.
Burlando
P.
Rosso
R.
Cadavid
L. G.
Salas
J. D.
1993
Forecasting of short-term rainfall using ARMA models
.
Journal of Hydrology
144
(
1
),
193
211
.
Chen
S.
Cowan
C. F. N.
Grant
P. M.
1991
Orthogonal least squares learning algorithm for radial basis function networks, IEEE Transactions on Neural Networks
, pp.
302
309
.
Delhomme
J.
1978
Kriging in the hydrosciences
.
Advances in Water Resources
1
(
5
),
251
266
.
Delleur
J. W.
Kavvas
M. L.
1978
Stochastic models for monthly rainfall forecasting and synthetic generation
.
Journal of Applied Meteorology
17
(
10
),
1528
1536
.
DelSole
T.
Shukla
J.
2002
Linear prediction of Indian monsoon rainfall
.
Journal of Climate
15
(
24
),
3645
3658
.
Fan
Z.
Taishan
L.
Liping
Z.
2007
BP Neural Network Modeling of Infrared Methane Detector for Temperature Compensation. IEEE
, pp.
4-123
4-126
.
French
M. N.
Krajewski
W. F.
Cuykendall
R. R.
1992
Rainfall forecasting in space and time using a neural network
.
Journal of Hydrology
137
(
1
),
1
31
.
Georgakakos
K. P.
Bras
R. L.
1984
A hydrologically useful station precipitation model. 1. Formulation
.
Water Resources Research
20
(
11
),
1585
1596
.
Hung
N. Q.
Babel
M. S.
Weesakul
S.
Tripathi
N.
2009
An artificial neural network model for rainfall forecasting in Bangkok, Thailand
.
Hydrology and Earth System Sciences
13
(
8
),
1413
1425
.
Kégl
B.
Krzyzak
A.
Niemann
H.
2000
Radial basis function networks and complexity regularization in function learning and classification, ICPR. IEEE
,
2081
pp.
Luk
K. C.
Ball
J. E.
Sharma
A.
2001
An application of artificial neural networks for rainfall forecasting
.
Mathematical and Computer Modelling
33
(
6
),
683
693
.
Luo
Y.
Xu
C.
Fan
Y.
2010
The comparison of RBF and BP neural network in decoupling of DTG, Proceedings of the Third International Symposium on Computer Science and Computational Technology
, pp.
159
162
.
Mar
K. W.
Naing
T. T.
2008
Optimum neural network architecture for precipitation prediction of Myanmar
.
World Academy of Science, Engineering and Technology
48
,
130
134
.
Matheron
G.
1963
Principles of geostatistics
.
Economic Geology
58
(
8
),
1246
1266
.
McCelland
J.
Rumelhart
D.
1986
Parallel Distributed Processing
.
MIT Press
,
Cambridge, MA
,
USA
.
Moradkhani
H.
Hsu
K.-l.
Gupta
H. V.
Sorooshian
S.
2004
Improved streamflow forecasting using self-organizing radial basis function artificial neural networks
.
Journal of Hydrology
295
(
1
),
246
262
.
Olsson
J.
Uvo
C.
Jinno
K.
Kawamura
A.
Nishiyama
K.
Koreeda
N.
Nakashima
T.
Morita
O.
2004
Neural networks for rainfall forecasting by atmospheric downscaling
.
Journal of Hydrologic Engineering
9
(
1
),
1
12
.
Partal
T.
Cigizoglu
H. K.
2009
Prediction of daily precipitation using wavelet – neural networks
.
Hydrological Sciences Journal
54
(
2
),
234
246
.
Partal
T.
Kisi
Ö.
2007
Wavelet and neuro-fuzzy conjunction model for precipitation forecasting
.
Journal of Hydrology
342
(
1–2
),
199
212
.
Powell
M. J.
1987
Radial Basis Functions for Multivariable Interpolation: A Review, Algorithms for Approximation
.
Clarendon Press
,
New York, USA
, pp.
143
167
.
Queipo
N. V.
Haftka
R. T.
Shyy
W.
Goel
T.
Vaidyanathan
R.
Kevin Tucker
P.
2005
Surrogate-based analysis and optimization
.
Progress in Aerospace Sciences
41
(
1
),
1
28
.
Ryu
J. S.
Kim
M. S.
Cha
K. J.
Lee
T. H.
Choi
D. H.
2002
Kriging interpolation methods in geostatistics and DACE model
.
Journal of Mechanical Science and Technology
16
(
5
),
619
632
.
Sarkar
A.
Basu
S.
Varma
A.
Kshatriya
J.
2002
Auto-correlation analysis of ocean surface wind vectors
.
Journal of Earth System Science
111
(
3
),
297
303
.
Shukla
J.
Mooley
D.
1987
Empirical prediction of the summer monsoon rainfall over India
.
Monthly Weather Review
115
(
3
),
695
704
.
Weesakul
U.
Lowanichchai
S.
2005
Rainfall forecast for agricultural water allocation planning in Thailand
.
Thammasat International Journal of Science and Technology
10
(
3
),
18
27
.
Zhang
C.
Qi
R.
Qiu
Z.
2008
Comparing BP and RBF neural network for forecasting the resident consumer level by MATLAB, Computer and Electrical Engineering, 2008. ICCEE 2008. International Conference on. IEEE
, pp.
169
172
.