The prediction of precipitation is of importance in the Thua Thien Hue Province, which is affected by climate change. Therefore, this paper suggests two models, namely, the Seasonal Auto-Regressive Integrated Moving Average (SARIMA) model and the Long Short-Term Memory (LSTM) model, to predict the precipitation in the province. The input data are collected for analysis at three meteorological stations for the period 1980–2018. The two models are compared in this study, and the results showed that the LSTM model was more accurate than the SARIMA model for Hue, Aluoi, and Namdong stations for forecasting precipitation. The best forecast model is for Hue station (= 0.94, = 0.94, = 8.15), the second-best forecast model is for Aluoi station ( = 0.89, = 0.89, = 12.72), and the lowest level forecast is for Namdong station ( = 0.89, = 0.89, = 12.81). The study result may also support stakeholderswho apply these models with future data to mitigate natural disasters in Thua Thien Hue.

  • Neural network methods of SARIMA and LSTM can improve the accuracy of forecasting of monthly precipitation in the Thua Thien Hue Province.

  • The local precipitation forecast system depends heavily on the neural network using meteorological data collected from Hue, Aluoi, and Namdong stations, and these are presented.

  • The Min–Max normalization method for the data is applied to improve the accuracy of the precipitation forecast of the models.

  • A comparison of forecasts implemented between LSTM with NSE, R2, and RMSE is made.

  • The prediction of LSTM is significantly better than SARIMA for the monthly precipitation regime.

In recent decades, global climate change has caused sea levels to rise, increased droughts, and extreme flooding. These dangerous weather phenomena, which are almost becoming a pattern in modern times, threaten food security and endanger the lives of several hundred million people on earth (Mall et al. 2006; Stuart et al. 2011; Busby 2018). Several countries are experiencing harsh climates, and the least-developed countries are the most affected (Mango et al. 2011; Elliott et al. 2014; Obianyo 2019; Oo et al. 2020). Climate change has led to the worst precipitation scenarios, because the change in the rain cycle has affected the agricultural sector, the management and reserves of water sources, groundwater, and the flow of rivers (Liu et al. 1998; Liu et al. 20s00; Mirza 2003; Kotir 2011; Defrance et al. 2020; Javadinejad et al. 2020).

The mitigation and control of natural disasters require that meteorological forecasts should be done early, with high accuracy and easy understanding. Currently, there are plenty of meteorological prediction studies based on artificial intelligence like machine learning and neural networks, and the results of research reveal the high accuracy of prediction of precipitation, storms, and droughts for both short and long periods (Nourani et al. 2011; Deka 2014; Du et al. 2018). Weather-related data are usually time-series data, so methods for time-series forecasting commonly apply to weather prediction (Mishra et al. 2007; Park & Kim 2017). Several types of research using supervised machine learning like Long Short-Term Memory Recurrent Neural Networks (LSTM RNNs) and Seasonal Auto-Regressive Integrated Moving Average (SARIMA) or Auto-Regressive Integrated Moving Average (ARIMA) predict and analyze time-series data (Anwar et al. 2016; Chen et al. 2017; Parmezan et al. 2019). The data signals move in the backward directions, and these networks have feedback connections in LSTM as well (Kalchbrenner et al. 2015; Salehinejad et al. 2017). Valipour (2015) used two models of SARIMA and ARIMA to study long-term runoff forecasting for 2011 with the data obtained from the year 1901 to 2010 in the United States. The study results also indicated that the accuracy of the SARIMA model is better than that of the ARIMA model. Sampson et al. (2013) employed the SARIMA model to develop forecasting precipitation from January 1980 to December 2010 in the Navrongo Municipality of Ghana. The study result indicated that the (p, d, q) × (P, D, Q)s parameters for the best SARIMA model of the precipitation forecast were (0, 0, 1) × (0, 1, 1)12. Bibi et al. (2014) applied the ARIMA time-series model for monthly precipitation prediction with data over 27 years (from 1980 to 2006) in Northeastern Nigeria. The model showed the monthly precipitation tendency and the number of rainy days in a month for every six months (from May to October). Hu et al. (2018) applied ANN and LSTM network models that predicted the precipitation runoff. The research data were collected from 14 precipitation stations and one hydrologic station in the catchment for flood events from 1971 to 2013 in the Fen River Basin. The results indicated that both network models were suitable for precipitation-runoff models. Ni et al. (2020) implemented the LSTM model for the forecast of streamflow and precipitation. The monthly streamflow volume data were collected from Cuntan and Hankou stations in the Yangtze River basin, China. The results showed that LSTM was also suitable for time-series prediction.

This paper studies precipitation prediction using the SARIMA and LSTM models for the Thua Thien Hue Province. The input vectors used in the models are based on 468 months per 39 years of precipitation measured at three main meteorological stations, Hue, Aluoi, and Namdong, of the province. This study also highlights the comparison between SARIMA and LSTM that heavily depend on the results of statistic accuracy parameters such as mean (M), root mean square error (RMSE), Nash–Sutcliffe Efficiency (NSE), minimum (Min), maximum (Max), standard deviation (St. Dev.), coefficient of variation (Cv), skewness coefficient (Cs), and correlation of determination (R2). The collection of results of these models may indicate the working efficiency of the models for precipitation prediction. Furthermore, the forecasting result can assist the province in mitigating natural disasters like floods, droughts, and landslides.

The structure of the paper is organized as follows. Section 1 gives the introduction to the paper. Section 2 introduces the methodology used throughout this paper. Sections 3 and 4 describe the study results and present discussions. Finally, Section 5 presents the conclusions.

Study area

The Thua Thien Hue Province belongs to the central coast of Vietnam (see Figure 1). The province is a tropical monsoon region with a complicated topography and climate. This province is one of the areas greatly affected by natural disasters in Vietnam (Do 2002; Lee & Lee 2017; Huynh et al. 2021; Nguyen et al. 2021). Almost annually, the province has to contend with natural disasters like floods, storms, droughts, and so on. These extreme disasters include past events like the floods of 1983, 2007, 2011, and 2015, the destructive storms of 1985, 2006, and 2013, and the floods of 1999 and 2020 that were unprecedented in terms of recorded history. These disasters caused huge loss of human lives and damage to property in the province. Moreover, nowadays, they occur with intense frequency and unfailing regularity, breaking the established law of the weather and climate.

Figure 1

Position of three meteorological stations. Please refer to the online version of this paper to see this figure in colour: https://dx.doi.org/10.2166.wcc.2022.271.

Figure 1

Position of three meteorological stations. Please refer to the online version of this paper to see this figure in colour: https://dx.doi.org/10.2166.wcc.2022.271.

Close modal

The red–blue dots in Figure 1 denote the locations of the Hue, Aluoi, and Namdong meteorological stations. The precipitation observed in these stations greatly influences the province river system. The precipitation obtained in these stations indicates that the average annual precipitation of the province is unevenly distributed over both space and time. With regard to space, the precipitation of the mountain area fluctuates from 3,400 to 7,000 mm and that of the delta area ranges from 2,100 to 3,000 mm. With regard to time, the season of less rainfall is from January to August, and the actual rainy season is from September to December (for details, see Figure 2).

Figure 2

Heatmap of annual precipitation in the Hue, Aluoi, and Namdong stations.

Figure 2

Heatmap of annual precipitation in the Hue, Aluoi, and Namdong stations.

Close modal

Long short-term memory

LSTM is a member of the family of RNN and was first introduced in the year 1997 (Hochreiter & Schmidhuber 1997; Chakraborty et al. 2016; Vazhayil & Soman 2018). LSTM can record values from previous periods for future applications (Mackenzie et al. 2018; Siami-Namini et al. 2018). Before describing LSTM, this study will introduce the basic concept of neural networks.

Artificial neural network

A neural network includes at least three core layers, namely, an input layer, a hidden layer, and an output layer (Huang 2003). The numerical characteristic of the dataset determines the dimension or the number of nodes in the input layer. Input nodes receive communication signals in a form that can be represented by numerical expressions. Activation values represent the communication, in which each node assigns a number, and the higher number gets a greater activation. Then, this communication is transmitted over the network. Each node carries a weight, and based on connection strength (weighting), inhibition or excitation, and the transfer function, the activation value is transferred from one node to another (Murray & Edwards 1993; Kaushik et al. 2020). The neural networks learn essentially by adjusting the weight for each summary (Smith & Demetsky 1994; Chang 2012). In the hidden layers, the nodes are used as an activation function on the weighted sum of inputs to transform to the output layer or predicted values. The output layer initiates a probability vector for the various output nodes and selects the one with a minimum error rate. This point implies that minimizing the difference between expected and predicted values, the error rate using a function called ReLU to obtain through first-time network training may not be the best because of the assignments to the weight vectors. Based on the algorithm of ‘backpropagation’, which finds the smallest values for errors, the errors are in the form of ‘backpropagation’ in the network from the output layer to be fed back on the hidden layers, and the resulting weight is adjusted to improve the predicted values. The training procedure is repeated until the predicted values approach the actual values (Werbos 1990; Paola & Schowengerdt 1995).

Recurrent neural network

The RNN is an exceptional case of a neural network. The goal of RNN is to predict the next step in a sequence of observations relative to previous steps in the same sequence. In other words, RNN is a memory algorithm and is capable of remembering previously computed information (Mohan & Gaitonde 2018; Siami-Namini et al. 2018; Balderas et al. 2019). Unlike traditional neural networks, the communication between the input layer and the output layer is independent. The idea behind RNN is to take advantage of sequential observations and learn from previous periods to predict future trends. Hence, the earlier stages of data need to be memorized when guessing the next steps (Kraus & Feuerriegel 2017; Siami-Namini et al. 2018, 2019). In RNNs, hidden layers act as internal storage to store the information obtained in previous sequential read stages. RNNs are named ‘recurrent’ because they conduct the same task for every element of the sequence, with the characteristic of using previously collected information to predict future invisible sequence data. The major problem of typical generic RNNs is that these networks only remember a few previous steps in the sequence and are not suitable for memorizing longer data sequences. This problem is expected to be solved by using the ‘memory stream’ and is presented in LSTM (Lipton et al. 2015; Kraus & Feuerriegel 2017; Yang et al. 2018; Young et al. 2018; Wu et al. 2020a, 2020b).

Long short-term memory

LSTM includes multiple LSTM cells that are connected, and the specific structure is described in Figure 3 (Bermúdez et al. 2017; Wu et al. 2020a, 2020b). The idea behind LSTM is to add the internal state and three filter gates from the input to the output process for the cell . These ports include the forget gate , the input gate , and the output gate . At each time step t, the gates, respectively, take the input and the value obtained from the output of the memory cell from the previous time step t–1.

  • The forget gate is responsible for deciding whether cell state communication in the t–1 time step should be stored. Communication from the current input and the hidden state is passed through the Sigmoid function with the output in the range {0, 1}. Therefore, if the output gate is close to 1, the communication needs to be retained. If the output gate is close to zero, the communication must be discarded.

  • The input gate is responsible for updating communication to the cell state. Presently, the output layer of the sigmoid activation multiplies with the output tanh activation to decide whether the current of the input gate state and the hidden gate state and the states should be updated in the cell state.

  • The output gate is responsible for calculating the hidden state value of the next time step. Using the forget and input gates, the new value of the cell state can be calculated, and the current input and the hidden state values can be combined to calculate the next hidden state value. The next hidden state value is the prediction value.

Figure 3

LSTM neural network structure.

Figure 3

LSTM neural network structure.

Close modal
The sequential update formula is
(1)
(2)
(3)
(4)
(5)
(6)
(7)
where is the input vector (forcing and static attributes) for the time step. ,,,,,, and are the network weights, ,,, and are the bias parameters of the forget gate, input gate, output gate, state, and output layer, respectively, y is the output to be compared with observations, h is the hidden state, s is called the cell state of memory cells, σ is the sigmoidal function, is element-wise multiplication, and tanh is the hyperbolic tangent function.

Obtaining the best generalization of network models requires understanding a method of manual calibration of hyperparameters. The main parameters calibrated during the model training encompass the number of layers, number of nodes, batch size, verbose, and epoch. Therefore, by looking for relevant hyperparameters, the model can make better predictions. In this paper, the main criteria of the LSTM model built for three meteorological stations of Hue, Aluoi, Namdong are given in Table 1. The model consists of an LSTM layer, followed by a single connected layer, as recommended by some studies (Zhang et al. 2018; Ayzel 2019). The main parameters of the LSTM neural network model are weights and biases, updated through the backpropagation time algorithm (BPTT) (Werbos 1990). Furthermore, metadata should be selected to design the training process. The loss function is the mean squared error function. The number of hidden neurons of the LSTM layer is 200. The training network uses several epochs to minimize mean squared error (MSE) using the optimization method by estimating the moment of adaptation (Adam) (Kingma & Ba 2014). The number of epochs for this study is 2,000 and 10,000, respectively. The batch size has a size of the training dataset with 336 samples, and is called Batch Gradient Descent.

Table 1

Basic component of the LSTM model

Number of inputs 12 Number of outputs 
Number of hidden nodes 200 Activation function ReLU 
Optimizer Adam Loss function MSE 
Epoch 2,000; 10,000 verbose 
Batch size 336 Metrics Mean Absolute Error 
Number of inputs 12 Number of outputs 
Number of hidden nodes 200 Activation function ReLU 
Optimizer Adam Loss function MSE 
Epoch 2,000; 10,000 verbose 
Batch size 336 Metrics Mean Absolute Error 

SARIMA model

The SARIMA model is used to deal with seasonality, and the seasonal differencing of appropriate order is used to remove non-stationarity from the series in this model. For the monthly time-series s =12, the model is generally termed the SARIMA (p, d, q) × (P, D, Q)s model. Therefore, there is the backshift operator C as C=a+b, and the seasonal difference is described as I=c+d. The SARIMA model with (p, d, q) non-seasonal order terms and (P, D, Q) seasonal order terms has a structure as given below (Kardakos et al. 2013; Vagropoulos et al. 2016; Mao et al. 2018):
(8)
where denote with order, while , specialize in C, with and D being non-seasonal and seasonal differencing orders, andrepresents white noise with zero mean and standard deviation . The operators can be depicted by the following equations:
(9)
(10)
(11)
(12)

In this paper, the SARIMA model is deployed in four phases: determination, estimation, verification, and execution (or forecast) phases. These phases are implemented to determine the optimal predictive model in each station (Salas & Obeysekera 1982; Burlando et al. 1993). In the first phase, it is required to determine the data stationarity and the general form or the estimated model order. In the second phase, using the Augmented Dickey–Fuller (ADF) test, the p-value and critical value cutoffs are evaluated. If the test statistic value is lower than the critical value, the stationary time-series of data is fixed (Guo & Ogata 1997; Avishek & Prakash 2017); the unit root test is used to consider that a null hypothesis of stationarity in a time-series for validation is accepted or rejected (Said & Dickey 1984; Dabral & Murry 2017); the maximum likelihood method is applied to estimate the model parameters; and the Ljung–Box statistical test is used to check its suitability to prove that the residue is white noise (Ljung & Box 1978). In the third phase, the Auto Correlation Function (ACF) and Partial Correlation Function (PACF) is employed to determine the best model type. Using Akaike Information Criteria (AIC), Bayes Information Criteria (BIC), and the plots of residuals to estimate each model, the model is selected with the lowest AIC and BIC (Akaike 1998; Box et al. 2011). Finally, once the optimal model is determined, the execution is made, and the results are compared with the validation vectors by , NSE, and RMSE to evaluate the performance (Dastorani et al. 2016; Nury et al. 2017).

Data normalization

The precipitation data of the study have an abnormal variation, so an accurate precipitation forecast by both SARIMA and LSTM requires a dimensionless processing of data. So, a method of Min–Max normalization is used in this study. The Min–Max normalization is a linear transformation method, also known as the normalization deviation, which causes the result to fall within the range (Chen et al. 2018; Xia et al. 2018; Ju et al. 2019). The conversion formula is as follows:
(13)
where Y and are the original data and the normalized data, and Min and Max are the minimum and the maximum values in the original sample data.

Performance metrics

Forecasting results are based on the calculation and comparison of the actual values with the forecasted values. The metrics of the accuracy measurement parameters are the RMSE,, and NSE. At the same time, the error metrics are defined as follows (Touzani et al. 2018; Kardani et al. 2020):
(14)
(15)
(16)
where are the estimated value and observed value in the period time t, and n is the number of the observed value in the testing data. is the mean of the observed value. The and NSE should approach 1 to indicate strong model performance, and the RMSE should be as close to zero as possible. Three accuracy measurements are applied by using the SARIMA and LSTM models.

The methodology of this study is described by the flowchart shown in Figure 4.

Figure 4

Flowchart of the research steps used in this study.

Figure 4

Flowchart of the research steps used in this study.

Close modal

Data collection

For the present observed study, the precipitation at the Hue, Aluoi, and Namdong meteorological stations for the period spanning between 1980 and 2018 was used, as well as the annual statistical report by the Thua Thien Hue Centre for Hydro-Meteorological Forecasting. At the same time, to double-check the data reliability, the annual statistical report by the Thua Thien Hue Province was also used. The data in Table 2 indicate the characteristics of the data employed in this study.

Table 2

Meteorological location, record period, and years considered

StationsLocationEarliest record yearLatest record yearNumbers of month
Hue Hue city 1980 2018 468 
Aluoi Aluoi district 1980 2018 468 
Namdong Namdong district 1980 2018 468 
StationsLocationEarliest record yearLatest record yearNumbers of month
Hue Hue city 1980 2018 468 
Aluoi Aluoi district 1980 2018 468 
Namdong Namdong district 1980 2018 468 

The operation and maintenance of the meteorological sites may affect the collection of the dataset, possibly leading to an extreme value outside the expected range of the prediction model, which will cause dissimilarily with other data. Therefore, the observed data must be free from any outliers to ensure that the best observation data are used in the models. A dataset that contains extreme values that are outside the range is called outliers. To remove these outliers from our datasets, we used the annual statistical report by the Thua Thien Hue Centre for Hydro-Meteorological Forecasting to check data reliability with the standard deviation method.

Statistical characteristic results from the monthly precipitation for each meteorological station are also presented in Table 3. The range of the following characteristics was computed from the observed monthly precipitation time-series: the mean, minimum and maximum values, standard deviation (St. Dev.), coefficient of variation (Cv), and skewness coefficient (Cs) (Eris et al. 2019).

Table 3

Statistical characteristics of monthly precipitation data

Mean (mm)
Min (mm)
Max (mm)
St. Dev.
Cv (%)
SkD
StationsMinMaxMinMaxMinMaxMinMaxMinMaxMinMax
Hue 50.6 788.3 3.2 35 353.7 2,452.3 46.7 451 48 106 0.39 2.26 
Aluoi 68.2 912.4 4.5 132.7 499.0 2,590.0 68.2 912.4 31 89 0.37 1.93 
Namdong 66.2 974.0 1.6 123.5 412.4 2,672.3 50 681.1 43 76 0.44 2.06 
Mean (mm)
Min (mm)
Max (mm)
St. Dev.
Cv (%)
SkD
StationsMinMaxMinMaxMinMaxMinMaxMinMaxMinMax
Hue 50.6 788.3 3.2 35 353.7 2,452.3 46.7 451 48 106 0.39 2.26 
Aluoi 68.2 912.4 4.5 132.7 499.0 2,590.0 68.2 912.4 31 89 0.37 1.93 
Namdong 66.2 974.0 1.6 123.5 412.4 2,672.3 50 681.1 43 76 0.44 2.06 

The input precipitation variables are separated into two parts. The first part is for the period January 1980–December 2008 and is used for the training phase, which contains about 74% of the entire data. The second part is for January 2009–December 2018 and is used for the test phase, which contains the remaining 26% of precipitation data. The monthly values of the dataset have been averaged for the entire January 1980–December 2018 period. Furthermore, considering the precipitation amplitude in these areas, this study has normalized this dataset by the Min–Max scaler method. The dataset after normalization is used in this study for precipitation prediction using the LSTM and SARIMA models. The lines in Figure 5 indicate the conversion of the precipitation data series of the three meteorological stations (Figure 5(a)) to the Min–Max scaler (Figure 5(b)).

Figure 5

(a) Calibration vector of the monthly precipitation time-series and (b) calibration vector of the transformed monthly precipitation time-series using the Min–Max scaler.

Figure 5

(a) Calibration vector of the monthly precipitation time-series and (b) calibration vector of the transformed monthly precipitation time-series using the Min–Max scaler.

Close modal

Application

Precipitation prediction by the SARIMA model

With regard to predicting the precipitation at the three stations, the following four steps for this experiment determine the optimal predictive models. Firstly, the precipitation data are normalized by using a Min–Max scaler. The calibration data in Figure 5(b) show that the peaking precipitations are observed about once a year. The observation results show that the highest precipitation peak is the Namdong area, next to the second-highest in the Aluoi area and the lowest in the Hue area. This result can be considered as an indicator of the sequence of seasonal behavior. However, it is difficult to determine the presence of seasonality visually with the figure above, so this study is conducted by using time-series decomposition. In this way, this study is verified by the components and the structure of the string is analyzed (trends, seasons, and random items). The four-line charts in Figure 6 show the additive decomposition of the lines. It can be determined that the lines are highly seasonal, mainly following the typical unilateral precipitation regime in the stations. At the same time, a trend is not maintained throughout the line. Secondly, the result of using the ADF test showed that the p-value of the models is p 0.00 (<0.05). The p-values indicate that the data of the models have a unit root (d= 1 and D= 1) and are stationary (Diebold & Senhadji 1996; MacKinnon 1996; Granger & Swanson 1997). Third, testing the ACF and PACF of the data of the three stations with correlation plots (detail in Figure 7) shows that the oscillations of these plots are sinusoidal in nature. These fluctuations also point out that the data are suitable for the SARIMA model to represent the precipitation series of the Hue, Aluoi, and Namdong stations. Besides, there are two similar things for each pair of plots: non-randomness of the time-series and a high lag-1 (which will probably need a higher order of differencing d and D). The standardized residual plots in Figures 8(a), 9(a), and 10(a) indicate the residuals over time as white noise. According to the histogram in Figures 8(b), 9(b), and 10(b), the orange Kernel Density Estimation (KDE) curve line conforms rather closely to the green N(0,1) curve line. Furthermore, Figures 8(b) and 10(b) show that the histograms resemble a right-skewed distribution, and meanwhile, Figure 9(b) indicates that the histogram resembles a left-skewed distribution. These are good signals for the SARIMA models of the Hue, Namdong, Aluoi stations in the sense that the residuals are equivalent to normal distribution. Likewise, Figures 8(c), 9(c), and 10(c) point out the Q–Q plots, the distribution of the ordered residuals with blue dots lying on the red line, with a mean equal to 0 and standard deviation equal to 1. These indicate that the residuals follow a linear trend. The correlogram plots in Figures 8(d), 9(d), and 10(d) imply that the residuals of the original data have a low correlation with the lagged data of itself. Finally, the p, d, q, and P, D, Q parameters of these models will be adjusted until the optimal model is gained by using the minimum of AIC. The data in Table 4 indicate that the best-fit models for the Hue, Namdong, and Aluoi stations are SARIMA (0, 1, 1) × (1, 1, 1, 12), SARIMA (0, 1, 1) × (1, 1, 1, 12), and SARIMA (0, 1, 1) × (1, 1, 1, 12), respectively, which are selected on the basis of the minimum values of =510, = 322, and = 321, respectively.

Table 4

Akaike information criterion (AIC) from some of the tested forecast models at the Hue, Namdong, and Aluoi stations

Hue
Namdong
Aluoi
Model of SARIMAAICModel of SARIMAAICModel of SARIMAAIC
(0, 1, 1) × (1, 1, 1, 12) −510 (0, 1, 1) × (1, 1, 1, 12) −321 (0, 1, 1) × (1, 1, 1, 12) −322 
(0, 1, 1) × (0, 1, 1, 12) −509 (0, 1, 1) × (0, 1, 1, 12) −320 (0, 1, 1) × (0, 1, 1, 12) −320 
(1, 1, 1) × (0, 1, 1, 12) −509 (1, 1, 1) × (1, 1, 1, 12) −319 (0, 1, 1) × (0, 1, 1, 12) −320 
(1, 1, 1) × (1, 1, 1, 12) −509 (1, 1, 1) × (0, 1, 1, 12) −317 (1, 1, 1) × (1, 1, 1, 12) −319 
(1, 1, 1) × (1, 1, 0, 12) −458 (0, 1, 1) × (1, 1, 0, 12) −243 (1, 1, 1) × (0, 1, 1, 12) −318 
Hue
Namdong
Aluoi
Model of SARIMAAICModel of SARIMAAICModel of SARIMAAIC
(0, 1, 1) × (1, 1, 1, 12) −510 (0, 1, 1) × (1, 1, 1, 12) −321 (0, 1, 1) × (1, 1, 1, 12) −322 
(0, 1, 1) × (0, 1, 1, 12) −509 (0, 1, 1) × (0, 1, 1, 12) −320 (0, 1, 1) × (0, 1, 1, 12) −320 
(1, 1, 1) × (0, 1, 1, 12) −509 (1, 1, 1) × (1, 1, 1, 12) −319 (0, 1, 1) × (0, 1, 1, 12) −320 
(1, 1, 1) × (1, 1, 1, 12) −509 (1, 1, 1) × (0, 1, 1, 12) −317 (1, 1, 1) × (1, 1, 1, 12) −319 
(1, 1, 1) × (1, 1, 0, 12) −458 (0, 1, 1) × (1, 1, 0, 12) −243 (1, 1, 1) × (0, 1, 1, 12) −318 
Figure 6

Calibrating vector decomposition of the transformed precipitation time-series using the Min–Max scaler (time-series graphs with observed, seasonal, trend, and residual components).

Figure 6

Calibrating vector decomposition of the transformed precipitation time-series using the Min–Max scaler (time-series graphs with observed, seasonal, trend, and residual components).

Close modal
Figure 7

ACF and PACF correlation plots for the SARIMA model using the Min–Max scaler for (a) the Hue station, (b) the Aluoi station, and (c) the Namdong station.

Figure 7

ACF and PACF correlation plots for the SARIMA model using the Min–Max scaler for (a) the Hue station, (b) the Aluoi station, and (c) the Namdong station.

Close modal
Figure 8

Plots of residuals of the Hue station: (a) residuals over time; (b) frequency distribution histogram; (c) QQ plot; and (d) autocorrelation.

Figure 8

Plots of residuals of the Hue station: (a) residuals over time; (b) frequency distribution histogram; (c) QQ plot; and (d) autocorrelation.

Close modal
Figure 9

Plots of residuals of the Aluoi station: (a) residuals over time; (b) frequency distribution histogram; (c) QQ plot; and (d) autocorrelation.

Figure 9

Plots of residuals of the Aluoi station: (a) residuals over time; (b) frequency distribution histogram; (c) QQ plot; and (d) autocorrelation.

Close modal
Figure 10

Plots of residuals of the Namdong station: (a) residuals over time; (b) frequency distribution histogram; (c) QQ plot; and (d) autocorrelation.

Figure 10

Plots of residuals of the Namdong station: (a) residuals over time; (b) frequency distribution histogram; (c) QQ plot; and (d) autocorrelation.

Close modal

The experimental results in Figure 11 reveal that the predicted values produced by the SARIMA models are very close to the actual values of precipitation at the three stations. These plots also indicate that the overlap levels between the predictive and the actual precipitation lines are equivalent to 85% at the three stations. The parameters of R2, NSE, and RMSE in Table 5 show that the values of accurate measurement of the Hue, Aluoi, and Namdong stations are , = 0.93, and = 30.10; = 0.89, , and= 33.32; = 0.87, = 0.88, and = 36.73, respectively. The values of these indicators for the forecast of natural phenomena are quite acceptable. In addition, the highest accuracy of the precipitation forecast compared with actual precipitation is a result obtanied from the Hue station, followed by the second-highest in the Aluoi station, and the lowest in the Namdong station. The results of the model study are compared with the forecasting simulation of precipitation by using the LSTM model.

Table 5

Accuracy parameters for precipitation prediction

Station
Hue 30.09 8.15 0.93 0.94 0.93 0.94 
Aluoi 33.32 12.72 0.88 0.89 0.89 0.90 
Namdong 36.73 12.81 0.87 0.89 0.88 0.90 
Average 33.38 12.21 0.90 0.91 0.90 0.91 
Station
Hue 30.09 8.15 0.93 0.94 0.93 0.94 
Aluoi 33.32 12.72 0.88 0.89 0.89 0.90 
Namdong 36.73 12.81 0.87 0.89 0.88 0.90 
Average 33.38 12.21 0.90 0.91 0.90 0.91 
Figure 11

Actual and predicted precipitation based on the SARIMA models in (a) summary and (b) Hue, (c) Aluoi, and (d) Namdong stations.

Figure 11

Actual and predicted precipitation based on the SARIMA models in (a) summary and (b) Hue, (c) Aluoi, and (d) Namdong stations.

Close modal

Precipitation prediction by LSTM

With regard totheLSTM model, the following five steps for this analysis have produced acceptable results. First, the study input data are standardized by Min–Max normalization. Second, the number of neurons for the input layer, the hidden layer, and the output layer including 12, 200, and 1 nodes is identified and deployed. Third, the activation function is the ReLU function with the training algorithm using the Adam algorithm. Fourth, the test input parameters are analyzed as the epoch, verbose, batch size using the early stopping function gain to the optimal target function by comparing the loss function (MSE) with metrics (MAE). Finally, the result in Figure 12 shows the descent gradient trend as it increases the number of epochs of the model. In addition, the parameters in Table 6 show that the slope direction accuracy and training vibration amplitude between the loss function and the metrics of Figure 12(b) are lower than those of Figure 12(a). (This means that the loss function and the metric lines of Figure 12(b) move closer to the horizontal axis compared with the loss function and the metric lines of Figure 12(a).) After the training stops at the 2,000th and 10,000th epochs, the model will start overfitting from the 2,000th and 10,000th epochs. A comparison of the error metrics between two different numbers of epochs shows that the 10,000th epoch has better accuracy, better performance, and greater stability than the 2,000th epoch.

Table 6

Hyperparameter values along with the number of epochs

Stations2,000th epoch
10,000th epoch
MSEMAEMSEMAE
Hue 1.5508e-05 0.0019 1.4790e-05 0.0018 
Aluoi 2.2966e-05 0.0028 2.1158e-05 0.0023 
Namdong 3.1147e-05 0.0030 2.6396e-05 0.0027 
Stations2,000th epoch
10,000th epoch
MSEMAEMSEMAE
Hue 1.5508e-05 0.0019 1.4790e-05 0.0018 
Aluoi 2.2966e-05 0.0028 2.1158e-05 0.0023 
Namdong 3.1147e-05 0.0030 2.6396e-05 0.0027 
Figure 12

Loss and MAE along the number of epochs. (a) Training stops before the 2,000th epoch and when early stopping is called and (b) training stops before the 10,000th epoch and when early stopping is called.

Figure 12

Loss and MAE along the number of epochs. (a) Training stops before the 2,000th epoch and when early stopping is called and (b) training stops before the 10,000th epoch and when early stopping is called.

Close modal

The experimental results in Figure 13 show that the overlap levels between the predictive and the actual precipitation lines are equivalent to 95% at the three stations. The parameters of , NSE, and RMSE in Table 5 are considered to evaluate the best predictive model. The forecasting result of the precipitation shows the highest accuracy of the precipitation forecast for the Hue station with = 0.94, = 0.94, and = 8.15, the second-highest accuracy for the Aluoi station with = 0.89, = 0.90, and = 12.72, and the lowest accuracy for the Namdong station with = 0.89, = 0.89, and = 12.81.

Figure 13

Actual and predicted precipitation based on the LSTM model in (a) summary and (b) Hue, (c) Aluoi, and (d) Namdong stations.

Figure 13

Actual and predicted precipitation based on the LSTM model in (a) summary and (b) Hue, (c) Aluoi, and (d) Namdong stations.

Close modal

Also, the of the predictive models ranges from 0.89 to 0.94. In other words, the values indicate that the response of the predictive model is about 90% in comparison with the actual reality. Therefore, the forecasting model of LSTM for the three stations is acceptable. The results of these models will be compared with the rain forecast simulations by using the SARIMA model, which will be discussed in the following section.

A comparison of precipitation prediction between SARIMA and LSTM

The results of simulating the optimal precipitation forecast for the three meteorological stations of Hue, Aluoi, Namdong based on the two models of SARIMA and LSTM are analyzed as presented in Section 3.2.1 and Section 3.2.2 of this paper. In this section, the LSTM model is compared with the SARIMA model to explore the best forecasting model. First, the predicted precipitation results of these models indicate the best, the second-best, and the lowest ranks for the stations, respectively. Second, the two models are used to compare which model gives highly accurate and acceptable precipitation forecasting results for each meteorological station. The three-line charts in Figure 14 show the lines of training, testing, and forecasting. At the same time, Figure 14(a)–(c) shows precipitation prediction graphs using the LSTM and SARIMA models at Hue, Aluoi, and Namdong, respectively. The average values in Table 5 also indicate that the precipitation prediction by the LSTM model is better than the precipitation forecast by the SARIMA model at these three meteorological stations.

Figure 14

Best forecast of precipitation when using the SARIMA and LSTM models in (a) Hue, (b) Aluoi, and (c) Namdong stations.

Figure 14

Best forecast of precipitation when using the SARIMA and LSTM models in (a) Hue, (b) Aluoi, and (c) Namdong stations.

Close modal

Rain brings many benefits to natural landscapes and human life, but sometimes it also brings disasters such as flooding, landslides, and inundation, and it is a cause of drought when it fails. Hence, it is necessary to have high-accuracy forecasts that can help mitigate the impact of these disasters. The Thua Thien Hue Province is a place with the maximum precipitation and precipitation patterns in Vietnam, with an annual amount of about 7,000 mm per year. With this huge amount of precipitation, it is important that this province should have some forecasting models related to precipitation forecast. Hence, this study deploys forecasting precipitation methods using the two models of SARIMA and LSTM with data collected from Hue, Aluoi, and Namdong meteorological stations. The models may prove useful for the province's precipitation prediction. At the same time, the plots in Figure 15 provide the regression graphs for precipitation forecasting in each model. Both models have values for the precipitation forecast that are greater than 0.87. This shows that a high correlation exists between the actual and the predicted data. This study uses two different algorithmic neural network methods, but both produce quite similar results. Several recent precipitation studies have factored in rainfall and some of its effects to forecast precipitation. The research of Barman et al. (2021) used the SARIMA model to predict precipitation in the state of Assam in India; the results revealed an RMSE value of 84.40, which was higher than the RMSE value of SARIMA in this study. Poornima & Pushpalatha (2019) deployed an advanced LSTM-based RNN with weighted linear units to estimate precipitation in the Hyderabad region of India; the result showed an RMSE value of 0.35, which was lower than that of this study. Hence, this study result is acceptable when compared with that of the two studies.

Figure 15

Best performance of the determination coefficient of the SARIMA and LSTM models for the Thua Thien Hue Province in (a)-(a1) the Hue station, (b)-(b1) the Aluoi station, and (c)-(c1) the Namdong station.

Figure 15

Best performance of the determination coefficient of the SARIMA and LSTM models for the Thua Thien Hue Province in (a)-(a1) the Hue station, (b)-(b1) the Aluoi station, and (c)-(c1) the Namdong station.

Close modal

However, many published works have shown that the mechanisms to generate rainfall are based on cloud phenomena. Sánchez-Monedero et al. (2014) classified clouds into three main groups: condensing nucleus, steam, and vertical motion. Consequently, their studies indicated that the ensemble of input variables was selected by different pressure levels relative to weather factors. These factors consisted of the total amount of precipitable water, the degrees of rainfall, moisture, wind speed, wind direction, Convective Available Potential Energy (CAPE), and inhibition of convection. Hashim et al. (2016) also ascertained the significant effect of cloud information when studying rainfall in the city of Patna, India, based on a set of variables including cloud coverage, steam pressure, maximum and minimum temperatures, and frequency of wet days. Therefore, although rainfall is a complex nonlinear atmospheric process, largely dependent on local-scale spaces (Applequist et al. 2002), it is still difficult to define a set of variables for it. Meteorology is the most suitable discipline facilitating training in Artificial Intelligence (AI) models by taking into account the physical mechanism of the 24-h rain process.

Although this study has some limitations mentioned above, the results have shown that the best model for monthly rainfall forecasting is the LSTM model, followed by the SARIMA model. In addition, the results support the efforts of the province authority and local people to develop a plan for the mitigation of natural disasters.

This paper is the first study to present predicted precipitation by neural networks for the Thua Thien Hue Province. The LSTM and SARIMA models are used for forecasting precipitation. The two models are compared to prove their effectiveness in predicting precipitation. The study also shows that the LSTM model is more accurate than the SARIMA model, and the hierarchical order of the models indicates the best forecast model for the Hue station (= 0.94, = 0.94, and = 8.15), the second-best forecast model for the Aluoi station ( = 0.89, = 0.90, and = 12.72), and the lowest one for the Namdong station (= 0.89, = 0.89, and = 12.81). These statistical indicators also indicate that the monthly rainfall fluctuation at the Hue station is more stable than that at the Namdong and Aluoi stations. In addition, the Min–Max normalization method for the data was applied to improve the accuracy of the precipitation forecast of the models. This study will be more useful for daily forecasting results if it is based on the input data of rainfall measured by day. However, the results will prove useful for the Thua Thien Hue's provincial government and the local inhabitants if they apply these models for the mitigation of natural disasters such as floods, droughts, and landslides with new data collected in the future. One possible future work is to combine these two models into a hybrid model that will possibly give more accurate rain forecasting results.

Conceptualization, Material, and Methods: Nguyen Hong Giang; Writing, Original Draft Preparation: Tran Dinh Hieu, Nguyen Tien Thinh; Writing, Review, and Editing: Yu Ren Wang, Le Anh Phuong; Funding Acquisition: Tran Dinh Hieu, Nguyen Tien Thinh, and Nguyen Hong Giang. All authors have read and agreed to the published version of the manuscript.

This research was funded by Thu Dau Mot University, Vietnam. The APC was funded by Thu Dau Mot University.

We thank the school of Civil Engineering of the National Kaohsiung University of Science and Technology, Taiwan. We also thank Dr Hector M. Tibo and Dr June Raymond L. Mariano, the current PhD students of the National Kaohsiung University of Science and Technology, Taiwan.

The authors declare no conflict of interest.

Data cannot be made publicly available; readers should contact the corresponding author for details.

Akaike
H.
1998
Information Theory and an Extension of the Maximum Likelihood Principle
.
Springer
,
New York, NY
, pp.
199
213
.
Applequist
S.
,
Gahrs
G. E.
,
Pfeffer
R. L.
&
Niu
X. F.
2002
Comparison of methodologies for probabilistic quantitative precipitation forecasting
.
Weather and Forecasting
17
(
4
),
783
799
.
Avishek
P.
&
Prakash
P.
2017
Practical Time-Series Analysis: Master Time Series Data Processing, Visualization, and Modeling Using Python
.
Packt
,
Birmingham, UK
.
Ayzel
G.
2019
Does Deep Learning Advance Hourly Runoff Predictions
. In:
Proceedings of the V International Conference Information Technologies and High-Performance Computing (ITHPC-2019)
,
Khabarovsk, Russia
, pp.
16
19
.
Balderas
D.
,
Ponce
P.
&
Molina
A.
2019
Convolutional long short term memory deep neural networks for image sequence prediction
.
Expert Systems With Applications
122
,
152
162
.
Barman
U.
,
Hussain
A. E.
,
Dahal
M. J.
,
Barman
P.
&
Hazarika
M.
2021
Time Series Analysis of Assam Rainfall Using SARIMA and ARIMA
. In:
Smart Computing Techniques and Applications
(
S. C. Satapathy, V. Bhateja, M. N. Favorskaya & T. Adilakshmi, eds.
).
Springer
,
Singapore
, pp.
357
364
.
Bermúdez
J. D.
,
Achanccaray
P.
,
Sanches
I. D.
,
Cue
L.
,
Happ
P.
&
Feitosa
R. Q.
2017
Evaluation of recurrent neural networks for crop recognition from multitemporal remote sensing images
. In:
Anais do XXVII Congresso Brasileiro de Cartografia
, pp.
800
804
.
Box
G. E.
,
Jenkins
G. M.
&
Reinsel
G. C.
2011
Time Series Analysis: Forecasting and Control
.
John Wiley & Sons
,
Hoboken, NJ
.
Burlando
P.
,
Rosso
R.
,
Cadavid
L. G.
&
Salas
J. D.
1993
Forecasting of short-term precipitation using ARMA models
.
Journal of Hydrology
144
,
193
211
.
Busby
J.
2018
Warming world: why climate change matters more than anything else
.
Foreign Affairs
97
,
49
.
Chakraborty
B.
,
Mukherjee
P. S.
&
Bhattacharya
U.
2016
Bangla online handwriting recognition using recurrent neural network architecture
. In:
Proceedings of The Tenth Indian Conference On Computer Vision, Graphics And Image Processing
, pp.
1
8
.
Chen
P.
,
Niu
A.
,
Liu
D.
,
Jiang
W.
&
Ma
B.
2018
Time series forecasting of temperatures using SARIMA: an example from nanjing
.
In IOP Conference Series: Materials Science and Engineering
394
(
5
),
1
7
.
Dabral
P. P.
&
Murry
M. Z.
2017
Modelling and forecasting of precipitation time series using SARIMA
.
Environmental Processes
4
(
2
),
399
419
.
Defrance
D.
,
Sultan
B.
,
Castets
M.
,
Famien
A. M.
&
Baron
C.
2020
Impact of climate change in West Africa on cereal production per capita in 2050
.
Sustainability
12
(
18
),
7585
.
Deka
P. C.
2014
Support vector machine applications in the field of hydrology: a review
.
Applied Soft Computing
19
,
372
386
.
Diebold
F. X.
&
Senhadji
A. S.
1996
The uncertain unit root in real GNP: comment
.
The American Economic Review
86
,
1291
1298
.
Do
B.
2002
Floods and Storms in Central Viet Nam in 19th and 20th Centuries
.
Da Nang Publishing House
,
Vietnamese
.
Elliott
J.
,
Deryng
D.
,
Müller
C.
,
Frieler
K.
,
Konzmann
M.
,
Gerten
D.
,
Glotter
M.
,
Flörke
M.
&
Wada
Y.
2014
Constraints and potentials of future irrigation water availability on agricultural production under climate change
.
Proceedings of the National Academy of Sciences
111
,
3239
3244
.
Eris
E.
,
Aksoy
H.
,
Onoz
B.
,
Cetin
M.
,
Yuce
M. I.
,
Selek
B.
&
… Karakus
E. U.
2019
Frequency analysis of low flows in intermittent and non-intermittent rivers from hydrological basins in Turkey
.
Water Supply
19
(
1
),
30
39
.
Granger
C. W.
&
Swanson
N. R.
1997
An introduction to stochastic unit-root processes
.
Journal of Econometrics
80
,
35
62
.
Guo
Z.
&
Ogata
Y.
1997
Statistical relations between the parameters of aftershocks in time, space, and magnitude
.
Journal of Geophysical Research: Solid Earth
102
,
2857
2873
.
Hashim
R.
,
Roy
C.
,
Motamedi
S.
,
Shamshirband
S.
,
Petković
D.
,
Gocic
M.
&
Lee
S. C.
2016
Selection of meteorological parameters affecting rainfall estimation using neuro-fuzzy computing methodology
.
Atmospheric Research
171
,
21
30
.
Hochreiter
S.
&
Schmidhuber
J.
1997
Long short-term memory
.
Neural Computation
9
,
1735
1780
.
Huynh
H. T. L.
,
Nguyen
H. X.
,
Ngo
T. T.
&
Van
H. T.
2021
Pre-disaster assessment of flood risk for mid central Vietnam
.
International Journal of Disaster Resilience in the Built Environment
12
(
3
),
322
335
.
Javadinejad
S.
,
Dara
R.
&
Jafary
F.
2020
Climate change scenarios and effects on snow-melt runoff
.
Civil Engineering Journal
6
(
9
),
1715
1725
.
Kalchbrenner
N.
,
Danihelka
I.
&
Graves
A.
2015
Grid long short-term memory. arXiv preprint arXiv: 1507.01526
.
Kardakos
E. G.
,
Alexiadis
M. C.
,
Vagropoulos
S. I.
,
Simoglou
C. K.
,
Biskas
P. N.
&
Bakirtzis
A. G.
2013
Application of time series and artificial neural network models in short-term forecasting of PV power generation
. In:
48th International Universities’ Power Engineering Conference (UPEC)
, pp.
1
6
.
Kardani
N.
,
Zhou
A.
,
Nazem
M.
&
Shen
S.-L.
2020
Estimation of bearing capacity of piles in cohesionless soil using optimised machine learning approaches
.
Geotechnical Geological Engineering
38
,
2271
2291
.
Kingma
D. P.
&
Ba
J.
2014
Adam: A method for stochastic optimization. arXiv preprint arXiv: 1412.6980
.
Lipton
Z. C.
,
Berkowitz
J.
&
Elkan
C.
2015
A critical review of recurrent neural networks for sequence learning. arXiv preprint arXiv: 00019
.
Liu
Q.
,
Islam
S.
,
Rodriguez-Iturbe
I.
&
Le
Y.
1998
Phase-space analysis of daily streamflow: characterization and prediction
.
Advances in Water Resources
21
,
463
475
.
Liu
J.
,
Chang
H.
,
Hsu
T. Y.
&
Ruan
X.
2000
Prediction of the flow stress of high-speed steel during hot deformation using a BP artificial neural network
.
Journal of Materials Processing Technology
103
,
200
205
.
Ljung
G. M.
&
Box
G. E.
1978
On a measure of lack of fit in time series models
.
Biometrika
65
(
2
),
297
303
.
Mackenzie
J.
,
Roddick
J. F.
&
Zito
R.
2018
An evaluation of HTM and LSTM for short-term arterial traffic flow prediction
.
IEEE Transactions on Intelligent Transportation Systems
20
,
1847
1857
.
MacKinnon
J. G.
1996
Numerical distribution functions for unit root and cointegration tests
.
Journal of Applied Econometrics
11
,
601
618
.
Mall
R. K.
,
Gupta
A.
,
Singh
R.
,
Singh
R. S.
&
Rathore
L. S.
2006
Water resources and climate change: an Indian perspective
.
Current Science
90
(
12
),
1610
1626
.
Mishra
A. K.
,
Desai
V. R.
&
Singh
V. P.
2007
Drought forecasting using a hybrid stochastic and neural network model
.
Journal of Hydrologic Engineering
12
,
626
638
.
Mohan
A. T.
&
Gaitonde
D. V.
2018
A deep learning based approach to reduced order modeling for turbulent flow control using LSTM neural networks. arXiv preprint arXiv: 09269
.
Nguyen
C. D.
,
Ubukata
F.
,
Nguyen
Q. T.
&
Vo
H. H.
2021
Long-Term improvement in precautions for flood risk mitigation: a case study in the low-lying area of Central Vietnam
.
International Journal of Disaster Risk Science
12
(
2
),
250
266
.
Ni
L.
,
Wang
D.
,
Singh
V. P.
,
Wu
J.
,
Wang
Y.
,
Tao
Y.
&
Zhang
J.
2020
Streamflow and precipitation forecasting by two long short-term memory-based models
.
Journal of Hydrology
583
,
124296
.
Obianyo
J. I.
2019
Effect of salinity on evaporation and the water cycle
.
Emerging Science Journal
3
(
4
),
255
262
.
Oo
H. T.
,
Zin
W. W.
&
Kyi
C. T.
2020
Analysis of streamflow response to changing climate conditions using SWAT model
.
Civil Engineering Journal
6
(
2
),
194
209
.
Park
J.-M.
&
Kim
J.-H.
2017
Online recurrent extreme learning machine and its application to time-series prediction
. In:
International Joint Conference on Neural Networks (IJCNN)
, pp.
1983
1990
.
Salas
J. D.
&
Obeysekera
J. T. B.
1982
ARMA model identification of hydrologic time series
.
Water Resources Research
18
,
1011
1021
.
Salehinejad
H.
,
Sankar
S.
,
Barfett
J.
,
Colak
E.
&
Valaee
S.
2017
Recent advances in recurrent neural networks. arXiv preprint arXiv: 01078
.
Sampson
W.
,
Suleman
N.
&
Gifty
A.
2013
Proposed seasonal autoregressive integrated moving average model for forecasting precipitation pattern in the Navrongo Municipality of Ghana
.
Journal of Environment Earth Science
3
,
80
85
.
Sánchez-Monedero
J.
,
Salcedo-Sanz
S.
,
Gutiérrez
P. A.
,
Casanova-Mateo
C.
&
Hervás-Martínez
C.
2014
Simultaneous modelling of rainfall occurrence and amount using a hierarchical nominal–ordinal support vector classifier
.
Engineering Applications of Artificial Intelligence
34
,
199
207
.
Siami-Namini
S.
&
Namin
A. S.
2018
Forecasting economics and financial time series: ARIMA vs. LSTM. arXiv preprint arXiv:1803.06386
.
Siami-Namini
S.
,
Tavakoli
N.
&
Namin
A. S.
2018
A comparison of ARIMA and LSTM in forecasting time series
. In:
17th IEEE International Conference on Machine Learning and Applications (ICMLA)
, pp.
1394
1401
.
Siami-Namini
S.
,
Tavakoli
N.
&
Namin
A. S.
2019
The performance of LSTM and BiLSTM in forecasting time series
. In:
International Conference on Big Data (Big Data)
, pp.
3285
3292
.
Smith
B. L.
&
Demetsky
M. J.
1994
Short-term traffic flow prediction models-a comparison of neural network and nonparametric regression approaches
. In:
Proceedings of IEEE International Conference on Systems, Man and Cybernetics
, pp.
1706
1709
.
Stuart
M. E.
,
Gooddy
D. C.
,
Bloomfield
J. P.
&
Williams
A. T.
2011
A review of the impact of climate change on future nitrate concentrations in groundwater of the UK
.
Science of the Total Environment
409
,
2859
2873
.
Touzani
S.
,
Granderson
J.
&
Fernandes
S.
2018
Gradient boosting machine for modeling the energy consumption of commercial buildings
.
Energy Buildings
158
,
1533
1543
.
Vagropoulos
S. I.
,
Chouliaras
G. I.
,
Kardakos
E. G.
,
Simoglou
C. K.
&
Bakirtzis
A. G.
2016
Comparison of SARIMAX, SARIMA, modified SARIMA and ANN-based models for short-term PV generation forecasting
. In:
IEEE International Energy Conference (ENERGYCON)
, pp.
1
6
.
Vazhayil
A.
&
Soman
K. P.
2018
Deep Proteomics: Protein family classification using Shallow and Deep Networks. arXiv preprint arXiv:1809.04461
.
Werbos
P. J.
1990
Backpropagation through time: what it does and how to do it
.
Proceedings of the IEEE
78
,
1550
1560
.
Wu
X.
,
Park
Y.
,
Li
A.
,
Huang
X.
,
Xiao
F.
&
Asif
U.
2020b
Smart detection of fire source in tunnel based on the numerical database and artificial intelligence
.
Fire Technology
369
,
113234
.
Xia
W.
,
Zhu
W.
,
Liao
B.
,
Chen
M.
,
Cai
L.
&
Huang
L.
2018
Novel architecture for long short-term memory used in question classification
.
Neurocomputing
299
,
20
31
.
Yang
Z.-L.
,
Guo
X.-Q.
,
Chen
Z.-M.
,
Huang
Y.-F.
&
Zhang
Y.-J.
2018
RNN-stega: Linguistic steganography based on recurrent neural networks
.
IEEE Transactions on Information Forensics Security
14
,
1280
1295
.
Young
T.
,
Hazarika
D.
,
Poria
S.
&
Cambria
E.
2018
Recent trends in deep learning based natural language processing
.
IEEE Computational Intelligence Magazine
13
,
55
75
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).