## Abstract

Traditionally, tidal level is predicted by harmonic analysis (HA). In this paper, three hybrid models that couple varied pre-processing methods, which are empirical mode decomposition (EMD), ensemble empirical mode decomposition (EEMD), and empirical wavelet transform (EWT), with the nonlinear autoregressive networks with exogenous inputs (NARX) were applied to forecast tidal level. The models were, namely, EMD-NARX, EEMD-NARX, and EWT-NARX. The sub-series obtained by using EMD or EEMD or EWT were then used as the input vectors to the NARX with the original data as targets. Notably, the EWT-NARX model was employed to predict the tidal level for the first time. Simulations were based on the measurements from four tidal stations at the Pearl River Estuary, China. The results showed that the EWT-NARX, EEMD-NARX, and EMD-NARX outperformed the HA model. Specifically, EWT-NARX was optimal among the four. Moreover, from the Hilbert energy spectra we can see the EWT solved the mode-mixing problem that EMD and EEMD suffered from, thus enabling precise tidal level prediction. Simulations and experimental results confirmed that the EWT-NARX model can achieve prediction of the tidal level with high accuracy.

## HIGHLIGHTS

EWT-NARX model was applied to predict the tidal level for the first time.

EWT-NARX outperformed EEMD-NARX, EMD-NARX, and harmonic analysis.

EWT solves the mode-mixing problem that EMD and EEMD suffered so that the EWT-NARX model has better prediction performance.

### Graphical Abstract

## INTRODUCTION

Tides are an alternating variation of the sea level influenced mainly by the gravitation of the moon and the sun. Accurate tide forecasting is increasingly important for ship navigation safety, port operations, and coastal engineering. The proposed least-squares method for the determination of harmonic constants, harmonic analysis (HA), has been extensively used in tidal analysis. It assumes tides are composed of various tidal constituents, each of which is produced by a celestial body. Separating all tidal constituents using the HA method needs a time span of 18.6 years or more of tidal level records. Consequently, the HA is not enough for precise tidal prediction when the recorded tidal sequence does not last for such a long time. Especially for tides in estuaries, where the tidal waveform is more nonlinear and complex, the accuracy of the HA predictions cannot match researchers' expectations.

Recently, the artificial neural networks (ANN) have been widely used in hydrological series prediction because of their ability of nonlinearity and nonstationarity modeling (Jain *et al.* 1999; Toth *et al.* 2000; Cheng *et al.* 2005; Kisi & Cigizoglu 2007). Researchers have applied ANN to predict the tidal level in overcoming the limitations of the least-squares-based HA approach. A variety of studies have applied the feedforward neural network to forecast tidal level (Tsai & Lee 1999; Lee 2004; Makarynska & Makarynskyy 2008). Furthermore, a type of feedforward and feedback neural network, namely, nonlinear autoregressive networks with exogenous inputs (NARX), was employed to predict the tidal level and obtained better performances than the feedforward neural network (Rakshith *et al.* 2014; Salim *et al.* 2015).

Furthermore, the hybrid ANN models combined with data pre-processing techniques were proposed to improve single ANN method performance. Through the pre-processing progress, researchers can capture the features hidden in the original data. Then, the feature series is used as the input to ANN for further prediction, thus taking advantage of both data pre-processing methods and ANN. For instance, wavelet transform (WA) is a powerful tool in signal processing. A signal is decomposed into sub-series at different resolution levels by the discrete WA. The conjunction of WA and ANN (WA-ANN) has been successfully applied in hydrological applications (Adamowski & Chan 2011; Wei *et al.* 2012; Kalteh 2013; Altunkaynak & Nigussie 2017). Chen *et al.* (2007) utilized the WA-ANN to successfully forecast tides for one to five years in advance. A study by Seo *et al.* (2015) showed that the efficiency of WA-ANN is higher than that of single ANN in water level prediction. Thi *et al.* (2018) calculated the daily water level of Vietnam using WA-ANN. Another broadly used data pre-processing method is the empirical mode decomposition (EMD) (Huang *et al.* 1998). Unlike the WA, EMD is an adaptive method that does not suffer from the issue of the selection of basic functions. Su & Liu (2010) proposed the EMD and RBF neural network model in forecasting runoff time series. Liu *et al.* (2012) forecasted wind speed with the help of the EMD-ANN model. The EMD-ANN predicts stream flows with a higher prediction accuracy than ANN (Kisi *et al.* 2014).

However, both WA and EMD have apparent drawbacks. The WA relies highly on the selection of mother wavelet, while EMD suffers from the mode-mixing problem and lacks rigorous mathematical theory. As a new adaptive data analysis method, the empirical wavelet transform (EWT) was proposed by Gilles (2013) with the philosophy of combining both WA and EMD. EWT solves the mode-mixing problem of multicomponent signal decomposition by generating the adaptive wavelet. The combination of EWT and ANN (EWT-ANN) showed high accuracy in streamflow (Peng *et al.* 2017) and wind speed prediction (Liu *et al.* 2018).

Overall, the ANN models with data pre-processing imply great potential for tidal level forecasting and are worthy of further study. However, there are few studies on EWT-ANN. Notably, there is no research on EWT-ANN for tidal level prediction, to the best of our knowledge. In this paper, we will discuss the tidal level prediction performance of EWT-ANN in comparison with EMD-ANN, ensemble EMD (EEMD)-ANN, and HA. Simulations are conducted on the measurements derived from the tidal stations at Pearl River Estuary, China.

The remainder of the paper is organized as follows. In the next section, the traditional HA, data pre-processing methods, including EMD, EEMD, and EWT, are stated together with the adopted neural network model NARX. This is followed by a section describing the study area and data. The results section follows and then conclusions are drawn in the final section.

## METHODS

### Harmonic analysis (HA)

*N*is the total number of constituents, and , , are the amplitude, temporal frequency, and the initial phase of the

*i*th constituent, respectively. , , can be derived from measurements by applying the least-squares method. The prediction accuracy of the HA model depends on the time span of tidal level measurements. A time span of at least one year is considered suitable for precise tidal level prediction using HA.

### Empirical mode decomposition (EMD)

- (1)
Identify all the local maxima and minima of the signal .

- (2)
Connect all the local maxima by a cubic spline line to generate the upper envelope . The lower envelope is generated by connecting the local minima similarly.

- (3)
- (4)

Check whether satisfies the criteria of IMFs. If is an IMF then set ; if not, substitute for to repeat steps 1–3 until meets the criteria for IMFs.

The process stops if the number of extrema of is no larger than two; otherwise, regard as the new . Furthermore, repeat steps 1–4 to compute the next IMF.

By the process above, the signal is decomposed into IMFs and a residual. The IMFs reveal the local characteristics of the signal from high to low frequency.

### Ensemble empirical mode decomposition (EEMD)

EMD suffers from the mode-mixing problem, which implies either a single IMF consists of signals of widely disparate scales or a signal of a similar scale resides in different IMF components. To overcome this drawback, Wu & Huang (2009) proposed a noise-assisted analysis method, called EEMD. EEMD defines IMFs as the mean of an ensemble of trials, each of which is the decomposition of a signal adding a white noise of finite amplitude. The principle of the EEMD is simple: the added white noise would populate the whole time-frequency space uniformly with the constituent components of different scales. The signal with different scales is automatically projected onto proper scales of reference established by the white noise. The procedure of EEMD is described as follows:

- (1)
Add a set of white noise series to the original signal (in this paper, the ratio of the standard deviation of the added noise and that of the signal to be analyzed was 0.2).

- (2)
Decompose the signal with added white noise into IMFs by EMD.

- (3)
Repeat steps 1 and 2 multiple times with different white noise series; the final IMFs are the means of the corresponding IMFs in the ensemble.

### Empirical wavelet transform (EWT)

Inspired by the concept of EMD, Gilles (2013) proposed the EWT to build adaptive wavelets capable of extracting amplitude modulated-frequency modulated (AM-FM) components (modes) of a signal. With the main idea that the Fourier spectrum of each mode is compactly supported, researchers are able to extract all modes of a signal by using a series of adaptive wavelet filters. EWT has a more rigorous mathematical background than EMD. The process of EWT is presented as follows:

- (1)
Obtain the frequency component of a signal using Fast Fourier Transform (FFT).

- (2)
Extract different modes through proper segmentation of the Fourier spectrum.

- (3)
Apply the scaling function and wavelets to calculate the approximation and detail coefficients.

*ω*to be the limits between each segment (where

_{n}*ω*

_{0}

*=*0 and

*ω*=

_{N}*π*), the Fourier support [0,

*π*] is segmented into

*N*contiguous segments of [0,

*ω*], [

_{1}*ω*

_{1},

*ω*

_{2}],…, [

*ω*

_{N−1},

*π*]. The empirical wavelets are defined as bandpass filters on each segment. With the idea of Littlewood–Paley and Meyers wavelets (Daubechies 1992), Gilles defined the expressions for Fourier transform of a scaling function and wavelets as:where

*β*function is defined as:

### Nonlinear autoregressive network with exogenous inputs (NARX)

*u*(

*t*) and

*y*(

*t*) sequences, which fulfills the network with the ability of memory. The defining equation of the NARX model is:where

*y*is the output variable, and

*u*is the exogenous input variable. The function

*f*is unknown and can be approximated by using a feedforward neural network (Beale

*et al.*2011)

*.*When training the network, NARX is set an open-loop form where the observed data are used instead of feeding back the estimated output. This setting increases the accuracy of the NARX model. The training stops when the validation error reaches a minimum. Then, the network is transformed into a closed-loop form to perform multistep predictions when it has been well trained. A closed-loop network means the output is fed back to the input of the feedforward neural network instead of the observed data. NARX network is trained by the Levenberg–Marquardt (LM) algorithm because it is fast, accurate, and reliable (Adamowski & Karapataki 2010). Figure 1 shows the architecture of EMD-NARX, EEMD-NARX, and EWT-NARX models.

### Performance evaluation

*N*is the length of data.

## CASE STUDY

The hourly tidal level data implemented in this study were derived from Guanchong, Xipaotai, Huangjin, and Denglongshan tidal stations at the Pearl River Estuary (Figure 2). The time span of the tidal level records is from 1/1/2016 00:00 to 31/12/2016 23:00. For the Guanchong, Xipaotai, Huangjin, and Denglongshan tidal stations, the numbers of missing values for the whole year are 0, 13, 9, and 27, respectively. Generally, there are one or two missing values in one gap. The linear or quadratic interpolation was used depending on the number of missing values. As one of the most economically developed regions in China, the Pearl River Delta is densely populated. In addition, it is a place with strong sea–land interactions. The tides in this area are quite different from those in the ocean. They are mainly formed on the basis of the tidal wave from the Pacific Ocean, influenced by topography, runoff, and meteorological factors, including wind, tropical cyclone, and cold wave. The tidal waveform is more nonlinear and complicated. Traditional HA can only predict astronomical tide, which is not sufficient for the tidal level prediction in this area. Hence, a precise tidal level prediction is necessary and is of great benefit to the people living in this area.

## APPLICATIONS AND RESULTS

In this paper, all the methods mentioned above were implemented using MATLAB software. The simulations were run on a Microsoft Windows 7-based platform (2.2 GHz CPU and 8-GB RAM).

### Predictions of HA

For the HA model, each data set was divided into two parts, the first 85% of the data for model training (training set) and the remaining 15% for model testing (testing set). Figure 3 displays the amplitude of major tidal constituents. In addition, significant peaks are seen at the diurnal, semidiurnal, third, and quarter diurnal constituents (Figure 4). It is clear that M2 is the dominant tide in this area. Moreover, the amplitudes of semidiurnal constituents are the largest, while the amplitudes of third diurnal constituents are the smallest among the significant peaks. The tidal level forecasting performances using the HA model for the stations are shown in Figure 5.

### Predictions of EMD-NARX, EEMD-NARX, and EWT-NARX

In this section, each data set was divided into training (70%), validation (15%), and testing (15%) parts. The ‘divideblock’ function was used to divide the data sets to avoid the ‘future data’ issue. Each training set was decomposed into sub-series using EMD or EEMD or EWT. These sub-series are the exogenous inputs of NARX, with the original series as the target. Tapped delay lines were used for both the input and the output. A trial-and-error procedure was used to identify the optimal numbers of hidden neurons and delays. Experiences show that one can get a good performance without losing the calculating speed by using the same delay of targets () as the delay of inputs (). Hence, we adopted the same value of and in one simulation. The characteristics of the tide stations and the optimal numbers of hidden neurons and delays are shown in Table 1. The closed-loop form was used to perform multistep predictions of the testing data (1,319 time steps). Figures 6–8 show the tidal level forecasting performances of the EMD-NARX, EEMD-NARX, and EWT-NARX models.

Tidal station . | Data span . | Number of training records . | Number of validation records . | Number of testing records . | Delays-neurons . | ||
---|---|---|---|---|---|---|---|

EMD-NARX . | EEMD-NARX . | EWT-NARX . | |||||

Guanchong | 01/01/2016–31/12/2016 | 6,147 | 1,318 | 1,319 | 2–10 | 3–10 | 3–13 |

Xipaotai | 01/01/2016–31/12/2016 | 6,147 | 1,318 | 1,319 | 2–10 | 3–10 | 3–13 |

Huangjin | 01/01/2016–31/12/2016 | 6,147 | 1,318 | 1,319 | 2–10 | 3–10 | 3–13 |

Denglongshan | 01/01/2016–31/12/2016 | 6,147 | 1,318 | 1,319 | 2–10 | 3–10 | 3–13 |

Tidal station . | Data span . | Number of training records . | Number of validation records . | Number of testing records . | Delays-neurons . | ||
---|---|---|---|---|---|---|---|

EMD-NARX . | EEMD-NARX . | EWT-NARX . | |||||

Guanchong | 01/01/2016–31/12/2016 | 6,147 | 1,318 | 1,319 | 2–10 | 3–10 | 3–13 |

Xipaotai | 01/01/2016–31/12/2016 | 6,147 | 1,318 | 1,319 | 2–10 | 3–10 | 3–13 |

Huangjin | 01/01/2016–31/12/2016 | 6,147 | 1,318 | 1,319 | 2–10 | 3–10 | 3–13 |

Denglongshan | 01/01/2016–31/12/2016 | 6,147 | 1,318 | 1,319 | 2–10 | 3–10 | 3–13 |

## COMPARISONS AND DISCUSSIONS

In this paper, MAE, RMSE, MAPE, and regression coefficient (r) of the forecasting models are compared to evaluate the performance of these models. The performance indices of the stations are given in Table 2. Moreover, the errors between observations and predictions are shown in Figure 9 for apparent recognition. It is evident that the EWT-NARX model outperformed the EEMD-NARX, EMD-NARX, and HA models with the smallest RMSE, MAE, and MAPE, and the largest r. For the four tidal stations studied, RMSE of the testing period for EWT-NARX ranged from 0.008 to 0.014 m, for EEMD-NARX ranged from 0.041 to 0.075 m, for EMD-NARX ranged from 0.105 to 0.167 m, for HA ranged from 0.163 to 0.191 m. In particular, with regard to the Guanchong station, the RMSE value equal to 0.010 m was obtained by the EWT-NARX model compared to a value equal to 0.075 m for the EEMD-NARX model, 0.120 m for the EMD-NARX model, and 0.190 m for the HA model in the testing period. Further analysis showed that the RMSE associated with the EWT-NARX model of the Guanchong was 86.7%, 91.7%, and 94.7% less as compared to the EEMD-NARX, EMD-NARX, and HA models, respectively. The EEMD-NARX was superior to EMD-NARX and HA but slightly weaker than the EWT-NARX. For Guanchong, the EEMD-NARX improved the HA model by 60.5% decreasing in RMSE, 62.1% decreasing in MAE, 55.6% decreasing in MAPE in the testing period. In comparison, the improvement of the EWT-NARX model to the HA model was 94.7%, 94.8%, and 94.6% decreasing in RMSE, MAE, and MAPE, respectively. The EMD-NARX outperformed the HA model but was significantly worse than the EWT-NARX and EEMD-NARX models. The RMSE of the EMD-NARX model in the testing period, when compared with the RMSE of the HA method, was reduced by about 36.8%, 25.7%, 6.2%, and 35.6% for Guanchong, Xipaotai, Huangjin, and Denglongshan, respectively. Regression plots comparing the degrees of correlations between observed values and predicted values of the HA, EMD-NARX, EEMD-NARX, and the EWT-NARX models of the four stations are illustrated in Figure 10. The closeness between the observed and forecast values of the EWT-NARX model can be further testified by the corresponding highest regression coefficient (r) of 0.999, while r for EEMD-NARX, EMD-NARX, and HA models was in the ranges of 0.991–0.996, 0.967–0.982, and 0.944–0.954, respectively.

Station . | Model . | Training . | Validation . | Testing . | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|

RMSE (m) . | MAE (m) . | MAPE (%) . | RMSE (m) . | MAE (m) . | MAPE (%) . | RMSE (m) . | MAE (m) . | MAPE (%) . | r
. | ||

Guanchong | HA | 0.167 | 0.126 | 159.0 | 0.190 | 0.153 | 130.3 | 0.952 | |||

EMD-NARX | 0.076 | 0.055 | 79.7 | 0.084 | 0.064 | 45.8 | 0.120 | 0.094 | 99.6 | 0.977 | |

EEMD-NARX | 0.041 | 0.031 | 43.0 | 0.061 | 0.044 | 28.5 | 0.075 | 0.058 | 57.8 | 0.991 | |

EWT-NARX | 0.008 | 0.007 | 8.2 | 0.024 | 0.008 | 4.3 | 0.010 | 0.008 | 7.0 | 0.999 | |

Xipaotai | HA | 0.159 | 0.123 | 99.2 | 0.191 | 0.154 | 113.5 | 0.954 | |||

EMD-NARX | 0.078 | 0.059 | 57.9 | 0.072 | 0.055 | 36.3 | 0.142 | 0.113 | 72.8 | 0.967 | |

EEMD-NARX | 0.040 | 0.031 | 27.9 | 0.056 | 0.042 | 24.9 | 0.062 | 0.048 | 36.6 | 0.995 | |

EWT-NARX | 0.007 | 0.005 | 4.7 | 0.021 | 0.007 | 3.5 | 0.009 | 0.007 | 6.8 | 0.999 | |

Huangjin | HA | 0.160 | 0.127 | 75.4 | 0.178 | 0.141 | 51.2 | 0.944 | |||

EMD-NARX | 0.064 | 0.048 | 27.1 | 0.078 | 0.059 | 27.8 | 0.167 | 0.139 | 77.1 | 0.969 | |

EEMD-NARX | 0.035 | 0.026 | 15.2 | 0.069 | 0.048 | 23.0 | 0.043 | 0.030 | 15.4 | 0.996 | |

EWT-NARX | 0.006 | 0.004 | 2.9 | 0.017 | 0.006 | 2.8 | 0.014 | 0.011 | 4.2 | 0.999 | |

Denglongshan | HA | 0.147 | 0.118 | 121.3 | 0.163 | 0.128 | 105.4 | 0.951 | |||

EMD-NARX | 0.057 | 0.044 | 53.0 | 0.071 | 0.054 | 44.0 | 0.105 | 0.086 | 103.2 | 0.982 | |

EEMD-NARX | 0.033 | 0.025 | 29.8 | 0.064 | 0.044 | 33.1 | 0.041 | 0.029 | 25.7 | 0.996 | |

EWT-NARX | 0.005 | 0.004 | 4.9 | 0.015 | 0.006 | 4.4 | 0.008 | 0.006 | 6.9 | 0.999 |

Station . | Model . | Training . | Validation . | Testing . | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|

RMSE (m) . | MAE (m) . | MAPE (%) . | RMSE (m) . | MAE (m) . | MAPE (%) . | RMSE (m) . | MAE (m) . | MAPE (%) . | r
. | ||

Guanchong | HA | 0.167 | 0.126 | 159.0 | 0.190 | 0.153 | 130.3 | 0.952 | |||

EMD-NARX | 0.076 | 0.055 | 79.7 | 0.084 | 0.064 | 45.8 | 0.120 | 0.094 | 99.6 | 0.977 | |

EEMD-NARX | 0.041 | 0.031 | 43.0 | 0.061 | 0.044 | 28.5 | 0.075 | 0.058 | 57.8 | 0.991 | |

EWT-NARX | 0.008 | 0.007 | 8.2 | 0.024 | 0.008 | 4.3 | 0.010 | 0.008 | 7.0 | 0.999 | |

Xipaotai | HA | 0.159 | 0.123 | 99.2 | 0.191 | 0.154 | 113.5 | 0.954 | |||

EMD-NARX | 0.078 | 0.059 | 57.9 | 0.072 | 0.055 | 36.3 | 0.142 | 0.113 | 72.8 | 0.967 | |

EEMD-NARX | 0.040 | 0.031 | 27.9 | 0.056 | 0.042 | 24.9 | 0.062 | 0.048 | 36.6 | 0.995 | |

EWT-NARX | 0.007 | 0.005 | 4.7 | 0.021 | 0.007 | 3.5 | 0.009 | 0.007 | 6.8 | 0.999 | |

Huangjin | HA | 0.160 | 0.127 | 75.4 | 0.178 | 0.141 | 51.2 | 0.944 | |||

EMD-NARX | 0.064 | 0.048 | 27.1 | 0.078 | 0.059 | 27.8 | 0.167 | 0.139 | 77.1 | 0.969 | |

EEMD-NARX | 0.035 | 0.026 | 15.2 | 0.069 | 0.048 | 23.0 | 0.043 | 0.030 | 15.4 | 0.996 | |

EWT-NARX | 0.006 | 0.004 | 2.9 | 0.017 | 0.006 | 2.8 | 0.014 | 0.011 | 4.2 | 0.999 | |

Denglongshan | HA | 0.147 | 0.118 | 121.3 | 0.163 | 0.128 | 105.4 | 0.951 | |||

EMD-NARX | 0.057 | 0.044 | 53.0 | 0.071 | 0.054 | 44.0 | 0.105 | 0.086 | 103.2 | 0.982 | |

EEMD-NARX | 0.033 | 0.025 | 29.8 | 0.064 | 0.044 | 33.1 | 0.041 | 0.029 | 25.7 | 0.996 | |

EWT-NARX | 0.005 | 0.004 | 4.9 | 0.015 | 0.006 | 4.4 | 0.008 | 0.006 | 6.9 | 0.999 |

Analysis reveals that EWT-NARX is the optimal model of the four. The following text explains why EWT-NARX outperformed EMD-NARX and EEMD-NARX. The differences between the three models are that they decompose the training set in varied ways. As is known to us, EMD suffers from the mode-mixing problem, which complicates the waveform of the decomposed series. EEMD is proposed to alleviate the mode-mixing problem by adding a series of white noise before the sifting. The Hilbert transform is used to identify the instantaneous frequency of a multicomponent signal, which helps to understand the mode-mixing problem in signal decomposition (Huang *et al.* 2009). Figures 11 and 12 show the waveforms and Hilbert energy spectra of the sub-series obtained from EMD, EEMD, and EWT of Guanchong and Xipaotai, respectively. It can be seen from Figures 11(b) and 12(b) that the diurnal, semidiurnal, third, and quarter diurnal constituents seriously interfere with each other. There is a slight difference for EEMD, in which the diurnal constituents can apparently be distinguished, and the interference around the period of 1/2 day is reduced (Figures 11(d) and 12(d)). While from the Hilbert spectra of EWT pre-processing (Figures 11(f) and 12(f)), the diurnal, semidiurnal, third, and quarter diurnal constituents are separated. It should be noted that the third diurnal constituents are hard to view at the Hilbert spectra because their amplitudes are the smallest among the significant constituents (Figure 4). Consequently, the EWT solves the mode-mixing problem that EMD and EEMD encountered, which reduces the irregular variations in the decomposed series (the left column of Figures 11 and 12). The greater the regularity of the sub-series, the higher the accuracy of the ANN predictions.

## CONCLUSIONS

In this study, the HA, EMD-NARX, EEMD-NARX, and EWT-NARX models were applied to forecast the tidal level at the Pearl River Estuary, China. It is worth mentioning that the EWT-NARX model was applied in tidal level prediction for the first time. The results lead to the following conclusions:

- (1)
EWT-NARX, EEMD-NARX, and EMD-NARX were superior to the HA model with high accuracy (r > 0.96).

- (2)
The EWT-NARX model outperformed the EEMD-NARX, EMD-NARX, and HA models. For the four tidal stations studied, the RMSE of the testing period was in the ranges of 0.008–0.014 m, 0.041–0.075 m, 0.105–0.167 m, and 0.163–0.191 m for the EWT-NARX, EEMD-NARX, EMD-NARX, and HA models, respectively.

- (3)
In the comparison of EMD-NARX, EEMD-NARX, and EWT-NARX, the EWT method decomposed the original data in a physically meaningful way, avoiding the mode-mixing problem that EMD and EEMD suffered. It reduced the irregular variation of the decomposition sequences so that the EWT-NARX model has better prediction performance. In addition, compared with EMD, the EEMD method slightly alleviated the mode-mixing problem but was worse than EWT. Consequently, the prediction performance of EEMD-NARX ranked second among the three.

- (4)
Although the EMD-NARX, EEMD-NARX, and EWT-NARX models outperformed the HA model, the optimal numbers of neurons and delays of the NARX model remain unknown before the simulation, on which the prediction accuracy is highly reliant. In this paper, we used a trial-and-error procedure to identify the optimal numbers at the cost of computational complexity. A trade-off between accuracy and efficiency was unavoidably made.

## ACKNOWLEDGEMENTS

This work was financially supported by Natural Science Foundation of Guangdong Province of China (Grant No. 2018A030313191) and Scientific Research Foundation for Key Teacher of Jimei University (Grant No. C619042). Ministry of Natural Science Foundation Committee of Guangdong Province of China and Jimei University are acknowledged.

## DATA AVAILABILITY STATEMENT

Data cannot be made publicly available; readers should contact the corresponding author for details.