In this study, the group method of data handling (GMDH)-based wavelet transform (WT) was developed to forecast significant wave height (SWH) in different lead times. The SWH dataset was collected from a buoy station located in the North Atlantic Ocean. For this purpose, the time series of SWH was decomposed into some subseries using WT and then decomposed time series were imported to the GMDH model to forecast the SWH. Performance of the wavelet group method of data handling (WGMDH) model was evaluated using an index of agreement (Ia), coefficient of efficiency and root mean square error. The analysis proved that the model accuracy is highly dependent on the decomposition levels. The results showed that the WGMDH model is able to forecast the SWH with a high reliability.

INTRODUCTION

Accurate forecasting of wave properties is very important for most coastal activities. Wave forecasting with different lead times at a special location is required for planning and maintenance of any marine activity. Therefore, many significant wave height (SWH) forecasting models have been developed in recent years (e.g. Yangyang et al. 2010; Muraleedharan et al. 2012). Chakrabarti (1987) reported that the concept of SWH is the average of the highest one-third of all waves in a special location of the sea. It can be demonstrated by H1/3 or Hs. In the last decade, artificial intelligence (AI) techniques such as artificial neural networks (ANNs), genetic programming, and fuzzy logic have been extensively utilized to forecast SWH in the fields of coastal and ocean engineering (Deo et al. 2001; Tsai et al. 2002; Basu et al. 2005; Markarynsky et al. 2005; Londhe & Panchang 2006; Ozger & Sen 2007; Gaur & Deo 2008; Kamranzad et al. 2011).

Owning to the limited capability of the above-mentioned AI approaches in forecasting non-stationary and non-linear phenomena, hybrid AI models were developed to forecast SWH, and successful results have been reported (Ozger 2010; Deka & Prahlada 2012). The applications of hybrid wavelet–AI methods were completely described by Nourani et al. (2014).

Hybridization of wavelet and fuzzy logic has been employed by Ozger (2010) to forecast SWH for a lead time up to 48 h. For comparison purposes, he applied fuzzy logic, ANN and autoregressive moving average methods for the same SWH time series. He reported that the wavelet fuzzy logic method has better results than other models in all cases.

On the application of the group method of data handling (GMDH) model, Najafzadeh et al. (2013) reported the superiority of GMDH in predicting the abutment scour depth. Zhang et al. (2013) successfully employed an improved GMDH model to predict debris flow. Some other researchers used hybridized GMDH models (Zhu et al. 2012; Atashrouz et al. 2014). These hybrid models present a better result compared with the single GMDH model.

The present study is intended to illustrate a hybrid model for SWH forecasting. The complex SWH time series decomposed into several simple time series using a discrete wavelet transform (DWT). Therefore, some characteristics of subseries can be seen more clearly than the original signal. Then, a GMDH model was used as a predictor method. The proposed model can improve low-level model accuracies in long-range SWH forecasting.

BACKGROUND

Wavelet transform (WT) applications have increased in recent years. The WT is a tool to decompose a signal into its subseries in time and frequency domains. It is derivative from the Fourier transform. It has been used for studying non-stationary time series unlike the Fourier transform. This issue is the most important benefit of WT. In traditional transformation methods, production of both time and frequency information with a higher resolution is impossible, but WT resolves this deficiency (Misiti et al. 2000; Boggess & Narcowich 2009; Ozger 2010; Danandeh Mehr et al. 2014).

Continuous wavelet transforms and DWT are two types of WT. The DWT computes the wavelet coefficients at discrete intervals of time and scale. The signal is passed through a series of high-pass filters and low-pass filters to analyze the high frequencies and low frequencies, respectively. The low-frequency content is the most important part for many signals. In wavelet analyses, frequency is addressed in terms of approximations and details. The approximation is the high-scale and low-frequency component of the signal. The detail is the low-scale and high-frequency component (Misiti et al. 2000; Deka & Prahlada 2012).

The GMDH is one of the evolutionary models based on exploratory self-organizing that was proposed by Ivakhnenko (1968). The GMDH can be combined with other evolutionary or AI methods (Amanifard et al. 2008).

This model forecasts an output based on some input vector such as . It can be written as: 
formula
1
and the forecast output is: 
formula
2
The GMDH model studies a relationship between input and output vectors. Here, a quadratic polynomial of GMDH is employed that is written as: 
formula
3
where , , , , and are weighting coefficients. These coefficients are calculated using regression techniques such that the difference of squares is minimized between observed and forecasted data.
Because of a limitation of GMDH, particularly in non-stationary models analysis, this paper introduces a hybrid model called the wavelet group method of data handling (WGMDH), which is a combination of the DWT and the GMDH (Figure 1). In this model, the original SWH series decompose to an approximation and some subseries and then these series are imported to the GMDH to forecast the SWH. The proposed model comprises a preprocessing stage and simulation stage. In the WGMDH, approximation and details play vital roles in the model performance.
Figure 1

Schematic diagram of the proposed wavelet GMDH model.

Figure 1

Schematic diagram of the proposed wavelet GMDH model.

The correlation coefficient has been employed in many hydrology and ocean engineering studies, but as provided by Deka & Prahlada (2012), it is not the best error index. It just shows the degree to which two variables are linearly related. Therefore, in this study other criteria including index of agreement (Ia), coefficient of efficiency (CE) and root mean square error (RMSE) were used to evaluate the model performances (Equations (4)–(6)): 
formula
4
 
formula
5
 
formula
6
where , , , and n are observed SWH, predicted SWH, mean of observed SWH, mean of predicted SWH and number of observations, respectively.

Obviously, a high value for CE (up to 1) and Ia and a small value for RMSE indicates high efficiency of the model.

ANALYSIS AND RESULTS

In the present study, the SWH data of station 41048 (Latitude and longitude ) located in the west of the North Atlantic Ocean have been utilized to forecast SWH with different lead times (Figure 2). The SWH time series statistical properties are given in Table 1. The initial 75% of the SWH data was employed for training and the remaining 25% for model validation. The statistical properties of the training and testing dataset are presented in Table 2, separately.
Table 1

Statistical properties of all data

Station IDPeriodSWH
Min. (m)Mean (m)Max. (m)
41048 2012-1-1 to 2014-12-31 0.20 1.84 12.08 
Station IDPeriodSWH
Min. (m)Mean (m)Max. (m)
41048 2012-1-1 to 2014-12-31 0.20 1.84 12.08 
Table 2

Statistical properties of training and validation data

Station IDStudy periodData typeMin. (m)Mean (m)Max. (m)
41048 2012-1-1 to 2014-4-1 Training data 050 1.92 12.08 
2014-4-1 to 2014-12-31 Testing data 0.20 1.58 8.15 
Station IDStudy periodData typeMin. (m)Mean (m)Max. (m)
41048 2012-1-1 to 2014-4-1 Training data 050 1.92 12.08 
2014-4-1 to 2014-12-31 Testing data 0.20 1.58 8.15 
Figure 2

Location map of the study area.

Figure 2

Location map of the study area.

The WGMDH model was tested for various lead times of 3, 6, 12, 24 and 48 h using various decomposition levels, but only the optimum decomposition level for every lead time is presented. The predicted SWH is indicated as where n is the lead time. The predicted SWH accuracy is important for ocean engineering applications. The analysis results and SWH modeling are evaluated using three different indices including Ia, CE and RMSE.

In this study, DWT was employed for SWH time series preprocessing. At first, the most similar mother wavelet function to the nature of the SWH time series was selected. In this study, Daubechies wavelet order 3(db3) was selected for decomposing the main series considering the shape similarity with the SWH time series.

The SWH time series included too much noise that drastically decreased the model performance. To enhance the model performance, the SWH time series was decomposed into subseries. The noisy time series was omitted and other subseries that include the main features of primary time series given to the GMDH model as input. The input layer number increases with increasing decomposition level. The analysis results are presented in Table 3. This table shows the three performance indices for various lead times. Table 3 shows that the index of agreement value changes with lead times. For instance, the Ia decreases from 0.986 to 0.827 for 3 h and 48 h lead times, respectively. The CE changes between 0.94 and 0.53 for 3 to 48 h lead times. The RMSE increases from 0.196 for 3 h to 0.568 for the 48 h lead time. For shorter lead times all performance indices show that predicted SWH is very close to observed SWH for the 3 h lead time. Figure 3 shows the robust correlation between forecast and observed SWH for the 3 h lead time.
Table 3

The results of the wavelet GMDH (WGMDH) model

Lead time (h)Ia
CE
RMSE (m)
Optimum
TrainTestTrainTestTrainTestLevel
0.9892 0.9855 0.9581 0.9443 0.2237 0.1956 
0.9742 0.9733 0.9040 0.9011 0.3385 0.2608 
12 0.9471 0.9520 0.8164 0.8313 0.4681 0.3407 
24 0.8841 0.9029 0.6504 0.6937 0.6460 0.4595 
48 0.8296 0.8267 0.5379 0.5326 0.7428 0.5685 
Lead time (h)Ia
CE
RMSE (m)
Optimum
TrainTestTrainTestTrainTestLevel
0.9892 0.9855 0.9581 0.9443 0.2237 0.1956 
0.9742 0.9733 0.9040 0.9011 0.3385 0.2608 
12 0.9471 0.9520 0.8164 0.8313 0.4681 0.3407 
24 0.8841 0.9029 0.6504 0.6937 0.6460 0.4595 
48 0.8296 0.8267 0.5379 0.5326 0.7428 0.5685 
Figure 3

Scatter diagram of observed and predicted SWH by the WGMDH model at 3 h lead time for testing data.

Figure 3

Scatter diagram of observed and predicted SWH by the WGMDH model at 3 h lead time for testing data.

The optimum decomposition level increases with increasing lead times, as shown in Table 3. It is clear from the results that the decomposition level has an important impact on the model performance. For instance, the Ia and the CE values from 0.52 and 0.18 in four decomposition levels increases to 0.83 and 0.53 in eight decomposition levels. Similarly, the RMSE decreases from 0.99 in four levels to 0.568 in seven levels for the 48 h lead time. Table 3 shows that the higher decomposition levels at longer lead times present a better performance of the SWH forecasting model.

Comparison of Figures 46 shows that the WGMDH performance for SWH forecasting decreases gradually as the lead time increases. For better comparison purposes the error indices are plotted in Figure 7 for various lead times. This figure clearly indicates that shorter lead times are more accurate than larger lead times.
Figure 4

Observed and predicted SWH by the WGMDH model at 6 h lead time for testing data.

Figure 4

Observed and predicted SWH by the WGMDH model at 6 h lead time for testing data.

Figure 5

Observed and predicted SWH by the WGMDH model at 12 h lead time for testing data.

Figure 5

Observed and predicted SWH by the WGMDH model at 12 h lead time for testing data.

Figure 6

Observed and predicted SWH by the WGMDH model at 48 h lead time for testing data.

Figure 6

Observed and predicted SWH by the WGMDH model at 48 h lead time for testing data.

Figure 7

Variation of Ia, CE and RMSE vs lead times.

Figure 7

Variation of Ia, CE and RMSE vs lead times.

The contrast between the lead times and model performance can be explained by variation in time series. Where the interval between current time (t) and predicted time (t + n) is increased, the variation of time series cannot be simulated by the model as well as it is for shorter lead times. But the proposed model presented better results compared with most of the available models proposed by Gaur & Deo (2008) and Kamranzad et al. (2011). The obtained results show that the WGMDH model results were close to the observed SWH and can be applied in ocean studies.

As a whole, oceanic phenomena are characterized as being non-stationary and this point is the most important limitation of AI models. The proposed model improves this shortcoming.

CONCLUSIONS

In the present research, a wavelet-based group method of data handling model, WGMDH, was proposed to forecast SWH over 3, 6, 12, 24 and 48 h lead times. The SWH time series belonged to the period of January 1, 2012 to December 31, 2014. These data are available from a buoy station located in the North Atlantic Ocean. The WGMDH model involves two stages, which are preprocessing and simulation. Here, db3 was selected as a mother wavelet to decompose the main time series (the preprocessing stage). Then decomposed time series were imposed as input to the GMDH model for SWH modeling (the simulation stage). The results presented the vital role of WT in the SWH forecasting. In addition, the impact of wavelet decomposition levels on the model's ability was assessed by examined different decomposition levels. The proposed model performance was evaluated by three different indices. As compared to recent studies on SWH forecasting, the WGMDH gives similar results, shorter lead times forecast being more precise than the longer ones. Albeit, the forecasting accuracy level is better than in similar studies, particularly at longer lead times. The index of agreement values decreases from 0.98 to 0.83 at 3 and 48 h lead times, respectively.

ACKNOWLEDGEMENTS

The authors are grateful to the editor and three anonymous reviewers for their constructive and insightful comments, which help enhance the quality of the manuscript.

REFERENCES

REFERENCES
Amanifard
N.
Nariman-Zadeh
N.
Farahani
M.-H.
Khalkhali
A.
2008
Modelling of multiple short-length-scale stall cells in an axial compressor using evolved GMDH neural networks
.
Energy Conversation and Management
49
(
10
),
2588
2594
.
Atashrouz
S.
Pazuki
G. R.
Alimoradi
Y.
2014
Estimation of the viscosity of nine nanofluids using a hybrid GMDH-type neural network system
.
Fluid Phase Equilibria
372
(
25
),
43
38
.
Basu
S.
Sarkar
A.
Satheesan
K.
Kishtawal
C. M.
2005
Predicting wave height in the North Indian Ocean using genetic algorithm
.
Geophysical Research Letters
32
,
1
5
.
Boggess
A.
Narcowich
F. J.
2009
A First Course in Wavelets with Fourier Analysis
.
John Wiley Publications
,
New Jersey
.
Chakrabarti
S. K.
1987
Hydrodynamics of Offshore Structures
. 1st edn.
Computational Mechanics Publications, Billerica, MA, and WIT Press
,
Southampton, UK
,
chapter 5
.
Danandeh Mehr
A.
Kahya
E.
Ozger
M.
2014
A gene–wavelet model for long lead time drought forecasting
.
Journal of Hydrology
517
,
691
699
.
Deo
M. C.
Jha
A.
Chaphekar
A. S.
Ravikar
K.
2001
Neural networks for wave forecasting
.
Ocean Engineering
28
,
889
898
.
Gaur
S.
Deo
M. C.
2008
Real-time wave forecasting using genetic programming
.
Ocean Engineering
35
,
1166
1172
.
Ivakhnenko
A. G.
1968
The group method of data handling (GMDH)
.
Automation
3
,
57
83
.
Kamranzad
B.
Shahidi
A. E.
Kazeminezhad
M. H.
2011
Wave height forecasting in Dayyer, the Persian Gulf
.
Ocean Engineering
38
,
248
255
.
Londhe
S.
Panchang
V.
2006
One-day wave forecasts using buoy data and artificial neural networks
.
Journal of Atmospheric and Ocean Technology
3
,
2119
2123
.
Markarynskyy
O.
Pires-Silva
A. A.
Makarynska
D.
Ventura-Soares
S.
2005
Artificial neural networks in wave predictions at the west coast of Portugal
.
Computers and Geosciences
31
,
415
424
.
Misiti
M.
Misiti
Y.
Oppenheim
C.
Poggi
J.
2000
Wavelet Toolbox: For Use with MATLAB
.
The MathWorks
,
Natick, MA
.
Muraleedharan
G.
Lucas
C.
Soares
C. G.
Nair
N. U.
Kurup
P. G.
2012
Modelling significant wave height distributions with quantile functions for estimation of extreme wave heights
.
Ocean Engineering
54
,
119
131
.
Najafzadeh
M.
Barani
G. A.
Hessami Kermani
M. R.
2013
Abutment scour in clear-water and live-bed conditions by GMDH network
.
Water Science and Technology
67
(
5
),
1121
1128
.
Nourani
V.
Hosseini
B.
Adamowski
J.
Kisi
O.
2014
Application of hybrid wavelet–artificial-intelligence models in hydrology: a review
.
Journal of Hydrology
514
,
358
377
.
Tsai
C. P.
Lin
C.
Shen
J. N.
2002
Neural network for wave forecasting among multi-stations
.
Ocean Engineering
29
,
1683
1695
.
Yangyang
G.
Dingyong
Y.
Cuilin
L.
Delun
X.
2010
Calculation of significant wave height using the linear mean square estimation method
.
Journal of Ocean University of China
9
,
327
332
.
Zhu
B.
He
C. Z.
Liatsis
P.
Li
X. Y.
2012
A GMDH-based fuzzy modeling approach for constructing TS model
.
Fuzzy Sets and Systems
189
(
1
),
19
29
.