Prediction of wave parameters is of great importance in the design of marine structures. In this paper, two shortcomings with the adaptive network-based fuzzy inference system (ANFIS) model for prediction of wave parameters are remedied by employing a genetic algorithm (GA). The first shortcoming in the ANFIS model goes back to its problem for automatic extraction of fuzzy IF-THEN rules and the second one is related to its gradient-based nature for tuning the antecedent and consequent parameters of fuzzy IF-THEN rules. To deal with these shortcomings, in this study a combined FIS and GA model is developed in which the capability of the GA as an evolutionary algorithm is used for simultaneous optimization of the subtractive clustering parameters and the antecedent and consequent parameters of fuzzy IF-THEN rules. Following the development of the combined model, this model is used to predict wave parameters, i.e., significant wave height and peak spectral period at Lake Michigan. The obtained results show that the developed model outperforms the ANFIS model and the Coastal Engineering Manual (CEM) method to estimate the function representing the generation process of the wind-driven waves.
NOMENCLATURE
- Ai, Bi, Ci, Di, Ei
fuzzy set variables
drag coefficient
standard deviations of the Gaussian membership functions
mean values of the Gaussian membership functions
- D
total number of input and output variables in data set
wind direction at the ith hour
average wind direction for consecutive preceding i hours
- E
mean square error
- g
gravitational acceleration
significant wave height
number of training data
number of testing data
number of validation data
- Numrule
number of fuzzy IF-THEN rules
- MaxNumrule
maximum number of fuzzy IF-THEN rules
- oi, pi, qi, ri, si, ti
the linear consequent parameters
firing strength of rule i
degree of membership
- O
observed value
- P
predicted value
the potential of kth data point
the modified potential value of kth data point
the first cluster center potential value
clustering radius for the ith input variable
the minimum clustering radius for the ith variable
learning rate
the number of training epochs
- S1
the set of nonlinear antecedent parameters
- S2
the set of linear consequent parameters
peak spectral wave period
required time for fetch-limited condition
- U
wind speed
wind speed at the ith hour
the average of wind speed for consecutive preceding i hours
shear velocity
- F
fetch length
- t
wind duration
fuzzy IF-THEN rule parameters
- d
distance between the candidate clusters
radii of clusters
quash factor
acceptance ratio
rejection ratio
- X
corrected fetch length in meter
- x
value of data point
- xC1
cluster center with highest potential value
INTRODUCTION
Prediction of wave parameters, including significant wave height and peak spectral period, plays an important role in ocean activities. In that regard, several methodshave been introduced to predict wave parameters. Simplified methods like SPM (Shore Protection Manual 1984) and CEM (Coastal Engineering Manual) (US Army 2003) were the first methods. These techniques are based on constant wind definition and their evaluation versus observed data set in various places proves their deficiency to predict wave parameters (Bishop 1983; Etemad-Shahidi et al. 2009).
Numerical models like the SWAN (simulating wave nearshore) model, which work based on solving energy equilibrium equations are the second type of model to predict wave parameters. In the application of these methods, not only the bathymetry of a lake is needed but also several additional parameters such as bottom roughness are needed (Ris et al. 1996).
The emergence of soft computing-based methods motivated scientists to employ these methods for prediction of wave parameters. These tools have been broadly used as an alternative method for modeling various complex civil engineering problems. Adaptive neural network (ANN) (Agrawal & Deo 2002; Tsai et al. 2002; Altunkaynak 2013) and fuzzy inference systems (FISs) are examples of these applications. The FIS models have been widely employed by many researchers in the field of water engineering including, modeling of rainfall–runoff events (Sen & Altunkaynak 2004), estimating scour uncertainty around bridge piers (Johnson & Ayyub 1996), predicting scour depth at abutments of armored beds (Muzzammil 2010; Muzzammil & Alam 2011), optimizing water allocation systems (Kindler 1992), controlling reservoir operation (Pesti et al. 1996), analyzing regional drought (Pongracz et al. 1999), estimating pile group scour (Chen & Dao 2007), predicting stream flow (Shi et al. 1999; Özger 2009), finding scour location at the downstream of a spillway (Russo 2000), and forecasting flood flow (Rezaeianzadeh et al. 2014). Most of the reviewed studies prove that the FIS models outperform the nonlinear regression approaches.
In the field of coastal engineering, Kazeminezhad et al. (2005) used FIS to estimate wave parameters while its structure is being optimized by a hybrid model. Özger & Sen (2007) applied fuzzy logic to investigate the relationship between the wind speed and previous and current wave characteristics in the Pacific Ocean. Mahjoobi et al. (2008) applied ANN and ANFIS models to hindcast wave parameters while their models' input variables were wind speed, wind direction, fetch length, and wind duration. Their findings showed the ANFIS models' superiority to the FIS and ANN models. Zanaganeh et al. (2009) used genetic algorithm–adaptive network-based fuzzy inference systems (GA-ANFIS) for prediction of wave parameters. In their model, a GA was used as the optimizer to tune subtractive clustering parameters, including radii of clustering and the quash factor, while a hybrid gradient-based method known as the ANFIS was used to tune the consequent and antecedent parameters of the fuzzy IF-THEN rules, simultaneously. Their obtained results indicated that the GA-ANFIS model is more accurate than the ANFIS model in which the clustering parameters were generated randomly. In another study, Zanganeh et al. (2011) employed a PSO-FIS-PSO model to estimate the equilibrium depth of scouring beneath pipelines. In their applied model, two particle swarm optimization (PSO) algorithms were employed to tune the subtractive clustering parameters and the antecedent and consequent parameters of fuzzy IF-THEN rules, simultaneously. Their model outperformed the empirical methods to estimate the equilibrium depth of scour. An important deficiency in the developed PSO-FIS-PSO model is tuning the parameters of the two employed PSO algorithms such as initial population, cognitive and social parameters, and so on. Therefore, modifying the model by using one evolutionary algorithm can be beneficial. Akpinar et al. (2014) employed FIS and empirical models along the south coast of the Black Sea to predict wave parameters and the obtained results proved the superiority of the ANFIS model compared to empirical models including the CEM and SPM methods. More recently, Zanganeh et al. (2016) applied ANN and ANFIS models to estimate coastal current velocities on the Ogata coast. The obtained results showed the models' efficiency to capture the physical complexity of coastal currents' generation process in both longshore and cross-shore directions.
As mentioned above, FISs are the models in which a phenomenon is estimated by mapping a nonlinear relationship among effective input and output variables. These models represent a phenomenon by some known fuzzy IF-THEN rules that their optimization is important to capture the nature of the phenomenon. Fuzzy IF-THEN rules optimization process can be either accomplished by a gradient-based method (ANFIS model) or by an evolutionary algorithm like the GA (combined FIS and GA model). In the combined FIS and GA model, due to the domain-irrelevant behavior of the GA a global optimization can be achieved, whereas hill-climbing methods like the ANFIS require a specific domain in order to guide their search area. To date, various efforts have been devoted to develop combined FIS and evolutionary algorithms either to estimate functions (Homaifar & McCormick 1995; Shi et al. 1999; Russo 2000; Hidalgo et al. 2012; Zacharia & Nearchou 2012) or to optimize fuzzy logic controllers (Seng et al. 1999; Chen & Dao 2007; Poursamad & Montazeri 2008).
The main objective of this paper is to employ the features of the GA as an evolutionary algorithm to optimize structures of fuzzy IF-THEN rules, that the resulting model is termed as a combined GA and FIS model. In the model, the gradient-based learning algorithm in the ANFIS is replaced by a GA in order to tune fuzzy nonlinear antecedent and linear consequent parameters. Also, subtractive clustering parameters are being optimized within the training process. Finally, the model developed to estimate any function is used to predict wind-driven wave parameters. Note that the learning process in the ANFIS model could be either a hybrid learning algorithm introduced by Jang (1993) or the steepest descent (SD) method.
In this study, the introduction above provides a brief review of the previous works in the field of water engineering and wave predictions, with the section below giving an overview of the features of the FIS and the ANFIS methods. Next, are sections describing the GA as an optimizer model and an outline of the developed combined FIS and GA model to estimate every function. Then the performance of the model is evaluated to predict the wind-driven wave parameters followed by a section in which the CEM method is outlined to predict wave parameters. The final section contains the evaluation of developed models to predict wave parameters.
FUZZY INFERENCE SYSTEMS
FIS structures
FISs are mathematical theories allowing one to model a natural process through some linguistic expressions. These methods are suitable to find relationships among effective input variables and the desired output of a system. These models can assign qualitative aspects of human knowledge by some linguistic expressions, so-called fuzzy IF-THEN rules. The rules are usually extracted from data sets representing the phenomenon.
Rule 1: If x is A1 and y is B1 then ,
Rule 2: If x is A2 and y is B2 then ,
Subtractive clustering method
GENETIC ALGORITHM
Selection
This operator selects the individuals to contribute to the next generation. In this study, stochastic uniform selections have been applied by lying individuals on a line according to their scaled value. Then, the algorithm moves along the line in equal steps. At each step, the algorithm selects parents from the section that they are lying in.
Crossover
This operator combines two parents to form the next generation children. In this paper, the scattered method is used as the crossover operator, in which a binary string is created randomly while its length is equal to the length of the solution. After that, string 1 values are replaced by the values of the first parent whereas 0 values are substituted by values of the second parent.
Mutation
This operator assigns sudden change in the parents to form new random children. This operator generally increases the robustness of the algorithm for sticking in local optima. In this paper, the Gaussian mutation is used by adding a random number to each vector entry of an individual. This random number is taken from the Gaussian distribution centered on zero. The variance of this distribution can be controlled with two parameters. The Scale parameter determines the variance at the first generation. The Shrink parameter controls how variance shrinks during the generations. If the Shrink parameter is 0, the variance is constant and if the Shrink parameter is 1, the variance shrinks to 0 linearly as reaching to the last generation (Chipperfield et al. 1994).
The GA operators would be controlled by population size, crossover fraction and mutation fraction to find reasonable settings for the problem class being worked on. A very small mutation rate may lead to genetic drift. A recombination rate that is too high may lead to premature convergence of the GA. A mutation rate that is too high may lead to loss of good solutions, unless elitism selection is employed. Note that the elitism selection goes back to the copy of some of the best children into the next generation unchanged.
THE COMBINED MODEL OF FIS AND GA
In the figure, is the clustering radius for the ith input variable and the output variable , is the minimum clustering radius for the ith variable. MaxNumrule, is the maximum number of fuzzy IF-THEN rules determined based on the prediction errors for the training and validation data expressed as and , respectively.
Note that in this paper as mentioned before, fuzzy IF-THEN rules associated with clustering parameters are extracted based on having the lowest similarities among them. The number of rules and the linguistic variables for each input variable are equal to the number of clusters related to clustering parameters. In order to meet minimum similarities in the fuzzy IF-THEN rules, only linguistic variables in the same levels are chosen (Chiu 1994). For example, ‘A1’ as the first linguistic parameter for input variable A makes a rule with the first linguistic of variable B as ‘B1’. It is the same for other rules and final obtained rules can be expressed as follows:
Rule 1: If x is A1 and y is B1
Rule 2: If x is A2 and y is B2
…….
Rule N: If x is An and y is Bn.
The construction of initial FIS is inspired by MATLAB GENFIS 2 commands that cause lower similarity in the rule to decrease the time of execution.
In addition to the development of combined FIS and GA models, subtractive clustering parameters and fuzzy IF-THEN rules, antecedent and consequent parameters can be optimized separately for clarifying how important is their optimization in the prediction of wave parameters. To achieve this, in this paper the GA also is used to optimize the subtractive clustering parameters, including radii of clustering and quash factor, without optimization of fuzzy antecedent and consequent parameters. In this optimizing process, two groups of data sets are used as the training and validation data sets similar to other data-driven algorithms. Training data sets are used directly to optimize subtractive clustering parameters whereas validation data sets are used to evaluate the model generalization capability and escape from the ‘curse of dimensionality’ deficiency in the FIS. In the tuning process, the generation that has the minimum error of training and validation data sets simultaneously is chosen as the final solution. The same scenario can be employed to optimize fuzzy antecedent and consequent parameters for a determined fuzzy IF-THEN rule. In this viewpoint, also two groups of data sets including the training and validation data sets should be used to have a model with suitable generalization capability. In all these scenarios, the RMSE error of the training data is the objective function, whereas the RMSE error of the validation data is calculated to control the known over-fitting problem.
THE COMBINED FIS AND GA MODEL AT LAKE MICHIGAN FOR WAVE PREDICTION
Study area and data selection
To estimate the fetch length at the study area, the CEM (US Army 2003) method criterion was used, by which a fetch length for a certain direction was estimated by considering 30 radials from the point of interest (at 1-degree intervals) and extending them until they intersected the coastline. The fetch length is the arithmetic average of the obtained lengths.
Following the extraction of constant winds, 1,200 hourly data were selected, of which, 1,080 data points (the data set of year 2001) were used as the training data set and the remaining data (data set of year 2013) were selected as the testing data set. Selection of another year for the testing is to provide the fair evaluation of the developed model in a different climate. Out of 1,080 data points, 800 data points were chosen as the training data points and the remaining 220 data points used as the validation data to avoid overtraining of the model. Statistical characteristics such as the minimum, maximum, average and range of all data points are reported in Table 1. In the selected intervals of the data set, the maximum and minimum of the recorded wind speed are 16.52 m/s and 5.75 m/s, respectively. This proves that the data set covers a wide range of wind climate at the lake. These data have been selected among a total of 4,554 hourly data, in which data points with wave height less than 0.5 m (the common calm condition in maritime design) have been eliminated from the data set. In addition, existing gaps in the data set either have been excluded from the data set or interpolated. Also, wind speeds have been converted to 10 m above sea level wind speeds. In order to evaluate the performance of the combined FIS and GA model in this section, the GA model is applied in three states. The first state of the GA model application goes back to its application for optimizing the subtractive clustering parameters, i.e., radii of inputs and output variables and the quash factor, leading to extraction of fuzzy IF-THEN rules. In the second state, the GA is employed only to tune the antecedent and consequent parameters of the resultant fuzzy IF-THEN rules from the first step. In the third state, the GA model is used for simultaneous optimization of the subtractive clustering parameters and the antecedent and consequent parameters associated with the selected fuzzy IF-THEN rules from clustering parameters. Note that in this form of the GA application for optimizing fuzzy IF-THEN rules, the number of the antecedent and consequent parameters are related to subtractive clustering parameters. Therefore, the number of the decision variables changes during the execution of the GA model.
Parameter . | Min. . | Max. . | Average . | Range . |
---|---|---|---|---|
Wind speed (m/s) | 5.75 | 16.52 | 8.37 | 14.77 |
Fetch length (CEM) (km) | 76 | 329 | 129 | 253 |
Wind duration (hr) | 3 | 37 | 6.37 | 34 |
Significant wave height (m) | 0.51 | 4.75 | 1.22 | 4.24 |
Peak spectral period (s) | 2.98 | 7.3 | 4.21 | 4.32 |
Parameter . | Min. . | Max. . | Average . | Range . |
---|---|---|---|---|
Wind speed (m/s) | 5.75 | 16.52 | 8.37 | 14.77 |
Fetch length (CEM) (km) | 76 | 329 | 129 | 253 |
Wind duration (hr) | 3 | 37 | 6.37 | 34 |
Significant wave height (m) | 0.51 | 4.75 | 1.22 | 4.24 |
Peak spectral period (s) | 2.98 | 7.3 | 4.21 | 4.32 |
GA application to extract the fuzzy IF-THEN rules
This equation has been selected according to GENFIS 2 commands and is proven by previous work (Zanganeh et al. 2016).
Following the above descriptions, in this part the fuzzy IF-THEN rules are extracted to predict wave parameters. These rules are extracted based on having minimum errors of training and validation data with respect to optimum subtractive clustering parameters.
The optimum values of the clustering parameters are as follows:
These obtained clustering parameters are associated with the best execution out of ten runs of the predictor models. As is evident from Figure 6, the training error for the significant wave height predictor model has varies from 0.273 m to 0.1705 m, while in Figure 7, this error changes from 0.2331 s to 0.2114 s for the peak spectral period predictor model. In addition, the error of validation data for the appropriate clustering parameters in the wave height predictor model is 0.3101 m while its appropriate generation is 87. In the peak spectral period predictor model, the error of the validation data is equal to 0.3815 s and its appropriate generation is 54. These two steps are the steps after which there is no decrease in the training and validation errors. Thus, improving the obtained FISs should be investigated in order to know whether more optimization is possible or not in the predictor models. However, in this step, there is no optimization on fuzzy antecedent and consequent parameters in both predictor models and so generally it can be concluded that the obtained FISs are local answers. In that regard, optimization of the fuzzy antecedent and consequent parameters considering some fixed fuzzy IF-THEN rules for both predictor models should be undertaken.
Referring to the optimized clustering parameters reveals that the wind speed has the lowest value of the clustering radius in both the wave height predictor model and peak spectral period predictor model , whereas the wind duration has the highest value of the clustering radius ( for wave height and for peak spectral period).
The GA application to tune the extracted fuzzy IF-THEN rules parameters
As mentioned above, optimizing the obtained fuzzy antecedent and consequent parameters in both predictor models is of great importance to improve the training and validation errors. Therefore, in this subsection following the extraction of fuzzy IF-THEN rules in the previous section, the appropriate antecedent and consequent parameters extracted by the fuzzy IF-THEN rules are optimized. The number of fuzzy IF-THEN rules associated with the obtained clustering parameters for both wave height and peak spectral period predictor models is 4. As a result, the number of appropriate fuzzy antecedent and consequent parameters is 40 which are considered as the decision variables of the GA. Of them, 16 parameters are linear consequent parameters and 24 parameters are nonlinear antecedent parameters. In the generation process, the population size of the GA for both predictor models is 400, the crossover fraction is 0.7, the number of elitism chromosomes is 20, and the remaining children are considered for the mutation process.
. | Method . | Validation error (m) . | Training error (m) . |
---|---|---|---|
Significant wave height predictor model | ANFIS | 0.3101 | 0.1705 |
FIS and GA | 0.2920 | 0.1604 | |
Peak spectral period predictor model | ANFIS | 0.3815 | 0.2114 |
FIS and GA | 0.3421 | 0.2018 |
. | Method . | Validation error (m) . | Training error (m) . |
---|---|---|---|
Significant wave height predictor model | ANFIS | 0.3101 | 0.1705 |
FIS and GA | 0.2920 | 0.1604 | |
Peak spectral period predictor model | ANFIS | 0.3815 | 0.2114 |
FIS and GA | 0.3421 | 0.2018 |
Application of the combined FIS and GA model
The RMSE errors of validation and training data sets are also presented in Table 3. As reported in the table, the GA model employed here has decreased the RMSE error successfully for both significant wave height and peak spectral predictor models. The GA not only has decreased the RMSE error to 0.1533 m for the wave height predictor model but it has also improved the RMSE error for the peak spectral period predictor model to 0.2045 s in the best run. In the generation process, the population size of the GA for both predictor models is 400, the crossover fraction is 0.7, the number of elitism chromosomes is 20, and the remaining children are taken for the mutation process. The obtained validation errors for both predictor models are, respectively, 0.2911 m and 0.3461 s. The obtained results in this part show the combined GA and FIS models' efficiency to predict wave parameters, although final evaluation of the developed models is related to their evaluation versus the testing data never used during the training process.
. | Method . | Validation error (m) . | Training error (m) . |
---|---|---|---|
Significant wave height predictor model | Combined FIS and GA | 0.2911 | 0.1533 |
Peak spectral period predictor model | Combined FIS and GA | 0.3461 | 0.2045 |
. | Method . | Validation error (m) . | Training error (m) . |
---|---|---|---|
Significant wave height predictor model | Combined FIS and GA | 0.2911 | 0.1533 |
Peak spectral period predictor model | Combined FIS and GA | 0.3461 | 0.2045 |
THE CEM METHOD
Evaluation of presented approaches against known methods with identical input variables is needed to verify the models and also to create a sound conclusion. To achieve this, in this paper the CEM formulas are employed for prediction of wave parameters and the following paragraphs outline this empirical method.
EVALUATION OF THE COMBINED FIS AND GA MODELS
Following the estimation of parameters, the results of wave prediction models are reported in Table 4. From this table, it can be inferred that in the studied case, the combined FIS and GA models are even more accurate than the ANFIS models where their structures have been optimized by the GA. All of the models have reasonable bias, indicating the accuracy of models for prediction of the phenomenon.
Wave parameter . | Combined FIS and GA . | CEM . | ANFIS . | |||
---|---|---|---|---|---|---|
SI (%) . | Bias . | SI (%) . | Bias . | SI (%) . | Bias . | |
20.03 | 0.0034 | 42.85 | 0.241 | 22.1 | 0.045 | |
8.12 | 0.0131 | 58.2 | 0.347 | 9.97 | 0.0199 |
Wave parameter . | Combined FIS and GA . | CEM . | ANFIS . | |||
---|---|---|---|---|---|---|
SI (%) . | Bias . | SI (%) . | Bias . | SI (%) . | Bias . | |
20.03 | 0.0034 | 42.85 | 0.241 | 22.1 | 0.045 | |
8.12 | 0.0131 | 58.2 | 0.347 | 9.97 | 0.0199 |
SUMMARY AND CONCLUSIONS
Recently, soft computing tools such as ANFISs and ANNs have been used in the prediction of wave parameters, i.e., significant wave height and peak spectral period. The ANFIS is a gradient-based method in which optimizing of the antecedent and consequent parameters of fuzzy IF-THEN rules can be accomplished by a gradient, thereby entrapping in a local optimum is possible. Therefore, in this study the GA was used to extract fuzzy IF-THEN rules and to optimize the antecedent and consequent parameters of fuzzy IF-THEN rules simultaneously in a model called the combined FIS and GA model. Finally, the combined FIS and GA models were employed for prediction of wave parameters. Results not only indicated the combined FIS and GA models' accuracy for prediction of wave parameters but also showed that the GA could optimize fuzzy IF-THEN rules and fuzzy antecedent and consequent parameters. In addition, it was inferred that in the wave predictor models optimizing of clustering parameters had a more important effect than fuzzy antecedent and consequent parameters' optimization. For future works, we can consider two viewpoints. The first one can focus on obtaining a FIS-based model capturing three known conditions, fetch-limited, duration-limited, and fully developed sea conditions in the study area. The second viewpoint goes back to define a new membership function to minimize the prediction errors more in order to produce a more robust algorithm.
ACKNOWLEDGEMENTS
This study was partially supported by the Deputy of Research at Golestan University (GU) and the author would like to sincerely thank them for their support during the study. Also, the author wishes to thank Dr Mahmood Hajiani, faculty member of Birjand University, for his constructive comments on the manuscript.