Accurate prediction of peak outflows from breached embankment dams is a key parameter in dam risk assessment. In this study, efficient models were developed to predict peak breach outflows utilizing artificial neural network (ANN) and adaptive neuro-fuzzy inference system (ANFIS). Historical data from 93 embankment dam failures were used to train and evaluate the applicability of these models. Two scenarios were applied with each model by either considering the whole data set without classification or classifying the set into small dams (48 dams) and large dams (45 dams). In this way, nine models were developed and their results were compared to each other and to the results of the best available regression equations and recent gene expression programming. Among the different models, the ANFIS model of the first scenario exhibited better performance based on its higher efficiency (E = 0.98), higher coefficient of determination (R2 = 0.98) and lower mean absolute error (MAE = 840.9). Moreover, models based on classified data enhanced the prediction of peak outflows particularly for small dams. Finally, this study indicated the potential of the developed ANFIS and ANN models to be used as predictive tools of peak outflow rates of embankment dams.

Failure of embankment dams can cause catastrophic flooding and consequently present high risk to human life and property located at the downstream. In order to prevent and mitigate such a natural hazard, dam owners and agencies responsible for dam safety carefully study, analyze and inspect dams to identify significant failure modes. Overtopping and piping are the most encountered modes of failures causing breach of embankment dams (Wahl 1998). The breach parameters: time of failure and breach width and the peak outflow are crucial in evaluating dam risk assessments. Accurate predictions of such parameters remain a challenging task. Prediction of peak breach outflows (), which is the main theme of this study, is an essential factor in preparing early emergency action plans and designing early warning systems that might reduce or eliminate the consequences of dam failure. Several methods are available in the literature to predict the resulting including: comparative analysis of similar case studies, predictor regression equations (RE) based on historic embankment dam failures, and physically based breach models using principles of hydraulics and sediment transport. Many of these methods apply unrealistic assumptions of linearity and suffer from uncertainty and lack of accurate data on a wide variety of dams (Wahl 1998). In practice, numerous studies have attempted to relate peak breach outflows () to water height above breach invert at time of failure (); some used dam height (), and reservoir storage at normal pool () or volume of water behind the dam at failure (), or combinations of the two (Wahl 1998; Pierce et al. 2010). Kirkpatrick (1977), the Soil Conservation Service (SCS 1981) and the US Bureau of Reclamation (USBR 1982) proposed best-fit linear REs for as a function of . Singh & Snorrason (1982, 1984) presented relations for as linear functions of and S. Hagen (1982) and MacDonald & Langridge-Monopolis (1984) defined the ‘breach formation factor’ as the product and developed equations relating the breach formation factor to . Froehlich (1995) used multiple regression for 22 case studies and introduced an equation for as a power function of both and . Wahl (1998) performed a literature search and produced a single database comprising a total of 108 embankment dam failures. Later on, Wahl (2004) presented a quantitative analysis of the uncertainty of various RE for predicting and stated that the equation offered by Froehlich (1995) had the best prediction performance. Pierce et al. (2010) expanded the breach database of Wahl (1998) by collecting information about an additional 44 case studies and performed linear, curvilinear, and multivariable regression analyses on the composite database and developed expressions correlating and to . Hooshyaripor & Tahershamsi (2012) applied artificial neural network (ANN) with different training algorithms and developed a model to predict . Hooshyaripor et al. (2013) derived statistical expressions to predict based on observed data and generated synthetic data using a copula method. Duricic et al. (2013) proposed a model using the kriging approach to predict . Sattar (2014) developed new empirical formulae for predicting using gene expression programming (GEP). From the above literature, it is seen that several prediction equations for have been developed from the analysis of historic embankment dam failures under various simplifying assumptions between the considered hydraulic variables. Many of these prediction equations are unable to accurately predict due to the complexity of the phenomena involved, nonlinearity, and uncertainty of data and parameters. The ANN and the adaptive neuro-fuzzy inference system (ANFIS) have been used in several problems in engineering as alternative approaches to traditional statistical models and proved advantages because of their tolerance to data errors and the ability to perform nonlinear mapping between a given input and a desired output (Azmathullah et al. 2005; Azmathulla & Ahmad 2013). These facts proclaimed the need for using such improved prediction tools. As a result, this study was initiated to develop new models for prediction of based on ANN and ANFIS techniques and provide a comparison between the results of these models and that of the best available RE and GEP. Assessment of the model's adequacy was performed by using basic statistical error criteria.

Development of the ANN and ANFIS models in this study was based on the historical data of the 93 dam failures collected and presented by Hooshyaripor & Tahershamsi (2012). The dam type, material, and mode of failure were collected and added to these data as presented in Table 1. This table contains the required data for the variables to be used in this study.

Table 1

Database from historical embankment failures

No.Dam nameVw × 106 (m3)hw (m)QP (m3/s)DtDfDeReference
Apishapa, USA 22.20 28.00 6,850 HD HE Xu & Zhang (2009)  
Armando de S. Oliveira, Brazil 25.90 35.00 7,195 HD – Tahershamsi et al. (2003)  
Baldwin Hills, USA 0.91 12.20 1,130 HD HE Wahl (2014)  
Banqiao, China 607.50 31.00 78,100 DC HE Xu & Zhang (2009)  
Bayi, China 23.00 28.00 5,000 HD ME Xu & Zhang (2009)  
Big Bay, USA 17.50 13.59 4,160 ZD ME Pierce et al. (2010)  
Boystown, USA 0.36 8.96 65.13 – – – Pierce et al. (2010)  
Bradfield, UK 3.20 28.96 1,150 HD – Wahl (1998)  
Break Neck Run, USA 0.05 7.00 9.20 – – – Wahl (1998)  
10 Buffalo Creek, USA 0.48 14.02 1,420 HD – Wahl (1998)  
11 Butler, USA 2.38 7.16 810 HD – Wahl (1998)  
12 Caney Coon Creek, USA 1.32 4.57 16.99 – – – Pierce et al. (2010)  
13 Castlewood, USA 6.17 21.60 3,570 DC ME Xu & Zhang (2009)  
14 Chenying, China 5.00 12.0 1,200 HD ME Xu & Zhang (2009)  
15 Cherokee Sandy, USA 0.44 5.18 8.50 – – – Pierce et al. (2010)  
16 Colonial #4, USA 0.04 9.91 14.16 – – – Pierce et al. (2010)  
17 Dam Site #8, USA 0.87 4.57 4,899 – – – Pierce et al. (2010)  
18 Danghe, China 10.70 24.50 2,500 DC LE Xu & Zhang (2009)  
19 Davis Reservoir, USA 58.00 11.58 510 FD ME Xu & Zhang (2009)  
20 Dells, USA 13.0 18.30 5,440 DC HE Wahl (2014)  
21 DMAD, USA 19.70 8.80 793 HD – – Pierce et al. (2010)  
22 Dongchuankou, China 27.00 31.00 21,000 HD HE Xu & Zhang (2009)  
23 Eigiau, UK 4.52 10.50 400 – – – Singh & Scarlatos (1988)  
24 Elk City, USA 1.18 9.44 608.79 DC ME Tahershamsi et al. (2003)  
25 Frankfurt, Germany 0.35 8.23 79 HD LE Xu & Zhang (2009)  
26 Fred Burr, USA 0.75 10.20 654 HD – Wahl (2014)  
27 French Landing, USA 3.87 8.53 929 HD HE Xu & Zhang (2009)  
28 Frenchman, USA 16.00 10.80 1,420 HD ME Xu & Zhang (2009)  
29 Frias, Argentina 0.25 15.00 400 FD ME Xu & Zhang (2009)  
30 Goose Creek, USA 10.60 1.37 492 HD ME Tahershamsi et al. (2003)  
31 Gouhou, China 3.18 44.00 2,050 FD LE Xu & Zhang (2009)  
32 Grand Rapids, USA 0.26 7.50 7.50 DC ME Singh & Scarlatos (1988)  
33 Hatchtown, USA 14.80 16.80 3,080 ZD HE Wahl (2014)  
34 Hatfield, USA 12.30 6.80 3,400 DC HE Wahl (2014)  
35 Haymaker, USA 0.37 4.88 26.90 – – – Pierce et al. (2010)  
36 Hell Hole, USA 30.60 35.10 7,360 HD ME Wahl (2014)  
37 Hemet, USA 8.63 6.09 1,600 – – – Tahershamsi et al. (2003)  
38 Horse Creek, USA 12.80 7.01 3,890 FD ME Xu & Zhang (2009)  
39 Horse Creek #2, USA 4.80 12.50 311.49 – – – Pierce et al. (2010)  
40 Huqitang, China 0.42 5.10 50 HD LE Xu & Zhang (2009)  
41 Ireland No. 5, USA 0.16 3.81 110 HD – Froehlich (1995)  
42 Johnstown, USA 18.90 22.25 7,079.20 ZD ME Wahl (1998)  
43 Kelly Barnes, USA 0.78 11.30 680 HD HE Xu & Zhang (2009)  
44 Knife Lake, USA 9.86 6.10 1,098.66 – – – Tahershamsi et al. (2003)  
45 Kodaganar, India 12.30 11.50 1,280 HD ME Xu & Zhang (2009)  
46 Lake Avalon, USA 31.50 13.70 2,321.90 HD – Tahershamsi et al. (2003)  
47 Lake Latonka, USA 4.09 6.25 290 HD ME Xu & Zhang (2009)  
48 Lake Tanglewood, USA 4.85 16.76 1,351 – – – Pierce et al. (2010)  
49 Laurel Run, USA 0.56 14.10 1,050 HD – Froehlich (1995)  
50 Lawn Lake, USA 0.80 6.71 510 HD HE Wahl (2014)  
51 Lijiaju, China 1.14 25.00 2,950 HD ME Xu & Zhang (2009)  
52 Lily Lake, USA 0.09 3.35 71 HD – Froehlich (1995)  
53 Little Deer Creek, USA 1.36 22.90 1,330 HD HE Xu & Zhang (2009)  
54 Little Wewoka, USA 0.99 9.45 42.48 – – – Pierce et al. (2010)  
55 Liujiatai, China 40.54 35.90 28,000 DC ME Xu & Zhang (2009)  
56 Lower Latham, USA 7.08 5.79 340 HD – Froehlich (1995)  
57 Lower Reservoir, USA 0.60 9.60 157.44 DC – Pierce et al. (2010)  
58 Lower T. Medicine, USA 19.60 11.30 1,800 HD HE Xu & Zhang (2009)  
59 Mahe, China 23.40 19.50 4,950 HD HE Xu & Zhang (2009)  
60 Mammoth, USA 13.60 21.30 2,520 DC ME Xu & Zhang (2009)  
61 Martin Cooling, USA 136.00 8.53 3,115 FD HE Wahl (2014)  
62 Middle Clear Boggy, USA 0.44 4.57 36.81 – – – Pierce et al. (2010)  
63 Mill River, USA 2.50 13.10 1,645 – – – Wahl (1998)  
64 Murnion, USA 0.32 4.27 17.50 – – – Pierce et al. (2010)  
65 Nanaksagar, India 210.00 15.85 9,709.50 – – – Tahershamsi et al. (2003)  
66 North Branch, USA 0.02 5.49 29.50 HD – – Wahl (1998)  
67 Oros, Brazil 660.00 35.80 9,630 ZD LE Xu & Zhang (2009)  
68 Otto Run, USA 0.01 5.79 60 HD – – Singh & Scarlatos (1988)  
69 Owl Creek, USA 0.12 4.88 31.15 – – – Pierce et al. (2010)  
70 Peter Green, USA 0.02 3.96 4.42 – – – Pierce et al. (2010)  
71 Prospect, USA 3.54 1.68 116 HD HE Xu & Zhang (2009)  
72 Puddingstone, USA 0.62 15.20 480 HD – Froehlich (1995)  
73 Qielinggou, China 0.70 18.00 2,000 HD HE Xu & Zhang (2009)  
74 Quail Creek, USA 30.80 16.70 3,110 HD ME Xu & Zhang (2009)  
75 Salles Oliveira, Brazil 71.50 38.40 7,200 HD – Wahl (2014)  
76 Sandy Run, USA 0.06 8.53 435 HD – Singh & Scarlatos (1988)  
77 Schaeffer Reservoir, USA 4.44 30.50 4,500 DC HE Xu & Zhang (2009)  
78 Shimantan, China 117.00 27.40 30,000 HD HE Xu & Zhang (2009)  
79 Sinker Creek, USA 3.33 21.34 926 HD – Pierce et al. (2010)  
80 Site Y-30–95, USA 0.14 7.47 144.42 – – – Pierce et al. (2010)  
81 Site Y-31 A–5, USA 0.39 9.45 36.98 – – – Pierce et al. (2010)  
82 Site Y-36–25, USA 0.04 9.75 2.12 – – – Tahershamsi et al. (2003)  
83 South Fork, USA 18.90 24.60 8,500 – – – Froehlich (1995)  
84 S. Fork Tributary, USA 0.0037 1.83 122 HD – – Pierce et al. (2010)  
85 Stevens Dam, USA 0.08 4.27 5.92 – – – Pierce et al. (2010)  
86 Swift, USA 37.00 47.85 24,947 FD ME Xu & Zhang (2009)  
87 Taum S. Reservoir, USA 5.39 31.46 7,743 HD – Wahl (2014)  
88 Teton, USA 310.0 77.40 65,120 ZD ME Xu & Zhang (2009)  
89 Upper Clear Boggy, USA 0.86 6.10 70.79 – – – Pierce et al. (2010)  
90 Upper Red Rock, USA 0.25 4.57 8.50 – – – Pierce et al. (2010)  
91 Weatland No. 1, USA 11.60 12.20 566.34 HD – Pierce et al. (2010)  
92 Zhugou, China 18.43 23.50 11,200 DC HE Xu & Zhang (2009)  
93 Zuocun, China 40.0 35.00 23,600 DC HE Xu & Zhang (2009)  
No.Dam nameVw × 106 (m3)hw (m)QP (m3/s)DtDfDeReference
Apishapa, USA 22.20 28.00 6,850 HD HE Xu & Zhang (2009)  
Armando de S. Oliveira, Brazil 25.90 35.00 7,195 HD – Tahershamsi et al. (2003)  
Baldwin Hills, USA 0.91 12.20 1,130 HD HE Wahl (2014)  
Banqiao, China 607.50 31.00 78,100 DC HE Xu & Zhang (2009)  
Bayi, China 23.00 28.00 5,000 HD ME Xu & Zhang (2009)  
Big Bay, USA 17.50 13.59 4,160 ZD ME Pierce et al. (2010)  
Boystown, USA 0.36 8.96 65.13 – – – Pierce et al. (2010)  
Bradfield, UK 3.20 28.96 1,150 HD – Wahl (1998)  
Break Neck Run, USA 0.05 7.00 9.20 – – – Wahl (1998)  
10 Buffalo Creek, USA 0.48 14.02 1,420 HD – Wahl (1998)  
11 Butler, USA 2.38 7.16 810 HD – Wahl (1998)  
12 Caney Coon Creek, USA 1.32 4.57 16.99 – – – Pierce et al. (2010)  
13 Castlewood, USA 6.17 21.60 3,570 DC ME Xu & Zhang (2009)  
14 Chenying, China 5.00 12.0 1,200 HD ME Xu & Zhang (2009)  
15 Cherokee Sandy, USA 0.44 5.18 8.50 – – – Pierce et al. (2010)  
16 Colonial #4, USA 0.04 9.91 14.16 – – – Pierce et al. (2010)  
17 Dam Site #8, USA 0.87 4.57 4,899 – – – Pierce et al. (2010)  
18 Danghe, China 10.70 24.50 2,500 DC LE Xu & Zhang (2009)  
19 Davis Reservoir, USA 58.00 11.58 510 FD ME Xu & Zhang (2009)  
20 Dells, USA 13.0 18.30 5,440 DC HE Wahl (2014)  
21 DMAD, USA 19.70 8.80 793 HD – – Pierce et al. (2010)  
22 Dongchuankou, China 27.00 31.00 21,000 HD HE Xu & Zhang (2009)  
23 Eigiau, UK 4.52 10.50 400 – – – Singh & Scarlatos (1988)  
24 Elk City, USA 1.18 9.44 608.79 DC ME Tahershamsi et al. (2003)  
25 Frankfurt, Germany 0.35 8.23 79 HD LE Xu & Zhang (2009)  
26 Fred Burr, USA 0.75 10.20 654 HD – Wahl (2014)  
27 French Landing, USA 3.87 8.53 929 HD HE Xu & Zhang (2009)  
28 Frenchman, USA 16.00 10.80 1,420 HD ME Xu & Zhang (2009)  
29 Frias, Argentina 0.25 15.00 400 FD ME Xu & Zhang (2009)  
30 Goose Creek, USA 10.60 1.37 492 HD ME Tahershamsi et al. (2003)  
31 Gouhou, China 3.18 44.00 2,050 FD LE Xu & Zhang (2009)  
32 Grand Rapids, USA 0.26 7.50 7.50 DC ME Singh & Scarlatos (1988)  
33 Hatchtown, USA 14.80 16.80 3,080 ZD HE Wahl (2014)  
34 Hatfield, USA 12.30 6.80 3,400 DC HE Wahl (2014)  
35 Haymaker, USA 0.37 4.88 26.90 – – – Pierce et al. (2010)  
36 Hell Hole, USA 30.60 35.10 7,360 HD ME Wahl (2014)  
37 Hemet, USA 8.63 6.09 1,600 – – – Tahershamsi et al. (2003)  
38 Horse Creek, USA 12.80 7.01 3,890 FD ME Xu & Zhang (2009)  
39 Horse Creek #2, USA 4.80 12.50 311.49 – – – Pierce et al. (2010)  
40 Huqitang, China 0.42 5.10 50 HD LE Xu & Zhang (2009)  
41 Ireland No. 5, USA 0.16 3.81 110 HD – Froehlich (1995)  
42 Johnstown, USA 18.90 22.25 7,079.20 ZD ME Wahl (1998)  
43 Kelly Barnes, USA 0.78 11.30 680 HD HE Xu & Zhang (2009)  
44 Knife Lake, USA 9.86 6.10 1,098.66 – – – Tahershamsi et al. (2003)  
45 Kodaganar, India 12.30 11.50 1,280 HD ME Xu & Zhang (2009)  
46 Lake Avalon, USA 31.50 13.70 2,321.90 HD – Tahershamsi et al. (2003)  
47 Lake Latonka, USA 4.09 6.25 290 HD ME Xu & Zhang (2009)  
48 Lake Tanglewood, USA 4.85 16.76 1,351 – – – Pierce et al. (2010)  
49 Laurel Run, USA 0.56 14.10 1,050 HD – Froehlich (1995)  
50 Lawn Lake, USA 0.80 6.71 510 HD HE Wahl (2014)  
51 Lijiaju, China 1.14 25.00 2,950 HD ME Xu & Zhang (2009)  
52 Lily Lake, USA 0.09 3.35 71 HD – Froehlich (1995)  
53 Little Deer Creek, USA 1.36 22.90 1,330 HD HE Xu & Zhang (2009)  
54 Little Wewoka, USA 0.99 9.45 42.48 – – – Pierce et al. (2010)  
55 Liujiatai, China 40.54 35.90 28,000 DC ME Xu & Zhang (2009)  
56 Lower Latham, USA 7.08 5.79 340 HD – Froehlich (1995)  
57 Lower Reservoir, USA 0.60 9.60 157.44 DC – Pierce et al. (2010)  
58 Lower T. Medicine, USA 19.60 11.30 1,800 HD HE Xu & Zhang (2009)  
59 Mahe, China 23.40 19.50 4,950 HD HE Xu & Zhang (2009)  
60 Mammoth, USA 13.60 21.30 2,520 DC ME Xu & Zhang (2009)  
61 Martin Cooling, USA 136.00 8.53 3,115 FD HE Wahl (2014)  
62 Middle Clear Boggy, USA 0.44 4.57 36.81 – – – Pierce et al. (2010)  
63 Mill River, USA 2.50 13.10 1,645 – – – Wahl (1998)  
64 Murnion, USA 0.32 4.27 17.50 – – – Pierce et al. (2010)  
65 Nanaksagar, India 210.00 15.85 9,709.50 – – – Tahershamsi et al. (2003)  
66 North Branch, USA 0.02 5.49 29.50 HD – – Wahl (1998)  
67 Oros, Brazil 660.00 35.80 9,630 ZD LE Xu & Zhang (2009)  
68 Otto Run, USA 0.01 5.79 60 HD – – Singh & Scarlatos (1988)  
69 Owl Creek, USA 0.12 4.88 31.15 – – – Pierce et al. (2010)  
70 Peter Green, USA 0.02 3.96 4.42 – – – Pierce et al. (2010)  
71 Prospect, USA 3.54 1.68 116 HD HE Xu & Zhang (2009)  
72 Puddingstone, USA 0.62 15.20 480 HD – Froehlich (1995)  
73 Qielinggou, China 0.70 18.00 2,000 HD HE Xu & Zhang (2009)  
74 Quail Creek, USA 30.80 16.70 3,110 HD ME Xu & Zhang (2009)  
75 Salles Oliveira, Brazil 71.50 38.40 7,200 HD – Wahl (2014)  
76 Sandy Run, USA 0.06 8.53 435 HD – Singh & Scarlatos (1988)  
77 Schaeffer Reservoir, USA 4.44 30.50 4,500 DC HE Xu & Zhang (2009)  
78 Shimantan, China 117.00 27.40 30,000 HD HE Xu & Zhang (2009)  
79 Sinker Creek, USA 3.33 21.34 926 HD – Pierce et al. (2010)  
80 Site Y-30–95, USA 0.14 7.47 144.42 – – – Pierce et al. (2010)  
81 Site Y-31 A–5, USA 0.39 9.45 36.98 – – – Pierce et al. (2010)  
82 Site Y-36–25, USA 0.04 9.75 2.12 – – – Tahershamsi et al. (2003)  
83 South Fork, USA 18.90 24.60 8,500 – – – Froehlich (1995)  
84 S. Fork Tributary, USA 0.0037 1.83 122 HD – – Pierce et al. (2010)  
85 Stevens Dam, USA 0.08 4.27 5.92 – – – Pierce et al. (2010)  
86 Swift, USA 37.00 47.85 24,947 FD ME Xu & Zhang (2009)  
87 Taum S. Reservoir, USA 5.39 31.46 7,743 HD – Wahl (2014)  
88 Teton, USA 310.0 77.40 65,120 ZD ME Xu & Zhang (2009)  
89 Upper Clear Boggy, USA 0.86 6.10 70.79 – – – Pierce et al. (2010)  
90 Upper Red Rock, USA 0.25 4.57 8.50 – – – Pierce et al. (2010)  
91 Weatland No. 1, USA 11.60 12.20 566.34 HD – Pierce et al. (2010)  
92 Zhugou, China 18.43 23.50 11,200 DC HE Xu & Zhang (2009)  
93 Zuocun, China 40.0 35.00 23,600 DC HE Xu & Zhang (2009)  

= dam type; = failure mode; O = overtopping; P = seepage erosion/piping; HD = homogenous dams; DC = dams with corewalls; FD = concrete-faced dams; ZD = zoned-fill dams; = dam erodibility; LE = low erodibility; ME = medium erodibility; HE = high erodibility.

The models were developed by employing two effective input variables that are known to have a direct effect on the present phenomena: namely, the height () and volume () of water behind the dam at failure. The desired output is the peak breach outflow (). Two scenarios were applied with each of the ANN and ANFIS models. The first scenario considered the whole data as one set without classification and the second classified the data into small dams (48 dams) and large dams (45 dams). The second scenario is proposed after noting that all models and RE are unable to predict of small dams at a reasonable level. Just as large dams, the small dams are not risk-free dams. From the physical standpoint, both types of dams behave differently depending on several factors, including material compositions, compaction conditions, dam geometry (height, side slopes, etc.), and reservoir capacity. When overtopping occurs it often causes erosion of dam material starting at a weak point at the dam crest. This will result in a vertically directed breach at that point which will continue until it faces a non-erodible layer (e.g., the dam base). The breach then expands laterally to an extent depending on the reservoir capacity (Singh 1996; Wahl 1998). Larger breach widths will occur from large reservoir capacities and small dam heights. In such configurations the non-erodible layer at the dam base will be quickly reached by the vertical breach erosion, and the breach then spreads laterally. Combination of these two factors (dam height and reservoir capacity) is usually considered as a measure of potential risk downstream of the dam. The failure of a higher dam usually generates a larger due to its higher potential energy compared with a small dam having the same storage capacity (Xu & Zhang 2009). Conversely, for both types of dams, the design of the riprap protection against wave action over the upstream slopes is independent of dam height. It essentially depends on the reservoir size (fetch and location). In this study, the ANN models were trained using two network types: the neural networks tool (NNTool) and neural networks fitting tool (NNFTool). The ANFIS model considered the Takagi–Sugeno-type fuzzy model (Takagi & Sugeno 1985) of which the antecedent part is a fuzzy proposition using Gaussian membership functions and the consequent part is a first-order polynomial linear function. The data vectors of the input and output variables in this study were uploaded on ANN (NNTool and NNFTool) comprising two models for the first scenario and four models for the second scenario; and on ANFIS, as one model for the first scenario and two models for the second scenario. In this way, nine models were developed and analyzed. The results of the training and testing phases were judged against separate sets of the observed values. Basic error criteria were calculated to assess the adequacy of the developed models.

ANFIS

An ANFIS combines fuzzy logic with neural networks in order to get better results for systems possessing nonlinear behavior and uncertain variables and data. The ANFIS can be described as a fuzzy inference system equipped with a training algorithm (Jang 1993). ANFIS consists of IF-THEN fuzzy rule base, membership functions to be used in the fuzzy rules, and a reasoning mechanism which performs the inference procedure upon the rules in order to obtain the desired output. The ANFIS uses a hybrid-learning rule combining back-propagation, gradient-descent, and a least-squares algorithm to identify and optimize the Sugeno system's parameters. ANFIS modeling is effectively utilized in applications ranging over perhaps all branches of engineering, however, there is currently no solution to predict the using this technique. In the present study, an ANFIS model is developed to predict as a function of and under two scenarios, as given in Table 2, as follows:

  1. One model is developed using all available data vectors of (), (), and () from the historical 93 dam failures in Table 1 without classification. This database is randomly subdivided into two sets without any pre-selection process. The bigger set (73 dams) is used in the training phase of the model and the smaller set (20 dams) is used in its testing phase.

  2. Two models were developed after classification of the data into large dams (45 dams) and small dams (48 dams) where a model is prepared for each class. According to the International Committee of Large Dams: if a dam height < 15 m then the dam is a small dam and if > 15 m or between 10 and 15 m but m3 then the dam is a large dam (Singh 1996). The data in each class are further subdivided into two sets where one is used in the training phase of the models (39 small dams and 37 large dams) and the other is used in their testing phase (nine small dams and eight large dams). The ANFIS models were trained and tested with the ANFIS editor. The ANFIS toolbox employed is the MATLAB®V7.10 (R2010a). The models were developed using the following steps at the ANFIS graphical user interface (GUI): (1) obtaining training data, (2) data sizing, (3) data partitioning, and (4) loading the data set. Table 3 presents the ranges and the linguistic labels of the fuzzy membership functions (MFs) of the input variables. It takes several trials in order to reach the optimum number and shape of MFs that result in reliable estimates for the output. Figures 13 show the MFs of the ANFIS developed models to predict for both scenarios. To illustrate the ANFIS method using a first-order Takagi–Sugeno fuzzy model, consider a rule base consisting of two fuzzy IF-THEN rules expressed as follows:

    Rule 1: If is and is then

    Rule 2: If is and is then

where , and , are the MFs of and , respectively; , and ( = 1 or 2) are linear parameters in the consequent part of the first-order Takagi–Sugeno model.
Figure 1

MFs for input variables (first scenario).

Figure 1

MFs for input variables (first scenario).

Close modal
Figure 2

MFs for input variables (second scenario – small dams).

Figure 2

MFs for input variables (second scenario – small dams).

Close modal
Figure 3

MFs for input variables (second scenario – large dams).

Figure 3

MFs for input variables (second scenario – large dams).

Close modal
Table 2

The scenarios used to train the ANFIS models for prediction of

ScenariosInputsCustom ANFIS
S1 , 93 case studies Membership function type: Gaussian MF; Number of MFs: (4, 4) functions; Learning algorithm: hybrid learning algorithms; Sugeno type-system: first order; Output type: linear 
S2 , small dams 48 case studies Membership function type: Gaussian MF; Number of MFs: (3,3) functions; Learning algorithm: hybrid learning algorithms; Sugeno type-system: first order; Output type: linear 
 , large dam 45 case studies Membership function type: Gaussian MF; Number of MFs: (4,4) functions; Learning algorithm: hybrid learning algorithms; Sugeno type-system: first order; Output type: linear 
ScenariosInputsCustom ANFIS
S1 , 93 case studies Membership function type: Gaussian MF; Number of MFs: (4, 4) functions; Learning algorithm: hybrid learning algorithms; Sugeno type-system: first order; Output type: linear 
S2 , small dams 48 case studies Membership function type: Gaussian MF; Number of MFs: (3,3) functions; Learning algorithm: hybrid learning algorithms; Sugeno type-system: first order; Output type: linear 
 , large dam 45 case studies Membership function type: Gaussian MF; Number of MFs: (4,4) functions; Learning algorithm: hybrid learning algorithms; Sugeno type-system: first order; Output type: linear 
Table 3

The linguistic labels for fuzzy membership functions of input variables

ScenariosThe variableThe linguistic variable
S1 () m; Range of data 1.37 to 77.4 m SH; M; H; VH (Short; Medium; High; Very High) 
() Mm3; Range of data 0.0037 to 660 L; M; H; VH (Low; Medium; High; Very High) 
S2 For small dams:() m; Range of data 1.37 to 15m SH; M; MH; H (Short; Medium; MedHigh; High) 
() Mm3; Range of data 0.0037 to 136 L; M; MH; H (Low; Medium; MedHigh; High) 
For large dams: () m; Range of data 10.5 to 77.4 m SH; M; H (Short; Medium; High) 
() Mm3; Range of data 0.617 to 660 L; M; H (Low; Medium; High) 
ScenariosThe variableThe linguistic variable
S1 () m; Range of data 1.37 to 77.4 m SH; M; H; VH (Short; Medium; High; Very High) 
() Mm3; Range of data 0.0037 to 660 L; M; H; VH (Low; Medium; High; Very High) 
S2 For small dams:() m; Range of data 1.37 to 15m SH; M; MH; H (Short; Medium; MedHigh; High) 
() Mm3; Range of data 0.0037 to 136 L; M; MH; H (Low; Medium; MedHigh; High) 
For large dams: () m; Range of data 10.5 to 77.4 m SH; M; H (Short; Medium; High) 
() Mm3; Range of data 0.617 to 660 L; M; H (Low; Medium; High) 
These parameters have to be determined in the training process besides premise parameters which belong to MFs. The ANFIS architecture consists of five layers as illustrated in Figure 4.
Figure 4

Architecture of the ANFIS model.

Figure 4

Architecture of the ANFIS model.

Close modal
In Figure 4, Layer 1 consists of adaptive nodes that assign membership degrees (or ) for linguistic labels (small, medium, large, etc.) depending on premise input variables. For generalized bell MFs, the output node function in layer 1, , is given by:
where {} is the parameter set of the MFs in the premise part of fuzzy IF-THEN rules that adjusts the shapes of the MFs. Table 4 shows the rule base of one of the ANFIS models.
Table 4

Rule base of one of the ANFIS models (first scenario)

Rule no.Rule
If (m) is SH and (Mm3) is L then ((m3/sec) is out1mf1)(1) 
If (m) is M and (Mm3) is L then ((m3/sec) is out1mf2)(1) 
If (m) is H and (Mm3) is L then ((m3/sec) is out1mf3)(1) 
If (m) is VH and (Mm3) is L then ((m3/sec) is out1mf4)(1) 
If (m) is SH and (Mm3) is M then ((m3/sec) is out1mf5)(1) 
If (m) is M and (Mm3) is M then ((m3/sec) is out1mf6)(1) 
If (m) is H and (Mm3) is M then ((m3/sec) is out1mf7)(1) 
If (m) is VH and (Mm3) is M then ((m3/sec) is out1mf8)(1) 
If (m) is SH and (Mm3) is H then ((m3/sec) is out1mf9)(1) 
10 If (m) is M and (Mm3) is H then ((m3/sec) is out1mf10)(1) 
11 If (m) is H and (Mm3) is H then ((m3/sec) is out1mf11)(1) 
12 If (m) is VH and (Mm3) is H then ((m3/sec) is out1mf12)(1) 
13 If (m) is SH and (Mm3) is VH then ((m3/sec) is out1mf13)(1) 
14 If (m) is M and (Mm3) is VH then ((m3/sec) is out1mf14)(1) 
15 If (m) is H and (Mm3) is VH then ((m3/sec) is out1mf15)(1) 
16 If (m) is VH and (Mm3) is VH then ((m3/sec) is out1mf16)(1) 
Rule no.Rule
If (m) is SH and (Mm3) is L then ((m3/sec) is out1mf1)(1) 
If (m) is M and (Mm3) is L then ((m3/sec) is out1mf2)(1) 
If (m) is H and (Mm3) is L then ((m3/sec) is out1mf3)(1) 
If (m) is VH and (Mm3) is L then ((m3/sec) is out1mf4)(1) 
If (m) is SH and (Mm3) is M then ((m3/sec) is out1mf5)(1) 
If (m) is M and (Mm3) is M then ((m3/sec) is out1mf6)(1) 
If (m) is H and (Mm3) is M then ((m3/sec) is out1mf7)(1) 
If (m) is VH and (Mm3) is M then ((m3/sec) is out1mf8)(1) 
If (m) is SH and (Mm3) is H then ((m3/sec) is out1mf9)(1) 
10 If (m) is M and (Mm3) is H then ((m3/sec) is out1mf10)(1) 
11 If (m) is H and (Mm3) is H then ((m3/sec) is out1mf11)(1) 
12 If (m) is VH and (Mm3) is H then ((m3/sec) is out1mf12)(1) 
13 If (m) is SH and (Mm3) is VH then ((m3/sec) is out1mf13)(1) 
14 If (m) is M and (Mm3) is VH then ((m3/sec) is out1mf14)(1) 
15 If (m) is H and (Mm3) is VH then ((m3/sec) is out1mf15)(1) 
16 If (m) is VH and (Mm3) is VH then ((m3/sec) is out1mf16)(1) 

Layer 2 presents the firing strength of each rule. The output of each node is the fuzzy AND (Minimum) of all membership degrees:

Layer 3 outputs are the normalized firing strengths. Each node is a fixed rule labeled N. The output of the ith node is calculated as:

Layer 4 consists of adaptive nodes that calculate the rule outputs based upon consequent parameters using the function:

Layer 5 transforms (defuzzifies) each rule's fuzzy results to a crisp output: .

In each iteration during the training of the ANFIS model, the node outputs are calculated up to layer 4. At layer 5, the consequent parameters are calculated using a least-squares regression method. The output of the ANFIS is calculated and the errors propagated back through the layers in order to determine the premise parameter (layer 1) updates (Jang et al. 1997). The criterion chosen for the development of the ANFIS model as shown in Table 2 was based on the selection of the number and type of MFs, learning algorithm, iteration size, and data size. The modeling criterion adopted was to effectively tune the MFs to minimize the output error and maximize performance index.

ANN

Based on the literature, the ANN technique can easily be applied to nonlinear complex systems that involve pattern recognition. ANNs have similarities to human brain functioning and are built to form complex interconnected sets of units taking a number of real–valued inputs and producing a single real–valued output. Neural Network Toolbox™ helps in creating training and simulating neural networks by providing many functions and applications. NNTool and NNFTool are the GUI tools included in Neural Network Toolbox. In this study, by using these tools, first input and output data are loaded to the system. Then, network type, training function, adaption learning function, performance function, and number of layers are chosen. As a result, a neural network is created. A multi-layer feed forward perceptron neural network with back-propagation training method was used in this study (Rojas 1996). In the present ANN models there is an input layer for and , one processing hidden layer, and a final processing layer for . Figure 5 shows the three-layer topology of the ANN model.
Figure 5

Structure of ANN model.

Figure 5

Structure of ANN model.

Close modal

Each node in the hidden layer receives and processes weighted input from the input layer and transmits its output to the layer through links. Each link is assigned a weight, which is a numerical estimate of the connection strength. The weighted summation of inputs to a node is converted to an output according to a transfer function, sigmoid function (SIG) for the NNFTool and logarithmic sigmoid function (LOGSIG) function for NNTool. The activation pattern of input variables (, ) is propagated through the network to produce output target (). Two scenarios as described before were applied. The first scenario uses the data as one set without classification and the second scenario is applied after classification of the data into small dams and large dams. Accordingly, two models for the first scenario and four models for the second scenario were developed. The inputs and target data vectors from the 93 embankment dam failures are randomly divided into three sets where ∼70% of the vectors are used to train the network and ∼30% of the vectors are used to validate and test how well the network generalized. Training on the training vectors continues as long as the training reduces the network's error on the validation vectors. After the network memorizes the training set, training is stopped. This technique automatically avoids the problem of overfitting encountered in many optimization and learning algorithms. The structures of the ANN models together with data division for both scenarios are presented in Tables 5 and 6, respectively. The network used the Levenberg–Marquardt algorithm (LMA) as the learning function. The LOGSIG is selected as the transfer function. In order to avoid running out of memory in the case of a very large network, the scaled conjugate gradient back-propagation (trainscg) was used in this study as the training function of large dams' models.

Table 5

ANN model's structure (without classification of data: S1)

ANN typeTraining algorithmTransfer functionLearning functionNetwork architectureNo. of dams in:
TrainingValidationTesting
NNFTool BP SIGMOID SIGMOID 2-5-1 55 19 19 
NNtool BP LOG-SIG LMA 2-5-1 65 14 14 
ANN typeTraining algorithmTransfer functionLearning functionNetwork architectureNo. of dams in:
TrainingValidationTesting
NNFTool BP SIGMOID SIGMOID 2-5-1 55 19 19 
NNtool BP LOG-SIG LMA 2-5-1 65 14 14 
Table 6

ANN model's structure (after classification of data: S2)

ANN typeTraining algorithmTransfer functionLearning functionNetwork architectureNo. of dams in:
TrainingValidationTesting
NNFTool BP SIGMOID SIGMOID 2-5-1 28 SD 10 SD 10 SD 
27 LD 9 LD 9 LD 
NNTool BP LOGSIG LMA 2-5-1 28 SD 10 SD 10 SD 
TANSIG 27 LD 9 LD 9 LD 
ANN typeTraining algorithmTransfer functionLearning functionNetwork architectureNo. of dams in:
TrainingValidationTesting
NNFTool BP SIGMOID SIGMOID 2-5-1 28 SD 10 SD 10 SD 
27 LD 9 LD 9 LD 
NNTool BP LOGSIG LMA 2-5-1 28 SD 10 SD 10 SD 
TANSIG 27 LD 9 LD 9 LD 

SD = small dams; LD = large dams.

The predicted peak breach outflows () is compared with the observed peak breach outflows () to determine the mean square error of the prediction of both models, with
1
where n is number of data points. The backward computation including calculation of the of and is back-propagated through the network from the layer to the input layer, at which time the weights of the connections are modified according to the delta learning rule. This rule is defined as a type of learning using the gradient descent to search the weights that best reduce the difference between and . The previous two steps are repeated until the error of the network is minimized. The delta learning rule controls the learning process by changing the present weight based on past weight changes.

Available RE and GEP

Based on his uncertainty analysis, Wahl (2004) concluded that the Froehlich (1995) equation performed better than other available RE for prediction of . Recently, Wahl (2014) evaluated and compared new RE for predicting and stated that application of the erodibility factor, as proposed by Xu & Zhang (2009), in addition to other physical and geometrical input parameters significantly increased the accuracy of the predictions. Hence, it is proposed to compare the predictions of the present ANN and ANFIS models with these equations. The predictions were also compared with the results of recent RE developed by Hooshyaripor et al. (2013) and GEP developed by Sattar (2014). These equations have the following forms.

Xu & Zhang (2009) ‘best’ RE:
3
where = 9.806 m/s2 is acceleration of gravity; 15 m is a reference height; b3 + b4 + b5, in which b3 = ∼0.503, ∼0.591, and ∼0.649 for DC, FD, and HD/ZD dams type, respectively, b4 = ∼0.705 and ∼1.039 for overtopping and seepage erosion/piping, respectively, b5 = ∼0.007, ∼0.375 and ∼1.362 for HE, ME, and LE dam erodibility, respectively.
Hooshyaripor et al. (2013) RE:
4
Sattar (2014) GEP:
5
where is the reservoir shape factor, . For , 4 is given for HD, 3 for DC, 2 for ZD, and 1 for FD. For , 3 is assigned for HE, 2 for ME, and 1 for LE. For , 1.1 denotes piping failure, and 1.2 overtopping failure. The variables , , and in these equations are measured in m3/s, m3, and m, respectively.

Reliability of the ANN and ANFIS models

The best performances of the ANN and ANFIS models versus the available RE and GEP were conducted according to basic statistical evaluation criteria such as the mean absolute error () in m3/s, Nash & Sutcliffe (1970) coefficient of efficiency (), and coefficient of determination (), with
6
7
where is the mean of the observed peak breach outflow. These can help in comparisons between observed versus predicted peak outflows obtained by the developed models, RE and GEP. Since the uncertainty of prediction of such a phenomenon is large, the 95% confidence interval = can be used as a quantitative assessment to analyze such uncertainty, where = 1.96 and is the standard deviation of .

In this study, two models based on the first scenario and four models based on the second scenario were developed by ANN in order to predict . Similarly, one model using the first scenario and two models using the second scenario were developed by the ANFIS technique. The results of these models are presented here.

Results of first scenario (S1)

The ANN and ANFIS models in this scenario were developed using unclassified separate sets of data vectors for , , and from the available 93 dam failures.

  1. The ANN were feed-forward neural networks employing a SIG and LOGSIG transfer function as activator with a back-propagation algorithm for network learning. Inputs and output data vectors in the ANN models were randomly separated into three sub-sets as training set for each trial (65 dams), validation set (14 dams) and testing set (14 dams) for the NNTool; and training set (55 dams), validation set (19 dams) and testing set (19 dams) for the NNFTool. The ANN models had three layers: input layer, hidden layer(s), and output layer. The optimal number of neurons in the hidden layer(s) was obtained by trial and error.

  2. In the ANFIS model the available data are subdivided into two sets. The bigger set (73 dams) is used in the training phase of the model and the smaller set (20 dams) is used in its testing phase. The ANFIS models used first-order Sugeno-system with four Gaussian MFs for each input variable (Figure 1), hybrid as learning algorithm, and a linear function for the output.

Judgment of the models' adequacy was carried out by calculating the selected error criteria and drawing the scatter plots (Figure 6) of the predicted versus observed peak outflow values. A trend line is added to each plot to confirm the match between predicted and observed values. It should be noted that data points for ANN and ANFIS models follow a tight function with a high although it has some bias at some points (outliers) along the curve at low values. That is maybe due to the relatively large data set and the small number of those points. In fact, those points may not be considered as outlier observations because outliers are defined as points on a scatter diagram that have a large gap containing no points between them and the vast majority of the other points. While a high is required for precise predictions, it is not sufficient by itself. Normally, we should evaluate in conjunction with other model statistics and residual plots to measure how well the model fits the observations. Figures 7 and 8 show the performance of the ANN and ANFIS models with unclassified sets of data. They represent the best models of the training and testing phases.
Figure 6

Scatter plots of the best models and Froehlich (1995) RE in the first scenario.

Figure 6

Scatter plots of the best models and Froehlich (1995) RE in the first scenario.

Close modal
Figure 7

Schematic performance of the best ANN models in the first scenario.

Figure 7

Schematic performance of the best ANN models in the first scenario.

Close modal
Figure 8

Schematic performance of the best ANFIS models in the first scenario.

Figure 8

Schematic performance of the best ANFIS models in the first scenario.

Close modal

Table 7 presents the magnitudes of the error criteria used to compare between the first scenario models. The ANN and ANFIS models showed a significant improvement in predicting values compared to Froehlich (1995) RE. They presented high efficiency, high coefficient of determination (both E and > 0.93), and low compared to the Froehlich (1995) RE. The mean value of by the present ANN and ANFIS models is given in Table 7. It is within the range of the 95% confidence interval and too close to the observed value. The ANFIS model had the highest fitting criteria with = 0.98, = 0.98, and = 890.4. Therefore, it presents the best performance among all models. The higher and the lower E and of the RE may be attributed to the influence of the large number of small dams included in the database resulting in a skewed distribution of the values. Moreover, RE cannot reasonably handle nonlinearity, complexity, and uncertainty of data and variables. The scatter plots of Figure 6 show that there are some outliers in predicting by all models. Most outliers correspond to small dams (e.g., Boystown, PA.; Caney Coon Creek, OK.; Haymaker, MT.; Horse Creek #2, CO) whose methods of determining peak outflows are unknown (Wahl 1998). Accordingly, one can conclude that lack of historical data from a wide range of breached dams might be the reason for improper and lower training of the network. This can be evidenced from Table 8 which shows the calculated error criteria after separating the results of the first scenario models into small and large dams. The results of small dams by all models showed very low E and in comparison to large dams.

Table 7

Error criteria using the results of the models in the first scenario

 Peak breach outflow QP m3/s
 
NNToolNNFToolANFISFroehlich95% CIμQpobs
 1,359.9 1,339.3 840.9 2,677.1 2,338.6–7,140.8 4,739.7 
 0.96 0.94 0.98 0.53 
 0.95 0.93 0.98 0.67 
Mean (4,762.3 4,541.4 4,757.1 2,906.2 
 Peak breach outflow QP m3/s
 
NNToolNNFToolANFISFroehlich95% CIμQpobs
 1,359.9 1,339.3 840.9 2,677.1 2,338.6–7,140.8 4,739.7 
 0.96 0.94 0.98 0.53 
 0.95 0.93 0.98 0.67 
Mean (4,762.3 4,541.4 4,757.1 2,906.2 
Table 8

Error criteria after classification of the results of the models in the first scenario

 Peak breach outflow QP m3/s
 NNTool
NNFTool
ANFIS
Froehlich
Small damsLarge damsSmall damsLarge damsSmall damsLarge damsSmall damsLarge dams
 456.0 2,866.5 362.6 2,381.1 388.5 1,525.6 350.3 5,268.3 
 0.11 0.90 0.25 0.93 0.59 0.98 0.43 0.45 
 0.24 0.89 0.39 0.93 0.55 0.98 0.50 0.61 
 Peak breach outflow QP m3/s
 NNTool
NNFTool
ANFIS
Froehlich
Small damsLarge damsSmall damsLarge damsSmall damsLarge damsSmall damsLarge dams
 456.0 2,866.5 362.6 2,381.1 388.5 1,525.6 350.3 5,268.3 
 0.11 0.90 0.25 0.93 0.59 0.98 0.43 0.45 
 0.24 0.89 0.39 0.93 0.55 0.98 0.50 0.61 

Results of second scenario (S2)

In this scenario the data vectors from the 93 dam failures were classified as 48 small dams and 45 large dams, which were used for building four models using ANN and two models using ANFIS techniques.
  1. The ANN employed a SIG, LOGSIG, and TANSIG transfer function as activator with a back-propagation algorithm for learning. An important role in ANN model development is to ensure the generalization ability of the trained models to produce accurate predictions for testing data subsets. This is often achieved by dividing the available data into training, validation, and test sets. The training set is used to calibrate the model, the validation set is used in cross-validation during the training process to avoid over-fitting, and the test set is used to test the performance of the model on the testing data set which was not used by the model during its training phase. Classified input and output data vectors were further separated by the network into three sets for each trial as: training set (28 small dams, 27 large dams), validation set (10 small dams, nine large dams), and testing set (10 small dams, nine large dams) for both NNTool and NNFTool, respectively.

  2. Training and testing the two ANFIS models were performed after dividing the classified data into two sets as: training set (39 small dams, 37 large dams) and testing set (nine small dams, eight large dams). The ANFIS models were developed using first-order Sugeno type-system with four Gaussian MFs for small dams' class (Figure 2) and three Gaussian MFs for large dams' class (Figure 3), hybrid as learning algorithm, and linear MF type for the output. The results of the models in each class were combined together and the scatter plots of the combined results are drawn in Figure 9.

Figure 9

Schematic performance of the best ANN and ANFIS models in the second scenario using the combined results of small and large dams’ models.

Figure 9

Schematic performance of the best ANN and ANFIS models in the second scenario using the combined results of small and large dams’ models.

Close modal
The schematic performance of only the NNFTool model were drawn in Figure 10 for small dams' class and in Figure 11 for large dams' class as it gave the best results in this scenario either for training or testing data. The quantitative results of the utilized error criteria are presented in Table 9 for the second scenario models. The ANN and ANFIS models showed the higher E and (both are >0.9) and lower for both classes of small and large dams. Based on the error criteria given in Tables 8 and 9, one can see that the is reduced by about 65% for small dams' class after classification of data and building the models accordingly.
Table 9

Error criteria of the models in the second scenario

 NNTool
NNFTool
ANFIS
Froehlich
Small damsLarge damsAll damsSmall damsLarge damsAll damsSmall damsLarge damsAll dams
 187.7 3,166.0 1,628.9 90.1 2,457.6 1,235.8 162.7 2,344.8 1,408.4 2,677.1 
 0.90 0.90 0.91 0.96 0.94 0.95 0.92 0.92 0.93 0.53 
 0.90 0.90 0.91 0.96 0.94 0.95 0.92 0.91 0.93 0.67 
 NNTool
NNFTool
ANFIS
Froehlich
Small damsLarge damsAll damsSmall damsLarge damsAll damsSmall damsLarge damsAll dams
 187.7 3,166.0 1,628.9 90.1 2,457.6 1,235.8 162.7 2,344.8 1,408.4 2,677.1 
 0.90 0.90 0.91 0.96 0.94 0.95 0.92 0.92 0.93 0.53 
 0.90 0.90 0.91 0.96 0.94 0.95 0.92 0.91 0.93 0.67 
Figure 10

Schematic performance of the best NNFTool model (second scenario – small dams).

Figure 10

Schematic performance of the best NNFTool model (second scenario – small dams).

Close modal
Figure 11

Schematic performance of the best NNFTool model (second scenario – large dams).

Figure 11

Schematic performance of the best NNFTool model (second scenario – large dams).

Close modal

Moreover, E and remarkably improved, e.g., the NNFTool model of small dams gave = 0.96 and = 90.1 in the second scenario compared to 0.25 and 364.6 before classification in the first scenario. Application of the RE on the same database in this scenario produced relatively less accurate results ( = 0.67 and = 2677.1) especially for large dams.

Another separate set from the testing database was selected in order to evaluate the performance of the developed models in the first scenario and compare their results with the Froehlich (1995), Xu & Zhang (2009), and Hooshyaripor et al. (2013) RE and Sattar (2014) GEP. This testing set consists only of eight case studies (Qielinggou, China; Quail Creek, USA; Schaeffer Reservoir, USA; Shimantan, China; Swift, USA; Teton, USA; Zhugou, China; and Zuocun, China) because they were the only cases in the testing database having all the required data to perform the calculations of the mentioned models. Figure 12 shows the comparison and the performance of the various models in predicting using this testing set. The ANFIS and the Xu & Zhang (2009) models confirmed a better match between the predicted and observed peak outflow values in comparison to Hooshyaripor & Tahershamsi (2012) ANN model; Hooshyaripor et al. (2013) and Froehlich (1995) RE; and Sattar (2014) GEP. From the error criteria and the scatter plots in Figure 12, it is observed that the predicted values by the ANN and ANFIS models and the Xu & Zhang (2009) RE were 90% closer to the observed values. However, the predicted values by Sattar (2014) GEP, Froehlich (1995) and Hooshyaripor et al. (2013) RE were 71%, 68%, and 62% closer to the observed values, respectively.
Figure 12

Schematic performance of best ANN, ANFIS, RE, and GEP models.

Figure 12

Schematic performance of best ANN, ANFIS, RE, and GEP models.

Close modal

It should be recognized that the above RE and GEP benefited from the selected testing data set. That is because several dams in this testing set were used in their derivation. Hence, this would add advantages to their predictions. On the contrary, this testing data set was not a part of the data used to train the developed ANN and ANFIS models. Despite these facts the developed ANN and ANFIS models provided a quite reasonable match between the predicted and observed peak outflow values in comparison to RE and GEP. In Figure 12, the trend line and the coefficient of determination were inserted in addition to the 1:1 line of agreement between the predicted and the observed peak outflow values. Although the value in Sattar (2014) GEP is large (very close to 1), it does not indicate that this is the best model. The square root of indicated the scatter of the predicted values around the regression line. However, the modeler is interested in the degree of scatter of the predicted values about the line of perfect prediction (i.e., the 1:1 line of agreement). For this reason, other statistical error criteria such as the , mean square error, and Nash & Sutcliffe (1970) coefficient of efficiency may be used in addition to the coefficient of determination. Models are considered adequate for prediction if they satisfy some or all of these criteria. From the scatter plots of Figure 12 it is evident that the models' performance is not reflected by only values. For example, the Sattar (2014) GEP and the Froehlich (1995) and Hooshyaripor et al. (2013) RE showed high values although they grossly overpredicted most of the peak outflow rates. To quantify the results of the developed models and the best available RE and GEP, three statistical parameters, , Nash & Sutcliffe (1970) coefficient of efficiency E and coefficient of determination were used. The high E and values and the low of the developed ANN and ANFIS models in the testing phase demonstrated the potential of these models in predicting the peak outflow rates from breached embankment dams. Comparing the results of Table 10 showed that = 1,493.5, = 0.98, and = 0.98 for the ANFIS model which are good indicators for the accuracy and efficiency of this model than the other ANN, RE, and GEP models. The Xu and Zhang ‘best’ RE comes second with = 3,457, = 0.94, and = 0.95 and produced almost similar results as the NNTool and the NNFTool models.

Table 10

Statistical error criteria of various models using testing data set of eight dams

ModelMAEER2
Present ANN model (NNTool) 2,707.8 0.97 0.97 
Present ANN model (NNFTool) 2,637.5 0.95 0.96 
Present ANFIS model 1,493.5 0.98 0.98 
Froehlich (1995) RE 9,933.0 0.62 0.90 
Xu & Zhang (2009) ‘best’ RE 3,457.0 0.94 0.95 
Hooshyaripor et al. (2013) RE 10,799.6 0.60 0.89 
Sattar (2014) GEP 7,800.8 0.71 0.99 
ModelMAEER2
Present ANN model (NNTool) 2,707.8 0.97 0.97 
Present ANN model (NNFTool) 2,637.5 0.95 0.96 
Present ANFIS model 1,493.5 0.98 0.98 
Froehlich (1995) RE 9,933.0 0.62 0.90 
Xu & Zhang (2009) ‘best’ RE 3,457.0 0.94 0.95 
Hooshyaripor et al. (2013) RE 10,799.6 0.60 0.89 
Sattar (2014) GEP 7,800.8 0.71 0.99 

Although the application of the developed ANN and ANFIS models for predicting peak breach outflows is promising, the proposed models can further be enhanced by increasing the database, searching for optimum key variables, and tuning the membership functions.

This study provides a comparison between two ANN GUI performances and an ANFIS for predicting peak breach outflows based on historical data from 93 breached embankments. Two scenarios were proposed in order to obtain the most effective model. In the first scenario, all the available data from the 93 case studies were used as one set without classification. The second scenario classifies the data into small and large dams. Nine models were developed and their results were analyzed. Extensive comparison was also made between the results of the developed models and the best available RE by Froehlich (1995), Xu & Zhang (2009) and Hooshyaripor et al. (2013) and GEP by Sattar (2014). In general, the following can be concluded:

  1. By considering the statistical error criteria, the results of this study showed that all ANN and ANFIS models predicted the peak outflow rates very well because of their high coefficient of determination (), high coefficient of efficiency (), and low in comparison to the best available RE and GEP models.

  2. Among the models, the ANFIS model in the first scenario gave the best results with = 0.98, = 0.98, and = 890.4; however, in the second scenario, the NNFTool model produces the best results ( = 0.96, = 0.96, and = 90.1 for small dams and = 0.94, = 0.94, and = 2,457.6 for large dams).

  3. The Xu & Zhang (2009) RE produced quite reasonable results and came second after the ANFIS model. It produced similar results as the ANN models.

  4. The RE by Froehlich (1995) and Hooshyaripor et al. (2013) and GEP by Sattar (2014) overpredicted the peak outflow rates and presented less accurate results than the ANFIS and the ANN models.

  5. Classification of data into small and large dams and building the models accordingly enhanced the performance of the models, particularly for the small dams' class.

  6. The minimum value of and maximum values of and E showed the potential of the ANFIS methodology to be used in the future as a predictive tool for predicting the peak outflow rates from breached embankments.

Azmathullah
 
H. Md.
Deo
 
M. C.
Deolalikar
 
P. B.
2005
Neural networks for estimation of scour downstream of a ski-jump bucket
.
J. Hydraul. Eng.
131
(
10
),
898
908
.
Duricic
 
J.
Erdik
 
T.
Van Gelder
 
P.
2013
Predicting peak breach discharge due to embankment dam failure
.
J. Hydroinform.
15
(
4
),
1361
1376
.
Froehlich
 
D. C.
1995
Peak outflow from breached embankment dam
.
J. Water Resour. Plann. Manage.
121
(
1
),
90
97
.
Hagen
 
V. K.
1982
Re-evaluation of design floods and dam safety
. In:
Proceedings of the 14th Congress of International Commission on Large Dams
,
Rio de Janeiro
,
Brazil
.
Hooshyaripor
 
F.
Tahershamsi
 
A.
Golian
 
S.
2013
Application of copula method and neural networks for predicting peak outflow from breached embankments
.
J. Hydro-Environ. Res.
8
(
3
),
292
303
.
Jang
 
J. S. R.
1993
ANFIS: adaptive network based fuzzy inference system
.
IEEE Trans. Syst. Man Cyber.
23
(
3
),
665
685
.
Jang
 
J. S. R.
Sun
 
C. T.
Mizutani
 
E.
1997
Neuro-Fuzzy and Soft Computing
.
Prentice Hall
,
NY
,
USA
, p.
607
.
Kirkpatrick
 
G. W.
1977
Evaluation guidelines for spillway adequacy
. In
The Evaluation of Dam Safety, Engineering Foundation Conference
,
ASCE
,
New York
, pp.
395
414
.
MacDonald
 
T.
Langridge-Monopolis
 
J.
1984
Breaching characteristics of dam failures
.
J. Hydraul. Eng.
110
(
5
),
567
586
.
Pierce
 
M. W.
Thornton
 
C. I.
Abt
 
S. R.
2010
Predicting peak outflow from breached embankment dams
.
J. Hydrol. Eng.
15
(
5
),
338
349
.
Rojas
 
R.
1996
Neural Networks: A Systematic Introduction
.
Springer-Verlag
,
Berlin
,
Germany
.
Sattar
 
A. M. A.
2014
Gene expression models for prediction of dam breach parameters
.
J. Hydroinform.
16
(
3
),
550
571
.
Singh
 
K. P.
Snorrason
 
A.
1982
Sensitivity of outflow peaks and flood stage to the selection of dam breach parameters and simulation models. SWS Contract Report 288, Illinois Department of Energy and Natural Resources, State Water Survey Division, Surface Water Section at the University of Illinois
, p.
179
.
Singh
 
V. P.
1996
Dam Breach Modelling Technology
.
Kluwer Academic Publishers
,
Dordrecht
,
The Netherlands
.
Singh
 
V. P.
Scarlatos
 
P. D.
1988
Analysis of gradual earth dam failure
.
J. Hydraul. Eng.
114
(
1
),
21
42
.
Soil Conservation Service (SCS)
1981
Simplified dam-breach routing procedure. Technical Release No. 66 (Rev.1), December
, p.
39
.
Tahershamsi
 
A.
Shetty
 
A. V.
Ponce
 
V. M.
2003
Embankment dam breaching: geometry and peak outflow characteristics
.
Dam Eng.
14
(
2
),
73
87
.
Takagi
 
T.
Sugeno
 
M.
1985
Fuzzy identification of systems and its applications to modeling and control
.
IEEE Trans. Syst. Man Cyber.
15
,
116
132
.
US Bureau of Reclamation (USBR)
1982
Guidelines for Defining Inundated Areas Downstream, Bureau of Reclamation dams. Reclamation Planning Instruction, Rep. No. 82-11
.
Wahl
 
T.
1998
Prediction of Embankment Dam Breach Parameters: A literature review and needs assessment. DSO-98-004, Dam safety research report, US Department of the Interior, Bureau of Reclamation
,
Denver, CO
,
USA
.
Wahl
 
T.
2014
Evaluation of Erodibility-Based Embankment Dam Breach Equations. Hydraulic Laboratory Report HL-2014-02. US Department of the Interior Bureau of Reclamation, Technical Service Center
,
Denver, CO
,
USA
, p.
99
.
Xu
 
Y.
Zhang
 
L. M.
2009
Breaching parameters for earth and rockfill dams
.
J. Geotech. Geoenviron. Eng.
135
(
12
),
1957
1970
.