## Abstract

Accurate estimation of velocity distribution in open channels or streams (especially in turbulent flow conditions) is very important and its measurement is very difficult because of spatio-temporal variation in velocity vectors. In the present study, velocity distribution in streams was estimated by two different artificial neural networks (ANN), ANN with conjugate gradient (ANN-CG) and ANN with Levenberg–Marquardt (ANN-LM), and two different adaptive neuro-fuzzy inference systems (ANFIS), ANFIS with grid partition (ANFIS-GP) and ANFIS with subtractive clustering (ANFIS-SC). The performance of the proposed models was compared with the multiple-linear regression (MLR) model. The comparison results revealed that the ANN-CG, ANN-LM, ANFIS-GP, and ANFIS-SC models performed better than the MLR model in estimating velocity distribution. Among the soft computing methods, the ANFIS-GP was observed to be better than the ANN-CG, ANN-LM, and ANFIS-SC models. The root mean square errors (RMSE) and mean absolute errors (MAE) of the MLR model were reduced by 69% and 72%, respectively, using the ANFIS-GP model to estimate velocity distribution in the test period.

## INTRODUCTION

Stream flow that can be described as quite difficult is an important study area for water resources and hydraulic engineers. Flows in streams are usually expressed by 1-D hydraulic equations. Many studies have been performed to determine the detailed properties of the hydrodynamics of complex flows using conventional methods, empirical formulas, and velocity samples (Hsu *et al.* 1998; Thomas & Williams 1999; Huang *et al.* 2002; Kar *et al.* 2015).

Three approaches, namely, experimental measurement, the theoretical method, and computer simulation are used to investigate flow properties in hydraulic engineering. Hydraulic systems usually show very complicated nonlinear behavior so it is not easy to get an analytical solution to describe the characteristics of these systems. Theoretical methods may be used to determine some simple flow cases (Kerh *et al.* 1994, 1997). Computer simulation using numerical methods such as the Computational Fluid Dynamics package is another approach to solve the fluid mechanics problems. It can be used to detect the properties of fluid motion in hydraulics engineering when the boundary conditions are properly defined. Flow measurement data are always extremely valuable for researchers who study in the field of hydraulic engineering. Usable measurement data are needed to corroborate its accuracy and to check its reliability in computer simulation (Kerh 2000, 2002). Prediction of velocity distribution is one of the basic properties of an open channel flow to analyze flow characteristics, particularly such as flow discharge, in the estimation of erosion and sediment transport in alluvial channels, shear stress, and watershed runoff which is used by hydraulic engineers. Also, recent researches have expressed that the profile of velocity in streams is the driver of habitat quality for aquatic species (Booker 2003). In this case, distribution of velocity must be investigated and determined as a priority for solving hydraulics problems in open channels.

Numerous analytical and experimental studies have been conducted to obtain velocity distributions in stream flows (Kirkgoz 1989; Smart 1999; Ferro 2003). The power law and the Prandtl–von Karman universal velocity distribution law are well-known velocity distribution equations for open channel flows (Prandtl 1925; von Karman 1930). Unfortunately, the existing formulas cannot fully reveal the velocity profile, particularly near the channel bed and water surface. Most recently, an entropy concept based on the probabilistic approach was used to investigate velocity distributions in open channels (Chiu 1988; Xia 1997). According to the entropy method, there is a linear relationship between the mean and maximum velocity and it is described as an entropy parameter. Xia (1997) demonstrated that the relationship between the mean and maximum velocities was linear for all the river sections considered. The entropy concept, which is an alternative to the traditional method, is used to forecast flow properties. In the last decade, an artificial neural network (ANN) is another method that has been used to determine the velocity profile. The ANN and adaptive neuro-fuzzy inference system (ANFIS) techniques have been satisfactorily used to solve problems in water resources and hydraulic engineering. Yang & Chang (2005) simulated velocity profiles and velocity contours and estimated the discharges by ANN. Kocabas & Ulker utilized the ANFIS approach for predicting the critical submergence for an intake in a stratified fluid media (Kocabas & Ulker 2006). Dogan *et al.* (2007) utilized the ANN approach to forecast concentration of the sediment acquired by an experimental study. Mamak *et al.* (2009) successfully analyzed bridge afflux through arched bridge constrictions by ANFIS and ANN techniques. Kocabas *et al.* (2009) used the ANN method for estimating the critical submergence for an intake in a stratified fluid medium. Bilhan *et al.* (2010) estimated the lateral outflow over rectangular side weirs by using two different ANN techniques. Emiroglu *et al.* (2011) utilized the ANN approach for predicting the discharge capacity of a triangular labyrinth side weir situated on a straight channel. Emiroglu & Kisi (2013) used a neuro-fuzzy method to predict the discharge coefficient of trapezoidal labyrinth side weirs located on a straight channel. The flow discharge of weirs has been successfully predicted by Kisi *et al.* (2013) by ANFIS. Genc *et al.* (2014) analyzed the accuracy of ANN and ANFIS in determination of mean velocity and discharge of natural streams. They demonstrated that the ANFIS model, which has a determination coefficient (R^{2}) of 0.996, can be successfully predicted to mean velocity and discharge. In this paper, the applicability of two different ANN and ANFIS approaches for estimating velocity distribution of streams is investigated and the results are compared with the multiple-linear regression (MLR) model. For this purpose, field studies were carried out at different cross-sections in Kayseri by an acoustic Doppler velocimeter (ADV). These techniques have not been used for this purpose before.

## VELOCITY DISTRIBUTIONS

Velocity profile should be determined for stream flows to better understand the structures of turbidity, sediment discharge, energy loss, and shear stress distributions (Ardiclioglu *et al.* 2012). Velocity distribution is influenced by vegetation, channel geometry, channel slope, roughness, and the presence of bends in rivers. In river flow, the velocity profile is not consistent at diverse depths. It increases from zero at the bottom of the channel to highest velocity near to the free water surface.

One of the most well-known velocity distribution models is the log-law (Sarma & Lakshminaraynan 1998). This logarithmic model is widely used to determine the two-dimensional velocity profile, particularly for hydraulic smooth and rough flow conditions. Two-dimensional open channel flows are divided into two zones, the inner and outer region, because of the existence of turbulence and the impact of a rigid boundary as shown in Figure 1. *u _{sw}* indicates the velocity at water surface,

*u*shows maximum velocity in boundary layer height

_{max}*δ*, water depth is shown as

*H*,

*z*is depth where

_{0}*u*equals zero. k represents the roughness coefficient.

*z/H*<0.20, for uniform and steady nonuniform open channel flows, is presented using the log law in Equation (1): where u shows the streamwise, time-mean flow velocity, indicates the shear velocity, (

*ρ*= density) and

*τ*is the boundary shear stress,

_{0}*χ*= 0.40 (1/

*χ*= 2.5) is the von Karman's constant (the value of

*χ*has a range of variation between 0.40 and 0.41),

*z*is the distance from the bed, k

_{s}is the equivalent sand roughness, and B

_{p}is a constant of integration, being B

_{p}= 8.5 ± 15% (Song & Graf 1996). Although the value of B

_{p}depends on the nature of the wall surface, the value of

*χ*does not. When using this constant, Equation (1) is obtained as shown in Equation (2):

The logarithmic law is valid throughout the inner region except on the channel bed. In practical applications, it is still widely presumed that the logarithmic law explains the velocity profile along the whole depth of uniform, steady open channel flows (Kundu & Ghoshal 2012).

*m*and

*a*means constant. Different

*m*values have been presented in literature studies for different flows to determine velocity profile measurements with the power law. González

*et al.*(1996) reported that

*m*

*=*1/6 in their studies in open channels. Ardiclioglu

*et al.*(2005) investigated power law equation constant

*(a)*and exponent (1/

*m*) in a stream and found them to be 4.0 and 1/5, respectively.

## FIELD MEASUREMENTS

The four data sets of fixed ADV measurements presented in this paper were collected in the center of Turkey by a team comprising the first and third authors. Twenty-two field measurements were taken at four diverse cross-sections in the Kızılırmak and Seyhan basins. The first data set was collected between 2009 and 2010 at the Sosun station, which is in the Seyhan basin, and the stream is a tributary of the Zamantı River. The other data sets were obtained at the Barsama, Şahsenem, and Bünyan stations, which are in the Kızılırmak basin, a tributary of Kızılırmak River, between 2005 and 2010. Kızılırmak basin, which is the second biggest basin in Turkey, is located in the center of Turkey and the Black Sea region (Figure 2). Six site visits were carried out to the Barsama, Bünyan, and Şahsenem stations and three visits to Sosun station. In Figure 3, a sample of measurements at the Şahsenem station is shown.

The ADV was utilized to gather three-dimensional velocity data at the four stations. The ADV measures three-dimensional flow velocities (u, v, w) for x, y, z dimensions in a sampling volume utilizing the Doppler shift principle. At each measurement cross-section, the ADV records velocity data, location information, and water depth. The ADV sampling volume is found 10 cm before the probe head. Accordingly, the probe head itself has least effect on the flow field surrounding the measurement volume. Velocity reach is ±0.001 m/s to 4.5 m/s, resolution 0.0001 m/s, exactness ±1% of measured velocity (SonTek 2002). To decide the distribution of velocity in river flow, the experimental devices must be properly arranged.

During flow measurements, cross sections were divided into a number of slices for each flow condition according to the water surface width. Point velocity measurements were taken at different positions in the vertical direction starting 4 cm from the streambed for each vertical. The velocities of free water surface in all verticals were estimated by extrapolating the last two measurements of the verticals. Also, mean water surface velocities u_{ws} were measured at each of the visited stations. Water surface velocity can be effortlessly computed with an object that is movable on the water surface and is not too heavy, such as leaves, twigs, and so on.

The flow characteristics at every site are given in Table 1. In this table, the first and second columns show the station visit numbers and dates visited, *U _{m}* (=

*Q/A*) is the mean velocity. A is the area of the cross section, u

_{ws}is the measured water surface velocity,

*H*is the maximum flow depth,

_{max}*T*is the surface water width,

*S*is water surface slope, Re (=4

_{ws}*U*/

_{m}R*ʋ*) is the Reynolds number, and

*ʋ*is the kinematic viscosity, and Fr (=

*U*/(

_{m}*gH*)

_{max}^{1/2}represents the Froude number. When Froude and Reynolds numbers are calculated for all flow measurements, subcritical and turbulent flow conditions have been encountered.

Stations | Dates (d/m/y) | U_{m} (m/s) | u_{ws} (m/s) | H_{max} (m) | T (m) | S_{ws} | Re (×10^{6}) | Fr |
---|---|---|---|---|---|---|---|---|

Barsama_1 | 28/05/2005 | 0.890 | 1.60 | 39.0 | 8.3 | 0.0091 | 0.76 | 0.481 |

Barsama_2 | 19/05/2006 | 1.051 | 1.85 | 40.0 | 9.0 | 0.0036 | 0.94 | 0.531 |

Barsama_3 | 19/05/2009 | 1.214 | 2.08 | 45.0 | 9.0 | 0.0094 | 1.47 | 0.578 |

Barsama_4 | 31/05/2009 | 0.590 | 1.14 | 26.0 | 8.4 | 0.0092 | 0.40 | 0.333 |

Barsama_5 | 24/03/2010 | 0.806 | 1.55 | 38.0 | 8.6 | 0.0097 | 0.61 | 0.417 |

Barsama_6 | 18/04/2010 | 0.865 | 1.63 | 38.2 | 8.8 | 0.0120 | 0.85 | 0.421 |

Bünyan_1 | 24/06/2009 | 0.354 | 0.65 | 72.0 | 4.0 | 0.0020 | 0.71 | 0.133 |

Bünyan_2 | 08/02/2010 | 0.214 | 0.40 | 66.0 | 4.0 | 0.0030 | 0.40 | 0.084 |

Bünyan_3 | 27/09/2009 | 0.301 | 0.54 | 72.0 | 3.9 | 0.0022 | 0.50 | 0.113 |

Bünyan_4 | 04/04/2010 | 0.405 | 0.74 | 85.0 | 4.0 | 0.0018 | 0.78 | 0.140 |

Bünyan_5 | 16/05/2010 | 0.426 | 0.54 | 86.0 | 4.0 | 0.0024 | 0.85 | 0.147 |

Bünyan_6 | 20/06/2010 | 0.286 | 0.53 | 79.0 | 3.9 | 0.0010 | 0.53 | 0.103 |

Şahsenem_1 | 29/03/2006 | 0.600 | 1.04 | 28.0 | 6.0 | 0.0059 | 0.47 | 0.350 |

Şahsenem_2 | 20/10/2007 | 0.529 | 0.93 | 32.0 | 5.4 | 0.0061 | 0.46 | 0.298 |

Şahsenem_3 | 22/03/2008 | 0.565 | 0.80 | 33.0 | 6.0 | 0.0037 | 0.49 | 0.314 |

Şahsenem_4 | 03/05/2008 | 0.518 | 1.00 | 32.0 | 5.4 | 0.0045 | 0.39 | 0.307 |

Şahsenem_5 | 11/10/2008 | 0.536 | 1.01 | 32.0 | 5.5 | 0.0046 | 0.44 | 0.303 |

Şahsenem_6 | 08/11/2008 | 0.516 | 1.00 | 34.0 | 5.6 | 0.0064 | 0.51 | 0.282 |

Sosun_1 | 19/05/2009 | 0.561 | 0.96 | 62.0 | 3.2 | 0.0032 | 0.84 | 0.227 |

Sosun_2 | 31/05/2009 | 0.285 | 0.63 | 43.0 | 3.0 | 0.0016 | 0.32 | 0.144 |

Sosun_3 | 24/03/2010 | 0.327 | 0.63 | 45.0 | 2.9 | 0.0026 | 0.37 | 0.156 |

Sosun_4 | 18/04/2010 | 0.541 | 0.93 | 54.0 | 2.3 | 0.0034 | 0.67 | 0.235 |

Stations | Dates (d/m/y) | U_{m} (m/s) | u_{ws} (m/s) | H_{max} (m) | T (m) | S_{ws} | Re (×10^{6}) | Fr |
---|---|---|---|---|---|---|---|---|

Barsama_1 | 28/05/2005 | 0.890 | 1.60 | 39.0 | 8.3 | 0.0091 | 0.76 | 0.481 |

Barsama_2 | 19/05/2006 | 1.051 | 1.85 | 40.0 | 9.0 | 0.0036 | 0.94 | 0.531 |

Barsama_3 | 19/05/2009 | 1.214 | 2.08 | 45.0 | 9.0 | 0.0094 | 1.47 | 0.578 |

Barsama_4 | 31/05/2009 | 0.590 | 1.14 | 26.0 | 8.4 | 0.0092 | 0.40 | 0.333 |

Barsama_5 | 24/03/2010 | 0.806 | 1.55 | 38.0 | 8.6 | 0.0097 | 0.61 | 0.417 |

Barsama_6 | 18/04/2010 | 0.865 | 1.63 | 38.2 | 8.8 | 0.0120 | 0.85 | 0.421 |

Bünyan_1 | 24/06/2009 | 0.354 | 0.65 | 72.0 | 4.0 | 0.0020 | 0.71 | 0.133 |

Bünyan_2 | 08/02/2010 | 0.214 | 0.40 | 66.0 | 4.0 | 0.0030 | 0.40 | 0.084 |

Bünyan_3 | 27/09/2009 | 0.301 | 0.54 | 72.0 | 3.9 | 0.0022 | 0.50 | 0.113 |

Bünyan_4 | 04/04/2010 | 0.405 | 0.74 | 85.0 | 4.0 | 0.0018 | 0.78 | 0.140 |

Bünyan_5 | 16/05/2010 | 0.426 | 0.54 | 86.0 | 4.0 | 0.0024 | 0.85 | 0.147 |

Bünyan_6 | 20/06/2010 | 0.286 | 0.53 | 79.0 | 3.9 | 0.0010 | 0.53 | 0.103 |

Şahsenem_1 | 29/03/2006 | 0.600 | 1.04 | 28.0 | 6.0 | 0.0059 | 0.47 | 0.350 |

Şahsenem_2 | 20/10/2007 | 0.529 | 0.93 | 32.0 | 5.4 | 0.0061 | 0.46 | 0.298 |

Şahsenem_3 | 22/03/2008 | 0.565 | 0.80 | 33.0 | 6.0 | 0.0037 | 0.49 | 0.314 |

Şahsenem_4 | 03/05/2008 | 0.518 | 1.00 | 32.0 | 5.4 | 0.0045 | 0.39 | 0.307 |

Şahsenem_5 | 11/10/2008 | 0.536 | 1.01 | 32.0 | 5.5 | 0.0046 | 0.44 | 0.303 |

Şahsenem_6 | 08/11/2008 | 0.516 | 1.00 | 34.0 | 5.6 | 0.0064 | 0.51 | 0.282 |

Sosun_1 | 19/05/2009 | 0.561 | 0.96 | 62.0 | 3.2 | 0.0032 | 0.84 | 0.227 |

Sosun_2 | 31/05/2009 | 0.285 | 0.63 | 43.0 | 3.0 | 0.0016 | 0.32 | 0.144 |

Sosun_3 | 24/03/2010 | 0.327 | 0.63 | 45.0 | 2.9 | 0.0026 | 0.37 | 0.156 |

Sosun_4 | 18/04/2010 | 0.541 | 0.93 | 54.0 | 2.3 | 0.0034 | 0.67 | 0.235 |

## ADAPTIVE NEURO-FUZZY INFERENCE SYSTEM

ANFIS was initially presented by Jang (1993). It is a universal approximator and is equipped for approximating any real continuous function. The ANFIS structure is made out of various nodes associated through directional connections and each node has a function comprising fixed or flexible parameters (Jang *et al.* 1997).

Here, f_{1} and f_{2} allude to the output function of rule 1 and rule 2, separately. The ANFIS structure is shown in Figure 4. The node functions of each layer will be explained next.

_{th}node's input and A

_{i}is a linguistic label, for example, ‘low’ or ‘high’ connected with this node function. O

_{l,i}is the membership function of a fuzzy set A (=A

_{1}, A

_{2}, B

_{1}, B

_{2}, C

_{1}, or C

_{2}). It indicates the degree to which the given input

*x*satisfies the quantifier A

_{i}. is typically chosen to be a Gaussian function with a minimum equivalent to 0 and maximum equivalent to 1: In Equation (7), a

_{i}and b

_{i}indicate the parameters. When the values of these parameters change, the Gaussian function also varies accordingly, thus exhibiting various forms of membership functions on linguistic label A

_{i}(Jang 1993). This layer's parameters are called the premise parameters (Emiroglu & Kisi 2013).

The output of each node shows the firing strength of a rule.

## ARTIFICIAL NEURAL NETWORKS

*θ*= a bias for neuron j,

_{j}*O*= i

_{pi}_{th}output of the past layer, W

_{ij}= weight between the first layer i and j. An output is calculated from each neuron in the second layer j and third layer k by passing its value of y through a non-linear activation function. An ordinarily utilized activation function is the logistic function:

More details about ANN can be obtained from the related references (Haykin 2009).

## APPLICATION AND RESULTS

*u*

_{ws}_{,}water surface slope

*S*,

_{ws}*z/H*, and

*y/T*. The 2,184 measured data were used for the ANN-CG, ANN-LM, ANFIS-GP, ANFIS-SC, and regression analyses (MLR). After being randomly permutated, the data were split into two parts, training and testing. The first part (1,747 values, 80% of the whole data) was utilized for training and the second part (437 values, 20% of the whole data) was utilized for testing. Before application of the ANN models, the training input and output values were standardized using Equation (14): in which

*x*and

_{max}*x*are the maximum and minimum of the training and test data. In the present study, the a

_{min}_{1}and a

_{2}values were individually assigned as 0.6 and 0.2 and the input and output data were standardized somewhere around 0.2 and 0.8. Different ANN structures were tried to obtain the optimal models. The assessing criteria utilized in the applications are the root mean square errors (RMSE), mean absolute errors (MAE), and determination coefficient (R

^{2}). The expressions of the RMSE and MAE are provided in the following equations. where

*N*and

*y*refer to the number of data sets and velocity, respectively.

_{i}For estimating the velocity distribution of the streams, four different input combinations were utilized. The correlations between the inputs *u _{ws}*,

*S*,

_{ws}*z/H*, and

*y/T*and output are 0.714, 0.547, 0.362, and 0.010, respectively. According to the correlation values, u

_{ws}seems to be the most effective variable on velocity distribution while

*y/T*is the least effective one. The optimal hidden node numbers were obtained for the ANN models by the trial and error method. Two different ANN models were obtained by using the CG and LM algorithms which are more powerful and faster than the conventional gradient descent technique (Kisi 2007). The sigmoid activation functions were used for the hidden and output nodes. The training of the ANN networks was stopped after 1,000 iterations. Table 2 reports the training and test results of the optimal ANN models in estimating velocity distribution. The numbers given in the second column of the table indicate the optimal number of hidden nodes for each ANN model. It is clear from the table that the ANN-CG (4,7,1) model comprising four inputs corresponding to

*u*,

_{ws}*S*,

_{ws}*z/H*, and

*y/T*, seven hidden and one output nodes, has the lowest RMSE, MAE, and the highest R

^{2}than the other model both in training and test periods. Out of four ANN-LM models, the ANN-LM (4,9,1) model comprising four inputs performs better than the other models. The relative RMSE and MAE differences between the optimal ANFIS-LM and ANFIS-CG models are 12% in the test period.

Input | Parameters | Training | Test | ||||
---|---|---|---|---|---|---|---|

RMSE | MAE | R^{2} | RMSE | MAE | R^{2} | ||

ANN-CG | |||||||

u_{ws} | 9 | 0.244 | 0.187 | 0.512 | 0.243 | 0.190 | 0.604 |

u_{ws} and S_{ws} | 8 | 0.238 | 0.180 | 0.536 | 0.237 | 0.186 | 0.626 |

u_{ws}, S_{ws} and z/H | 7 | 0.171 | 0.134 | 0.758 | 0.167 | 0.128 | 0.810 |

u_{ws}, S_{ws}, z/H and y/T | 7 | 0.106 | 0.081 | 0.908 | 0.098 | 0.074 | 0.934 |

ANN-LM | |||||||

u_{ws} | 10 | 0.241 | 0.183 | 0.533 | 0.243 | 0.189 | 0.606 |

u_{ws} and S_{ws} | 5 | 0.238 | 0.181 | 0.532 | 0.238 | 0.186 | 0.623 |

u_{ws}, S_{ws} and z/H | 5 | 0.169 | 0.132 | 0.764 | 0.167 | 0.129 | 0.812 |

u_{ws}, S_{ws}, z/H and y/T | 9 | 0.095 | 0.073 | 0.925 | 0.086 | 0.065 | 0.950 |

ANFIS-GP | |||||||

u_{ws} | (gaussmf, 3) | 0.247 | 0.190 | 0.499 | 0.245 | 0.196 | 0.599 |

u_{ws} and S_{ws} | (gaussmf, 4) | 0.237 | 0.180 | 0.536 | 0.239 | 0.187 | 0.618 |

u_{ws}, S_{ws} and z/H | (gaussmf, 2) | 0.177 | 0.138 | 0.743 | 0.167 | 0.127 | 0.810 |

u_{ws}, S_{ws}, z/H and y/T | (gaussmf, 4) | 0.037 | 0.027 | 0.989 | 0.066 | 0.046 | 0.971 |

ANFIS-SC | |||||||

u_{ws} | (gaussmf, 0.2) | 0.240 | 0.183 | 0.523 | 0.242 | 0.189 | 0.610 |

u_{ws} and S_{ws} | (gaussmf, 0.2) | 0.237 | 0.179 | 0.539 | 0.239 | 0.187 | 0.617 |

u_{ws}, S_{ws} and z/H | (gaussmf, 0.4) | 0.174 | 0.137 | 0.752 | 0.171 | 0.132 | 0.802 |

u_{ws}, S_{ws}, z/H and y/T | (gaussmf, 0.4) | 0.107 | 0.082 | 0.906 | 0.103 | 0.081 | 0.928 |

MLR | |||||||

u_{ws} | (0.58) | 0.250 | 0.193 | 0.488 | 0.247 | 0.196 | 0.591 |

u_{ws} and S_{ws} | (0.56;4.72) | 0.250 | 0.192 | 0.489 | 0.246 | 0.195 | 0.593 |

u_{ws}, S_{ws} and z/H | (0.42;4.51;0.31) | 0.223 | 0.172 | 0.621 | 0.217 | 0.168 | 0.726 |

u_{ws}, S_{ws}, z/H and y/T | (0.48;4.68;0.37; − 0.21) | 0.216 | 0.164 | 0.624 | 0.211 | 0.163 | 0.712 |

Input | Parameters | Training | Test | ||||
---|---|---|---|---|---|---|---|

RMSE | MAE | R^{2} | RMSE | MAE | R^{2} | ||

ANN-CG | |||||||

u_{ws} | 9 | 0.244 | 0.187 | 0.512 | 0.243 | 0.190 | 0.604 |

u_{ws} and S_{ws} | 8 | 0.238 | 0.180 | 0.536 | 0.237 | 0.186 | 0.626 |

u_{ws}, S_{ws} and z/H | 7 | 0.171 | 0.134 | 0.758 | 0.167 | 0.128 | 0.810 |

u_{ws}, S_{ws}, z/H and y/T | 7 | 0.106 | 0.081 | 0.908 | 0.098 | 0.074 | 0.934 |

ANN-LM | |||||||

u_{ws} | 10 | 0.241 | 0.183 | 0.533 | 0.243 | 0.189 | 0.606 |

u_{ws} and S_{ws} | 5 | 0.238 | 0.181 | 0.532 | 0.238 | 0.186 | 0.623 |

u_{ws}, S_{ws} and z/H | 5 | 0.169 | 0.132 | 0.764 | 0.167 | 0.129 | 0.812 |

u_{ws}, S_{ws}, z/H and y/T | 9 | 0.095 | 0.073 | 0.925 | 0.086 | 0.065 | 0.950 |

ANFIS-GP | |||||||

u_{ws} | (gaussmf, 3) | 0.247 | 0.190 | 0.499 | 0.245 | 0.196 | 0.599 |

u_{ws} and S_{ws} | (gaussmf, 4) | 0.237 | 0.180 | 0.536 | 0.239 | 0.187 | 0.618 |

u_{ws}, S_{ws} and z/H | (gaussmf, 2) | 0.177 | 0.138 | 0.743 | 0.167 | 0.127 | 0.810 |

u_{ws}, S_{ws}, z/H and y/T | (gaussmf, 4) | 0.037 | 0.027 | 0.989 | 0.066 | 0.046 | 0.971 |

ANFIS-SC | |||||||

u_{ws} | (gaussmf, 0.2) | 0.240 | 0.183 | 0.523 | 0.242 | 0.189 | 0.610 |

u_{ws} and S_{ws} | (gaussmf, 0.2) | 0.237 | 0.179 | 0.539 | 0.239 | 0.187 | 0.617 |

u_{ws}, S_{ws} and z/H | (gaussmf, 0.4) | 0.174 | 0.137 | 0.752 | 0.171 | 0.132 | 0.802 |

u_{ws}, S_{ws}, z/H and y/T | (gaussmf, 0.4) | 0.107 | 0.082 | 0.906 | 0.103 | 0.081 | 0.928 |

MLR | |||||||

u_{ws} | (0.58) | 0.250 | 0.193 | 0.488 | 0.247 | 0.196 | 0.591 |

u_{ws} and S_{ws} | (0.56;4.72) | 0.250 | 0.192 | 0.489 | 0.246 | 0.195 | 0.593 |

u_{ws}, S_{ws} and z/H | (0.42;4.51;0.31) | 0.223 | 0.172 | 0.621 | 0.217 | 0.168 | 0.726 |

u_{ws}, S_{ws}, z/H and y/T | (0.48;4.68;0.37; − 0.21) | 0.216 | 0.164 | 0.624 | 0.211 | 0.163 | 0.712 |

The training and test results of the ANFIS-GP and ANFIS-SC models are given in Table 2 for each input combination. The optimal number of membership functions and parameter values are also provided in the second column of the table. From Table 2, it is clear that the ANFIS-GP model comprising four Gaussian membership functions for the inputs, *u _{ws}*,

*S*,

_{ws}*z/H*, and

*y/T*performs better than the other ANFIS-GP models for both periods. There is a harmony between training and test results and this proves the proper calibration of the applied models. The comparison of ANFIS approaches reveals that the optimal ANFIS-GP model has a better accuracy than the optimal ANFIS-SC model with four inputs. According to the ANN-CG, ANN-LM, ANFIS-GP, and ANFIS-SC results given in Table 2, the

*y/T*seems to be the most effective variable on velocity distribution in streams. However,

*y/T*was found to be the least effective variable with respect to correlation (correlation = 0.010). This implies the strong nonlinear relationship between

*y/T*and velocity distribution. This is also valid for

*z/H*which also has considerable nonlinear effect on velocity distribution even though it has low correlation (correlation = 0.010). The main reason for this is the fact that these variables (

*y/T*and

*z/H*) are related to wetted area and velocity is proportional to the wetted area (continuity equation,

*V*

*=*

*Q*/

*A*, where

*V*,

*Q*, and

*A*are mean velocity, discharge, and wetted area, respectively). It is apparent from Table 2 that the first two input combinations provide low accuracy in the applied models. Although the

*u*and

_{ws}*S*have high correlations with velocity, they are not sufficient for accurate estimation of velocity distribution. Table 2 also reports the RMSE, MAE, and R

_{ws}^{2}statistics of the MLR models. The second column of this table indicates the regression coefficients of the MLR models. The MLR model with four inputs performs better than the other MLR models. Comparison of ANN-CG, ANN-LM, ANFIS-GP, ANFIS-SC, and MLR models reveals that the data-driven ANN and ANFIS based models perform better than the MLR model in estimating velocity distribution. The optimal ANFIS-GP model comprising four input combinations has the lowest RMSE (0.066) and MAE (0.046) and the highest R

^{2}(0.971) values. The main advantage of the ANFIS-GP over ANFIS-SC method is that it uses all possible rule combinations in its structure while ANFIS-SC decreases the possible rules to some limited numbers by using a clustering algorithm. ANFIS-SC has simpler structure compared to ANFIS-GP but it has less accuracy than the latter. Table 3 reports the cross-correlations among the estimates of the optimal models with four input variables in the test period. It is apparent that the MLR has the lowest correlations with soft computing models. The ANN-LM has the highest correlation with ANFIS-GP and this indicates that the ANN-LM has the second rank in estimating velocity distribution.

ANN-CG | ANN-LM | ANFIS-GP | ANFIS-SC | MLR | |
---|---|---|---|---|---|

ANN-CG | 1 | ||||

ANN-LM | 0.991 | 1 | |||

ANFIS-GP | 0.963 | 0.970 | 1 | ||

ANFIS-SC | 0.975 | 0.974 | 0.961 | 1 | |

MLR | 0.855 | 0.856 | 0.838 | 0.857 | 1 |

ANN-CG | ANN-LM | ANFIS-GP | ANFIS-SC | MLR | |
---|---|---|---|---|---|

ANN-CG | 1 | ||||

ANN-LM | 0.991 | 1 | |||

ANFIS-GP | 0.963 | 0.970 | 1 | ||

ANFIS-SC | 0.975 | 0.974 | 0.961 | 1 | |

MLR | 0.855 | 0.856 | 0.838 | 0.857 | 1 |

The measured and estimated velocity distributions by the ANN-CG, ANN-LM, ANFIS-GP, ANFIS-SC, and MLR models are shown in Figure 5 for the test period. It is clear that the estimates of the ANN and ANFIS models are closer to the corresponding measured velocities than the MLR model. As seen from the figure, ANN-CG, AN-LM, and ANFIS-SC models underestimate some peaks while the ANFIS-GP model generally well estimates. The significantly under- and over-estimations of the MLR model are clearly seen. The scatterplots of the estimates in the test period are illustrated in Figure 6. It is clear from the fit line equations (assume that the equation is y = *a*x + *b*) given in the scatterplots that the *a* and *b* coefficients of the ANFIS-GP model are individually closer to 1 and 0 with a higher R^{2} than those of the ANN-CG, ANN-LM, ANFIS-SC, and MLR models. It is clear from Figure 6 that the MLR model is insufficient in estimating the velocity distribution of the natural streams.

In order to verify the robustness (the significance of differences between the model estimates and measured velocity values) of the models, the results are likewise tested by utilizing one-way ANOVA. The test is set at a 95% significant level. The test statistics are provided in Table 4. The ANN and ANFIS models yield small testing values with high significance levels in respect to the MLR. According to the ANOVA results, the ANFIS and ANN models are more robust (the closeness between the measured velocity values and model estimates are significantly high) in estimating velocity distribution than the MLR model. Among the data-driven models, the ANN-CG and ANFIS-SC models seem to be better than the ANN-LM and ANFIS-GP regarding robustness.

Model | F-statistic | Resultant significance level |
---|---|---|

ANN-CG (4,7,1) | 0.003 | 0.959 |

ANN-LM (4,9,1) | 0.021 | 0.885 |

ANFIS-GP (gaussmf, 4) | 0.038 | 0.845 |

ANFIS-SC (gaussmf, 0.4) | 0.004 | 0.950 |

MLR | 0.076 | 0.783 |

Model | F-statistic | Resultant significance level |
---|---|---|

ANN-CG (4,7,1) | 0.003 | 0.959 |

ANN-LM (4,9,1) | 0.021 | 0.885 |

ANFIS-GP (gaussmf, 4) | 0.038 | 0.845 |

ANFIS-SC (gaussmf, 0.4) | 0.004 | 0.950 |

MLR | 0.076 | 0.783 |

## CONCLUSION

Estimating velocity distribution in streams by ANN-CG, ANN-LM, ANFIS-GP, ANFIS-SC, and MLR approaches was examined in this study. The 2,184 field data gauged from four diverse cross sections at four destinations on the Sarımsaklı and Sosun streams in central Turkey were used in the applications. To predict velocity distribution, the water surface velocity *u _{ws}*

_{,}water surface slope

*S*,

_{ws}*z/H*, and

*y/T*were utilized as inputs to the models. The accuracy of the ANN-CG, ANN-LM, ANFIS-GP, and ANFIS-SC models was compared with MLR models. Comparison results showed that the ANN and ANFIS models provided better accuracy than the regression model in estimating velocity distribution. The ANFIS-GP model was observed to be better than the ANN-CG, ANN-LM, and ANFIS-SC models. The optimal ANFIS-GP model separately reduced the RMSE and MAE by 69% and 72% and increased the determination coefficient by 36% with respect to the optimal MLR model. The study suggests that the ANFIS and ANN techniques can be effectively utilized for estimating the velocity distribution of the streams.

In this study, 2,184 field data measured in four cross-sections at four destinations were used for model development. More data from different places may improve the models' accuracy. In this way, generalization of the applied models may be improved.