Two new soft computing models, namely genetic programming (GP) and genetic artificial algorithm (GAA) neural network (a combination of modified genetic algorithm and artificial neural network methods) were developed in order to predict the percentage of shear force in a rectangular channel with non-homogeneous roughness. The ability of these methods to estimate the percentage of shear force was investigated. Moreover, the independent parameters' effectiveness in predicting the percentage of shear force was determined using sensitivity analysis. According to the results, the GP model demonstrated superior performance to the GAA model. A comparison was also made between the GP program determined as the best model and five equations obtained in prior research. The GP model with the lowest error values (root mean square error ((RMSE) of 0.0515) had the best function compared with the other equations presented for rough and smooth channels as well as smooth ducts. The equation proposed for rectangular channels with rough boundaries (RMSE of 0.0642) outperformed the prior equations for smooth boundaries.

## INTRODUCTION

Boundary shear stress is an important parameter in steady, fully developed open channel flow. Trustworthy prediction of boundary shear distribution in open channel flow is very difficult in many critical engineering problems, such as sedimentation, channel design and energy loss calculation. The boundary shear stress distribution in rectangular channels with smooth or rough boundaries has been investigated by many researchers, such as Knight (1981), Knight *et al.* (1984) and Bilgil (2005). Due to difficulties in measuring shear stress distribution in channels by direct and indirect methods, a number of studies have been extended in order to calculate this by different approaches. Some studies have focused on numerical or analytical means of predicting shear stress distribution (Berlamont *et al.* 2003; Guo & Julien 2005; Yu & Tan 2007; Ansari *et al.* 2011; Bonakdari *et al.* 2015; Sheikh & Bonakdari 2015).

Nowadays, soft computing methods of forecasting various phenomena in all fields are widely exploited. Alternatively, soft computing techniques, such as artificial neural networks (ANNs), evolutionary computation, fuzzy logic, genetic programming (GP) and gene-expression programming, have been successfully applied for water engineering problems in recent years (Azamathulla & Jarrett 2013; Ebtehaj & Bonakdari 2014; Zaji & Bonakdari 2015; Azamathulla 2015). Boundary shear force carried by walls is a major parameter related to mean shear stress. Only a few studies have addressed predicting shear stress in channels using soft computing methods. Cobaner *et al.* (2010) investigated the ability of the ANN approach to predict the percentage of shear force acting on walls (%*SF _{w}*) in smooth open channels and ducts. The authors indicated that the ANN model's function was better than previously obtained equations of other researchers. In this study, the capability of GP and genetic artificial algorithm (GAA) models in predicting the %

*SF*in rectangular channels with non-homogeneous roughness is investigated. Consequently, two models are developed and the best one is selected. The function of the best model is also compared with equations obtained by Knight (1981), Knight

_{w}*et al.*(1984, 1994) and Seckin

*et al.*(2006) for rectangular channels, and an equation proposed by Rhodes & Knight (1994) for smooth ducts.

## MODELS

### GP model

GP was introduced by Koza (1994) and is among the more practical genetic algorithm (GA) applications. The GP method is very similar to the GA, except that the chromosomes are in fact computer programs. GP begins with a random initial population consisting of certain computer programs. Each computer program is run to evaluate the cost of each chromosome. The cost function is calculated using fitness functions. Thus, by sorting out the initial population costs, and performing crossover, mutation and elite GA processes, a new population is achieved. The main objective of GP is to find a computer program that can accurately predict shear force using the given input variables.

Three major variables affect GP performance: (i) selecting the appropriate input variables; (ii) selecting the functions that GP is permitted to use in the computer programs (functions are arithmetic operators such as + and − or mathematical functions such as exp, sin, power and logical functions like OR and NOT; in this study, four different function combinations are tested); and (iii) selecting a suitable fitness function (a fitness function is used to evaluate the efficiency of each individual, and selecting the appropriate fitness function directly affects the model's performance). The mean squared error (MSE) and mean absolute error (MAE) statistics methods are herein regarded as fitness functions.

### GAA model

## DATA USED

^{−4}. The author assumed that %

*SF*varies exponentially with aspect ratio,

_{w}*B/h*, and relative roughness, , as follows:where

*α*is a function of aspect ratio:Knight

*et al.*(1984) analyzed their own data as well as that of Knight (1981), plotted them on a log-log scale, assumed a simple relationship between %

*SF*and

_{w}*B/h*, and derived the following equation:where

*A*

_{1}= −1.4026,

*A*

_{2}= 3.00 and

*A*

_{3}= 2.6692.

*et al.*(1994) suggested a general equation for smooth rectangular channels considering the relationship between %

*SF*and the wetted perimeter ratio,

_{w}*P*, as follows:Seckin

_{b}/P_{w}*et al.*(2006) experimentally studied boundary shear stress and shear force distributions in smooth rectangular channels. They derived a nonlinear regression-based equation from the experimental analysis to obtain the percentage of total shear force carried by walls as follows:For smooth rectangular ducts, Rhodes & Knight (1994) carried out a series of experiments considering aspect ratios of up to 50 and proposed the following equation:According to these relations, some parameters, such as channel geometry (

*B*), flow depth (

*h*), bed and wall roughness (

*k*), energy slope (

_{sb}, k_{sw}*S*), flow velocity (

_{f}*V*), fluid density (

*ρ*), gravitational acceleration (

*g*) and hydraulic radius (

*R*) affect the shear force carried by walls. Thus, %

*SF*can be expressed as:By using Buckingham's theorem, the dimensionless parameters affecting %

_{w}*SF*are represented as follows:where is the Froude number, and the Reynolds number. About 75% of all data were selected randomly for training and the remaining were used for testing.

_{w}## ANALYSIS OF GP AND GAA MODELS

*X*is the normalized value, and

_{n}*X*and

_{min}*X*are the minimum and maximum values of the variables, respectively. Subsequently, in order to select the best among the GP and GAA models, different input combinations were investigated. Eight different combinations were grouped as quaternary, ternary and binary, as follows: (1)

_{max}*B*/

*h*, Fr, Re and

*k*/

_{sb}*k*, (2)

_{sw}*B/h*, Fr and Re, (3)

*B/h*, Fr and

*k*/

_{sb}*k*, (4)

_{sw}*B*/

*h*, Re and

*k*/

_{sb}*k*, (5) Fr,

_{sw}*k*/

_{sb}*k*and Re, (6)

_{sw}*B*/

*h*and

*k*/

_{sb}*k*, (7)

_{sw}*B*/

*h*and Fr and (8) Fr and

*k*/

_{sb}*k*. The root mean square error (RMSE) served to check the models' accuracy in each step. The RMSE, MSE, and MAE statistical indexes are defined as follows:where

_{sw}*X*is the wall shear stress predicted by the model,

_{p}*X*is the wall shear stress measured in the laboratory and

_{m}*n*is the number of studied dataset samples.

Upon selecting the best input combination for each model, the best fitness function was selected. In the GP model, input combination (2) and the default function were considered and two fitness functions MSE and MAE were tested. The results indicated that for the GP model, the MAE fitness function with RMSE of 0.0521 outperformed the MSE fitness function with RMSE of 0.053. However, the MSE fitness function for the GAA model with RMSE of 0.0791 outperformed MAE for the GAA model with RMSE of 0.0873.

The final step in modeling with the GP and GAA methods differs for each method. In selecting the best GP model, four basic arithmetic operators and the mathematical functions were employed. When input combination (2) and MAE as the fitness function were used, four different structures were chosen to identify the best GP model. The principal range of the investigated functions is:

The results signify that function *F*2 with RMSE of 0.0515 performs superior to functions *F*1, *F*3, and *F*4 with RMSE of 0.0521, 0.0629 and 0.0699, respectively.

*SF*plot of the two presented models for all data. As seen in Figure 3, the GP model made more appropriate predictions than the GAA model. According to the fit line equations of the scatter plots (assuming the equation is ) the

_{w}*a*

_{1}coefficient of the GP model is closer to 1 and the

*a*

_{2}coefficient is closer to 0 compared to the GAA model. It is obvious from these scatter plots that the GP model estimates are less scattered and closer to the exact line than those of the GAA model. Evidently, the GAA model estimates %

*SF*with lower accuracy than the GP model, but the adjusted

_{w}*R*

^{2}value of the GAA model is higher than the GP model. This is acceptable because the output data of the GAA model is nearer to the mean of the experimental outputs.

*SF*

_{w}in rectangular channels. The best model program (Figure 4) in the form of a Matlab code indicates that by selecting input combination (2), MAE as a fitness function and mathematical function

*F*2 were obtained.

*SF*for all data using the GP model and the five above-mentioned equations. According to Figure 5, the GP model program with adjusted

_{w}*R*

^{2}of 0.9652 has the best function compared with the other equations. The equation obtained by Knight (1981) indicated the most suitable performance (with adjusted

*R*

^{2}of 0.9396) compared to other equations. The equations expressed by Knight

*et al.*(1984, 1994) and Seckin

*et al.*(2006) for smooth channels and Rhodes & Knight (1994) for smooth ducts predicted %

*SF*well. It is worth noting that these equations predicted some points with nearly perfect adjustment for high aspect ratio values (

_{w}*B*/

*h*> 5). It can be deducted that by increasing the aspect ratio (

*B*/

*h*), roughness is less effective on the %

*SF*. In the GP program, three important parameters were considered. However, with input combination (2) roughness was not perceived, although the model made predictions very close to experimental data. In Knight's equation only two parameters were involved although roughness was considered, and the results obtained with this relation were superior to the other mentioned equations. Yet once again, the GP model produced better results than the other equations.

_{w}## CONCLUSION

The efficiency of soft computing methods in predicting the %*SF _{w}* in rectangular channels with rough boundaries was investigated in this study. GP and GAA models were developed in three steps. First, the effective parameters were selected and different input combinations were tested to select the optimum combinations. Then the best fitness and transfer functions were studied and the superior one was accepted for both models. Finally, after extending the two models, their ability to predict the percentage of shear force was examined. According to performance results, the GP model predicted %

*SF*more accurately than the GAA model. The results of the proposed model were also compared with five equations presented by other researchers. The GP model predicted %

_{w}*SF*with lower error (RMSE = 0.0515) and higher accuracy than the equations by Knight (1981), Knight

_{w}*et al.*(1984, 1994), Seckin

*et al.*(2006) and Rhodes & Knight (1994) with RMSE of 0.0642, 0.2413, 0.2327, 0.2424 and 0.2063 respectively. The equations proposed for smooth rectangular channels and ducts predicted overestimated %

*SF*values. It can be deducted that the relations for smooth boundaries do not yield more accurate results for estimating %

_{w}*SF*in channels with rough boundaries except for high aspect ratio (when

_{w}*B*/

*h*> 5). The equation obtained by Knight (1981) produced the most appropriate results after the best GP model, as the equation was also presented for rectangular channels with rough boundaries.