This article presents a fast and powerful new hybrid decision tree (DT) method based on multilayer perceptron neural networks (MLP-NN) to determine the limiting velocity in sediment transport for preventing solid matter deposition. The parameters with the greatest influence on limiting-velocity prediction are exploited from the literature in order to present the MLP-DT-based model in this study. The effect of each parameter presented as part of functional relationships in previous studies is first surveyed by means of sensitivity analysis with the MLP-NN. After identifying the most effective parameters, the hybrid MLP-DT method is used to predict the limiting velocity. A comparison between MLP (*R*^{2} = 0.957, *MARE* = 0.072, *RMSE* = 0.434, *SI* = 0.107, *BIAS* = 0.029) and MLP-DT (*R*^{2} = 0.975, *MARE* = 0.063, *RMSE* = 0.328, *SI* = 0.081, *BIAS* = −0.01) shows that the MLP and DT combination leads to increased MLP-NN ability to predict the required limiting velocity and prevent sediment deposition. The approach developed in this study yields explicit expressions for practical applications.

## INTRODUCTION

The flow entering pipe channels usually contains suspended solid substances. In terms of runoff velocity, solid substances along channel paths are washed away and transported with the passing flow. The suspended solid substances get deposited on the channel bed when the velocity of the flow entering a pipe channel with a constant gradient is less than a specific value (limiting velocity). The simplest way to determine the limiting velocity is to use a fixed velocity. Different limiting velocity values have been presented in a number of countries, but since this method does not consider the properties of flow and sediment, it does not always give good results under different geographic and hydraulic conditions. Therefore, the limiting velocity is underestimated or overestimated (Bonakdari & Ebtehaj 2014). For this reason, several analytical and experimental studies have been conducted (Ab Ghani 1993; Nalluri & Ab Ghani 1996; Banasiak 2008; Almedeij 2012; Ota & Perrusquía 2013) to examine the effect of hydraulic parameters affecting the determination of the limiting velocity (e.g. pipe diameter, hydraulic radius, particle size, sediment concentration, etc.). Researchers have used their own study results together with nonlinear regression (NLR) and presented equations for determining the limiting velocity. The main drawback of NLR-based equations is that they do not produce good results for nonlinear and complex problems. For instance, by increasing the velocity value from a specific value, May *et al.* (1996)’s equation for predicting the limiting velocity produced an uneconomical design with a large difference from the actual values (Ebtehaj *et al.* 2014) when using seven sets of data (presented by Ackers *et al.* 1996).

Artificial intelligence (AI) results are generally superior to those from classical methods such as NLR, but they occasionally do not present good results. Therefore, different hybrid methods have been presented in recent years in order to mitigate this problem. Memarian *et al.* (2013) evaluated the performance of a hybrid artificial neural network (ANN) with the genetic algorithm (GA) method in predicting sediment load. According to the results, ANN-GA performs reliably in predicting sediment load. The results signify that both proposed algorithms performed well compared with multilayer perceptron (MLP). Ebtehaj & Bonakdari (2014b) demonstrated that MLP does not perform very well when using a large range of different experimental data, and hybrid methods must be used to increase prediction accuracy. By using two different evolutionary algorithms, namely the imperialist competition algorithm and particle swarm optimization, Ebtehaj & Bonakdari (2015) optimized MLP neural network (MLP-NN) weighting.

One of the methods to increase ANN performance is a hybrid of this AI method with decision trees (DTs), which is employed in real-life problems (Chang & Chen 2009). Tsai *et al.* (2012) employed DTs as a classification approach to enhance ANN capability in water-stage forecasting. The authors compared the results of the proposed method with conventional MLP and found that classification can increase the capability of an ANN in water-stage prediction. The goal of the present study is to compare the performance of MLP-DT with the simple MLP method in predicting the minimum velocity required to prevent sediment deposition. Actually, the key of the present study is to show the performance increment when using a classification method in combination with a NN. Firstly, to survey the effect of each parameter presented by Ebtehaj & Bonakdari (2014b), sensitivity analysis is done using simple MLP. The best input combination found by MLP is used for MLP-DT modeling and the results of MLP and MLP-DT are compared.

## METHODS

### Sediment transport modeling

*et al.*2010; Ebtehaj

*et al.*2014) indicate that sediment transport in pipes depends on the properties of the sediment, channel hydraulics and flow. Thus, the parameters affecting the minimum velocity considered are as follows: where

*C*is the volumetric sediment concentration,

_{V}*y*is the flow depth,

*R*is the hydraulic radius,

*D*is the pipe diameter,

*d*is the mean diameter of particles,

*s*(=

*ρ*) is the specific gravity of the sediment,

_{s}/ρ*g*is gravity acceleration and

*λ*is the overall sediment friction factor.

_{s}In recent works by Ebtehaj & Bonakdari (2014a, 2014b, 2015), dimensionless parameters are categorized in five different groups, including movement (*Fr*), transport (*C _{V}*), sediment (

*D*,

_{gr}*d/D*), transport mode (

*d/R*,

*D*

^{2}

*/A*,

*R/D*) and flow resistance (

*λ*). Therefore, the effects of all four dimensionless groups are considered in predicting the Froude number parameter (

_{s}*Fr)*in the ‘movement’ dimensionless group. Taking into account that the ‘sediment’ group has two parameters, ‘transport mode’ has three parameters, and the ‘transport’ and ‘flow resistance’ groups have only one parameter, Ebtehaj & Bonakdari (2014b) presented six different models in order to consider all parameters. The best results were obtained when

*Fr*=

*f*(

*C*

_{V}, d/D, d/R, λ_{s}).

A sensitivity analysis is conducted in the present research using MLP-NN to reflect on the effects of each of the parameters in the model proposed by Ebtehaj & Bonakdari (2014b). The presented models are as follows:

The best model will be selected after examining the effects of every parameter. The model selected using the hybrid MLP based on the DT (MLP-DT) method will also be predicted.

### Data collection

In this study, 218 different data collected from the literature (Ab Ghani 1993; Ota & Nalluri 1999; Vongvisessomjai *et al.* 2010) are utilized to predict the limiting velocity in sediment transport in pipe channels. To survey the sediment transport at the limit of deposition, Ab Ghani (1993) performed different experimental tests using three different pipe diameters (0.154, 0.305 and 0.405 m) and 20.5 m length. The maximum flow discharge in Ab Ghani (1993)'s tests was 0.04 m^{3}/s. The smallest and largest pipe diameters (0.154 and 0.45 m) were utilized with a smooth bed but the pipe with 0.305 m diameter was employed for smooth and rough beds.

Using 18 m long pipes with 0.305 m diameter, Ota & Nalluri (1999) studied the sediment gradation. The authors conducted 24 tests in limit-of-deposition state in uniform (*d* = 0.71–5.61 mm) and non-uniform (*d* = 2 mm) conditions.

Vongvisessomjai *et al.* (2010) conducted different experimental tests with two pipes 16 m long. The pipe diameters were 0.1 and 0.15 m. The authors utilized different slopes (0.002, 0.004, 0.006) and sediments (*d* = 0.2, 0.3, 0.43 mm). The mean velocity was calculated as the average of the velocity near the bed, at intermediate depth and at the flow surface.

In order to train and test the considered models, 70% of the entire dataset was randomly selected as the training dataset and the remaining 30% was the testing dataset. The dataset ranges are: 0.237 < *V* (m/s) < 1.216; 1 < *C _{V}* (ppm) < 1,280; 0.072 <

*d*(mm) < 8.3; 0.005 <

*R*(m) < 0.136; 0.153 <

*y/D*< 0.84 and 0.1 <

*D*(m) < 0.45.

### MLP ANN

*et al.*2010). Therefore, in all MLP models applied in this study, the number of hidden layer neurons is determined by trial and error. In this procedure, for each input combination, the MLP is modeled with one hidden layer neuron up to the maximum allowable number. After that, the model with the highest accuracy is selected. In addition, this study benefits from the Levenberg–Marquardt (LM) algorithm (Levenberg 1944) for training the MLP model and determining the hidden and output layers’ weighted summations. The training stop criterion considered is 100 epochs, when the MLP model ultimately converges. All models in the present study nearly converge when 50–60 epochs are reached. Figure 1 shows the mean squared error of modeling for each epoch. This figure is obtained in the modeling procedure by using the input variables. From this figure, it is obvious that convergence is reached at epoch number 57.

### Hybrid DT-based MLP model

The novel, hybrid MLP-DT method is introduced in this section. This method uses the DT classification algorithm (Breiman *et al.* 1993) to increase MLP regression performance. The aim of the DT classification problem is to predict the class of each sample using the input variables. In the DT procedure, recursive splitting is used to find the most appropriate partitions with the most suitable input variables. Therefore, the whole dataset is first divided into various parts with each one of the input variables in order to find the most appropriate input variable and splitting position. Subsequently, different input variables and split positions are tried on the partitioned dataset to identify the subsequently split input variables and positions. This procedure is repeated so as to find the best classification tree of the dataset with the minimum number of misclassified samples. The stop criterion of DT classification employed in this study is minimum parent size (MPS). After each split, the number of samples in each partitioned zone is checked. If the sample number is higher than the MPS, the partitioned zone is made into a branch and split once again in the classification procedure. However, if the sample number of the partitioned zone is less than the MPS, the zone is considered a leaf and leaves the classification procedure. DT classification was presented by Breiman *et al.* (1993) in more detail.

The first stage is to train the DT algorithm. By using the training input and output variables the dataset is divided into a specific number of classes (in this study, four). In the DT training procedure, the one essential parameter that should be considered is classification precision. The goal of the MLP-DT is not to reach high classification precision but to attain a model with high regression performance. High DT classification precision may lead to overtraining in regression and decrease MLP-DT performance. However, highly misclassified samples of a weak DT classification model may decrease the MLP-DT regression performance. DT classification performance is controlled using the MPS parameter. Clearly, using MPS of 1 leads to a DT model with no misclassified samples and using higher amounts of MPS leads to a DT with more misclassified samples. In order to find the optimum DT classification accuracy, the trial-and-error method is used. In this procedure, the MLP-DT is modeled with different amounts of MPS. It should be noted that classification accuracy is not important, but the regression accuracy of MLP-DT is. As such, the optimum amount of MLP is obtained when MLP-DT prediction performance is highest. The results show that in the present dataset, the MLP-DT method performs the best when MPS is 60.

In the second stage of the MLP-DT procedure, the simple MLP is divided into smaller MLPs. The number of smaller MLPs is equal to the considered number of classes. To make a fair comparison between the MLP and MLP-DT methods, the number of MLP hidden neurons is considered equal to the sum of the hidden neurons of the smaller MLPs. In the present study, the maximum allowable number of hidden neurons in the MLP method is 12. Thus, the sum of the maximum allowable number of hidden neurons of the four smaller MLPs for the MLP-DT model is 12.

Another process done for each MLP-DT model is to find the appropriate number of hidden neurons for each smaller MLP. In this study, trial and error is applied for each of the smaller MLPs. It is obvious that the optimum numbers of hidden layer neurons of these smaller MLPs are obtained when the prediction performance of the MLP-DT is highest. The last procedure in the MLP-DT method is to collect the separated classes’ results into one unit. This is done to export the final model results and compare them with the simple MLP model in order to study the performance increment of the MLP-DT model. The MLP-DT algorithm is presented in Box 1.

## RESULTS AND DISCUSSION

*Fr*as

*Fr*=

*f*(

*C*), the results obtained from the sensitivity analysis of this functional relationship are presented as Models A to E in Table 1 and are examined in this section. Figure 3 displays the results obtained from the qualitative examination of the sensitivity analysis. Model A, which considers all four parameters (

_{V}, d/D, d/R, λ_{s}*C*,

_{V}*d/D*,

*d/R*,

*λ*) for predicting

_{s}*Fr*, presents relatively good results (

*R*

^{2}= 0.95,

*MARE*= 0.076,

*RMSE*= 0.461,

*SI*= 0.114,

*BIAS*= −0.07), with most values predicted by this model having a relative error of less than 10%. This process stands for almost all

*Fr*predictions, although some of the predictions also exhibited a relative error greater than 10%. Model B, which considers three parameters (

*C*,

_{V}*d/D*,

*d/R*) for predicting the

*Fr*, produces relatively similar results to Model A (

*R*

^{2}= 0.957,

*MARE*= 0.072,

*RMSE*= 0.434,

*SI*= 0.107,

*BIAS*= 0.029). According to Figure 3, however, Model B has smaller error values in cases when the

*Fr*is predicted with a relative error of greater than 10%. Therefore, eliminating

*λ*does not decrease

_{s}*Fr*prediction accuracy, but it actually slightly increases it. Model C does not consider

*d/R*as an effective parameter in predicting the

*Fr*. Comparing Model C with A indicates that not using this parameter significantly decreases the prediction accuracy, such that the majority of predictions made by this model are overestimated, which renders an uneconomical design. Moreover, this model yields a portion of predictions with a relative error of approximately 80% and mean relative error of approximately 16.8% (

*MARE*= 0.168). The other indexes presented in Table 1 also prove that

*Fr*prediction accuracy decreases when using this model. Therefore, it is not advisable to use this model at all, but using the

*d/R*parameter as an effective parameter in predicting the

*Fr*is strongly recommended. The

*d/D*parameter is another parameter that affects

*Fr*prediction. Not using this parameter decreases

*Fr*prediction such that as the value of

*Fr*increases, the prediction error bears a greater error percentage with a relative error mostly greater than 10%. However, it is evident with regard to Figure 3 and Table 1 that this model performs better than Model C, since the mean relative error of Model D is almost 12% while the value of this index is more than 16% for Model D. It could therefore be stated that the

*d/D*parameter influences

*Fr*prediction less than the

*d*/

*R*parameter does. The reason may be that the hydraulic radius of the

*d/R*parameter considers the effect of flow depth in addition to pipe diameter. Model E shows that the volumetric sediment concentration (

*C*) parameter is the most important among all the parameters in Model A (

_{V}*Fr*=

*f*(

*C*,

_{V}*d/D*,

*d/R*,

*λ*)). Not using this parameter overestimates

_{s}*Fr*prediction and causes large relative error values (Figure 3), with a mean relative error greater than 20% (

*MARE*= 0.228). With regard to the explanations given, the best performing proposed model according to the sensitivity analysis is Model B. Therefore, Model B is selected and predicted in this section by utilizing the hybrid method of MLP based on DT (MLP-DT). Figure 2 represents the DT classification results that lead to the highest MLP-DT performance. According to this figure, DT utilizes the

*C*and

_{V}*d*/

*R*input variables in the classification procedure.

Train . | R^{2}
. | MARE
. | RMSE
. | SI
. | BIAS
. | |
---|---|---|---|---|---|---|

MLP–Model A | Fr = f (C, _{V}d/D, d/R, λ) _{s} | 0.965 | 0.059 | 0.36 | 0.089 | 0.004 |

MLP–Model B | Fr = f (C, _{V}d/D, d/R) | 0.974 | 0.051 | 0.31 | 0.076 | 0.01 |

MLP–Model C | Fr = f (C, _{V}d/D, λ) _{s} | 0.889 | 0.123 | 0.645 | 0.159 | −0.001 |

MLP–Model D | Fr = f (C, _{V}d/R, λ) _{s} | 0.95 | 0.068 | 0.434 | 0.107 | 0 |

MLP–Model E | Fr = f (d/D, d/R, λ) _{s} | 0.825 | 0.153 | 0.81 | 0.2 | 0.001 |

MLP-DT–Model B | Fr = f (C, _{V}d/D, d/R) | 0.983 | 0.038 | 0.255 | 0.063 | 0 |

Test . | R^{2}
. | MARE
. | RMSE
. | SI
. | BIAS
. | |

MLP–Model A | Fr = f (C, _{V}d/D, d/R, λ) _{s} | 0.95 | 0.076 | 0.461 | 0.114 | −0.07 |

MLP–Model B | Fr = f (C, _{V}d/D, d/R) | 0.957 | 0.072 | 0.434 | 0.107 | 0.029 |

MLP–Model C | Fr = f (C, _{V}d/D, λ) _{s} | 0.781 | 0.168 | 0.98 | 0.242 | −0.204 |

MLP–Model D | Fr = f (C, _{V}d/R, λ) _{s} | 0.869 | 0.116 | 0.928 | 0.229 | −0.242 |

MLP–Model E | Fr = f (d/D, d/R, λ) _{s} | 0.66 | 0.228 | 1.19 | 0.294 | −0.079 |

MLP-DT–Model B | Fr = f (C, _{V}d/D, d/R) | 0.975 | 0.063 | 0.328 | 0.081 | −0.01 |

Train . | R^{2}
. | MARE
. | RMSE
. | SI
. | BIAS
. | |
---|---|---|---|---|---|---|

MLP–Model A | Fr = f (C, _{V}d/D, d/R, λ) _{s} | 0.965 | 0.059 | 0.36 | 0.089 | 0.004 |

MLP–Model B | Fr = f (C, _{V}d/D, d/R) | 0.974 | 0.051 | 0.31 | 0.076 | 0.01 |

MLP–Model C | Fr = f (C, _{V}d/D, λ) _{s} | 0.889 | 0.123 | 0.645 | 0.159 | −0.001 |

MLP–Model D | Fr = f (C, _{V}d/R, λ) _{s} | 0.95 | 0.068 | 0.434 | 0.107 | 0 |

MLP–Model E | Fr = f (d/D, d/R, λ) _{s} | 0.825 | 0.153 | 0.81 | 0.2 | 0.001 |

MLP-DT–Model B | Fr = f (C, _{V}d/D, d/R) | 0.983 | 0.038 | 0.255 | 0.063 | 0 |

Test . | R^{2}
. | MARE
. | RMSE
. | SI
. | BIAS
. | |

MLP–Model A | Fr = f (C, _{V}d/D, d/R, λ) _{s} | 0.95 | 0.076 | 0.461 | 0.114 | −0.07 |

MLP–Model B | Fr = f (C, _{V}d/D, d/R) | 0.957 | 0.072 | 0.434 | 0.107 | 0.029 |

MLP–Model C | Fr = f (C, _{V}d/D, λ) _{s} | 0.781 | 0.168 | 0.98 | 0.242 | −0.204 |

MLP–Model D | Fr = f (C, _{V}d/R, λ) _{s} | 0.869 | 0.116 | 0.928 | 0.229 | −0.242 |

MLP–Model E | Fr = f (d/D, d/R, λ) _{s} | 0.66 | 0.228 | 1.19 | 0.294 | −0.079 |

MLP-DT–Model B | Fr = f (C, _{V}d/D, d/R) | 0.975 | 0.063 | 0.328 | 0.081 | −0.01 |

Figure 3 indicates that Model B MLP-DT predicts the *Fr* well. This model makes most predictions with a relative error of less than 10% and the accuracy of this model does not decrease as *Fr* increases; however, Model B MLP's accuracy decreases to some extent as *Fr* increases. A quantitative examination of Model B, which was predicted using the MLP and MLP-DT methods, signifies that all indexes in Table 1 are an indication of MLP-DT performing better than MLP. Therefore, using DT as a powerful data classification method improves *Fr* prediction.

*C*,

_{V}*d/D*and

*d/R*input variables is presented in Equation (8). According to this equation, the hyperbolic tangent serves as the activation function.

## CONCLUSIONS

Pipe channels must be designed efficiently in view of the problems caused by sediment deposition on the pipe bed, such as decreased transport capacity. The limiting velocity required was predicted in this study using a new hybrid method comprising MLP based on decision trees (MLP-DT). The parameters in the functional relationship presented by Ebtehaj & Bonakdari (2014b) for the purpose of predicting the *Fr* initially underwent sensitivity analysis with MLP. The results indicate that the overall friction factor of the sediment (*λ _{s}*) parameter does not significantly affect

*Fr*prediction and that not using this parameter boosts

*Fr*prediction accuracy. The

*d/R*and

*d/D*parameters do not have as much influence on

*Fr*prediction. The greatest effect, however, is related to the volumetric sediment concentration (

*C*) parameter. Not using this parameter will cause the mean relative prediction error to reach 22% and lead to overestimated predictions, which will make for an uneconomical plan. The best functional relationship is therefore

_{V}*Fr*=

*f*(

*C*,

_{V}*d/R*,

*d/D*) (Model B). This model was then predicted by MLP-DT. The results demonstrated that the MLP-DT hybrid model (

*R*

^{2}= 0.975,

*MARE*= 0.063,

*RMSE*= 0.328,

*SI*= 0.081,

*BIAS*= −0.01) is more accurate than MLP. Moreover this hybrid model nearly eliminates overestimated predictions. The model presented in this study can ultimately offer an equation for practical use by engineers.