This article presents a fast and powerful new hybrid decision tree (DT) method based on multilayer perceptron neural networks (MLP-NN) to determine the limiting velocity in sediment transport for preventing solid matter deposition. The parameters with the greatest influence on limiting-velocity prediction are exploited from the literature in order to present the MLP-DT-based model in this study. The effect of each parameter presented as part of functional relationships in previous studies is first surveyed by means of sensitivity analysis with the MLP-NN. After identifying the most effective parameters, the hybrid MLP-DT method is used to predict the limiting velocity. A comparison between MLP (R2 = 0.957, MARE = 0.072, RMSE = 0.434, SI = 0.107, BIAS = 0.029) and MLP-DT (R2 = 0.975, MARE = 0.063, RMSE = 0.328, SI = 0.081, BIAS = −0.01) shows that the MLP and DT combination leads to increased MLP-NN ability to predict the required limiting velocity and prevent sediment deposition. The approach developed in this study yields explicit expressions for practical applications.

INTRODUCTION

The flow entering pipe channels usually contains suspended solid substances. In terms of runoff velocity, solid substances along channel paths are washed away and transported with the passing flow. The suspended solid substances get deposited on the channel bed when the velocity of the flow entering a pipe channel with a constant gradient is less than a specific value (limiting velocity). The simplest way to determine the limiting velocity is to use a fixed velocity. Different limiting velocity values have been presented in a number of countries, but since this method does not consider the properties of flow and sediment, it does not always give good results under different geographic and hydraulic conditions. Therefore, the limiting velocity is underestimated or overestimated (Bonakdari & Ebtehaj 2014). For this reason, several analytical and experimental studies have been conducted (Ab Ghani 1993; Nalluri & Ab Ghani 1996; Banasiak 2008; Almedeij 2012; Ota & Perrusquía 2013) to examine the effect of hydraulic parameters affecting the determination of the limiting velocity (e.g. pipe diameter, hydraulic radius, particle size, sediment concentration, etc.). Researchers have used their own study results together with nonlinear regression (NLR) and presented equations for determining the limiting velocity. The main drawback of NLR-based equations is that they do not produce good results for nonlinear and complex problems. For instance, by increasing the velocity value from a specific value, May et al. (1996)’s equation for predicting the limiting velocity produced an uneconomical design with a large difference from the actual values (Ebtehaj et al. 2014) when using seven sets of data (presented by Ackers et al. 1996).

Artificial intelligence (AI) results are generally superior to those from classical methods such as NLR, but they occasionally do not present good results. Therefore, different hybrid methods have been presented in recent years in order to mitigate this problem. Memarian et al. (2013) evaluated the performance of a hybrid artificial neural network (ANN) with the genetic algorithm (GA) method in predicting sediment load. According to the results, ANN-GA performs reliably in predicting sediment load. The results signify that both proposed algorithms performed well compared with multilayer perceptron (MLP). Ebtehaj & Bonakdari (2014b) demonstrated that MLP does not perform very well when using a large range of different experimental data, and hybrid methods must be used to increase prediction accuracy. By using two different evolutionary algorithms, namely the imperialist competition algorithm and particle swarm optimization, Ebtehaj & Bonakdari (2015) optimized MLP neural network (MLP-NN) weighting.

One of the methods to increase ANN performance is a hybrid of this AI method with decision trees (DTs), which is employed in real-life problems (Chang & Chen 2009). Tsai et al. (2012) employed DTs as a classification approach to enhance ANN capability in water-stage forecasting. The authors compared the results of the proposed method with conventional MLP and found that classification can increase the capability of an ANN in water-stage prediction. The goal of the present study is to compare the performance of MLP-DT with the simple MLP method in predicting the minimum velocity required to prevent sediment deposition. Actually, the key of the present study is to show the performance increment when using a classification method in combination with a NN. Firstly, to survey the effect of each parameter presented by Ebtehaj & Bonakdari (2014b), sensitivity analysis is done using simple MLP. The best input combination found by MLP is used for MLP-DT modeling and the results of MLP and MLP-DT are compared.

METHODS

Sediment transport modeling

The parameters influencing sediment transport must first be identified in order to examine and determine the minimum velocity required to transport sediment without deposition in pipe channels. Studies carried out in this field (Nalluri & Ab Ghani 1996; Vongvisessomjai et al. 2010; Ebtehaj et al. 2014) indicate that sediment transport in pipes depends on the properties of the sediment, channel hydraulics and flow. Thus, the parameters affecting the minimum velocity considered are as follows: 
formula
1
where CV is the volumetric sediment concentration, y is the flow depth, R is the hydraulic radius, D is the pipe diameter, d is the mean diameter of particles, s(=ρs) is the specific gravity of the sediment, g is gravity acceleration and λs is the overall sediment friction factor.

In recent works by Ebtehaj & Bonakdari (2014a, 2014b, 2015), dimensionless parameters are categorized in five different groups, including movement (Fr), transport (CV), sediment (Dgr, d/D), transport mode (d/R, D2/A, R/D) and flow resistance (λs). Therefore, the effects of all four dimensionless groups are considered in predicting the Froude number parameter (Fr) in the ‘movement’ dimensionless group. Taking into account that the ‘sediment’ group has two parameters, ‘transport mode’ has three parameters, and the ‘transport’ and ‘flow resistance’ groups have only one parameter, Ebtehaj & Bonakdari (2014b) presented six different models in order to consider all parameters. The best results were obtained when Fr = f(CV, d/D, d/R, λs).

A sensitivity analysis is conducted in the present research using MLP-NN to reflect on the effects of each of the parameters in the model proposed by Ebtehaj & Bonakdari (2014b). The presented models are as follows:

The best model will be selected after examining the effects of every parameter. The model selected using the hybrid MLP based on the DT (MLP-DT) method will also be predicted.

Data collection

In this study, 218 different data collected from the literature (Ab Ghani 1993; Ota & Nalluri 1999; Vongvisessomjai et al. 2010) are utilized to predict the limiting velocity in sediment transport in pipe channels. To survey the sediment transport at the limit of deposition, Ab Ghani (1993) performed different experimental tests using three different pipe diameters (0.154, 0.305 and 0.405 m) and 20.5 m length. The maximum flow discharge in Ab Ghani (1993)'s tests was 0.04 m3/s. The smallest and largest pipe diameters (0.154 and 0.45 m) were utilized with a smooth bed but the pipe with 0.305 m diameter was employed for smooth and rough beds.

Using 18 m long pipes with 0.305 m diameter, Ota & Nalluri (1999) studied the sediment gradation. The authors conducted 24 tests in limit-of-deposition state in uniform (d = 0.71–5.61 mm) and non-uniform (d = 2 mm) conditions.

Vongvisessomjai et al. (2010) conducted different experimental tests with two pipes 16 m long. The pipe diameters were 0.1 and 0.15 m. The authors utilized different slopes (0.002, 0.004, 0.006) and sediments (d = 0.2, 0.3, 0.43 mm). The mean velocity was calculated as the average of the velocity near the bed, at intermediate depth and at the flow surface.

In order to train and test the considered models, 70% of the entire dataset was randomly selected as the training dataset and the remaining 30% was the testing dataset. The dataset ranges are: 0.237 < V (m/s) < 1.216; 1 < CV (ppm) < 1,280; 0.072 < d (mm) < 8.3; 0.005 < R (m) < 0.136; 0.153 < y/D < 0.84 and 0.1 < D (m) < 0.45.

MLP ANN

Due to the flexible structure of MLP in predicting high-dimensional, complex problems (Haykin 1994), the MLP method has become one of the most common NNs for practical engineering problems. An MLP model is built from input, hidden, and output layers. Input variables are connected with the MLP model by using the input layer. Thus, the number of input layer neurons is equal to the number of input variables of the model. The input layer transfers the input variables to the hidden layer. The hidden layer, which comprises hidden neurons, accumulates the input layer information using a separated weighted summation for each hidden neuron. After that, in order to decrease the coarseness of dimensionality, the accumulated information is transferred into a nonlinear space using the transfer functions. The hidden layer's transfer function is sigmoid. By definition, any function that is bounded and differentiable and that has a positive derivative at each point is called a sigmoid function (Smith 1993). The hyperbolic tangent transfer function, which is among the most practical, is utilized in this study as the hidden layer transfer function and is defined as: 
formula
2
Following the hidden layer's nonlinear transformation, the hidden neurons’ results are transferred to the output layer. The output layer, which prepares the model output, performs as a linear regressor. Thus, by using a weighted summation between the hidden neurons, the final result of the MLP model is prepared. There is no rule to determine the number of hidden neurons of an MLP model, so the trial-and-error method is commonly used in this case (Bilhan et al. 2010). Therefore, in all MLP models applied in this study, the number of hidden layer neurons is determined by trial and error. In this procedure, for each input combination, the MLP is modeled with one hidden layer neuron up to the maximum allowable number. After that, the model with the highest accuracy is selected. In addition, this study benefits from the Levenberg–Marquardt (LM) algorithm (Levenberg 1944) for training the MLP model and determining the hidden and output layers’ weighted summations. The training stop criterion considered is 100 epochs, when the MLP model ultimately converges. All models in the present study nearly converge when 50–60 epochs are reached. Figure 1 shows the mean squared error of modeling for each epoch. This figure is obtained in the modeling procedure by using the input variables. From this figure, it is obvious that convergence is reached at epoch number 57.
Figure 1

The results of convergence verify for Model A.

Figure 1

The results of convergence verify for Model A.

Hybrid DT-based MLP model

The novel, hybrid MLP-DT method is introduced in this section. This method uses the DT classification algorithm (Breiman et al. 1993) to increase MLP regression performance. The aim of the DT classification problem is to predict the class of each sample using the input variables. In the DT procedure, recursive splitting is used to find the most appropriate partitions with the most suitable input variables. Therefore, the whole dataset is first divided into various parts with each one of the input variables in order to find the most appropriate input variable and splitting position. Subsequently, different input variables and split positions are tried on the partitioned dataset to identify the subsequently split input variables and positions. This procedure is repeated so as to find the best classification tree of the dataset with the minimum number of misclassified samples. The stop criterion of DT classification employed in this study is minimum parent size (MPS). After each split, the number of samples in each partitioned zone is checked. If the sample number is higher than the MPS, the partitioned zone is made into a branch and split once again in the classification procedure. However, if the sample number of the partitioned zone is less than the MPS, the zone is considered a leaf and leaves the classification procedure. DT classification was presented by Breiman et al. (1993) in more detail.

The role of the DT in the MLP-DT method is to optimize the MLP model's regression power allocation. In the MLP-DT method, despite using all MLP regression power on the whole dataset, first the dataset is divided into some classes, and second, the simpler MLP models are utilized to simulate each dataset class separately. The MLP-DT procedure is presented as follows (Figure 2).
Figure 2

Most appropriate DT classification results.

Figure 2

Most appropriate DT classification results.

The first stage is to train the DT algorithm. By using the training input and output variables the dataset is divided into a specific number of classes (in this study, four). In the DT training procedure, the one essential parameter that should be considered is classification precision. The goal of the MLP-DT is not to reach high classification precision but to attain a model with high regression performance. High DT classification precision may lead to overtraining in regression and decrease MLP-DT performance. However, highly misclassified samples of a weak DT classification model may decrease the MLP-DT regression performance. DT classification performance is controlled using the MPS parameter. Clearly, using MPS of 1 leads to a DT model with no misclassified samples and using higher amounts of MPS leads to a DT with more misclassified samples. In order to find the optimum DT classification accuracy, the trial-and-error method is used. In this procedure, the MLP-DT is modeled with different amounts of MPS. It should be noted that classification accuracy is not important, but the regression accuracy of MLP-DT is. As such, the optimum amount of MLP is obtained when MLP-DT prediction performance is highest. The results show that in the present dataset, the MLP-DT method performs the best when MPS is 60.

In the second stage of the MLP-DT procedure, the simple MLP is divided into smaller MLPs. The number of smaller MLPs is equal to the considered number of classes. To make a fair comparison between the MLP and MLP-DT methods, the number of MLP hidden neurons is considered equal to the sum of the hidden neurons of the smaller MLPs. In the present study, the maximum allowable number of hidden neurons in the MLP method is 12. Thus, the sum of the maximum allowable number of hidden neurons of the four smaller MLPs for the MLP-DT model is 12.

Another process done for each MLP-DT model is to find the appropriate number of hidden neurons for each smaller MLP. In this study, trial and error is applied for each of the smaller MLPs. It is obvious that the optimum numbers of hidden layer neurons of these smaller MLPs are obtained when the prediction performance of the MLP-DT is highest. The last procedure in the MLP-DT method is to collect the separated classes’ results into one unit. This is done to export the final model results and compare them with the simple MLP model in order to study the performance increment of the MLP-DT model. The MLP-DT algorithm is presented in Box 1.

Box 1
The MLP-DT procedure.

RESULTS AND DISCUSSION

The performance of the models presented in this paper consisting of MLP and MLP-DT is evaluated in this section with different statistical indexes: coefficient of determination (R2), mean absolute relative error (MARE), root mean square error (RMSE), scatter index (SI) and BIAS: 
formula
3
 
formula
4
 
formula
5
 
formula
6
 
formula
7
In terms of Ebtehaj & Bonakdari (2014b)'s best functional relationship introduced for calculating the Fr as Fr = f (CV, d/D, d/R, λs), the results obtained from the sensitivity analysis of this functional relationship are presented as Models A to E in Table 1 and are examined in this section. Figure 3 displays the results obtained from the qualitative examination of the sensitivity analysis. Model A, which considers all four parameters (CV, d/D, d/R, λs) for predicting Fr, presents relatively good results (R2 = 0.95, MARE = 0.076, RMSE = 0.461, SI = 0.114, BIAS = −0.07), with most values predicted by this model having a relative error of less than 10%. This process stands for almost all Fr predictions, although some of the predictions also exhibited a relative error greater than 10%. Model B, which considers three parameters (CV, d/D, d/R) for predicting the Fr, produces relatively similar results to Model A (R2 = 0.957, MARE = 0.072, RMSE = 0.434, SI = 0.107, BIAS = 0.029). According to Figure 3, however, Model B has smaller error values in cases when the Fr is predicted with a relative error of greater than 10%. Therefore, eliminating λs does not decrease Fr prediction accuracy, but it actually slightly increases it. Model C does not consider d/R as an effective parameter in predicting the Fr. Comparing Model C with A indicates that not using this parameter significantly decreases the prediction accuracy, such that the majority of predictions made by this model are overestimated, which renders an uneconomical design. Moreover, this model yields a portion of predictions with a relative error of approximately 80% and mean relative error of approximately 16.8% (MARE = 0.168). The other indexes presented in Table 1 also prove that Fr prediction accuracy decreases when using this model. Therefore, it is not advisable to use this model at all, but using the d/R parameter as an effective parameter in predicting the Fr is strongly recommended. The d/D parameter is another parameter that affects Fr prediction. Not using this parameter decreases Fr prediction such that as the value of Fr increases, the prediction error bears a greater error percentage with a relative error mostly greater than 10%. However, it is evident with regard to Figure 3 and Table 1 that this model performs better than Model C, since the mean relative error of Model D is almost 12% while the value of this index is more than 16% for Model D. It could therefore be stated that the d/D parameter influences Fr prediction less than the d/R parameter does. The reason may be that the hydraulic radius of the d/R parameter considers the effect of flow depth in addition to pipe diameter. Model E shows that the volumetric sediment concentration (CV) parameter is the most important among all the parameters in Model A (Fr = f(CV, d/D, d/R, λs)). Not using this parameter overestimates Fr prediction and causes large relative error values (Figure 3), with a mean relative error greater than 20% (MARE = 0.228). With regard to the explanations given, the best performing proposed model according to the sensitivity analysis is Model B. Therefore, Model B is selected and predicted in this section by utilizing the hybrid method of MLP based on DT (MLP-DT). Figure 2 represents the DT classification results that lead to the highest MLP-DT performance. According to this figure, DT utilizes the CV and d/R input variables in the classification procedure.
Table 1

Performance evaluation of MLP and MLP-DT in predicting the Fr

TrainR2MARERMSESIBIAS
 MLP–Model A Fr = f (CV, d/D, d/R, λs0.965 0.059 0.36 0.089 0.004 
MLP–Model B Fr = f (CV, d/D, d/R0.974 0.051 0.31 0.076 0.01 
 MLP–Model C Fr = f (CV, d/D, λs0.889 0.123 0.645 0.159 −0.001 
 MLP–Model D Fr = f (CV, d/R, λs0.95 0.068 0.434 0.107 
 MLP–Model E Fr = f (d/D, d/R, λs0.825 0.153 0.81 0.2 0.001 
MLP-DT–Model B Fr = f (CV, d/D, d/R0.983 0.038 0.255 0.063 
Test
R2MARERMSESIBIAS
 MLP–Model A Fr = f (CV, d/D, d/R, λs0.95 0.076 0.461 0.114 −0.07 
MLP–Model B Fr = f (CV, d/D, d/R0.957 0.072 0.434 0.107 0.029 
 MLP–Model C Fr = f (CV, d/D, λs0.781 0.168 0.98 0.242 −0.204 
 MLP–Model D Fr = f (CV, d/R, λs0.869 0.116 0.928 0.229 −0.242 
 MLP–Model E Fr = f (d/D, d/R, λs0.66 0.228 1.19 0.294 −0.079 
MLP-DT–Model B Fr = f (CV, d/D, d/R0.975 0.063 0.328 0.081 −0.01 
TrainR2MARERMSESIBIAS
 MLP–Model A Fr = f (CV, d/D, d/R, λs0.965 0.059 0.36 0.089 0.004 
MLP–Model B Fr = f (CV, d/D, d/R0.974 0.051 0.31 0.076 0.01 
 MLP–Model C Fr = f (CV, d/D, λs0.889 0.123 0.645 0.159 −0.001 
 MLP–Model D Fr = f (CV, d/R, λs0.95 0.068 0.434 0.107 
 MLP–Model E Fr = f (d/D, d/R, λs0.825 0.153 0.81 0.2 0.001 
MLP-DT–Model B Fr = f (CV, d/D, d/R0.983 0.038 0.255 0.063 
Test
R2MARERMSESIBIAS
 MLP–Model A Fr = f (CV, d/D, d/R, λs0.95 0.076 0.461 0.114 −0.07 
MLP–Model B Fr = f (CV, d/D, d/R0.957 0.072 0.434 0.107 0.029 
 MLP–Model C Fr = f (CV, d/D, λs0.781 0.168 0.98 0.242 −0.204 
 MLP–Model D Fr = f (CV, d/R, λs0.869 0.116 0.928 0.229 −0.242 
 MLP–Model E Fr = f (d/D, d/R, λs0.66 0.228 1.19 0.294 −0.079 
MLP-DT–Model B Fr = f (CV, d/D, d/R0.975 0.063 0.328 0.081 −0.01 
Figure 3

Scatter plot for predicting the Fr using MLP and MLP-DT (Testing).

Figure 3

Scatter plot for predicting the Fr using MLP and MLP-DT (Testing).

Figure 3 indicates that Model B MLP-DT predicts the Fr well. This model makes most predictions with a relative error of less than 10% and the accuracy of this model does not decrease as Fr increases; however, Model B MLP's accuracy decreases to some extent as Fr increases. A quantitative examination of Model B, which was predicted using the MLP and MLP-DT methods, signifies that all indexes in Table 1 are an indication of MLP-DT performing better than MLP. Therefore, using DT as a powerful data classification method improves Fr prediction.

The output equation of the most appropriate MLP-DT method that uses the CV, d/D and d/R input variables is presented in Equation (8). According to this equation, the hyperbolic tangent serves as the activation function. 
formula
8
 
formula
8.1
 
formula
8.2
 
formula
8.3
 
formula
8.4

CONCLUSIONS

Pipe channels must be designed efficiently in view of the problems caused by sediment deposition on the pipe bed, such as decreased transport capacity. The limiting velocity required was predicted in this study using a new hybrid method comprising MLP based on decision trees (MLP-DT). The parameters in the functional relationship presented by Ebtehaj & Bonakdari (2014b) for the purpose of predicting the Fr initially underwent sensitivity analysis with MLP. The results indicate that the overall friction factor of the sediment (λs) parameter does not significantly affect Fr prediction and that not using this parameter boosts Fr prediction accuracy. The d/R and d/D parameters do not have as much influence on Fr prediction. The greatest effect, however, is related to the volumetric sediment concentration (CV) parameter. Not using this parameter will cause the mean relative prediction error to reach 22% and lead to overestimated predictions, which will make for an uneconomical plan. The best functional relationship is therefore Fr = f (CV, d/R, d/D) (Model B). This model was then predicted by MLP-DT. The results demonstrated that the MLP-DT hybrid model (R2 = 0.975, MARE = 0.063, RMSE = 0.328, SI = 0.081, BIAS = −0.01) is more accurate than MLP. Moreover this hybrid model nearly eliminates overestimated predictions. The model presented in this study can ultimately offer an equation for practical use by engineers.

REFERENCES

REFERENCES
Ab Ghani
A.
1993
Sediment Transport in Sewers
.
PhD thesis
,
University of Newcastle Upon Tyne
,
UK
.
Ackers
J. C.
Butler
D.
May
R. W. P.
1996
Design of Sewers to Control Sediment Problems
.
Rep. No. CIRIA 141
,
Construction Industry Research and Information Association
,
London
.
Almedeij
J.
2012
Rectangular storm sewer design under equal sediment mobility
.
American Journal of Environmental Science
8
(
4
),
376
384
.
Banasiak
R.
2008
Hydraulic performance of sewer pipes with deposited sediments
.
Water Science and Technology
57
(
11
),
1743
1748
.
Bonakdari
H.
Ebtehaj
I.
2014
Verification of equation for non-deposition sediment transport in flood water canals
. In:
River Flow 2104
,
Schleiss
A. J.
de Cesare
G.
Franca
M. J.
Pfister
M.
(eds).
CRC Press/Balkema
,
London and Leiden, The Netherlands
, pp.
1527
1533
.
Breiman
L.
Friedman
J. H.
Olshen
R. A.
Stone
C. J.
1993
Classification and Regression Trees
.
Chapman & Hall
,
Boca Raton, FL
.
Chang
C. L.
Chen
C. H.
2009
Applying decision tree and neural network to increase quality of dermatologic diagnosis
.
Expert Systems with Applications
36
(
2
),
4035
4041
.
Ebtehaj
I.
Bonakdari
H.
Sharifi
A.
2014
Design criteria for sediment transport in sewers based on self-cleansing concept
.
Journal of Zhejiang University Science A
15
(
11
),
914
924
.
Haykin
S.
1994
Neural Networks: A Comprehensive Foundation
.
Prentice Hall PTR, Upper Saddle River
,
NJ
.
Levenberg
K.
1944
A method for the solution of certain non-linear problems in least squares
.
Quarterly of Applied Mathematics
2
,
164
168
.
May
R. W. P.
Ackers
J. C.
Butler
D.
John
S.
1996
Development of design methodology for self-cleansing sewers
.
Water Science and Technology
33
(
9
),
195
205
.
Memarian
H.
Balasundram
S. K.
Tajbakhsh
M.
2013
An expert integrative approach for sediment load simulation in a tropical watershed
.
Journal of Integrative Environmental Sciences
10
(
3–4
),
161
178
.
Nalluri
C.
Ab Ghani
A.
1996
Design option for self-cleansing storm sewers
.
Water Science and Technology
33
(
9
),
215
220
.
Ota
J. J.
Nalluri
C.
1999
Graded sediment transport at limit deposition in clean pipe channel
. In:
28th Congress of International Association Hydro-Environmental Engineering Research
,
Graz
,
Austria
.
Ota
J. J.
Perrusquía
G. S.
2013
Particle velocity and sediment transport at the limit of deposition in sewers
.
Water Science and Technology
67
(
5
),
959
967
.
Smith
M.
1993
Neural Networks for Statistical Modeling
.
John Wiley & Sons, Inc
.,
New York
.
Vongvisessomjai
N.
Tingsanchali
T.
Babel
M. S.
2010
Non-deposition design criteria for sewers with part-full flow
.
Urban Water Journal
7
(
1
),
61
77
.