A genetic algorithm-based support vector machine to estimate the transverse mixing coefficient in streams


 Transverse mixing coefficient (TMC) is known as one of the most effective parameters in the two-dimensional simulation of water pollution, and increasing the accuracy of estimating this coefficient will improve the modeling process. In the present study, genetic algorithm (GA)-based support vector machine (SVM) was used to estimate TMC in streams. There are three principal parameters in SVM which need to be adjusted during the estimating procedure. GA helps SVM and optimizes these three parameters automatically in the best way. The accuracy of the SVM and GA-SVM algorithms along with previous models were discussed in TMC estimation by using a wide range of hydraulic and geometrical data from field and laboratory experiments. According to statistical analysis, the performance of the mentioned models in both straight and meandering streams was more accurate than the regression-based models. Sensitivity analysis showed that the accuracy of the GA-SVM algorithm in TMC estimation significantly correlated with the number of input parameters. Eliminating the uncorrelated parameters and reducing the number of input parameters will reduce the complexity of the problem and improve the TMC estimation by GA-SVM.


INTRODUCTION
Increasing the accuracy of modeling the process of pollution release into streams will increase the ability to control the quality of streams and thereby reduce environmental damage. Therefore, the capability to estimate the transport of pollutants in streams and waterways has always been a considerable issue in many industrial and environmental projects (Abderrezzak et al. 2015). After being discharged into a river, contaminants and effluents mix with water of the river being transported to the downstream (Seo & Cheong 1998). The effluent is spread vertically, transversely, and longitudinally by advective and dispersive transport processes. In a shallow stream, after contamination is rapidly mixed throughout the depth, the transmission will occur in the longitudinal and transverse directions (Ahmad et al. 2011). A full cross-sectional mix will not be achieved, unless the pollutant travels the long distances which are generally not within the length of practical interest (Beltaos 1980). The length required for full cross-sectional mixing of contaminations is approximately 20 and 200 times the upper width for a rough and a smooth flow, respectively (Fischer 1967). Transverse mixing plays an important role in determining the effect of contaminants under steady-state conditions. This parameter has an important effect in water quality management; especially in a case of point source discharges or tributary inflows (Rutherford 1994;Boxall & Guymer 2003). According to Figure 1, for the effluent mixing process in rivers, three stages are considered: (1) mixing near to the discharging point due to initial momentum and flow buoyancy (between A and B zones); (2) transverse mixing due to turbulence (secondary turbulence transfer) and its secondary flows (between B and C zones); and (3) dispersion due to longitudinal shear flow (after C zone) (Fischer et al. 1979).
The distribution of tracer concentration can be written in a two-dimensional model according to the principle of mass conservation (Rutherford 1994;Sharma & Ahmad 2014): where t is the time; H is depth of flow (m); C is the depth-averaged tracer concentration (kg=m 3 ); z and x are the transverse and longitudinal directions, respectively; U z and U x are the velocities in the z and x directions (m=s), respectively; 1 z and 1 x are the depth-averaged dispersion coefficients in transverse and longitudinal directions (m 2 =s). By assuming that longitudinal dispersion of tracer has not begun yet for the uniformly flowing stream, the time differentiation of Equation (1) will be zero (Sharma & Ahmad 2014). Also, by assuming a uniform flow and U z ¼ 0, Equation (1) can be simplified to: The above equation has been used in many studies (Krishnappan & Lau 1977;Lau & Krishnappan 1981;Demetracopoulos 1994;Ahmad 2008;Aghababaei et al. 2017;Huai et al. 2018;Zahiri & Nezaratian 2020). More investigations on the role of the effective parameters in transverse mixing would be required due to the complexity of the transverse mixing mechanism (Aghababaei et al. 2017). Thus, predicting the transverse mixing coefficient (TMC) for known flow conditions in a stream for accounting the pollutant concentration at any location downstream of the injection site is genuinely essential (Azamathulla & Ahmad 2012). Generally, there are three approaches for predicting the TMC in stream mixing. Empirical methods have developed equations using the hydraulic and geometric dataset of rivers and experimental studies in order to establish a relationship for 1 z and theoretical methods have used the concept of shear flow to derive the dispersion coefficient (Baek & Seo 2013). Moreover, many researchers have recently used powerful predictive tools to find solutions for complex engineering problems. The significance of dispersion coefficients in water quality modeling and the complexity of the pollutant emission and mixing process have considerably increased the importance of using these tools (Zahiri & Nezaratian 2020). Soft computing techniques such as fuzzy-neural inference system-based principal component analysis (ANFIS-based PCA), particle swarm optimization method (PSO), artificial neural network (ANN), genetic expression programming (GEP), differential evolution (DE), decision tree (M5), support vector machine (SVM), and fuzzy-neural inference system (ANFIS) have been widely used to estimate longitudinal dispersion coefficient in streams by Parsaei et al.   (Fischer et al. 1979). Journal Vol 56 No 3,128 to predict the TMC accurately by using decision tree (M5), multivariate adaptive regression splines (MARS), particle swarm optimization method (PSO), multiple linear regression (MLR), genetic algorithm (GA), genetic programming for symbolic regression (GPSR), and GEP. Soft computing techniques used by the above-mentioned researchers have less statistical errors and higher accuracy than empirical methods in TMC prediction (Zahiri & Nezaratian 2020). According to previous studies, there is a strong relationship between the TMC and channel parameters such as channel width, flow depth, shear velocity, friction factor, curvature and sinuosity (Fischer 1967;Beltaos 1979;Lau & Krishnappan 1981;Stefanovic & Stefan 2001;Boxall & Guymer 2003). Table 1 shows some of the most well-known equations proposed for calculating the TMC.

Water Quality Research
Each of these mentioned algorithms has its strengths and weaknesses that may not be able to predict complex phenomena such as TMC accurately. Selecting several meta-heuristic algorithms correctly and using them simultaneously will increase accuracy and decrease errors in target values' estimation. Selecting an algorithm as the main algorithm along with an auxiliary algorithm that can improve the weaknesses of the main algorithm will lead to developing a hybrid algorithm with higher performance. In previous investigations, several hybrid algorithms were used to estimate some of the complex phenomena and, consequently, the ability of these algorithms was proven completely (Pourbasheer et al. 2009;Wang et al. 2013;Li & Kong 2014;Zhou et al. 2016). In this study, two common algorithms were used to develop a hybrid algorithm: support vector machine (SVM) as the main algorithm and genetic algorithm (GA) as the auxiliary algorithm. Connecting GA to SVM allows us to estimate optimal values of SVM's adjustable parameters in the shortest time and increase predicting accuracy. The purpose of this study is developing an SVM-GA algorithm by using 232 published datasets and making a comparison of its performance with previous models. In addition, sensitivity analysis has been performed on the developed model to determine the effect of input parameters in the TMC modeling.

Data
In the present study, 232 data points (see Supplementary material) were collected from the technical literature (Yotsukura et al. 1970;Holley & Abraham 1973;Krishnappan & Lau 1977;Beltaos 1979;Rutherford 1994;Jeon et al. 2007;Baek & Seo 2008; (Aghababaei et al. 2017). Table 2 illustrates a statistical analysis of all variables. Table 2 implies that the studied cases varied from narrow rivers (W=H, 10) to very wide rivers (W=H. 100). U=UÃ, which is known as friction term and represents the hydrodynamic and roughness of the canal bed (Seo & Cheong 1998), varied from 0.026 to 28.571. This range of variations indicates the usage of a wide range of streams with various geometrical and hydraulic features in this study, the results of which can be related to many streams with different characteristics. The dataset was randomly divided into two sets, training (75% of the data) and testing (25% of the data). Although many unknown parameters may affect the TMC, according to previous studies, the key parameters affecting the mixing process during steady flow in natural streams can be stated as follows: where r is the fluid density; m is fluid viscosity; S f and S n are bed shape factor and sinuosity, respectively; and g is gravity. Fischer et al. (1979) and Jeon et al. (2007) expressed the relation below in terms of dimensionless parameters by using Buckingham Pi theorem: where U=U Ã is the friction term; W=H is the channel width to flow depth ratio; U= ffiffiffiffiffiffi ffi gH p is Froude number; and rHU=m is Reynolds number. Bed shape factor, S f , and sinuosity, S n , indicate vertical and transverse irregularities in natural streams, respectively (Etemad-Shahidi & Taghipour 2012). By developing secondary currents and shear flow, transverse and vertical irregularities affect the mixing processes in streams (Seo & Cheong 1998). Generally, the flow in natural streams is usually fully turbulent, so Reynolds number could be eliminated from Equation (4) as a first approximation (Seo & Cheong 1998;Kashefipour & Falconer 2002). Bed shape factor S f could also be eliminated from this equation as Froude number (Fr) and dimensionless roughness factor U=U Ã can reflect the other effects of bed material roughness and bed slope (Sattar & Gharabaghi 2015). Finally, the best dimensionless form of 1 z based on previous findings such as those of Yotsukura & Sayre (1976), Deng et al. (2001), Jeon et al. (2007), Azamathulla & Ahmad (2012), Aghababaei et al. (2017), and Zahiri & Nezaratian (2020) can be written as follows: where 1 z HU Ã represents the dimensionless parameter of 1 z and it will be used as the target parameter in this research. The correlations between all input and output parameters are displayed in Figure 2.
Based on Figure 2, there is no considerable correlation between the input variables, thus the problems that could arise in analysis from exaggerating the strength of the relations between variables, would be eliminated (Sattar & Gharabaghi 2015).

Support vector machine (SVM)
Vapnik (1995) proposed a nonlinear regression predicting method called support vector machine (SVM) which was usable to solve pattern recognition, highly nonlinear classification and regression problems. Maximizing the accuracy of prediction or minimizing the difference between the outputs and targets was the purpose of developing the SVM (Parsaie & Haghiabi 2017a, 2017bParsaie et al. 2019). For this purpose, the input parameters are mapped into a high-dimensional linear feature space by a nonlinear transformation to construct the optimal decision function. The dot product operation in the higher dimensional feature space is replaced by the kernel function in the original space, and by the finite sample training, the global optimal solution is obtained (Zhou et al. 2016). In the current study, SVM is used for predicting the TMC as the main algorithm, which is briefly described below. If data [(x 1 , y 1 ), (x 2 , y 2 ), . . . , (x i , y i ), . . . , (x n , y n )] is assumed as training set, where x i is the input vector, x [ R n , y i is the output, y [ R and n is the number of data pairs, the regression function of SVM which is called SVR will be formulated as follows: where v T represents the transposed form of v vector; b is a bias; and v can be obtained through some restricted rules. This function can describe the observed output y with an error tolerance 1. ;(x) would be considered as a nonlinear transfer function mapping the input vectors into a high-dimensional feature space which, theoretically, even a simple linear regression will be able to overcome the complexity of nonlinear regression of the input space (He et al. 2014). The tolerated errors within the extent of the 1-tube, as well as the penalized losses when data concern the outside of the tube, are defined by Vapnik's 1-insensitive loss function as: After that, the SVM problem can be formulated as the optimization problem as below: Subject to where the constant C is called a penalty factor and C . 0 shows the penalty degree of the sample with error exceeding 1 (Liu & Jiao 2011). Here, the value of C is set to 1 which shows the complexity of the model is as important as the empirical error. Also, j i and j Ã i are introduced as slack variables that specify the upper and lower errors of training subject to the error tolerance 1. These variables express the distance difference between actual values and the corresponding boundary values of 1-tube. Figure 3 depicts the mentioned situation graphically. SVM reduces under-fitting and over-fitting problems by which are called the regularization and training error terms, respectively.
Thus, the dual Lagrangian form will be yielded as follows by considering Lagrangian multipliers and Karush-Kuhn-Tucher condition in Equation (9): where a i and a Ã i are Lagrangian multipliers that satisfy equalities; a i Â a Ã i ¼ 0, and also, L(a i , a Ã i ) represents the Lagrange function. The Lagrange multiplier terms (a i À a Ã i ) related to the data accumulating the inside of the 1-insensitive tube will be considered to be zero. The final regression function is calculated only by using the datasets with non-zero coefficients (a i À a Ã i ) which are known as the support vectors. There are two groups of support vectors: margin support vectors and error support vectors (Noori et al. 2011). In the first group, the support vectors have absolute values of the weights ja i À a Ã i j less than C and in the second group, equal to C. In other words, the support vectors, which are located outside and on the margin of the insensitive tube, are called the error support vectors and the margin support vectors, respectively ( Figure 3). For changing the dimensionality of the input space to perform the regression or classification task with more confidence, kernel functions are used (Azamathulla & Wu 2011). These functions yield the inner products in the feature space ;(x i ) and ;(x j ). A kernel function plays the most significant role to simplify the learning process by changing the representation of the data in the feature space. Thus, although the data may be non-separable in the original input space, an appropriate choice of a kernel function allows the data to be highly separable in the feature space (Patil et al. 2012). If there is no prior knowledge about data features, radial basis function (RBF) will be recommended as one of the most popular kernel functions which is being used in different scientific fields (Roushangar & Koosheh 2015). For this reason, in this study, RBF was used as the kernel function of the SVM model for the TMC prediction.
is a kernel function and g is the parameter of the RBF kernel function.

Genetic algorithm (GA)
According to the mechanisms of genetics and Darwin's natural selection principles, John Holland in 1975, proposed a heuristic search method and called it the genetic algorithm (GA). This method was named after biological processes of inheritance, mutation, natural selection, and the genetic crossover that happens when parents mate to produce offspring (Goldberg 1989). Technically, there are four differences between the structure of GA and other traditional optimization algorithms (Goldberg 1989): • The GA typically uses a coding of the decision variable set instead of decision variable itself.
• The GA searches from a population of decision variable sets instead of a single decision variable set.
• The GA uses the objective function itself instead of the derivative information.
• The GA algorithm uses probabilistic instead of deterministic, search rules.
In the last decade, GA has successfully been used to solve some problems such as fitting nonlinear regression to data, optimizing simulation models, solving systems of nonlinear equations, and machine learning (Deb 1998). Generally, a GA has five major components to solve a particular problem that are briefly described below: 1 At the first, n chromosomes generate a population randomly that are known as candidate solutions to the problem. 2 A special fitness function evaluates the fitness of each chromosome. In the present study, efficiency coefficient (EC) was used as the fitness function and it can be written as: where N represents the total number of a testing data and y i is the predicted value. d i is the observed value and d is the mean of the observed values. 3 The following steps will be repeated until n offsprings have been created: (a) Selection: This operator selects the best chromosomes in pairs from the population to play the role of parents and reproduce two offspring. The more appropriate chromosomes have more chances to be selected. (b) Crossover: This operator randomly chooses a locus between a couple of chromosomes to form two offspring. (c) Mutation: This operator creates new chromosomes by flipping some of the bits in the chromosomes randomly. 4 Replace the current population with the new population. 5 If the stopping condition is satisfied, the best solution is returned in the current population, otherwise step 2 should be performed again.
The applied GA method settings in the present study are shown in Table 3.

Genetic algorithm-based support vector machine
In this study, at first, the training data (input and target parameters) are presented to the GA-SVM algorithm. Then, GA randomly generates an initial population of unknown SVM's parameters (C, 1, and g) to determine their optimal values to approach the best prediction with the lowest error and the highest accuracy. The fitness function examines the performance of each model. The secondary population of SVM's parameters is created by using the operators of GA (mutation, crossover, and selection) to obtain the optimal values of parameters and then these parameters are introduced to the SVM algorithm, again. This cycle is continued until the value of the fitness function is near or equal to the stopping conditions of the algorithm. Therefore, model outputs are expected to be closer to the target values at each cycle. In the GA-SVM algorithm, both algorithms operate separately but help each other in order to simplify the problem. In other words, first, SVM starts modeling by using the random parameters generated by GA, and GA continues the procedure of modeling until the optimal values of SVM's parameters are obtained. In this method, the GA algorithm tries to estimate the optimal combination of three parameters (C, 1, and g) in each cycle. C is known as a regularization parameter that must control the trade-off between maximizing the margin and minimizing the training error. Low C values will place insufficient stress on fitting the training data and high values of C make the algorithm over-fit the training data (Noori et al. 2011). Nevertheless, according to Wang et al. (2003), it can be concluded that the prediction error is rarely influenced by C. g denotes the optimal width of the kernel function, while RBF with large g allows the support vector to have a strong impact over a larger area. The type of noise present in data determines the optimal value for 1, which is usually unknown. There is a practical consideration of the number of resulting support vectors, even if enough knowledge of the noise is available for selecting an optimal value for 1 (Liu et al. 2006). In the GA-SVM hybrid algorithm, GA automatically starts finding the mentioned parameters of SVM and provides the optimal values, while determining the optimal values of parameters in the SVM algorithm was done by trial-and-error process. The cross-validation, which is an improved version of the grid search method, described by Hsu et al. (2010), was used to find these three parameters. In ν-fold cross-validation, after the training set was divided into ν subsets of equal size, one subset is tested sequentially by applying the classifier trained on the remaining ν À 1 subset. Therefore, each instance of the whole training set is estimated once so the cross-validation accuracy is the percentage of correctly classified data. The general flowchart of GA-SVM is illustrated in Figure 4. In the present study, SVM and GA-SVM were applied by using RBF kernel function and input variables. Table 2 shows that all parameters used in this study have a right-skewed distribution. On the other hand, according to Figure 5, there is an abundance of outliers in the target and input parameters except Fr and U=U Ã . Those observations which are uncommon and do not conform to the pattern of the majority of the data are called outliers (Rousseeuw & Van Zomeren 1990). The existence of outliers can cause increased error rates and reduce the accuracy of prediction. It can also lead to considerable distortions of statistic estimates when using either parametric or nonparametric tests (Zimmerman 1994(Zimmerman , 1995(Zimmerman , 1998. One of the simplest methods to tackle this problem is logarithmic transformations of parameters individually or collectively (Hubert & Van der Veeken 2008). Therefore, to reduce the negative effects of skewness and outliers on modeling, the whole dataset had been transformed into logarithmic scale and then the logarithmic parameters were used for modeling.

Model evaluation
In this study, both SVM and GA-SVM were used to estimate the TMC. The performances of these two models are assessed by evaluating the scatter plots between the observed and predicted results. In addition, the discrepancy ratio (DR), the root mean square error (RMSE), the mean of the absolute error (ME) and the accuracy were used as statistical parameters to evaluate the performance of SVM, GA-SVM, and empirical models. Statistical indexes that were used in this study are expressed as: where 1 z c and 1 z m are predicted and observed TMCs, respectively, and N is the total number of data points. If DR is equal to zero, there will be an exact match between the observed and predicted values. An overestimation (DR . 0) or underestimation (DR , 0) otherwise occurs. Previous researchers reported the percentage of DR values between À0.3 and 0.3 as an accuracy index (Seo & Cheong 1998;Kashefipour & Falconer 2002). In this research, in order to better evaluate the models' performance and accuracy, percentages of DR values between À0.15 and 0.15 were used as an accuracy index ( Figure 6). As well, DR , À0.15 and DR . 0.15 have been considered as underestimation and overestimation beyond the precision range, respectively. A comparison of DR frequency could be used to determine the symmetry and skewness of TMC estimation by different models.

RESULTS AND DISCUSSION
For estimating TMC by using SVM, as was mentioned before, we first need to find the optimal values of three adjustable parameters of SVM (C, 1, and g). During the grid search, all combinations of C, 1, and g were tested for each cross-validation routine, where these parameters all ranged from 0 to 120. Finally, the optimum values of these three parameters were determined by using both GA and grid search algorithms. These values are presented in Table 4. According to Table 4, although both GA and grid search algorithms estimate parameter C to be approximately the same, their estimations were different for the other two parameters. It should be noted that GA does not estimate the optimal value of each parameter separately. This algorithm estimates only the optimal combination of the three parameters. The performances of SVM, GA-SVM, and the previous methods in TMC estimation by using the mentioned statistical indexes are presented in Table 5.  Along with MAE, RMSE, and accuracy indexes, the balance between overestimation and underestimation values is also another important point in analyzing the models' performances. According to Table 5, among the previous regression models, the two models of Yotsukura et al. (1970) and Fischer & Park (1967), had the lowest performances in estimating the TMC with the accuracy of 8% and 28.5%, respectively. The two models of Aghababaei et al. (2017) and Zahiri & Nezaratian (2020) were able to have accurate performances in estimating TMC. The model of Aghababaei et al. (2017), based on GPSR method, with an accuracy of 80% and RMSE and MAE values of 0.148 and 0.096, respectively, and the simple data-driven-based model proposed by Zahiri & Nezaratian (2020) with a relatively good accuracy (75.8%) and the balance between overestimation and underestimation values were the most accurate regression-based models available to estimate this coefficient. Both GA-SVM and SVM algorithms had genuinely accurate and relatively similar performances. In the testing stage, both of them had the least error rates and the highest accuracy compared to the previous regression-based models. It should also be noted that although both models were based on the SVM algorithm, GA-SVM compared to SVM was able to improve the accuracy of the TMC estimation gently, in both training and testing stages by 1.15% and 1.7%, respectively. On the other hand, the grid search method is more time-consuming than GA, which make the GA-SVM model chosen for estimating TMC in this study. A comparison of the DR values of all expressions along with developed SVM and GA-SVM models is demonstrated in Figure 7. In addition, Figure 8 shows the performances of the developed SVM and GA-SVM in estimating the TMC for the two training and testing stages.
Based on Figure 7, the superiority of GA-SVM and SVM performance is obvious and both models have lower overestimation and underestimation values than the models of Aghababaei et al. (2017) and Zahiri & Nezaratian (2020). In addition, in Figure 8, the estimating accuracy by SVM and GA-SVM models are shown in training and testing stages, separately. The dataset used in this study included characteristics of straight and meandering streams. According to Table 6, the performance of both SVM and GA-SVM models in both straight and meandering streams was more accurate than the regression-based models. All models performed better in estimating the TMC in straight streams than meandering ones.

Sensitivity analysis
Sensitivity analysis helps researchers to determine which parameter has the most effect on reducing output uncertainty, and/ or which parameters are negligible and can be eliminated from the final model (Nezaratian et al. 2018). In this study, a sensitivity analysis method was applied to determine the effect of each parameter on the performance of GA-SVM as the most accurate model in the TMC estimation. Five scenarios of the input parameter combinations were introduced to the GA-SVM algorithm for the TMC estimation. Table 7 presents the combination of inputs, absent parameters, SVM parameters, and the performance of each scenario in the testing stage, respectively.
As presented in Table 7, the effect of eliminating each input parameter on accuracy of final GA-SVM model was determined. In the table above, Δ Accuracy% expresses the difference between the final accuracy of each scenario and the overall accuracy in the testing stage. It should be noticed that the above method significantly depends on the mathematical and theoretical structure of GA-SVM and may not be able to introduce the most effective parameter on TMC. However, analyzing Table 7 could help us, to some extent, on the effect of each input parameter on TMC estimation. The logic of input combination in scenario 5 was based on Figure 2. According to this figure, W=H and S n have the highest correlation with the dimensionless parameter of the TMC while the lowest correlation belongs to U=U Ã and Fr, respectively. Therefore, scenario 5 was used to measure the impact of removing the least correlated parameters on modeling TMC by GA-SVM. According to Table 7, in scenario 1, by eliminating W=H from the input parameters, the accuracy increases by 1.725%. However, in scenario 2, when W=H was replaced with U=U Ã in the input variables, the accuracy was improved by 3.488%. In addition, using the same analysis and considering scenario 3, it can be deduced that Fr is the least effective parameter on TMC estimation by using the GA-SVM algorithm. According to scenario 4, it can also be concluded that Sn is a most efficient parameter in the process of modeling TMC. In scenario 5, only inputs which had a correlation coefficient above 0 were used, so U=U Ã and Fr were eliminated from the process. The result showed that there was a significant improvement in the accuracy of the final model, which increase the modeling accuracy by 8.26%. Table 7 demonstrated that reducing the number of input variables with low correlation with the target improved the performance of the final GA-SVM model. Eliminating the low correlated input variables could decrease the complexity of the modeling process and increase the accuracy. This finding agreed with the results of Zahiri & Nezaratian (2020) and Jeon et al. (2007), which showed that Sn and W=H are the most influential parameters in estimating the TMC, respectively.

CONCLUSION
In this study, SVM and GA-SVM algorithms were developed to estimate the transverse mixing coefficient that plays an important role in modeling the pollutant release into streams. For this purpose, three statistical indexes (accuracy, RMSE, and MAE) were used to determine the performance of different models. The results showed the superiority of the proposed model compared to well-known regression-based models. The results also showed that the two models proposed by Aghababaei et al. (2017) and Zahiri & Nezaratian (2020) had the highest accuracy in estimating the TMC, respectively. Dividing the dataset into two groups (straight and meandering streams) showed that SVM and GA-SVM are still more reliable than the previous models. In this study, the grid search method was used to develop the SVM algorithm and was much more timeconsuming than the GA algorithm. Therefore, the GA-SVM model was chosen as the best model to estimate the TMC in