Abstract
Machine learning models hybridized with optimization algorithms have been applied to many real-life applications, including the prediction of water quality. However, the emergence of newly developed advanced algorithms can provide new scopes and possibilities for further enhancements. In this study, the least-square support vector machine (LSSVM) integrated with advanced optimization algorithms is presented, for the first time, in the prediction of the water quality index (WQI) at the Klang River of Malaysia. Thereafter, the LSSVM model using RBF kernel was optimized using the hybrid particle swarm optimization and genetic algorithm (HPSOGA), whale optimization based on self-adapting parameter adjustment and mix mutation strategy (SMWOA) as well as ameliorative moth-flame optimization (AMFO) separately. It was found that the SMWOA-LSSVM model had the better performance for WQI prediction by having the best achievement root mean square error (RMSE), mean absolute error (MAE), coefficient of determination (R2) and mean absolute percentage error (MAPE). Comprehensive comparison was done using the global performance indicator (GPI), whereby the SMWOA-LSSVM had the highest average score of 0.31. This could be attributed to the internal architecture of the SMWOA, which was catered to avoid local optima within short optimization period.
HIGHLIGHTS
Advanced optimization algorithms were applied, for the first time, in WQI prediction.
LSSVM using RBF as kernel function was found to be the best model.
All the hybrid LSSVM integrated with optimization algorithms had improved in accuracy, against the base models.
SMWOA-LSSVM was found to be the most suitable hybrid model for WQI prediction at the Klang River.
INTRODUCTION
The water quality index (WQI) is a crucial indicator to assess the quality of different water bodies for better management (Mehta et al. 2018; Zhang 2019). The WQI is usually calculated using multiple water quality parameters that require on-site data collection. However, the process of obtaining water quality parameters through sampling at different points can be time consuming, not to mention the tedious work as well as the high financial cost incurred (Najah Ahmed et al. 2019). A more efficient measure to obtain WQI would be a step in the right direction to ensure effective repeated monitoring of the quality of water bodies, especially in regions where pollution is prone to happen frequently. This is especially vital to track down the pollution sources before water treatment plants and water supply of the respective regions are affected. Machine learning models fit this task extremely well due to their quick response towards the fluctuation of the water quality parameters.
Currently, the tree-based model, kernel-based model, fuzzy-based model and artificial neural network (ANN) model are the mainstream machine learning models that are used in water quality prediction (Rajaee et al. 2020). Asadollah et al. (2021) compared the performances of tree models and kernel-based support vector machine (SVM) in estimating WQI at the Lam Tsuen River in Hong Kong. Parsaie et al. (2018) assessed the ability of different ANN variants as well as the SVM for the prediction of different water quality parameters. However, it was found that the performance of such base models was not satisfactory and hence improvements needed to be carried out on the design of the machine learning models.
Kisi & Parmar (2016) utilized the least-square support vector machine (LSSVM) for the prediction of the chemical oxygen demand (COD) and compared its performance with the multivariate adaptive regression splines (MARS) as well as the M5 model tree. Similar work was also conducted at the Perak River Basin of Malaysia for WQI modelling (Leong et al. 2019). The authors of the mentioned works opined that the LSSVM model is well-suited for addressing problems with high non-linearity. Besides, the shortcomings of the LSSVM model were also detected and it was recommended that further improvement through hybridization in order to optimize its hyperparameters ought to be considered. This effort was done by numerous researchers through incorporating different optimization algorithms to automate the tuning of LSSVM model's hyperparameters (Bozorg-Haddad et al. 2017; Yaseen et al. 2018; Sun et al. 2019; Song et al. 2021).
In recent years, a new class of optimization algorithms, known as the advanced optimization algorithms, have emerged due to the rapid development in soft computing technology. The hybrid of particle swarm optimization and genetic algorithm (HPSOGA) was proposed by Wang & Si (2010) with the intent to achieve mutual complementation between the particle swarm optimization (PSO) and the genetic algorithm (GA). The PSO has high convergence speed but often fails to adjust its velocity step size correctly. This can be solved by incorporating the GA, which can reflect the granularity of the search space via dynamic evolution or mutation. In exchange, the slow GA can be compensated by the efficient PSO. Besides, Tong (2020) introduced the whale optimization algorithm based on the self-adapting parameter adjustment and mix mutation strategy (SMWOA), which had evolved from the whale optimization algorithm (WOA) developed by Mirjalili & Lewis (2016). The SMWOA improves the original WOA by including self-adjustment parameters and mix mutation strategy. The former involves changing the progressive parameter in WOA into a self-adjustment parameter to ensure the global search ability remains constant, whereas the latter balances the trade-off between the ‘exploration’ and ‘exploitation’ phases to prevent premature convergence. Tian et al. (2019) developed the ameliorative moth-flame optimization (AMFO) based on the Kent chaotic map search strategy and dynamic inertia weight. The researchers are confident that such modifications will be able to help the original moth-flame optimization (MFO) to escape to converging into the local optima.
To this current day, there is no study or publication that involves the use of advanced optimization algorithms on machine learning models for the prediction of the WQI. Yet, many of the previous studies claimed that the integration of conventional optimization algorithms can be a promising meta-heuristic approach for boosting the performance of machine learning models (Chia et al. 2021). Therefore, this work proposes a superior hybrid LSSVM model for sustainable river water quality management. This study pioneered the integration of advanced optimization algorithm on the LSSVM model, whereby the resulting model shall be beneficial for water quality monitoring of the Klang River in Selangor, Malaysia. The specific objectives of this study are as follows:
Select the most suitable kernel function of the LSSVM model by comparing their prediction accuracies, to formulize the best base model for use.
Enhance the performance of the hybrid LSSVM models in WQI prediction through the integration of various advanced optimization algorithm (HPSOGA, SMWOA and AMFO).
Investigate how the combinations of input water quality parameters can affect the performance of the base and hybridized LSSVM model.
METHODS
Study area and data
The study was carried out at the Klang River in Selangor, Malaysia. The total length of the Klang River is 120 km and its basin covers an area of 1,280 km2. The Klang River begins from the Ulu Gombak Forest Reserve and flows in the west direction until it discharges into the Straits of Melaka. Due to human activities, the Klang River has become one of the most polluted rivers in Selangor. The chosen water sampling station is the 1K08 station shown in Figure 1. The 1K08 station is located in the middle of Kuala Lumpur city where the surrounding land use is mainly commercial and the river water is deemed consistently polluted under the WQI standards of the Department of Environment, Malaysia (DOE). This has subsequently captured the interest of the authors to adopt the Klang River as the study site. The water quality data, which included dissolved oxygen (DO, %), biological oxygen demand (BOD, mg/L), chemical oxygen demand (COD, mg/L), suspended solids (SS, mg/L) ammoniacal nitrogen (NH3-N, mg/L) and pH spans from year 1999 to year 2018 were provided by the DOE.
Water quality index
Data pre-processing
The input data have to be normalized in order to avoid any effect of absolute scale (Feng et al. 2017). The normalization can be done by rescaling the inputs into the scale of 0–1 based on the minimum and maximum values. On top of that, the k-fold cross validation strategy was applied to reduce the risk of overfitting. The data were partitioned into five equal portions where one portion would be used as testing data for each fold.
Least-square support vector machine
Hybrid of particle swarm optimization and genetic algorithm
Whale optimization algorithm based on self-adapting parameter adjustment and mix mutation strategy
Ameliorative moth-flame optimization
The various resultant hybrid LSSVM models developed via the appropriate integration of the HPSOGA, SMWOA or AMFO, respectively were labelled as HPSOGA-LSSVM, SMWOA-LSSVM and AMFO-LSSVM. For all the advanced optimization algorithms, 10 search agents are assigned with a maximum iteration of 100.
Performance evaluation

RESULTS AND DISCUSSION
Preliminary screening and effect of kernel functions
During the preliminary screening, a total of 63 different combinations of input parameters were tested for their suitability in the development of the LSSVM models. The best combinations of the different number of input parameters were selected to be presented in this paper. Initially, all the six water quality parameters were fed into the LSSVM models for WQI prediction (namely C6). Then, the number of input parameters were reduced one at a time to create the C5, C4, C3, C2 and C1 arbitrary combinations accordingly. The pH input data was first discounted, followed next by the NH3-H, SS, BOD and lastly COD until when DO is the sole input parameter left, as shown in Table 1. The performance of LSSVM models using different kernel functions and input combinations is summarized in Table 2.
Selected input combinations during preliminary screening
Combination . | Water quality parameters . |
---|---|
C1 | DO |
C2 | DO, COD |
C3 | DO, BOD, COD |
C4 | DO, BOD, COD, SS |
C5 | DO, BOD, COD, SS, NH3-N |
C6 | DO, BOD, COD, SS, NH3-N, pH |
Combination . | Water quality parameters . |
---|---|
C1 | DO |
C2 | DO, COD |
C3 | DO, BOD, COD |
C4 | DO, BOD, COD, SS |
C5 | DO, BOD, COD, SS, NH3-N |
C6 | DO, BOD, COD, SS, NH3-N, pH |
Performance of the LSSVM using different kernel functions for WQI prediction
LSSVM (Linear) . | |||||
---|---|---|---|---|---|
Combination . | RMSE . | MAE . | R2 . | MAPE (%) . | GPI . |
C1 | 19.12 | 13.58 | 0.66 | 29.42 | −1.93 |
C2 | 15.33 | 10.72 | 0.53 | 20.62 | −1.40 |
C3 | 22.74 | 16.29 | 0.48 | 31.71 | −2.55 |
C4 | 22.25 | 17.31 | 0.39 | 32.54 | −2.73 |
C5 | 20.90 | 16.77 | 0.39 | 31.17 | −2.58 |
C6 | 21.27 | 17.47 | 0.33 | 31.80 | −2.73 |
LSSVM (Polynomial) . | |||||
Combination . | RMSE . | MAE . | R2 . | MAPE (%) . | GPI . |
C1 | 25.71 | 17.72 | 0.00 | 34.61 | −3.41 |
C2 | 17.24 | 12.73 | 0.28 | 23.22 | −1.98 |
C3 | 14.58 | 9.68 | 0.40 | 17.41 | −1.32 |
C4 | 8.78 | 6.24 | 0.70 | 11.32 | −0.29 |
C5 | 7.40 | 5.54 | 0.77 | 10.63 | −0.08 |
C6 | 7.35 | 5.49 | 0.64 | 10.79 | −0.22 |
LSSVM (RBF) . | |||||
Combination . | RMSE . | MAE . | R2 . | MAPE (%) . | GPI . |
C1 | 8.28 | 6.57 | 0.52 | 13.08 | −0.55 |
C2 | 6.73 | 5.11 | 0.76 | 10.17 | −0.02 |
C3 | 6.01 | 4.55 | 0.80 | 8.78 | 0.14 |
C4 | 5.55 | 3.83 | 0.89 | 7.58 | 0.35 |
C5 | 5.88 | 3.92 | 0.91 | 8.24 | 0.33 |
C6 | 6.18 | 4.25 | 0.87 | 8.66 | 0.24 |
LSSVM (Linear) . | |||||
---|---|---|---|---|---|
Combination . | RMSE . | MAE . | R2 . | MAPE (%) . | GPI . |
C1 | 19.12 | 13.58 | 0.66 | 29.42 | −1.93 |
C2 | 15.33 | 10.72 | 0.53 | 20.62 | −1.40 |
C3 | 22.74 | 16.29 | 0.48 | 31.71 | −2.55 |
C4 | 22.25 | 17.31 | 0.39 | 32.54 | −2.73 |
C5 | 20.90 | 16.77 | 0.39 | 31.17 | −2.58 |
C6 | 21.27 | 17.47 | 0.33 | 31.80 | −2.73 |
LSSVM (Polynomial) . | |||||
Combination . | RMSE . | MAE . | R2 . | MAPE (%) . | GPI . |
C1 | 25.71 | 17.72 | 0.00 | 34.61 | −3.41 |
C2 | 17.24 | 12.73 | 0.28 | 23.22 | −1.98 |
C3 | 14.58 | 9.68 | 0.40 | 17.41 | −1.32 |
C4 | 8.78 | 6.24 | 0.70 | 11.32 | −0.29 |
C5 | 7.40 | 5.54 | 0.77 | 10.63 | −0.08 |
C6 | 7.35 | 5.49 | 0.64 | 10.79 | −0.22 |
LSSVM (RBF) . | |||||
Combination . | RMSE . | MAE . | R2 . | MAPE (%) . | GPI . |
C1 | 8.28 | 6.57 | 0.52 | 13.08 | −0.55 |
C2 | 6.73 | 5.11 | 0.76 | 10.17 | −0.02 |
C3 | 6.01 | 4.55 | 0.80 | 8.78 | 0.14 |
C4 | 5.55 | 3.83 | 0.89 | 7.58 | 0.35 |
C5 | 5.88 | 3.92 | 0.91 | 8.24 | 0.33 |
C6 | 6.18 | 4.25 | 0.87 | 8.66 | 0.24 |
The three oxygen-based water quality parameters, namely DO, COD and BOD, appear to be the most important inputs for accurate modelling and prediction of WQI. This was due to the fact that the oxygen dynamics in the river water reflected the suitability of the water for the aqua-ecosystem. The oxygen-based parameters were followed by the SS which represented the inorganic composition, and possibly the turbidity of the river water. Meanwhile, the NH3-N (organic representation) and pH were deemed the least important parameters, and had minimal influence on the accuracy of models.
From Table 2, it can be clearly seen that the LSSVM model using the RBF kernel function, which had the best performance in predicting the WQI. This can be observed from the lower error metrics and also the higher values of R2 as well as the GPI. From the results, one can observe that the linear kernel function failed to predict the WQI satisfactorily. This could be due to the highly non-linear nature of the interactions among the water quality parameters. When the kernel functions were switched to polynomial (degree of 3) and RBF, the accuracy of the prediction improved drastically, with the effect of input combinations becoming more prominent. The increase in the number of input parameters from C1 to C6 evidently enhanced the performance of the LSSVM models, as shown by the increasing value of the GPI. Thus, the LSSVM model with the RBF kernel function was selected as the base model for hybridization purposes at the next stage, as it had achieved the higher average GPI value (0.08) compared to those of LSSVM models with linear (−2.32) and polynomial (−1.22) kernel functions. This could be due to the Gaussian function, which can map the input better into the feature space.
Optimized LSSVM models
In this next stage, the LSSVM model with RBF kernel function (known as the base model) was hybridized using the various advanced optimization algorithms. To be exact, the respective advanced optimization algorithms (HPSOGA, SMWOA and AMFO) were each separately integrated outside of the LSSVM model, so that they could receive feedback from the LSSVM models in terms of the fitness value. Based on the fitness value received, the advanced optimization algorithms would react by searching for the global optima for the hyperparameters of the LSSVM model. Since the RBF was the chosen kernel function, the hyperparameters tuned were γ and σ (refer to Equations (3) and (9)), in which the boundaries (search spaces) were set to be in between 0.001 and 100. Table 3 summarizes the results of the performance of the three resultant optimized LSSVM models, namely the HPSOGA-LSSVM, SMWOA-LSSVM and AMFO-LSSVM.
Performance of hybrid optimized LSSVM models for WQI predicting
HPSOGA-LSSVM . | |||||
---|---|---|---|---|---|
Combination . | RMSE . | MAE . | R2 . | MAPE (%) . | GPI . |
C1 | 7.51 | 5.86 | 0.63 | 11.76 | −0.30 |
C2 | 7.81 | 6.17 | 0.74 | 12.15 | −0.23 |
C3 | 5.23 | 3.87 | 0.84 | 7.53 | 0.31 |
C4 | 4.39 | 3.08 | 0.90 | 6.03 | 0.52 |
C5 | 4.12 | 2.96 | 0.91 | 5.84 | 0.56 |
C6 | 3.96 | 2.86 | 0.90 | 5.69 | 0.57 |
SMWOA-LSSVM . | |||||
Combination . | RMSE . | MAE . | R2 . | MAPE (%) . | GPI . |
C1 | 6.97 | 5.46 | 0.66 | 10.46 | −0.17 |
C2 | 6.67 | 5.06 | 0.78 | 10.22 | 0.01 |
C3 | 5.23 | 3.87 | 0.84 | 7.53 | 0.31 |
C4 | 4.33 | 3.07 | 0.90 | 6.01 | 0.53 |
C5 | 4.10 | 2.95 | 0.92 | 5.83 | 0.57 |
C6 | 3.93 | 2.90 | 0.90 | 5.62 | 0.57 |
AMFO-LSSVM . | |||||
Combination . | RMSE . | MAE . | R2 . | MAPE (%) . | GPI . |
C1 | 7.09 | 5.67 | 0.65 | 10.80 | −0.21 |
C2 | 6.67 | 5.06 | 0.78 | 10.22 | 0.01 |
C3 | 4.96 | 3.71 | 0.85 | 7.26 | 0.36 |
C4 | 5.18 | 3.93 | 0.91 | 7.66 | 0.38 |
C5 | 4.14 | 3.01 | 0.91 | 5.93 | 0.55 |
C6 | 4.94 | 3.46 | 0.90 | 6.96 | 0.44 |
HPSOGA-LSSVM . | |||||
---|---|---|---|---|---|
Combination . | RMSE . | MAE . | R2 . | MAPE (%) . | GPI . |
C1 | 7.51 | 5.86 | 0.63 | 11.76 | −0.30 |
C2 | 7.81 | 6.17 | 0.74 | 12.15 | −0.23 |
C3 | 5.23 | 3.87 | 0.84 | 7.53 | 0.31 |
C4 | 4.39 | 3.08 | 0.90 | 6.03 | 0.52 |
C5 | 4.12 | 2.96 | 0.91 | 5.84 | 0.56 |
C6 | 3.96 | 2.86 | 0.90 | 5.69 | 0.57 |
SMWOA-LSSVM . | |||||
Combination . | RMSE . | MAE . | R2 . | MAPE (%) . | GPI . |
C1 | 6.97 | 5.46 | 0.66 | 10.46 | −0.17 |
C2 | 6.67 | 5.06 | 0.78 | 10.22 | 0.01 |
C3 | 5.23 | 3.87 | 0.84 | 7.53 | 0.31 |
C4 | 4.33 | 3.07 | 0.90 | 6.01 | 0.53 |
C5 | 4.10 | 2.95 | 0.92 | 5.83 | 0.57 |
C6 | 3.93 | 2.90 | 0.90 | 5.62 | 0.57 |
AMFO-LSSVM . | |||||
Combination . | RMSE . | MAE . | R2 . | MAPE (%) . | GPI . |
C1 | 7.09 | 5.67 | 0.65 | 10.80 | −0.21 |
C2 | 6.67 | 5.06 | 0.78 | 10.22 | 0.01 |
C3 | 4.96 | 3.71 | 0.85 | 7.26 | 0.36 |
C4 | 5.18 | 3.93 | 0.91 | 7.66 | 0.38 |
C5 | 4.14 | 3.01 | 0.91 | 5.93 | 0.55 |
C6 | 4.94 | 3.46 | 0.90 | 6.96 | 0.44 |
If comparison is made between Tables 2 and 3, the evidence that the advance optimization algorithms could improve the base LSSVM model is explicit. Taking the C6 case as an example, the RMSE, MAE, R2, MAPE and GPI of base LSSVM model were 6.18, 4.25, 0.87, 8.66% and 0.24, respectively. On the other hand, the same metrics for the HPSOGA-LSSVM, SMWOA-LSSVM and AMFO-LSSVM models were 3.96, 2.86, 0.90, 5.69% and 0.57, 3.93, 2.86, 0.90, 5.62% and 0.57 as well as 4.94, 3.46, 0.90, 6.96% and 0.44, respectively. Similar observations were obtained for the models trained with different input combinations. The dispersion and distribution of the search agents initialized by the advanced optimization algorithms resolved the constraints caused by the random selection of initial point by the base LSSVM model, thus resulting in better WQI predictions.
Although the HPSOGA-LSSVM, SMWOA-LSSVM and AMFO-LSSVM models outperformed their base LSSVM model counterpart, a comparison is still needed to examine the efficiency and effectiveness of each of the individual advanced optimization algorithms. The HPSOGA-LSSVM model achieved an average GPI value of 0.24, whereas the average GPI value for SMWOA-LSSVM and AMFO-LSSVM models are 0.31 and 0.26, respectively. From the macro aspect (in terms of GPI), the SMWOA-LSSVM model could predict the WQI better than the HPSOGA-LSSVM models. This could be due to the nature of the SMWOA, which incorporates ‘progressive algorithms’, whereby the potential search space is gradually shrinking as the iteration increases. This noble characteristic is not to be found in the HPSOGA, which focuses on the survival of fitter swarms, or that of the AMFO, which focuses on the optimization's computational efficiency instead of the final results. In order to avoid premature convergence in the shrinking search space, the SMWOA further adopts the rebalancing between the ‘exploration’ and ‘exploitation’ phases so that the global optimum could be located without lengthening the optimization time. In other words, the SMWOA actually shows its upper hand over the HPSOGA (retaining fitter swarm analogous to shrinking potential search space) and the AMFO (rebalancing the phases to avoid premature convergence analogous to optimizing the computational efficiency). This in turn, has resulted in the better performance of the SMWOA-LSSVM model as compared to the other optimized models.
From the micro scale point of view, as the number of input water quality parameters increase, the performance of the optimized LSSVM models would also be improved. This property was actually inherited from the base LSSVM model using the RBF kernel function. However, the improvements to the optimized model due to the increase in number of input parameters would reach an optimum after which it became less significant when further parameter inputs are made, say for example from C4 to C6. That is to say, after the utilization of the four water quality parameter suites (DO, BOD, COD and SS), the five-parameter option (plus NH3-N) and six-parameter option (plus NH3-N and pH) were not reflected in significantly improved performance of the optimization models. This finding can be taken positively as this indicates that the optimized LSSVM models could predict equally well, regardless of the number of input parameters, as long as at least four water quality parameters are fed into the optimized LSSVM models, particularly the HPSOGA-LSSVM and SMWOA-LSSVM models. That is to say the optimal number of inputs would be four. On top of that, the MAPE of the HPSOGA-LSSVM and SMWOA-WOA models using C4, C5 and C6 are at most 6%. Although DO appears to be the most important water quality parameter for WQI prediction using the optimized LSSVM models; however, WQI prediction depending solely on DO was not reliable due to the high error incurred. In other words, complementary water quality parameters such as the BOD, COD and SS are mandatory for good prediction, irrespective of the model used. At this stage, using the optimized LSSVM models, at least three water quality parameters are confidently needed in order to achieve WQI prediction with MAPE less than 10%.
The comparison between all the LSSVM models (base and optimized) in this study is illustrated in Figure 3. The performance of the advanced optimization algorithm improved the base LSSVM model, regardless of the input combinations fed. The SMWOA-LSSVM model stood out among the hybridized model and is suggested to be used as a predictive model for WQI at the Klang River.
A check on similar studies that had been carried out at the Klang River using other machine learning models is made here. According to the results reported by Hameed et al. (2016), although the RBF neural network performed better than the SMWOA-LSSVM model of this work (both using C6 as input combination; however, the accuracy of the RBF neural network degraded as the number of input water quality parameters decreased. For inputs with limited parameters, the SMWOA-LSSVM model of this work performed better than the RBF neural network as well as the back-propagation neural network. Besides, the SMWOA-LSSVM model developed in this study also performed similarly to the hybridized random forest, conditional random forest, random forest generator and extreme gradient boosting models reported by Tiyasha et al. (2021), although the models in the latter study used significantly more input water quality parameters. This means that the LSSVM models hybridized using advanced optimization algorithms are more resilient towards low numbers of input water quality parameters, which is more useful for river quality monitoring and management.
CONCLUSIONS
In this study, the three different novel hybrid LSSVM models based on the individual separate integration of three advanced optimization algorithms, namely the HPSOGA, SMWOA and AMFO with the initially selected base model (LSSVM with the RBF kernel function) were developed for WQI prediction at station 1K08 of the Klang River in Kuala Lumpur. It was found that the optimized LSSVM models had better performance compared to the standalone base model, as portrayed by the lower RMSE, MAE and MAPE as well as higher R2 and GPI. Next, among the optimized LSSVM models, the SMWOA-LSSVM had the best performance, followed by the HPSOGA-LSSVM and then AMFO-LSSVM. The RMSE, MAE, R2, MAPE and R2 of WQI prediction of the SMWOA-LSSVM are 3.93, 2.90, 0.90, 5.62% and 0.57, which exhibits great significant improvement based on its base counterpart (6.18, 4.25, 0.87, 8.66% and 0.24) when C6 was used as the input combination. Similar observations were seen for the other input combinations. The SMWOA encompasses the progressive reduction of search space (faster convergence) as well as the trade-off between the ‘exploration’ and ‘exploitation’ phases (prevent premature convergence). This can help to tune the hyperparameters of the LSSVM so that minimal loss can be achieved with optimum performance of WQI prediction. On the contrary, the noble characteristics found in the SMWOA that are not present in both the HPSOGA and AMFO led to the relatively weaker performance of the latter two models. In general, the SMWOA-LSSVM produced reliable predictions of the WQI at station 1K08 of the Klang River, as long as the minimal input of the DO, BOD and COD are included in the input combination during the training of the optimized LSSVM model. Having said that, the validity of this study is still being constrained at this one sampling station along the Klang River. More comprehensive work, which involves greater spatial variability, should be undertaken in future and would undoubtedly justify the integration of advanced optimization algorithms onto the LSSVM model.
ACKNOWLEDGEMENTS
This research was funded by Universiti Tunku Abdul Rahman Research Fund (UTAR), Malaysia through Universiti Tunku Abdul Rahman Research Fund under project number IPSR/RMC/UTARRF/2020-C2/K03. The authors would also like to express their gratitude to the Department of Environment, Malaysia for their provision of water quality data.
DATA AVAILABILITY STATEMENT
Data cannot be made publicly available; readers should contact the corresponding author for details.