Improving the performance of machine learning (ML) algorithms is essential for accurately estimating water quality parameters (WQPs). For the first time, a novel hybrid framework, namely the adaptive neural fuzzy inference system–discrete wavelet transform–gradient-based optimization (ANFIS–DWT–GBO), for estimation of electrical conductivity (EC) and total dissolved solids (TDS), is used. Before estimating WQPs, the performance of the ANFIS–DWT–GBO is proven by several benchmark data sets. In addition, three benchmark algorithms, including ANFIS, ANFIS–DWT, and ANFIS–GBO, are used to demonstrate the strength of the novel framework. The principal component analysis (PCA) method determines the best input combination in EC and TDS estimation. The consequences show that the ANFIS–DWT–GBO produces very successful and competitive results in benchmark data sets modeling and WQPs estimation compared to other algorithms. This result is due to the simultaneous use of DWT and optimization algorithm in the proposed framework. DWT can process WQP data before applying it to the algorithms. The GBO is utilized to optimize the hyperfine parameters in the ANFIS. The results show that the highest accuracy of estimating EC and TDS is in Mollasani and Gotvand stations, respectively. The correlation coefficient (R) value in the Mollasani station is 0.99, and in the Gotvand station it is 0.98.

  • Introducing a novel hybrid framework based on the ANFIS, discrete wavelet transform, and optimization algorithm, namely the ANFIS–DWT–GBO.

  • Proving the performance of the ANFIS–DWT–GBO by using several benchmark data sets.

  • Employment of the ANFIS–DWT–GBO for estimating water quality parameters (WQPs).

  • The proposed framework has the potential to analyze other engineering problems.

Graphical Abstract

Graphical Abstract
Graphical Abstract

Rivers are among the most important water resources whose quality is very important (Kumar et al. 2020). Recently, the pollution of rivers has increased for various reasons, which can cause many problems for humans and the environment. Therefore, river water pollution is a vital issue (Antanasijević et al. 2020), and the accurate assessment of water quality parameters (WQPs) is an important challenge. One method of estimating WQPs is to perform multiple tests and sampling, which is associated with many errors. The need for basic and boundary conditions and high costs are other disadvantages of physical and mathematical methods (Imani et al. 2021). In recent years, the development of machine learning (ML) algorithms has solved the problems of traditional methods (Dawood et al. 2021). High accuracy and speed and no need for basic and boundary conditions are the advantages of ML algorithms (Shah et al. 2021). One of the best ways to investigate river pollution problems is to model and analyze them using artificial intelligence (Abba et al. 2020).

Researchers in recent years have used several classical algorithms to estimate and predict WQPs (Olyaie et al. 2017). Bilali et al. (2021) employed different ML algorithms for WQPs prediction. This study showed that the stochastic gradient descent (SGD) and artificial neural network (ANN) had better accuracy than other algorithms. Applying random forest (RF), support vector regression (SVR), and neural networks (NNs) for WQPs estimation (Guo et al. 2020) showed that the NN had better applications. The deep learning methods used to predict water quality are highly accurate (Khullar & Singh 2022). The application of the long short-term memory (LSTM), multi-linear regression (MLR), and ANN showed that ANN and MLR had better outcomes (Kouadri et al. 2022).

In recent years, researchers have sought to increase the accuracy of ML algorithms (Morshed-Bozorgdel et al. 2022). One way to improve the performance of simulation algorithms is to use optimization algorithms to determine the important parameters and produce new hybrid algorithms (Farzin et al. 2022). With the advent of new optimization algorithms, the production of a new hybrid algorithm can increase the accuracy of algorithms and be used to model and estimate various parameters. Recently, new optimization algorithms have been introduced to find the best solutions to various problems (Abualigah et al. 2021). Since important fine-tuning parameters of simulation algorithms require trial and error, new optimization algorithms can solve this problem well.

The shuffled frog leaping algorithm (SFLA) and genetic algorithm (GA) were used to train SVR to model and predict WQPs (Mahmoudi et al. 2016). Kadkhodazadeh & Farzin (2021) used the gradient-based optimization (GBO) algorithm for least-squares support vector machine (LSSVM) training. The results showed that the LSSVM–GBO performs better than classical algorithms. Song et al. (2021a, 2021b) developed a new hybrid algorithm called synchrosqueezed wavelet transform (WT)–improved sparrow search algorithm (ISSA)–LSTM to predict water quality. The results showed that the new hybrid algorithm has high accuracy. Hybrid algorithms based on one-dimensional residual convolutional neural networks (1-DRCNNs) and bi-directional gated recurrent unit (BiGRU) algorithms were introduced which had better performance than previous algorithms (Yan et al. 2021). In another study, Banadkooki et al. (2020) used new hybrid ML algorithms to predict the total dissolved solids (TDS). Tizro et al. (2021) applied ANN, adaptive neuro-fuzzy inference system (ANFIS), and ANFIS-subtractive clustering in the estimated TDS of the Zayandehrood river. Wu et al. (2022) predicted dissolved oxygen (DO) by combining extreme gradient boosting (XGBoost), ISSA, and LSTM algorithms. Kadkhodazadeh & Farzin (2022) introduced a new hybrid algorithm integrating arithmetic optimization algorithm (AOA) and LSSVM and developed its application in estimating WQPs. Results of the technique for order of preference by similarity to ideal solution (TOPSIS) method showed that LSSVM–AOA has a better performance than classical and hybrid algorithms. In addition, the results of the Monte Carlo method (MCM) showed that the LSSVM–AOA has less uncertainty.

Another way to increase the accuracy of ML algorithms is to use WT in the input data processing. WT prevents data overlap and the effect of noise in modeling ML algorithms. WT is used for signal processing and time-series data. Recently, WT methods have been widely used due to their simple implementation and noise cancellation (Alizadeh et al. 2017). This method decomposes disturbing signals and eliminates noisy signals while preserving the important features of time-series data (Yu et al. 2020). WT has been widely used in estimating WQPs and studies related to water resources management and hydrology (Parmar et al. 2019; Bayatvarkeshi et al. 2020; Jamei et al. 2020).

To further improve the accuracy of estimating WQPs, we propose a novel hybrid framework. According to a review of studies, there is no study on estimating WQPs using the ANFIS, discrete wavelet transform (DWT), and GBO. Simultaneous use of the DWT and GBO in improving the accuracy of simulation algorithms can lead to significant results. The DWT and GBO can increase estimation accuracy by analyzing WQPs and adjusting ANFIS parameters. Therefore, the present study used the ANFIS–DWT–GBO to estimate EC and TDS in the Karun river. First, to prove the better performance of the ANFIS–DWT–GBO, its performance is evaluated with three benchmark data sets. The ANFIS–DWT–GBO results are compared with the ANFIS, ANFIS–DWT, and ANFIS–GBO. The principal component analysis (PCA) method is then used to determine the important EC and TDS estimation inputs. In the next step, based on the best input combination, the ability of the ANFIS, ANFIS–DWT, ANFIS–GBO, and ANFIS–DWT–GBO to estimate EC and TDS will be evaluated. Finally, a superior hybrid framework is proposed to improve the performance of the ANFIS.

Adaptive neuro-fuzzy inference system

The ANFIS structure consists of nonlinear and linear parts (Karaboga & Kaya 2019). ANFIS, which takes advantage of NNs and fuzzy logic simultaneously, combines the ANN and fuzzy inference system (FIS) and finds their strengths (Farzin & Valikhan Anaraki 2021). According to Figure 1, the ANFIS consists of the following five layers:
Figure 1

The ANFIS structure.

Figure 1

The ANFIS structure.

Close modal
Layer 1: input layer: nodes of this layer are adaptive nodes with a node output. The fuzzy membership functions (MFs) are the output of each node O1,i defined by:
formula
(1)
formula
(2)
where x is the input, Wi and Zi are the linguistic labels characterized by MFs μWi and μZi.
Layer 2: rule layer: Each node output in this layer represents the firing strength of its corresponding fuzzy rule, named as win below, named as O2,i below:
formula
(3)
where e is the input, is the firing strength of rule.
Layer 3: average layer: in layer 3, each neuron calculates the ratio of the ith rules firing strength to the sum of all rules firing strength, as follows:
formula
(4)
where is the normalized firing strength, and O3,i is the output of the third layer.
Layer 4: consequent layer: this layer's nodes correspond to adaptive processing units.
formula
(5)
where O4,i is the output of the fourth layer, pi, qi, and r are parameters of this layer referred to as consequent parameters.
Layer 5: output layer: these layers are overall outputs computed as the summation of all incoming signals.
formula
(6)
where O5,i is the output.

Discrete wavelet transform

The WT method was introduced by Grossmann & Morlet (1984). WT is a data preprocessing method that removes noise from the data. Wavelet, which can provide information about frequency and time simultaneously, is a function with zero means (Malekzadeh et al. 2019). WT has many advantages over other data analysis techniques and is used to extract hidden information from time-series data (Jamei et al. 2020). WT reveals different and effective time-series factors (trends, discontinuities, breakdown points) by analyzing the data's frequency range and time scale. DWT is a type of WT. The DWT method is more useful in water resources and hydrology studies because hydrological data are measured in discrete time phases. In DWT, wavelets are sampled discretely, and the input signals are converted to a set of functions called wavelets depending on the type of mother wavelet (Anaraki et al. 2021). Choosing a mother wavelet requires trial and error. This study used the third-order Daubechies (db3) wavelet base function. The DWT is expressed as (Mallat 1989):
formula
(7)
where i and L are integer numbers that the scale regulator and change parameters, and h and are wavelet base function and time-series data, respectively. * is related to complex conjugate. There are two ways to determine the level of wavelet decomposition: using trial and error or using Equation (8) (Wang & Ding 2003):
formula
(8)
where L is the wavelet decomposition level, and n is the number of data. In this research, according to Equation (8), the number of wavelet decomposition levels is equal to 2.

Gradient-based optimization

The GBO algorithm, which mimics population-based and gradient methods, was first introduced by Ahmadianfar et al. (2020). GBO uses Newton's approach to achieve better positions in the search space and uses the two main operators, gradient search rule (GSR) and local escaping operator (LEO). The structure of GBO is summarized in the following sections. For more details, see Ahmadianfar et al. (2020).

Gradient search rule

The basis of GSR is the gradient-based method that is the core of the GBO. Increasing convergence rate acceleration (DM) and enhancing exploration tendency aim to use GSR. Therefore, the equation to update the current vector position is ():
formula
(9)
formula
(10)
formula
(11)
where randn, rand, and are random numbers, is a balance factor, Δx is dealing with the difference between the best solution () and a randomly selected position (), and ε is a small number. The new vector () is obtained as follows:
formula
(12)
Finally, a new version of the solution is found as follows:
formula
(13)
and are random numbers. defined as:
formula
(14)

Local escaping operator

LEO is responsible for preventing trapping in local optima and speeding up convergence. The factor that will solve complex problems is LEO. LEO creates the best solution () using multiple solutions as follows:
formula
(15)
where , and are uniform distributed random numbers ɛ [−1; 1], and is the probability, while , , and are three random numbers.

Novel hybrid framework (ANFIS–DWT–GBO)

ANFIS performs poorly due to slow convergence and falling into the trap of local optimization in large search spaces and difficult issues. The ANFIS has various parameters that greatly affect its performance. Determining these parameters using trial and error is associated with many errors. On the other hand, using DWT for accurate analysis of input data helps to increase the accuracy of algorithms. For this reason, in this paper, for the first time, a novel framework with the integration of ANFIS, DWT, and GBO is presented. Simultaneous use of the DWT and optimization algorithm greatly helps increase the accuracy of ANFIS. This framework uses the GBO for ANFIS training instead of the classical training algorithms. The ANFIS–DWT–GBO flow chart is shown in Figure 2. The steps of the ANFIS–DWT–GBO are as follows:
  • Input data are processed using DWT and divided into training (70%) and testing (30%) data.

  • The initial parameters of the GBO and ANFIS (mean and standard deviation (SD) of Gaussian MFs) as decision variables are randomly determined.

  • The GBO is used to train the ANFIS.

  • In this phase, the stop criterion is checked. The solution has been obtained if the stop criterion is satisfied; otherwise, the simulation process repeats.

Figure 2

Flowchart of the ANFIS–DWT–GBO.

Figure 2

Flowchart of the ANFIS–DWT–GBO.

Close modal

Principal component analysis

This study used PCA to select the best inputs for estimating EC and TDS. The PCA is a statistical method used to decrease the input variables (Jahin et al. 2020). The purpose of PCA is to reduce the dimensions of variables without losing much information and identify the prominent input data by the principal components (Kazakis et al. 2017). The principal components are computed as follows:
formula
(16)
where is eigenvector, and X is input variable.

Benchmark data set

To fully evaluate and validate the performance of the ANFIS–DWT–GBO, several benchmark data sets, including Housing, LSVT, and Servo, were used. The features of the benchmark data sets are listed in Table 1. Benchmark data sets are a good tool for evaluating the accuracy of new algorithms and comparing them with other algorithms (Henríquez & Ruz 2017). Each benchmark data set was obtained from several experiments or studies. For more information about benchmark data sets, please see https://archive.ics.uci.edu/ml/datasets.php.

Table 1

Details of benchmark data sets

HousingLSVTServo
Train data 354 88 117 
Test data 152 38 50 
Attributes 13 309 
Attribute characteristics Real Real Categorical, Integer 
Data set characteristics Multivariate Multivariate Multivariate 
HousingLSVTServo
Train data 354 88 117 
Test data 152 38 50 
Attributes 13 309 
Attribute characteristics Real Real Categorical, Integer 
Data set characteristics Multivariate Multivariate Multivariate 

Evaluation criteria

To evaluate the estimated WQPs values by different algorithms, four statistical criteria, mean absolute error (MAE), relative root mean square error (RRMSE), R, and coefficient of determination () were used. Expressions for these measures are given as follows (Kadkhodazadeh et al. 2022):
formula
(17)
formula
(18)
formula
(19)
formula
(20)
where N is the number of data, O is the observed values, E is the estimated values, is mean observed values, and is mean estimated values.

Case study and data sources

The Karun river, located southwest of Iran, is selected as the study area in the present study. With a basin area of 67,257 km2, the Karun river covers two provinces of Iran (Khouzestan and Chaharmahal and Bakhtiari). The Karun river, the most important river in Iran, is the only navigable river. In the present study, four stations, Armand, Ahvaz, Gotvand, and Mollasani in the Karun river, have been selected to estimate WQPs. These stations have different climates due to their geographical location. Figure 3 shows the geographic locations of the investigated stations.
Figure 3

Study area and hydrometric stations.

Figure 3

Study area and hydrometric stations.

Close modal
WQPs in hydrometric stations is measured using instrumentation. Therefore, they have high reliability. Statistical indicators of WQPs including coefficient of skewness (CS), coefficient of variation (CV), SD, and average, are presented in Table 2. These parameters include: discharge (Q), calcium (Ca2+), chlorine (), SAR, bicarbonate (), sum anion (Sum.A), sum cation (Sum.C), magnesium (Mg2+), sodium (Na+), sulfate (), pH, EC, and TDS. Also, the correlation of input data with output data is shown in Figure 4. According to this figure, Sum.A and Sum.C parameters have the highest correlation with output parameters (EC and TDS). This means more impact of this data on the output data.
Table 2

Statistical specifications of input and output data

Input
Output
QSARSum.ASum.CMg2+Na+pHECTDS
Unitm3/sMg/lMg/lMg/lMg/lMg/lMg/lMg/lMg/lμ mho/cmMg/l
Armand CS 2.07 0.28 0.13 1.65 4.56 0.27 1.73 1.40 2.77 1.15 −5.51 0.21 0.20 
CV 0.66 0.33 0.32 0.05 0.39 0.30 0.09 0.33 1.04 0.27 0.40 0.22 0.22 
SD 147.77 0.76 0.53 0.11 1.35 1.63 0.47 0.45 15.16 0.28 3.22 119.96 78.38 
Average 221.51 2.30 1.61 2.00 3.65 5.41 0.47 1.33 13.14 1.15 7.97 534.97 347.36 
Ahvaz CS 2.52 1.86 1.20 1.88 −0.91 1.11 1.88 0.74 0.73 1.20 −8.00 1.08 1.01 
CV 0.92 0.26 0.27 0.04 0.22 0.21 0.18 0.24 0.34 0.27 0.35 0.38 0.40 
SD 691.85 1.31 1.95 0.33 0.61 2.67 2.44 0.47 2.84 1.05 2.83 556.33 364.78 
Average 746.43 4.85 7.06 7.00 2.69 12.57 13.00 1.95 8.16 3.81 7.98 1,430.26 902.13 
Gotvand CS 3.43 0.90 1.22 1.68 −0.23 0.69 0.45 5.19 1.55 1.67 −8.66 1.66 1.55 
CV 0.94 0.26 0.29 0.06 0.25 0.18 0.16 0.25 0.30 0.24 0.39 0.37 0.38 
SD 464.71 0.78 1.33 0.25 0.64 1.58 1.35 0.26 1.59 0.38 3.14 355.15 224.91 
Average 493.65 2.93 4.61 6.54 2.56 8.43 7.98 1.13 5.15 1.61 7.98 941.53 586.76 
Mollasani CS 2.74 1.46 2.05 1.97 −0.34 1.38 1.36 1.11 1.96 1.18 −0.46 1.44 1.16 
CV 0.97 0.31 0.51 0.45 0.15 0.40 0.41 0.41 0.57 0.50 0.34 0.39 0.39 
SD 648.65 1.35 2.86 0.68 0.43 2.40 2.50 0.93 3.91 1.91 2.27 529.90 336.98 
Average 665.14 4.35 6.77 3.64 2.83 13.37 13.35 2.28 6.78 3.76 7.91 1,341.65 844.94 
Input
Output
QSARSum.ASum.CMg2+Na+pHECTDS
Unitm3/sMg/lMg/lMg/lMg/lMg/lMg/lMg/lMg/lμ mho/cmMg/l
Armand CS 2.07 0.28 0.13 1.65 4.56 0.27 1.73 1.40 2.77 1.15 −5.51 0.21 0.20 
CV 0.66 0.33 0.32 0.05 0.39 0.30 0.09 0.33 1.04 0.27 0.40 0.22 0.22 
SD 147.77 0.76 0.53 0.11 1.35 1.63 0.47 0.45 15.16 0.28 3.22 119.96 78.38 
Average 221.51 2.30 1.61 2.00 3.65 5.41 0.47 1.33 13.14 1.15 7.97 534.97 347.36 
Ahvaz CS 2.52 1.86 1.20 1.88 −0.91 1.11 1.88 0.74 0.73 1.20 −8.00 1.08 1.01 
CV 0.92 0.26 0.27 0.04 0.22 0.21 0.18 0.24 0.34 0.27 0.35 0.38 0.40 
SD 691.85 1.31 1.95 0.33 0.61 2.67 2.44 0.47 2.84 1.05 2.83 556.33 364.78 
Average 746.43 4.85 7.06 7.00 2.69 12.57 13.00 1.95 8.16 3.81 7.98 1,430.26 902.13 
Gotvand CS 3.43 0.90 1.22 1.68 −0.23 0.69 0.45 5.19 1.55 1.67 −8.66 1.66 1.55 
CV 0.94 0.26 0.29 0.06 0.25 0.18 0.16 0.25 0.30 0.24 0.39 0.37 0.38 
SD 464.71 0.78 1.33 0.25 0.64 1.58 1.35 0.26 1.59 0.38 3.14 355.15 224.91 
Average 493.65 2.93 4.61 6.54 2.56 8.43 7.98 1.13 5.15 1.61 7.98 941.53 586.76 
Mollasani CS 2.74 1.46 2.05 1.97 −0.34 1.38 1.36 1.11 1.96 1.18 −0.46 1.44 1.16 
CV 0.97 0.31 0.51 0.45 0.15 0.40 0.41 0.41 0.57 0.50 0.34 0.39 0.39 
SD 648.65 1.35 2.86 0.68 0.43 2.40 2.50 0.93 3.91 1.91 2.27 529.90 336.98 
Average 665.14 4.35 6.77 3.64 2.83 13.37 13.35 2.28 6.78 3.76 7.91 1,341.65 844.94 
Figure 4

Correlation coefficient results between input and output parameters in different stations.

Figure 4

Correlation coefficient results between input and output parameters in different stations.

Close modal

Steps to present a novel hybrid framework and WQPs estimation

This study offers a novel hybrid framework for estimating WQPs. The following steps are performed to prove the performance of the proposed framework and to estimate WQPs:

  • The first step consisted of two parts:

    • Part 1. Three benchmark data sets (Housing, LSVT, and Servo) evaluated the performance of ANFIS and ANFIS–GBO algorithms without data preprocessing.

    • Part 2. Benchmark data sets were preprocessed by DWT and then applied to ANFIS and ANFIS–GBO algorithms. This means that the ANFIS–DWT and ANFIS–DWT–GBO algorithms were tested in benchmark data sets.

  • The PCA method selected the best input combination of WQPs to the different algorithms in the four stations.

  • After determining the best input combination of WQPs, EC and TDS similar to the first step (part 1 and part 2) were estimated by the novel framework and other algorithms.

Figure 5 shows the steps to present a novel hybrid framework and WQPs estimation.
Figure 5

Flowchart to prove the performance of a novel hybrid framework and estimate WQPs.

Figure 5

Flowchart to prove the performance of a novel hybrid framework and estimate WQPs.

Close modal

Benchmark data sets modeling

In this section, the performance of the novel framework was tested on three benchmark data sets (Housing, LSVT, and Servo) and compared with other algorithms. The results of the evaluation criteria of the examined algorithms in the benchmark data sets are presented in Table 3. For this purpose, first, the performance of ANFIS and ANFIS–GBO were compared with each other. Then, after applying DWT to the input data, the performance of the ANFIS–DWT and the ANFIS–DWT–GBO in benchmark data sets modeling were investigated. ANFIS and ANFIS–DWT failed during the testing period, and ANFIS–GBO had moderate accuracy. In contrast, the accuracy of the ANFIS–DWT–GBO was significantly better than other algorithms. According to Figure 6, the value of MAE and RRMSE obtained by the ANFIS–DWT–GBO decreased in all benchmark data sets, and the value of R increased in all benchmark data sets, which means better performance of the ANFIS–DWT–GBO. MAE, RRMSE, and R in the Housing were 3.17, 0.06, and 0.68, respectively. Also, in LSVT were 135.86, 0.02, 0.98 and in Servo 0.42, 0.02, 0.95, respectively. The ANFIS–DWT–GBO decreased the MAE and RRMSE by 63–93% and 97–100%, respectively, and increased the R by 42–146% compared to other algorithms. ANFIS–GBO and ANFIS–DWT were placed in the second and third ranks after ANFIS–DWT–GBO.
Table 3

Evaluation criteria in benchmark data sets modeling

Train
Test
MAERRMSERMAERRMSER
 Housing 
ANFIS 1.61 0.24 0.96 10.07 2.35 −0.06 
ANFIS–DWT 1.70 0.33 0.96 8.47 1.54 0.36 
ANFIS–GBO 3.37 0.00 0.84 4.17 0.36 0.63 
ANFIS–DWT–GBO 3.13 0.03 0.85 3.17 0.06 0.68 
 LSVT 
ANFIS 0.04 0.00 1.00 1,754 11.25 −0.46 
ANFIS–DWT 0.00 0.00 1.00 2,748 4.14 0.11 
ANFIS–GBO 304.68 0.01 0.96 260.14 0.03 0.94 
ANFIS–DWT–GBO 65.36 0.00 1.00 135.86 0.02 0.98 
 Servo 
ANFIS 0.20 0.18 0.98 0.58 1.63 0.55 
ANFIS–DWT 0.01 0.01 0.99 1.41 1.30 0.56 
ANFIS–GBO 0.38 0.00 0.90 0.55 0.03 0.90 
ANFIS–DWT–GBO 0.32 0.00 0.94 0.42 0.02 0.95 
Train
Test
MAERRMSERMAERRMSER
 Housing 
ANFIS 1.61 0.24 0.96 10.07 2.35 −0.06 
ANFIS–DWT 1.70 0.33 0.96 8.47 1.54 0.36 
ANFIS–GBO 3.37 0.00 0.84 4.17 0.36 0.63 
ANFIS–DWT–GBO 3.13 0.03 0.85 3.17 0.06 0.68 
 LSVT 
ANFIS 0.04 0.00 1.00 1,754 11.25 −0.46 
ANFIS–DWT 0.00 0.00 1.00 2,748 4.14 0.11 
ANFIS–GBO 304.68 0.01 0.96 260.14 0.03 0.94 
ANFIS–DWT–GBO 65.36 0.00 1.00 135.86 0.02 0.98 
 Servo 
ANFIS 0.20 0.18 0.98 0.58 1.63 0.55 
ANFIS–DWT 0.01 0.01 0.99 1.41 1.30 0.56 
ANFIS–GBO 0.38 0.00 0.90 0.55 0.03 0.90 
ANFIS–DWT–GBO 0.32 0.00 0.94 0.42 0.02 0.95 
Figure 6

Comparison of the performance of algorithms in modeling benchmark data sets in the testing period.

Figure 6

Comparison of the performance of algorithms in modeling benchmark data sets in the testing period.

Close modal
Figure 7 shows the accuracy of modeling and data scattering by the best algorithm (ANFIS–DWT–GBO) and the worst algorithm (ANFIS). Based on Figure 7, there is a significant difference between the results of the ANFIS–DWT–GBO as the best algorithm and ANFIS as the worst algorithm. The high R-value in all benchmark data sets by the ANFIS–DWT–GBO indicates that the estimated values were close to the target data values. In contrast, the discrepancy between the target data and the estimated values indicates the low accuracy of the ANFIS. According to Figure 7, in modeling by the best algorithm, the distance between the points and the fitted line was minimal, while this distance between points and line was considerable in modeling with the worst algorithm.
Figure 7

Comparison of the performance of the worst algorithm (ANFIS) and the best algorithm (ANFIS–DWT–GBO) in benchmark data sets modeling.

Figure 7

Comparison of the performance of the worst algorithm (ANFIS) and the best algorithm (ANFIS–DWT–GBO) in benchmark data sets modeling.

Close modal

Input selection by PCA

As shown in Figure 8, PCA was executed on 11 parameters for the four stations to identify the best input combination. The parameters with the highest eigenvalues are the most significant. Eigenvalues of 1.00 or greater are considered significant. According to the results, in all stations, the first three PCs were chosen as the inputs of the algorithms.
Figure 8

Percentage of eigenvalue variation in different stations.

Figure 8

Percentage of eigenvalue variation in different stations.

Close modal

Tables 4 and 5 show the results of the PCA method. The results indicated that in Armand station, the first three PCs included >76% of the input data variance, in which the first principal component was 37.35%, the second principal component was 26.20%, and the third principal component was 12.58% of the total variance. The first component included Ca2+ and SAR as the most important parameters. PC2 included Sum.A, Sum.C, Mg2+, and Na+ and the third component included Q,, . In the Ahvaz station, the first component accounted for 63.92% of the total variance. This component included Ca2+, SAR, Sum.C, and Mg2+ as the most important parameters. The second component, which included , Sum.A, and , accounted for 11.65% of the total variance in both periods. The third component, which included Q, , and pH, accounted for 10.87% of the total variance in both periods. In the Gotvand station, PC1 (54.67% of the variance) was contributed mainly by Ca2+, SAR, Sum.C. The second component (14.02% of the variance) included Sum.A, and , and the third component (11.13% of the variance) included Q,, and pH. Also, the results indicated that in Mollasani station, the first three PCs included 82.30% of the input data variance, in which the first principal component was 60.98%, the second principal component was 11.73%, and the third principal component was 9.59% of the total variance. The first component included Ca2+, SAR, Sum.C, and Mg2+ as the most important parameters. The second component included Sum.A, Na+, and , and the third component included Q,, , and pH.

Table 4

Loadings of 11 WQPs on the first three PCs

Armand
Ahvaz
Gotvand
Mollasani
PC1PC2PC3PC1PC2PC3PC1PC2PC3PC1PC2PC3
−0.23 0.35 0.48 −0.18 0.34 0.38 −0.10 0.35 0.41 −0.15 0.35 0.38 
Ca2+ 0.74 0.11 0.04 0.57 −0.19 −0.01 0.70 0.23 0.01 0.45 0.06 0.01 
 0.01 −0.29 −0.40 −0.02 −0.42 −0.02 −0.01 −0.25 0.46 0.01 −0.11 −0.34 
SAR 0.54 −0.16 0.10 0.42 −0.07 0.04 0.52 −0.33 0.02 −0.35 0.18 0.00 
 0.00 −0.40 0.74 −0.02 −0.20 0.68 0.00 −0.01 −0.02 −0.01 −0.14 0.62 
Sum.A 0.05 −0.43 −0.16 0.02 0.43 −0.05 −0.02 0.31 −0.14 0.03 0.35 −0.08 
Sum.C 0.31 −0.44 0.15 0.66 −0.15 0.07 −0.41 0.02 −0.03 −0.64 0.03 −0.06 
Mg2+ −0.12 −0.24 −0.02 0.52 0.13 0.04 0.14 −0.01 0.02 0.49 0.16 0.07 
Na+ 0.03 −0.11 −0.04 −0.08 0.21 0.02 −0.01 −0.23 0.03 −0.00 0.38 0.03 
 0.03 0.03 −0.09 −0.02 −0.44 −0.02 0.01 0.71 0.00 0.02 0.71 −0.17 
pH 0.00 0.01 −0.01 0.00 0.00 0.62 0.00 0.02 0.77 0.00 −0.09 −0.56 
Armand
Ahvaz
Gotvand
Mollasani
PC1PC2PC3PC1PC2PC3PC1PC2PC3PC1PC2PC3
−0.23 0.35 0.48 −0.18 0.34 0.38 −0.10 0.35 0.41 −0.15 0.35 0.38 
Ca2+ 0.74 0.11 0.04 0.57 −0.19 −0.01 0.70 0.23 0.01 0.45 0.06 0.01 
 0.01 −0.29 −0.40 −0.02 −0.42 −0.02 −0.01 −0.25 0.46 0.01 −0.11 −0.34 
SAR 0.54 −0.16 0.10 0.42 −0.07 0.04 0.52 −0.33 0.02 −0.35 0.18 0.00 
 0.00 −0.40 0.74 −0.02 −0.20 0.68 0.00 −0.01 −0.02 −0.01 −0.14 0.62 
Sum.A 0.05 −0.43 −0.16 0.02 0.43 −0.05 −0.02 0.31 −0.14 0.03 0.35 −0.08 
Sum.C 0.31 −0.44 0.15 0.66 −0.15 0.07 −0.41 0.02 −0.03 −0.64 0.03 −0.06 
Mg2+ −0.12 −0.24 −0.02 0.52 0.13 0.04 0.14 −0.01 0.02 0.49 0.16 0.07 
Na+ 0.03 −0.11 −0.04 −0.08 0.21 0.02 −0.01 −0.23 0.03 −0.00 0.38 0.03 
 0.03 0.03 −0.09 −0.02 −0.44 −0.02 0.01 0.71 0.00 0.02 0.71 −0.17 
pH 0.00 0.01 −0.01 0.00 0.00 0.62 0.00 0.02 0.77 0.00 −0.09 −0.56 
Table 5

Eigenvalues and percentage of variance in different stations

EigenvalueVariance (%)Cumulative eigenvalueCumulative variance (%)
Armand PC1 4.10 37.35 4.10 37.35 
PC2 2.88 26.20 6.98 63.55 
PC3 1.38 12.58 8.36 76.13 
Ahvaz PC1 7.03 63.92 7.03 63.92 
PC2 1.28 11.65 8.31 75.57 
PC3 1.19 10.87 9.50 86.44 
Gotvand PC1 6.01 54.67 6.01 54.67 
PC2 1.54 14.02 7.55 68.69 
PC3 1.22 11.13 8.77 79.82 
Mollasani PC1 6.70 60.98 6.70 60.98 
PC2 1.29 11.73 7.99 72.71 
PC3 1.05 9.59 9.04 82.30 
EigenvalueVariance (%)Cumulative eigenvalueCumulative variance (%)
Armand PC1 4.10 37.35 4.10 37.35 
PC2 2.88 26.20 6.98 63.55 
PC3 1.38 12.58 8.36 76.13 
Ahvaz PC1 7.03 63.92 7.03 63.92 
PC2 1.28 11.65 8.31 75.57 
PC3 1.19 10.87 9.50 86.44 
Gotvand PC1 6.01 54.67 6.01 54.67 
PC2 1.54 14.02 7.55 68.69 
PC3 1.22 11.13 8.77 79.82 
Mollasani PC1 6.70 60.98 6.70 60.98 
PC2 1.29 11.73 7.99 72.71 
PC3 1.05 9.59 9.04 82.30 

EC and TDS estimating

After proving the performance of the proposed framework in modeling benchmark data sets, we estimate the EC and TDS parameters. Similar to the benchmark data sets modeling at the beginning of this section, WQP estimates were performed without preprocessing the input data using ANFIS and ANFIS–GBO. In the next step, modeling was performed with the mentioned algorithms and with the data processed by DWT to test the accuracy of the proposed framework in estimating WQPs. Table 6 presents the results of ML algorithms in EC estimation at four stations. The MAE, RRMSE, and R calculated for the ANFIS–GBO and ANFIS–DWT–GBO combinations were different from ANFIS and ANFIS–DWT. ANFIS and ANFIS–DWT were the weakest algorithms in the testing period, and ANFIS–DWT–GBO were the best algorithms. ANFIS–GBO had better accuracy than ANFIS and ANFIS–DWT. In comparison, ANFIS–DWT–GBO performed better than ANFIS–GBO. At the Armand station, MAE, RRMSE, and R values were 22.41, 0.09, and 0.96, respectively. These values were 73.39, 0.04, and 0.99 for the Ahvaz station. Also, in the Gotvand station, the value of MAE was equal to 61.70, RRMSE was equal to 0.04, R was equal to 0.98, and Mollasani station was equal to 66.07, RRMSE was equal to 0.03, and R was equal to 0.99.

Table 6

Results of algorithms for estimating EC in different stations

Train
Test
MAERRMSERMAERRMSER
 Armand 
ANFIS 19.37 0.23 0.97 26.07 0.30 0.95 
ANFIS–DWT 8.78 0.12 0.99 23.17 0.27 0.96 
ANFIS–GBO 23.15 0.00 0.95 22.74 0.10 0.96 
ANFIS–DWT–GBO 21.17 0.00 0.96 22.41 0.09 0.96 
 Ahvaz 
ANFIS 43.83 0.15 0.98 98.58 0.21 0.96 
ANFIS–DWT 33.14 0.09 0.99 81.63 0.25 0.97 
ANFIS–GBO 44.72 0.00 0.98 78.21 0.04 0.98 
ANFIS–DWT–GBO 44.91 0.00 0.98 73.39 0.04 0.99 
 Gotvand 
ANFIS 30.59 0.14 0.98 75.16 0.28 0.96 
ANFIS–DWT 17.48 0.06 0.99 42.35 0.22 0.97 
ANFIS–GBO 32.46 0.00 0.98 65.58 0.09 0.98 
ANFIS–DWT–GBO 40.62 0.01 0.98 61.70 0.04 0.98 
 Mollasani 
ANFIS 40.20 0.12 0.99 91.27 0.22 0.97 
ANFIS–DWT 28.95 0.08 0.99 55.56 0.15 0.98 
ANFIS–GBO 43.74 0.00 0.99 68.25 0.04 0.98 
ANFIS–DWT–GBO 43.62 0.00 0.99 66.07 0.03 0.99 
Train
Test
MAERRMSERMAERRMSER
 Armand 
ANFIS 19.37 0.23 0.97 26.07 0.30 0.95 
ANFIS–DWT 8.78 0.12 0.99 23.17 0.27 0.96 
ANFIS–GBO 23.15 0.00 0.95 22.74 0.10 0.96 
ANFIS–DWT–GBO 21.17 0.00 0.96 22.41 0.09 0.96 
 Ahvaz 
ANFIS 43.83 0.15 0.98 98.58 0.21 0.96 
ANFIS–DWT 33.14 0.09 0.99 81.63 0.25 0.97 
ANFIS–GBO 44.72 0.00 0.98 78.21 0.04 0.98 
ANFIS–DWT–GBO 44.91 0.00 0.98 73.39 0.04 0.99 
 Gotvand 
ANFIS 30.59 0.14 0.98 75.16 0.28 0.96 
ANFIS–DWT 17.48 0.06 0.99 42.35 0.22 0.97 
ANFIS–GBO 32.46 0.00 0.98 65.58 0.09 0.98 
ANFIS–DWT–GBO 40.62 0.01 0.98 61.70 0.04 0.98 
 Mollasani 
ANFIS 40.20 0.12 0.99 91.27 0.22 0.97 
ANFIS–DWT 28.95 0.08 0.99 55.56 0.15 0.98 
ANFIS–GBO 43.74 0.00 0.99 68.25 0.04 0.98 
ANFIS–DWT–GBO 43.62 0.00 0.99 66.07 0.03 0.99 

According to Table 7 for the TDS in four stations, the best results were related to the ANFIS–DWT–GBO. Based on the results, MAE, RRMSE, and R values in the Armand station were 14.68, 0.06, and 0.97, and in the Ahvaz station they were 55.90, 0.02, 0.98, respectively. Also, in the Gotvand station, the value of MAE was equal to 44.48, RRMSE was equal to 0.00, and R was equal to 0.98. MAE, RRMSE, and R values in the Mollasani station were 50.97, 0.00, and 0.98, respectively.

Table 7

Results of algorithms for estimating TDS in different stations

Train
Test
MAERRMSERMAERRMSER
 Armand 
ANFIS 13.43 0.25 0.96 34.38 0.63 0.84 
ANFIS–DWT 7.69 0.14 0.98 15.78 0.35 0.93 
ANFIS–GBO 20.22 0.02 0.98 18.55 0.17 0.96 
ANFIS–DWT–GBO 15.23 0.00 0.99 14.68 0.06 0.97 
 Ahvaz 
ANFIS 42.55 0.22 0.97 98.76 0.75 0.81 
ANFIS–DWT 30.61 0.11 0.99 78.50 0.45 0.90 
ANFIS–GBO 43.69 0.00 0.97 61.05 0.04 0.97 
ANFIS–DWT–GBO 43.89 0.00 0.97 55.90  0.02 0.98 
 Gotvand 
ANFIS 32.14 0.24 0.96 85.90 1.06 0.79 
ANFIS–DWT 24.04 0.14 0.98 34.58 0.21 0.97 
ANFIS–GBO 34.46 0.00 0.96 41.62 0.03 0.97 
ANFIS–DWT–GBO 33.85 0.00 0.96 44.48 0.00 0.98 
 Mollasani 
ANFIS 42.28 0.18 0.98 71.10 0.29 0.95 
ANFIS–DWT 26.33 0.10 0.99 57.48 0.22 0.97 
ANFIS–GBO 45.63 0.00 0.97 60.03 0.05 0.97 
ANFIS–DWT–GBO 44.82 0.00 0.97 50.97 0.00 0.98 
Train
Test
MAERRMSERMAERRMSER
 Armand 
ANFIS 13.43 0.25 0.96 34.38 0.63 0.84 
ANFIS–DWT 7.69 0.14 0.98 15.78 0.35 0.93 
ANFIS–GBO 20.22 0.02 0.98 18.55 0.17 0.96 
ANFIS–DWT–GBO 15.23 0.00 0.99 14.68 0.06 0.97 
 Ahvaz 
ANFIS 42.55 0.22 0.97 98.76 0.75 0.81 
ANFIS–DWT 30.61 0.11 0.99 78.50 0.45 0.90 
ANFIS–GBO 43.69 0.00 0.97 61.05 0.04 0.97 
ANFIS–DWT–GBO 43.89 0.00 0.97 55.90  0.02 0.98 
 Gotvand 
ANFIS 32.14 0.24 0.96 85.90 1.06 0.79 
ANFIS–DWT 24.04 0.14 0.98 34.58 0.21 0.97 
ANFIS–GBO 34.46 0.00 0.96 41.62 0.03 0.97 
ANFIS–DWT–GBO 33.85 0.00 0.96 44.48 0.00 0.98 
 Mollasani 
ANFIS 42.28 0.18 0.98 71.10 0.29 0.95 
ANFIS–DWT 26.33 0.10 0.99 57.48 0.22 0.97 
ANFIS–GBO 45.63 0.00 0.97 60.03 0.05 0.97 
ANFIS–DWT–GBO 44.82 0.00 0.97 50.97 0.00 0.98 

In Figure 9, the evaluation criteria for EC and TDS estimating in four stations for a different algorithm in training and testing periods have been shown. As can be seen in this figure, the highest accuracy was related to the Mollasani station in the estimating EC. In the testing period, the ANFIS–DWT–GBO reduced the MAE over the ANFIS, ANFIS–DWT , and ANFIS–GBO by 27–57%, 7–29%, 9–21%, respectively. Furthermore, the ANFIS–DWT–GBO reduced the RRMSE over the ANFIS, ANFIS–DWT, and ANFIS–GBO by 90–100%, 82–100%, and 50–100%, respectively. Also, it was observed that the ANFIS–DWT–GBO had the highest R-value among the algorithms. Also, the highest accuracy was related to the Gotvand station in the estimating TDS. According to Figure 9, the lowest RRMSE values and the highest value of R were observed for the ANFIS–DWT–GBO in the testing period. A comparison of algorithms indicates that the novel framework ANFIS–DWT–GBO has the best EC and TDS estimating performance in all stations.
Figure 9

Evaluation criteria of different algorithms for estimating EC and TDS in different stations.

Figure 9

Evaluation criteria of different algorithms for estimating EC and TDS in different stations.

Close modal
According to the explanations provided in this study, Q, Ca2+, , SAR, , Sum.A, Sum.C, Mg2+, Na+, , and pH parameters were used as input. Due to the breadth of the data, the PCA method identified the main components of WQPs that have the greatest impact on the output parameters. Then EC and TDS parameters were estimated separately at four stations. Thus, in Figures 10 and 11, the observational data of the EC and TDS parameters and their estimated results are shown by the new framework at different stations. Also, scatter plots in the testing period with the best algorithm (ANFIS–DWT–GBO) are presented. According to Figures 10 and 11, ANFIS–DWT–GBO showed significant performance in estimating EC and TDS values. In most cases, the difference between the estimated and observed values was very small. The high R-value meant the high accuracy of the novel hybrid framework in estimating WQPs. Since most of the values were grouped around the semiconductor line, the accuracy of the ANFIS–DWT–GBO was high. Also, the maximum and minimum values of the observed and estimated values were very close due to the high accuracy of the novel hybrid framework.
Figure 10

Performance of the novel framework (ANFIS–DWT–GBO) for estimating EC in different stations.

Figure 10

Performance of the novel framework (ANFIS–DWT–GBO) for estimating EC in different stations.

Close modal
Figure 11

Performance of the novel framework (ANFIS–DWT–GBO) for estimating TDS in different stations.

Figure 11

Performance of the novel framework (ANFIS–DWT–GBO) for estimating TDS in different stations.

Close modal

According to the modeling results of three benchmark data sets and estimation of EC and TDS values in the Karun river by the studied algorithms, ANFIS–DWT–GBO, ANFIS–GBO, ANFIS–DWT , and ANFIS, according to the accuracy in the rankings, they were ranked first to fourth. The improved accuracy of ANFIS–DWT–GBO compared to other algorithms can be explained by using the GBO optimization algorithm to optimally determine important ANFIS parameters and process input data by DWT. In fact, the simultaneous use of the DWT and optimization algorithm significantly increased the accuracy of ANFIS. GBO, by finding the optimal ANFIS values, led to the estimation of various parameters with maximum accuracy. Also, the superiority of ANFIS–DWT and ANFIS–GBO algorithms compared to ANFIS proved the efficiency of the DWT and optimization algorithm in improving modeling accuracy. In fact, the GBO has operators that find the membership function parameters more accurately than the independent ANFIS. The positive impact of the GBO optimization algorithm on ANFIS accuracy was due to the use of Newton's approach to achieving better positions in the search space. Also, the use of GSR and LEO operators prevents GBO from falling into the trap of local optimization and finding the optimal values with high accuracy and speed. Also, DWT helps extract hidden information from input data by providing frequency and time information simultaneously.

The superiority of hybrid algorithms compared to classical algorithms and the improvement of the accuracy of simulation algorithms by optimization algorithms have been shown in other studies (Azad et al. 2018; Milan et al. 2021; Song et al. 2021a, 2021b; Wu et al. 2022). Also, the simultaneous effect of DWT and optimization algorithms on increasing the accuracy of simulation algorithms has been proven by Montaseri et al. (2018) and Anaraki et al. (2021). Also, based on the results of a study by Kadkhodazadeh & Farzin (2021), ANFIS–DWT–GBO performed better than LSSVM–GBO. This result could be due to DWT and the use of ANFIS instead of LSSVM. According to the results, ANFIS–DWT–GBO is skilled in achieving high accuracy and overcoming competitors.

The present study introduced a novel hybrid framework based on the ANFIS and the DWT and optimization algorithm. The ANFIS–DWT–GBO were compared with the ANFIS, ANFIS–DWT, and ANFIS–GBO on three benchmark data sets (Housing, LSVT, and Servo). The PCA method determined the best input combination for estimating EC and TDS at each station. After proving the performance of the ANFIS–DWT–GBO, its performance was used in EC and TDS estimating in four stations of Armand, Ahvaz, Gotvand, and Mollasani of Karun river. The main results are as follows:

  • 1.

    The results showed that ANFIS–DWT–GBO has the highest accuracy in the three benchmark data sets. MAE, RRMSE, and R in the Housing were 3.17, 0.06, and 0.68, respectively. Also in the LSVT were 135.86, 0.02, 0.98 and in the Servo were 0.42, 0.02, 0.95, respectively.

  • 2.

    According to the results of the PCA method, the first three components had the highest percentage of variance in all stations.

  • 3.

    Results estimating EC and TDS in four stations based on evaluation criteria showed that the ANFIS–DWT–GBO had the best performance among other algorithms. The results showed the highest accuracy in estimating EC and TDS parameters in Mollasani and Gotvand stations. MAE, RRMSE, and R in Mollasani station were 66.07, 0.03, 0.99, respectively, and in Gotvand station they were 44.48, 0.00, 0.98, respectively.

In this paper, the advantages of the GBO optimization algorithm (high estimation accuracy, fast convergence, and easy implementation) and DWT were used to improve the performance of the ANFIS. According to the results of this study, and due to the many advantages of the proposed framework, the future step is to expand the application of this framework to analyze other engineering problems. Also, the ANFIS–DWT–GBO has a high potential in modeling and predicting various parameters of water resources. As a result, we achieved the desired goals in presenting a new hybrid algorithm.

Data cannot be made publicly available; readers should contact the corresponding author for details.

The authors declare there is no conflict.

Abba
S. I.
,
Hadi
S. J.
,
Sammen
S. S.
,
Salih
S. Q.
,
Abdulkadir
R. A.
,
Pham
Q. B.
&
Yaseen
Z. M.
2020
Evolutionary computational intelligence algorithm coupled with self-tuning predictive model for water quality index determination
.
Journal of Hydrology
587
,
124974
.
https://doi.org/10.1016/j.jhydrol.2020.124974
.
Abualigah
L.
,
Diabat
A.
,
Mirjalili
S.
,
Elaziz
M. A.
&
Gandomi
A. H.
2021
The arithmetic optimization algorithm
.
Computer Methods in Applied Mechanics and Engineering
376
,
113609
.
https://doi.org/10.1016/j.cma.2020.113609
.
Ahmadianfar
I.
,
Bozorg-Haddad
O.
&
Chu
X.
2020
Gradient-based optimizer: a new metaheuristic optimization algorithm
.
Information Sciences
540
,
131
159
.
https://doi.org/10.1016/j.ins.2020.06.037
.
Alizadeh
M. J.
,
Kavianpour
M. R.
,
Kisi
O.
&
Nourani
V.
2017
A new approach for simulating and forecasting the rainfall-runoff process within the next two months
.
Journal of Hydrology
548
,
588
597
.
https://doi.org/10.1016/j.jhydrol.2017.03.032
.
Anaraki
M. V.
,
Farzin
S.
,
Mousavi
S.-F.
&
Karami
H.
2021
Uncertainty analysis of climate change impacts on flood frequency by using hybrid machine learning methods
.
Water Resources Management
35
,
199
223
.
https://doi.org/10.1007/s11269-020-02719-w
.
Antanasijević
D.
,
Pocajt
V.
,
Perić-Grujić
A.
&
Ristić
M.
2020
Multilevel split of high-dimensional water quality data using artificial neural networks for the prediction of dissolved oxygen in the Danube river
.
Neural Computing and Applications
32
,
3957
3966
.
https://doi.org/10.1007/s00521-019-04079-y
.
Azad
A.
,
Karami
H.
,
Farzin
S.
,
Saeedian
A.
,
Kashi
H.
&
Sayyahi
F.
2018
Prediction of water quality parameters using ANFIS optimized by intelligence algorithms (case study: Gorganrood river)
.
KSCE Journal of Civil Engineering
22
,
2206
2213
.
https://doi.org/10.1007/s12205-017-1703-6
.
Banadkooki
F. B.
,
Ehteram
M.
,
Panahi
F.
,
Sammen
S. S.
,
Othman
F. B.
&
EL-Shafie
A.
2020
Estimation of total dissolved solids (TDS) using new hybrid machine learning models
.
Journal of Hydrology
587
,
124989
.
https://doi.org/10.1016/j.jhydrol.2020.124989
.
Bayatvarkeshi
M.
,
Mohammadi
K.
,
Kisi
O.
&
Fasihi
R.
2020
A new wavelet conjunction approach for estimation of relative humidity: wavelet principal component analysis combined with ANN
.
Neural Computing and Applications
32
,
4989
5000
.
https://doi.org/10.1007/s00521-018-3916-0
.
Bilali
A. E.
,
Taleb
A.
,
Nafii
A.
,
Alabjah
B.
&
Mazigh
N.
2021
Prediction of sodium adsorption ratio and chloride concentration in a coastal aquifer under seawater intrusion using machine learning models
.
Environmental Technology & Innovation
23
,
101641
.
https://doi.org/10.1016/j.eti.2021.101641
.
Dawood
T.
,
Elwakil
E.
,
Novoa
H. M.
&
Delgado
J. F. G.
2021
Toward urban sustainability and clean potable water: prediction of water quality via artificial neural networks
.
Journal of Cleaner Production
291
,
125266
.
https://doi.org/10.1016/j.jclepro.2020.125266
.
Farzin
S.
&
Valikhan Anaraki
M.
2021
Modeling and predicting suspended sediment load under climate change conditions: a new hybridization strategy
.
Journal of Water and Climate Change
12
(
6
),
2422
2443
.
https://doi.org/10.2166/wcc.2021.317
.
Farzin
S.
,
Valikhan Anaraki
M.
,
Naeimi
M.
&
Zandifar
S.
2022
Prediction of groundwater table and drought analysis; a new hybridization strategy based on bi-directional long short-term model and the Harris hawk optimization algorithm
.
Journal of Water and Climate Change
2022066
.
https://doi.org/10.2166/wcc.2022.066
.
Grossmann
A.
&
Morlet
J.
1984
Decomposition of Hardy function into square integrable wavelets of constant shape
.
Journal of Mathematical Analysis and Applications
5
,
723
736
.
https://doi.org/10.1137/0515056
.
Guo
H.
,
Huang
J. J.
,
Chen
B.
,
Guo
X.
&
Singh
V. P.
2020
A machine learning-based strategy for estimating non-optically active water quality parameters using Sentinel-2 imagery
.
International Journal of Remote Sensing
42
(
5
),
1841
1866
.
https://doi.org/10.1080/01431161.2020.1846222
.
Henríquez
P. A.
&
Ruz
G. A.
2017
Extreme learning machine with a deterministic assignment of hidden weights in two parallel layers
.
Neurocomputing
226
,
109
116
.
https://doi.org/10.1016/j.neucom.2016.11.040
.
Imani
M.
,
Hasan
M. M.
,
Bittencourt
L. F.
,
McClymont
K.
&
Kapelan
Z.
2021
A novel machine learning application: water quality resilience prediction model
.
Science of The Total Environment
768
,
144459
.
https://doi.org/10.1016/j.scitotenv.2020.144459
.
Jahin
H. S.
,
Abuzaid
A. S.
&
Abdellatif
A. D.
2020
Using multivariate analysis to develop irrigation water quality index for surface water in Kafr El-Sheikh Governorate, Egypt
.
Environmental Technology & Innovation
17
,
100532
.
https://doi.org/10.1016/j.eti.2019.100532
.
Jamei
M.
,
Ahmadianfar
I.
,
Chu
X.
&
Yaseen
Z. M.
2020
Prediction of surface water total dissolved solids using hybridized wavelet-multigene genetic programming: new approach
.
Journal of Hydrology
589
,
125335
.
https://doi.org/10.1016/j.jhydrol.2020.125335
.
Kadkhodazadeh
M.
&
Farzin
S.
2021
A novel LSSVM model integrated with GBO algorithm to assessment of water quality parameters
.
Water Resources Management
35
,
3939
3968
.
https://doi.org/10.1007/s11269-021-02913-4
.
Kadkhodazadeh
M.
&
Farzin
S.
2022
Introducing a novel hybrid machine learning model and developing its performance in estimating water quality parameters
.
Water Resources Management
.
https://doi.org/10.1007/s11269-022-03238-6
.
Karaboga
D.
&
Kaya
E.
2019
Adaptive network based fuzzy inference system (ANFIS) training approaches: a comprehensive survey
.
Artificial Intelligence Review
52
,
2263
2293
.
https://doi.org/10.1007/s10462-017-9610-2
.
Kazakis
N.
,
Mattas
C.
,
Pavlou
A.
,
Patrikaki
O.
&
Voudouris
K.
2017
Multivariate statistical analysis for the assessment of groundwater quality under different hydrogeological regimes
.
Environmental Earth Sciences
76
,
349
.
https://doi.org/10.1007/s12665-017-6665-y
.
Khullar
S.
&
Singh
N.
2022
Water quality assessment of a river using deep learning Bi-LSTM methodology: forecasting and validation
.
Environmental Science and Pollution Research
29
,
12875
12889
.
https://doi.org/10.1007/s11356-021-13875-w
.
Kouadri
S.
,
Pande
C. B.
,
Panneerselvam
B.
,
Moharir
K. N.
&
Elbeltagi
A.
2022
Prediction of irrigation groundwater quality parameters using ANN, LSTM, and MLR models
.
Environmental Science and Pollution Research
29
,
21067
21091
.
https://doi.org/10.1007/s11356-021-17084-3
.
Kumar
A.
,
Mishra
S.
,
Taxak
A. K.
,
Pandey
R.
&
Yu
Z. G.
2020
Nature rejuvenation: long-term (1989–2016) vs short-term memory approach based appraisal of water quality of the upper part of Ganga river, India
.
Environmental Technology & Innovation
20
,
101164
.
https://doi.org/10.1016/j.eti.2020.101164
.
Mahmoudi
N.
,
Orouji
H.
&
Fallah-Mehdipour
E.
2016
Integration of shuffled frog leaping algorithm and support vector regression for prediction of water quality parameters
.
Water Resources Management
30
,
2195
2211
.
https://doi.org/10.1007/s11269-016-1280-3
.
Malekzadeh
M.
,
Kardar
S.
,
Saeb
K.
,
Shabanlou
S.
&
Taghavi
L.
2019
A novel approach for prediction of monthly ground water level using a hybrid wavelet and non-tuned self-adaptive machine learning model
.
Water Resources Management
33
,
1609
1628
.
https://doi.org/10.1007/s11269-019-2193-8
.
Mallat
S. G.
1989
A theory for multiresolution signal decomposition: the wavelet representation
.
IEEE Transactions on Pattern Analysis and Machine Intelligence
11
(
7
),
674
693
.
https://doi.org/10.1109/34.192463
.
Milan
S. G.
,
Roozbahani
A.
,
Azar
N. A.
&
Javadi
S.
2021
Development of adaptive neuro fuzzy inference system
Evolutionary algorithms hybrid models (ANFIS-EA) for prediction of optimal groundwater exploitation
.
Journal of Hydrology
598
,
126258
.
https://doi.org/10.1016/j.jhydrol.2021.126258
.
Montaseri
M.
,
Zaman Zad Ghavidel
S.
&
Sanikhani
H.
2018
Water quality variations in different climates of Iran: toward modeling total dissolved solid using soft computing techniques
.
Stochastic Environmental Research and Risk Assessment
32
,
2253
2273
.
https://doi.org/10.1007/s00477-018-1554-9
.
Morshed-Bozorgdel
A.
,
Kadkhodazadeh
M.
,
Valikhan Anaraki
M.
&
Farzin
S.
2022
A novel framework based on the stacking ensemble machine learning (SEML) method: application in wind speed modeling
.
Atmosphere
13
(
5
),
758
.
https://doi.org/10.3390/atmos13050758
.
Olyaie
E.
,
Abyaneh
H. Z.
&
Mehr
A. D.
2017
A comparative analysis among computational intelligence techniques for dissolved oxygen prediction in Delaware river
.
Geoscience Frontiers
8
(
3
),
517
527
.
https://doi.org/10.1016/j.gsf.2016.04.007
.
Parmar
K. S.
,
Makkhan
S. J. S.
&
Kaushal
S.
2019
Neuro-fuzzy-wavelet hybrid approach to estimate the future trends of river water quality
.
Neural Computing and Applications
31
,
8463
8473
.
https://doi.org/10.1007/s00521-019-04560-8
.
Shah
M. I.
,
Abunama
T.
,
Javed
M. F.
,
Bux
F.
,
Aldrees
A.
,
Tariq
M. A. U. R.
&
Mosavi
A.
2021
Modeling surface water quality using the adaptive neuro-fuzzy inference system aided by input optimization
.
Sustainability
13
(
8
),
4576
.
https://doi.org/10.3390/su13084576
.
Song
C.
,
Yao
L.
,
Hua
C.
&
Ni
Q.
2021a
A novel hybrid model for water quality prediction based on synchrosqueezed wavelet transform technique and improved long short-term memory
.
Journal of Hydrology
603
,
126879
.
https://doi.org/10.1016/j.jhydrol.2021.126879
.
Song
C.
,
Yao
L.
,
Hua
C.
&
Ni
Q.
2021b
Comprehensive water quality evaluation based on kernel extreme learning machine optimized with the sparrow search algorithm in Luoyang river basin, China
.
Environmental Earth Sciences
80
,
521
.
https://doi.org/10.1007/s12665-021-09879-x
.
Tizro
A. T.
,
Fryar
A. E.
,
Vanaei
A.
,
Kazakis
N.
,
Voudouris
K.
&
Mohammadi
P.
2021
Estimation of total dissolved solids in Zayandehrood river using intelligent models and PCA
.
Sustainable Water Resources Management
7
,
22
.
https://doi.org/10.1007/s40899-021-00497-w
.
Wang
W.
&
Ding
J.
2003
Wavelet network model and its application to the prediction of hydrology
.
Journal of Nature and Science
1
,
67
71
.
Wu
Y.
,
Sun
L.
,
Sun
X.
&
Wang
B.
2022
A hybrid XGBoost-ISSA-LSTM model for accurate short-term and long-term dissolved oxygen prediction in ponds
.
Environmental Science and Pollution Research
29
,
18142
18159
.
https://doi.org/10.1007/s11356-021-17020-5
.
Yan
J.
,
Liu
J.
,
Yu
Y.
&
Xu
H.
2021
Water quality prediction in the Luan river based on 1-DRCNN and BiGRU hybrid neural network model
.
Water
13
(
9
),
1273
.
https://doi.org/10.3390/w13091273
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).