ABSTRACT
Accurate prediction of streamflows is crucial for managing water resources. Machine learning approaches have gained popularity for their ability to handle noisy and non-linear data and develop models that are capable of detecting relationships from the data they are provided with. This study was conducted to compare the performance of three machine learning algorithms (including extreme learning machine (ELM), random forest (RF), and gene expression programming (GEP)) and their hybrid versions in predicting the monthly streamflow of the Leaf River catchment. The models were tested with three scenarios and the most accurate scenario has been selected for the implementation of hybrid models. Results of all the models have been examined with a new evaluation index called general index (GI), which is calculated based on the three error statistical indices. Finally, the GEP model outperformed the other models, all the scenarios with GI = 11.268 in the M3 scenario and later, the ELM algorithm presented the best performance with GI = 12.811 in the M2 scenario, while the RF model had the worst overall performance. Regarding the hybrid models, using the EMD and principal components analysis (PCA) methods decreased the precisions of the models with the GI values fluctuating around 35.
HIGHLIGHTS
This study was conducted to compare the performance of three machine learning algorithms (including extreme learning machine, random forest, and gene expression programming) and their hybrid versions in predicting the monthly streamflow.
In this study, a new index called general index is used to perform evaluations based on three other indices, root mean square error, mean absolute error, and normalized root mean square error.
LIST OF ACRONYMS
- RMSE
root mean square error
- MAE
mean absolute error
- LSSVM
least square support vector machine
- ENN
emotional neural network
- GP
genetic programming
- ACF
autocorrelation function
- PACF
partial autocorrelation function
- ELM
extreme learning machine
- GEP
gene expression programming
- GPR
Gaussian process regression
- ML
machine learning
- SVM
support vector machine
- M5T
M5 Tree
- LSTM
long short-term memory
- LWLR
local weighted linear regression
- ANN
artificial neural networks
- M5P
M5Prime
- RF
random forest
- GA
genetic algorithm
INTRODUCTION
Increasing demand for water resources and increasing uncertainty in the supply are the results of climate change and human activities (Chu & Huang 2020). The availability of water resources is essential for human survival, as well as an integral part of socio-economic protection (Chu & Huang 2020). One of the most essential tasks for the planning, development, and optimal use of water resources is accurate flow forecasting, which plays an important role in preserving water resources (Tongal & Booij 2018). Consequently, it is imperative to develop methods of managing water resources, which include predicting streamflow, that is accurate and efficient (Tongal & Booij 2018). Water resource management in the past few decades has encouraged the development of streamflow simulations for river basins of various scales, and many researchers have focused on this topic (Naghibi & Pourghasemi 2015). Hydrologists are focused on researches aimed at enhancing the precision and accuracy of short-term streamflow forecasting which is a difficult undertaking (Naghibi & Pourghasemi 2015). Simulating streamflow and analyzing a hydrological system's behavior are essential functions of hydrological models (Remesan & Mathew 2015). To allocate water resources efficiently and use sustainably, it is crucial to have an accurate and detailed prediction of streamflow (Remesan & Mathew 2015). Simulating the streamflow is very difficult because it has a non-linear and non-constant behavior and depends on various factors such as weather, land type, surface vegetation, etc. (Young et al. 2017). In addition to providing valuable information for reducing natural disasters like floods and droughts, it can also facilitate the safe and economical operation of reservoirs, as well as an effective way of organizing the use of water across different sections to maximize the benefits of the water system (Young et al. 2017).
The accuracy of complicated hydrological models has been evaluated in a variety of ways using many models, and a lot of research is currently being conducted to improve those approaches (Sahoo et al. 2017). There is a growing number of watershed hydrological models based on machine learning (ML) approaches in recent years (Sahoo et al. 2017). Hydrology of catchments faces a significant problem when it comes to developing algorithms that can accurately predict streamflow (Sahoo et al. 2017). With the ability to handle noisy and nonlinearities data, the aim of ML is to produce models that can detect relationships within data and improve their performance based on the number of samples used for training (Hastie et al. 2009). It has been developed and tested a number of models for streamflow analysis by using ML: artificial neural network (ANN) (Zhou et al. 2018), extreme learning machine (ELM) (Huang et al. 2018), random forest (RF) (Li et al. 2019), least square support vector machine (LSSVM), genetic programming (GEP) (Mehr 2018), emotional neural network (ENN) (Yaseen et al. 2020), and M5 model tree (M5T)/M5Prime (M5P) (Khosravi et al. 2021; Tiwari et al. 2021). A study by Essam et al. (2022) employed ML algorithms to predict the streamflow of 11 rivers throughout Peninsular Malaysia using ANNs, long short-term memory (LSTM), and support vector machines (SVMs). Among the other methods used in the paper, they concluded that the ANN model produced the best results. Abda et al. (2022) estimated daily streamflow in the Oued Sebaou Watershed using local weighted linear regression (LWLR), ANNs, and RF. They showed the RF model produced better performances than any of the ML models used in the study, such as the ANN and the LWLR models.
Hybrid methods have been used in many research studies to date: principal components analysis (PCA) (Fan et al. 2017), rotated principal components analysis (RPCA) (Meng et al. 2019; Scherl et al. 2020) conducted a research on the Wei River located in China. In this research, the monthly flow was predicted using ANN, SVM, WA–SVM, EMD–SVM, and M-EMD–SVM methods. The EMD–SVM method has produced better results than other approaches. Noori et al. (2011) used the SVM method along with PCA, GT, and FS in their research on Alaviyan Dam located in Iran for monthly flow forecasting and finally compared the results. This research showed that the PCA_SVM method had the highest accuracy and the lowest risk compared to other methods. During the research conducted on the Arkansas River located in the USA by Chamani & Roushangar (2020) the daily and monthly discharge of this river was modeled and compared with ELM, GPR, and CEEMD methods. As a result of using the CEEMD method, the accuracy and performance of the mentioned models were improved.
In this paper, three ML algorithms including GEP, RF, and ELM were used to model and predict the monthly streamflow of Leaf River, United States of America, and three criteria compared the results. The main focus of this research was to investigate the efficiency and precision of these three algorithms with the three forcing inputs of streamflow rate (Q), precipitation (R), and evaporation (E). Today, given the advanced capabilities and performance of today's models, it is conventional to use hybrid models and functions linked to these models. Combinations may take different forms, such as optimization, influence and change of input data, and influence and change of output data. In this article, EMD and PCA algorithms that influence the input data are used. After analyzing the input data into IMFs using the EMD algorithm, PCA was used to reduce the amount of data due to the increased amount of input data.
In general, this research is a comparison between single methods and hybrid methods. Choosing the best model from a variety of options is one of the most challenging tasks for hydrologists. Different indexes are available for evaluating different topics, including model performance, model accuracy, and model error. In this study, a new index called general index (GI) is used to perform evaluations based on three other indices, RMSE, MAE, and NRMSE. In view of the fact that the other three indicators are aligned and exhibit similar behavior, the lower value of the evaluation indicates a higher level of accuracy in the modeling, the GI is calculated by taking the weighted average of these three criteria. By employing this criterion, the process of selecting the best model is simplified and more accurate and reliable. Various researches in different fields can utilize this index, which can be expanded through subsequent studies. In this research, the flow, precipitation, and evaporation data of the studied area were first collected and the process of data preparation, model selection and modeling was done, and after calculating the evaluation indices, the most efficient model was determined based on these results. The continuation of all cases is explained in the commentary.
MATERIALS AND METHODS
Study area and data
(a) Location of the study area and (b) elevational situation of the study area.
Methods
During the middle of the 20th century, ML was first proposed (Lange & Sippel 2020). The topic of artificial intelligence was discussed a little later by researchers, which originated from ML (Lange & Sippel 2020). ML is generally based on the principle that the best output is produced by analyzing the relationship between input data (Hastie et al. 2009). To predict the monthly streamflow, three ML algorithms were used. These included the RF, GEP, and ELM.
Random forest
The predicted value is that (Y), Ti(x) is the prediction for tree I, X is the input value, while N represents the number of trees in the forest (Breiman 2001).
Gene expression programming
Extreme learning machine
A ML algorithm known as ELM developed by Huang et al. (2004) performs similarly to feedforward neural networks, the simplest type of ANNs. As a relatively new ML algorithm, ELM has received attention because of its fast learning speed and good generalization capabilities. In addition to its effectiveness at solving classification and regression problems, ELM can also solve problems that involve high-dimensional input spaces (Huang et al. 2018). Three steps are involved in the ELM algorithm, namely input weight generation, hidden node activation, and output weight computation. A uniform distribution is used to generate the input weights in the first step. In the second step, a non-linear activation function, such as the sigmoid function, is used to activate the hidden nodes, and the activation is used to derive the output matrix of the hidden layer. A linear system solver is used in the final step to calculate the output weights that minimize the error between the actual output and the desired output (Huang et al. 2018).




The output weights vector β is the vector that is positioned between the output nodes and the hidden layer. As a result of the model, W represents the output of the model and the predicted discharge (Huang et al. 2018).
In this article, the optimized ELM algorithm is used. This process consists of two parts: repetition and optimization. It is mentioned in various articles that the value of the weight matrix W is considered as an average of 10, but in this optimization method, the dimensions of the weight matrix W change according to the number of input data, and modeling for all of them are done and, finally, by analyzing all the outputs, the best answer with the least error is selected.
Hybrid models
Researchers have increasingly focused on hybrid methods in recent years and they are now widely used. They are generally composed of various methods, such as statistical analysis, ML, and rule-based algorithms, which increases their accuracy and strength. The most important advantage of these methods is to use the strengths of the hybrid models and reduce their weaknesses. The use of hybrid models allows us to analyze the nonlinearity and complexity of flow data, and to make more accurate and reliable predictions about dynamics based on these data. It has scenarios. In this article, two hybrid methods, EMD and PCA, are used for flow prediction.
Empirical mode decomposition
The set can be the first IMFs if the
is in the range of 0.2–0.3. According to many tests that have been done, the results showed that if the
is in the range of 0.2–0.3, the obtained IMFs give the correct physical meaning (Wang et al. 2018).
Principle component analysis
In the mentioned equation, PCs are expressed by , eigen vector is indicated by
, and X represent the input variables (Fan et al. 2017).
Evaluation criteria
One of the most common evaluation criteria is RMSE, which is used to measure the fitness of high streamflow and has been used in extensive research. The MAE evaluation index is used to measure the fitness of streamflows, with the difference that it has a more balanced performance than RMSE and is mostly used for moderate streamflows. NRMSE is a method for evaluating the accuracy of prediction, especially in regression problems, is the normalized RMSE sample, which has been used in many researches. Choosing the right index or indices to evaluate the performance and error of modeling has always been one of the researchers' concerns. By using the appropriate index, it is very efficient in choosing the best model, which is the most difficult part. Similarly, choosing the best model, which is considered one of the most difficult stages of research, is very efficient and makes this work easier and more accurate. The new criterion that is used in this article has been introduced called GI, which is defined based on three other evaluation indicators. This criterion has been proposed and used only for evaluation in this article, and according to the conditions and possibility of use, it will have the ability to be used in other articles. Since the evaluation criteria considered in this research have similar behaviors and better results are obtained by reducing the number of the index, in the GI, the effect of three other indices is considered and the weighted average is calculated. It was said earlier that this criterion does not have a specific acceptable range and is defined based on the RMSE, MAE, and NRMSE criteria, so the lower the obtained values, the better the results. Using the proposed index shows better results and makes it easier for researchers to determine the best model. Figure 2 describes the whole processes done in this research.
RESULTS AND DISCUSSION
The flow of the Mississippi River has been the focus of many researchers and scientists to investigate the accuracy of new methods and algorithms and numerical modeling. For modeling in this article, daily data of flow, evaporation and rainfall of the Mississippi River were collected. Table 1 shows the statistical parameters of the rainfall, evaporation and flow data.
Statistical components of observational data
. | Symbol . | Unit . | Mean . | Standard deviation . | Variance . | Skewness . | Maximum . | Minimum . |
---|---|---|---|---|---|---|---|---|
Flow | Q | ![]() | 28.28 | 64.48 | 4157.51 | 7.6311 | 1313.91 | 1.56 |
Rainfall | R | Mm | 3.710 | 9.690 | 93.899 | 4.5835 | 124.106 | 0 |
Evaporation | E | ![]() | 2.9810 | 1.8704 | 3.4982 | 0.4563 | 8.4977 | 0.0062 |
. | Symbol . | Unit . | Mean . | Standard deviation . | Variance . | Skewness . | Maximum . | Minimum . |
---|---|---|---|---|---|---|---|---|
Flow | Q | ![]() | 28.28 | 64.48 | 4157.51 | 7.6311 | 1313.91 | 1.56 |
Rainfall | R | Mm | 3.710 | 9.690 | 93.899 | 4.5835 | 124.106 | 0 |
Evaporation | E | ![]() | 2.9810 | 1.8704 | 3.4982 | 0.4563 | 8.4977 | 0.0062 |
To build the models and verify their accuracy and efficiency, the input data were divided into two categories of training and testing data, so that 80% of the data entered into the modeling process as training data and 20% as the test data. Based on ACF and PACF plots of flow data (Figure 3), three scenarios, namely the M1, M2, and M3 with 1, 2, and 3 time-lag, respectively, were examined. The rainfall, evaporation and flow data of the previous lags were considered as input and the flow of the next lags were considered as output data.
M1 model
The evaluation criteria for the M1 model
Model . | RMSE ![]() | MAE ![]() | NRMSE . | GI . |
---|---|---|---|---|
ELM | 41.153 | 12.40 | 3.14 | 14.37 |
GEP | 41.227 | 13.143 | 3.145 | 14.46 |
RF | 58.778 | 14.216 | 4.484 | 20.21 |
Model . | RMSE ![]() | MAE ![]() | NRMSE . | GI . |
---|---|---|---|---|
ELM | 41.153 | 12.40 | 3.14 | 14.37 |
GEP | 41.227 | 13.143 | 3.145 | 14.46 |
RF | 58.778 | 14.216 | 4.484 | 20.21 |
As Table 2 shows, the lowest value of the GI corresponds to the ELM algorithm with a value of 14.37. Therefore, the ELM algorithm has the best performance in scenario M1. After that, there are the GEP and RF models with values of 14.46 and 20.21, respectively. As the values of the GI show, the two algorithms, ELM and GEP, worked almost identically and acceptably, while the RF method produced unrealistic results.
M2 model
The evaluation criteria for the M2 model
Model . | RMSE ![]() | MAE ![]() | NRMSE . | GI . |
---|---|---|---|---|
ELM | 36.932 | 10.354 | 2.818 | 12.82 |
GEP | 36.366 | 11.811 | 2.774 | 12.78 |
RF | 56.401 | 12.877 | 4.303 | 19.34 |
Model . | RMSE ![]() | MAE ![]() | NRMSE . | GI . |
---|---|---|---|---|
ELM | 36.932 | 10.354 | 2.818 | 12.82 |
GEP | 36.366 | 11.811 | 2.774 | 12.78 |
RF | 56.401 | 12.877 | 4.303 | 19.34 |
Unlike scenario M1, in scenario M2, the GEP model has surpassed the ELM algorithm, and the values of the GI, 12.78 for the GEP model and 12.82 for the ELM algorithm prove this. In this scenario, the GEP and ELM models performed satisfactorily and equally; still, the RF method produces predictions far from reality. But in general, based on the comparison of calculated values of the GI, it can be said that the results of scenario M2 have improved compared to the results of scenario M1.
M3 model
The evaluation criteria for the M3 model
Model . | RMSE ![]() | MAE ![]() | NRMSE . | GI . |
---|---|---|---|---|
ELM | 45.208 | 11.514 | 3.449 | 15.59 |
GEP | 31.824 | 11.184 | 2.428 | 11.27 |
RF | 56.269 | 13.404 | 4.293 | 19.33 |
Model . | RMSE ![]() | MAE ![]() | NRMSE . | GI . |
---|---|---|---|---|
ELM | 45.208 | 11.514 | 3.449 | 15.59 |
GEP | 31.824 | 11.184 | 2.428 | 11.27 |
RF | 56.269 | 13.404 | 4.293 | 19.33 |
In this scenario, the GEP model performed the best with the GI = 11.27, followed by the ELM and RF models with values of 15.59 and 19.33, respectively.
The R2 and Taylor diagram of independent model. (a) R2-independent ELM. (b) R2-independent GEP. (c) R2-independent RF. (d) Taylor diagram of independent models.
The R2 and Taylor diagram of independent model. (a) R2-independent ELM. (b) R2-independent GEP. (c) R2-independent RF. (d) Taylor diagram of independent models.
Since the most accurate performance was related to the M3 scenario, the modeling related to the EMD and PCA were only applied to this scenario, and their results are presented in the following.
Hybrid models
The evaluation criteria for the EMD and PCA
Model . | RMSE ![]() | MAE ![]() | NRMSE . | . |
---|---|---|---|---|
Hybrid ELM | 85.20 | 40.323 | 6.50 | 31.49 |
Hybrid GEP | 78.439 | 37.807 | 5.984 | 29.09 |
Hybrid RF | 85.302 | 50.769 | 6.508 | 33.16 |
Model . | RMSE ![]() | MAE ![]() | NRMSE . | . |
---|---|---|---|---|
Hybrid ELM | 85.20 | 40.323 | 6.50 | 31.49 |
Hybrid GEP | 78.439 | 37.807 | 5.984 | 29.09 |
Hybrid RF | 85.302 | 50.769 | 6.508 | 33.16 |
The comparison of predicted data and actual data for the EMD and PCA.
The large amount of observational data and its complexity, along with extreme fluctuations, make modeling and forecasting difficult and error-prone. The GEP algorithm, with its mathematical nature and the use of genetic algorithm and genetic programming, has a high ability to match, analyze, and optimize data modeling. This allows the algorithm to effectively model and predict complex data with extreme fluctuations. On the other hand, the RF algorithm, derived from a set of decision trees, has a weaker ability to understand complex mathematical relationships compared to the GEP algorithm. It struggles to identify complex non-linear relationships. The ELM approach, known for its simplicity and ease in training neural networks and ML, has a weakness in understanding complex and non-linear relationships. In the current research, the GEP algorithm is found to have the best performance in forecasting and modeling. All in all, it can be said that in this article, the use of the EMD and PCA algorithms had a negative effect on the accuracy of modeling results, and individual models performed better than hybrid models. The EMD and PCA were used as data preprocessing steps to enhance ML model performance, as EMD can analyze non-linear and non-stationary time series, while PCA reduces noise and redundancy through dimensionality reduction. However, the results indicated that the combined models performed worse than the individual ones. This may be due to the complexities introduced by decomposition and dimensionality reduction, which could have obscured important temporal patterns in the raw data. This outcome highlights the need to align data preprocessing with ML algorithms and underscores the limitations of these methods in hydrological applications.
Similar research has been done on this catchment area. In the research conducted by Sadeghi & Pourreza Bilondi (2015), they used four optimization methods to analyze the uncertainty of the conceptual rainfall–runoff model (HYMOD) and acceptable results were obtained. In another research that it was done by Pourreza Bilondi et al. (2015), the data of the Leaf River catchment area was used to model the support vector method to simulate the daily runoff. The results of the mentioned research are in line with this research and show the usefulness and efficiency of these algorithms in the water basin.
CONCLUSION
The performance of the models in each of the scenarios was evaluated using GI which is calculated based on the evaluation criteria of RMSE, MAE and NRMSE. The results showed that the GEP algorithm performs better than other models in all scenarios M1, M2 and M3 with GI values of 14.46, 12.78, and 11.27, respectively. GI values in ELM algorithm are 14.37, 12.82, and 15.59, respectively, in scenarios M1, M2, and M3, which is the best result related to scenario M2. In this way, RF model with GI values obtained from three scenarios M1, M2, and M3, which are 20.21, 19.34, and 19.33, respectively, has shown the weakest performance. Turning to the hybrid models, using the EMD and PCA to build the EMD-PCA-ELM, EMD-PCA-GEP, and EMD-PCA-RF in the M3 scenario considerably decreased the accuracy of the projections which the values of the GI, the average around 31, proved this claim. These findings suggest that the GEP model, individually, is a valuable tool for predicting streamflow in the Leaf River, given that compared to other ML models, it showed superior performance in streamflow forecasting. This study highlights the potential of ML approaches in hydrological modeling and emphasizes the importance of precise streamflow prediction for effective water resource management. The review of other articles and research in this area showed that this article's results align with the results and claims of other articles.
ETHICAL APPROVAL
This article does not contain any studies with human participants or animals performed by any of the authors.
INFORMED CONSENT
Informed consent was obtained from all individual participants included in the study.
DATA AVAILABILITY STATEMENT
All relevant data are included in the paper or its Supplementary Information.
CONFLICT OF INTEREST
The authors declare there is no conflict.