The main goal of this study is to enhance the precision and reliability of monthly runoff forecasts within the complex Navrood watershed, situated in northern Iran. The innovative use of a waveform matching algorithm is a defining feature of this study. This approach is vital in optimizing the selection of the mother wavelet, which is a critical component in wavelet analysis. This is a significant divergence from established techniques in hydrological research, indicating a paradigm change in the area. To thoroughly assess model performance, the Technique for Order of Preference by Similarity to the Ideal Solution (TOPSIS) is applied. This all-encompassing evaluation guarantees not only astounding precision but also a near-perfect fit with the ideal solution. The findings highlight the remarkable precision attained by using the hybrid multiresolution analysis (MRA) methodology. The proposed methodology involves the integration of the maximal overlap discrete wavelet transform (MODWT) with a random forest (RF) model, referred to as MRA–RF. The obtained Nash–Sutcliffe efficiency (NSE) score of 0.94 is noteworthy. Furthermore, the model exhibits a low mean absolute error (MAE) of just 0.36 m3/s, a strong p-factor of 73.5%, and a significant d-factor of 37.9% during extensive testing.

  • Using appropriate lag time for data of ML methods to monthly runoff forecasting.

  • Combine ML methods with wavelet preprocessing methods, stepwise regression, and PCA.

  • Using the waveform matching algorithm to find the optimal mother wavelet.

  • Introducing RF–MODWTMRA as the top model and uncertainty analysis of modeling.

Floods are considered the major natural hazards that cause widespread destructive impacts on humans, the environment, and properties. Heavy rains, insufficient drainage practices, climate change, and improper management of water resources have been regarded as the most frequent reasons for the occurrence of floods. Therefore, a high-precision evaluation of river runoff can play a significant role in flood management policies. Due to the lack of ground-truth monitoring stations, as well as the costly conventional systems for flood warning, it is necessary to develop a machine learning (ML) model for accurately predicting river runoff using innovative and feasible approaches.

Wang et al. (2023) used a decomposition method to reduce input uncertainty and artificial neural networks (ANNs) and least-squares support vector machines (LSSVMs) to predict monthly stream discharge. Also, Ibrahim et al. (2022) reviewed studies that predicted flow discharge with ML methods. They reported that a number of studies such as Tikhamarine et al. (2019, 2020); Fathian et al. (2019); Niu et al. (2018); Zhang et al. (2018); and Choong et al. (2017) applied different ML methods for streamflow prediction.

In recent years, pre-processing techniques such as wavelet transform (WT) have been broadly used in hydrological studies to improve the performance of ML models (Aussem & Murtagh 1997; Labat et al. 2000; Cannas et al. 2006; Nourani et al. 2009, 2011, 2019; Liu et al. 2014; Farajpanah et al. 2020; Nalley et al. 2020; Adib et al. 2021; Azarpira & Shahabi 2021; Abebe et al. 2022; Ahmadi et al. 2022; Esmaeili-Gisavandani et al. 2022; Syed et al. 2023).

Several algorithms have been established to choose the optimal mother wavelet, including: Gabor's method, which applied a relatively difficult approach to find the appropriate mother wavelet for each data set (Burrus et al. 1998), energy matching algorithm, in which the Fourier transform (FT) is applied to a raw signal to identify the frequency ranges with dominant signal energy; then, the signal is decomposed using different mother wavelets, and the signal energy can be calculated within the aforementioned frequency ranges; based on the Parswall theorem, the more appropriate mother wavelet makes the more suitable matching between these two calculated signal energies (Burrus et al. 1998); the entropy matching algorithm, in which, similar to the energy matching algorithm, the high-entropy ranges are recognized through the FT; within these ranges, the wavelet with the highest compatible entropy with the FT will be considered the most desirable mother wavelet (Passoni et al. 2005). The matching algorithm, in which the optimal mother wavelet can be determined based on the most similarity to the shape of the partially known source wavelet. Each wavelet is incorporated for analyzing a similar signal, particularly producing results different from others; hence, visual shape matching is usually applied to obtain the most proper mother wavelet.

The efficiency of different forms of mother wavelets was investigated to measure the multiunit bursts' timing in surface electromyograms. The db2 was selected as the most similar wavelet to the corresponding signal (Flanders 2002). Ahadi & Bakhtiar (2010) measured the acoustic emission leakage signal signatures via the visual inspection method. They found that the Gaussian mother wavelet had the most resemblance to the real signal. The results also demonstrated that the spectrograms generated via proper wavelets could be beneficially applied to leak detection. Tang et al. (2010) used Morlet wavelet for oscillation signal denoising in wind turbines. The results showed the highest correspondence of the Morlet wavelet to the mechanical impulse signal. More similarities of the best wavelet to the mechanical characteristics of impulse response made the higher coefficients for the related impulses. However, it was difficult to perform a visual shape matching of the signal to the mother wavelet.

Two criteria, the information extraction of the signal and the distribution error were proposed to obtain a proper wavelet for the image correction process (Zhang et al. 2005). The results revealed the priority of bior1.3 (Zhang et al. 2005). For partial discharge detection, db4 was chosen as the most appropriate with the maximum cross-correlation to the ultrahigh-frequency signal (Yang et al. 2004). Mao et al. (2021) showed that ANNs can simulate monthly runoff more accurately than long short-term memory (LSTM) neural networks, while LSTM performs better than ANNs for simulating daily runoff. Pal & Talukdar (2020) showed that hybrid wavelet ANNs can predict flow discharge better than ML methods such as support vector machines (SVMs) and random forest (RF).

According to the authors’ knowledge, a few studies have been conducted on the process of selecting a mother wavelet in fields other than hydrology (Ngui et al. 2013; Jang et al. 2021). In the research conducted in the field of hydrology, no study has used the waveform matching algorithm to select the appropriate mother wavelet, the principal component analysis (PCA) method to reduce the dimensions of the data, and the methods of uncertainty analysis and technique for order of preference by similarity to the ideal solution (TOPSIS) to select the best model. This study attempts to provide a framework based on ML models for the accurate estimation of watershed runoff. In most studies, the selection of the optimal mother wavelet depends on its performance in modeling and recognition of desired features. In this study, the innovative approaches have been characterized to predict river runoff using mother wavelets coupled with the climatic variables, and hybrid models comprising adaptive neuro-fuzzy inference system (ANFIS), LSSVM, group method of data handling (GMDH), multivariate adaptive regression splines (MARS), and RF. Then, the uncertainty of the outperformed models has been analyzed. This study aims to develop a preprocessing technique to select an appropriate mother wavelet function for each input parameter to the modeling process.

Study area

Navrood, a forest-mountainous watershed with an area of 266 km2, located in the Talysh Mountain range between the longitude 35°48 to 48°54 E and latitude 36°37 to 45°37 N in Talesh County, Gilan province, Iran. The maximum, minimum, and average heights of the watershed are 3,025, 133, and 1,398 m, respectively. The length of the main stream is 26 km with an average slope of 7%; the average slope of the watershed is 45%, and the concentration time of the watershed is ∼4 h. According to Koppen climate, this watershed is categorized as the class C (temperate) climate. This watershed consists of two first-class hydrometric stations, Kharjgil (watershed outlet) and Khalian (watershed centroid) (Figure 1).
Figure 1

Stations and location of the Navrood watershed.

Figure 1

Stations and location of the Navrood watershed.

Close modal

Input data to modeling

In this study, three weather stations, Nav, Khalian, and Kharjgil, provided monthly data sets including precipitation (PCP), maximum and minimum temperatures, evaporation, relative humidity (RH), sun hours (SH) in a month (hours/month), and runoff from 1989 to 2017. The spatial average of data was obtained through the Thiessen polygons. The runoff data for the Navrood River were also extracted from the Kharjgil station. During the study period, the average annual PCP was 1,256 mm; the average maximum and minimum temperatures were 19.9 and 11 °C, the average annual evaporation is 751 mm, the average annual RH is 82.4%, the average annual SH in month is 104 h, and the average runoff is 4.4 m3/s.

To complete data at stations with gaps in their time series, this study employed the double mass method. For this purpose, data from the station that exhibits the highest correlation with the data of the considered station are utilized (typically, this station is the one closest to the considered station). It is worth noting that the amount of missing meteorological data (precipitation and temperature data) in this study was negligible (<1% of the data) and river flow data were complete.

One of the most important steps in the modeling process is to provide a suitable combination of input variables to the model. Therefore, first, the mutual correlation between input and output variables was calculated. The values of correlation with Q are 0.33, −0.34, −0.36, −0.34, 0.32, and −0.39 for RH, SH in a month, maximum and minimum temperatures (Tmax, Tmin), precipitation (PCP) and evaporation (Eva), respectively. These correlations are significant at the 0.01 level (two-tailed).

Therefore, the input variables have interdependent effects, and these effects are considered in the modeling process using correlation calculations, selection of an appropriate mother wavelet, stepwise regression, and PCA. These processes have helped to improve the accuracy of river flow forecasting and achieve better results.

Then, according to the autocorrelation function (ACF) and partial autocorrelation function (PACF) diagrams, a high correlation (0.4) was found for 1-month lag runoff. ACF and PACF find out linear relationship, for finding non-linear relationship between different variables, this study used mutual information method. The results of mutual information method confirmed the accuracy of meteorological and river flow data have been considered to forecasting Qt (see the Supplementary material). Finally, the runoff of the Navrood River can be considered as the following function:
(1)
where Q is the flow discharge (m3/s), PCP is the precipitation (mm/month), Eva is the evaporation (mm/month), RH is the relative humidity (%), Tmin is the minimum temperature (°C), Tmax is the maximum temperature (°C), and SH is the sun hours in a month (hours/month).

The correlation between Qt and Qt−1 and the correlation between Qt and meteorological data such as precipitation and temperature are nearly equal. Therefore, to determine Qt, all of these variables should be considered. This watershed has a small size, such that in winter, there is a high correlation between precipitation and river flow in the same month, and in spring, when river flow is mostly due to snowmelt, a high correlation between temperature and river flow is observed in the same month. Due to the small size of the watershed, the correlation between river flow in two consecutive months decreases.

The Navrood watershed is a small watershed and the correlation between flow discharge and meteorological parameters (PCP, EVA, RH, SH, Tmin, and Tmax) in previous months is low (). Therefore, the flow discharge in each month is a function of the meteorological parameters in the same month and the flow discharge in the previous month.

At first, the preprocessing wavelet technique was employed to eliminate the deterministic trend in the time series. In recent years, these transformations have been widely used in hydrology (Sang 2013; Nourani et al. 2014; Farajpanah et al. 2020).

Waveform matching algorithm

Waveform matching is a method to select a mother wavelet that closely resembles the shape of the signal being analyzed. This leads to more accurate analysis and better results.

The waveform matching algorithm works based on the following steps:

  • Identify the signal you want to analyze.

  • Pick a set of common wavelets.

  • Compare the shapes of these wavelets to the signal's shape using visual inspection and calculations like correlation.

  • Choose the wavelet that most closely matches the signal's shape.

  • Analyze the signal using the chosen wavelet.

Waveform matching offers several advantages:

  • Increased accuracy due to a better match between the signal and the wavelet.

  • Reduced errors in the analysis.

  • More applicable to signals with some prior knowledge.

Traditionally, choosing a mother wavelet relied on experience or using well-known wavelets by default. Waveform matching provides a more data-driven approach for selecting the optimal wavelet.

This study used waveform matching to select the best wavelet for analyzing river flow data. Waveform matching showed better performance compared with other methods due to:

  • Higher accuracy in matching the river flow signal.

  • Better results in modeling monthly river flow.

  • Simplicity and ease of implementation.

  • Generalizability for various types of signals.

The study also highlights that different input parameters might require different mother wavelets due to their unique characteristics. Selecting an optimal wavelet for each parameter leads to:

  • Better capturing of the unique characteristics of each parameter.

  • More accurate extraction of important features from the data.

  • Increased accuracy in modeling and prediction of the phenomenon.

WT and the corresponding algorithm to choose a suitable wavelet

The main idea behind the WT is to overcome the disadvantages of the FT to deal with the frequency–time resolution and non-stationary signal analysis. The family of wavelets includes a wide range of different kinds of WTs (known as ‘mother wavelet’) with respective filtering characteristics family of transforms coupled with the corresponding ML models were employed to compete for superiority as an optimal condition with the maximum possible decomposed information extracted from a raw signal.

In the majority of studies, the optimal mother wavelet has been selected based on its performance to identify the desired characteristics within the modeling process. It means that first a set of different types of mother wavelets are employed to analyze the corresponding time series; finally, the modeling results reveal which mother wavelet has optimal performance. This study attempts to find the appropriate mother wavelet function for each of the input parameters (hydroclimatic variables) before the modeling process via the waveform matching algorithm. To explain the importance of selecting the mother wavelet for analyzing different data, let's consider one of the input parameters, , which can be decomposed by three mother wavelet functions db2, bior3.1, and sym14. Figure 2 demonstrates the time series of three mother wavelet functions.
Figure 2

Example of runoff time series decomposition by three mother wavelets.

Figure 2

Example of runoff time series decomposition by three mother wavelets.

Close modal
According to Figure 2, it can be recognized that the shape of the wavelet function, sym14 is the most similar to the shape of the time series, ; hence, it is expected that this mother wavelet can extract the maximum possible information within the decomposition process of the time series, . The time series was decomposed into three levels through three mother wavelets db2, bior3.1, and sym14, and the approximation and detail coefficients were extracted. Table 1 demonstrates the correlation of the corresponding sub-signals (approximation and details) decomposed from the related mother wavelets as target (denoted as Q). Then, the sub-signals have been considered the input for modeling the river runoff (Figure 3).
Table 1

Correlation of the wavelets, sym14, db2, and bior3.1 with Q (target) using LSSVM

Wavelet nameCorrelations
d1 (Qt−1)d2 (Qt−1)d3 (Qt−1)a3 (Qt−1)
bior3.1 −0.17 0.26 0.14 0.19 
db2 −0.17 0.21 0.39 0.45 
sym14 −0.26 0.24 0.45 0.43 
Wavelet nameCorrelations
d1 (Qt−1)d2 (Qt−1)d3 (Qt−1)a3 (Qt−1)
bior3.1 −0.17 0.26 0.14 0.19 
db2 −0.17 0.21 0.39 0.45 
sym14 −0.26 0.24 0.45 0.43 
Figure 3

Evaluation of the choice of the mother wavelet in monthly runoff forecasting based on the waveform matching algorithm.

Figure 3

Evaluation of the choice of the mother wavelet in monthly runoff forecasting based on the waveform matching algorithm.

Close modal

For finding the best mother wavelet function, the LSSVM method is coupled with the different mother wavelet functions. These mother wavelet functions are: 21 Daubechies, 19 Symlets, 5 Coiflets, 15 Biorthogonal, 15 Reverse biorthogonal, 5 Fejér–Korovkin, and discrete Meyer (dmey) mother wavelet functions. The written code for these mother wavelet functions in MATLAB is illustrated in the Supplementary material.

These mother wavelet functions were incorporated for forecasting monthly runoff. The simulation process was repeated 10 times, and at last, the results were averaged and compared (Table 2). It is worth noting that the best mother wavelet function from each family with the highest R-value and the lowest standard deviation ratio (RSR) and mean absolute error (MAE) values are shown in Tables 1 and 2 and Figure 3. In this study, the daily time series of Qt was also considered. It was observed that the mother wavelet functions that have the highest conformity with the shape of the daily time series of Qt can provide the maximum possible information for decomposing the time series and making it available to the user. The superior mother wavelet functions for daily time series were the same as the superior mother wavelet functions for monthly time series. The shape of the daily time series, the shape of the mother wavelet functions that have the highest similarity (blue color) and the lowest similarity (red color) with the time series shape, and the values of correlation of decomposed components of Qt−1 by these wavelets with Qt are shown in the Supplementary material.

Table 2

Performance of combination of LSSVM and wavelet model in monthly runoff prediction

WaveletNo. of runsTrain data
Test data
bior3.1 10 0.61 0.79 1.38 0.65 0.77 1.24 
db2 10 0.63 0.78 1.36 0.68 0.75 1.20 
sym14 10 0.75 0.66 1.14 0.73 0.69 1.16 
WaveletNo. of runsTrain data
Test data
bior3.1 10 0.61 0.79 1.38 0.65 0.77 1.24 
db2 10 0.63 0.78 1.36 0.68 0.75 1.20 
sym14 10 0.75 0.66 1.14 0.73 0.69 1.16 

As indicated in Table 1, it can be recognized that sym14 outperformed the two other mother wavelets, bior3.1 and db2 to analyze the time series, such that the correlation of the approximation and details with the target is relatively higher than those of the other mother wavelets. Also, Table 2 demonstrates the priority of sym14 to simulate the monthly river runoff. This demonstrates that the maximum possible information decomposed from the time series has been acquired using the sym14 as the most similar to the shape of the main time series.

The mother wavelet is responsible for discovering the similarity between the wavelet function and the given time series. Since all different types of mother wavelets are not equally effective, each wavelet function may provide accurate results for one or more specific signals, while not suitable for others. Therefore, choosing the appropriate mother wavelet is a crucial step in wavelet hybrid modeling.

The main point is that several input data may be needed to model a phenomenon. In this study, the monthly runoff of the Navrood River is a function of the meteorological parameters in the same month and the runoff of the previous month (Equation (1)).

Since discrete wavelet transform (DWT) has been widely used to analyze different hydrological time series, in this study, two WT methods, maximal overlap discrete wavelet transform (MODWT) and maximal overlap discrete wavelet transform multiresolution analysis (MODWTMRA), generalized forms of the DWT are also incorporated.

DWT, MODWT, and MODWTMRA decompose a signal but differ in their strengths and weaknesses.

DWT: efficient (uses smaller filters) and widely used, but may lose information (especially at high frequencies).

MODWT: preserves information better (due to maximum overlap filters) and is good for non-stationary signals, but requires more computation.

MODWTMRA: enables detailed analysis with multiresolution, but can be very demanding on computational resources and storage space.

The best choice depends on your specific needs: signal type, size, desired accuracy, and available computing power.

The decomposed time series was modeled by the stepwise regression; in addition, the PCA was employed to reduce the input dimension; finally, three WT modes, DWT, MODWT, and MODWTMRA coupled with the ML models, ANFIS, LSSVM, GMDH, MARS, and RF were used for forecasting monthly runoff. The performance criteria were employed to evaluate the modeling results; the best models were introduced via the TOPSIS method; as a complementary task, the uncertainty analysis was carried out on the top two models. Figure 4 demonstrates the flowchart of the method used in this study.
Figure 4

Flowchart of the method used in this study.

Figure 4

Flowchart of the method used in this study.

Close modal

Stepwise regression method

In this method, the independent variables x1, x2, …, xn are introduced into the related equation based on some predefined criteria. A variable in the equation may be replaced by a new variable in the equation or discarded from the equation altogether (Thompson 1995).

The stepwise regression, a set of criteria to determine how a variable is entered, replaced, or eliminated is as follows:

  • The stepwise procedure with F-test

  • Replacing with F-test

  • The stepwise procedure with multiple correlation coefficient R2

  • Swapping with R2

Principal component analysis

The PCA provides new coordinates to preserve data with the highest variance. Therefore, the purpose of this method is to analyze multivariate data sets into those components representing the maximum possible variations. After sorting the components based on the maximum variance, those with the lowest variance can be ignored. Hence, this analytic approach can reduce the dimensions of the data, such that to only preserve the data containing the most useful information. In this method, the primary variables are transformed into components that are not correlated with each other; a linear combination of a new component may be established. Indeed, this method attempts to find a linear combination of L indices, x1, x2, …, xm to produce the independent indices, z1, z2, …, zl while l < m. The steps of the PCA method can be summarized as follows: (1) data averaging and normalizing, (2) establishing a covariance matrix of the normalized data, (3) calculating the eigenvalues and eigenvectors for the covariance matrix, (4) arranging the eigenvectors in the descending order based on the corresponding eigenvalues, and (5) selecting an appropriate set of eigenvectors (Abdi & Williams 2010).

This study investigated if using PCA (a dimensionality reduction technique) to simplify data would affect the accuracy of predicting monthly river flow. For this investigation, it used performance criteria such as mean squared error and coefficient of determination (R2) and sensitivity analysis for comparison between models trained using PCA-reduced data and models trained using original data. They found that PCA can significantly reduce the complexity of the data (reduce dimensions) without harming the prediction accuracy. This success is attributed to PCA's focus on capturing the most informative parts of the data and ensuring no critical information is lost during the reduction process (because the principal components selected by PCA captured a substantial portion of the data variance). Overall, the study confirms that PCA is a valuable tool for simplifying data analysis in tasks like river flow prediction, while still retaining the key information needed for accurate results.

In this study, preventing overfitting was a crucial aspect of model tuning. For this purpose, various methods and techniques were employed.

  • (a) Hybrid wavelet–ML models: hybrid models utilize wavelet decomposition to separate input signals into different frequency components. This allows the models to focus on each component individually, reducing model complexity and overfitting.

  • (b) Feature selection technique: feature selection, which involves identifying and retaining important features while eliminating irrelevant ones, is an effective approach to overfitting prevention. In this study, stepwise regression and PCA were employed for feature selection and dimensionality reduction.

  • (c) Model and parameter tuning: the various models used in this study have multiple parameters that require careful tuning. Optimization algorithms and trial-and-error methods were employed to optimize these parameters.

  • (d) Independent testing: to evaluate the models definitively, independent data not used in the training process was utilized. This assesses model performance in real-world scenarios and prevents overfitting.

  • (e) Model comparison: the results of different models were compared to select the one with the lowest error and least overfitting.

Uncertainty analysis

In this study, the uncertainty has been quantified using two criteria, p-factor and d-factor. The p-factor denotes the percentage of the observed data surrounded within the range of 95PPU, and therefore its optimal value will be 100%. Since each observation datum has its 95PPU range and the width of this range is special to each observation value, d-factor, the ratio of the average width of the uncertainty range of all observed data (percentile difference of 97.5 and 2.5 for each simulated value) to the standard deviation of the observed data has also been proposed (the desired value is zero). Therefore, the small value of the d-factor means that the model uncertainty is low. It should be mentioned that the d-factor reduction reflects the p-factor diminishes, which is not desirable. So, it is essential to establish a proper balance between these two criteria to obtain optimal values. In runoff forecasting, for the p-factor, at least 70%, and for the average width of the uncertainty range, a maximum of 1.5 are acceptable (Abbaspour et al. 2015).

In this study, the p-factor and d-factor methods are used to assess uncertainty in monthly flow modeling. These two methods are statistical tools employed to examine the agreement between the predicted and observed values of monthly flow. The goal of these two methods is to evaluate the performance of monthly flow models and ensure that the model accurately predicts changes in flow.

Performance criteria

In this study, the performance criteria, including correlation coefficient (R), Kling–Gupta efficiency (KGE), Nash–Sutcliffe efficiency (NSE), the root mean square error (RMSE) to RSR, and MAE have been employed to analyze the results as well as quantify both the accuracy and efficiency of the developed models. In the following, a brief description of these criteria has been presented.

The KGE: The KGE, developed by Gupta et al. (2009), indicates the effects of mean, variance, and correlation on the model performance, separately. The ideal value of the KGE is 1:
(2)
The RMSE to RSR: The RSR is calculated as the ratio of RMSE to the standard deviation of the calculated data, developed by Singh et al. (2005), where the RSR merges an error index and the extra information. An RSR = 0 denotes the ideal situation, while RSR > 0.7 indicates the performance is not reasonable (Hamaamin et al. 2016):
(3)
The NSE: It is defined as (Gupta et al. 2009)
(4)
where is the ith observed value, is the ith calculated value, is the mean observed data, and n is the total number of observations. The values in the range of 0.0–1.0 are usually considered to show reasonable performance (Nash & Sutcliffe 1970).
The MAE: It is the average of all absolute errors and is defined as
(5)
The correlation coefficient: It is defined as
(6)

The proper mother wavelet using the waveform matching algorithm

In time series analysis, WTs are employed to eliminate the trend and overcome the non-stationarity. From the literature, the appropriate mother wavelet was commonly selected based on its performance to identify the desired features during the modeling process. This means that first all the corresponding wavelets were applied to the whole data, and the modeling process was finished, then the obtained results revealed which mother wavelet outperformed to enhance the simulation process compared with others. However, in this study, the waveform matching algorithm was used to select the proper mother wavelet. This algorithm attempts to match the form of the original time series with each wavelet and finally extracts that wavelet with the most similarity. In this study, seven parameters were incorporated to model the monthly runoff of the Navrood River. Each parameter was analyzed using the most appropriate mother wavelet function selected based on the waveform matching algorithm (Figure 5).
Figure 5

Proper mother wavelet for each input parameter.

Figure 5

Proper mother wavelet for each input parameter.

Close modal

Modeling of the decomposed time series using stepwise regression

The selection of input parameters is one of the most important steps for making a successful ML model. In this regard, the data collection process should not be costly; one of the main criteria is to reduce the size of the network while preserving high performance; in this study, the input parameters were selected based on Pearson's correlation coefficient at 99% confidence level and the stepwise regression method. Therefore, the time series of the hydroclimatic variables were decomposed by the most suitable mother wavelet; then, the different combinations of the obtained sub-series (including approximate and detail components) were prepared via stepwise regression to make an optimal model configuration.

The PCA results

The third step of the procedure employed in this study (Figure 4) is the PCA technique to acquire the principal components of the corresponding variables, which accordingly leads to reducing the dimensions of the input data. PCA is often used for dimensionality reduction, where the goal is to reduce the number of variables while preserving the most important information. By selecting the top k principal components, which capture the most variance in the data, we can represent the data set in a lower-dimensional space. In this study, the PCA algorithm has been implemented on the results of the wavelet-based (DWT, MODWT, and MODWTMRA) models.

First, the boxplots of all three mother wavelet outputs are demonstrated in Figure 6. These plots indicate the variation range of each parameter decomposed from the corresponding mother wavelet, which is supposed to enter the PCA algorithm. The + sign shows those data that exceeded three times the standard deviation, which is considered in this study as the outliers.
Figure 6

Boxplot outputs for different WT techniques: (a) DWT, (b) MODWTMRA, and (c) MODWT.

Figure 6

Boxplot outputs for different WT techniques: (a) DWT, (b) MODWTMRA, and (c) MODWT.

Close modal
PCA was used to decompose the results of the wavelet-based models (DWT, MODWT, and MODWTMRA) into principal components. Figure 7(a)–7(c) illustrates the relationship between the first and second most important components for each WT technique.
Figure 7

First important component is plotted against the second important component for each WT technique: (a) DWT, (b) MODWTMRA, and (c) MODWT. Output of the variance (%) for each principal component: (d) DWT, (e) MODWTMRA, and (f) MODWT.

Figure 7

First important component is plotted against the second important component for each WT technique: (a) DWT, (b) MODWTMRA, and (c) MODWT. Output of the variance (%) for each principal component: (d) DWT, (e) MODWTMRA, and (f) MODWT.

Close modal

As depicted in Figure 7(a)–7(c), the first principal component exhibits a significantly larger range compared with the second principal component. This indicates that the first component holds higher importance than the second component. It is obvious from this figure that the first component has more dispersion than the second component. To determine the optimal number of components that can effectively represent the original data, we can plot the variance explained by each component, as well as the cumulative variance. This analysis helps us assess the amount of variance captured by each component and identify the point where adding more components does not significantly contribute to the overall variance explained. Figure 7(d)–7(f) illustrates the variance of each component that collectively represents nearly 95% of the total variance in the original data.

As observed in Figure 7(d)–7(f), the rate of variation for the components in the WDT model is lower compared with the other models. The first four components account for approximately two-thirds of the total variance, leading to their selection while disregarding the remaining components. Conversely, both the MODWTMRA and MODWT models display a noticeable drop from the first to the second component. In these wavelet models, the first component explains ∼40% of the total variance. As for the MODWTMRA and MODWT models, the first three components and first two components respectively capture two-thirds of the total variance. Consequently, the first three components are chosen for the MODWTMRA model, while the first two components are selected for the MODWT model.

Comparison of the developed hybrid models

The simulation results of the models, comprising RF, MARS, GMDH, LSSVM, and ANFIS, coupled with DWT, MODWT, and MODWTMRA are presented in the Supplementary material.

As expected, during training, the performance of the models is relatively better than testing. In the simulation process, it is very important to avoid overfitting the model. The models were ranked based on the TOPSIS method for both training and test data (see the Supplementary material).

The results revealed that the RF–MRA model outperformed others with R = 0.98, KGE = 0.88, NSE = 0.96, RSR = 0.20, MAE = 0.36, and score = 0.9910 during training period and R = 0.98, KGE = 0.84, NSE = 0.94, RSR = 0.24, MAE = 0.36, and score = 0.9805 during testing period due to the classification, a combination of decision trees, without any overfitting during the simulation process.

In other words, RF is a hybrid learning method for classification and regression. It works on either the training timing or output of classes (classification) or the predictions of each tree separately.

Meanwhile, the results indicated that for both training and testing periods, the top three highest scores were assigned to the same models including RF–MRA, RF–DWT, and MARS–DWT. In addition, the worst-case performances of the simulation were assigned to the ANFIS–MODWT for training and LSSVM–MODWT for testing periods. Also, the comparison of preprocessing techniques demonstrated distinctive priorities for each ML model; such that for the RF model, the RF–MRA had the best performance (score = 0.9910), followed by RF–DWT (score = 0.9791) and (RF–MODWT = 0.9410) during training and RF–MRA (score = 0.9805), followed by RF–DWT (score = 0.9527) and (RF–MODWT = 0.7866); while for MARS, the ranking is MARS–DWT, MARS–MRA, and MARS–MODWT for training and testing periods.

The GMDH–DWT, GMDH–MRA, and GMDH–MODWT were the corresponding wavelet-based model ranks for both training and testing periods. For the ANFIS model, the results demonstrated the robustness of the ANFIS–MRA than ANFIS–DWT and ANFIS–MODWT during the training period, while for the testing period, the ANFIS–DWT had the highest performance rather than the ANFIS–MRA, followed by ANFIS–MODWT. A similar ranking was observed for the wavelet-based LSSVM models. The results revealed that for RF, MARS, and GMDH models, the preprocessing techniques comprising DWT, MODWT, and MODWTMRA had comparable ranks during both training and testing periods; while for ANFIS and LSSVM, the priority of different wavelet techniques was not alike during training and testing periods. Therefore, it can be concluded that there is not a unique wavelet technique that represented the best performance for all ML models due to the inherent difference among the studied models.

Uncertainty quantification in the developed hybrid models

The uncertainty analysis has been discussed based on the concepts of p-factor and d-factor. In this study, during 50 model executions, the statistical criteria extract the best performance to calculate the 95PPU for each observed value. This had a great impact on the optimized p-factor and d-factor values. Table 3 indicates the uncertainty analysis for the top two hybrid models during the training and testing periods. Also, comparison between the simulated and observed values considering the range of 95PPU for the top two models during the training and testing periods is shown in the Supplementary material.

Table 3

Uncertainty parameters for top two hybrid models with different natures during the training and testing periods

CriteriaRF–MODWTMRA
MARS–DWT
Train dataTest dataTrain dataTest data
p-factor 0.735 0.713 0.808 0.752 
d-factor 0.379 0.493 0.919 1.010 
Average bandwidth 0.92 0.92 2.23 1.89 
CriteriaRF–MODWTMRA
MARS–DWT
Train dataTest dataTrain dataTest data
p-factor 0.735 0.713 0.808 0.752 
d-factor 0.379 0.493 0.919 1.010 
Average bandwidth 0.92 0.92 2.23 1.89 

From Table 3, it is concluded that for MODWTMRA–RF, the p-factor of 73.5% (more than 70%), d-factor of 37.9, and the average bandwidth of 0.92 (<1.5) during the training period, are considered the acceptable values for runoff forecasting. In addition, the diagrams reveal that the uncertainty interval (95PPU) for the minimum runoff is much wider than that of the maximum runoff; it means that the minimum runoff has a higher uncertainty than the maximum runoff. In contrast, the number of violations of the uncertainty range in the maximum runoff is more than the minimum ones. In fact, for minimum runoff, the wider uncertainty interval causes the higher d-factor, while the lower p-factor, which is the opposite for the maximum runoff. During the testing period of MODWTMRA–RF, the optimal p-factor, d-factor, and the average of the uncertainty bandwidth are 71.3% (more than 70%), 49.3%, and 0.92 (<1.5), respectively. In addition, the diagram indicates that the uncertainty band has almost the same distribution for both the maximum and minimum runoff. During the training of the DWT–MARS, the p-factor = 80.8%, the d-factor = 91.9%, and the average width of the uncertainty band = 2.23. Compared with the RF, the p-factor is increased, which demonstrates that a higher percentage of the observed data is surrounded within the 95PPU uncertainty band; in addition, the d-factor is significantly increased, due to that the average uncertainty band is a noticeable value compared with the standard deviation of the observed data. A relatively wide bound of the uncertainty for nearly all observed data is the reason for the high values of the p-factor and d-factor. Similar to the RF, the uncertainty of the minimum data is greater than the maximum data due to the wide uncertainty band. In general, during the modeling process, the uncertainties may be caused by a series of simplifying assumptions, the processes that occurred in a watershed, but were not included in the simulation, data limitations, uncertainty and the quality of input data, etc., which leads to the prediction errors. Similarly, this was observed during the testing of the DWT–MARS. Despite the acceptable and satisfactory p-factor (75.2%), the average width of the uncertainty band is greater than the standard deviation of the observed data, which made the d-factor slightly more than one. However, compared with the training period, the average width of the uncertainty band achieved 1.89 (∼15% reduction), but it is still significantly more than the standard deviation of the observed data. In addition, the uncertainty bandwidth for the minimum values is greater than that of the maximum ones, which indicates the higher uncertainty of the minimum values. It should be mentioned that the results are based on the top 50 model executions during both training and testing periods. The smaller number of superior models might lead to relatively more acceptable results; however, during the validation process, the models might not be properly generalized, leading to overfitting.

Climate changes can have impacts on the performance of precipitation and runoff models configured using ML. These impacts can occur directly or indirectly through changes in precipitation patterns, temperature, and runoff. To address these impacts, it is important to update precipitation and runoff models so they can more accurately predict these changes. This includes using new climate data, improving modeling algorithms, and adjusting model parameters based on current and future climate conditions. It is noteworthy that Lotfirad et al. (2023) investigated the effect of climate change on the runoff of the watersheds of the Hyrcanian region such as the Navrood watershed and observed the appropriate performance of the rainfall–runoff model in future periods under the influence of climate change.

The proposed models in this study have demonstrated high prediction accuracy, flexibility, generalization capability, stability, and reliability, making them suitable for various applications:

  • Water resource management: the proposed models can be highly effective in water resource planning and management. Accurate and reliable information about future river flows can aid in better decision-making regarding water allocation, flood management, and agricultural planning.

  • Flood forecasting: the proposed models can serve as valuable tools for flood forecasting. By accurately predicting future river flows, proactive measures can be taken to mitigate flood damage.

  • Agricultural planning: in agriculture, knowing future water flows is crucial for planning crop planting and harvesting. The proposed models can assist farmers and agricultural managers in developing better water management plans based on accurate forecasts.

In this study, the data-driven modeling techniques coupled with the ML models, and preprocessing methods comprising WT, stepwise algorithm, and PCA are used to develop accurate hybrid models for predicting river runoff. The waveform matching algorithm to achieve optimal mother wavelet for data preprocessing was first applied in this study. Contrary to the conventional methods which ensure the efficiency of the wavelet after the modeling process was implemented, in this method, optimal wavelet is obtained before beginning the modeling, through matching the shape of the original time series with the desired wavelet. Once the raw time series was decomposed into the corresponding sub-series (approximation and details), a stepwise method was incorporated to provide different combinations, which finally achieve the most efficient combination. The output of the stepwise method was entered into the complementary preprocessed technique, PCA, to reduce the dimensions of the input data for the modeling process. Changing the coordinates of the input data, the PCA only preserves those data that convey the most beneficial information. The PCA results revealed that for the DWT, about the first four components expressed 75% of the total variance. For the MODWTMRA and MODWT, the first three and two components expressed 75% of the variance of the total data, respectively. The preprocessing input data were introduced into the ML models, including RF, MARS, GMDH, LSSVM, and ANFIS to implement the simulation process. During training and testing periods, the statistical criteria, consisting of MAE, RSR, NSE, KGE, and R were applied to compare the performance of the models. The RF–MRA model outperformed other ML models for both training and testing periods. The last step was assigned to the uncertainty analysis based on the concepts of p-factor and d-factor parameters. In this study, for each model, 50 executions were implemented and the models with the best function were extracted based on the statistical criteria. For the testing period, the MODWTMRA–RF model showed the best performance with p-factor = 73.5% (more than 70%), d-factor = 37.9%, and the average width of the uncertainty band = 0.92 (<1.5), representing confident forecasting of the river runoff. During training, the best performance was achieved for the DWT–MARS model, in which, the p-factor, d-factor, and the average width of the uncertainty band were obtained to be 80.8% (more than 70%), 91.9%, and 2.23, respectively. The combination of ML models and preprocessed techniques, such as wavelet, were discussed in many studies. Nourani et al. (2019) integrated DWT with the tree model, M5 to predict daily and monthly river runoff in Iran and Australia. The db4 was introduced as the most appropriate mother wavelet for the runoff for precipitation simulation. Freire et al. (2019) coupled an ANN model with 54 mother wavelets from the DWT to predict daily runoff. The Meyer wavelet was selected as the most suitable mother wavelet. Farajpanah et al. (2020) incorporated eight ML models and the corresponding combinations with DWT to estimate the daily runoff of the Navrood River. They recognized the sym7 as the optimal mother wavelet for the simulation process. Wang et al. (2022) combined five ML models with the DWT wavelet to evaluate the monthly runoff of two rivers in the United States. The db4 was selected as the best mother wavelet. A noteworthy point is that in all these studies, no specific framework has been considered to choose the proper mother wavelet and only restricted themselves to improving the modeling results. However, in the current study, the waveform matching algorithm, an innovative method in hydroclimatic studies, was incorporated to identify the most appropriate mother wavelet; meanwhile, the sym14 was determined as the optimal mother wavelet to predict monthly runoff.

The manuscript is an original work with its own merit, has not been previously published in whole or in part, and is not being considered for publication elsewhere.

The authors have read the final manuscript, have approved the submission to the journal, and have accepted full responsibilities pertaining to the manuscript's delivery and contents.

The authors agree to publish this manuscript upon acceptance.

The authors declare that they have contribution in the preparation of this manuscript.

The authors did not receive support from any organization for the submitted work.

Data cannot be made publicly available; readers should contact the corresponding author for details.

The authors declare there is no conflict.

Abbaspour
K. C.
,
Rouholahnejad
E.
,
Vaghefi
S.
,
Srinivasan
R.
,
Yang
H.
&
Kløve
B.
2015
A continental-scale hydrology and water quality model for Europe: Calibration and uncertainty of a high-resolution large-scale SWAT model
.
Journal of Hydrology
524
,
733
752
.
https://doi.org/10.1016/j.jhydrol.2015.03.027
.
Abdi
H.
&
Williams
L. J.
2010
Principal component analysis
.
Wiley Interdisciplinary Reviews: Computational Statistics
2
(
4
),
433
459
.
https://doi.org/10.1002/wics.101
.
Abebe
S. A.
,
Qin
T.
,
Zhang
X.
&
Yan
D.
2022
Wavelet transform-based trend analysis of streamflow and precipitation in Upper Blue Nile River basin
.
Journal of Hydrology: Regional Studies
44
,
101251
.
https://doi.org/10.1016/j.ejrh.2022.101251
.
Adib
A.
,
Zaerpour
A.
,
Kisi
O.
&
Lotfirad
M.
2021
A rigorous wavelet-packet transform to retrieve snow depth from SSMIS data and evaluation of its reliability by uncertainty parameters
.
Water Resources Management
35
(
9
),
2723
2740
.
https://doi.org/10.1007/s11269-021-02863-x
.
Ahadi
M.
&
Bakhtiar
M. S.
2010
Leak detection in water-filled plastic pipes through the application of tuned wavelet transforms to acoustic emission signals
.
Applied Acoustics
71
(
7
),
634
639
.
https://doi.org/10.1016/j.apacoust.2010.02.006
.
Ahmadi
F.
,
Mehdizadeh
S.
&
Nourani
V.
2022
Improving the performance of random forest for estimating monthly reservoir inflow via complete ensemble empirical mode decomposition and wavelet analysis
.
Stochastic Environmental Research and Risk Assessment
36
(
9
),
2753
2768
.
https://doi.org/10.1007/s00477-021-02159-x
.
Aussem
A.
&
Murtagh
F.
1997
Combining neural network forecasts on wavelet-transformed time series
.
Connection Science
9
(
1
),
113
121
.
https://doi.org/10.1080/095400997116766
.
Azarpira
F.
&
Shahabi
S.
2021
Evaluating the capability of hybrid data-driven approaches to forecast monthly streamflow using hydrometric and meteorological variables
.
Journal of Hydroinformatics
23
(
6
),
1165
1181
.
https://doi.org/10.2166/hydro.2021.105
.
Burrus
C. S.
,
Gopinath
R. A.
&
Guo
H.
1998
Introduction to Wavelet and Wavelet Transforms: A Primer
.
Prentice Hall
,
Upper Saddle River, NJ
, p.
268
.
Cannas
B.
,
Fanni
A.
,
See
L.
&
Sias
G.
2006
Data preprocessing for river flow forecasting using neural networks: Wavelet transforms and data partitioning
.
Physics and Chemistry of the Earth, Parts A/B/C
31
(
18
),
1164
1171
.
https://doi.org/10.1016/j.pce.2006.03.020
.
Choong
S. M.
,
El-Shafie
A.
&
Mohtar
W. H. M. W.
2017
Optimisation of multiple hydropower reservoir operation using artificial bee colony algorithm
.
Water Resources Management
31
(
4
),
1397
1411
.
https://doi.org/10.1007/s11269-017-1585-x
.
Esmaeili-Gisavandani
H.
,
Farajpanah
H.
,
Adib
A.
,
Kisi
O.
,
Riyahi
M. M.
,
Lotfirad
M.
&
Salehpoor
J.
2022
Evaluating ability of three types of discrete wavelet transforms for improving performance of different ML models in estimation of daily-suspended sediment load
.
Arabian Journal of Geosciences
15
(
1
),
29
.
https://doi.org/10.1007/s12517-021-09282-7
.
Farajpanah
H.
,
Lotfirad
M.
,
Adib
A.
,
Esmaeili-Gisavandani
H.
,
Kisi
Ö.
,
Riyahi
M. M.
&
Salehpoor
J.
2020
Ranking of hybrid wavelet-AI models by TOPSIS method for estimation of daily flow discharge
.
Water Supply
20
(
8
),
3156
3171
.
https://doi.org/10.2166/ws.2020.211
.
Fathian
F.
,
Mehdizadeh
S.
,
Sales
A. K.
&
Safari
M. J. S.
2019
Hybrid models to improve the monthly river flow prediction: Integrating artificial intelligence and non-linear time series models
.
Journal of Hydrology
575
,
1200
1213
.
https://doi.org/ 10.1016/j.jhydrol.2019.06.025
.
Flanders
M.
2002
Choosing a wavelet for single-trial EMG
.
Journal of Neuroscience Methods
116
(
2
),
165
177
.
https://doi.org/10.1016/S0165-0270(02)00038-9
.
Freire
P. K. D. M. M.
,
Santos
C. A. G.
&
da Silva
G. B. L.
2019
Analysis of the use of discrete wavelet transforms coupled with ANN for short-term streamflow forecasting
.
Applied Soft Computing
80
,
494
505
.
https://doi.org/10.1016/j.asoc.2019.04.024
.
Gupta
H. V.
,
Kling
H.
,
Yilmaz
K. K.
&
Martinez
G. F.
2009
Decomposition of the mean squared error and NSE performance criteria: Implications for improving hydrological modelling
.
Journal of Hydrology
377
(
1–2
),
80
91
.
https://doi.org/10.1016/j.jhydrol.2009.08.003
.
Hamaamin
Y. A.
,
Nejadhashemi
A. P.
,
Zhang
Z.
,
Giri
S.
&
Woznicki
S. A.
2016
Bayesian regression and neuro-fuzzy methods reliability assessment for estimating streamflow
.
Water
8
(
7
),
287
.
https://doi.org/10.3390/w8070287
.
Ibrahim
K. S. M. H.
,
Huang
Y. F.
,
Ahmed
A. N.
,
Koo
C. H.
&
El-Shafie
A.
2022
A review of the hybrid artificial intelligence and optimization modelling of hydrological streamflow forecasting
.
Alexandria Engineering Journal
61
,
279
303
.
https://doi.org/10.1016/j.aej.2021.04.100
.
Jang
Y. I.
,
Sim
J. Y.
,
Yang
J. R.
&
Kwon
N. K.
2021
The optimal selection of mother wavelet function and decomposition level for denoising of DCG signal
.
Sensors
21
(
5
),
1851
.
https://doi.org/10.3390/s21051851
.
Labat
D.
,
Ababou
R.
&
Mangin
A.
2000
Rainfall–runoff relation for karstic spring. Part II: Continuous wavelet and discrete orthogonal multiresolution analyses
.
Journal of Hydrology
238
(
3–4
),
149
178
.
https://doi.org/10.1016/S0022-1694(00)00322-X
.
Liu
Z.
,
Zhou
P.
,
Chen
G.
&
Guo
L.
2014
Evaluating a coupled discrete wavelet transform and support vector regression for daily and monthly streamflow forecasting
.
Journal of Hydrology
519
,
2822
2831
.
https://doi.org/10.1016/j.jhydrol.2014.06.050
.
Lotfirad
M.
,
Adib
A.
,
Riyahi
M. M.
&
Jafarpour
M.
2023
Evaluating the effect of the uncertainty of CMIP6 models on extreme flows of the Caspian Hyrcanian forest watersheds using the BMA method
.
Stochastic Environmental Research and Risk Assessment
37
(
2
),
491
505
.
https://doi.org/10.1007/s00477-022-02269-0
.
Mao
G.
,
Wang
M.
,
Liu
J.
,
Wang
Z.
,
Wang
K.
,
Meng
Y.
,
Zhong
R.
,
Wang
H.
&
Li
Y.
2021
Comprehensive comparison of artificial neural networks and long short-term memory networks for rainfall–runoff simulation
.
Physics and Chemistry of the Earth, Parts A/B/C
123
,
103026
.
https://doi.org/10.1016/j.pce.2021.103026
.
Nalley
D.
,
Adamowski
J.
,
Khalil
B.
&
Biswas
A.
2020
A comparison of conventional and wavelet transform based methods for streamflow record extension
.
Journal of Hydrology
582
,
124503
.
https://doi.org/10.1016/j.jhydrol.2019.124503
.
Nash
J. E.
&
Sutcliffe
J. V.
1970
River flow forecasting through conceptual models part I – A discussion of principles
.
Journal of Hydrology
10
(
3
),
282
290
.
https://doi.org/10.1016/0022-1694(70)90255-6
.
Ngui
W. K.
,
Leong
M. S.
,
Hee
L. M.
&
Abdelrhman
A. M.
2013
Wavelet analysis: Mother wavelet selection methods
.
Applied Mechanics and Materials
393
,
953
958
.
https://doi.org/10.4028/www.scientific.net/AMM.393.953
.
Niu
W. J.
,
Feng
Z. K.
,
Cheng
C. T.
&
Zhou
J. Z.
2018
Forecasting daily runoff by extreme learning machine based on quantum-behaved particle swarm optimization
.
Journal of Hydrologic Engineering
23
(
3
).
https://doi.org/10.1061/(ASCE)HE.1943-5584.0001625
.
Nourani
V.
,
Komasi
M.
&
Mano
A.
2009
A multivariate ANN-wavelet approach for rainfall-runoff modeling
.
Water Resources Management
23
(
14
),
2877
2894
.
https://doi.org/10.1007/s11269-009-9414-5
.
Nourani
V.
,
Kisi
Ö.
&
Komasi
M.
2011
Two hybrid artificial intelligence approaches for modeling rainfall–runoff process
.
Journal of Hydrology
402
(
1–2
),
41
59
.
https://doi.org/10.1016/j.jhydrol.2011.03.002
.
Nourani
V.
,
Baghanam
A. H.
,
Adamowski
J.
&
Kisi
O.
2014
Applications of hybrid wavelet–artificial intelligence models in hydrology: A review
.
Journal of Hydrology
514
,
358
377
.
https://doi.org/10.1016/j.jhydrol.2014.03.057
.
Nourani
V.
,
Tajbakhsh
A. D.
,
Molajou
A.
&
Gokcekus
H.
2019
Hybrid wavelet–M5 model tree for rainfall–runoff modeling
.
Journal of Hydrologic Engineering
24
(
5
).
https://doi.org/10.1061/(ASCE)HE.1943-5584.0001777
.
Pal
S.
&
Talukdar
S.
2020
Modelling seasonal flow regime and environmental flow in Punarbhaba River of India and Bangladesh
.
Journal of Cleaner Production
252
,
119724
.
https://doi.org/10.1016/j.jclepro.2019.119724
.
Passoni
I.
,
Pra
A. D.
,
Rabal
H.
,
Trivi
M.
&
Arizaga
R.
2005
Dynamic speckle processing using wavelet based entropy
.
Optics Communications
246
(
1–3
),
219
228
.
https://doi.org/10.1016/j.optcom.2004.10.054
.
Sang
Y. F.
2013
A review on the applications of wavelet transform in hydrology time series analysis
.
Atmospheric Research
122
,
8
15
.
https://doi.org/10.1016/j.atmosres.2012.11.003
.
Singh
J.
,
Knapp
H. V.
,
Arnold
J. G.
&
Demissie
M.
2005
Hydrological modeling of the Iroquois River watershed using HSPF and SWAT
.
Journal of the American Water Resources Association
41
(
2
),
343
360
.
https://doi.org/10.1111/j.1752-1688.2005.tb03740.x
.
Syed
Z.
,
Mahmood
P.
,
Haider
S.
,
Ahmad
S.
,
Jadoon
K. Z.
,
Farooq
R.
,
Syed
S.
&
Ahmad
K.
2023
Short–long-term streamflow forecasting using a coupled wavelet transform–artificial neural network (WT–ANN) model at the Gilgit River basin, Pakistan
.
Journal of Hydroinformatics
25
(
3
),
881
894
.
https://doi.org/10.2166/hydro.2023.161
.
Tang
B.
,
Liu
W.
&
Song
T.
2010
Wind turbine fault diagnosis based on Morlet wavelet transformation and Wigner–Ville distribution
.
Renewable Energy
35
(
12
),
2862
2866
.
https://doi.org/10.1016/j.renene.2010.05.012
.
Thompson
B.
1995
Stepwise regression and stepwise discriminant analysis need not apply here: A guidelines editorial
.
Educational and Psychological Measurement
55
(
4
),
525
534
.
https://doi.org/10.1177/0013164495055004001
.
Tikhamarine
Y.
,
Souag-Gamane
D.
&
Kisi
O.
2019
A new intelligent method for monthly streamflow prediction: Hybrid wavelet support vector regression based on grey wolf optimizer (WSVR–GWO)
.
Arabian Journal of Geosciences
12
(
17
),
540
.
https://doi.org/10.1007/s12517-019-4697-1
.
Tikhamarine
Y.
,
Souag-Gamane
D.
,
Ahmed
A. N.
,
Kisi
O.
&
El-Shafie
A.
2020
Improving artificial intelligence models accuracy for monthly streamflow forecasting using grey wolf optimization (GWO) algorithm
.
Journal of Hydrology
582
,
124435
.
https://doi.org/10.1016/j.jhydrol.2019.124435
.
Wang
K.
,
Band
S. S.
,
Ameri
R.
,
Biyari
M.
,
Hai
T.
,
Hsu
C. C.
,
Hadjouni
M.
,
Elmannai
H.
,
Chau
K. W.
&
Mosavi
A.
2022
Performance improvement of machine learning models via wavelet theory in estimating monthly river streamflow
.
Engineering Applications of Computational Fluid Mechanics
16
(
1
),
1833
1848
.
https://doi.org/10.1080/19942060.2022.2119281
.
Wang
J.
,
Wang
X.
&
Khu
S. T.
2023
A decomposition-based multi-model and multi-parameter ensemble forecast framework for monthly streamflow forecasting
.
Journal of Hydrology
618
,
129083
.
https://doi.org/10.1016/j.jhydrol.2023.129083
.
Yang
L.
,
Judd
M. D.
&
Bennoch
C. J.
2004
Denoising UHF signal for PD detection in transformers based on wavelet technique
. In
Paper Presented at 2004 Annual Report Conference on Electrical Insulation and Dielectric Phenomena, CEIDP ’04.
,
Boulder, United States
, pp.
166
169
.
https://doi.org/10.1109/CEIDP.2004.1364215.
Zhang
L.
,
Bao
P.
&
Wu
X.
2005
Multiscale LMMSE-based image denoising with optimal wavelet selection
.
IEEE Transactions on Circuits and Systems for Video Technology
15
(
4
),
469
481
.
https://doi.org/10.1109/TCSVT.2005.844456
.
Zhang
D.
,
Lin
J.
,
Wang
D.
,
Yang
T.
,
Sorooshian
S.
,
Liu
X.
&
Zhuang
J.
2018
Modeling and simulating of reservoir operation using the artificial neural network, support vector regression, deep learning algorithm
.
Journal of Hydrology
565
,
720
736
.
https://doi.org/10.1016/j.jhydrol.2018.08.050
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).

Supplementary data