Accurate river flow prediction is critical for sustainable water resource management, particularly in arid and semi-arid regions. However, balancing model accuracy, computational efficiency, and interpretability remains a significant challenge due to the complex and nonlinear nature of hydrological systems. This study employs a granular computing (GRC) model to predict monthly inflows to the Alavian Dam in Iran. Principal component analysis (PCA) was used to reduce input dimensionality, identifying six key variables to enhance computational performance. The predictive performance of GRC was compared with artificial neural networks (ANNs) and support vector machines (SVMs), using R2, RMSE, and MAE as evaluation metrics. The GRC model achieved R2 values of 0.93 during calibration and 0.94 during validation, outperforming both ANN and SVM. Notably, GRC demonstrated superior accuracy in capturing extreme flow events, which are crucial for flood and drought management. This advantage is attributed to its rule-based structure and local learning approach, which enables effective modeling of nonlinearities and sparse data. Furthermore, the interpretability of the GRC model - facilitated by its use of granules and transparent if-then rules - offers valuable insights into variable influence. These strengths highlight GRC as a reliable and efficient tool for hydrological forecasting and climate-adaptive water resource planning.

  • This study presents granular computing (GRC) as an innovative approach for monthly inflow prediction, surpassing traditional models like artificial neural network and support vector machine in both accuracy and interpretability.

  • With R2 values of 0.93 and 0.94 during calibration and validation, GRC demonstrated superior predictive performance while avoiding overfitting.

  • Unlike black-box models, GRC ensured consistent, interpretable, and reliable predictions.

The sustainable management of water resources has become one of the most critical global challenges, especially in regions facing diminishing supplies due to rapid population growth and increasing demand across domestic, agricultural, and industrial sectors. Accurate river flow prediction is essential for effective water resource planning, helping decision-makers allocate resources more efficiently and mitigate risks associated with both water scarcity and extreme hydrological events. In developing countries, where water infrastructure is often limited, effective runoff modeling becomes particularly important in risk reduction efforts. Multiple runoff modeling techniques exist, and enhancing these models through accurate calibration is essential to address growing water demands and the intensifying impacts of climate change on hydrological systems (Jodhani et al. 2023a, b, 2024; Mazandarani Zadeh et al. 2023; Alisoltani et al. 2024; Vyas et al. 2024; Heshmati et al. 2025).

Over the past decades, researchers have developed a variety of approaches to address the complex nature of river flow prediction. These range from traditional regression methods and conceptual hydrological models to advanced black-box techniques such as artificial neural networks (ANN) and support vector machines (SVM). For instance, Fallah Kalaki et al. (2025) evaluated weather and river flow predictions in the Karun River Basin using the North American Multi-Model Ensemble (NMME) system in combination with the Soil & Water Assessment Tool (SWAT) model. The study demonstrated improved long-term forecasting performance, particularly in spring and autumn, by integrating multiple NMME models with statistical downscaling techniques such as multiple linear regression and K-nearest neighbors.

Neuro-fuzzy models have also gained attention for their potential to combine the strengths of fuzzy logic and neural networks. Early work by Roger & Gulley (1995) introduced this hybrid approach, showing its effectiveness in capturing complex relationships through both rule-based reasoning and adaptive learning. Further applications by Coulibaly et al. (2000), Nayak et al. (2004), and Farokhnia & Morid (2010) demonstrated the superior performance of neuro-fuzzy systems compared to traditional ANN models, particularly in reducing output uncertainty. Other studies, such as those by Anusree & Varghese (2016) and Nath et al. (2020), explored enhancements to Adaptive Neuro-Fuzzy Inference System (ANFIS) models using optimization algorithms like particle swarm optimization (PSO), leading to improved predictive accuracy and computational efficiency.

In addition to ANN and neuro-fuzzy models, SVMs have been successfully applied in river flow forecasting due to their solid theoretical foundation and capacity for handling nonlinear data. Notable contributions include those by Asefa et al. (2005), Yu et al. (2006), and He et al. (2014), who reported satisfactory results in seasonal and short-term flow prediction tasks. Complementary to this, Noori et al. (2009; 2010, 2011a) highlighted the benefits of preprocessing techniques such as principal component analysis (PCA) and wavelet transforms in improving model performance.

Despite the promise of ANN, SVM, and fuzzy models, these techniques often function as black boxes, offering limited insight into the physical relationships among input parameters. Their inability to interpret nonlinear hydrological behavior, especially during extreme events such as floods and droughts, remains a key limitation. Furthermore, their performance typically degrades when faced with unstable or highly uncertain input data. These challenges underscore the need for models that combine high accuracy and efficiency with interpretability and robustness across varying conditions.

In response to these challenges, granular computing (GRC) has emerged as a promising framework for modeling complex environmental systems. Compared to conventional data-driven models, GRC offers notable advantages, including better interpretability, flexible handling of sparse or imprecise data, and rule-based knowledge structuring. These attributes make GRC particularly effective in scenarios involving uncertainty, such as estimating the longitudinal dispersion coefficient in rivers. Hybrid models that integrate GRC with neural networks have shown enhanced performance without sacrificing model transparency, positioning GRC as a compelling alternative to black-box methods in environmental applications.

The GRC approach defines computational models through classes, clusters, groups, and intervals, making it well-suited for large, complex datasets (Zhao et al. 2007). It utilizes multi-criteria decision-making to extract classification rules while minimizing inconsistency, thereby reducing problem uncertainty. However, its limitation lies in the fixed interval assumption, which can restrict flexibility. In recent years, GRC has been applied to water science problems, including the estimation of riverbed particle dimensions and longitudinal dispersion coefficients, with encouraging results reported by Naghikhani et al. (2015), Noori et al. (2017a, b), and Ghiasi et al. (2019, 2022).

This study adopts the GRC model – a method more common in electrical engineering than in water resource applications – for streamflow prediction. Given the supervised nature of the current problem, unsupervised methods such as self-organizing maps, hierarchical clustering, and semi-supervised models are unsuitable. Similarly, evolutionary algorithms and reinforcement learning techniques are not optimal for developing a predictive model in this context, although they can support pattern recognition tasks. Among supervised learning models, linear and logistic regression approaches lack the complexity-handling capabilities required here, while decision-tree-based models risk providing suboptimal, localized results (Maimon & Rokach 2008; Ben-Gal et al. 2014). Neural networks, though powerful, are often non-interpretable, computationally intensive, and sensitive to initialization conditions. In contrast, the GRC model provides a consistent and interpretable framework where model behavior aligns with its underlying assumptions and is not dependent on random weight initialization. Furthermore, its precision can be fine-tuned through parameter adjustment, unlike neural networks that may yield variable outputs across different runs.

GRC is particularly suitable for predicting monthly inflow to the Alavian Dam, where input factors such as precipitation, climatic variability, and hydrological conditions are subject to high uncertainty and fluctuation. Traditional models may fall short in such settings, whereas GRC's hierarchical structure aids in uncovering hidden patterns within the data, enhancing both predictive accuracy and computational performance. By analyzing information at varying levels of granularity, GRC simplifies the processing of large datasets – an essential feature in big-data-driven hydrological modeling. Moreover, appropriate parameterization of the GRC model helps mitigate overfitting risks (Sheikhian et al. 2017).

Despite the high predictive power of ANN and SVM models, their lack of interpretability and high computational demand present limitations. GRC overcomes these issues by decomposing data into granules and establishing interpretable interrelationships. This transparency fosters greater stakeholder trust and enables more informed decision-making. Additionally, GRC's hierarchical design better accommodates the inherent uncertainty and complexity of hydrological data compared to black-box models, and it reduces computational costs through its efficient problem decomposition.

Nevertheless, GRC does face challenges, such as scalability and memory inefficiency when applied to extremely large datasets. Addressing these limitations requires the development of optimized algorithms, implementation of parallel processing techniques, and use of advanced computational infrastructures.

Consistent with recent research practices, PCA is employed in this study as a dimensionality reduction technique to manage a high number of input variables and improve computational efficiency. By projecting the data into a lower-dimensional space, PCA retains essential patterns while streamlining the model training process. This preprocessing step was applied prior to implementing the GRC model. Its effectiveness depends on the data's structure and variability. Following PCA, the GRC model is applied alongside ANN and SVM models to predict precipitation and river flow, with particular attention given to their performance in extreme flow conditions. The objective of this study is to evaluate the accuracy, robustness, and interpretability of the GRC model in comparison to conventional machine learning approaches.

Study area

Alavian Dam is constructed on the southern slopes of Mount Sahand in the Urmia Lake watershed, on the Sufi Chay River, one of the rivers of East Azerbaijan Province, located 3.5 km north of Maragheh city in East Azerbaijan Province (Figure 1). Alavian Dam was built between 1990 and 1995. In fact, the Sufi Chay River serves as the main inlet to the Alavian Dam. The dam's minimum and maximum storage capacities are 3 and 60 m3, respectively, while the average annual evaporation and precipitation are 1885.51 and 330.55 mm, respectively. The dam's basin covers an area of 914.3 km2. The primary objectives of constructing the dam include supplying the required agricultural water in the Maragheh and Bonab regions over an area of 1,200 hectares, providing part of the drinking water for the cities of Maragheh, Miandoab, Bonab, and Malekan, supplying water for regional industries, and controlling floods from the Sufi Chay River.

The collected data were obtained from the synoptic station of Maragheh. The average river discharge data were obtained from the hydrometric station in Tazkand, located upstream of the Alavian Dam. These stations have provided data between the years 1983 to 2005 due to their long-term functionality. The relevant information, constituting 18 variables with three-time lags, forms the foundational data for this study. Figure 1 illustrates the location of this watershed and the corresponding station in East Azerbaijan Province. The objective of the calculations conducted in this article is to predict future discharge using the required dataset through a granular computation method. Due to the large number of input variables, PCA, as performed by Noori et al. (2011b), was employed to select and incorporate the primary and most significant variables into the model. Based on this, six input variables were considered for the models to predict discharge at time t + 1, including maximum temperature at t − 2, solar radiation at time t − 1, discharge at time t, discharge at time t − 1, precipitation at time t, and precipitation at time t − 2. A summary of the statistical information for the data used in this study is provided in Table 1.

Table 1

Statistical summary data used

VariablesRt−2 (mm)R (mm)Qt−1 (m3/s)Q (m3/s)Radt−1 (cal/cm2)Tmaxt−2 (°C)Qt+1 (m3/s)
Maximum 101.20 101.34 23.200 23.200 609.9 35.000 23.200 
Median 20.90 20.97 1.570 1.570 372.3 18.600 1.570 
Minimum 0.00 0.00 0.149 0.149 130.2 −1.400 0.149 
Variance 566.73 625.01 20.043 20.737 22242.7 114.271 20.915 
St Dev 23.81 25.00 4.477 4.554 149.1 10.690 4.573 
SE Mean 1.62 1.71 0.305 0.311 10.2 0.729 0.312 
Mean 25.85 26.67 3.623 3.683 382.9 18.223 3.713 
Skewness 0.86 0.93 2.20 2.14 −0.02 −0.05 2.11 
VariablesRt−2 (mm)R (mm)Qt−1 (m3/s)Q (m3/s)Radt−1 (cal/cm2)Tmaxt−2 (°C)Qt+1 (m3/s)
Maximum 101.20 101.34 23.200 23.200 609.9 35.000 23.200 
Median 20.90 20.97 1.570 1.570 372.3 18.600 1.570 
Minimum 0.00 0.00 0.149 0.149 130.2 −1.400 0.149 
Variance 566.73 625.01 20.043 20.737 22242.7 114.271 20.915 
St Dev 23.81 25.00 4.477 4.554 149.1 10.690 4.573 
SE Mean 1.62 1.71 0.305 0.311 10.2 0.729 0.312 
Mean 25.85 26.67 3.623 3.683 382.9 18.223 3.713 
Skewness 0.86 0.93 2.20 2.14 −0.02 −0.05 2.11 

In this study, PCA was employed not only for dimensionality reduction but also to enhance the performance of the GRC model. High-dimensional input spaces in hydrological modeling often lead to overfitting, reducing the generalizability of traditional models. By applying PCA, we identified the most informative features while maintaining computational efficiency within the GRC framework.

Unlike previous studies where PCA was primarily used as a preprocessing step for neural networks or regression models, this research integrates PCA within a rule-based GRC approach, representing an innovative application in hydrological modeling. The findings indicate that this integration not only improves predictive accuracy but also enhances model interpretability by filtering out noise from less relevant variables.

The values required for predicting future discharge are presented in Table 1. In this table, calculated values for maximum, median, minimum, variance, standard deviation, mean absolute error, mean, and skewness are provided. These values can be compared for different time periods. Considering the availability of these values for past precipitation, radiation, temperature, and discharge, the corresponding values for future discharge are obtained, which are necessary for calculations in subsequent stages.

Granular computing

GRC is a set of theories, methods, tools, and techniques that utilize granules (classes, groups, and clusters of the reference set) to solve problems. The working method of this GRC model is to form granules of information using similarity (indiscernibility) and extract classification rules (Yao & Zhong 2002). The implementation process of this method consists of several key stages. The first step involves defining the problem space and collecting relevant data. At this stage, the input and output variables are identified, and the necessary data are gathered from various sources such as historical records, laboratory experiments, or simulations. Subsequently, data preprocessing is performed, which includes normalization, cleaning, and the removal of missing values. Next, the granulation of the data process is carried out. In this step, the data are grouped into clusters or granules based on shared characteristics, which may be defined according to similarity, proximity, or degrees of certainty. In the following stage, instead of working with individual data points, the analysis is conducted using information granules that serve as representatives of subsets of the original data. All the information in the problem is classified by the GRC model into objects with a specified number of attributes in a way that one of the attributes serves as the class identifier or output value. The extraction of rules is performed using algorithms related to GRC rules, utilizing the sorted set of information (information table) for this purpose (Yao & Zhong 2002). In essence, the information table provides a descriptive representation of objects or patterns. Each row corresponds to an object, and each column represents an attribute. In this way, objects are described with their attributes. Subsequently, the analysis of granular relationships is performed. This step focuses on examining the interactions and dependencies among the granules. For example, it identifies which granules have the greatest impact on the output or whether certain patterns are recurrent across the granules. Tools such as fuzzy logic and statistical methods can be employed to extract and interpret these relationships. Finally, the model performance evaluation stage is conducted. In this phase, the accuracy and effectiveness of the model are assessed using evaluation metrics such as root mean square error (RMSE), mean absolute error (MAE), and coefficient of determination (R2). The mathematical relation of the information table is represented by the following equation (Yao & Zhong 2002).
(1)
In Equation (1), U is a non-empty set of objects, At is a non-empty set of attributes, L is a linguistic term set of features for the set At, Va is a set of values for the features in At, and fa is a function that, for each a ∈ At and XU, yields the information function f(x, a) ∈ Va. The representation of rules in this model is in the form of if-then. If an object has a specific set of attribute values from the information table in the form of a concept φ, and it also belongs to a class or has a specific value of the decision variable ψ, the relationship between the two sets is defined as follows: ‘If an object belongs to the concept φ, then it will have a value ψ for the decision variable’ (Yao & Zhong 2002). The GRC model performs multiple measurements on objects, concepts, and rules to extract relevant information. The parameters defined in this model for the intended purposes are generalization, absolute support, and coverage, which are discussed further (Yao & Zhong 2002). These mentioned parameters are applicable for extracting and prioritizing rules. Rules with higher levels of generalization, absolute support, and coverage, along with lower inconsistency, are more suitable for subsequent predictions. Equation (2) represents the degree of conceptual learning such as φ among the available information, defining the concept of generalization. In a way, φ indicates the relative size of the grain representing φ (Yao & Zhong 2002).
(2)
This equation represents the size of the grain constituting φ, and represents the size of the grain constituting the reference set. The absolute accuracy of the rule or relation ϕ → ψϕ → ψ is indicative of absolute support, expressed as conditional probability P(ϕ → ψ)P(ϕ → ψ) (Yao & Zhong 2002).
(3)
indicating the size of the grain whose constituting elements simultaneously satisfy φ and ψ.
The quantitative measure of the coverage of ψ by φ signifies the likelihood of the rule being invoked and, at the same time, the accuracy level of the rule. In other words, the conditional probability that if an object is randomly selected within the range of ψ, it also falls within the range of φ. The calculation of this measure is expressed in the following equation (Yao & Zhong 2002).
(4)
conditional incompatibility is calculated according to Equation (5). Some reasons for conditional incompatibility include a large volume of data, insufficient awareness of the accuracy of the information, and inconsistencies present in the opinions provided by researchers (Yao & Zhong 2002; Zhao et al. 2007).
(5)

In Equation (5), represents the true positive rate for the class .

The steps of using the GRC model in streamflow prediction, illustrated in Figure 2 depicting the modeling process, are provided.
Figure 2

GRC model flowchart for flow prediction (Noori et al. 2017a, b).

Figure 2

GRC model flowchart for flow prediction (Noori et al. 2017a, b).

Close modal

The extraction of patterns in the GRC model from the data involves three main components. In the first part, possible rules are extracted. In the second part, rules of higher quality are selected from the set of extracted rules in the previous stage. In the third part, the rules are prioritized. The strength of the GRC model lies in the criteria used to determine and rank the quality of rules. These rules are expressed in Equations (2)–(5). Rules with higher absolute support, higher coverage, and higher generality have higher confidence, while rules with higher conditional inconsistency have lower confidence. In the parameter of absolute support, values range between zero and one, with values close to one being desirable and expected. This metric essentially indicates the degree of support the body of a rule provides for its output class. A set of rules reflects the level of support for a specific output class and also the support level for the selected set of rules from the set of classes or existing output values. Essentially, this parameter indicates the likelihood of a specific output occurring under specific conditions, and therefore, it does not definitively assign an output to the input conditions. The coverage parameter expresses the support of an output class from the set of input values. It indicates, if a particular output class is chosen, the probability with which different rules will be invoked. Considering both the absolute support and coverage parameters, the strength of the two-way relationship between input values and a specific output class is determined. The generality of a rule reflects the extent of inclusion of the rule in the training dataset. The higher the generality of a rule, the more occurrences it has for a greater number of samples, and thus, it carries a higher level of confidence. Rules with very high generality often lack precise information and are disregarded by considering absolute support and coverage.

Conditional inconsistency measures the degree of inconsistency of a rule within the set of rules. According to this metric, the accuracy of a rule may be called into question in cases where the extracted rule, valid for a particular dataset, is considered an exception. Such a rule, labeled as an exception, lacks compatibility with the other extracted rules and its utilization may introduce ambiguity into the created model. This parameter is one of the strengths of the model in distinguishing a reliable and accurate model, not just one that is precise (Sheikhian et al. 2015).

The GRC algorithm first extracts possible rules from the data and then calculates the mentioned parameters for each. To isolate accurate and reliable rules from the extracted set of rules, the algorithm removes rules with an inconsistency greater than 0.5 and absolute support less than 0.7. Next, it ranks the rules based on higher values of generality and coverage. For predicting an output value for input data, the model utilizes one or more rules with higher rankings to ensure both prediction accuracy and precision at satisfactory levels (Sheikhian et al. 2017).

The GRC model utilized six input variables to predict future flow rates, with river discharge at the time (t + 1) serving as the output parameter. A total of 215 data points were processed, of which 145 data points (70% of the dataset) were allocated for calibration, and 70 data points (30% of the dataset) were reserved for validation. The calibration phase involved tuning the GRC model parameters to optimize accuracy, while the validation phase tested the model's performance on unseen data.

The results demonstrated the GRC model's robust performance, with R2 of 0.93 and 0.94 for calibration and validation, respectively. Figure 3 and Figure 4 illustrate the observed versus predicted discharge values, showcasing the model's ability to closely replicate observed data. Furthermore, the model effectively captured both extremely high and low inflows, a critical capability for accurate hydrological modeling.
Figure 3

Relationship between observed and estimated discharge values during calibration.

Figure 3

Relationship between observed and estimated discharge values during calibration.

Close modal
Figure 4

Relationship between observed and estimated discharge values during validation.

Figure 4

Relationship between observed and estimated discharge values during validation.

Close modal

The sensitivity analysis indicated that river discharge at time t (Q(t)) was the most influential variable in predicting inflow at t + 1. This finding aligns with the well-established temporal dependency in river discharge, where past discharge levels strongly influence future inflow rates. Additionally, solar radiation (Rad(t − 1)) and discharge at t 1 emerged as critical predictors, underscoring the role of evaporation and groundwater contributions in hydrological processes.

Notably, precipitation at t − 2 had a relatively minor impact on inflow predictions. This suggests that the hydrological response to precipitation is primarily governed by immediate conditions rather than past rainfall patterns. The weaker influence of older precipitation data can likely be attributed to watershed storage effects, where infiltration and delayed runoff processes mitigate the direct impact of earlier rainfall events.

These results emphasize the importance of incorporating recent hydrological variables in inflow prediction models. They further suggest that approaches focusing on short-term dependencies may achieve greater predictive accuracy than those relying on extended precipitation trends.

In this article, three statistics were employed to assess the accuracy of the GRC model. These three statistics include the R2, MAE, and RMSE. The results of these statistics for the calibration and validation sections in the GRC method are presented in Table 2 and Figure 5.
Table 2

Performance evaluation of the GRC model: R2, RMSE, and MAE statistics

Statistical indexANN
SVM
GRC
CalibrationValidationCalibrationValidationCalibrationValidation
R2 0.92 0.81 0.82 0.93 0.94 
RMSE (m3/s) 1.36 1.69 0.04 1.69 1.33 0.87 
MAE (m3/s) 0.89 1.05 0.03 0.99 0.87 0.54 
Statistical indexANN
SVM
GRC
CalibrationValidationCalibrationValidationCalibrationValidation
R2 0.92 0.81 0.82 0.93 0.94 
RMSE (m3/s) 1.36 1.69 0.04 1.69 1.33 0.87 
MAE (m3/s) 0.89 1.05 0.03 0.99 0.87 0.54 
Figure 5

Heatmap of performance evaluation metrics for SVM, ANN, and GRC methods in the calibration and validation steps.

Figure 5

Heatmap of performance evaluation metrics for SVM, ANN, and GRC methods in the calibration and validation steps.

Close modal
For the purpose of comparing the accuracy of the developed model in this study with other conducted studies, the results of two models, SVM and ANN, conducted by Noori et al. (2011b; 2009), are presented alongside the results of the GRC model in Figure 6.
Figure 6

Comparison of R2 and RMSE values for three SVM, ANN, and GRC methods in the calibration and validation steps.

Figure 6

Comparison of R2 and RMSE values for three SVM, ANN, and GRC methods in the calibration and validation steps.

Close modal
Figure 7

Residual plot (differences between observed and estimated discharge values) versus the estimated discharge for GRC methods.

Figure 7

Residual plot (differences between observed and estimated discharge values) versus the estimated discharge for GRC methods.

Close modal

As evident in Figure 6, in the SVM method, the statistic value becomes 1 in the calibration section, but in the validation section, this value has reached 0.820. On the other hand, in the ANN method, the statistic shows 0.920 in the calibration section and adopts the value of 0.810 in the validation section. However, in the GRC method, the statistic is 0.930 in the calibration section and 0.940 in the validation section. This comparison indicates that the GRC method has operated more consistently in both calibration and validation sections and is, on average, closer to reality than the other two methods excessive training with the SVM method has led to a reduction in the MAE value to nearly zero for the validation section. However, in the testing section, the value of this statistic is 0.990. For the ANN method, the MAE value is 0.890 in the validation section and 1.050 in the testing section. However, for the GRC method in the calibration and validation sections, the MAE values are 0.870 and 0.540, respectively. Here again, the GRC method has performed significantly better than the other methods.

Also in Figure 5, for the calibration section, the SVM method has achieved a value close to zero, and for the validation section, this value is 0.691. For the ANN method, the RMSE value is 1.360 in the calibration section and 0.691 in the validation section. In the GRC method, the RMSE values for the calibration and validation sections are 1.330 and 0.870, respectively. Thus, once again, the GRC method has outperformed the other two methods. Overall, it appears that the SVM model performs weakly when faced with new data for estimation. Also, the difference between the calibration and validation stages in the GRC model compared to other methods is very small. This indicates that the GRC model has a very good performance and is not heavily dependent on training data. The reason for the superior performance of the GRC model compared to SVM and ANN models overall is due to its performance in both stages, lack of strong dependance on training data, and the absence of a black-box-like behavior in the model, which leads to reduced uncertainty.

Unlike ANN and SVM, the GRC model offers interpretable rules that provide insights into the underlying relationships between input and output variables. This transparency is particularly advantageous for water resource management, as it enables decision-makers to understand and trust the model's predictions. The model's robustness and interpretability make it well-suited for practical applications, including flood forecasting, reservoir management, and climate adaptation strategies.

Based on the residual plot (Figure 7), the model demonstrates good performance for most predicted values, with no systematic pattern observed in the errors. However, the high concentration of points at lower Q(Est) values suggests that the model may have more data points in this range, with residuals close to zero. Additionally, the random dispersion of residuals indicates the model's overall adequacy.

Evaluation of model accuracy at extreme flow values

The results of comparing the performance of the three models – GRC, ANN, and SVM – in predicting streamflow under extreme conditions (Figure 8) demonstrate that the GRC model consistently outperforms the other two approaches. The evaluation of the R2 reveals that GRC achieves values of 0.41 for low extremes and 0.84 for high extremes, indicating its superior ability to capture critical fluctuations in streamflow behavior. In contrast, ANN and SVM exhibit lower accuracy, particularly in predicting low extreme values, where ANN yields an R2 of only 0.02, highlighting its inability to generalize effectively under sparse and volatile data conditions.
Figure 8

Comparative evaluation of GRC, ANN, and SVM Models for extreme streamflow prediction.

Figure 8

Comparative evaluation of GRC, ANN, and SVM Models for extreme streamflow prediction.

Close modal

Complementary error metrics, including RMSE and MAE, further confirm the advantage of GRC. For low extreme flows, GRC records the lowest RMSE (0.51) and MAE (0.38), while maintaining superior performance for high extremes with RMSE and MAE values of 1.87 and 1.16, respectively. These quantitative results underscore the robustness of the GRC model in modeling nonlinear and unstable streamflow dynamics.

The observed superiority of the GRC model can be attributed to its rule-based structure, which enables it to decompose the input space into multiple localized granules. This granulation process allows the model to capture subtle variations and nonlinear dependencies within the data – particularly critical in extreme hydrological conditions where conventional global modeling approaches often fail. Moreover, GRC's flexibility in rule generation and selection promotes adaptability to data uncertainty and sparsity, which are common challenges in hydrological modeling. This local learning capability contrasts with the ANN and SVM models, which rely more heavily on global approximations and often struggle with overfitting or poor generalization in edge-case scenarios.

Therefore, incorporating GRC techniques not only enhances predictive accuracy but also provides greater model stability and interpretability, especially under extreme flow conditions. These results suggest that GRC offers a reliable and effective framework for hydrological forecasting applications, including flood and drought risk assessment.

Interpretability of the GRC model

A key advantage of the GRC model over black-box approaches such as ANNs and SVMs is its interpretability. Unlike ANN and SVM, which rely on intricate mathematical transformations and weight distributions that are often challenging to interpret, the GRC model organizes data into meaningful granules. Each granule represents a subset of similar data points, facilitating a clearer understanding of how input variables influence predictions.

In this study, the GRC model identified distinct granules based on key input variables, including river discharge at time t, solar radiation, and precipitation levels. These granules were then utilized to generate if-then rules, enhancing the model's transparency. For instance, a rule such as:

‘If precipitation at t − 1 is high and discharge at t exceeds the median threshold, then inflow at t + 1 is likely to be high’

provides an explicit reasoning framework that can be easily interpreted by hydrologists and decision-makers. This level of transparency is particularly valuable in water resource management, as it allows stakeholders to trust the model's predictions while gaining insights into the underlying hydrological processes.

Effective water resource management is becoming increasingly critical in addressing challenges related to water scarcity, especially in arid and semi-arid regions such as Iran. Accurate prediction of river flow is a key component in ensuring optimal allocation, operational planning, and disaster mitigation. In this study, the GRC approach was used to model monthly inflows to the Alavian Dam. PCA was applied to identify the most influential input variables, reducing the data dimensionality and enhancing model efficiency without compromising accuracy. The GRC model achieved high predictive performance, with R2 values of 0.93 and 0.94 during calibration and validation, respectively, outperforming both ANN and SVM models.

Importantly, the GRC model demonstrated a superior ability to predict extreme flow conditions. Unlike ANN and SVM, which often struggle with data sparsity or overfitting in such cases, GRC's rule-based and granule-oriented structure allowed it to effectively model localized and nonlinear relationships. This makes it particularly valuable for flood and drought forecasting, where capturing rare events is crucial. The model's transparency and interpretability – through explicit rule formulation – also offer practical advantages for operational use. Water managers and policymakers can gain insight not only into the outputs but also into the driving input variables, enabling informed and adaptive decision-making.

Looking ahead, integrating GRC with optimization algorithms (e.g., Genetic Algorithm (GA), PSO) and real-time monitoring systems can further enhance its adaptability to changing hydrological patterns. Additionally, GRC's potential for integration with climate models positions it as a promising tool for long-term water resource planning under climate change scenarios. Overall, the findings suggest that GRC offers a balanced solution combining accuracy, efficiency, and interpretability, and can play a pivotal role in sustainable water management and hydroclimatic resilience planning.

Data cannot be made publicly available; readers should contact the corresponding author for details.

The authors declare there is no conflict.

Alisoltani
T.
,
Shafiepour Motlagh
M.
&
Ashrafi
K.
(
2024
)
Concurrent heat stress and air pollution episodes by considering future projection of climate change
,
Scientific Reports
,
14
(
1
),
29301
.
Anusree
K.
&
Varghese
K. O.
(
2016
)
Streamflow prediction of Karuvannur River basin using ANFIS, ANN and MNLR models
,
Procedia Technology
,
24
,
101
108
.
https://doi.org/10.1016/j.protcy.2016.05.015
.
Asefa
T.
,
Kemblowski
M.
,
McKee
M.
&
Khalil
A.
(
2005
)
Multi-time scale stream flow predictions: the support vector machines approach
,
Journal of Hydrology
,
318
(
1–4
),
7
16
.
https://doi.org/10.1016/j.jhydrol.2005.06.001
.
Ben-Gal
I.
,
Dana
A.
,
Shkolnik
N.
&
Singer
G.
(
2014
)
Efficient construction of decision trees by the dual information distance method
,
Quality Technology & Quantitative Management
,
11
(
1
),
133
147
.
https://doi.org/10.1080/16843703.2014.11673330
.
Coulibaly
P.
,
Ancti
F.
&
Bobee
B.
(
2000
)
Daily reservoir inflow forecasting using artificial neural networks with stopped training approach
,
Journal of Hydrology
,
230
(
3–4
),
244
257
.
Fallah Kalaki
M.
,
Delavar
M.
,
Farokhnia
A.
,
Morid
S.
,
Shokri Kuchak
V.
,
Hajihosseini
H.
,
Shahbazi
A.
,
Nourmohammadi
F.
,
Motamedi
A.
&
Eini
M. R.
(
2025
)
An ensemble multi-model approach for long-term river flow forecasting in managed basins of the Middle East: insights from the Karkheh river basin
,
Journal of Hydrology
, 654,
132846
.
Farokhnia
A.
&
Morid
S.
(
2010
)
Uncertainty analysis of artificial neural networks and neuro-fuzzy models in river flow forecasting
,
Iran-Water Resources Research
, 5 (3),
14
27
.
Ghiasi
B.
,
Sheikhian
H.
,
Zeynolabedin
A.
&
Niksokhan
M. H.
(
2019
)
Granular computing–neural network model for prediction of longitudinal dispersion coefficients in rivers
,
Water Science and Technology
,
80
(
10
),
1880
1892
.
https://doi.org/10.2166/wst.2020.006
.
Ghiasi
B.
,
Noori
R.
,
Sheikhian
H.
,
Zeynolabedin
A.
,
Sun
Y.
,
Jun
C.
,
Hamouda
M.
,
Bateni
S. M.
&
Abolfathi
S.
(
2022
)
Uncertainty quantification of granular computing-neural network model for prediction of pollutant longitudinal dispersion coefficient in aquatic streams
,
Scientific Reports
,
12
(
1
),
4610
.
https://doi.org/10.1038/s41598-022-08417-4
.
Heshmati
S.
,
Nazari
B.
&
Nikoo
M. R.
(
2025
)
Enhancing accuracy in streamflow prediction under climate change scenarios based on an integrated machine learning–metaheuristic optimization approach
,
Journal of Water and Climate Change
,
16 (2), 456–473
.
https://doi.org/10.2166/wcc.2025.499
.
Jodhani
K. H.
,
Patel
D.
,
Madhavan
N.
&
Singh
S. K.
(
2023a
)
Soil erosion assessment by rusle, google earth engine, and geospatial techniques over rel river watershed, Gujarat, India
,
Water Conservation Science and Engineering
,
8
(
1
),
49
.
Jodhani
K. H.
,
Patel
D.
&
Madhavan
N.
(
2023b
)
A review on analysis of flood modelling using different numerical models
,
Materials Today: Proceedings
,
80
,
3867
3876
.
https://doi.org/10.1016/j.matpr.2021.07.405
.
Jodhani
K. H.
,
Patel
D.
,
Madhavan
N.
,
Gupta
N.
,
Singh
S. K.
&
Rathnayake
U.
(
2024
)
Unravelling flood risk in the Rel river watershed, Gujarat using coupled earth observations, multi criteria decision making and google earth engine
,
Results in Engineering
,
24
,
102836
.
Maimon
O. Z.
&
Rokach
L.
(
2008
)
Data mining with decision trees: theory and applications
,
World Scientific
,
69
, 1–21.
Mazandarani Zadeh
H.
,
Fallah Kalaki
M.
&
Azizian
A.
(
2023
)
Simulation of the effects of climate change on runoff using artificial neural network models and adaptive fuzzy neural inference system (Case study: tashk-Bakhtegan basin)
,
Iran-Water Resources Research
,
18
(
4
),
1
18
.
Naghikhani
A.
,
Sheikhian
H.
,
Noori
R.
&
Ghiasi
B.
(
2015
)
Estimating the scour hole of Ski jumps at downstream of Dam dimensions using granular computing model
,
Journal of Hydraulic Engineering
,
9
(
3
),
45
60
.
Nath
A.
,
Mthethwa
F.
&
Saha
G.
(
2020
)
Runoff estimation using modified adaptive neuro-fuzzy inference system
,
Environmental Engineering Research
,
25
(
4
),
545
553
.
https://doi.org/10.4491/eer.2019.166
.
Nayak
P. C.
,
Sudheer
K. P.
,
Rangan
D. M.
&
Ramasastri
K. S.
(
2004
)
A neuro-fuzzy computing technique for modeling hydrological time series
,
Journal of Hydrology
,
291
(
1
),
52
66
.
https://doi.org/10.1016/j.jhydrol.2003.12.010
.
Noori
R.
,
Farokhnia
A.
,
Morid
S.
&
Riahi Madvar
H.
(
2009
)
Effect of input variables preprocessing in artificial neural network on monthly flow prediction by PCA and wavelet transformation
,
Journal of Water and Wastewater
,
1
,
13
22
.
Noori
R.
,
Khakpour
A.
,
Omidvar
B.
&
Farokhnia
A.
(
2010
)
Comparison of ANN and principal component analysis-multivariate linear regression models for predicting the river flow based on developed discrepancy ratio statistic
,
Expert Systems with Applications
,
37
(
8
),
5856
5862
.
https://doi.org/10.1016/j.eswa.2010.02.020
.
Noori
R.
,
Khakpour
A.
,
Dehghani
M.
&
Farokhnia
A.
(
2011a
)
Monthly stream flow prediction using support vector machine based on principal component analysis
,
Journal of Water and Wastewater
,
80
,
118
123
.
Noori
R.
,
Karbassi
A. R.
,
Moghaddamnia
A.
,
Han
D.
,
Zokaei-Ashtiani
M. H.
,
Forokhnial
A.
&
Ghafari-Goushesh
M.
(
2011b
)
Assessment of input variables determination on the SVM model performance using PCA, gamma test, and forward selection techniques for monthly stream flow prediction
,
Journal of Hydrology
,
401
(
3–4
),
177
189
.
https://doi.org/10.1016/j.jhydrol.2011.02.021
.
Noori
R.
,
Ghiasi
B.
,
Sheikhian
H.
&
Adamowski
J. F.
(
2017a
)
Estimation of the dispersion coefficient in natural rivers using a granular computing model
,
Journal of Hydraulic Engineering
,
143
(
5
),
04017001
.
https://doi.org/10.1061/(ASCE)HY.1943-7900.0001276
.
Noori
R.
,
Sheikhian
H.
,
Hooshyaripor
F.
,
Naghikhani
A.
,
Adamowski
J. F.
&
Ghiasi
B.
(
2017b
)
Granular computing for prediction of scour below spillways
,
Water Resources Management
,
31
(
1
),
313
326
.
https://doi.org/10.1007/s11269-016-1526-0
.
Roger
J. S.
&
Gulley
N.
(
1995
)
Fuzzy Logic Toolbox. For use with MATLAB
.
Natik, MA
:
The Math Works
.
Sheikhian
H.
,
Delavar
M. R.
&
Stein
A.
(
2015
)
Integrated estimation of seismic physical vulnerability of Tehran using rule based granular computing
,
The International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences
,
40
(
3
),
187
.
https://doi.org/10.5194/isprsarchives-XL-3-W3-187-2015
.
Sheikhian
H.
,
Delavar
M. R.
&
Stein
A.
(
2017
)
A GIS-based multi-criteria seismic vulnerability assessment using the integration of granular computing rule extraction and artificial neural networks
,
Transactions in GIS
,
21
(
6
),
1237
1259
.
https://doi.org/10.1111/tgis.12274
.
Vyas
U.
,
Patel
D.
,
Vakharia
V.
&
Jodhani
K. H.
(
2024
)
Integrating GEE and IWQI for sustainable irrigation: a geospatial water quality assessment
,
Groundwater for Sustainable Development
,
27
,
101332
.
Yao
Y.
&
Zhong
N.
(
2002
)
Granular computing using information tables
.
In: Lin, T. Y., Yao, Y. Y. & Zadeh. L. A. (eds.)
Data Mining, Rough Sets, and Granular Computing
,
02
124
.
Cham, Switzerland: Springer. https://doi.org/10.1007/978-3-7908-1791-1_5
.
Yu
P. S.
,
Chen
S. T.
&
Chang
I. F.
(
2006
)
Support vector regression for real-time flood stage forecasting
,
Journal of Hydrology
,
328
(
3–4
),
704
716
.
https://doi.org/10.1016/j.jhydrol.2006.01.021
.
Zhao
Y.
,
Yao
Y. Y.
&
Yan
M.
(
2007
). '
ICS: an interactive classification system
',
Proceedings of the 20th Canadian Conference on Artificial Intelligence (CAI'07)
, pp.
134
145
.
https://doi.org/10.1007/978-3-540-72665-4_12
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).