## Abstract

Successful application of one-dimensional advection–dispersion models in rivers depends on the accuracy of the longitudinal dispersion coefﬁcient (LDC). In this regards, this study aims to introduce an appropriate approach to estimate LDC in natural rivers that is based on a hybrid method of granular computing (GRC) and an artificial neural network (ANN) model (GRC-ANN). Also, adaptive neuro-fuzzy inference system (ANFIS) and ANN models were developed to investigate the accuracy of three credible artificial intelligence (AI) models and the performance of these models in different LDC values. By comparing with empirical models developed in other studies, the results revealed the superior performance of GRC-ANN for LDC estimation. The sensitivity analysis of the three intelligent models developed in this study was done to determine the sensitivity of each model to its input parameters, especially the most important ones. The sensitivity analysis results showed that the W/H parameter (W: channel width; H: flow depth) has the most significant impact on the output of all three models in this research.

## INTRODUCTION

Dispersion is a dominant process for mixing pollutants in narrow water bodies such as natural rivers. In natural rivers, due to lower width and depth compared to length, usually – except for near-field regions close to source of pollution – hydrodynamic and water quality variations occur along the length of the river. In this regard, flow and mixing in rivers in the far-field (after cross-sectional mixing) can be considered as one-dimensional. Therefore, for river water quality modelling in the far field, lateral and vertical mixings are less relevant than longitudinal dispersion (Deng *et al.* 2001). The lateral difference of longitudinal velocity causes longitudinal dispersion and the intensity of this mixing is determined by the longitudinal dispersion coefficient (LDC) (Seo & Cheong 1998). Due to the importance of LDC, many studies have been done on this parameter.

Many researches focusing on the estimation of LDC have led to introducing different solutions which are based on integral methods (Fischer 1967; Fischer *et al.* 1979; Deng *et al.* 2001), empirical formulae (Seo & Cheong 1998; Kasheﬁpour & Falconer 2002; Azamathulla & Ghani 2011; Zeng & Huai 2014; Noori *et al.* 2017a), dye tracing measurements (Clark *et al.* 1996), and also more sophisticated methods based on artificial intelligence (AI) methods (Riahi-Madvar *et al.* 2009; Sahay & Dutta 2009; Parsaie & Haghiabi 2015; Noori *et al.* 2017b; Balf *et al.* 2018).

Generally, the estimation of LDC in natural rivers is very complicated due to high variations of hydraulic and geometric characteristics, which significantly affect this parameter. Regarding the application of the basic method proposed by Fischer *et al.* (1979) to determine LDC, a detailed transverse profile of velocity and geometry of the river is required. Otherwise, the lateral profile of longitudinal velocity should be assumed (Zeng & Huai 2014). Also, many researchers have demonstrated that some tracers (e.g., chemical or fluorescent tracers) are non-conservative in natural waters (Smart & Laidlaw 1977). Toxicity of by-products and the high cost of soluble tracers are other flaws related to the application of dye tracing measures (Clark *et al.* 1996). The mentioned shortcomings have limited the integral–empirical–dye tracer methods for LDC determination in rivers. Regarding AI methods, although they have recorded acceptable performance for LDC estimation, their uncertainty can limit the application of results for water quality management and control strategies in rivers (Noori *et al.* 2015).

In this study, considering the disadvantages and limitations of proposed methods for the prediction of LDC, an alternative approach was introduced that is based on a hybrid method of the granular computing (GRC) approach and an artificial neural network (ANN) model. GRC is a general theory of information granulation and processing, acting upon different levels of information discretization (Bargiela & Pedrycz 2003). Yao (2001) defined the GRC as a natural data-mining tool to solve real-world problems by analyzing, presenting and decision-making based on available information about the problem. The GRC model was first introduced by Zadeh (1996), focusing on interval-based information granulation and was developed to conceptualize granulation by Lin *et al.* (2002), which operates based on the information table. Pedrycz *et al.* (2008) discussed different combinations of fuzzy sets and GRC, proving that fuzzy sets can be applied to the presentation of GRC outputs, or as the input granules of the GRC model. Qian *et al.* (2010) extended Pawlak's rough set model (Pawlak 1982) to a multi-granulation rough set model, where granules are extracted in a hierarchical structure. The GRC approach has been applied in the fields of computer science and engineering such as machine learning (Yao & Yao 2002a, 2002b), knowledge reduction (Wu *et al.* 2009), intelligent data analysis (Chen & Yao 2006) and intelligent social networks (Yager 2008). However, different types of GRC have been proposed to improve its capabilities including GRC-based interval analysis (Moore 1966), integration of rough set theory and the GRC approach (Pawlak 1982) and embedding fuzzy sets into GRC (Zadeh 1997). Integration of GRC with interval analysis and rough set theory is mostly used for simplifying the problem, which is not the matter of interest in this research, and the theory of fuzzy-granulation is a more advanced and flexible method. However, it suffers from a shortage of transparency and validity of the results. In order to overcome the limitations of the mentioned methods, integration of ANN and GRC (GRC-ANN) approaches is proposed here. This approach provides a model with unique capabilities in comparison to fuzzy, rough set theory and interval information granulations. Reliability, transparency, efficiency, and robustness of the model can be considered as the advantages of GRC-ANN approach application, which can lead to the more reliable and accurate estimation of LDC.

The main goal of this study is to provide a new model to accurately estimate the LDC by the GRC-ANN model and compare GRC-ANN, ANN, adaptive neuro-fuzzy inference system (ANFIS) models with other studies. Finally, sensitivity analysis is done to check models' performances and how parameters affect the results.

## METHODOLOGY

### GRC-ANN model

*U*is the whole objects used to solve the problem, is a set of attributes for the mentioned objects,

*L*is a language of defining object attributes, is the set of valid values for the attribute

*a*and the is the function relating the objects to their attributes.

In the GRC approach, for prediction, a set of rules is extracted in the form of IF-THEN: ‘if an object satisfies , then the object satisfies ’, where concepts and are sets of attribute values for a set of objects and assigned output value, respectively. In the process of rule extraction, the GRC algorithm applies measurements on granules formed in the procedure in order to select the best set of possible rules and to prioritize the rules. Generality (*G*), absolute support (*AS*), coverage (*CV*) and conditional entropy (*CE*) are four parameters used to extract the optimum rules and to construct the best model.

The contingency table in Table 1 represents the quantitative information about the rule *φ* ⇒ *ψ*. The values in the four cells are not independent. They are linked by the constraint a + b + c + d = n. The 2 × 2 contingency table has been used by many authors for representing information of rules (Gaines 1991; Silverstein *et al.* 1998).

. | ψ
. | ¬ψ
. | Totals . |
---|---|---|---|

Φ | |m(φ) ∩ m(¬ψ)| | |m(φ) ∩ m(¬ψ)| | |m(φ)| |

¬φ | |m(¬φ) ∩ m(ψ)| | |m(¬φ) ∩ m(¬ψ)| | |m(¬φ)| |

Totals | |m(ψ)| | |m(¬ψ)| | |U| |

. | ψ
. | ¬ψ
. | Totals . |
---|---|---|---|

Φ | |m(φ) ∩ m(¬ψ)| | |m(φ) ∩ m(¬ψ)| | |m(φ)| |

¬φ | |m(¬φ) ∩ m(ψ)| | |m(¬φ) ∩ m(¬ψ)| | |m(¬φ)| |

Totals | |m(ψ)| | |m(¬ψ)| | |U| |

A concept is more general if it covers more instances of the universe. If *G*(*φ*) = *α*, then (100*α*)% of objects in *U* satisfy *φ*. The quantity may be viewed as the probability of a randomly selected element satisfying *φ*. Obviously, we have 0 ≤ *G*(*φ*) ≤ 1.

*AS*, which is the conditional probability where a randomly selected object satisfies both and (Yao & Yao 2002a, 2002b), can be achieved through Equation (3).

It may be interpreted as the degree to which *φ* implies *ψ*. If *AS*(*ψ*|*φ*) = *α*, then (100*α*)% of objects satisfying *φ* also satisfy *ψ*. It is in fact the conditional probability of a randomly selected element satisfying *ψ* given that the element satisfies *φ*. In set-theoretic terms, it is the degree to which *m*(*φ*) is included in *m*(*ψ*). Clearly, *AS*(*ψ*|*φ*) = 1, if and only if *m*(*φ*) ⊆ *m*(*ψ*).

*CE*, represented by that reveals the uncertainty of formula based on the formula is defined by Equation (4) (Yao & Yao 2002a, 2002b). where .

*CV*denotes the conditional probability of a randomly selected object to satisfy while satisfying (Yao & Yao 2002a, 2002b).

Unlike the absolute support, the change of support varies from −1 to 1. One may consider *G*(*ψ*) to be the prior probability of *ψ* and *AS*(*ψ*|*φ*) the posterior probability of *ψ* after knowing *φ*. The difference of posterior and prior probabilities represents the change of our confidence regarding whether *φ* actually confirms *ψ*. For a positive value, one may say that *φ* confirms *ψ*; for a negative value, one may say that *φ* does not confirm *ψ*.

Rule extraction starts with extracting and grouping instances possessing the same set of attribute values in the information table to construct concepts. Decision attributes are then grouped in the form of concepts showing the output classes. GRC applies the same measures to construct high-quality classification rules in the form of IF-THEN statements (Yao 2001). The GRC approach extracts the rules from the dataset based on *CE* and *AS* so that rules with minimum *CE* and maximum *AS* values are extracted. To form a granular decision tree, the priority of rules in the tree is determined based on higher *G* and *CV*. The rule extraction procedure using GRC for LDC modelling has been illustrated in the first section of Figure 1.

This paper proposes an integrated model of GRC rule generation and ANN (GRC-ANN model). It considers the relationship among extracted rules from the GRC rule extraction algorithm in the previous section, by applying a group of rules satisfied using the input pattern of data, ranked by the rule importance measurements undertaken by the GRC. This approach allows the ANN to use the mentioned rule quality parameters to construct the approximator structure, instead of a time-consuming iterative learning procedure.

To integrate GRC and ANN, existing relations in the data are extracted in the form of classification rules. A classifier network is then constructed from the rules considering their quality parameters computed during the rule extraction procedure. In this way, the network is constructed using the rule quality parameters.

The GRC-ANN classifier addresses the drawbacks and limitations of both granular computing and neural network methods. To do so, it embeds the rules extracted by granular computing into a feed forward multi-layer ANN structure. It significantly contributes to the transparency of the resulting structure. Layers, neurons, and connection weights are all accompanied by physical descriptions. The network uses the information provided by a set of rules to decide about the presented pattern and take advantage of all available information. By using this approach, no input pattern remains unclassified. In this way, learning is done by granular computing rather than by the network itself. The constructed network does not contain any connection or node without a clear description, despite the conventional neural networks containing hidden neurons and connection weights obtained from a black-box learning algorithm. (Sheikhian *et al.* 2017). The proposed method gives an exact and robust prediction, despite neural networks employing a learning algorithm which relies on the initial weights, causing different network structures after each learning instance. In the proposed model, the number of inputs, outputs, hidden neurons and the pattern of the connections are determined automatically from the dataset.

The second section of the Figure 1 algorithm is suggested to integrate GRC rule generation and ANN model.

The quality of prediction depends on the rule set quality and accuracy, including both test dataset adequacy and quality, and rule extraction algorithm reliability and robustness. The structure of the proposed procedure is illustrated in Figure 1.

The ANN used in this paper comprises four layers integrated with rule parameters, including input layer, pattern layer, rule firing layer and output layer (Figure 2). The number of nodes used in the input layer equals the attributes of data records. Pattern layer nodes contain normalized and quantized valid values of the criteria in the input layer. The classification rules are embedded in the rule-firing layer and the output node is one node which contains the value of the output parameter.

*CE*of the rules, which shows the degree of certainty in the utilized rules. The links between the rule-firing layer and output nodes are defined by

*AS*of the rules, which shows the degree of confidence provided by the supporting rules for the output class or value. The output of the rule-firing layer are used to compute the overall output of the network as follows: where LDC is the predicted value for the input pattern, is the support of the

*i*th fired rule used as the weight of connection between the rule-firing layer and the output layer, is the number of rules fired and is the output of rule-firing layer nodes obtained through Equation (7).

*i*th node in the rule-firing layer, which is the

*CE*of the rule corresponding to that node. is the weight of connection between the

*i*th node in the input layer and

*j*th node in the pattern layer, which is Boolean. Thus, is unity if the

*j*th node in the pattern layer is a valid interval for the value of the

*i*th node in the input layer, and it is zero otherwise. Finally, is the output of the

*i*th rule fired by the presented pattern of the dataset. The final output value would be obtained as:

In the process of prediction, a set of rules satisfied by an input pattern of the data, is triggered, which is followed by grouping these rules according to their output partition classes. For each output partition class, an activation strength value is calculated based on its triggered rules and their parameters computed by GRC. The output will be the average of the centre of partitions with maximum activation values. In this way, no object remains unclassified. Quality of the classification depends upon the accuracy and adequacy of the extracted and pruned rule set, quality of the dataset and reliability and accuracy of the rule extraction method.

Considering the provided description, it is evident that the network is independent of the number of input criteria and the number of extracted rules.

### Sensitivity analysis

_{c}, which is marginal sensitivity coefficient, and S

_{n}, which is normalized sensitivity coefficient, and are defined by below equations: where output ΔY is the deviation in Y caused by the deviation in the input X. Investigation of the sensitivity of the error is carried out by incrementing each input parameter by 10%.

### Models evaluation

^{2}), root mean square error (RMSE) and mean absolute error (MAE) were used to evaluate the models' performance. The best value for R

^{2}is equal to one and for the two other indices is zero. However, these criteria show the average error in models and do not give any information about the error distribution. Hence, it is important to test the models' performance using some other criteria such as threshold statistics (

*TS*) (Equation (11)) (Jain & Indurthy 2003) and developed discrepancy ratio (DDR) (Equation (12)) (Noori

*et al.*2010). In the DDR index, the higher crowns of each model show its better performance in estimating. where

*n*is the total number of points for which the absolute value of the relative error is less than

_{x}*x*%.

### Dataset

In this research, a dataset including 100 patterns consisting of hydraulic and geometry characteristics measured in rivers (Seo & Cheong 1998; Noori *et al.* 2011) was used to calibrate and verify the GRC-ANN model (details of the dataset are available in Appendix B, Table A3). These datasets were collected from published works of Seo & Cheong (1998), Kashefipour & Falconer (2002), Tayfur (2006), Toprak & Cigizoglu (2008), and Etemad-Shahidi & Taghipour (2012). Hydraulic characteristics were the cross-sectional average velocity (U) and shear velocity (u*) and also geometric characteristics included channel width (W) and the flow depth (H). To eliminate the possible inconsistency due to different data dimensions, the dimensionless parameters W/H, U/u* and LDC/Hu^{∗} were used for calibration and verification of GRC-ANN model. From the existing dataset, 70 data randomly selected for calibration and retention were used for validation of the results. In this dataset, W and H parameters had more variability than U and u*. LDC measurement data presented higher variability. Therefore, it can be concluded that the LDC should be treated as a nonlinear and complicated parameter because of its high variability.

## RESULT AND DISCUSSION

### GRC-ANN model evaluation

The data used in GRC-ANN model must be ranged between (0,1). Using the input and outputs data were rescaled in which *x*_{min} and *x*_{max} are the minima and maximum values in the datasets. Normalized calibration data were used as the input of the GRC-ANN model in the training procedure. Network design starts with rule induction through the GRC algorithm which consists of two main sections: extracting the set of possible rules and selecting the high-quality rules. Quality of the rules is assessed based on the measurements undertaken by GRC, which includes generality, absolute support, coverage, and entropy. Higher values of absolute support, coverage, and generality indicated that the rules are more desirable, while higher Entropy values show the opposite effect. The GRC algorithm used in this research applies the threshold of maximum 0.5 for entropy and minimum 0.7 for absolute support to achieve high-quality rules. In the case of generality and coverage, high values cannot be expected except in some cases, but rules having higher values of these parameters gain a priority in the set of rules. Four out of 43 extracted rules are shown in Table 2.

Rule . | Generality . | Absolute support . | Coverage . | Entropy . |
---|---|---|---|---|

If 0.41 < U/u* < 0.78 and 0.08 < W/H < 0.21 then 0.01 < LDC/Hu*< 0.06 | 0.08 | 1 | 0.6 | 0.3 |

If 0.25 < U/u* < 0.41 and 0.31 < W/H < 0.49 then 0.021 < LDC/Hu*< 0.43 | 0.06 | 0.9 | 0.25 | 0.1 |

If 0.25 < U/u* < 0.33 and 0.07 < W/H < 0.21 then 0.01 < LDC/Hu*< 0.06 | 0.11 | 0.91 | 0.1 | 0.06 |

If 0.008 < U/u* < 0.12 and 0.16 < W/H < 0.36 then 0.006 < LDC/Hu*< 0.024 | 0.09 | 0.77 | 0.16 | 0.32 |

Rule . | Generality . | Absolute support . | Coverage . | Entropy . |
---|---|---|---|---|

If 0.41 < U/u* < 0.78 and 0.08 < W/H < 0.21 then 0.01 < LDC/Hu*< 0.06 | 0.08 | 1 | 0.6 | 0.3 |

If 0.25 < U/u* < 0.41 and 0.31 < W/H < 0.49 then 0.021 < LDC/Hu*< 0.43 | 0.06 | 0.9 | 0.25 | 0.1 |

If 0.25 < U/u* < 0.33 and 0.07 < W/H < 0.21 then 0.01 < LDC/Hu*< 0.06 | 0.11 | 0.91 | 0.1 | 0.06 |

If 0.008 < U/u* < 0.12 and 0.16 < W/H < 0.36 then 0.006 < LDC/Hu*< 0.024 | 0.09 | 0.77 | 0.16 | 0.32 |

The induction process can be briefly described as follows. Based on a measure of association between two partitions, the universe is divided into partitions. If a partition is not a subset of a user-defined class or data-driven class, it is further divided by using another measure. The process continues until a set of rules are found that correctly classify all records. In a granule network, each node contains a subset of objects. A larger granule is divided into smaller granules by an atomic formula. In other words, the smaller granule is obtained by selecting those objects of the larger granule that satisfy the atomic formula. The set of the smallest granules thus form a conjunctively definable coverage of the universe. Atomic formulas define basic granules which are the basis for the granule network constructed from the data. Each node in the granule network is a conjunction of some basic granules, and thus a conjunctively definable granule. The algorithm is a heuristic search algorithm. The measures discussed can be used to define different fitness functions for the algorithm. After forming the initial network, a pruning process is undertaken to reduce the number of rules while maintaining useful information. In this process, granular computing measures are used to prune the rules not satisfying minimum requirements of a rule to be included in the classification rule set. For example, in this table GRC-ANN structure is constructed based on the set of qualified rules with the constraint of covering the dataset values. The acquired structure was examined against the expected values to ensure that the existing relations between the data patterns are perfectly reflected in the structure of the data. In this regard, the best network structure, which can be built upon the GRC quality measurements, was formed and tested against the 30 normalized instances and was used as the test dataset.

Figure 3 shows the quality measurements for the extracted rules. Because of the considered thresholds for *AS* and *CE* parameters, they fluctuate between the defined range. In the case of the *G* parameter, most of the rules acquired low values because of the complexity of the problem, which requires detailed rules to cover different situations.

### Comparison

In order to compare acquired GRC-ANN results concerning previously implemented models, the GRC-ANN results are compared with the outputs of the empirical equations applied to test data of this research. Table 3 shows the detailed result of the comparison.

Reference . | Equation . | RMSE . | MAE . | R^{2}
. |
---|---|---|---|---|

Fischer et al. (1979) | 456.46 | 179.23 | 0.6185 | |

Seo & Chang (1998) | 94.68 | 55.28 | 0.8180 | |

Deng et al. (2001) | 82.83 | 42.31 | 0.8349 | |

Kashefipoor & Falconer (2002) | 97.47 | 46.44 | 0.7957 | |

Sahay & Dutta (2009) | 92.06 | 45.36 | 0.7897 | |

Zeng & Huai (2014) | 105.65 | 48.10 | 0.8320 | |

Disley et al. (2015) | 108.79 | 51.20 | 0.9103 | |

Sattar & Gharabaghi (2015) | 87.17 | 44.47 | 0.8891 | |

GRC-ANN | 28.97 | 10.08 | 0.9824 |

Reference . | Equation . | RMSE . | MAE . | R^{2}
. |
---|---|---|---|---|

Fischer et al. (1979) | 456.46 | 179.23 | 0.6185 | |

Seo & Chang (1998) | 94.68 | 55.28 | 0.8180 | |

Deng et al. (2001) | 82.83 | 42.31 | 0.8349 | |

Kashefipoor & Falconer (2002) | 97.47 | 46.44 | 0.7957 | |

Sahay & Dutta (2009) | 92.06 | 45.36 | 0.7897 | |

Zeng & Huai (2014) | 105.65 | 48.10 | 0.8320 | |

Disley et al. (2015) | 108.79 | 51.20 | 0.9103 | |

Sattar & Gharabaghi (2015) | 87.17 | 44.47 | 0.8891 | |

GRC-ANN | 28.97 | 10.08 | 0.9824 |

Table 3 shows that the empirical methods extravagantly lack the accuracy of the GRC-ANN model. This can be due to applying a smaller dataset compared to GRC-ANN model. As a result, the Deng *et al.* (2002) method, which showed the best performance among empirical methods, had approximately three and four times higher RMSE and MAE values compared to GRC-ANN model. Considering the R^{2}, empirical methods reached the value of 0.8349 in the best case, which is the Deng *et al.* (2002) method, while the GRC-ANN value of R^{2} was approximately 0.9824. The GRC model has also outperformed the new methods proposed by Disley *et al.* (2015) and Sattar & Gharabaghi (2015). GRC-ANN RMSE and MAE are calculated as 28.97 and 10.08 respectively, which are better than mentioned methods, while R^{2} value is computed to be 0.982, which is slightly higher.

Predicted values of the mentioned models were compared with true measured LDC values. Figure 4 demonstrates this comparison for the calibration and test data. Apart from some anomalies, ANFIS model outputs were significantly closer to the desired values and ANN model was not capable of precisely pursuing the pattern of measured values. Among the implemented models, GRC-ANN perfectly estimated the measured LDC values, which verifies the accuracy and reliability of the GRC-ANN model once more.

Most of the AI methods are unstable in estimating low or high values of LDC (Noori *et al.* 2017a). Some of these models ignore LDC values over 100 to exaggerate the results, where in some others, estimated LDC least value is extravagantly higher than the measured least value. To sort this out, stability of the three implemented models was examined in estimating LDC values of higher than 200. Evidence for this can be found in Table 4. Although ANFIS model showed an acceptable performance in estimating LDC values higher than 200, its accuracy was not convincing because of having an RMSE index about two times worse compared to the GRC-ANN model. However, GRC-ANN model provided the best results among all of the methods in extremely high LDC values, which certify the good performance of the GRC-ANN model in prediction of LDC measurements.

. | RMSE . | MAE . | R^{2}
. |
---|---|---|---|

ANN | 281.58 | 150.66 | 0.5186 |

ANFIS | 97.18 | 41.63 | 0.9315 |

GRC-ANN | 39.86 | 15.11 | 0.9867 |

. | RMSE . | MAE . | R^{2}
. |
---|---|---|---|

ANN | 281.58 | 150.66 | 0.5186 |

ANFIS | 97.18 | 41.63 | 0.9315 |

GRC-ANN | 39.86 | 15.11 | 0.9867 |

Natural rivers have widespread W/H ratios affecting the LDC. To investigate this effect in model predictions, low and high values of W/H ratio were considered separately. The results are demonstrated in Table 5.

Model . | RMSE . | MAE . | R^{2}. | |||
---|---|---|---|---|---|---|

W/H <25 . | W/H >100 . | W/H <25 . | W/H >100 . | W/H <25 . | W/H >100 . | |

ANN | 88.08 | 15.70 | 60.98 | 8.28 | 0.0003 | 0.9993 |

ANFIS | 58.00 | 0.2819 | 34.87 | 0.12 | 0.3821 | 0.9999 |

GRC-ANN | 21.53 | 0.3729 | 16.13 | 0.23 | 0.8864 | 0.9999 |

Model . | RMSE . | MAE . | R^{2}. | |||
---|---|---|---|---|---|---|

W/H <25 . | W/H >100 . | W/H <25 . | W/H >100 . | W/H <25 . | W/H >100 . | |

ANN | 88.08 | 15.70 | 60.98 | 8.28 | 0.0003 | 0.9993 |

ANFIS | 58.00 | 0.2819 | 34.87 | 0.12 | 0.3821 | 0.9999 |

GRC-ANN | 21.53 | 0.3729 | 16.13 | 0.23 | 0.8864 | 0.9999 |

Table 5 illustrates that ANN and ANFIS models are more affected by W/H ratio than is GRC-ANN. Eliminating the W/H < 25 range, the ANN and ANFIS models' accuracy improves because these models have better performance in W/H > 100. Despite the ANN and ANFIS models showing weak performance in low and high values of W/H, the GRC-ANN model has an excellent performance. The R^{2} value of GRC-ANN models is close to 1.000 in W/H > 100 and close to 0.9 in W/H < 25. Therefore, W/H changes slightly affected the GRC-ANN and this model is applicable to every natural river.

In the case of U/u*, GRC-ANN model outperformed the other models, as in the case of W/H.

Table 6 indicates that the ANN model showed the worst performance with low U/u* values. Although GRC-ANN performance in U/u* < 3 is worse than U/u* > 15, it has better performance than other models since RMSE of the ANN and ANFIS models is about 4 and 3 times worse than that of GRC-ANN when U/u* < 3 and 140 and 3 times worse when U/u* > 15.

Model . | RMSE . | MAE . | R^{2}. | |||
---|---|---|---|---|---|---|

U/u* <3 . | U/u* >15 . | U/u* <3 . | U/u* >15 . | U/u* <3 . | U/u* >15 . | |

ANN | 36.92 | 332.18 | 22.00 | 140.28 | 0.3682 | 0.6970 |

ANFIS | 24.77 | 9.78 | 11.00 | 3.38 | 0.9862 | 0.9996 |

GRC-ANN | 9.65 | 2.24 | 7.27 | 1.04 | 0.9619 | 0.9999 |

Model . | RMSE . | MAE . | R^{2}. | |||
---|---|---|---|---|---|---|

U/u* <3 . | U/u* >15 . | U/u* <3 . | U/u* >15 . | U/u* <3 . | U/u* >15 . | |

ANN | 36.92 | 332.18 | 22.00 | 140.28 | 0.3682 | 0.6970 |

ANFIS | 24.77 | 9.78 | 11.00 | 3.38 | 0.9862 | 0.9996 |

GRC-ANN | 9.65 | 2.24 | 7.27 | 1.04 | 0.9619 | 0.9999 |

According to Tables 5 and 6, models can be ranked based on their performance in determining low and high values of W/H and U/u. The performance of the three models is prioritized in Table 7.

. | ANN . | ANFIS . | GRC-ANN . |
---|---|---|---|

W/H < 25 | 3 | 2 | 1 |

W/H >100 | 3 | 1 | 2 |

U/u* < 3 | 3 | 2 | 1 |

U/u* > 15 | 3 | 2 | 1 |

. | ANN . | ANFIS . | GRC-ANN . |
---|---|---|---|

W/H < 25 | 3 | 2 | 1 |

W/H >100 | 3 | 1 | 2 |

U/u* < 3 | 3 | 2 | 1 |

U/u* > 15 | 3 | 2 | 1 |

In this table, numbers 1, 2 and 3 indicate the order of choosing models for rivers which are based on a W/H or U/u* special range. Rank 1 indicates that the model has the lowest error among others presented in Tables 5 and 6 (it is considered as the best model) while rank 3 indicates that the model has the highest error. This table indicates that GRC-ANN model can be chosen in nearly all of the natural rivers, with best performance and least error. ANFIS model showed a better performance than ANN model. So, after GRC-ANN, ANFIS model can be chosen in special conditions. The ANN model, in three categories, was determined as the last priority, so in the special condition, ANN performance is the worst.

In Figure 5, more tendencies to the centreline in the error distribution graph and larger values of the maximum QDDR denote more accuracy. By comparing DDR results illustrated in Figure 5, we can conclude that the GRC-ANN and ANFIS models significantly outperformed the ANN model.

In order to compare the TS analysis, error distribution of testing steps for the ANN, ANFIS and GRC-ANN models is computed by means of TS analysis and absolute relative error (ARE) of 10, 25, 50, 90 and 100% of datasets, and results are given in Table 8.

Percent of datasets (%) . | ANN . | ANFIS . | GRC-ANN . |
---|---|---|---|

10 | 5.27 | 0.04 | 0.06 |

25 | 16.82 | 3.07 | 0.33 |

50 | 39.40 | 16.26 | 1.18 |

75 | 91.20 | 49.16 | 4.98 |

90 | 167.38 | 99.97 | 13.53 |

95 | 289.48 | 216.06 | 22.43 |

100 | 3674.66 | 387.8 | 67.89 |

Percent of datasets (%) . | ANN . | ANFIS . | GRC-ANN . |
---|---|---|---|

10 | 5.27 | 0.04 | 0.06 |

25 | 16.82 | 3.07 | 0.33 |

50 | 39.40 | 16.26 | 1.18 |

75 | 91.20 | 49.16 | 4.98 |

90 | 167.38 | 99.97 | 13.53 |

95 | 289.48 | 216.06 | 22.43 |

100 | 3674.66 | 387.8 | 67.89 |

### Sensitivity analysis

Sensitivity analysis of the model has been made for three different methods named GRC-ANN, ANFIS, and ANN. X and ΔX are the inputs of the models and in each method, Y is computed and is compared with the new Y which is the result of sensitivity analysis. Table 9 shows computed Y (Y_{1}), new Y (Y_{2}), S_{c} and S_{n} for the aforementioned methods.

Method . | GRC-ANN . | ANFIS . | ANN . | |||
---|---|---|---|---|---|---|

W/H . | U/u* . | W/H . | U/u* . | W/H . | U/u* . | |

X (input) | 51.46 | 7.75 | 51.46 | 7.75 | 51.46 | 7.75 |

ΔX (input) | 5.14 | 0.77 | 5.14 | 0.77 | 5.14 | 0.77 |

Y_{1} (computed output of model) | 695.97 | 695.97 | 775.25 | 775.25 | 793.21 | 793.21 |

Y_{2} (sensitivity analysis output) | 780 | 812 | 803.87 | 1840.75 | 856.65 | 953.35 |

ΔY (Y_{2} – Y_{1}) | 84.03 | 116.03 | 28.62 | 1065.49 | 63.44 | 160.14 |

S_{c} (marginal sensitivity coefficient) | 0.0612 | 149.67 | 5.56 | 1374.45 | 12.33 | 206.58 |

S_{n} (normalized sensitivity coefficient | 0.83 | 1.67 | 0.37 | 13.74 | 0.8 | 2.02 |

Method . | GRC-ANN . | ANFIS . | ANN . | |||
---|---|---|---|---|---|---|

W/H . | U/u* . | W/H . | U/u* . | W/H . | U/u* . | |

X (input) | 51.46 | 7.75 | 51.46 | 7.75 | 51.46 | 7.75 |

ΔX (input) | 5.14 | 0.77 | 5.14 | 0.77 | 5.14 | 0.77 |

Y_{1} (computed output of model) | 695.97 | 695.97 | 775.25 | 775.25 | 793.21 | 793.21 |

Y_{2} (sensitivity analysis output) | 780 | 812 | 803.87 | 1840.75 | 856.65 | 953.35 |

ΔY (Y_{2} – Y_{1}) | 84.03 | 116.03 | 28.62 | 1065.49 | 63.44 | 160.14 |

S_{c} (marginal sensitivity coefficient) | 0.0612 | 149.67 | 5.56 | 1374.45 | 12.33 | 206.58 |

S_{n} (normalized sensitivity coefficient | 0.83 | 1.67 | 0.37 | 13.74 | 0.8 | 2.02 |

Table 9 shows that the input parameter of W/H has a greater effect on the output of the model in all three models in this research. The importance of W/H has also been confirmed by Kashefipour & Falconer (2002), Tayfur & Singh (2005) and Sattar & Gharabaghi (2015).

However, the W/H parameter has a greater impact on all three models, but its impact varies in the three models. The values of S_{n} indicate that the W/H parameter has the highest effect in the ANFIS model and has the least effect in the GRC-ANN model. Also, the difference between the values of S_{n} in the two input parameters demonstrates that the impact difference between the two parameters in the GRC-ANN model is the lowest.

### Robustness

In order to investigate the model robustness, the GRC-ANN model was rerun three times with different calibration data size from 40 to 60% (the base run was 70%) and the model performance has been investigated in this condition. The R^{2} and RMSE of these runs in calibration and test steps are shown in Table 10.

Size of calibration . | R^{2}. | RMSE . | ||
---|---|---|---|---|

Calibration . | Test . | Calibration . | Test . | |

40% | 0.95 | 0.92 | 52 | 80 |

50% | 0.98 | 0.93 | 32 | 46 |

60% | 0.99 | 0.98 | 19 | 32 |

70% | 0.99 | 0.99 | 15 | 22 |

Size of calibration . | R^{2}. | RMSE . | ||
---|---|---|---|---|

Calibration . | Test . | Calibration . | Test . | |

40% | 0.95 | 0.92 | 52 | 80 |

50% | 0.98 | 0.93 | 32 | 46 |

60% | 0.99 | 0.98 | 19 | 32 |

70% | 0.99 | 0.99 | 15 | 22 |

By using 40% of the dataset for calibration, the RMSE is 52 and 80 in calibration and test steps, and these values are less than the best empirical model, Deng *et al.* (2001), with RMSE equal to 82. Therefore, the results of Table 10 demonstrated that the GRC-ANN is a reliable and robust model in different sizes of calibration dataset.

## CONCLUSION

Many researchers have tried to predict the LDC more accurately through experimental methods and intelligence models. These efforts have led to numerous intelligence models and empirical equations. Although intelligent models are more accurate than empirical models, they still do not have the proper accuracy and have some weaknesses. In this study, in order to improve the performance of the GRC model, it was combined with an ANN model. Also, to test the performance of other intelligence models, ANN and ANFIS models were also used to investigate the results of intelligent models accurately. The results of this research can be summarized as follows.

Compared to empirical relationships, the GRC-ANN model has the best performance, and the error value of the Deng

*et al.*(2002) equation is about three times greater than that of the GRC-ANN model.The GRC-ANN model makes proper allegiance with the pattern on observation values and has the best performance among ANN and ANFIS models.

The GRC-ANN model has higher accuracy and lower error than the ANN and ANFIS models in LDC> 200.

GRC-ANN model was more accurate than ANFIS and ANN models in W/H >100, W/H <25, U/u* <3 and U/u* >15 ranges. As a result, GRC-ANN model can be used to estimate all the LDC values in most rivers with high confidence.

Diagrams of error distribution of the models developed in this study indicate the advantage of GRC-ANN model.

The input parameter W/H has a more significant effect on the output of the model in all three models of this research.

The W/H parameter has the highest effect in the ANFIS model and has the least effect in the GRC-ANN model.

Due to results of GRC-ANN model in different sizes of calibration dataset, this model is very reliable and robust.

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this paper is available online at https://dx.doi.org/10.2166/wst.2020.006.

## REFERENCES

*A Formal Theory of Granularity*

*PhD thesis*