Abstract
Artificial neural networks (ANNs) are labeled as black-box techniques which limit their operational uses in hydrology. Recently, researchers explored techniques that provide insight into the various elements of ANN and their relationship with the physical components of the system being modeled which are commonly known as knowledge extraction (KE) techniques. However, the physical components of rainfall-runoff (RR) process utilized in these KE techniques are obtained from primitive baseflow separation techniques without considering other components of RR process utilizing mostly base flow and surface flow till now. To identify if ANN acquires physical components of the RR process, a well-established water balance model (Australian Water Balance Model) has been utilized first time in this study. For this purpose, correlation and visualization techniques have been used for the Kentucky River basin, USA. Results show that ANN architecture having a single hidden layer with four hidden neurons was the best in simulating RR process and each of the four hidden neurons corresponds to certain subprocesses of the overall RR process, i.e., two hidden neurons are capturing surface flow dynamics with lower and higher flows, one is capturing base flow dynamics, and last one is having good relations with past rainfalls showing that ANN captures physics of basin's RR process.
HIGHLIGHTS
Australian water balance model (AWBM) and ANN model are developed employing data obtained from the real basin.
The conceptual components from AWBM and the hidden neuron outputs from ANN are calculated and their relations are assessed using correlation and graphical techniques.
The hidden neurons of the best ANN architecture so obtained have traces of various subprocesses of the overall rainfall-runoff process.
INTRODUCTION
Artificial neural networks (ANNs) have been successfully applied for solving function approximation (FA), classification, and clustering problems in hydrologic and water resources areas and demonstrated their superiority over conventional conceptual models (Solomatine & Dulal 2003; Solomatine & Ostfeld 2008). The physical processes that are modeled by approximating a set of functions are known as FA types of problems; one such problem is the modeling of the rainfall-runoff (RR) process using ANN. Despite ANNs performing very well in RR modeling, they receive major criticism due to their weakness of being black-box models (Tickle et al. 1998). This criticism mainly stems from the fact that no satisfactory explanation of their internal behavior has been offered yet (Jain & Kumar 2009). This is a significant weakness, for without the ability to produce comprehensive decisions, it is hard to trust the reliability of networks addressing real-world problems (Benitez et al. 1997). However, it must be realized that if the nonlinear function being mapped by the ANN is explored further, one may be able to shed some light on the physical processes of the system being modeled that are inherent in a trained ANN model (Jain & Kumar 2009; Sudheer & Jain 2009; Maier et al. 2010). Identifying components of a physical process in a trained ANN is an area of research that is more or less virgin and remains unexplored, especially in hydrology except for a few works on knowledge extraction (KE) from the ANN RR models by investigating its massively parallel structure. Sensitivity analysis, graphical visualization, and correlation analysis are a few of the methods that have been employed to analyze a trained ANN to explore the possibility of physics embedded inside an ANN in RR modeling. Wilby et al. (2003) detected that the hidden nodes in the neural network correspond to dominant processes (such as quickflow, baseflow, and soil moisture accounting) within the conceptual RR model using correlation analysis. Sudheer & Jain (2004) proposed a method of visualization and understanding the internal behavior of an ANN model by hypothesizing that an ANN is able to map a function similar to the flow duration curve while modeling the river flow. Jain et al. (2004) examined whether or not the physical processes in a watershed are inherent in a trained ANN RR model by analyzing definite statistical measures of the strength of the relationship between the disintegrated hidden neuron responses of an ANN model and its input variables and various deterministic components of a conceptual RR model. See et al. (2008) applied graphical and statistical methods to visualize hidden neuron behavior in a trained neural network RR model developed for the River Ouse catchment in northern England. Sudheer (2005) extracted the knowledge of a trained ANN river flow model using a perturbation analysis for determining the order of influence of the elements in the input vector on the output vector for the Narmada Basin, India. Kingston et al. (2006) quantified the uncertainty associated with the optimized weight vector of an ANN for hydrological predictions using the Monte–Carlo approach to calculate the corresponding input importance measures, forming distributions that encapsulate the uncertainty in the modeled relationship. Jain & Kumar (2009) systematically dissected the massively parallel architectures of trained ANN hydrologic models to determine if they learn the underlying physical subprocesses during training using simple qualitative and quantitative techniques and showed that the number of hidden neurons determined during training for a particular data set correspond to certain subprocesses of the overall physical process being modeled. Londhe & Shah (2017) extracted the knowledge from a trained ANN model developed for estimating evaporation. Recently, Vidyarthi & Jain (2020) extracted the knowledge in terms of simple rules from a trained ANN drought classification model using the decision tree technique and concluded that the trained ANN learns specific rules during their training for forecasting specific drought classes.
It is shown from the literature that while training an ANN, different hidden neurons learn the individual relationships with physical components inherent in the overall RR relationship or with different portions of a flow hydrograph. In most of these works, however, the baseflow and surface flow components which are estimated using a basic formulation based on primitive base flow separation techniques have been utilized to identify them in ANN architecture and are not determined from a well-established water balance model. Therefore, further work is required in the area of KE from ANN models which can identify more physical components of the RR process obtained from a water balance model.
In this paper, the best ANN architecture has been explored by examining the hidden neuron outputs at the hidden layer for identifying the physical components of the overall RR process. The uniqueness of this study is to include the physical components obtained from a well-established Australian Water Balance Model (AWBM) to identify them in the trained ANN architecture. The AWBM model is a conceptual model and other physical components can be estimated using its calibrated parameters apart from base flow and surface flow components. The input variables possess important information to mimic the RR process and therefore, the input variables are also utilized along with the physical components for KE of trained ANN RR models. The specific objectives of this paper are to (i) develop an AWBM model for simulating the RR process in the basin; (ii) develop ANN models for simulating the RR process for the same data obtained from the same basin; (iii) extract knowledge from the trained ANN RR model using correlation and visualization techniques to identify the physical components of RR process. The daily rainfall (in mm), runoff (in m3/s), and temperature (°C) data derived from the Kentucky River Basin, USA, is employed to test the methodologies proposed in this paper.
The paper starts with a brief description of the tools and techniques employed in this study. The description of the study area and the data employed are then presented before describing the procedure for model development for the RR process using AWBM and ANN techniques. A detailed description of the procedure for KE is then discussed. The obtained results are discussed next before making concluding remarks.
METHODOLOGY, STUDY AREA AND DATA
The modeling methods employed to carry out the investigations in this study include the AWBM, ANN, and correlation techniques. A brief overview of the AWBM with steps involving the calculation of various physical components, ANN, and the correlation technique along with the explanation of the study area and data are presented next.
Australian water balance model
The rainfall-runoff library (RRL) which is a toolkit for various water balance models under one library developed by the Cooperation Research Centre for Catchment Hydrology, Australia, has been used for the development of the AWBM in this study. The sum of square error (SSE) as an objective function and genetic algorithm (GA) as an optimizer were used for the calibration of AWBM parameters (Kim et al. 2005).
Artificial neural network
For the minimization of errors in the standard FFBP network, the gradient-decent (GD) algorithm is normally used during training, which suffers from a slow convergence rate. In this study, a more efficient and effective algorithm known as the Levenberg–Marquardt algorithm (Hagan & Menhaj 1994; Hagan et al. 1996) has been used for training the ANN models.
Statistics for performance evaluation and KE
Five different standard statistics are used to evaluate the performance of the developed models during training and testing in this study. These are average absolute relative error (AARE), root mean square error (RMSE), Pearson coefficient of correlation (R), Nash–Sutcliffe efficiency (E), and threshold statistics (TSx) between observed and predicted discharges. Most of these error statistics have been extensively used for model evaluation by many researchers in the past (Jain & Indurthy 2003; Chidthong et al. 2009; Vidyarthi et al. 2020; Vidyarthi & Jain 2022) and their detailed descriptions can be found in these papers. The coefficient of correlation has been also used for identifying the physical components obtained from AWBM in ANN architecture during the KE step.
Study area and data
where j is the individual day of the month. PET data for a day in any month is calculated by multiplying this coefficient to the PET (mm/month) of that month.
The actual ET per day can be calculated by comparing total daily rainfalls (PPT), i.e., if PPT is greater than 2.5 mm/day, ET is half of PET otherwise ET is equal to the value of PET itself (Haan 1972). The data set is divided into two subsets: the first is used for calibration/training (covers the period 1 January 1960–31 December 1972) and another used for validation/testing (covers the period 1 January 1977–31 December 1989) of the models.
RR MODEL DEVELOPMENT AND KE
The model development procedure for the RR process simulation and the KE steps are discussed in this section. The AWBM and ANN models are developed to estimate the runoff at LD14 for the Kentucky River Basin. The model developed for the Kentucky River basin uses area, daily average rainfall (mm/day) which is the average of rainfall obtained from four rain gauge stations (Manchester, Hyden, Jackson, and Heidelberg), and ET (mm/day) as input variables, and runoff or flow (m3/s) at LD14 as observed output variable.
AWBM model development
ANN model development
The results obtained from ANN and AWBM models developed for runoff simulation in the Kentucky River basin are presented in Table 1.
Model . | AARE . | R . | RMSE . | TS50 . | TS75 . | TS100 . |
---|---|---|---|---|---|---|
During calibration/traininga | ||||||
AWBM | 116.3 | 0.874 | 0.047 | 46.2 | 57.9 | 65.8 |
ANN | 28.2 | 0.970 | 0.013 | 85.2 | 93.5 | 96.7 |
During validation/testinga | ||||||
AWBM | 104.5 | 0.863 | 0.043 | 45.1 | 58.4 | 66.7 |
ANN | 30.3 | 0.960 | 0.015 | 84.3 | 92.0 | 95.4 |
Model . | AARE . | R . | RMSE . | TS50 . | TS75 . | TS100 . |
---|---|---|---|---|---|---|
During calibration/traininga | ||||||
AWBM | 116.3 | 0.874 | 0.047 | 46.2 | 57.9 | 65.8 |
ANN | 28.2 | 0.970 | 0.013 | 85.2 | 93.5 | 96.7 |
During validation/testinga | ||||||
AWBM | 104.5 | 0.863 | 0.043 | 45.1 | 58.4 | 66.7 |
ANN | 30.3 | 0.960 | 0.015 | 84.3 | 92.0 | 95.4 |
aCalibration/validation is used for the AWBM model while training/testing has been used for the ANN model.
The results shown in Table 1 indicate that although both AWBM and ANN are simulating the RR process under acceptable limits, the RR simulation ability of ANN is far better than the AWBM modeling technique with the value of AARE, R, RMSE, and TS100 obtained from ANN technique to be 28.2, 0.97, 0.013, and 96.7, respectively, against 116.3, 0.874, 0.047, and 65.8, respectively, obtained from AWBM models during training. A similar trend of performances is observed during testing also, where the values of AARE, R, RMSE, and TS100 from ANN technique obtained were 30.3, 0.96, 0.015, and 95.4, respectively, against 104.5, 0.863, 0.043, and 66.7, respectively, obtained from AWBM models.
RESULTS AND DISCUSSION ON KE FROM TRAINED ANN RR MODEL
In this study, extraction of knowledge from the ANN RR model is done by investigating its massively parallel structure using physical components obtained from the AWBM model by looking at hidden neuron outputs of a trained ANN. For this purpose, the previously developed ANN model with optimum architecture 6–4–1 is explored by examining the hidden neuron outputs at the hidden layer for KE. The input variables possess important information to mimic the RR process and therefore, the input variables along with the physical components are used for KE of the trained ANN RR model. The uniqueness of this study is to include the physical components obtained from a well-established AWBM model to identify them in the trained ANN architecture. The AWBM model is a physics-based model and physical components can be calculated using its calibrated parameters. Described in the next section are the many ways of looking into the hidden neuron outputs of a trained ANN model, and the one considered in this study.
Hidden neuron outputs
The hidden neuron outputs from the four hidden neurons were obtained by the following equation used in the feed-forward step till the hidden layer, i.e., . The basic statistics of the hidden neuron outputs thus obtained are presented in Table 2. It can be clearly observed from this table that the hidden neuron output, HN1 possesses a higher magnitude while HN4 possesses a lower magnitude. The HN2 and HN3 magnitudes on the other hand have ranged almost between HN1 and HN4. Also, the HN3 magnitude is higher than that of HN2.
Statistics . | HN1 . | HN2 . | HN3 . | HN4 . |
---|---|---|---|---|
Minimum | 1,957.9 | 0.0 | 30.1 | 0.6 |
Maximum | 2,740.8 | 294.9 | 1,915.7 | 200.0 |
Mean | 2,545.9 | 281.0 | 1,751.9 | 43.7 |
Std. dev. | 49.2 | 35.8 | 144.2 | 37.3 |
Skewness | −0.4 | −5.7 | −3.8 | 1.0 |
Kurtosis | 11.4 | 37.5 | 21.8 | 0.9 |
Statistics . | HN1 . | HN2 . | HN3 . | HN4 . |
---|---|---|---|---|
Minimum | 1,957.9 | 0.0 | 30.1 | 0.6 |
Maximum | 2,740.8 | 294.9 | 1,915.7 | 200.0 |
Mean | 2,545.9 | 281.0 | 1,751.9 | 43.7 |
Std. dev. | 49.2 | 35.8 | 144.2 | 37.3 |
Skewness | −0.4 | −5.7 | −3.8 | 1.0 |
Kurtosis | 11.4 | 37.5 | 21.8 | 0.9 |
In the feed-forward step of ANN modeling, the inputs to the hidden neurons at the hidden layer are the weighted sum of the raw input variables (at the input layer) and therefore, the raw input variables used during the training of the ANN models can be an important asset in analyzing the knowledge acquired by the ANN during their training. In this study, 6–4–1 architecture was selected for KE, and hence six input variables (P(t), P(t − 1), P(t − 2), Q(t − 1), Q(t − 2), and Q(t − 3)) were used in its modeling. The minimum and maximum values of the rainfall are 0 and 85.8 mm/day and those for discharge are 2 and 2,449.4 m3/s, respectively.
Physical components of the RR process
Statistics . | QG (m3/s) . | QS (m3/s) . | RE (mm/day) . |
---|---|---|---|
Minimum | 0.0 | 0.0 | 0.0 |
Maximum | 200.0 | 1,437.3 | 85.5 |
Mean | 42.0 | 79.5 | 1.5 |
Std. dev. | 36.4 | 132.7 | 5.1 |
Skewness | 1.0 | 3.1 | 5.9 |
Kurtosis | 0.8 | 14.8 | 52.4 |
Statistics . | QG (m3/s) . | QS (m3/s) . | RE (mm/day) . |
---|---|---|---|
Minimum | 0.0 | 0.0 | 0.0 |
Maximum | 200.0 | 1,437.3 | 85.5 |
Mean | 42.0 | 79.5 | 1.5 |
Std. dev. | 36.4 | 132.7 | 5.1 |
Skewness | 1.0 | 3.1 | 5.9 |
Kurtosis | 0.8 | 14.8 | 52.4 |
KE by dissecting trained ANN model
In this study, two methods are employed for KE, namely: visualization (a.k.a., graphical) techniques, and correlation analysis. Each method considered the hidden neuron output (HNj) at the hidden layer. The investigation of the physical significance shown in the ANN architecture for the Kentucky River basin is described in the following sub-sections.
KE by visualization techniques
The scatter plots of HN2 with various physical components such as surface flow, baseflow, and RE are shown in Figure 9(g)–9(i), respectively. The scatter plot between the output HN2 and QS given in Figure 9(g) indicates that there is an existence of a relationship between them. From Figure 9, it is seen that out of the three conceptual components (QS, QG, and RE), HN2 seems to be most close to QS with a negative correlation.
Since the magnitude of HN2 itself is in the lower range (200–300), it cannot contribute to surface flow significantly. Therefore, it can be concluded that HN2 is contributing to the low-magnitude surface flow. The scatter plot shown in Figure 9(i) shows that the HN2 is moderately related to RE.
The scatter plots of HN4 output with various physical components such as QS, QG, and RE are shown in Figure 11(g)–11(i), respectively. It is evident from Figure 11(g) and 11(i) that QS and RE are insensitive to HN4 output. As the ranges of the values of HN4 output and baseflow are almost the same and also the scatter plot between them (Figure 11(h)) are quite matching to each other suggests a strong positive relationship between HN4 output and QG which indicates that HN4 may be modeling baseflow. Moreover, it is interesting to note the ranges exhibited by HN1, HN2, HN3, and HN4 outputs. Out of the four hidden neurons, only HN4 output ranges between 0 and 200 while, HN1 output ranges mostly between 2,000 and 3,000; whereas, HN2 and HN3 outputs range almost between HN1 and HN4. As mentioned earlier, the maximum flow for Kentucky River data during training is about 2,449 m3/s. As mentioned earlier, the maximum flow for Kentucky River data during training is about 2,449 m3/s.
The ranges in which HN1, HN2, HN3, and HN4 operate show a clear demarcation of the HN1 to contribute at the crest part, HN2 and HN3 as the surface flow of low and high magnitudes in the middle portion of a flow hydrograph and HN4 to the falling limb as a low flow component of the hydrograph, respectively. Thus, it can be concluded that there is a definite hidden neuron specialization in the 6–4–1 ANN RR model developed for Kentucky River with the first hidden neuron modeling peak flow, the second and third hidden neurons modeling surface flows of low and high magnitudes and the fourth hidden neuron modeling the base flow. Here, the simple graphical methods of plotting hidden neuron outputs in different forms are able to suggest hidden neuron specialization. These are the preliminary findings that need to be verified further using correlation analysis described in the next section.
KE by correlation analysis
The preliminary inferences drawn using graphical methods need to be verified using other methods. For this purpose, the correlation analysis was used in this study. The intermediate outputs from hidden neurons can be correlated with the inputs or the known subprocesses of the overall hydrologic system. In this study, the strength of the relationship between HNi and various inputs was first determined by calculating the coefficient of correlation between them. Then, the strength of the relationship between HNi and physical components QS, QG, and RE was determined and inferences were drawn by examining the coefficient of correlation between them. The cross-correlation coefficients of hidden neuron outputs (HNi) with both input variables, and physical components (QS, QG, and RE) are presented in Table 4. Looking at the overall correlation values of HN1 output from Table 4, HN1 is highly correlated with the rainfalls P(t − 1) and P(t − 2) with the value of correlation coefficients 0.61 and 0.63, respectively, as the crest portion of the hydrograph mainly affected by the rainfall, it seems that the HN1 is contributing mainly to the crest portion of the hydrograph.
Input variables and physical components . | Correlation coefficient . | |||
---|---|---|---|---|
HN1 . | HN2 . | HN3 . | HN4 . | |
Q(t − 1) | 0.01 | − 0.68 | − 0.66 | 0.44 |
Q(t − 2) | −0.32 | −0.45 | −0.39 | 0.44 |
Q(t − 3) | −0.39 | −0.28 | −0.24 | 0.44 |
P(t) | −0.11 | −0.49 | − 0.60 | 0.07 |
P(t − 1) | 0.63 | − 0.62 | − 0.74 | 0.07 |
P(t − 2) | 0.61 | −0.47 | −0.39 | 0.08 |
QG | −0.11 | −0.40 | −0.45 | 0.83 |
QS | 0.09 | − 0.79 | − 0.80 | 0.39 |
RE | −0.07 | −0.53 | −0.57 | 0.18 |
Input variables and physical components . | Correlation coefficient . | |||
---|---|---|---|---|
HN1 . | HN2 . | HN3 . | HN4 . | |
Q(t − 1) | 0.01 | − 0.68 | − 0.66 | 0.44 |
Q(t − 2) | −0.32 | −0.45 | −0.39 | 0.44 |
Q(t − 3) | −0.39 | −0.28 | −0.24 | 0.44 |
P(t) | −0.11 | −0.49 | − 0.60 | 0.07 |
P(t − 1) | 0.63 | − 0.62 | − 0.74 | 0.07 |
P(t − 2) | 0.61 | −0.47 | −0.39 | 0.08 |
QG | −0.11 | −0.40 | −0.45 | 0.83 |
QS | 0.09 | − 0.79 | − 0.80 | 0.39 |
RE | −0.07 | −0.53 | −0.57 | 0.18 |
Bold values signify higher correlation.
Looking at the correlation of HN2, it seems that it is highly correlated with the past flow Q(t − 1) and past rainfall P(t − 1) and reasonably correlated with P(t), P(t − 2), and Q(t − 2). HN2 is strongly correlated with the surface flow QS (−0.79). Since HN2 is correlated with all the rainfalls and QS, it indicates that HN2 is modeling rainfall and the surface flow components which are responsible for the rising limb of the hydrograph. Analyzing the correlations of HN3 output, it appears that HN3 is similar to HN2 with HN3 having slightly higher correlation values with the current and previous rainfalls (P(t) and P(t − 1)) which are responsible for producing surface flows of higher magnitudes. This inference is strengthened by the fact that HN3 is strongly correlated with QS (−0.80). The HN2 and HN3 both are also reasonably correlated with the excess rainfall, RE which points out the fact that the quick effect of excess rainfall is mostly on the surface flow. Thus, the correlation analysis on HN2 and HN3 suggests that HN2 and HN3 are modeling the surface flow of low and high magnitudes, respectively. Looking at the correlation of HN4, it is observed that HN4 is only slightly correlated with all the past observed flows Q(t − 1), Q(t − 2), and Q(t − 3) (with correlation value 0.44) and strongly correlated with the QG (with correlation value 0.83). Also, it is poorly correlated with rainfall (0.07–0.08). Further, the correlation value between HN4 with QG (i.e., 0.83) is the highest as compared with the correlation values of QS (0.39) and RE (0.18). Given these findings and the fact that values of HN4 are less than 200, it suggests that HN4 is modeling QG. This inference is strengthened by the fact that HN4 is poorly correlated with the rainfalls that are responsible for producing most of the surface flows. These findings are consistent with the findings based on the visualization methods.
SUMMARY AND CONCLUSION
This paper presents the findings of a study that aims to answer the question that does ANN really acquire the physics of the hydrologic system. This has been achieved by extracting the knowledge acquired by the ANN during training using conceptual components from an established water balance model. The study describes the procedure and results obtained from KE from a trained ANN RR model (ANN-LM) developed by employing the data extracted from the Kentucky River Basin using graphical techniques and correlation analysis. The hidden neuron outputs at the hidden layer were first calculated using the equations used in the feed-forward step of the ANN model. The uniqueness of this study is the inclusion of the AWBM for estimating the physical components of the RR process in the basin. In addition to the input variables, three physical components, viz., surface flow (QS), baseflow (QG), and RE were used as conceptual components of the RR process to use in KE. In the graphical technique, the scatter plots between the hidden neuron outputs and the various input and physical components were plotted, while in the correlation technique, the cross-correlation values between the hidden neuron outputs and the various input and physical components were calculated. The study on KE suggested that there is a definite hidden neuron specialization during the training of the ANN RR model. The number of optimum hidden neurons during the training of the ANN RR model was found to be four. One out of these four hidden neurons was found to model the peak flows at the crest, two of them were found to model the surface flow of low and high magnitudes and the last was modeling the baseflow. These findings were confirmed by both the methods employed viz. visualization and correlation. The specific findings of the investigation carried out in this study are summarized below:
The best ANN model consisted of four hidden neurons, of which two (HN2 and HN3) appeared to capture the surface flow dynamics with one modeling the lower magnitude flows and the other specializing in modeling higher magnitude flows.
HN4 is contributing to the falling limb of the hydrograph where runoff is dominated by base flows.
Although the behavior of HN1 is not very clear, due to its higher magnitude and good correlation with past rainfalls, it seems to contribute to the crest portion of the hydrograph.
The overall conclusion of this study suggests that the best ANN architecture acquires the physical components of the system. In this study, a lumped conceptual model has been used; however, a distributed model estimating distributed physical components may provide ANN with a better opportunity to learn them. The development of an accurate and comprehensible ANN model, especially in solving hydrological problems, needs to be extensively explored by researchers in an attempt to make them acceptable to traditional practitioners and decision-makers. It is hoped that future research efforts will focus on some of these directions.
ACKNOWLEDGEMENTS
The data used in this study has been obtained freely from USGS (https://waterdata.usgs.gov/nwis) and is duly acknowledged. The help and support provided by the staff members of Kentucky Water Science Center, USGS, Kentucky, USA, in troubleshooting the data-related issues is highly acknowledged.
DATA AVAILABILITY STATEMENT
All relevant data are available from an online repository or repositories (https://waterdata.usgs.gov/nwis/uv/?referred_module=sw).
CONFLICT OF INTEREST
The authors declare there is no conflict.