Artificial neural networks (ANNs) are labeled as black-box techniques which limit their operational uses in hydrology. Recently, researchers explored techniques that provide insight into the various elements of ANN and their relationship with the physical components of the system being modeled which are commonly known as knowledge extraction (KE) techniques. However, the physical components of rainfall-runoff (RR) process utilized in these KE techniques are obtained from primitive baseflow separation techniques without considering other components of RR process utilizing mostly base flow and surface flow till now. To identify if ANN acquires physical components of the RR process, a well-established water balance model (Australian Water Balance Model) has been utilized first time in this study. For this purpose, correlation and visualization techniques have been used for the Kentucky River basin, USA. Results show that ANN architecture having a single hidden layer with four hidden neurons was the best in simulating RR process and each of the four hidden neurons corresponds to certain subprocesses of the overall RR process, i.e., two hidden neurons are capturing surface flow dynamics with lower and higher flows, one is capturing base flow dynamics, and last one is having good relations with past rainfalls showing that ANN captures physics of basin's RR process.

  • Australian water balance model (AWBM) and ANN model are developed employing data obtained from the real basin.

  • The conceptual components from AWBM and the hidden neuron outputs from ANN are calculated and their relations are assessed using correlation and graphical techniques.

  • The hidden neurons of the best ANN architecture so obtained have traces of various subprocesses of the overall rainfall-runoff process.

Artificial neural networks (ANNs) have been successfully applied for solving function approximation (FA), classification, and clustering problems in hydrologic and water resources areas and demonstrated their superiority over conventional conceptual models (Solomatine & Dulal 2003; Solomatine & Ostfeld 2008). The physical processes that are modeled by approximating a set of functions are known as FA types of problems; one such problem is the modeling of the rainfall-runoff (RR) process using ANN. Despite ANNs performing very well in RR modeling, they receive major criticism due to their weakness of being black-box models (Tickle et al. 1998). This criticism mainly stems from the fact that no satisfactory explanation of their internal behavior has been offered yet (Jain & Kumar 2009). This is a significant weakness, for without the ability to produce comprehensive decisions, it is hard to trust the reliability of networks addressing real-world problems (Benitez et al. 1997). However, it must be realized that if the nonlinear function being mapped by the ANN is explored further, one may be able to shed some light on the physical processes of the system being modeled that are inherent in a trained ANN model (Jain & Kumar 2009; Sudheer & Jain 2009; Maier et al. 2010). Identifying components of a physical process in a trained ANN is an area of research that is more or less virgin and remains unexplored, especially in hydrology except for a few works on knowledge extraction (KE) from the ANN RR models by investigating its massively parallel structure. Sensitivity analysis, graphical visualization, and correlation analysis are a few of the methods that have been employed to analyze a trained ANN to explore the possibility of physics embedded inside an ANN in RR modeling. Wilby et al. (2003) detected that the hidden nodes in the neural network correspond to dominant processes (such as quickflow, baseflow, and soil moisture accounting) within the conceptual RR model using correlation analysis. Sudheer & Jain (2004) proposed a method of visualization and understanding the internal behavior of an ANN model by hypothesizing that an ANN is able to map a function similar to the flow duration curve while modeling the river flow. Jain et al. (2004) examined whether or not the physical processes in a watershed are inherent in a trained ANN RR model by analyzing definite statistical measures of the strength of the relationship between the disintegrated hidden neuron responses of an ANN model and its input variables and various deterministic components of a conceptual RR model. See et al. (2008) applied graphical and statistical methods to visualize hidden neuron behavior in a trained neural network RR model developed for the River Ouse catchment in northern England. Sudheer (2005) extracted the knowledge of a trained ANN river flow model using a perturbation analysis for determining the order of influence of the elements in the input vector on the output vector for the Narmada Basin, India. Kingston et al. (2006) quantified the uncertainty associated with the optimized weight vector of an ANN for hydrological predictions using the Monte–Carlo approach to calculate the corresponding input importance measures, forming distributions that encapsulate the uncertainty in the modeled relationship. Jain & Kumar (2009) systematically dissected the massively parallel architectures of trained ANN hydrologic models to determine if they learn the underlying physical subprocesses during training using simple qualitative and quantitative techniques and showed that the number of hidden neurons determined during training for a particular data set correspond to certain subprocesses of the overall physical process being modeled. Londhe & Shah (2017) extracted the knowledge from a trained ANN model developed for estimating evaporation. Recently, Vidyarthi & Jain (2020) extracted the knowledge in terms of simple rules from a trained ANN drought classification model using the decision tree technique and concluded that the trained ANN learns specific rules during their training for forecasting specific drought classes.

It is shown from the literature that while training an ANN, different hidden neurons learn the individual relationships with physical components inherent in the overall RR relationship or with different portions of a flow hydrograph. In most of these works, however, the baseflow and surface flow components which are estimated using a basic formulation based on primitive base flow separation techniques have been utilized to identify them in ANN architecture and are not determined from a well-established water balance model. Therefore, further work is required in the area of KE from ANN models which can identify more physical components of the RR process obtained from a water balance model.

In this paper, the best ANN architecture has been explored by examining the hidden neuron outputs at the hidden layer for identifying the physical components of the overall RR process. The uniqueness of this study is to include the physical components obtained from a well-established Australian Water Balance Model (AWBM) to identify them in the trained ANN architecture. The AWBM model is a conceptual model and other physical components can be estimated using its calibrated parameters apart from base flow and surface flow components. The input variables possess important information to mimic the RR process and therefore, the input variables are also utilized along with the physical components for KE of trained ANN RR models. The specific objectives of this paper are to (i) develop an AWBM model for simulating the RR process in the basin; (ii) develop ANN models for simulating the RR process for the same data obtained from the same basin; (iii) extract knowledge from the trained ANN RR model using correlation and visualization techniques to identify the physical components of RR process. The daily rainfall (in mm), runoff (in m3/s), and temperature (°C) data derived from the Kentucky River Basin, USA, is employed to test the methodologies proposed in this paper.

The paper starts with a brief description of the tools and techniques employed in this study. The description of the study area and the data employed are then presented before describing the procedure for model development for the RR process using AWBM and ANN techniques. A detailed description of the procedure for KE is then discussed. The obtained results are discussed next before making concluding remarks.

The modeling methods employed to carry out the investigations in this study include the AWBM, ANN, and correlation techniques. A brief overview of the AWBM with steps involving the calculation of various physical components, ANN, and the correlation technique along with the explanation of the study area and data are presented next.

Australian water balance model

The conceptual AWBM (Boughton 1995, 2004) has been used to estimate the physical components of the RR process in this study. The AWBM is a simple lumped water balance model whose calibration and implementation are easy. The schematic diagram of the AWBM is shown in Figure 1. In the AWBM, each of the three surface storage (SS) elements calculates water balance independently of the others. The water in excess of the capacity of the storage element becomes runoff and part of this runoff recharges the base flow storage with a factor called base flow index (BFI), and the rest of the runoff, i.e. (1 − BFI) × runoff, is surface runoff. The BS is depleted at the rate of (1 − KB) × BS, where KB is the base flow recession constant. The SS is depleted at the rate of (1 − KS) × SS, where, KS is the surface runoff recession constant.
Figure 1

Schematic diagram of AWBM model.

Figure 1

Schematic diagram of AWBM model.

Close modal

The rainfall-runoff library (RRL) which is a toolkit for various water balance models under one library developed by the Cooperation Research Centre for Catchment Hydrology, Australia, has been used for the development of the AWBM in this study. The sum of square error (SSE) as an objective function and genetic algorithm (GA) as an optimizer were used for the calibration of AWBM parameters (Kim et al. 2005).

Artificial neural network

The ANNs are one of the effective data-driven techniques to model nonlinear systems. These are analogous to biological neurons in the human brain. The feed-forward back propagation (FFBP) is the most commonly employed ANN in engineering applications. An FFBP consists of an input layer, one or more hidden layers, and an output layer in which the input signal propagates through the network on a layer-to-layer basis in the forward direction with a connection strength also called ‘weight’. Each neuron actuates a response using a nonlinear activation function. The structure of a feed-forward ANN is shown in Figure 2 in which the circles represent neurons; the lines joining the neurons represent weights; the inputs are represented by X's; Y represents the output; wji and wkj represent the weights between the input to hidden and hidden to output layers, respectively. An important step in developing an ANN model is to determine its weight matrix through training. In the back propagation algorithm (Rumelhart et al. 1986) for the training mechanism, the input data are presented at the input layer, the information is processed in the forward direction, and the output is calculated at the output layer. The target values are known at the output layer so that the error can be estimated. The total error at the output layer is distributed back to the ANN and the connection weights are adjusted. This process of feed-forward mechanism and backpropagation of errors and weight adjustment is repeated iteratively until convergence in terms of an acceptable level of error is achieved. Precisely, the primary objective in the backpropagation step is to minimize the error (mean square error (MSE)) at the output layer by searching for an optimum set of connection weights that produces outputs closest to the targets. Weight adjusting during the training of neural network with a gradient descent may result in the local minimum problem. Training with random starting weights (a.k.a., initial weight) and selecting the best set of initial weights out of them is among the popular methods to avoid this problem and has been adopted in this study. This whole process is called the training of the ANN.
Figure 2

Structure of a feed-forward ANN.

Figure 2

Structure of a feed-forward ANN.

Close modal

For the minimization of errors in the standard FFBP network, the gradient-decent (GD) algorithm is normally used during training, which suffers from a slow convergence rate. In this study, a more efficient and effective algorithm known as the Levenberg–Marquardt algorithm (Hagan & Menhaj 1994; Hagan et al. 1996) has been used for training the ANN models.

Statistics for performance evaluation and KE

Five different standard statistics are used to evaluate the performance of the developed models during training and testing in this study. These are average absolute relative error (AARE), root mean square error (RMSE), Pearson coefficient of correlation (R), Nash–Sutcliffe efficiency (E), and threshold statistics (TSx) between observed and predicted discharges. Most of these error statistics have been extensively used for model evaluation by many researchers in the past (Jain & Indurthy 2003; Chidthong et al. 2009; Vidyarthi et al. 2020; Vidyarthi & Jain 2022) and their detailed descriptions can be found in these papers. The coefficient of correlation has been also used for identifying the physical components obtained from AWBM in ANN architecture during the KE step.

Study area and data

The data derived from the Kentucky River, USA, as shown in Figure 3, is employed to develop the RR models in this study. The first Lock & Dam 14 (LD14) in its flow direction is taken as the outflow location whose drainage area near Heidelberg, Kentucky is approximately 6,881 km2. The average daily streamflow (m3/s) from the Kentucky River at LD14 near Heidelberg, and the daily rainfalls (mm) by averaging the rainfalls obtained from the four rain gauges (Manchester, Hyden, Jackson, and Heidelberg) scattered up to LD14 of the Kentucky River basin were used. The average of the temperature data (in °C) obtained at Manchester and Lexington Airport was used in this study. The rainfall, runoff, and temperature data for 26 years (between years 1960 and 1989) are used for all model development in this study. For the development of the conceptual models, Evapotranspiration (ET) is calculated for the entire Kentucky catchment using temperature data. To calculate ET, potential evapotranspiration (PET) is first calculated using the equation given by Thornthwaite (1948), and then using rainfall pattern, ET is calculated. At the time of using Thornthwaite equation, the negative average monthly temperature is taken as zero (Xu & Singh 2001). Using temperature data, PET (mm/month) is calculated for all months. A coefficient, t is calculated for each day as follows:
Figure 3

Kentucky River catchment up to LD14, USA.

Figure 3

Kentucky River catchment up to LD14, USA.

Close modal

where j is the individual day of the month. PET data for a day in any month is calculated by multiplying this coefficient to the PET (mm/month) of that month.

The actual ET per day can be calculated by comparing total daily rainfalls (PPT), i.e., if PPT is greater than 2.5 mm/day, ET is half of PET otherwise ET is equal to the value of PET itself (Haan 1972). The data set is divided into two subsets: the first is used for calibration/training (covers the period 1 January 1960–31 December 1972) and another used for validation/testing (covers the period 1 January 1977–31 December 1989) of the models.

The model development procedure for the RR process simulation and the KE steps are discussed in this section. The AWBM and ANN models are developed to estimate the runoff at LD14 for the Kentucky River Basin. The model developed for the Kentucky River basin uses area, daily average rainfall (mm/day) which is the average of rainfall obtained from four rain gauge stations (Manchester, Hyden, Jackson, and Heidelberg), and ET (mm/day) as input variables, and runoff or flow (m3/s) at LD14 as observed output variable.

AWBM model development

The physical mechanism, associated equations, and meaning of various parameters corresponding to the AWBM model were discussed previously and the development of the AWBM model can be further found in Vidyarthi & Jain (2022). After preparations of the data, the parameters of the AWBM model are estimated by calibrating the model against observed runoff data at LD14 for the Kentucky River of the calibration period using the RRL toolkit taking SSE as an objective function and GA as an optimizer. The best parameter set was obtained by running the model multiple times using different initial values to get suitable and meaningful calibrated parameters. The GA used in this study has four parameters: number of points (population), probability of mutation, trapezoidal probability density function (PDF), and maximum iterations. The best GA optimizer parameters were obtained using the trial and error method varying one parameter at a time. The trapezoidal PDF was varied from 1 to 2 and the number of iterations was varied from 20 to 100. The value of trapezoidal PDF = 2, and mutation probability = 0.01, were found to be the best for the calibration of the AWBM model. As shown in Figure 4, iteration 80 was found to be the best for the calibration of the LAWBM model as the AARE and RMSE were the minimum and the R and E values were the maximum. After the calibration step, the parameters of AWBM for the Kentucky River basin were obtained as 0.134 and 0.433 for partial areas, A1 and A2, respectively; 0.722 for percolation constant, BFI; 48.6, 154.5, and 125.5 for SS capacities, C1, C2, and C3, respectively; 0.773 for base flow recession constant, KB; and 0.988 for surface flow recession constant, KS. In this study, a more efficient and effective algorithm known as the Levenberg–Marquardt (LM) algorithm has been used for training the ANN models. The interested readers can refer to Hagan & Menhaj (1994) and Hagan et al. (1996) for a detailed description of the LM algorithm.
Figure 4

Optimum iteration selection for AWBM model.

Figure 4

Optimum iteration selection for AWBM model.

Close modal

ANN model development

For the development of the ANN model, the significant input vectors were selected on the basis of the values of cross-correlation function (CCF), auto-correlation function (ACF), and partial auto-correlation function (PACF) of the daily data employed in this study. PACF is used in this study because, in the case of more than two independent variables, the use of PACF is very useful to know the association between two variables in the presence of another influential variable. Based on the analysis, the flow values lagged up to 3 days, Qt−1, Qt−2, Qt−3, and the rainfalls, Pt, Pt−1, Pt−2, were found as significant for predicting Qt at LD14. The ANN model architecture, 6–N–1, was thus trained using the LM algorithm by varying the hidden neuron number (N) from 1 to 20. A number of trial values of the LM parameter, initial damping factor (μ), were tried and a value of μ = 0.001 was found to be the best and selected to train the ANN model. The objective function was taken as the MSE. As soon as MSE reaches 0.0001 or a maximum iteration of 50,000 reaches, the training was terminated. The 6–4–1 architecture of the ANN model thus obtained was found to be the best among 20 different architectures as the error statistics of this architecture were found to be optimum as shown in Figure 5.
Figure 5

Architecture selection for ANN model.

Figure 5

Architecture selection for ANN model.

Close modal

The results obtained from ANN and AWBM models developed for runoff simulation in the Kentucky River basin are presented in Table 1.

Table 1

Performance of AWBM and ANN RR models

ModelAARERRMSETS50TS75TS100
During calibration/traininga 
AWBM 116.3 0.874 0.047 46.2 57.9 65.8 
ANN 28.2 0.970 0.013 85.2 93.5 96.7 
During validation/testinga 
AWBM 104.5 0.863 0.043 45.1 58.4 66.7 
ANN 30.3 0.960 0.015 84.3 92.0 95.4 
ModelAARERRMSETS50TS75TS100
During calibration/traininga 
AWBM 116.3 0.874 0.047 46.2 57.9 65.8 
ANN 28.2 0.970 0.013 85.2 93.5 96.7 
During validation/testinga 
AWBM 104.5 0.863 0.043 45.1 58.4 66.7 
ANN 30.3 0.960 0.015 84.3 92.0 95.4 

aCalibration/validation is used for the AWBM model while training/testing has been used for the ANN model.

The results shown in Table 1 indicate that although both AWBM and ANN are simulating the RR process under acceptable limits, the RR simulation ability of ANN is far better than the AWBM modeling technique with the value of AARE, R, RMSE, and TS100 obtained from ANN technique to be 28.2, 0.97, 0.013, and 96.7, respectively, against 116.3, 0.874, 0.047, and 65.8, respectively, obtained from AWBM models during training. A similar trend of performances is observed during testing also, where the values of AARE, R, RMSE, and TS100 from ANN technique obtained were 30.3, 0.96, 0.015, and 95.4, respectively, against 104.5, 0.863, 0.043, and 66.7, respectively, obtained from AWBM models.

In this study, extraction of knowledge from the ANN RR model is done by investigating its massively parallel structure using physical components obtained from the AWBM model by looking at hidden neuron outputs of a trained ANN. For this purpose, the previously developed ANN model with optimum architecture 6–4–1 is explored by examining the hidden neuron outputs at the hidden layer for KE. The input variables possess important information to mimic the RR process and therefore, the input variables along with the physical components are used for KE of the trained ANN RR model. The uniqueness of this study is to include the physical components obtained from a well-established AWBM model to identify them in the trained ANN architecture. The AWBM model is a physics-based model and physical components can be calculated using its calibrated parameters. Described in the next section are the many ways of looking into the hidden neuron outputs of a trained ANN model, and the one considered in this study.

Hidden neuron outputs

ANN is one of the data-driven techniques which uses input-target data sets during their training. The input and the target data which are presented to the ANN during training contain vital information to establish the relationship that exists among them inherently. The ANN tries to learn this input–output relationship in some form unlike the conceptual models, where the modeler provides the empirical or physics-based equations to solve complex problems. It is, therefore, possible that the various components of the optimum architecture obtained during the training of an ANN model for a particular data set correspond to certain subprocesses of the overall physical system being modeled (RR process in this case). These components may be in terms of individual weights or in the combination of weights. The hidden neuron outputs are generally obtained from the combination of weights and biases associated with the optimum architecture. Furthermore, the hidden neuron outputs can be examined in various ways to infer the physical subprocesses that may have been captured in a trained ANN model. In the present study, the hidden neuron outputs obtained just after the activation function at the hidden layer were examined to establish the conclusions of this study. The dissection of the optimum architecture and the equation for finding out the hidden neuron outputs are shown in Figure 6.
Figure 6

Dissection of optimum architecture 6–4–1 of ANN-LM RR model.

Figure 6

Dissection of optimum architecture 6–4–1 of ANN-LM RR model.

Close modal

The hidden neuron outputs from the four hidden neurons were obtained by the following equation used in the feed-forward step till the hidden layer, i.e., . The basic statistics of the hidden neuron outputs thus obtained are presented in Table 2. It can be clearly observed from this table that the hidden neuron output, HN1 possesses a higher magnitude while HN4 possesses a lower magnitude. The HN2 and HN3 magnitudes on the other hand have ranged almost between HN1 and HN4. Also, the HN3 magnitude is higher than that of HN2.

Table 2

Basic statistics of the hidden neuron outputs

StatisticsHN1HN2HN3HN4
Minimum 1,957.9 0.0 30.1 0.6 
Maximum 2,740.8 294.9 1,915.7 200.0 
Mean 2,545.9 281.0 1,751.9 43.7 
Std. dev. 49.2 35.8 144.2 37.3 
Skewness −0.4 −5.7 −3.8 1.0 
Kurtosis 11.4 37.5 21.8 0.9 
StatisticsHN1HN2HN3HN4
Minimum 1,957.9 0.0 30.1 0.6 
Maximum 2,740.8 294.9 1,915.7 200.0 
Mean 2,545.9 281.0 1,751.9 43.7 
Std. dev. 49.2 35.8 144.2 37.3 
Skewness −0.4 −5.7 −3.8 1.0 
Kurtosis 11.4 37.5 21.8 0.9 

In the feed-forward step of ANN modeling, the inputs to the hidden neurons at the hidden layer are the weighted sum of the raw input variables (at the input layer) and therefore, the raw input variables used during the training of the ANN models can be an important asset in analyzing the knowledge acquired by the ANN during their training. In this study, 6–4–1 architecture was selected for KE, and hence six input variables (P(t), P(t − 1), P(t − 2), Q(t − 1), Q(t − 2), and Q(t − 3)) were used in its modeling. The minimum and maximum values of the rainfall are 0 and 85.8 mm/day and those for discharge are 2 and 2,449.4 m3/s, respectively.

Physical components of the RR process

The AWBM model for the Kentucky River basin developed using the lumped approach previously is utilized for finding out the various physical components for use in KE in this study. Three major physical components, the routed surface runoff, baseflow (QG), and rainfall excess (RE) were calculated using calibrated parameters and used for the KE by identifying them in the trained ANN RR model. For simplicity, the routed surface runoff is represented as simply surface flow (QS) further in this study. The basic statistics of QS, QG, and RE thus calculated and obtained are presented in Table 3. Looking at Table 3, it is clear that the range of the baseflow is low, having a maximum value of 200 m3/s, while the range of surface flow is higher having a maximum value of 1,437.3 m3/s. The time series plot of QS, QG, and the total runoff is presented in Figure 7(a), and that for the RE is shown in Figure 7(b).
Table 3

Basic statistics of various physical components used in KE

StatisticsQG (m3/s)QS (m3/s)RE (mm/day)
Minimum 0.0 0.0 0.0 
Maximum 200.0 1,437.3 85.5 
Mean 42.0 79.5 1.5 
Std. dev. 36.4 132.7 5.1 
Skewness 1.0 3.1 5.9 
Kurtosis 0.8 14.8 52.4 
StatisticsQG (m3/s)QS (m3/s)RE (mm/day)
Minimum 0.0 0.0 0.0 
Maximum 200.0 1,437.3 85.5 
Mean 42.0 79.5 1.5 
Std. dev. 36.4 132.7 5.1 
Skewness 1.0 3.1 5.9 
Kurtosis 0.8 14.8 52.4 
Figure 7

Time series plots of physical components of the AWBM model. (a) QS, QG, and total runoff, (b) Rainfall excess.

Figure 7

Time series plots of physical components of the AWBM model. (a) QS, QG, and total runoff, (b) Rainfall excess.

Close modal

KE by dissecting trained ANN model

In this study, two methods are employed for KE, namely: visualization (a.k.a., graphical) techniques, and correlation analysis. Each method considered the hidden neuron output (HNj) at the hidden layer. The investigation of the physical significance shown in the ANN architecture for the Kentucky River basin is described in the following sub-sections.

KE by visualization techniques

The graphical techniques can be very useful in getting insight about the physical subprocesses captured by a trained ANN RR model. For this purpose, the hidden neuron outputs (HNi) can be plotted against the raw inputs of the trained ANN model and also against the conceptual components obtained from the AWBM model. The raw inputs of the trained ANN model and the physical components of the AWBM model were already discussed in the previous sections. The number of hidden neurons in the architecture of the ANN-LM model (6–4–1) for Kentucky River data was 4, namely HN1, HN2, HN3, and HN4. Six different scatter plots were prepared showing the relationships of hidden neuron HN1 with various input variables, shown in Figure 8(a)–8(f). The scatter plots of HN1 with various physical components such as surface flow, baseflow, and RE are shown in Figure 88(i), respectively. Although there does not seem to be an obvious relationship between HN1 outputs and any of the input variables or physical components, the higher range (between 2,000 and 3,000) of HN1 in the scatter plots shows its affinity toward the high flow magnitudes and that, in turn, shows its contribution on the crest portion of the hydrograph.
Figure 8

Scatter plots between HN1 and various input variables and physical components.

Figure 8

Scatter plots between HN1 and various input variables and physical components.

Close modal
The scatter plots to visualize the relationship of hidden neuron HN2 output with various input variables are shown in Figure 9(a)–9(f). From Figure 9, there does not seem to be a clear pattern or relationship between HN2 and Q's and P's except that the value of HN2 decreases as P's and Q's increase. A clear picture may appear based on correlation analysis which is presented in the next section.
Figure 9

Scatter plots between HN2 and various input variables and physical components.

Figure 9

Scatter plots between HN2 and various input variables and physical components.

Close modal

The scatter plots of HN2 with various physical components such as surface flow, baseflow, and RE are shown in Figure 9(g)–9(i), respectively. The scatter plot between the output HN2 and QS given in Figure 9(g) indicates that there is an existence of a relationship between them. From Figure 9, it is seen that out of the three conceptual components (QS, QG, and RE), HN2 seems to be most close to QS with a negative correlation.

Since the magnitude of HN2 itself is in the lower range (200–300), it cannot contribute to surface flow significantly. Therefore, it can be concluded that HN2 is contributing to the low-magnitude surface flow. The scatter plot shown in Figure 9(i) shows that the HN2 is moderately related to RE.

The scatter plots between the HN3 output and the input variables are presented in Figure 10(a)–10(f). The HN3 shows good relation with rainfall at time steps t and (t − 1) and also with the previous day's runoff as shown in Figure 10(a), 10(b) and 10(d), respectively. The scatter plots between the HN3 output and the physical components QS, QG, and RE are presented in Figure 10(g)–10(i), respectively. Out of all scatter plots presented in Figure 10, the scatter plot between the HN3 output and QS given in Figure 10(g) clearly indicates that there is an existence of a relationship between them with a negative correlation. As the values of the HN3 mostly lie between 1,000 and 2,000, it seems that the HN3 has a good relationship with the high magnitude of the surface flow. Looking at the scatter plot between the HN3 output and QG as shown in Figure 10(h), it seems that there is no relation between them as the ranges of the magnitude of the baseflow (less than 200 m3/s) and HN3 outputs (mostly between 1,000 and 2,000) are different which further supports the claim that HN3 is related to the surface flow of high magnitudes. The scatter plot shown in Figure 10(i) indicates that the HN3 output is moderately related to RE. This strengthens the argument that HN3 is related to high-magnitude surface flows.
Figure 10

Scatter plots between HN3 and various input variables and physical components.

Figure 10

Scatter plots between HN3 and various input variables and physical components.

Close modal
The scatter plots for showing the relationships of hidden neuron HN4 output with various input variables are shown in Figure 11(a)–11(f). From these figures, it is clear that the HN4 output has no relationship with any of the rainfall input variables but shows a relationship with the past runoff input variable.
Figure 11

Scatter plots between HN4 and various input variables and physical components.

Figure 11

Scatter plots between HN4 and various input variables and physical components.

Close modal

The scatter plots of HN4 output with various physical components such as QS, QG, and RE are shown in Figure 11(g)–11(i), respectively. It is evident from Figure 11(g) and 11(i) that QS and RE are insensitive to HN4 output. As the ranges of the values of HN4 output and baseflow are almost the same and also the scatter plot between them (Figure 11(h)) are quite matching to each other suggests a strong positive relationship between HN4 output and QG which indicates that HN4 may be modeling baseflow. Moreover, it is interesting to note the ranges exhibited by HN1, HN2, HN3, and HN4 outputs. Out of the four hidden neurons, only HN4 output ranges between 0 and 200 while, HN1 output ranges mostly between 2,000 and 3,000; whereas, HN2 and HN3 outputs range almost between HN1 and HN4. As mentioned earlier, the maximum flow for Kentucky River data during training is about 2,449 m3/s. As mentioned earlier, the maximum flow for Kentucky River data during training is about 2,449 m3/s.

The ranges in which HN1, HN2, HN3, and HN4 operate show a clear demarcation of the HN1 to contribute at the crest part, HN2 and HN3 as the surface flow of low and high magnitudes in the middle portion of a flow hydrograph and HN4 to the falling limb as a low flow component of the hydrograph, respectively. Thus, it can be concluded that there is a definite hidden neuron specialization in the 6–4–1 ANN RR model developed for Kentucky River with the first hidden neuron modeling peak flow, the second and third hidden neurons modeling surface flows of low and high magnitudes and the fourth hidden neuron modeling the base flow. Here, the simple graphical methods of plotting hidden neuron outputs in different forms are able to suggest hidden neuron specialization. These are the preliminary findings that need to be verified further using correlation analysis described in the next section.

KE by correlation analysis

The preliminary inferences drawn using graphical methods need to be verified using other methods. For this purpose, the correlation analysis was used in this study. The intermediate outputs from hidden neurons can be correlated with the inputs or the known subprocesses of the overall hydrologic system. In this study, the strength of the relationship between HNi and various inputs was first determined by calculating the coefficient of correlation between them. Then, the strength of the relationship between HNi and physical components QS, QG, and RE was determined and inferences were drawn by examining the coefficient of correlation between them. The cross-correlation coefficients of hidden neuron outputs (HNi) with both input variables, and physical components (QS, QG, and RE) are presented in Table 4. Looking at the overall correlation values of HN1 output from Table 4, HN1 is highly correlated with the rainfalls P(t − 1) and P(t − 2) with the value of correlation coefficients 0.61 and 0.63, respectively, as the crest portion of the hydrograph mainly affected by the rainfall, it seems that the HN1 is contributing mainly to the crest portion of the hydrograph.

Table 4

Correlation coefficient of hidden neurons outputs

Input variables and physical componentsCorrelation coefficient
HN1HN2HN3HN4
Q(t − 1) 0.01 − 0.68 − 0.66 0.44 
Q(t − 2) −0.32 −0.45 −0.39 0.44 
Q(t − 3) −0.39 −0.28 −0.24 0.44 
P(t−0.11 −0.49 − 0.60 0.07 
P(t − 1) 0.63 − 0.62 − 0.74 0.07 
P(t − 2) 0.61 −0.47 −0.39 0.08 
QG −0.11 −0.40 −0.45 0.83 
QS 0.09 − 0.79 − 0.80 0.39 
RE −0.07 −0.53 −0.57 0.18 
Input variables and physical componentsCorrelation coefficient
HN1HN2HN3HN4
Q(t − 1) 0.01 − 0.68 − 0.66 0.44 
Q(t − 2) −0.32 −0.45 −0.39 0.44 
Q(t − 3) −0.39 −0.28 −0.24 0.44 
P(t−0.11 −0.49 − 0.60 0.07 
P(t − 1) 0.63 − 0.62 − 0.74 0.07 
P(t − 2) 0.61 −0.47 −0.39 0.08 
QG −0.11 −0.40 −0.45 0.83 
QS 0.09 − 0.79 − 0.80 0.39 
RE −0.07 −0.53 −0.57 0.18 

Bold values signify higher correlation.

Looking at the correlation of HN2, it seems that it is highly correlated with the past flow Q(t − 1) and past rainfall P(t − 1) and reasonably correlated with P(t), P(t − 2), and Q(t − 2). HN2 is strongly correlated with the surface flow QS (−0.79). Since HN2 is correlated with all the rainfalls and QS, it indicates that HN2 is modeling rainfall and the surface flow components which are responsible for the rising limb of the hydrograph. Analyzing the correlations of HN3 output, it appears that HN3 is similar to HN2 with HN3 having slightly higher correlation values with the current and previous rainfalls (P(t) and P(t − 1)) which are responsible for producing surface flows of higher magnitudes. This inference is strengthened by the fact that HN3 is strongly correlated with QS (−0.80). The HN2 and HN3 both are also reasonably correlated with the excess rainfall, RE which points out the fact that the quick effect of excess rainfall is mostly on the surface flow. Thus, the correlation analysis on HN2 and HN3 suggests that HN2 and HN3 are modeling the surface flow of low and high magnitudes, respectively. Looking at the correlation of HN4, it is observed that HN4 is only slightly correlated with all the past observed flows Q(t − 1), Q(t − 2), and Q(t − 3) (with correlation value 0.44) and strongly correlated with the QG (with correlation value 0.83). Also, it is poorly correlated with rainfall (0.07–0.08). Further, the correlation value between HN4 with QG (i.e., 0.83) is the highest as compared with the correlation values of QS (0.39) and RE (0.18). Given these findings and the fact that values of HN4 are less than 200, it suggests that HN4 is modeling QG. This inference is strengthened by the fact that HN4 is poorly correlated with the rainfalls that are responsible for producing most of the surface flows. These findings are consistent with the findings based on the visualization methods.

This paper presents the findings of a study that aims to answer the question that does ANN really acquire the physics of the hydrologic system. This has been achieved by extracting the knowledge acquired by the ANN during training using conceptual components from an established water balance model. The study describes the procedure and results obtained from KE from a trained ANN RR model (ANN-LM) developed by employing the data extracted from the Kentucky River Basin using graphical techniques and correlation analysis. The hidden neuron outputs at the hidden layer were first calculated using the equations used in the feed-forward step of the ANN model. The uniqueness of this study is the inclusion of the AWBM for estimating the physical components of the RR process in the basin. In addition to the input variables, three physical components, viz., surface flow (QS), baseflow (QG), and RE were used as conceptual components of the RR process to use in KE. In the graphical technique, the scatter plots between the hidden neuron outputs and the various input and physical components were plotted, while in the correlation technique, the cross-correlation values between the hidden neuron outputs and the various input and physical components were calculated. The study on KE suggested that there is a definite hidden neuron specialization during the training of the ANN RR model. The number of optimum hidden neurons during the training of the ANN RR model was found to be four. One out of these four hidden neurons was found to model the peak flows at the crest, two of them were found to model the surface flow of low and high magnitudes and the last was modeling the baseflow. These findings were confirmed by both the methods employed viz. visualization and correlation. The specific findings of the investigation carried out in this study are summarized below:

  • The best ANN model consisted of four hidden neurons, of which two (HN2 and HN3) appeared to capture the surface flow dynamics with one modeling the lower magnitude flows and the other specializing in modeling higher magnitude flows.

  • HN4 is contributing to the falling limb of the hydrograph where runoff is dominated by base flows.

  • Although the behavior of HN1 is not very clear, due to its higher magnitude and good correlation with past rainfalls, it seems to contribute to the crest portion of the hydrograph.

The overall conclusion of this study suggests that the best ANN architecture acquires the physical components of the system. In this study, a lumped conceptual model has been used; however, a distributed model estimating distributed physical components may provide ANN with a better opportunity to learn them. The development of an accurate and comprehensible ANN model, especially in solving hydrological problems, needs to be extensively explored by researchers in an attempt to make them acceptable to traditional practitioners and decision-makers. It is hoped that future research efforts will focus on some of these directions.

The data used in this study has been obtained freely from USGS (https://waterdata.usgs.gov/nwis) and is duly acknowledged. The help and support provided by the staff members of Kentucky Water Science Center, USGS, Kentucky, USA, in troubleshooting the data-related issues is highly acknowledged.

All relevant data are available from an online repository or repositories (https://waterdata.usgs.gov/nwis/uv/?referred_module=sw).

The authors declare there is no conflict.

Benitez
J. M.
,
Castro
J. L.
&
Requena
I.
1997
Are artificial neural networks black boxes?
IEEE Transactions on Neural Networks
8
(
5
),
1156
1164
.
Boughton
W. C.
1995
An Australian water balance model for semiarid watersheds
.
Journal of Soil Water Conservation
50
(
5
),
454
457
.
Boughton
W. C.
2004
The Australian water balance model
.
Environmental Modelling and Software
19
(
10
),
943
956
.
Chidthong
Y.
,
Tanaka
H.
&
Supharatid
S.
2009
Developing a hybrid multi-model for peak flood forecasting
.
Hydrological Processes
23
,
1725
1738
.
Haan
C. T.
1972
A water yield model for small watersheds
.
Water Resources Research
8
(
1
),
58
69
.
Hagan
M. T.
&
Menhaj
M. B.
1994
Training feedforward networks with the Marquardt algorithm
.
IEEE Transactions on Neural Networks
5
,
998
993
.
Hagan
M. T.
,
Demuth
H. B.
&
Beale
M. H.
1996
Neural Network Design
.
PWS Publishing
,
Boston, MA
,
USA
.
Jain
A.
&
Kumar
S.
2009
Dissection of trained neural network hydrologic models for knowledge extraction
.
Water Resources Research
45
(
7
),
W07420
.
doi:10.1029/2008WR007194
.
Jain
A.
,
Sudheer
K. P.
&
Srinivasulu
S.
2004
Identification of physical processes inherent in artificial neural network rainfall runoff models
.
Hydrological Processes
18
,
571
581
.
Kim
S.
,
Vertessy
R. A.
,
Perraud
J. M.
&
Sung
Y.
2005
Integration and application of the rainfall runoff library
.
Water Science and Technology
52
(
9
),
275
282
.
Kingston
G. B.
,
Maier
H. R.
&
Lambert
M. F. A.
2006
A probabilistic method for assisting knowledge extraction from artificial neural networks used for hydrological prediction
.
Mathematical and Computer Modelling
44
(
5–6
),
499
512
.
Londhe
S. N.
&
Shah
S.
2017
A novel approach for knowledge extraction from artificial neural networks
.
ISH Journal of Hydraulic Engineering
25
(
3
),
1
13
.
Rumelhart
D. E.
,
Hinton
G. E.
&
Williams
R. J.
1986
Learning representations by back-propagating errors
.
Nature
323
,
533
536
.
See
L. M.
,
Jain
A.
,
Abrahart
R. J.
,
Dawson
C. W.
,
2008
Visualisation of hidden neuron behavior in a neural network rainfall-runoff model
. In:
Practical Hydroinformatics: Computational Intelligence and Technological Developments in Water Applications
(
Abrahart
R. J.
,
See
L. M.
&
Solomatine
D.
, eds),
Water Science and Technology Library, vol 68. Springer, Berlin, Heidelberg, pp. 87–99. https://doi.org/10.1007/978-3-540-79881-1_7
.
Solomatine
D.
&
Ostfeld
A.
2008
Data-driven modelling: Some past experiences and new approaches
.
Journal of Hydroinformatics
10
,
3
22
.
Sudheer
K. P.
2005
Knowledge extraction from trained neural network river flow models
.
Journal of Hydrologic Engineering
10
(
4
),
264
269
.
Sudheer
K. P.
&
Jain
A.
2004
Explaining the internal behaviour of artificial neural network river flow models
.
Hydrological Processes
18
(
4
),
833
844
.
Sudheer
K. P.
&
Jain
A.
2009
Recent advances in knowledge extraction from neural network based hydrologic models
.
ISH Journal of Hydraulic Engineering
15
(
sup1
),
75
83
.
doi: 10.1080/09715010.2009.10514969
.
Thornthwaite
C. W.
1948
An approach toward a rational classification of climate
.
Geographical Review
38
,
55
94
.
Vidyarthi
V. K.
&
Jain
A.
2020
Knowledge extraction from trained ANN drought classification model
.
Journal of Hydrology
585
.
HYDROL_124804. https://doi.org/10.1016/j.jhydrol.2020.124804
.
Vidyarthi
V. K.
&
Jain
A.
2022
Incorporating non-uniformity and non-linearity of hydrologic and catchment characteristics in rainfall-runoff modeling using conceptual, data-driven, and hybrid techniques
.
Journal of Hydroinformatics
24
(
2
),
350
366
.
doi: https://doi.org/10.2166/hydro.2022.088
.
Vidyarthi
V. K.
,
Jain
A.
&
Chourasiya
S.
2020
Modeling rainfall-runoff process using artificial neural network with emphasis on parameter sensitivity
.
Modeling Earth Systems and Environment
6
,
2177
2188
.
https://doi.org/10.1007/s40808-020-00833-7
.
Wilby
R. L.
,
Abrahart
R. J.
&
Dawson
C. W.
2003
Detection of conceptual model rainfall–runoff processes inside an artificial neural network
.
Hydrological Sciences Journal
48
(
2
),
163
181
.
Xu
C. Y.
&
Singh
V. P.
2001
Evaluation and generalization of radiation-based methods for calculating evaporation
.
Hydrological Processes
15
,
305
319
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY-NC-ND 4.0), which permits copying and redistribution for non-commercial purposes with no derivatives, provided the original work is properly cited (http://creativecommons.org/licenses/by-nc-nd/4.0/).