Transit data analysis and artificial neural networks (ANNs) have proven to be a useful tool for characterizing and modelling non-linear hydrological processes. In this paper, these methods have been used to characterize and to predict the discharge of Lor River (North Western Spain), 1, 2 and 3 days ahead. Transit data analyses show a coefficient of correlation of 0.53 for a lag between precipitation and discharge of 1 day. On the other hand, temperature and discharge has a negative coefficient of correlation (−0.43) for a delay of 19 days. The ANNs developed provide a good result for the validation period, with *R*^{2} between 0.92 and 0.80. Furthermore, these prediction models have been tested with discharge data from a period 16 years later. Results of this testing period also show a good correlation, with *R*^{2} between 0.91 and 0.64. Overall, results indicate that ANNs are a good tool to predict river discharge with a small number of input variables.

## INTRODUCTION

Nowadays, river systems are subject to high anthropic stresses which are leading to increases in the frequency and severity of droughts and floods. In addition, climate change is likely to modify hydrological cycles, increasing extreme events (IPCC 2012). Due to high human occupation of flood plains, prediction of floods sufficiently in advance is of vital importance to minimize the effects on population. For these reasons, hydrological prediction is an essential tool for an adequate management of water resources from the points of view of social, environmental and economic (Thornton *et al.* 2007; Marques *et al.* 2015).

The behaviour of hydrologic systems is complex and basically nonlinear (Sivakumar & Singh 2012). It is directly influenced by different kinds of variables, such as edaphic, geological, geographical and climatic (Post & Jakeman 1996; Soulsby *et al.* 2006; López-Moreno *et al.* 2013). Due to the nonlinearity of the behaviour, and the variety of the variables involved, artificial neural networks (ANNs) can be considered as an excellent prediction method.

Transit data analysis studies the behaviour of different variables over time and how they can be correlated even if the output variable responds later to the change of the input variable. It allows determination of the lag time between the two variables, and how statistically significant the correlation is (Sahu *et al.* 2009). In this way, it is possible to know how long it takes for the discharge to respond to precipitation and evapotranspiration. This lag time is important because it represents the period of time that is necessary to model, to predict the response of discharge to other variables.

The ANNs have been used in many types of applications because they are a very useful tool for characterization (Willis *et al.* 1991; Mariey *et al.* 2001; Coppola *et al.* 2005; El Ouahed *et al.* 2005; Papadopoulos *et al.* 2005; Corma *et al.* 2006), modelling (Thompson & Kramer 1994; Lek & Guégan 1999; Araújo *et al.* 2005; Smith *et al.* 2011), or time series forecasting (Zhang *et al.* 1998, 2000; Bunn 2000; Zhang 2003; Antanasijevic *et al.* 2013). The traditional approaches, such as Box–Jenkins or autoregressive integrated moving average, assume that the time series are generated from a linear process; however, the real world systems are often nonlinear (Granger & Teräsvirta 1993; Zhang *et al.* 1998). ANNs are a set of computational methods inspired by the human brain (Sutariya *et al.* 2013) and the way it works using the fundamental cell of the neural networks (neuron) (Zhang *et al.* 1998). An ANN has a high number of neurons interconnected with other neurons, is this fact which gives the neural network capability of generalization, that is, the ANN can learn the data presented, and then, it can infer correct information although the request data contain noise (Zhang *et al.* 1998).

The aims of this study were: (a) to characterize the behaviour of water discharge of an undammed river in relation to temporal distribution of rainfall and temperature, employing time series analysis; and (b) to model discharge 1, 2 and 3 days ahead through the use of ANNs.

## MATERIALS AND METHODS

### Description of the study river

The area of the Lor River basin is 372 km^{2} and the difference in elevation between the highest point (1,203 masl) and the outlet (223 masl) is 980 m; the length of the river is 53 km. The area has a mountainous climate with a Mediterranean tendency at the lowest part of the basin. The mean annual precipitation is 1,560 mm and the annual mean evapotranspiration is 687 mm (Martínez Cortizas & Pérez-Alberti 1999).

The daily Lor River discharge data from October 1984 to September 1994 were divided into two subsets, the first data set with data from October 1984 to December 1992 used as training period of ANNs, and a second data set with data from January 1993 to September 1994 used as validation period. Additionally, data from period 2008 to 2011 were used as the testing period.

The mean discharge over the training period was 11.4 m^{3}s^{−1}, in the validation period was 13.6 m^{3}s^{−1} and in the testing period was 10.9 m^{3}s^{−1}. For the period 1959–2007, the mean discharge was 13.3 m^{3}s^{−1} (with a coefficient of variation of 42%). The maximum annual discharge was in 2000 with 25.4 m^{3}s^{−1} and the minimum discharge was in 2001 with 3.1 m^{3}s^{−1}.

### Data base

Meteorological and hydrological data were collected from different sources. The discharge data were obtained from Ministerio de Agricultura, Alimentación y Medio Ambiente (Lor Station in Parada, http://sig.magrama.es/geoportal/). Precipitation data were obtained from Conselleria de Medio Ambiente Territorio e Infraestructura (http://www.meteogalicia.es) and temperature data were collected from the network of AEMET (Agencia Estatal de Meterología) meteorological stations (www.datosclima.es). Maximum temperature was considered because it is well correlated with evapotranspiration and it is available in many regions of the world (Enku & Melesse 2014).

Data from 1984 and 1994 were divided into two subsets, one for training (1984–1992) to develop the best model that can predict the discharge, and the second subset (1993–1994) to validate the model. Also, the best model selected (in validation phase) was used to predict the discharge at Lor River for the period 2008–2011 (testing subset).

### Time series analysis

The time series analysis seeks to find the correlation between different variables (cross-correlation) as well as within the variable itself (autocorrelation). These trends or seasonal variation should be accounted for when implementing the model prediction (Sahu *et al.* 2009). In this study univariate (autocorrelation) and bivariate (cross-correlation) methods are applied to study the Lor River hydrology using the following time series data: Julian day, precipitation, maximum temperature and discharge. Autocorrelation coefficient varies in the interval [−1, 1] and the representation of this parameter versus lag visually indicates the periodicity of the event throughout time. If the perturbation of the variable (e.g. rainfall) has a long effect in time, the slope is gentle, while if the event has a short range in the temporary series, the slope will be more pronounced (Lee & Lee 2000; Sahu *et al.* 2009). The cross-correlation represents the relationship between the input series and output series (Lee & Lee 2000), in this case precipitation and temperature vs. discharge. In the cross-correlation function, the delay is the time lag between lag 0 and the maximum cross-correlation coefficient value; this lag determines the transfer velocity of the system, for example how long it takes for the river discharge to respond to precipitation (Lee & Lee 2000; Sahu *et al.* 2009). The computer program PAST was used to obtain the auto- and cross-correlograms (Hammer *et al.* 2001).

### ANNs

*S*is the propagation function for the intermediate neuron

_{j}*j*,

*N*corresponds with the number of neurons in the first layer (input layer),

*w*corresponds with the value of the importance (weight) between the input neuron

_{ij}*i*and the intermediate neuron

*j*, and finally

*b*corresponds with the bias associated to the neuron

_{j}*j*of the intermediate layer. The value obtained by the propagation function is used by the activation function to provide an output value for each input entered into the ANN system (Astray

*et al.*2013). These values are propagated to all the neurons on the following layers, and finally to the last neuron in the network, the output neuron. The value provided by the output neuron (

*y*) is compared with the experimental value (

_{o}*d*), and therefore the error (

_{o}*E*) in the prediction is calculated (Equation (3)).

*et al.*1991). In this work the sigmoidal function (Equation (4)) was chosen.

### Statistical evaluation of ANNs

## RESULTS AND DISCUSSION

### Time series analysis of precipitation, temperature and discharge

To identify the relationship between discharge and the variables maximum temperature and precipitation, cross-correlation functions were calculated. The results obtained with the cross-correlation function show the cross-correlation coefficient vs. lag time between two variables (Figure 3(c) and 3(d)): precipitation–discharge and maximum temperature–discharge. In these models, precipitation and maximum temperature were used as input series, and discharge was used as output series. The cross-correlograms show that the coefficient of correlation for precipitation–discharge is 0.53 for a lag time of 1 day. On the other hand, the cross-correlation between the maximum temperature and discharge has a minimum cross-correlation value of −0.43 with a lag time of 19 days, proving that the effect of the temperature on water discharge takes longer time than that for precipitation.

### Development and training of ANNs

The development of ANNs requires a great implementation of networks, using trial and error method, to obtain the best neural network. In this study, over 1,500 neural networks were implemented, with different input variables (Table 1), topologies and training cycles, to identify the neural networks with the best fit for previously untrained cases to predict the discharge to 1 (ANN_{1}), 2 (ANN_{2}) and 3 days (ANN_{3}) ahead. All neural network topologies have an input layer with different numbers of neurons depending on the type of neural network that is implemented (Table 1). Also intermediate layers with different numbers of neurons were constructed. Finally the output layer had a single neuron.

Type (T) | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|

Variables | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |

Julian day (Jd) | ||||||||||

Precipitation (P) | ||||||||||

Precipitation 1 day before (P_{−1}) | ||||||||||

Precipitation 2 days before (P_{−2}) | ||||||||||

Precipitation 3 days before (P_{−3}) | ||||||||||

Maximum temperature (T_{max}) | ||||||||||

Maximum temperature 1 day before (T_{max−1}) | ||||||||||

Minimum temperature (T_{min}) | ||||||||||

Discharge (Q) | ||||||||||

Discharge 1 day before (Q_{−1}) | ||||||||||

Discharge 2 days before (Q_{−2}) | ||||||||||

Discharge 3 days before (Q_{−3}) |

Type (T) | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|

Variables | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |

Julian day (Jd) | ||||||||||

Precipitation (P) | ||||||||||

Precipitation 1 day before (P_{−1}) | ||||||||||

Precipitation 2 days before (P_{−2}) | ||||||||||

Precipitation 3 days before (P_{−3}) | ||||||||||

Maximum temperature (T_{max}) | ||||||||||

Maximum temperature 1 day before (T_{max−1}) | ||||||||||

Minimum temperature (T_{min}) | ||||||||||

Discharge (Q) | ||||||||||

Discharge 1 day before (Q_{−1}) | ||||||||||

Discharge 2 days before (Q_{−2}) | ||||||||||

Discharge 3 days before (Q_{−3}) |

Table 2 shows the best implemented neural networks for each of the selected types of ANNs for the training phase (October 1984 to December 1992). This table also show the linear fit coefficients (*R*^{2}) and RMSE for the best neural networks for each type of input variable selection. As we can see, the same prediction topologies for 1, 2 or 3 days ahead were studied. As expected, the fits in developed predictive models are better when the time window is smaller. The types of neural networks that offer better results for the training phase show different selection of input variables for the different prediction days. So, the best ANN_{1} is type 5 (Topology 6-5-1) with *R*^{2} = 0.94 and RMSE = 3.57 m^{3}s^{−1}. For ANN_{2} the best type is T3 (Topology 8-5-1) and for ANN_{3} the best type is 1 (9-7-1). These findings are consistent with the transit data analysis, which indicates that the lag time between precipitation and discharge is 1 day. Also the time of concentration (the time needed for water to flow from the farthest point in a watershed to the outlet) was calculated by the Temez expression (Temez 1991) and the result was 14 hours (less than our time step: 1 day).

Training phase | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|

ANN_{1} (One day ahead) | ANN_{2} (Two days ahead) | ANN_{3} (Three days ahead) | ||||||||||

T | Top. | Cycles | R^{2} | RMSE | Top. | Cycles | R^{2} | RMSE | Top. | Cycles | R^{2} | RMSE |

1 | 7-2-1 | 8·10^{4} | 0.922 | 4.18 | 7-3-1 | 8·10^{4} | 0.809 | 6.52 | 9-7-1 | 1·10^{5} | 0.761 | 7.30 |

2 | 4-2-1 | 3·10^{5} | 0.905 | 4.59 | 4-3-1 | 3.5·10^{5} | 0.790 | 6.84 | 4-3-1 | 2·10^{5} | 0.716 | 7.96 |

3 | 8-2-1 | 2·10^{5} | 0.923 | 4.14 | 8-5-1 | 3·10^{5} | 0.819 | 6.35 | 8-7-1 | 4·10^{5} | 0.733 | 7.72 |

4 | 6-2-1 | 4·10^{5} | 0.919 | 4.25 | 6-3-1 | 4·10^{5} | 0.803 | 6.62 | 6-3-1 | 8·10^{5} | 0.723 | 7.86 |

5 | 6-5-1 | 8·10^{5} | 0.943 | 3.57 | 6-2-1 | 8·10^{5} | 0.789 | 6.86 | 6-3-1 | 2·10^{5} | 0.722 | 7.87 |

6 | 5-2-1 | 8·10^{5} | 0.909 | 4.50 | 5-3-1 | 8·10^{5} | 0.797 | 6.73 | 5-3-1 | 2·10^{5} | 0.716 | 7.95 |

7 | 3-2-1 | 4·10^{5} | 0.904 | 4.63 | 3-2-1 | 2·10^{6} | 0.778 | 7.04 | 3-2-1 | 2·10^{5} | 0.697 | 8.22 |

8 | 5-2-1 | 8·10^{5} | 0.918 | 4.28 | 5-3-1 | 4·10^{5} | 0.803 | 6.63 | 5-4-1 | 2·10^{5} | 0.721 | 7.88 |

9 | 6-2-1 | 8·10^{5} | 0.919 | 4.24 | 6-2-1 | 2·10^{5} | 0.793 | 6.79 | 6-4-1 | 4·10^{5} | 0.72 | 7.91 |

10 | 7-2-1 | 4·10^{5} | 0.920 | 4.22 | 7-4-1 | 1·10^{5} | 0.807 | 6.56 | 7-4-1 | 3·10^{5} | 0.733 | 7.72 |

| Validation phase | |||||||||||

| ANN _{1} (One day ahead) | ANN _{2} (Two days ahead) | ANN _{3} (Three days ahead) | |||||||||

T | Top. | Cycles | R^{2} | RMSE | Top. | Cycles | R^{2} | RMSE | Top. | Cycles | R^{2} | RMSE |

1 | 7-2-1 | 8·10^{4} | 0.894 | 5.71 | 7-3-1 | 8·10^{4} | 0.804 | 7.88 | 9-7-1 | 1·10^{5} | 0.769 | 8.60 |

2 | 4-2-1 | 3·10^{5} | 0.917 | 5.15 | 4-3-1 | 3.5·10^{5} | 0.835 | 7.41 | 4-3-1 | 2·10^{5} | 0.803 | 8.02 |

3 | 8-2-1 | 2·10^{5} | 0.891 | 5.78 | 8-5-1 | 3·10^{5} | 0.805 | 7.82 | 8-7-1 | 4·10^{5} | 0.778 | 8.28 |

4 | 6-2-1 | 4·10^{5} | 0.892 | 5.78 | 6-3-1 | 4·10^{5} | 0.804 | 7.92 | 6-3-1 | 8·10^{5} | 0.762 | 8.72 |

5 | 6-5-1 | 8·10^{5} | 0.902 | 5.66 | 6-2-1 | 8·10^{5} | 0.805 | 7.90 | 6-3-1 | 2·10^{5} | 0.756 | 8.78 |

6 | 5-2-1 | 8·10^{5} | 0.912 | 5.22 | 5-3-1 | 8·10^{5} | 0.828 | 7.49 | 5-3-1 | 2·10^{5} | 0.770 | 8.56 |

7 | 3-2-1 | 4·10^{5} | 0.875 | 6.30 | 3-2-1 | 2·10^{6} | 0.804 | 8.04 | 3-2-1 | 2·10^{5} | 0.745 | 9.18 |

8 | 5-2-1 | 8·10^{5} | 0.890 | 5.86 | 5-3-1 | 4·10^{5} | 0.804 | 7.87 | 5-4-1 | 2·10^{5} | 0.678 | 10.09 |

9 | 6-2-1 | 8·10^{5} | 0.888 | 5.89 | 6-2-1 | 2·10^{5} | 0.798 | 8.01 | 6-4-1 | 4·10^{5} | 0.707 | 9.58 |

10 | 7-2-1 | 4·10^{5} | 0.892 | 5.78 | 7-4-1 | 1·10^{5} | 0.804 | 7.86 | 7-4-1 | 3·10^{5} | 0.744 | 9.05 |

Training phase | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|

ANN_{1} (One day ahead) | ANN_{2} (Two days ahead) | ANN_{3} (Three days ahead) | ||||||||||

T | Top. | Cycles | R^{2} | RMSE | Top. | Cycles | R^{2} | RMSE | Top. | Cycles | R^{2} | RMSE |

1 | 7-2-1 | 8·10^{4} | 0.922 | 4.18 | 7-3-1 | 8·10^{4} | 0.809 | 6.52 | 9-7-1 | 1·10^{5} | 0.761 | 7.30 |

2 | 4-2-1 | 3·10^{5} | 0.905 | 4.59 | 4-3-1 | 3.5·10^{5} | 0.790 | 6.84 | 4-3-1 | 2·10^{5} | 0.716 | 7.96 |

3 | 8-2-1 | 2·10^{5} | 0.923 | 4.14 | 8-5-1 | 3·10^{5} | 0.819 | 6.35 | 8-7-1 | 4·10^{5} | 0.733 | 7.72 |

4 | 6-2-1 | 4·10^{5} | 0.919 | 4.25 | 6-3-1 | 4·10^{5} | 0.803 | 6.62 | 6-3-1 | 8·10^{5} | 0.723 | 7.86 |

5 | 6-5-1 | 8·10^{5} | 0.943 | 3.57 | 6-2-1 | 8·10^{5} | 0.789 | 6.86 | 6-3-1 | 2·10^{5} | 0.722 | 7.87 |

6 | 5-2-1 | 8·10^{5} | 0.909 | 4.50 | 5-3-1 | 8·10^{5} | 0.797 | 6.73 | 5-3-1 | 2·10^{5} | 0.716 | 7.95 |

7 | 3-2-1 | 4·10^{5} | 0.904 | 4.63 | 3-2-1 | 2·10^{6} | 0.778 | 7.04 | 3-2-1 | 2·10^{5} | 0.697 | 8.22 |

8 | 5-2-1 | 8·10^{5} | 0.918 | 4.28 | 5-3-1 | 4·10^{5} | 0.803 | 6.63 | 5-4-1 | 2·10^{5} | 0.721 | 7.88 |

9 | 6-2-1 | 8·10^{5} | 0.919 | 4.24 | 6-2-1 | 2·10^{5} | 0.793 | 6.79 | 6-4-1 | 4·10^{5} | 0.72 | 7.91 |

10 | 7-2-1 | 4·10^{5} | 0.920 | 4.22 | 7-4-1 | 1·10^{5} | 0.807 | 6.56 | 7-4-1 | 3·10^{5} | 0.733 | 7.72 |

| Validation phase | |||||||||||

| ANN _{1} (One day ahead) | ANN _{2} (Two days ahead) | ANN _{3} (Three days ahead) | |||||||||

T | Top. | Cycles | R^{2} | RMSE | Top. | Cycles | R^{2} | RMSE | Top. | Cycles | R^{2} | RMSE |

1 | 7-2-1 | 8·10^{4} | 0.894 | 5.71 | 7-3-1 | 8·10^{4} | 0.804 | 7.88 | 9-7-1 | 1·10^{5} | 0.769 | 8.60 |

2 | 4-2-1 | 3·10^{5} | 0.917 | 5.15 | 4-3-1 | 3.5·10^{5} | 0.835 | 7.41 | 4-3-1 | 2·10^{5} | 0.803 | 8.02 |

3 | 8-2-1 | 2·10^{5} | 0.891 | 5.78 | 8-5-1 | 3·10^{5} | 0.805 | 7.82 | 8-7-1 | 4·10^{5} | 0.778 | 8.28 |

4 | 6-2-1 | 4·10^{5} | 0.892 | 5.78 | 6-3-1 | 4·10^{5} | 0.804 | 7.92 | 6-3-1 | 8·10^{5} | 0.762 | 8.72 |

5 | 6-5-1 | 8·10^{5} | 0.902 | 5.66 | 6-2-1 | 8·10^{5} | 0.805 | 7.90 | 6-3-1 | 2·10^{5} | 0.756 | 8.78 |

6 | 5-2-1 | 8·10^{5} | 0.912 | 5.22 | 5-3-1 | 8·10^{5} | 0.828 | 7.49 | 5-3-1 | 2·10^{5} | 0.770 | 8.56 |

7 | 3-2-1 | 4·10^{5} | 0.875 | 6.30 | 3-2-1 | 2·10^{6} | 0.804 | 8.04 | 3-2-1 | 2·10^{5} | 0.745 | 9.18 |

8 | 5-2-1 | 8·10^{5} | 0.890 | 5.86 | 5-3-1 | 4·10^{5} | 0.804 | 7.87 | 5-4-1 | 2·10^{5} | 0.678 | 10.09 |

9 | 6-2-1 | 8·10^{5} | 0.888 | 5.89 | 6-2-1 | 2·10^{5} | 0.798 | 8.01 | 6-4-1 | 4·10^{5} | 0.707 | 9.58 |

10 | 7-2-1 | 4·10^{5} | 0.892 | 5.78 | 7-4-1 | 1·10^{5} | 0.804 | 7.86 | 7-4-1 | 3·10^{5} | 0.744 | 9.05 |

### Validation of ANNs

Once the adjustment of different ANNs for the training phase for each of the types of models implemented was calculated, the neural networks were validated using the period 1993–1994. As we can see in Table 2 the models with better fit for the validation phase are the models of type 2 (Julian day, precipitation, maximum temperature and discharge). Again, the neural network fit to predict the discharge 1 day ahead (*R*^{2} = 0.917, RMSE = 5.15 m^{3}s^{−1}) is better than the others, ANN_{2} (*R*^{2} = 0.835, RMSE = 7.41 m^{3}s^{−1}) and ANN_{3} (*R*^{2} = 0.803, RMSE = 8.02 m^{3}s^{−1}) (Table 2). These latter two ANNs have a good prediction power, although lower than ANN_{1}, but show high *R*^{2} coefficients, always over 0.80.

Choosing the best neural network should not be based on the fit in the training phase, but rather in the best fit for the validation phase because these cases are really unknown to the neural network implemented, allowing a better idea of how to adjust the neural network to future data. In this sense, and as mentioned above, the best-fitting networks are type two (T2).

_{1}, the best implemented network has a topology of 4-2-1, and it has been trained for 300,000 cycles, with an

*R*

^{2}coefficient of 0.917 and a lower RMSE (5.15 m

^{3}s

^{−1}), which represents an APD of 18.9% (Table 2 and Figure 4(a) and 4(b)). This ANN was trained for 3·10

^{5}cycles with a start learning rate of 0.6 (to control the weights variation (Yu

*et al.*1995) and a momentum of 0.8 (to speed up convergence and maintain generalization power (Istook & Martinez 2002); both were decreased with the training phase of the neural network. The ANN

_{2}for predicting the discharge 2 days ahead has a topology 4-3-1 (Table 2 and Figure 4(c) and 4(d)). This ANN

_{2}presents a good fit (

*R*

^{2}= 0.835 and RMSE = 7.41 m

^{3}s

^{−1}) that corresponds with an APD of 15.9%. Finally, the ANN

_{3}to predict the discharge 3 days ahead presents a good predictive power for previously unseen cases; these settings are, obviously, worse than ANN

_{1}and ANN

_{2}, with

*R*

^{2}of 0.803 and RMSE of 8.02 m

^{3}s

^{−1}, which corresponds with an APD of 19.3% (Table 2 and Figure 4(e) and 4(f)). Both ANNs, ANN

_{2}and ANN

_{3,}present the same topology with a start learning rates of 0.6 and a momentums of 0.8; both errors were decreased with the training phase but the training cycles for each ANN were different, 3.5·10

^{5}and 2·10

^{5}, respectively.

The predictions 2 and 3 days ahead are 44.0 and 55.9% worse than a prediction 1 day ahead. This huge error between fits of ANN_{2} and ANN_{3} with respect to the prediction of ANN_{1} is directly related to the cross-correlation function for precipitation and discharge (1 day); thus, trying to predict the discharge of Lor River for more than 3 days is, as was thought at first, an unnecessary task.

The temporal distribution for the three ANNs is shown in Figure 4(b), 4(d) and 4(f), providing a good fit between observed discharge (grey shading) and predicted (black line) for low discharge. For periods of high flows the models tend to underestimate the discharge and this fact is more pronounced for ANN_{2} and ANN_{3}.

The importance of input neuron value depends on the weights of each neuron with the neurons in the intermediate layer; the sum of the absolute value of weights determines the importance of input variable to predict the discharge (Table 3).

Julian day | Precipitation | T_{max} | Discharge | |
---|---|---|---|---|

ANN_{1} | 2.5 | 18.0 | 4.5 | 74.9 |

ANN_{2} | 7.2 | 16.0 | 15.7 | 61.2 |

ANN_{3} | 17.3 | 9.1 | 4.6 | 69.1 |

Julian day | Precipitation | T_{max} | Discharge | |
---|---|---|---|---|

ANN_{1} | 2.5 | 18.0 | 4.5 | 74.9 |

ANN_{2} | 7.2 | 16.0 | 15.7 | 61.2 |

ANN_{3} | 17.3 | 9.1 | 4.6 | 69.1 |

_{2}) the temperature becomes more important compared with the other two models implemented. This effect can be related to the importance of temperature in evapotranspiration. To confirm that, the daily potential evapotranspiration (PET) was calculated using Hamon's method (Dingman 2008) (Equation (7)), where

*D*corresponds with day length in hours and

*e**corresponds with the saturation vapour pressure at the mean daily temperature and was calculated by Equation (8) (Dingman 2008).

_{a}(T_{a})### Testing of ANNs

Once the top three neural networks had been developed, validated and selected (model 4-2-1 for 1 day prediction ahead, and model 4-3-1 for 2 and 3 days prediction ahead), the river discharge of years 2008 to 2011 (testing period) was predicted.

*R*

^{2}= 0.908) is similar to that obtained for the validation period (

*R*

^{2}= 0.917). Similar results were obtained with the discharge prediction 2 and 3 days ahead, but in these cases the difference is larger, around 12.4% and 21.9% respectively. Nevertheless, the values of RMSE are better for all days ahead, 5.15 m

^{3}s

^{−1}vs. 3.74 m

^{3}s

^{−1}(27.3% less) for 1 day ahead, 7.41 m

^{3}s

^{−1}vs. 6.42 m

^{3}s

^{−1}(13.3%) for 2 days ahead, and 8.02 m

^{3}s

^{−1}vs. 7.58 m

^{3}s

^{−1}(5.5%) for 3 days ahead. This was probably related to the fact that in the validation period an important error was observed in a short period of high flows, resulting in a significant increase of RMSE value. As happened in the validation period, the ANNs underestimate discharge in high flow periods, and this fact is more pronounced in ANN

_{2}and ANN

_{3}.

ANN_{1} (1 day ahead) | ANN_{2} (2 days ahead) | ANN_{3} (3 days ahead) | ||||
---|---|---|---|---|---|---|

Period | R^{2} | RMSE | R^{2} | RMSE | R^{2} | RMSE |

1993–1994 | 0.917 | 5.15 | 0.835 | 7.41 | 0.803 | 8.02 |

2008–2011 | 0.908 | 3.74 | 0.731 | 6.42 | 0.635 | 7.58 |

ANN_{1} (1 day ahead) | ANN_{2} (2 days ahead) | ANN_{3} (3 days ahead) | ||||
---|---|---|---|---|---|---|

Period | R^{2} | RMSE | R^{2} | RMSE | R^{2} | RMSE |

1993–1994 | 0.917 | 5.15 | 0.835 | 7.41 | 0.803 | 8.02 |

2008–2011 | 0.908 | 3.74 | 0.731 | 6.42 | 0.635 | 7.58 |

## CONCLUSIONS

In this study, the combination of time series analysis and ANNs has provided useful information to analyse the hydrologic behaviour of an undammed river, such as the lag time between precipitation and discharge and the prediction of discharge 1, 2 and 3 days ahead.

The key findings are summarized below.

A lag time of 1 day is found between precipitation and discharge of Lor River. Knowledge of the lag of the river discharge is important to determine the optimal time window threshold to predict the discharge.

The ANNs implemented in this study have shown a great ability to predict discharges to 1, 2 and 3 days ahead. All models feature high linear correlation, always greater than 0.80.

The prediction in the testing period shows a good correlation for 1 day prediction ahead (

*R*^{2}= 0.91), but for 2 and 3 days ahead the correlation worsened considerably due to the time lag between precipitation and discharge of this river being 1 day, and therefore there is no clear relation between precipitation and discharge beyond this 1 day period.Taking into account the different models implemented and the results obtained we can say that ANNs have proved to be a valid tool to predict the Lor River discharge with lower RMSE for 1 day ahead, but in the case of 2 and 3 days ahead the models implemented present an important deviation, especially in periods of high flows in which stream discharge is underestimated.

Long-term predictions using a small number of variables must be considered with caution. Probably, the time in advance of forecasting for water discharge prediction using ANNs is limited by the lag time between rainfall and river discharge.

## ACKNOWLEDGEMENTS

G. Astray thanks Xunta de Galicia, Consellería de Cultura, Educación e Ordenación Universitaria, for the Postdoctoral grant (Plan I2C) and CIA Project for financial support to develop this communication. M. A. Iglesias thanks University of Vigo for his predoctoral fellowship which supported this research.