ABSTRACT
This study examines the complexity of hydrometeorological variables in the Savitri River basin in India. Specifically, it estimates the dimensionality of daily rainfall, runoff, maximum temperature, minimum temperature, pan evaporation, relative humidity, sunshine duration, and wind speed observed during 2000–2010 at two stations (Kangule and Birwadi). The false nearest neighbour (FNN) algorithm is employed to estimate the dimensionality of each variable. The dimensionality represents the number of variables dominantly governing the system. The FNN dimension values of the eight daily hydrometeorological series from each station range between 4 and 7, which may be considered as exhibiting a medium level of complexity. Among the eight series, wind speed is found to be the least complex, whereas minimum temperature and sunshine duration are the most complex. An attempt is also made to examine the effect of temporal scale on the complexity of the hydrometeorological variables, by analysing the hourly rainfall and runoff series. The results indicate that, for both rainfall and runoff, hourly data exhibits greater complexity, with two to three additional influential variables. The present results have important implications for hydrometeorological modelling, prediction, and disaster management in the small and flood-prone Savitri River basin.
HIGHLIGHTS
The complexity of eight daily hydrometeorological variables in the Savitri River basin in India is examined.
The false nearest neighbour algorithm is employed.
The eight variables exhibit the medium complexity, with dimensions between 4 and 7.
Wind speed exhibits the least complexity, and sunshine duration and minimum temperature are the most complex.
Rainfall and runoff at the hourly scale exhibit higher complexity than that at the daily scale.
INTRODUCTION
Hydrometeorological variables are generally complex, nonlinear, and interdependent in nature. Understanding the level of complexity of hydrometeorological variables is important and useful for a wide range of purposes, including selection of appropriate complexity of model, prediction, and interpolation/extrapolation.
Due to the complex, irregular, and seemingly random nature of hydrometeorological variables, stochastic modelling approaches have and continue to be widely employed for modelling and prediction of hydrometeorological variables (Feller 1950; Matalas 1967; Salas et al. 1995; Buishand & Brandsma 2001; Prairie et al. 2007; Li & Singh 2014). However, advances in nonlinear dynamic theories, especially chaos theory (e.g., Lorenz 1963), have revealed that complex, irregular, and random-looking behaviour could also arise from simple nonlinear deterministic systems with a few interdependent variables with sensitive dependence on initial conditions. Motivated by this discovery, several methods have been subsequently developed to identify (qualitatively or quantitatively) the chaotic behaviour of a time series. These methods include phase space reconstruction (Packard et al. 1980), correlation dimension method (Grassberger & Procaccia 1983), Lyapunov exponent method (Wolf et al. 1985), local approximation prediction method (Farmer & Sidorowich 1987), false nearest neighbour (FNN) method (Kennel et al. 1992), and close returns plot (Gilmore 1993).
Applications of chaos theory methods for the identification of the existence of chaotic behaviour in hydrometeorological variables and their prediction have been an important area of research in recent decades; see, for example, Tsonis (1992), Abarbanel (1996), and Sivakumar (2004, 2017) for comprehensive accounts. While rainfall and streamflow time series have been far more widely studied (Rodriguez-Iturbe et al. 1989; Wilcox et al. 1991; Jayawardena & Lai 1994; Sivakumar et al. 2001, 2014; Islam & Sivakumar 2002; Khatibi et al. 2012; Zounemat-Kermani 2016; Mihailović et al. 2019), there have been notable studies on other hydrometeorological variables as well, including temperature, pressure, wind speed, relative humidity, sunshine duration, and lake water level (Fraedrich 1986; Tsonis & Elsner 1988; Zeng et al. 1992; Wang 1995; Sangoyomi et al. 1996; Jayawardena & Gurung 2000; Millán et al. 2010; Ogunjo et al. 2017; Fuwape et al. 2017; Özgür & Yilmaz 2022). Some studies have explored the multivariable analysis of hydrometeorological data for chaos identification and prediction (Porporato & Ridolfi 2001; Sivakumar et al. 2005; Dhanya & Kumar 2011; Tongal 2020).
In the context of identification of the level of complexity and chaotic behaviour in hydrometeorological time series (or any time series, for that matter), a widely used indicator for identification is ‘dimensionality’, such as the one resulting from correlation dimension and FNN methods. The dimensionality is generally considered to be an indication of the number of variables dominantly governing the system dynamics and, hence, representation of the minimum number of variables required to model the system. This means, in a way, a time series that yields a higher dimension value is generally an indication of a higher level of complexity of the system and, hence, of the need for a more complex model when compared with a time series that yields a lower dimension value, which is generally an indication of the lower level of complexity and requirement of a less complex model.
From the dimensionality perspective in particular (and others, such as prediction), the outcomes of the chaos studies are generally encouraging, as they suggest less complex or simpler models for studying hydrometeorological systems. However, it should also be noted that only a very few of these studies have examined multiple hydrometeorological variables from the same basin for identification of chaotic behaviour, while a significant majority have examined only a single variable. Since different hydrometeorological variables in the same basin may exhibit different types of behaviours and extent of complexity, investigation of the dynamic behaviour and complexity of each of the variables is important for a more complete understanding and interpretation of the system dynamics and development of suitable models. This is particularly crucial in the case of small river basins that are susceptible to floods and flash floods. This provides the motivation for the present study.
The present study assesses the complexity of multiple hydrometeorological variables in a small river basin: the Savitri River basin in the state of Maharashtra, India. The study examines eight different hydrometeorological variables: rainfall, runoff, maximum temperature, minimum temperature, pan evaporation, relative humidity, sunshine duration, and wind speed. Daily data of these variables observed during the monsoon season (July–September) over a period of 11 years (2000–2010) from two stations (Kangule and Birwadi) are analysed. The FNN algorithm (Kennel et al. 1992) is employed to estimate the dimensionality of the time series. An attempt is also made to examine the effect of temporal scale on the complexity of hydrometeorological variables by analysing the available hourly rainfall and runoff data from the two stations.
As mentioned earlier, there exist several methods for estimating the dimensionality and complexity of a time series. The FNN method is used in this study for estimation of dimensionality and identification of complexity, since this method is considered to be a more refined and better estimator of dimensionality. Indeed, the FNN method was specifically designed to overcome certain potential limitations that may exist in some other widely used dimensionality methods, such as the correlation dimension method. In particular, the FNN method helps in eliminating the ‘false’ neighbours in the reconstructed phase space of the time series, a problem generally encountered in the correlation dimension method. In addition, the FNN method has some other advantages: it is more efficient and generally works well even for short and noisy time series, which is generally the case in hydrometeorological time series; see, for example, Tsonis et al. (1993), Wang & Gan (1998), Schreiber & Kantz (1996), and Sivakumar (2000, 2001, 2005) for details on the influence of data size, noise, and other data-related issues in chaos identification and prediction.
The rest of this paper is organised as follows. Section 2 gives a brief account of the study area and details of the data used in this study. Section 3 describes the FNN method. Section 4 presents the results obtained from the study and also discusses them. Section 5 reports the conclusions drawn from this work.
STUDY AREA AND DATA
The Savitri River basin comprises several tributaries. The major tributaries are Kundalika (58.7 km), Gandhari (20.9 km), Kal (45.5 km), and Savitri (38.6 km). The Savitri River, originating from the Mahabaleshwar plateau, flows down and joins the Arabian Sea at the Bankot Creek (Figure 1). The location of the basin and the reach of each sub-basin are shown in Figure 1. All four major tributaries and other streams are short in length with small areas but experience high-intensity rainfall in the range of 61–112 mm/h during the monsoon season, which makes the basin prone to flash floods. For instance, the basin experienced a flash flood event in 2016 that resulted in the loss of life of 10 people (The Hindu 2016) in the Mahad bridge collapse (10 bodies recovered and 38 persons feared to have been washed away). The frequent floods and the associated impacts in the Savitri River basin necessitate a better understanding of the complexity of the individual hydrometeorological variables and their interactions, for more accurate modelling and prediction of hydrometeorological variables and proper disaster prevention and mitigation measures in the basin.
The hydrometeorological data pertaining to the Savitri River basin are collected from the Hydrology Project, Nasik, India. There are seven hydrometeorological stations in this basin. For this study, data for eight different variables are collected: rainfall, runoff, maximum temperature, minimum temperature, pan evaporation, relative humidity, sunshine duration, and wind speed. Table 1 lists the names of the stations and the periods of data available from each station for each variable. There are five stations for rainfall (Ambiwali, Kangule, Birwadi, Varandoli, and Waki (kd)), four stations for runoff (Kangule, Bhave, Birwadi, and Kokkare), and two meteorological stations (Kangule and Birwadi) for the remaining variables. However, the available data length/period of each variable varies for the different stations. Considering that using a common period in which data for all the hydrometeorological variables are available would be better for consistency in the analysis and interpretation of the results, only monsoon data (July to September) over the period 2000–2010 from only two stations, namely Kangule and Birwadi, are used in this study.
Variable . | Name of station . | Data length . |
---|---|---|
Daily rainfall | Ambiwali | 01/06/1988–30/09/2012 |
Kangule | 01/06/1988–30/09/2011 | |
Birwadi | 01/06/1988–30/09/2011 | |
Varandoli | 01/06/1988–30/09/2011 | |
Waki (kd) | 01/06/1988–30/09/2011 | |
Daily runoff | Kangule | 30/6/1995–29/10/2010 |
Bhave | 01/06/2000–15/10/2011 | |
Birwadi | 01/06/2000–31/10/2011 | |
Kokkare | 15/06/1999–21/10/2012 | |
Daily maximum temperature | Kangule | 01/01/2000–31/12/2011 |
Birwadi | 01/01/2000–31/12/2011 | |
Daily minimum temperature | Kangule | 01/01/2000–31/12/2011 |
Birwadi | 01/01/2000–31/12/2011 | |
Daily pan evaporation | Kangule | 01/01/2000–31/12/2011 |
Birwadi | 01/01/2000–31/12/2011 | |
Daily relative Humidity | Kangule | 01/01/2000–31/12/2011 |
Birwadi | 01/01/2000–31/12/2011 | |
Daily sunshine duration | Kangule | 01/01/2000–31/12/2010 |
Birwadi | 01/01/2000–31/12/2010 |
Variable . | Name of station . | Data length . |
---|---|---|
Daily rainfall | Ambiwali | 01/06/1988–30/09/2012 |
Kangule | 01/06/1988–30/09/2011 | |
Birwadi | 01/06/1988–30/09/2011 | |
Varandoli | 01/06/1988–30/09/2011 | |
Waki (kd) | 01/06/1988–30/09/2011 | |
Daily runoff | Kangule | 30/6/1995–29/10/2010 |
Bhave | 01/06/2000–15/10/2011 | |
Birwadi | 01/06/2000–31/10/2011 | |
Kokkare | 15/06/1999–21/10/2012 | |
Daily maximum temperature | Kangule | 01/01/2000–31/12/2011 |
Birwadi | 01/01/2000–31/12/2011 | |
Daily minimum temperature | Kangule | 01/01/2000–31/12/2011 |
Birwadi | 01/01/2000–31/12/2011 | |
Daily pan evaporation | Kangule | 01/01/2000–31/12/2011 |
Birwadi | 01/01/2000–31/12/2011 | |
Daily relative Humidity | Kangule | 01/01/2000–31/12/2011 |
Birwadi | 01/01/2000–31/12/2011 | |
Daily sunshine duration | Kangule | 01/01/2000–31/12/2010 |
Birwadi | 01/01/2000–31/12/2010 |
*Remark– Common data period (July, August, and September from 2000 to 2010) is used in this study
Data . | Station . | μ . | σ . | cv . | α . | β . | Ω (% Ω) . | Max . | Min . |
---|---|---|---|---|---|---|---|---|---|
Pd (mm) | Kangule | 26.56 | 38.30 | 1.44 | 3.55 | 24.28 | 130 (12.84) | 442 | 0.00 |
Birwadi | 32.04 | 45.46 | 1.42 | 3.17 | 17.36 | 126 912.45) | 403 | 0.00 | |
Qd (m3/s) | Kangule | 103.26 | 160.61 | 1.56 | 4.59 | 34.19 | 0 (0) | 1,743.20 | 1.16 |
Birwadi | 182.60 | 231.21 | 1.27 | 3.35 | 18.17 | 0 (0) | 1,980.95 | 8.21 | |
Tx (°C) | Kangule | 27.77 | 2.08 | 0.08 | 0.70 | 0.23 | 0 (0) | 34.90 | 23.50 |
Birwadi | 27.92 | 2.08 | 0.07 | 0.48 | −0.30 | 0 (0) | 35.00 | 23.50 | |
Tn (°C) | Kangule | 23.88 | 1.27 | 0.05 | −1.21 | 2.41 | 0 (0) | 26.50 | 18.00 |
Birwadi | 24.12 | 0.84 | 0.03 | −0.60 | 2.27 | 0 (0) | 27.00 | 19.00 | |
EV (mm/day) | Kangule | 1.26 | 0.82 | 0.65 | 0.99 | 0.62 | 0 (0) | 4.40 | 0.20 |
Birwadi | 1.36 | 1.50 | 1.10 | 1.36 | 0.80 | 0 (0) | 7.10 | 0.20 | |
RH (%) | Kangule | 89.83 | 4.58 | 0.05 | −0.76 | −0.06 | 0 (0) | 96.00 | 75.00 |
Birwadi | 89.70 | 5.17 | 0.06 | −1.08 | 1.06 | 0 (0) | 98.00 | 64.00 | |
SD (h) | Kangule | 2.01 | 2.50 | 1.24 | 1.14 | 0.18 | 387 (38.24) | 9.90 | 0.00 |
Birwadi | 1.77 | 2.38 | 1.35 | 1.31 | 0.56 | 402 (39.72) | 9.50 | 0.00 | |
WD (km/h) | Kangule | 2.63 | 1.52 | 0.58 | 0.88 | 0.82 | 0 (0) | 8.57 | 0.07 |
Birwadi | 2.73 | 1.52 | 0.56 | 1.19 | 1.73 | 0 (0) | 9.08 | 0.07 |
Data . | Station . | μ . | σ . | cv . | α . | β . | Ω (% Ω) . | Max . | Min . |
---|---|---|---|---|---|---|---|---|---|
Pd (mm) | Kangule | 26.56 | 38.30 | 1.44 | 3.55 | 24.28 | 130 (12.84) | 442 | 0.00 |
Birwadi | 32.04 | 45.46 | 1.42 | 3.17 | 17.36 | 126 912.45) | 403 | 0.00 | |
Qd (m3/s) | Kangule | 103.26 | 160.61 | 1.56 | 4.59 | 34.19 | 0 (0) | 1,743.20 | 1.16 |
Birwadi | 182.60 | 231.21 | 1.27 | 3.35 | 18.17 | 0 (0) | 1,980.95 | 8.21 | |
Tx (°C) | Kangule | 27.77 | 2.08 | 0.08 | 0.70 | 0.23 | 0 (0) | 34.90 | 23.50 |
Birwadi | 27.92 | 2.08 | 0.07 | 0.48 | −0.30 | 0 (0) | 35.00 | 23.50 | |
Tn (°C) | Kangule | 23.88 | 1.27 | 0.05 | −1.21 | 2.41 | 0 (0) | 26.50 | 18.00 |
Birwadi | 24.12 | 0.84 | 0.03 | −0.60 | 2.27 | 0 (0) | 27.00 | 19.00 | |
EV (mm/day) | Kangule | 1.26 | 0.82 | 0.65 | 0.99 | 0.62 | 0 (0) | 4.40 | 0.20 |
Birwadi | 1.36 | 1.50 | 1.10 | 1.36 | 0.80 | 0 (0) | 7.10 | 0.20 | |
RH (%) | Kangule | 89.83 | 4.58 | 0.05 | −0.76 | −0.06 | 0 (0) | 96.00 | 75.00 |
Birwadi | 89.70 | 5.17 | 0.06 | −1.08 | 1.06 | 0 (0) | 98.00 | 64.00 | |
SD (h) | Kangule | 2.01 | 2.50 | 1.24 | 1.14 | 0.18 | 387 (38.24) | 9.90 | 0.00 |
Birwadi | 1.77 | 2.38 | 1.35 | 1.31 | 0.56 | 402 (39.72) | 9.50 | 0.00 | |
WD (km/h) | Kangule | 2.63 | 1.52 | 0.58 | 0.88 | 0.82 | 0 (0) | 8.57 | 0.07 |
Birwadi | 2.73 | 1.52 | 0.56 | 1.19 | 1.73 | 0 (0) | 9.08 | 0.07 |
μ, mean; σ, standard deviation; cv, coefficient of variation; α, skewness; β, kurtosis; Ω, number of zeros; % Ω, percentage of zeros; Min, minimum; Max, maximum; Pd, daily rainfall (mm/day); Qd, daily runoff (m3/s); Tx, maximum temperature (°C); Tn, minimum temperature (°C); EVP, pan evaporation (mm/day); RH, relative humidity (%); SD, sunshine duration (h); WD, wind speed (km/h).
METHODOLOGY
The FNN method, introduced by Kennel et al. (1992), is based on the concept that the neighbour of point , denoted as , at dimension m will remain as a neighbour when the dimension is increased to m+ 1 if it is a true neighbour. If it is a false neighbour, then moves farther away at dimension m+ 1. The FNN method starts with finding the Euclidean distance between all possible pairs of points in the phase space at dimension m and finding the nearest neighbour for each point. This is repeated for the m+ 1 dimension as well. The next step is to identify the true set of nearest neighbours when the dimension is increased from m to m+ 1. Two criteria are generally used to decide whether a pair of points is true neighbours or not when the dimension is increased from m to m+ 1. These are as follows:
- 1) Distance tolerance: This criterion is used to check whether the nearest neighbours have moved far apart by increasing the dimension. The distance tolerance is given by:where is the square of the Euclidean distance between point n and its nearest neighbour (r) in dimension m+ 1, is the square of the Euclidean distance between point n and its nearest neighbour (r) in dimension m, and is the threshold that takes values greater than 10 and less than 50. In this study, the value is taken as 15, as this has been adopted and recommended by several earlier studies (e.g., Sangoyomi et al. 1996). If the distance tolerance criterion is satisfied, then we can identify those set of points as false neighbours.
- 2) Loneliness tolerance: This criterion is necessary to take care of the short and finite data set. For such data, while finding the embedding dimension, the data points get stretched too far apart when the analysis is done repeatedly at higher dimensions, but it cannot be stretched further as the dimension increases. Therefore, it is important to consider this criterion in the analysis of hydrometeorological time series, since such time series are usually short and finite. The loneliness tolerance is given by:where is the distance between point n and its neighbour (r) in dimension m+ 1, is the size of the attractor (here it is taken as the standard deviation of the time series), and is the threshold. Generally, this threshold value is found by experimentation. In this study, a value of 2 is adopted, based on earlier studies (e.g., Sangoyomi et al. 1996).
By considering these two criteria, the percentage of false nearest neighbours (%FNN) at each dimension is found. The embedding dimension at which the FNN becomes zero or attains the minimum value is identified as the optimum embedding dimension (mopt). This optimum embedding dimension value is the minimum number of variables required (i.e., dominant governing variables) to represent the complexity of the system. Based on the optimum dimension value, also termed as the ‘FNN dimension’ here, the system can be categorised as low-, medium-, and high-dimensional. However, as of now, there is no clear quantitative guideline in the literature to clearly identify low, medium, and high levels of complexity of time series, based on dimensionality. Oftentimes, identification of low, medium, and high levels of complexity is made according to the range of dimensions obtained, especially when data of multiple variables or multiple stations are involved; see, for example, Sivakumar & Singh (2012), Sivakumar et al. (2014), and Vignesh et al. (2015) for further details about how the classification has been made according to the range of dimensions obtained in these studies.
Based on the %FNN values for different dimensions, system behaviour can be identified as least complex or highly complex as follows, as reported by Kennel et al. (1992): (i) The FNN plot for a series with less complexity in the ideal case should be decreasing from 100% and reaching zero or minimum value at the optimum embedding dimension and remaining at that value beyond the optimum embedding dimension; (ii) the FNN percentage for a highly complex system will decrease from 100% and reach a minimum value, but then again rise to 100% or go beyond 100%; and (iii) if the FNN percentage decreases from 100% and reaches zero or a minimum value and then again increases but does not reach 100% as the dimension increases, then such can be considered to indicate less complex behaviour with some amount of noise in the data.
RESULTS AND DISCUSSION
The FNN analysis is performed on the time series of the eight hydrometeorological variables from the Savitri River basin. The FNN analysis starts with the reconstruction of the phase space. In this study, embedding dimension values (m) from 1 to 15 are used for phase space reconstruction. Regarding the selection of delay time (τ), various methods and guidelines have been made available and applied. These include the autocorrelation function (Holzfuss & Mayer-Kress 1986), average mutual information (Fraser & Swinney 1986), and correlation integral (Liebert & Schuster 1989). Some studies have used the delay time window, instead of just the delay time (Martinerie et al. 1992). However, none of these methods or guidelines has proven to be definitive for choosing τ. Extensive details of the delay time selection methods and associated issues are already available in the literature; see, for example, Sivakumar (2017). In the present study, for simplicity and consistency, a delay time of 1, which is equal to the sampling time, is used for phase space reconstruction. The optimum embedding dimension (i.e., FNN dimension) is identified as the dimension of the phase space corresponding to the minimum percentage of false nearest neighbours (%FNN) value. Based on the FNN dimensions obtained for the eight meteorological variables in the present study, we consider 1–3 as low-dimensional, 4–7 as medium-dimensional, and >7 as high-dimensional and, correspondingly, the level of complexity as low, medium, and high, respectively. As mentioned earlier, there is no clear quantitative guideline available in the literature for the classification of the level of complexity based on dimensionality.
Daily data from the Kangule station
Data . | Station . | Delay time = 1 . | |
---|---|---|---|
Optimum embedding dimension . | %FNN . | ||
Pd | Kangule | 6 | 18.17 |
Birwadi | 5 | 16.63 | |
Ph | Kangule | 8 | 36.89 |
Birwadi | 8 | 35.82 | |
Qd | Kangule | 6 | 9.73 |
Birwadi | 5 | 8.80 | |
Qh | Kangule | 9 | 2.74 |
Birwadi | 8 | 2.98 | |
Tx | Kangule | 5 | 10.60 |
Birwadi | 6 | 6.16 | |
Tn | Kangule | 7 | 6.50 |
Birwadi | 7 | 22.61 | |
EVP | Kangule | 5 | 8.89 |
Birwadi | 6 | 21.48 | |
RH | Kangule | 6 | 19.66 |
Birwadi | 6 | 11.80 | |
SD | Kangule | 6 | 28.50 |
Birwadi | 7 | 22.99 | |
WD | Kangule | 5 | 7.13 |
Birwadi | 4 | 6.82 |
Data . | Station . | Delay time = 1 . | |
---|---|---|---|
Optimum embedding dimension . | %FNN . | ||
Pd | Kangule | 6 | 18.17 |
Birwadi | 5 | 16.63 | |
Ph | Kangule | 8 | 36.89 |
Birwadi | 8 | 35.82 | |
Qd | Kangule | 6 | 9.73 |
Birwadi | 5 | 8.80 | |
Qh | Kangule | 9 | 2.74 |
Birwadi | 8 | 2.98 | |
Tx | Kangule | 5 | 10.60 |
Birwadi | 6 | 6.16 | |
Tn | Kangule | 7 | 6.50 |
Birwadi | 7 | 22.61 | |
EVP | Kangule | 5 | 8.89 |
Birwadi | 6 | 21.48 | |
RH | Kangule | 6 | 19.66 |
Birwadi | 6 | 11.80 | |
SD | Kangule | 6 | 28.50 |
Birwadi | 7 | 22.99 | |
WD | Kangule | 5 | 7.13 |
Birwadi | 4 | 6.82 |
Pd, daily rainfall (mm/day); Ph, hourly rainfall (mm/h); Qd, daily runoff (m3/s); Qh, hourly runoff (m3/s); Tx, maximum temperature (°C); Tn, minimum temperature (°C); EVP, pan evaporation (mm/day); RH, relative humidity (%); SD, sunshine duration (h); WD, wind speed (km/h).
Daily data from the Birwadi station
The phase space diagrams and FNN results for the eight hydrometeorological time series from the Kangule and Birwardi stations indicate that: (1) all the eight series exhibit the medium level of complexity; (2) wind speed exhibits the lowest level of complexity; (3) minimum temperature and sunshine duration exhibit the highest level of complexity; (4) there is not much difference in the level of complexity in the respective variable between Kangule and Birwadi; and (5) the eight time series from each station contain a certain amount of noise. The FNN results also indicate that the runoff and wind speed series have lower %FNN value (at the optimum embedding dimension) and lower amount of noise (i.e., smaller increase in the %FNN value at m greater than mopt) when compared with that for the other six variables. The noise in the data limits the accuracy of prediction since the prediction error is always greater than the noise level (Schreiber & Kantz 1996; Sivakumar et al. 1999). Therefore, the lower amount of noise in runoff and wind speed may be an indication of better predictability of the same when compared with that of the other variables.
The above results are certainly useful to further advance our studies in modelling and prediction of the hydrometeorological variables in the Savitri River basin. For instance, a variable that exhibits a greater level of complexity generally requires a more complex model (and more data) when compared with a variable that is less complex. This means, for example, a more complex model is required to model minimum temperature time series in the Savitri River basin when compared with that for the wind speed in the same basin. With this, it is also important to note that the level of complexity of hydrometeorological variables can, and indeed often do, change with the temporal scale (Stehlik 1999; Sivakumar 2001; Salas et al. 2005). In view of this, an attempt is made in this study to analyse also the hourly data from the Kangule and Birwadi stations. Since hourly data are available only for rainfall and runoff, the analysis is limited only to these two variables. The details of the results for the hourly rainfall and runoff data are presented next. A comparison of these results with those for the daily rainfall and runoff data is also made. For the purpose of illustration, results for only one station (Birwadi) are presented for each variable.
Temporal scale and complexity: analysis of hourly rainfall and runoff data
Figure 8(d) and 8(e) shows the phase space diagrams for the daily and hourly runoff series, respectively, at the Birwadi station. The FNN dimensions for the daily and hourly runoff series are found to be 5 and 8, respectively, clearly indicating the more complex nature of the hourly runoff series. There are also differences when it comes to the relationship between %FNN values and embedding dimensions for the two series. For both series, the %FNN value decreases with an increase in the embedding dimension until it reaches the minimum value (at the optimal embedding dimension), as shown in Figure 8(f). The difference, however, is what happens after the %FNN value reaches the minimum value. For the daily runoff series, the %FNN value shows a clear increase after it reaches the minimum value when the embedding dimension is increased further. For the hourly runoff series, however, the %FNN value almost saturates (insignificant change) when the embedding dimension is increased further. These observations seem to indicate that the noise level in the daily runoff series is more when compared with that of the hourly runoff series. The results for the daily and hourly runoff series from the Kangule station lead to similar conclusions as well, i.e., hourly runoff is more complex (FNN dimension is 9) when compared with the daily runoff (FNN dimension is 6) (Table 3).
The FNN results (and also the phase space diagrams) for the rainfall and runoff series from the Kangule and Birwadi stations reveal that both rainfall and runoff are more complex at the hourly scale when compared with that at the daily scale. The results indeed support our general intuition that aggregation of data in time (and space) leads to more smoothing of data and, thus, less complexity; i.e., finer-scale data have greater complexity and vice versa. However, the present results are contradictory to those reported by some previous chaos theory-based studies that examined hydrometeorological data at different temporal scales (e.g., Stehlik 1999; Sivakumar 2001; Salas et al. 2005). It is important to note, at this point, that most such previous chaos theory-based studies have used the correlation dimension method, which has certain limitations, especially in the presence of a large number of zeros in the time series, in addition to issues of data size and noise, as discussed by Tsonis et al. (1993), Wang & Gan (1998), and Sivakumar (2001). Nevertheless, one needs to be careful in interpreting the results, since the complexity may not always increase (decrease) when the temporal scale gets finer (coarser). Indeed, there may be a particular ‘intermediate’ scale where the level of complexity may be the highest (or lowest), depending upon the underlying dynamics at play. The study by Sivakumar et al. (2001) addressed this issue from a different perspective (i.e., scaling and disaggregation), by examining the weights of distribution of data between two different temporal scales, rather than examining the original data themselves. In view of these, the issue of complexity of hydrometeorological data with respect to temporal scale needs further exploration.
CONCLUSIONS
Understanding the level of complexity of hydrometeorological variables is vital for a wide range of studies related to modelling and prediction. Chaos theory-based methods have gained significant attention in studying the complexity of hydrometeorological variables. However, most such studies have examined only a single hydrometeorological variable (from a single or multiple river basins) and, thus, are not adequate for a more complete understanding of system dynamics and development of suitable models. This study examined the complexity of multiple hydrometeorological variables in the Savitri River basin in India. The FNN method, a chaos theory-based dimensionality method, was applied to daily rainfall, runoff, maximum temperature, minimum temperature, pan evaporation, relative humidity, sunshine duration, and wind speed observed during 2000–2010 at two stations (Kangule and Birwadi) in the basin.
The results indicate that (1) all the eight series at the daily scale exhibit dimensionality between 4 and 7, i.e., medium level of complexity; (2) wind speed exhibits the lowest level of complexity and minimum temperature (and also sunshine duration) exhibits the highest level of complexity; (3) there is not much difference in the level of complexity in the respective time series between Kangule and Birwadi; and (4) the hydrometeorological time series from each station contain certain amount of noise. The FNN results also suggest a lower amount of noise in the runoff and wind speed series, which implies better predictability, since noise limits the prediction accuracy. In addition, examination of rainfall and runoff data at two different temporal scales (daily and hourly) at the two stations indicate that the hourly rainfall and runoff data exhibit a greater level of complexity (with at least two to three additional dominant influencing variables) when compared with the daily data. These results support our general intuition that data aggregation leads to less complexity, although one needs to exercise caution in such interpretations since the complexity may not always increase (decrease) when the temporal scale gets finer (coarser). The analysis reveals that the noise level in the daily runoff series is more when compared with that of the hourly runoff series, implying better predictability of runoff at the hourly scale than at the daily scale. However, one should also exercise caution in such interpretation, since there may also be other factors (including level of complexity) that influence predictability.
Since dimensionality is a representation of the number of dominant governing variables influencing the system dynamics, the outcomes from the present study have important implications for selection of appropriate complexity of the models required for the eight hydrometeorological variables. However, identification of what these actual dominant governing variables are is a different problem and not that straightforward. We are currently performing research in this direction, especially by using multiple variables for reconstruction (i.e., state phase reconstruction) based on chaos theory concepts and estimating the FNN dimensions. The multivariable reconstruction is an extensive analysis, since the variables involved can be ordered in different ways for reconstruction; for instance, for studying the runoff dynamics, the order of the variables involved, including runoff, can be chosen based on correlation between runoff and the other variables. Therefore, depending upon the criterion chosen, there can be a large number of combinations for any embedding dimension, and the optimum embedding dimension should be identified based on such combinations. Such an exercise can be particularly useful for cross-verifying the results obtained using the single-variable phase space reconstruction procedure as well. For instance, if the optimum dimension from the single-variable phase space reconstruction and multivariable reconstruction match, then one can be more confident in making inferences about the actual dominant variables. Other chaos identification methods, such as the nonlinear local approximation approach, for both single-variable and multivariable cases, can also help further verify the complexity level and identification of the dominant variables, since, in general, a time series exhibiting a lower dimension is expected to yield more accurate predictions when compared with a time series that exhibits a higher dimension. We hope to report the outcomes of such investigations in the near future.
Finally, it is appropriate to note that the estimation of dimensionality using the FNN method (or any other chaos theory-based method, for that matter) is not limited to any particular variable/time series. In the present study, the analysis was performed on eight hydrometeorological variables from two stations in the Savitri River basin, because of the availability of data for the eight variables. If data for yet other variables (e.g., soil moisture, subsurface) become available, then the analysis can be easily extended to those time series as well. Therefore, although the present study uses data from only two stations from the Savitri River basin (due to the availability of data), the outcomes have important and broad implications, not only for the Savitri River basin but also for other river basins, especially those that are small and experience frequent floods/flash floods. We would also like to highlight that the present study is arguably unique for the study of many hydrometeorological variables for a small basin, from the perspective of chaos theory. The results can be used as an important initial step for the hydrological models for the study area and more extensive studies on water planning and management. With the availability of data from large number of stations, it is possible to relate the catchment size and land use land cover with the dimensionality of variables. This extensive study can be done in the future for better understanding of the system.
ACKNOWLEDGEMENTS
The authors are thankful to the Hydrology Project, Nasik for providing the hydrometeorological data in the Savitri basin. The authors also would like to thank the two reviewers and the editor for their constructive comments and useful suggestions on an earlier version of this manuscript.
AUTHOR CONTRIBUTIONS
N.E.S. contributed to the study conception, methodology, data collection, analysis, and writing of the original draft. V.J. supervised the study and contributed to the conceptualisation of the study, methodology, validation, reviewing and editing of the manuscript. B.S. supervised the study and contributed to the methodology, validation, and reviewed and edited the manuscript for improvements. All authors have read and approved the final manuscript.
FUNDING
No funding was received for conducting this study.
ETHICS APPROVAL
All authors have read, understood, and have complied as applicable with the statement on ‘Ethical responsibilities of Authors’ as found in the Instructions for Authors.
DATA AVAILABILITY STATEMENT
Data cannot be made publicly available; readers should contact the corresponding author for details.
CONFLICT OF INTEREST
The authors declare there is no conflict.