In this paper, using the wavelet technique we analysed rainfall behaviour in the country across different agro-climatic zones over a century. Findings indicate that at the national level there is no significant trend in rainfall in the long run, but there are pockets of change in the rainfall pattern. There was a significant increase in the rainfall in the arid zone, whereas in the humid, semi-arid tropics and semi-arid temperate zones the trend was downward but insignificant. The behaviour of rainfall was different during this period. Except in the arid zone, we find a similar trend in other zones – increasing initially, tapering off in the middle and then declining but with some difference in time intervals. In the arid zone, the behaviour of rainfall had been erratic. In the short run, the direction of change in trend remains the same as in the long run but the change is statistically significant.

## INTRODUCTION

Indian agriculture is rain-dependent, with almost two-thirds of the net cropped area being rain-fed. Therefore, a regular rainfall pattern is crucial for agricultural growth; too much or too little rain can have a devastating effect on agriculture. A regular pattern in rainfall may not be observed all the time. The series may behave erratically, with highs and lows. Recent evidence shows monsoon rainfall in India is becoming more erratic (Manton *et al.* 2001; Kumar *et al.* 2004; Goswami *et al.* 2006; Mall *et al.* 2006; Ghosh *et al.* 2010; Paul *et al.* 2011). Auffhammer *et al.* (2011) have shown that the Indian climate has become drier and warmer, adversely affecting the crop yields.

India is a large country, and regional patterns in rainfall and temperature are more important for agriculture. A few studies have analysed behaviour of rainfall at disaggregated spatial units, at the level of meteorological sub-division, state and district. Guhathakurta & Rajeevan (2006) found a sharp decrease in the number of rainy days and amount of rainfall in many sub-divisions of India. They reported a decline in rainfall in the months of June, July and September and an increase in August in a few sub-divisions. Goswami *et al.* (2006) studied rainfall behaviour over central India for the period 1951–2000 and observed a significant increase in heavy rainfall events and a decrease in moderate rainfall events. They also reported a rise in the frequency of extreme rainfall events such as droughts and floods. Krishnakumar *et al.* (2009) studied variations in monthly, seasonal and annual rainfall in the state of Kerala for the period 1871–2005, and found a significant decline in the south west monsoon (June–September) rains, especially in the months of June and July; and an upward trend in winter rainfall in the months of January, February and April. Soltani *et al.* (2012) computed rainfall and rainy day trends in Iran.

Some studies have also attempted to predict rainfall behaviour and estimation of trends using recently developed nonparametric techniques like wavelets. Kallache *et al.* (2005) applied discrete wavelet transform (DWT) analysis to river run-off data from gauges in Southern Germany to separate the variability of small and large time-scales, assuming that the trend component is part of the latter. Adamowski *et al.* (2009) proposed a method for trend detection and estimation based on continuous wavelet transform (CWT), which provides a very flexible and accurate tool for detecting and estimating complicated signals in hydro-climatologic series in Canada. Paul *et al.* (2011) have also used DWT for the estimation of trends in Indian monsoon rainfall and reported that that there is a significant decreasing trend in the same. Nalley *et al.* (2012) studied the DWT technique to analyse and detect trends in monthly, seasonally-based, and annual data from eight flow stations and seven meteorological stations in southern Ontario and Quebec during 1954–2008. Using wavelet and fuzzy logics to describe the frequency decomposition of annual rainfall and summer monsoon rainfall, Singh & Bhatla (2012) suggest that a combination of these two models provides a better forecast of rainfall behaviour. Ramana *et al*. (2013) used wavelet and artificial neural network (ANN) models for predicting monthly rainfall and found wavelet neural network models more effective in predicting rainfall than ANN models alone. Mahajan & Mazumdar (2013) applied non-linear regenerative predicting codes for rainfall forecasts, and predicted more than 1 year's rainfall trend with 90% or more accuracy. Paul *et al.* (2013) applied maximal overlap DWT (MODWT) for forecasting Indian monsoon rainfall. Adarsh & Reddy (2015) carried out trend analysis based on DWT on the post-monsoon rainfall time series of the Kerala subdivision, and the results show that there is a dominance of short-term periodicity of less than a decade in the subdivision.

Most of these studies have analysed rainfall behaviour either at national level or at state level. Estimates at these levels are too aggregative to provide regional variation in climate. India, being a large country, has considerable heterogeneity in agro-climatic conditions even within a state or sub-division, while an analysis of climate variables at a more disaggregated level, say the district, is desirable for a better understanding of their location-specific behaviour. In this paper, we have examined rainfall behaviour at the level of an agro-climatic zone by clustering the districts having similar agro-climatic conditions into homogeneous zones. The study of rainfall trends at the zone level may provide inputs to policymakers in designing regionally differentiated adaptation and mitigation strategies to cope with climate change. Moreover, in almost none of the previous studies, has MODWT been explored for estimation of trends in climate variables. In the present study, MODWT and DWT have been applied. The paper is organized as follows: the next section deals with the methodology, followed by the results and discussion section, and ending with the conclusions in the final section.

## METHODOLOGY

*et al.*(2005). Rao

*et al.*(2005) have grouped districts into different agro-climatic zones based on the average annual temperature and length of growing period (LGP). The districts with an LGP range from 70 to 180 days, and with a mean monthly temperature (all months) of greater than 18 °C and a daily mean temperature greater than 20 °C during the growing period, are classified as semi-arid tropics. For the districts in the semi-arid temperate zone the LGP ranges from 70 to 180 days, and the daily mean temperature during the growing period ranges from 5–20 °C. One or more months in these districts have a monthly mean temperature below 5 °C. The districts in the arid zone have an LGP of fewer than 75 days, and those in the humid zone have an LGP of more than 270 days. These zones are arid, semi-arid tropics, semi-arid temperate and humid (Figure 1). Semi-arid tropics comprise the largest zone, and the arid zone has the smallest area.

To identify the trend in rainfall we have used two nonparametric methods, viz. Mann-Kendall (M-K) and MODWT. The advantage of nonparametric methods is that these are based on fewer assumptions underlying the rainfall distribution and are not very sensitive to abrupt breaks, which are a common feature of a rainfall series. We have also used a parametric method, that is, the deterministic linear trend. A brief description of these is given below.

### Deterministic linear trend

*Y*is rainfall at time

_{t}*t, β*'s are the unknown parameters and is the error term that follows independent, identically distributed (IID) normal distribution, i.e. ∼ Normal (0, ). The advantage of linear regression is that it provides an estimate of slope, confidence interval and goodness of fit. However, it cannot handle missing data and may be greatly affected by outliers.

### M-K test

_{0}) for this test is that there is no trend in the series. Following Jayawardene

*et al.*(2005), the first step in this test is to replace observations (

*x*) by their ranks (

_{i}*k*), and then calculate

_{i}*N*as the number of

_{i}*k*preceding it such that

_{j}*k*>

_{j}*k*. The

_{i}*t*-statistic for the test is computed as:Then, we define a range,

*r*, for a desired probability point,

_{m}*r*, as:If

_{g}*t*lies outside the range , then there exists a significant trend in the series. Further, to find the direction of the trend, the

_{m}*S*-statistic is used.where

*sgn*() is the signum function. A signum function extracts the sign of a real number. If

*S*is greater or lower than zero, the trend is increasing or decreasing respectively. For large values of

*n*,

*S*is approximately normally distributed with mean zero, i.e.

*E*(

*S*) = 0 and variance:where

*t*is the number of ties among observations having the same rank.

_{i}*k*) are arranged in increasing order, and then

_{i}*d*=

_{i}*k*–1 is calculated. The test statistic is given by:The significance of can be tested by calculating

_{i}*t*, where

_{S}*t*is:The trend is absent if

_{S}*t*lies within the desired limits.

_{s}### DWT

Wavelets are fundamental building block functions, analogous to the trigonometric sine and cosine functions. As with a sine or cosine wave, a wavelet function oscillates around zero. This property makes the function a *wave*. A description of wavelets can be found in Daubechies (1992), Ogden (1997) and Percival & Walden (2000). A completely different use of wavelets in statistical time series analysis is described in Rioul & Vetterli (1991) and Flandrin (1998). Wavelets, being time-scale representation methods, deliver a tool complementary to both classical and localized Fourier analyses. The search for localized time-scale representations of stochastic correlated signals leads to both analysing and synthesizing (i.e. modelling) mainly non-stationary processes with wavelets or wavelet-like bases (Nason & von Sachs 1999).

There are two main waves of wavelets; the first is known as a CWT designed to work with time-series defined over the entire real axis; another is the DWT which deals with series defined essentially over a range of integers. The DWT of a time-series is used to capture high and low frequency components. This enables modelling of time-series data through computation of the inverse DWT. Computation of the DWT is carried out by the ‘pyramid algorithm’, as described in Percival & Walden (2000). A brief description of this protocol can be written as follows.

**X**be an

*N*1 vector of time series observations with

*N*= 2

*;*

^{J}**P**be an

*N*

*N*real valued matrix defining DWT and satisfying orthonormality property, i.e.

**P**′

**P**=

**I**

*, where*

_{N}**I**

*is the*

_{N}*N*

*N*identity matrix. Then the DWT (

**) of the time-series vector**

*W***X**is computed by

**W**

*=*

**P X**. Now the elements of the vector

**W**are decomposed into

*J*+ 1 sub-vectors. The first

*J*sub-vectors contain all of the DWT coefficients for scale . Then

**W**can be formed by concatenating all the

*J*+ 1 sub-vectors as follows:

### MODWT

The MODWT is a linear filtering operation that transforms a series into coefficients related to variations over a set of scales. It is similar to the DWT in that both are linear filtering operations producing a set of time-dependent wavelet and scaling coefficients. Both have basis vectors associated with a location *t* and a unit-less scale *τ _{j}* = 2

*− 1 for each decomposition level*

^{j}*j*= 1, …,

*J*

_{0}. The MODWT retains down-sampled values at each level of decomposition that would otherwise be discarded by the DWT. The MODWT is well defined for all sample sizes

*N*, whereas for a complete decomposition of

*J*levels, the DWT requires

*N*to be an integer multiple of 2

*. In the MODWT, the output signal is never sub-sampled (not decimated). Instead, the filters are upsampled on each level. The redundancy of the MODWT facilitates alignment of the decomposed wavelet and scaling coefficients at each level with the original time-series, thus enabling a ready comparison between the series and its decomposition. Finally, the redundancy of the MODWT wavelet coefficients modestly increases the effective degrees of freedom on each scale, and thus decreases the variance of certain wavelet-based statistical estimates.*

^{J}MODWT is a highly redundant non-orthogonal transform. For a redundant transform like the MODWT, an *N* samples input time-series will have an *N* samples resolution scale for each resolution level. Therefore, the features of wavelet coefficients in a multi-resolution analysis (MRA) will be lined up with the original time-series in a meaningful way.

*X*with arbitrary sample size

*N*, the

*j*th level MODWT wavelet () and scaling () coefficients are defined as:where are the

*j*th level MODWT wavelet filters, and are the

*j*th level MODWT scaling filters.

*L*is the width of the

_{j}*j*th level equivalent wavelet and scaling filter. A wavelet filter must satisfy the following three basic properties:The first stage scaling coefficients are obtained as: .

### Estimation of trend by wavelets

*X*}:where

_{t}*μ*is a constant term,

*T*is an unknown deterministic polynomial trend function of order

_{t}*r*,

*Z*is a residual term which is a long-memory process defined by

_{t}*Z*= , where, is the long memory parameter, {} is a Gaussian white noise process with mean zero and

_{t}*>*

*0*. Here,

*B*, is the back shift operator such that

*BZ*=

_{t}*Z*.

_{t−1}**W**can be written as the sum of two vectors:

**W**=

**W**+

_{w}**W**

_{s}*,*where

**W**is an

_{w}*N*×

*1*vector containing the wavelet coefficients and zeros at all other locations, and

**W**is an

_{s}*N*×

*1*vector containing the scaling coefficients and zeros at all other locations. Since

**X**=

**P**′

**W**, therefore:Here, is an estimator of the polynomial trend

*T*at level

*J*, while is the estimate of residual

*Z*. The issue of choosing the level of the estimate depends on the goal of the application.

*J*should be chosen to be small for detecting the local trends and cycles. In other applications,

*J*is set to be large, if the objective is to detect the global trend.

**P**implies that the DWT is an energy preserving transform so that:Given the structure of the wavelet coefficients, the energy in

**X**is decomposed, on a scale by scale basis, via:so that represents the contribution to the energy of {

*X*} due to changes at scale , whereas represents the contribution due to variations at scale . So the estimated variance of the time-series in terms of wavelet and scaling coefficients can be expressed as:where is the estimated variance of the wavelet coefficients at scale , and is the estimated variance of the trend.

_{t}*H*

_{0}:

*Trend*=

*0*, against the alternative hypothesis

*H*

_{1}:

*Trend*≠ 0, Almasri

*et al.*(2008) proposed a test statistic as:

The test statistic (*N* – *N/*2* ^{J}*)/(

*N/*2

*– 1)*

^{J}*G*follows

*F*distributed with

*(N/*2

*– 1*

^{J}*)*and

*(N*–

*N/*2

*degrees of freedom (under the normality assumption of the scaling coefficients).*

^{J})## RESULTS AND DISCUSSION

A preliminary investigation into rainfall behaviour across agro-climatic zones shows stable mean and variance, indicating that the series is stationary. The series is also not autoregressive as indicated by the values of autocorrelation and partial autocorrelation function at various lag orders (results are not reported here due to lack of space, but are available on request). The coefficients of the linear regression model and Theil–Sen slope estimates are given in the last two columns of Table 1, respectively. At the all India level, the trend in rainfall is positive but insignificant. It is positive and significant only in the arid zone. In other zones, the trend is negative and not significant. From these we find that over the years rainfall is becoming less, except in the arid zone where rainfall increased by 0.64 mm per year. The M-K test and Spearman's rank test confirm these observations except for the humid zone where one of the tests is significant. Here, the direction of change in rainfall in M-K and Spearman's test is opposite to that provided by linear regression. One possible reason could be the presence of some extreme events, i.e. very high or very low rainfall in some years.

. | M-K test . | Spearman's rank test . | . | Linear regression . | . | ||
---|---|---|---|---|---|---|---|

Zones . | S . | t_{m}
. | r_{m}
. | t_{s}
. | Direction . | b . | Theil–Sen slope . |

Humid | −527 | 0.1023 | 0.1316 | −1.6641 | Decreasing | −0.6272 | −0.553 |

Semi-arid temperate | 109 | −0.0211 | 0.1316 | 0.1841 | Increasing | −0.0351 | −0.161 |

Semi-arid tropic | −125 | 0.0242 | 0.1316 | −0.5410 | Decreasing | −0.2104 | −0.180 |

Arid | 805 | −0.1562^{a} | 0.1316 | 2.2670 ^{a} | Increasing | 0.6418 ^{a} | 0.597^{a} |

All India | 437 | −0.0813 | 0.1316 | 0.8971 | Increasing | 0.4690 | 0.113 |

. | M-K test . | Spearman's rank test . | . | Linear regression . | . | ||
---|---|---|---|---|---|---|---|

Zones . | S . | t_{m}
. | r_{m}
. | t_{s}
. | Direction . | b . | Theil–Sen slope . |

Humid | −527 | 0.1023 | 0.1316 | −1.6641 | Decreasing | −0.6272 | −0.553 |

Semi-arid temperate | 109 | −0.0211 | 0.1316 | 0.1841 | Increasing | −0.0351 | −0.161 |

Semi-arid tropic | −125 | 0.0242 | 0.1316 | −0.5410 | Decreasing | −0.2104 | −0.180 |

Arid | 805 | −0.1562^{a} | 0.1316 | 2.2670 ^{a} | Increasing | 0.6418 ^{a} | 0.597^{a} |

All India | 437 | −0.0813 | 0.1316 | 0.8971 | Increasing | 0.4690 | 0.113 |

^{a}Denotes significant trend at 5% level of significance.

For examining the local and global fluctuations in rainfall series, we computed MODWT coefficients along with the MRA for all the zones. The MODWT can accommodate any sample size *N*. In order to preclude decomposition at scales longer than the total length of the time-series, we select the highest scale (*J _{0}*) such that . In particular, for alignment of wavelet coefficients with the original series, condition (i.e. the width of the equivalent filter at the th level is less than the sample size) should be satisfied to prevent multiple wrappings of the time-series at level

*J*

_{0}. In our case, we found as 6. The choice of wavelet filter has a great significance on feature extraction of a time series. The decomposition would vary significantly, particularly for high frequency components with the choice of a different filter. A variety of ‘mother wavelet functions' is available that can be used for DWT analysis (e.g. the Haar wavelet, Dsaubechies wavelet). The mother wavelet function should reflect the type of features present in the time series. For time series with ‘steps’, one would choose a boxcar-like mother wavelet function such as the Haar wavelet, while for more smoothly varying time series one would choose a smoother mother wavelet function such as the Dsaubechies wavelet. In the present paper, both the filters have been used. Daubechies wavelets optimally capture the polynomial trends, while the Haar wavelet is discontinuous and resembles a step function. It is to be noted that for trend detection only, a low frequency component is used.

*X*denotes the original time-series;

*W*to

_{1}*W*denote wavelet details components, and

_{6}*V*denotes the smoothed component of MODWT. A perusal of Figure 2(a) indicates that localized variation in the data is detected at a lower scale, whereas global variation is detected at a higher scale. At the lower scales the series are much more irregular compared to the series in the upper scale. In the lower scale we can detect trends for small time intervals, whereas in the larger scale we can visualize trends for the whole time period. The V

_{6}_{6}series is considered to be the proxy for trend.

Coefficients plotted at the top of Figure 2(a) are low-frequency components, and at the bottom are high-frequency components. The wavelet coefficients do not remain constant over time, and reflect changes in the data series at various time points. The locations of abrupt jumps can be spotted by looking at vertical clustering of the relatively larger coefficients. To obtain a visual idea regarding this, the MODWT coefficients at 3–5 levels, i.e. W_{3}, W_{4} and W_{5}, have been plotted in Figure 2(b). A perusal of Figure 2(b) represents that whenever there are abrupt changes in rainfall, it is depicted by clusters of long spikes on either side of the X-axis.

The behaviour of rainfall in the arid zone is in contrast with that in others. In this zone, the overall trend in rainfall is increasing, but the intervals of change in direction are greater, indicating more erratic rainfall behaviour. The intensity of rainfall decreases during 1901–1916, remains almost constant until 1931, increases until 1976 and then starts decreasing. At the all India level, rainfall has increased from 1950 onwards (Figure 3(e)).

*et al.*(2012). A global increase in extreme rainfall events is a consequence of the emission of anthropogenic greenhouse gases causing atmospheric and oceanic warming (Ghosh

*et al.*2012). However, the spatial heterogeneity in rainfall as observed here could also be due to local influences such as topography, deforestation, non-uniform population growth and urbanization.

The trend analysis studies by Guhathakurta & Rajeevan (2006), Goswami *et al.* (2006) and Krishnakumar *et al.* (2009) indicate that trends significantly depend upon the period and the locations. To see this, we analysed rainfall behaviour for the period 1939–2002, applying the same methods as for the entire period but instead of MODWT we applied DWT (the DWT requires that the number of observations (N) should be sufficient such that N = 2^^{J}, J takes a positive integer). Also, we used two wavelet filters, viz. Haar and Daubechies (D4).

The results are reported in Table 2. With application of the wavelet method, the direction of trend in rainfall remains as in the case of the entire period, but the trend coefficients are now significant. These results indicate that there has been a significant decline in rainfall in the past six decades or so (1939–2002) in all the zones, except in the arid zone and at the all India level, where it has increased. The wavelet analysis could determine a significant trend globally as well as locally that could not be captured through other methods.

. | M-K test . | Linear regression . | . | Test statistic based on wavelets . | |||
---|---|---|---|---|---|---|---|

Zones . | S . | Tm . | Rm . | B . | Theil–Sen slope . | D4 . | Haar . |

Humid | −418 | 0.207^{a} | 0.167 | −1.873^{a} | −1.855^{a} | 17.859^{a} | 7.122^{a} |

Semi-arid temperate | −306 | 0.151 | 0.167 | −1.861^{a} | −1.732^{a} | 15.876^{a} | 24.191^{a} |

Semi-arid tropic | −378 | 0.187^{a} | 0.167 | −1.999^{a} | −1.965^{a} | 13.573^{a} | 9.241^{a} |

Arid | 208 | −0.103 | 0.167 | 0.636 | 0.722 | 9.608^{a} | 12.812^{a}* |

All India | −927 | 0.308^{a} | 0.167 | −5.268^{a} | −4.842^{a} | 19.645^{a} | 25.470^{a} |

. | M-K test . | Linear regression . | . | Test statistic based on wavelets . | |||
---|---|---|---|---|---|---|---|

Zones . | S . | Tm . | Rm . | B . | Theil–Sen slope . | D4 . | Haar . |

Humid | −418 | 0.207^{a} | 0.167 | −1.873^{a} | −1.855^{a} | 17.859^{a} | 7.122^{a} |

Semi-arid temperate | −306 | 0.151 | 0.167 | −1.861^{a} | −1.732^{a} | 15.876^{a} | 24.191^{a} |

Semi-arid tropic | −378 | 0.187^{a} | 0.167 | −1.999^{a} | −1.965^{a} | 13.573^{a} | 9.241^{a} |

Arid | 208 | −0.103 | 0.167 | 0.636 | 0.722 | 9.608^{a} | 12.812^{a}* |

All India | −927 | 0.308^{a} | 0.167 | −5.268^{a} | −4.842^{a} | 19.645^{a} | 25.470^{a} |

^{a}Denotes statistically significant trend at 5% level.

*Note:* Statistics for testing trend using D4 and Haar wavelets were estimated using (*N*–*N/*2* ^{J}*)/(

*N/*2

*–1)G described above under the section ‘Estimation of trend by wavelets’.*

^{J}## CONCLUSIONS

Agriculture in India is highly sensitive to rainfall, therefore any change in rainfall will influence crop yields. An understanding of the spatial and temporal distribution and changing patterns in rainfall is important for planning and management of water resources. In this paper we examined behaviour of rainfall in different agro-climatic zones over a period of more than 100 years using wavelet techniques. The trends were tested for two periods, a long-term period (1901–2002) and for a shorter period (1939–2002). Findings indicate that at the national level there is no significant trend in rainfall in the long run, but there are pockets of change in the rainfall pattern. There was a significant increase in the rainfall in the arid zone, whereas in the humid, semi-arid tropics and semi-arid temperate zones the trend was downward but insignificant. In the short run, the direction of change in trend remains the same as in the long run, but the change is statistically significant. Since rainfall has been decreasing in three of the four agro-climatic zones, a possible sign of temporal change in Indian rainfall over the past century may be apparent.

## ACKNOWLEDGEMENTS

This paper has been drawn from the project ‘Enhancing resilience of agriculture to climate change’, funded by the Indian Council of Agricultural Research under a mega project ‘National Initiative on Climate Resilient Agriculture’. The authors are grateful to the anonymous reviewers for providing useful comments which helped improve the paper.