ABSTRACT
Climate change has made rainfall patterns more uneven and unpredictable. Recent advancements in machine learning and deep learning offer the capability to handle the complex, nonlinear nature of weather input parameters, leading to more reliable predictions. Therefore, in this study, trend detection and rainfall prediction using logistic machine learning and deep learning models have been carried out in the Bhopal region of central India. Trend analysis methods such as Mann–Kendall, Sen's slope, and Pettit test methods were applied to detect trends, estimate slope, and change points in weather parameters. The performance assessment of the logistic and deep learning model showed a higher F1 score on classification for the deep learning model (0.93) compared to the logistic model (0.56). The results revealed the greater capability of deep learning models for capturing the variations in rainfall compared to the logistic models. The sensitivity of the deep learning model was studied using gradients of the loss function (mean-square error) with respect to input variables. The gradient-based sensitivity measure revealed that rainfall was highly sensitive to RHmin, BSS, and RHmax. The deep learning model-based rainfall prediction may help with real-time decision-making for irrigation scheduling and others leading to water resource management.
HIGHLIGHTS
Rainfall prediction was conducted using logistic machine learning and deep learning models.
Trend analysis used Mann–Kendall, Sen's slope, and Pettit tests.
A significant increasing trend was found at maximum temperature, while minimum temperature showed no significant trend.
Rainfall was significantly influenced by month, humidity, temperature, and wind velocity.
The deep learning model outperformed the logistic model.
INTRODUCTION
Climate change is a complex and far-reaching global issue that affects nations worldwide, including India (IPCC 2021). This evolving scenario is characterized by several notable global trends, such as rising temperatures, shifting rainfall patterns, and increasing frequency of extreme weather events (Pachauri et al. 2014; Gaddikeri et al. 2024). Over the century, global temperatures have steadily increased, with the Earth's average temperature rising by roughly 1.2 °C (2.2 °F) since the late 1800s (Hansen et al. 2012). Recent decades have witnessed an accelerated warming trend, with some of the most recent times ranking among the warmest on record (IPCC 2021). The primary cause of this temperature increase is anthropogenic climate change (Gogoi et al. 2019), caused by conditioning similar to the burning of fossil fuels, leading to an increase in the Earth's surface temperatures. This warming trend has led to an increase in extreme weather events, including storms, droughts, floods, and wildfires (IPCC 2021), causing significant damage to ecosystems, contributing to biodiversity loss, and impacting economies. Furthermore, the melting of ice caps and glaciers, coupled with climate-induced ocean-level rise, poses a severe threat to coastal areas, aggravating coastal flooding and erosion and potentially causing lasting damage to both the environment and human societies (Pachauri et al. 2014).
Climate change has disrupted rainfall patterns, leading to prolonged droughts in some regions and heavy rainfall in others (IPCC 2021; Trenberth 2011). These changes have significant implications for agricultural productivity because rainfall plays a pivotal role in crop cultivation practices. In this context, rainfall is the most important factor because it has the potential to intensify the impact of hydrometeorological disasters (Irwandi et al. 2023). Prediction of rainfall provides valuable awareness to individuals, enabling them to anticipate rainfall in advance and take necessary precautions to safeguard their crops from potential damage (Basha et al. 2020). Moreover, globally, agriculture itself is responsible for a substantial portion (30–40%) of all greenhouse gas emissions (Vermeulen et al. 2012; Abbass et al. 2022), along with other major contributors, including the burning of fossil fuels, deforestation, land-use changes, and industrial processes (IPCC 2014).
The Paris Agreement was adopted in 2015, marking a significant global effort to combat the effects of climate change. The primary objective of the agreement is to limit global warming to well below 2 °C above preindustrial levels, with an even stronger goal of restricting it to 1.5 °C (Rogelj et al. 2016; IPCC 2021). However, recent projections suggest a potential global temperature increase ranging from 1.4 to 5.8 °C by 2100 (Balasubramanian & Birundha 2012; Collins et al. 2013). A climate change assessment report for the Hindu Kush Himalaya region indicates that even if global warming is limited to 1.5 °C, the Himalayan region could still experience a temperature increase of approximately 2 °C by 2100 (Wester et al. 2019; Karki et al. 2022). These trends highlight the urgent need for international cooperation and individual action to mitigate the impacts of climate change, preserve the environment, and secure a sustainable future (IPCC 2018; Rogelj et al. 2018).
India, a country with a substantial population and extensive low-lying coastal areas, is one of the world's most vulnerable nations to the consequences of climate change (Revi et al. 2014; IPCC 2021). Agriculture is a vital pillar of India's economy, and climate-induced changes have profound implications for crop production and food security. India faces significant challenges due to climate change, which necessitates proactive measures to mitigate its impacts on agriculture, water resources, and vulnerable communities. Adaptation strategies and sustainable practices are crucial in this regard. India has experienced increasing weather unpredictability, including temperatures surpassing 50 °C in certain regions and an erratic monsoon season with respect to its timing and rainfall amount. This is highlighted by research conducted by Chateau et al. (2023) and Rohini et al. (2016). The fluctuations in monsoon rainfall have a significant impact on agriculture and the nation's economy, and effective water resource management depends on them, as emphasized by Chakraborty et al. (2023) and Revadekar & Preethi (2012). In addition, studies under various greenhouse gas scenarios show a general decline in runoff availability in different river basins, as detailed by Balasubramanian & Birundha (2012) and Pervez & Henebry (2015). The Indian monsoon, a crucial weather system for agriculture and water resources, is becoming increasingly irregular, which is of great importance (Krishnan et al. 2019). Climate change has interfered with monsoon patterns, resulting in prolonged dry spells interspersed with intense rainfall events, leading to an increase in extreme weather phenomena (Goswami et al. 2006). The notable shifts in India's summer monsoon rainfall from 1951 to 2015 highlighted this vulnerability. During this time, there was a 6% reduction in summer monsoon rainfall, with the Indo-Gangetic Plains and Western Ghats experiencing the most significant declines (Krishnan et al. 2020). Furthermore, the frequency of daily rainfall extremes in central India, characterized by rainfall intensities surpassing 150 mm per day, increased by 75% from 1950 to 2015 (Roxy et al. 2017; Mukherjee et al. 2018). The escalating frequency of droughts in India has been attributable to the overall decrease in seasonal summer monsoon rainfall over the past 6–7 decades. Regions such as central India, the southwest coast, the southern peninsula, and northeastern India have experienced an average of more than two droughts per decade, each with unique consequences due to variations in agricultural geography. Additionally, the area affected by drought has seen a 1.3% increase per decade during the same period (Krishnan et al. 2020).
Therefore, India is facing the urgent need to adapt to climate change, ensure agricultural sustainability, and maintain a resilient food supply for its population (Gupta & Pathak 2016). Studies have examined the impacts of climate change on specific regions in India. For instance, Singh et al. (2015) studied urban flooding caused by heavy rainfall events in Bhopal, finding a decreasing trend in seasonal heavy rainfall events (≥65 mm). In another study, Duhan et al. (2013) examined temperature variations from 1901 to 2002 in Madhya Pradesh, central India, and found that annual mean, maximum, and minimum temperatures increased over the past century, with more significant warming in winter than in summer. Kundu et al. (2015) analyzed monthly rainfall data from 1901 to 2011 for 45 stations in Madhya Pradesh, identifying a significant decrease in annual and monsoon rainfall, with an overall annual decrease of −6.75% in rainfall over the 111-year period, which aligns with the findings.
Climate change impacts vary regionally and are influenced by a wide range of atmospheric factors, leading to uncertainties in rainfall projections (IPCC 2021). Measuring the effects of climate change on temperature and rainfall aims to predict the amount, spatial distribution, and variability of rainfall (Jain & Kumar 2012; Chauhan et al. 2022). Climate change studies continue to advance with the incorporation of several modeling strategies, including machine learning approaches. Machine learning offers the ability to analyze vast datasets rapidly, providing insights into complex climate patterns and trends. This aids in more accurate predictions and proactive mitigation strategies. Weesakul et al. 2018 developed a deep learning neural network (DNN) for forecasting monthly rainfall for optimum reservoir operation and water resources management for the eastern region of Thailand. The application of deep learning models to rainfall prediction signifies a substantial advancement in meteorological modeling, capacity to handle intricate, nonlinear relationships within high-dimensional datasets. Rainfall patterns are influenced by complex, nonlinear interactions among meteorological variables such as temperature, humidity, pressure, and wind speed. Adewoyin et al. 2021 used the deep learning approach to improve the prediction of high-resolution precipitation from coarse-resolution simulation of other weather variables. Praveena et al. 2023 applied logistic regression to accurately predict the rainfall and confirmed better accuracy of support vector machine compared to logistic regression. Analysis using traditional statistical models often struggles to effectively capture these intricate interactions. Deep learning models excel in identifying hidden patterns and relationships within large, multivariate datasets, offering superior predictive capabilities. However, to study the effects of climate change, it is necessary to analyze prevailing trends in climate conditions using long-term meteorological data. Detailed studies using advanced climatic models with machine learning approaches are required to assess the specific effects of climate change on temperature and rainfall. Regional studies can help identify relevant trends to develop strategies for adaptation and mitigation, minimizing the potential negative impacts of climate change.
The objective of this study is to analyze the trend and predict the rainfall using the logistic and deep learning model for understanding the complex dynamics of climate change due to system imbalances such as global warming and witnessing a particularly increase in the average temperature of earth and erratic rainfall.
MATERIALS AND METHODS
This study utilized primary weather data to examine the impact of climate change in the study region. Weather data from 1983 to 2023 were collected from the Agro-meteorological Observatory at the ICAR-Central Institute of Agricultural Engineering (CIAE), Bhopal (India). Nonparametric techniques of the trend analysis were employed to estimate the trends and intensities of the weather parameters. Furthermore, machine learning algorithms were applied to capture variations in rainfall patterns. The weather parameters investigated in this study encompassed time-series data on rainfall, relative humidity, temperature, bright sunshine hours, evaporation, and wind velocity.
Site information
Data structure and statistical tools and techniques
The primary data on weather parameters, including maximum temperature (Max T), minimum temperature (Min T), maximum relative humidity (RHmax), minimum relative humidity (RHmin), bright sunshine hours (BSS), evaporation (E Pan), and wind velocity (Wind V), were utilized on a date-, month-, and year-wise basis. Trend analysis for the minimum and maximum temperatures was conducted using the Mann–Kendall test (Mann 1945; Kendall 1975; Gilbert 1987). The Pettitt test (Pettitt 1979) was employed to detect change points in the minimum and maximum temperatures, while the magnitude of change in the minimum and maximum temperatures (slope) was estimated using Sen's slope estimator (Sen 1968).
Mann–Kendall test
The Mann–Kendall test is a nonparametric statistical test used to identify trends in a time-series dataset. The test calculates a statistic (S) based on the difference between the data points. Yan et al. 2022 applied the Mann–Kendall test to detect the trend of extreme precipitation indices. The Mann–Kendall test is distribution free and is not affected by outliers and thus making it the most widely used method for trend detection in climatic parameters.
Sen's slope estimator
The confidence interval for Sen's slope is X(N−k)/2 and X(N−k)/2+1, where k = se × z, N is the total number of pairs [n(n − 1)/2], and X represents the slope for a certain pair.
Pettit test
Rainfall prediction
Logistic regression machine learning
Deep learning architectures
Statistical analysis and software used
Data entry, organization, and descriptive statistics were performed using Microsoft Excel (2016). The trend analysis was carried out in R 4.3 open-source software using Kendall, trend, and change point packages. The logistic model parameters (coefficients) were estimated using PROC LOGISTIC in SAS 9.3, SAS Institute Inc., Cary, NC, USA. The optimization of the weights for the deep learning model was performed in Python 3.11, an open-source software. The Python 3.11 library included numpy, pandas, tensorflow, sklearn, and matplotlib for analysis.
RESULTS
Trend analysis in rainfall and temperature data
Trend analysis for annual rainfall
. | Mann–Kendall test . | Pettit test . | Sen's slope (mm/year) . | ||
---|---|---|---|---|---|
Trend . | p value . | Change point . | p value . | ||
Total | No | 0.72 | No (2021) | 0.73 | 1.17 (p = 0.73) |
Kharif | No | 0.91 | No (2021) | 0.52 | −0.41(p = 0.91) |
Rabi | No | 0.20 | No (2018) | 0.66 | 0.87 (p = 0.20) |
. | Mann–Kendall test . | Pettit test . | Sen's slope (mm/year) . | ||
---|---|---|---|---|---|
Trend . | p value . | Change point . | p value . | ||
Total | No | 0.72 | No (2021) | 0.73 | 1.17 (p = 0.73) |
Kharif | No | 0.91 | No (2021) | 0.52 | −0.41(p = 0.91) |
Rabi | No | 0.20 | No (2018) | 0.66 | 0.87 (p = 0.20) |
Trend analysis for rainy days
Season . | Mann–Kendall test . | Pettit test . | Sen's slope (days/year) . | ||
---|---|---|---|---|---|
Trend . | p value . | Change point . | p value . | ||
Total | Yes | 0.02 | Yes (1992) | 0.02 | 0.33 (p = 0.01) |
Kharif | No | 0.19 | – | – | – |
Rabi | Yes | 0.04 | No (1995) | 0.66 | 0.08 (p = 0.04) |
Season . | Mann–Kendall test . | Pettit test . | Sen's slope (days/year) . | ||
---|---|---|---|---|---|
Trend . | p value . | Change point . | p value . | ||
Total | Yes | 0.02 | Yes (1992) | 0.02 | 0.33 (p = 0.01) |
Kharif | No | 0.19 | – | – | – |
Rabi | Yes | 0.04 | No (1995) | 0.66 | 0.08 (p = 0.04) |
Trend analysis in monthly rainfall
Month . | Mann–Kendall test . | Pettit test . | Sen's slope (mm/year) . | ||
---|---|---|---|---|---|
Trend . | p value . | Change point . | p value . | ||
January | No | 0.28 | No (2013) | 0.28 | 0 |
February | No | 0.90 | No (2003) | 0.90 | 0 |
March | No | 0.37 | No (2006) | 0.37 | 0 |
April | No | 0.51 | No (2006) | 0.51 | 0 |
May | No | 0.88 | No (2007) | 0.88 | 0 |
June | No | 0.75 | No (2019) | 0.75 | −0.32 |
July | No | 0.36 | No (2021) | 0.33 | 1.85 |
August | No | 0.70 | No (2021) | 0.35 | −0.79 |
Sep | No | 0.95 | No (2018) | 0.81 | 0.12 |
Oct | No | 0.98 | No (2018) | 0.56 | 0 |
Nov | No | 0.05 | Yes (2010) | 0.04 | 0 |
Dec | No | 0.16 | No (1997) | 0.16 | 0 |
Month . | Mann–Kendall test . | Pettit test . | Sen's slope (mm/year) . | ||
---|---|---|---|---|---|
Trend . | p value . | Change point . | p value . | ||
January | No | 0.28 | No (2013) | 0.28 | 0 |
February | No | 0.90 | No (2003) | 0.90 | 0 |
March | No | 0.37 | No (2006) | 0.37 | 0 |
April | No | 0.51 | No (2006) | 0.51 | 0 |
May | No | 0.88 | No (2007) | 0.88 | 0 |
June | No | 0.75 | No (2019) | 0.75 | −0.32 |
July | No | 0.36 | No (2021) | 0.33 | 1.85 |
August | No | 0.70 | No (2021) | 0.35 | −0.79 |
Sep | No | 0.95 | No (2018) | 0.81 | 0.12 |
Oct | No | 0.98 | No (2018) | 0.56 | 0 |
Nov | No | 0.05 | Yes (2010) | 0.04 | 0 |
Dec | No | 0.16 | No (1997) | 0.16 | 0 |
Regarding the monthly minimum and maximum temperatures (Tables 4 and 5), a significant increasing trend was observed over the years for all months, except December. However, no significant trend was observed for the maximum temperature for any month during the study period.
Trend analysis in minimum temperature over the months
Month . | Trend . | M-K . | Change point . | Sen's slope (°C/year) . |
---|---|---|---|---|
January | + | 0.01 | 2013 | 0.12 |
February | + | 0.03 | 2014 | 0.13 |
March | + | 0.02 | 2015 | 0.13 |
April | + | 0.002 | 2015 | 0.25 |
May | + | 0.001 | 2017 | 0.22 |
June | + | 0.002 | 1985 | 0.07 |
July | + | 0.001 | 1988 | 0.05 |
August | + | 0.005 | 2008 | 0.03 |
September | + | 0.002 | 2014 | 0.13 |
October | + | 0.002 | 2012 | 0.11 |
November | + | 0.003 | 2013 | 0.14 |
December | No | 0.12 | – | – |
Month . | Trend . | M-K . | Change point . | Sen's slope (°C/year) . |
---|---|---|---|---|
January | + | 0.01 | 2013 | 0.12 |
February | + | 0.03 | 2014 | 0.13 |
March | + | 0.02 | 2015 | 0.13 |
April | + | 0.002 | 2015 | 0.25 |
May | + | 0.001 | 2017 | 0.22 |
June | + | 0.002 | 1985 | 0.07 |
July | + | 0.001 | 1988 | 0.05 |
August | + | 0.005 | 2008 | 0.03 |
September | + | 0.002 | 2014 | 0.13 |
October | + | 0.002 | 2012 | 0.11 |
November | + | 0.003 | 2013 | 0.14 |
December | No | 0.12 | – | – |
Trend analysis in maximum temperature over the months
Month . | Trend . | M-K . | Change point . | Sen's slope (°C/year) . |
---|---|---|---|---|
January | No | 0.25 | – | – |
February | No | 0.32 | – | – |
March | No | 0.39 | – | – |
April | No | 0.14 | – | – |
May | No | 0.44 | – | – |
June | No | 0.88 | – | – |
July | No | 0.59 | – | – |
August | No | 0.22 | – | – |
September | + | 0.03 | 1999 | 0.03 |
October | + | 0.01 | 2012 | 0.11 |
November | No | 0.52 | – | – |
December | No | 0.09 | – | – |
Month . | Trend . | M-K . | Change point . | Sen's slope (°C/year) . |
---|---|---|---|---|
January | No | 0.25 | – | – |
February | No | 0.32 | – | – |
March | No | 0.39 | – | – |
April | No | 0.14 | – | – |
May | No | 0.44 | – | – |
June | No | 0.88 | – | – |
July | No | 0.59 | – | – |
August | No | 0.22 | – | – |
September | + | 0.03 | 1999 | 0.03 |
October | + | 0.01 | 2012 | 0.11 |
November | No | 0.52 | – | – |
December | No | 0.09 | – | – |
Rainfall prediction using logistic regression and deep learning models
Logistic regression
The relationship between the weather parameters and rainfall was developed using a logistic machine learning model and a deep learning algorithm to predict rainfall. The parameter estimates (coefficients) of the logistic model are presented in Table 6. The intercept term was found to be highly significant (p < 0.01) in predicting the phenomenon of rainfall within 1, 3, and 5 days. In predicting rainfall for 1 day, the months of April, February, and July had no significant influence on rainfall, whereas for 3 days, all months except April, August, and February had a significant effect on rainfall. All months except March were found to be significant in predicting the rainfall occurrence for 5 days. Relative humidity and minimum temperature significantly affected rainfall in all categories. The maximum temperature and evaporation had no significant effect on rainfall in any case, but wind velocity contributed significantly in all cases. No lack of fit was found in the fitted logistic models (p > 0.05). The results were also supported by the Wald chi-square test presented in Table 7. All independent variables (month, relative humidity, temperature, and wind velocity) had a significant effect on rainfall, except for BSS and E pan. The association between the predicted probability of rainy days and actual rainy days is presented in Table 8. The measured statistics in Table 8 describe the strength of the association between actual and predicted rainy days. The percent concordant showed the proportion of pairs (actual, predicted) where the predicted rainy day agreed with the actual rainy day in terms of relative ranking. A high percentage indicates the good predictive ability of the model. The measured percentage concordant (90.70%) showed a good agreement between actual and predicted rainy days. Percent discordant showed the proportion of pairs where the predicted rainy day did not agree with the actual rainy day. The lower value of the percentage discordant (9.10%) showed that the performance of the logistic model was not too bad. The percent tied (0.20%) showed the proportion of pairs where the predicted rainy days were identical to the actual rainy day. The ‘Somers’ D,’ ‘Gamma,’ ‘Tau-a,’ and ‘c’ are the indices of rank correlation for assessing the predictive ability of a model. The Somers' D statistics (0.82) showed that there was a strong positive correlation between the actual rainy day and the predicted rainy day. The Gamma statistics (0.82), a nonparametric measure of association measured as the difference between concordant and discordant pairs divided by the total pairs, showed a strong positive association between actual and predicted rainy days. Tau-a (0.20) suggested a moderate association, whereas concordance index c (0.91) indicated a satisfactory predictive ability of the logistic model.
Coefficients of logistic model and their significance
Parameters . | Estimates with p value . | ||
---|---|---|---|
Within 1 day . | Within 3 days . | Within 5 days . | |
Intercept | −8.7539* | −6.0486* | −4.7140* |
April | 0.3491 | 0.1412 | −0.3641** |
August | −0.3074** | 0.2442 | 1.0861* |
December | −0.9922* | −1.2830* | −1.3596* |
February | 0.1615 | −0.2256 | −0.4955* |
January | −0.6323** | −0.8528* | −0.7409* |
July | 0.272 | 0.9185* | 1.6341* |
June | 1.0872* | 1.5182* | 1.6857* |
March | 0.9699* | 0.6038* | 0.1051 |
May | 0.9485* | 0.7884* | 0.4417* |
November | −0.8399* | −1.2654* | −1.4797* |
October | −0.6449* | −0.6262* | −0.8550* |
RHmax | 0.0248* | 0.0227* | 0.0263* |
RHmin | 0.0621* | 0.0465* | 0.0220* |
Min_T | 0.1524* | 0.1108* | 0.0935* |
Max_T | −0.0522** | −0.0371 | −0.0218 |
BSS | −0.0198 | −0.0369** | −0.0369** |
E_Pan | 0.0372 | 0.00941 | 0.0118 |
Wind velocity | −0.0572** | −0.0530* | −0.0592* |
Goodness of fit | 0.09 | 0.80 | 0.15 |
Parameters . | Estimates with p value . | ||
---|---|---|---|
Within 1 day . | Within 3 days . | Within 5 days . | |
Intercept | −8.7539* | −6.0486* | −4.7140* |
April | 0.3491 | 0.1412 | −0.3641** |
August | −0.3074** | 0.2442 | 1.0861* |
December | −0.9922* | −1.2830* | −1.3596* |
February | 0.1615 | −0.2256 | −0.4955* |
January | −0.6323** | −0.8528* | −0.7409* |
July | 0.272 | 0.9185* | 1.6341* |
June | 1.0872* | 1.5182* | 1.6857* |
March | 0.9699* | 0.6038* | 0.1051 |
May | 0.9485* | 0.7884* | 0.4417* |
November | −0.8399* | −1.2654* | −1.4797* |
October | −0.6449* | −0.6262* | −0.8550* |
RHmax | 0.0248* | 0.0227* | 0.0263* |
RHmin | 0.0621* | 0.0465* | 0.0220* |
Min_T | 0.1524* | 0.1108* | 0.0935* |
Max_T | −0.0522** | −0.0371 | −0.0218 |
BSS | −0.0198 | −0.0369** | −0.0369** |
E_Pan | 0.0372 | 0.00941 | 0.0118 |
Wind velocity | −0.0572** | −0.0530* | −0.0592* |
Goodness of fit | 0.09 | 0.80 | 0.15 |
Note: * and ** indicates the level of significance at 1 and 5%, respectively.
Significance of independent variables on rainfall for logistic model
Effect . | DF . | Wald chi-square . | p value . |
---|---|---|---|
Month | 11 | 117.1486 | <0.0001 |
RHMax | 1 | 15.4630 | <0.0001 |
RHMin | 1 | 150.9556 | <0.0001 |
Min_T | 1 | 39.9901 | <0.0001 |
Max_T | 1 | 4.3611 | 0.0368 |
BSS | 1 | 1.4932 | 0.2217 |
E_Pan | 1 | 1.5322 | 0.2158 |
Wind_Ve | 1 | 11.1245 | 0.0009 |
Effect . | DF . | Wald chi-square . | p value . |
---|---|---|---|
Month | 11 | 117.1486 | <0.0001 |
RHMax | 1 | 15.4630 | <0.0001 |
RHMin | 1 | 150.9556 | <0.0001 |
Min_T | 1 | 39.9901 | <0.0001 |
Max_T | 1 | 4.3611 | 0.0368 |
BSS | 1 | 1.4932 | 0.2217 |
E_Pan | 1 | 1.5322 | 0.2158 |
Wind_Ve | 1 | 11.1245 | 0.0009 |
Association of predicted probabilities and actual responses (rainy days)
Statistics . | Measurements . |
---|---|
Percent concordant | 90.70 |
Percent discordant | 9.10 |
Percent tied | 0.20 |
Somers’ D | 0.82 |
Gamma | 0.82 |
Tau-a | 0.20 |
C | 0.91 |
Statistics . | Measurements . |
---|---|
Percent concordant | 90.70 |
Percent discordant | 9.10 |
Percent tied | 0.20 |
Somers’ D | 0.82 |
Gamma | 0.82 |
Tau-a | 0.20 |
C | 0.91 |
Deep learning models
Fine-tuned hyperparameters of deployed deep learning architecture
S No . | Hyperparameters . | Values . |
---|---|---|
1 | Input features | 19 |
2 | Hidden layer | 1 |
3 | Neurons in hidden layer | 5 |
4 | Neurons in output layer | 1 |
5 | Batch size | 32 |
6 | Epochs | 50 |
7 | Learning rate | 0.001 |
S No . | Hyperparameters . | Values . |
---|---|---|
1 | Input features | 19 |
2 | Hidden layer | 1 |
3 | Neurons in hidden layer | 5 |
4 | Neurons in output layer | 1 |
5 | Batch size | 32 |
6 | Epochs | 50 |
7 | Learning rate | 0.001 |
Loss and accuracy plots of deep learning models with confusion matrix: (a) for 1 day, (b) for 3 days, and (c) for 5 days.
Loss and accuracy plots of deep learning models with confusion matrix: (a) for 1 day, (b) for 3 days, and (c) for 5 days.
Table 10 compares the performance measures of logistic and deep learning models. The accuracy of both models was found to be comparable. However, the correct classification of the rainfall phenomenon requires both precision and sensitivity in the prediction of rainfall. The precision and sensitivity of the deep learning model (0.96 and 0.90) were found to be better than those of the logistic model (0.66 and 0.48). A higher degree of accuracy indicates that there are very few instances of predicting rain that actually did not happen. If instances of predicting no rain when it does rain are less, then sensitivity will be higher. A good measure of performance is the F1 score, which refers to the harmonic mean of precision and sensitivity (recall) that confers greater weight to small items. Overall, based on several performance measures, the deep learning model outperformed the logistic model. In addition, the F1 score was also better for the deep learning model (0.93) compared to the logistic model (0.56). Of all the performance measures, only the specificity of the logistic model (0.96) was better compared to the deep learning model (0.63).
Performance measures of logistic and deep learning model
Performance measures . | Estimated values . | ||
---|---|---|---|
Within 1 day . | Within 3 days . | Within 5 days . | |
Logistic machine learning | |||
Accuracy | 89.11 | 85.66 | 84.75 |
Sensitivity | 0.48 | 0.69 | 0.74 |
Specificity | 0.96 | 0.91 | 0.89 |
Precision | 0.66 | 0.73 | 0.77 |
F1 score | 0.56 | 0.71 | 0.76 |
Deep learning model | |||
Accuracy | 88.93 | 84.98 | 84.39 |
Sensitivity | 0.90 | 0.89 | 0.88 |
Specificity | 0.63 | 0.67 | 0.76 |
Precision | 0.96 | 0.88 | 0.89 |
F1 score | 0.93 | 0.89 | 0.88 |
Performance measures . | Estimated values . | ||
---|---|---|---|
Within 1 day . | Within 3 days . | Within 5 days . | |
Logistic machine learning | |||
Accuracy | 89.11 | 85.66 | 84.75 |
Sensitivity | 0.48 | 0.69 | 0.74 |
Specificity | 0.96 | 0.91 | 0.89 |
Precision | 0.66 | 0.73 | 0.77 |
F1 score | 0.56 | 0.71 | 0.76 |
Deep learning model | |||
Accuracy | 88.93 | 84.98 | 84.39 |
Sensitivity | 0.90 | 0.89 | 0.88 |
Specificity | 0.63 | 0.67 | 0.76 |
Precision | 0.96 | 0.88 | 0.89 |
F1 score | 0.93 | 0.89 | 0.88 |
Gradient-based sensitivity measure of input variables
Input variables . | Gradient . | Absolute gradient . | Sensitivity index (rank) . |
---|---|---|---|
April | −0.0106 | 0.011 | 0.048 (14) |
August | 0.0349 | 0.035 | 0.545 (3) |
December | −0.0107 | 0.011 | 0.050 (13) |
February | −0.0102 | 0.010 | 0.039 (17) |
January | −0.0103 | 0.010 | 0.042 (16) |
July | 0.0329 | 0.033 | 0.504 (4) |
June | 0.0092 | 0.009 | 0.018 (18) |
March | −0.0107 | 0.011 | 0.051 (12) |
May | −0.0121 | 0.012 | 0.079 (10) |
November | −0.0106 | 0.011 | 0.047 (15) |
October | −0.0109 | 0.011 | 0.055 (11) |
September | 0.0083 | 0.008 | 0.000 (19) |
RHMax | 0.0275 | 0.027 | 0.393 (5) |
RHMin | 0.0572 | 0.057 | 1.000 (1) |
Min_T | 0.0247 | 0.025 | 0.336 (6) |
Max_T | −0.0178 | 0.018 | 0.196 (9) |
BSS | −0.0424 | 0.042 | 0.697 (2) |
E_Pan | −0.0210 | 0.021 | 0.259 (7) |
Wind_v | 0.0185 | 0.018 | 0.209 (8) |
Input variables . | Gradient . | Absolute gradient . | Sensitivity index (rank) . |
---|---|---|---|
April | −0.0106 | 0.011 | 0.048 (14) |
August | 0.0349 | 0.035 | 0.545 (3) |
December | −0.0107 | 0.011 | 0.050 (13) |
February | −0.0102 | 0.010 | 0.039 (17) |
January | −0.0103 | 0.010 | 0.042 (16) |
July | 0.0329 | 0.033 | 0.504 (4) |
June | 0.0092 | 0.009 | 0.018 (18) |
March | −0.0107 | 0.011 | 0.051 (12) |
May | −0.0121 | 0.012 | 0.079 (10) |
November | −0.0106 | 0.011 | 0.047 (15) |
October | −0.0109 | 0.011 | 0.055 (11) |
September | 0.0083 | 0.008 | 0.000 (19) |
RHMax | 0.0275 | 0.027 | 0.393 (5) |
RHMin | 0.0572 | 0.057 | 1.000 (1) |
Min_T | 0.0247 | 0.025 | 0.336 (6) |
Max_T | −0.0178 | 0.018 | 0.196 (9) |
BSS | −0.0424 | 0.042 | 0.697 (2) |
E_Pan | −0.0210 | 0.021 | 0.259 (7) |
Wind_v | 0.0185 | 0.018 | 0.209 (8) |
The ranking of sensitivity index has been represented in parenthesis. The top four sensitivity ranks are highlighted with bold font.
Measured gradients of loss function with respect to input variables.
DISCUSSION
Trend analysis in rainfall and temperature data
The lack of significant trends in rainfall found in annual and monthly rainfall data for Bhopal, as indicated by the p values and Sen's slope estimator, suggests that rainfall patterns have remained relatively stable over the analyzed period. However, the observed increase in the number of rainy days, particularly during the Rabi season, could have implications for crop planning and water management strategies. The increased significance of most months for 3-day and 5-day predictions indicates that short-term weather patterns have more complex interactions that were evident over longer periods.
Moreover, minimum temperatures did not exhibit a significant trend, and the significant increasing trend in maximum temperatures, with an estimated rate of 0.05 °C/year, is concerning. This rise in maximum temperatures could exacerbate heat stress, affecting crop yields, water demand, and human health, necessitating adaptation measures such as heat-tolerant crop varieties, efficient irrigation systems, and improved heat action plans. The identified change point in maximum temperature around 2015, as detected by the Pettitt test, underscores the need for continuous monitoring and assessment of climatic variables to inform timely and appropriate adaptation strategies.
Rainfall prediction using logistic regression and deep learning models
Traditional statistical methods, which rely on linear assumptions, often fall short of accurately predicting rainfall. However, recent developments in machine learning and deep learning provide the solution to the complex, nonlinear nature of weather input parameters, leading to more reliable predictions. The linear model based on historical trends was found less effective in predicting extreme weather events or subtle climatic variations. Endalie et al. 2022 developed the deep learning model for daily rainfall prediction for Jimma, Ethiopia. A long short-term memory (LSTM) model was applied to predict daily rainfall in the Jimma region. The performance assessment revealed that deep learning outperformed the existing statistical methods. By employing logistic machine learning models and deep learning algorithms, the study aimed to create a robust predictive framework for rainfall. The significant intercept term across all prediction periods underscores the strong baseline relationship between the chosen parameters and rainfall occurrence. Relative humidity and minimum temperature consistently emerged as significant predictors of rainfall. This finding aligns with meteorological principles, as higher humidity often precedes rainfall, and minimum temperatures can influence condensation processes. The consistent significance of wind velocity across all prediction periods is noteworthy. Wind patterns can influence moisture transport and cloud formation, directly impacting rainfall likelihood. Finally, the strong association between predicted and actual rainy days, as shown in Table 8, validates the practical applicability of the model. This correlation demonstrates that the model can effectively translate weather parameter data into accurate rainfall predictions, making it a useful tool for weather forecasting and related applications.
In summary, the study's findings highlight the complexity of rainfall prediction and the importance of specific weather parameters. The results not only validate the chosen model but also offer a foundation for future research to refine and enhance rainfall prediction methods. Climate change projections and observed trends highlight the need for proactive adaptation measures to reduce vulnerability and build resilience, especially in sectors such as agriculture that are highly dependent on climatic conditions (Datta et al. 2022; Baraj et al. 2024).
CONCLUSIONS
The investigation aimed to quantify the impact of climate change in the Bhopal region of India based on historical time-series weather data spanning from 1983 to 2023. In this study on the effect of climate change on weather parameters, the following conclusions were drawn based on the results of tested hypotheses and the fitted logistic machine learning and deep learning models:
1. The average annual rainfall of the region was estimated to be 1,076 (±120) mm, with no significant increasing or decreasing trend (1.17 mm/year) observed in annual rainfall (p = 0.47). A significant increasing trend in total rainy days and rainy days during the Rabi season was observed. Furthermore, no trend was detected in the number of Kharif rainy days.
2. Minimum temperature showed no significant trend over the years (p = 0.18); however, it is increasing at the rate of 0.02 °C/year. There was a highly significant increasing trend in maximum temperature (p = 0.07). The maximum temperature is increasing at a rate of 0.05 °C/year.
3. A significant increasing trend was observed in monthly minimum temperature over the years, except for December. However, no significant trend was observed in maximum temperature for any month over the years.
4. The Wald chi-square test in the logistic machine learning model detected a significant effect of month, relative humidity, temperature, and wind velocity on rainfall, but BSS and E pan had no effect on rainfall.
5. The deep learning architecture with one hidden layer containing five neurons was found to be the best in predicting rainfall (learning rate, 0.0001; batch size, 32; and epochs, 50).
6. Based on the F1 score performance measure, the deep learning model performed better in predicting rainfall (F1 score: 0.93) compared to the logistic model (0.56). Of the five performance measures, only the specificity of the logistic model (0.96) was better compared to the deep learning model (0.63). Overall, it was concluded that the deep learning model outperformed the logistic model in capturing the variation in rainfall.
Rainfall prediction may play a critical role in crop production. The timely prediction of rainfall could help farmers to make decisions on sowing and harvesting of the crop. In addition, the accurate rainfall prediction is significant for optimizing irrigation schedules, which may result in better crop water productivity. The deep learning model may be further trained for multiple stations from other regions with long-term time-series data. The other influencing input variables (e.g. atmospheric pressure, net radiation, dew point) may be considered for the prediction of rainfall. The findings and methodology adopted in this study may provide valuable insights for researchers and academics interested in exploring climate change through statistical analysis and machine learning approaches.
ACKNOWLEDGEMENTS
The authors would like to thank ICAR-Central Institute of Agricultural Engineering, Bhopal, India, for providing Agro-meteorological Observatory data for this study.
AUTHOR CONTRIBUTIONS
Manoj Kumar: Conceptualization, data collection, data analysis, and original draft preparation; Mukesh Kumar: Conceptualization, data collection, and original draft preparation; Ranjay K. Singh: Original draft preparation, manuscript revision; Abhishek M. Waghaye: Conceptualization and interpretation; V. BhushanaBabu: Manuscript revision; Ravindra D. Randhe: Writing – review and editing.
FUNDING
This research received no external funding.
DATA AVAILABILITY STATEMENT
All relevant data are included in the paper or its Supplementary Information.
CONFLICT OF INTEREST
The authors declare there is no conflict.