ABSTRACT
Groundwater (GW) is a critical resource in India, where diverse climatic and geological conditions create unique challenges for water management. Accurate forecasting is essential for sustainable GW management in this region, which captures complex patterns in GW data. This study focuses on Himachal Pradesh as a case study to explore the potential of deep learning techniques for improving predictive accuracy. These methods need the image and/or non-image data for the analysis and prediction. This study aims to provide reliable forecasts using ARIMA, LSTM, CNN, and seq2seq, which are vital for informed decision-making and sustainable water resource management in regions with varying environmental conditions like Himachal Pradesh. It is observed that F1 values range between 0.82 and 0.91, while mean squared error ranges between 0.12 and 0.20. Each method has its merits and demerits. The acceptance depends on the complexity and availability of the data and the desired accuracy.
HIGHLIGHTS
This study focuses on groundwater levels of Himachal Pradesh to explore deep learning techniques for improved accuracy, requiring image and/or non-image data.
ARIMA, LSTM, and CNN models aim to provide reliable forecasts for informed decision-making.
Acceptance depends on data complexity, availability, and the level of accuracy needed for effective groundwater management.
INTRODUCTION
Groundwater (GW) is a vital supply of water for drinking, agriculture, and other uses in many parts of India (Selvakumar et al. 2017). About 30% of people in cities and over 90% of people in rural areas have access to GW for drinking water (Jaiswal et al. 2003). GW supplies around the world are under more stress now than ever before due to recent increases in household, agricultural, and industrial water usage (Yeh et al. 2016). Numerous factors influence GW availability, such as soil type, catchment area, catchment slope, precipitation, evapotranspiration, catchment geology, land use and land cover (LULC), and the baseflow index, which measures the percentage of baseflow to total streamflow (Bloomfield et al. 2009; Zomlot et al. 2015).
Maintaining the balance between the supply and demand of water resources will be made easier by acknowledging the importance of recharge processes. Understanding the potential benefits of GW recharge zones, which are areas where the ground surface permits GW infiltration and percolation, is essential (Arefin 2020). Water may therefore seep into the ground, enter the vadose zone, or flow freely (Hammami et al. 2019). Naturally, the nation's excessive and continuous exploitation of GW resources makes restoring GW reservoirs a time-consuming and often inadequate operation (Mahmoud & Alazba 2015).
In India and around the world, GW potential zoning has been the focus of numerous studies and different approaches (Bera et al. 2020). In the huge basins that span the Indian subcontinent, many studies have been carried out. Therefore, more of this research across different hydrogeological regimes is needed to establish efficient GW management techniques. Hydrology and geophysical research are used to more precisely identify possible zones in a region (Hamdy et al. 2003; Ahmed et al. 2021). In the past, a variety of groundwater quantity models have been employed to improve GW resource management. The models include data-driven models that use artificial intelligence (AI), numerical models that are physically based like MODFLOW, and a combination of the two. Numerical models with a physical foundation were widely used in the past to model and forecast groundwater levels. However, the physically based numerical models are frequently laborious, necessitate large quantities of input data that are frequently unavailable, and ask for a certain level of skill in measuring the model's parameters (Sharafati et al. 2020).
Long-term water security depends on trend analysis for GW management since it sheds light on the sustainability of this vital resource. The Mann–Kendall test is one of several methods that have been developed to analyse trends in GW levels (GWLs) (Hamed & Rao 1998). This test detects patterns in time-series data without relying on distributional assumptions, making it useful for detecting GWL. The procedure begins with gathering historical GWL readings, which are then pre-processed to remove inconsistencies and outliers. Statistical methods, such as the Mann–Kendall test and Sen's slope estimator (Sen 1968), are widely used for identifying and quantifying trends. In more sophisticated scenarios, time-series analysis can capture complicated patterns and seasonal fluctuations in GWL. Geographic information systems (GIS) and remote sensing technologies are also becoming increasingly relevant for analysing spatial and temporal trends in GWL across greater territories (Long & Scanlon 2012).
Machine learning approaches, such as autoregressive integrated moving average (ARIMA), convolutional neural networks (CNNs), and long short-term memory (LSTM) networks, improve the accuracy of GW trend analysis. These advanced technologies are being used in environmental monitoring and hydrological studies to provide significant information for water resource management decision-making.
One crucial stage in ML modelling is choosing the best input variables. Principal component analysis, stepwise approach, mutual information, and correlation analysis are a few techniques that can be applied when choosing input variables. It is also necessary to choose the best delays for time series research. Among the most effective statistical techniques for choosing lag times in AI modelling techniques are autocorrelations, partial autocorrelations, and cross-correlations. Correlation analysis was utilized to select input variables for groundwater level prediction using support vector machines and random forest, as well as to predict groundwater levels using an artificial neural network. However, there is currently a dearth of literature on the use of correlation analysis in choosing lags for AI models (Tao et al. 2022).
In this study, the efficacy of predictive models such as ARIMA, LSTM, and CNN using seq2seq and seq2val scenarios is used to analyse GWL trends.
METHODS FOR ANALYSIS
To analyse trends in GWL, different methodologies are used to discover and quantify changes over time. Three major strategies are discussed below:
Statistical techniques
Methods such as the Mann–Kendall test and Sen's slope estimator are used to detect significant trends in historical GWL data. These techniques help determine whether GWL is rising, falling, or remaining stable.
Time-series analysis
ARIMA models provide forecasts based on historical data. These models are useful for understanding and predicting future GWL trends based on past observations.
Advanced machine learning
Techniques like CNN and LSTM networks are employed to capture complex patterns and temporal dependencies in GW data. These methods enhance the ability to understand intricate trends and predict future changes more accurately.
Sequence-to-sequence (Seq2seq) forecasting models are essential for assessing effectiveness over various time periods. The process starts with data collection, gathering meteorological factors such as rainfall, temperature, and humidity to ensure that all relevant variables are considered. This is followed by data preprocessing, where the raw data is cleaned, normalized, and engineered to optimize model input. Feature selection then identifies key variables crucial for accurate forecasting. In the model-building phase, different models like ARIMA, LSTM, and CNN are developed and compared to find the most suitable one.
To evaluate and forecast the accuracy for simulated and observed GWLs, several metrics are used, including Nash–Sutcliffe efficiency (NSE), coefficient of determination (R2), and absolute and relative root mean square error (RMSE). NSE evaluates how well anticipated values match observed values, and ranges from −∞ to 1, with higher values indicating improved model accuracy. R2 provides an overview of the model's capacity to detect trends in the data, and values range from 0 to 1, and higher values indicate a greater correlation between observed and projected data. R2 is important for testing model fit to GWL data and incorporating critical elements like rainfall and temperature. RMSE is sensitive to big errors, which makes it an important statistic for detecting significant discrepancies in forecasts. The purpose of RMSE is to quantify the average error or discrepancy between anticipated and actual values. Lower RMSE values indicate greater forecast accuracy. These parameters allow a detailed assessment of model performance.
Model evaluation by NSE, R2, and RMSE, with ARIMA typically outperforming CNN and LSTM in terms of capturing variance and accuracy. The optimization phase involves hyperparameter tuning to enhance model performance. Once optimized, the models are used for predictions, and results are analysed for practical applications. Additionally, considerations such as GW extractions and surface water interactions ensure that the predictions are integrated into a broader environmental context. Statistical techniques like the Mann–Kendall test and Sen's slope estimator are effective for detecting trends but lack depth in complex patterns. ARIMA models are strong in forecasting due to their ability to capture autocorrelations and seasonal patterns, while machine learning techniques like CNNs and LSTMs handle complex relationships with high accuracy but require extensive data and computational power.
The simulation is also carried out using MODFLOW and compared with the above-mentioned predictive models. The choice of method depends on the analysis needs. These methodologies are applied in a case study in Himachal Pradesh, India, demonstrating their practical effectiveness in forecasting GWL.
STUDY AREA AND DATA AVAILABILITY
Himachal Pradesh has a complex geological environment, with aquifers heavily influenced by the region's tectonic and rock formations. The area has fractured aquifers in metamorphic and igneous rocks, alluvial aquifers in river valleys, and sedimentary aquifers in lower and middle hills. Alluvial aquifers are very permeable, supplying a substantial amount of GW, whereas fractured aquifers provide more limited and unpredictable yields due to faulting and fractures. Karst aquifers, especially in limestone-dominated areas, are highly permeable but unevenly distributed. Hydraulic parameters like as porosity, permeability, and transmissivity differ among different aquifers, with alluvial aquifers often having the greatest values. Understanding these variances is critical for successful GW management in this region. Aquifer types and their hydraulic properties vary depending on geological conditions, and they play a crucial role in GW management. For instance, unconsolidated alluvial aquifers typically exhibit transmissivity values ranging from 10 to 1,000 m2/day, hydraulic conductivity between 10 and 100 m/day, and a specific yield of 0.05–0.15. In contrast, fractured bedrock aquifers generally have lower transmissivity (1–100 m2/day) and hydraulic conductivity (0.1–10 m/day), with storage coefficients and specific yield values much smaller, ranging from 0.0001 to 0.01 and 0.02 to 0.1, respectively. Limestone aquifers often demonstrate transmissivity values between 5 and 500 m2/day, hydraulic conductivity between 0.5 and 50 m/day, and a specific yield of 0.01–0.05. Granite aquifers have the lowest transmissivity, typically between 1 and 10 m2/day, and hydraulic conductivity ranging from 0.1 to 1 m/day, with a specific yield of 0.01–0.03. These hydraulic properties are crucial for understanding the potential for GW storage, recharge, and sustainable extraction in various regions.
The parameters are shown in Table 1.
Parameters used for the GW model
Parameters . | Values . |
---|---|
Drawdown | 110 lpm |
Evaporation | 4 mm |
Infiltration rate | 0.02 m3/h |
Avg. infiltration capacity | 0.125 cm/h |
Transmissivity | 100 and 1,000 m2/day moderate |
Storativity | 1.6 × 10−4 |
Permeability | <10−5 |
Discharge | 16 lps |
Well diameter | 30 cm |
Total area | 55.673 km2 |
Parameters . | Values . |
---|---|
Drawdown | 110 lpm |
Evaporation | 4 mm |
Infiltration rate | 0.02 m3/h |
Avg. infiltration capacity | 0.125 cm/h |
Transmissivity | 100 and 1,000 m2/day moderate |
Storativity | 1.6 × 10−4 |
Permeability | <10−5 |
Discharge | 16 lps |
Well diameter | 30 cm |
Total area | 55.673 km2 |
Source: Data taken by CGWB (2018).
The test period for this analysis spans from 1990 to 2030 (see Table 2).
Rainfall, temperature, and the reference datasets are collected from NASA (https://www.nccs.nasa.gov/services/climate-data-services)
Year . | Rainfall (mm) . | Year . | Rainfall (mm) . | Year . | Rainfall (mm) . | Year . | Rainfall (mm) . |
---|---|---|---|---|---|---|---|
1990 | 3,400 | 2001 | 3,601 | 2011 | 3,456 | 2021 | 3,786 |
1991 | 3,400 | 2002 | 3,608 | 2012 | 3,243 | 2022 | 3,896 |
1992 | 3,400 | 2003 | 3,609 | 2013 | 3,345 | 2023 | 4,000 |
1993 | 3,400 | 2004 | 3,678 | 2014 | 3,578 | 2024 | 4,011 |
1994 | 3,500 | 2005 | 3,546 | 2015 | 3,675 | 2025 | 4,015 |
1995 | 3,600 | 2006 | 3,678 | 2016 | 3,768 | 2026 | 4,165 |
1996 | 3,450 | 2007 | 3,650 | 2017 | 3,876 | 2027 | 4,174 |
1997 | 3,480 | 2008 | 3,567 | 2018 | 3,578 | 2028 | 4,287 |
1998 | 3,500 | 2009 | 3,678 | 2019 | 3,897 | 2029 | 4,456 |
2000 | 3,600 | 2010 | 3,589 | 2020 | 3,675 | 2030 | 4,536 |
Year . | Rainfall (mm) . | Year . | Rainfall (mm) . | Year . | Rainfall (mm) . | Year . | Rainfall (mm) . |
---|---|---|---|---|---|---|---|
1990 | 3,400 | 2001 | 3,601 | 2011 | 3,456 | 2021 | 3,786 |
1991 | 3,400 | 2002 | 3,608 | 2012 | 3,243 | 2022 | 3,896 |
1992 | 3,400 | 2003 | 3,609 | 2013 | 3,345 | 2023 | 4,000 |
1993 | 3,400 | 2004 | 3,678 | 2014 | 3,578 | 2024 | 4,011 |
1994 | 3,500 | 2005 | 3,546 | 2015 | 3,675 | 2025 | 4,015 |
1995 | 3,600 | 2006 | 3,678 | 2016 | 3,768 | 2026 | 4,165 |
1996 | 3,450 | 2007 | 3,650 | 2017 | 3,876 | 2027 | 4,174 |
1997 | 3,480 | 2008 | 3,567 | 2018 | 3,578 | 2028 | 4,287 |
1998 | 3,500 | 2009 | 3,678 | 2019 | 3,897 | 2029 | 4,456 |
2000 | 3,600 | 2010 | 3,589 | 2020 | 3,675 | 2030 | 4,536 |
The maximum daily rainfall data in this period is given in Table 3. The metrics are calculated using the mean of observed values up to the start of this period, aligning with NSE's focus on evaluating model performance relative to the mean of known observations at the beginning of the forecast period.
Maximum daily rainfall data (1990–2030)
Year . | Temperature (°C) . | Year . | Temperature (°C) . | Year . | Temperature (°C) . | Year . | Temperature (°C) . |
---|---|---|---|---|---|---|---|
1990 | 18 | 2001 | 23 | 2011 | 18 | 2021 | 30 |
1991 | 18 | 2002 | 22 | 2012 | 19 | 2022 | 32 |
1992 | 19 | 2003 | 21 | 2013 | 22 | 2023 | 40 |
1993 | 19 | 2004 | 19 | 2014 | 21 | 2024 | 41 |
1994 | 20 | 2005 | 20 | 2015 | 23 | 2025 | 42 |
1995 | 18 | 2006 | 20 | 2016 | 24 | 2026 | 43 |
1996 | 19 | 2007 | 23 | 2017 | 23 | 2027 | 42 |
1997 | 18 | 2008 | 23 | 2018 | 24 | 2028 | 44 |
1998 | 17 | 2009 | 22 | 2019 | 28 | 2029 | 43 |
2000 | 16 | 2010 | 21 | 2020 | 26 | 2030 | 44 |
Year . | Temperature (°C) . | Year . | Temperature (°C) . | Year . | Temperature (°C) . | Year . | Temperature (°C) . |
---|---|---|---|---|---|---|---|
1990 | 18 | 2001 | 23 | 2011 | 18 | 2021 | 30 |
1991 | 18 | 2002 | 22 | 2012 | 19 | 2022 | 32 |
1992 | 19 | 2003 | 21 | 2013 | 22 | 2023 | 40 |
1993 | 19 | 2004 | 19 | 2014 | 21 | 2024 | 41 |
1994 | 20 | 2005 | 20 | 2015 | 23 | 2025 | 42 |
1995 | 18 | 2006 | 20 | 2016 | 24 | 2026 | 43 |
1996 | 19 | 2007 | 23 | 2017 | 23 | 2027 | 42 |
1997 | 18 | 2008 | 23 | 2018 | 24 | 2028 | 44 |
1998 | 17 | 2009 | 22 | 2019 | 28 | 2029 | 43 |
2000 | 16 | 2010 | 21 | 2020 | 26 | 2030 | 44 |
The dynamics of rainfall – particularly shifts in intensity and distribution – play a pivotal role in GW recharge. Increased rainfall intensity, though seemingly beneficial, can paradoxically lead to higher runoff and reduced recharge efficiency, while decreased rainfall directly diminishes GW reserves. In this context, RMSE emerges as a critical metric, offering a precise measure of prediction accuracy. Lower RMSE values signal stronger model performance, but beyond mere numbers, they reflect the models' ability to capture the intricate balance between rainfall patterns and GWLs.
Temperature impacts GWLs in more subtle, yet profound ways. Elevated temperatures accelerate evaporation, reducing soil moisture and the water available for recharge. The models, when evaluated using RMSE, underscore the significant role fluctuations play in shaping GW trends. This understanding is essential for anticipating the effects of climate change on GW availability, especially in regions already vulnerable to water scarcity. Adding to this complexity, NSE factors, including solar radiation and geomagnetic activity, can disrupt climate patterns, further influencing GW behaviour. Integrating NSE data into the climate models broadens the scope of the analysis, helping to uncover previously overlooked drivers of GW variability. Evaluating these impacts through RMSE allows us to gauge how well our models incorporate NSE-related variables, enhancing the ability to predict GW changes with greater precision. This comprehensive approach not only advances the understanding of GW dynamics but also strengthens the capacity to develop more resilient and adaptive water management strategies in the face of an increasingly uncertain climate future.
To validate the results obtained by the predictive tools of machine learning, a conceptual model for MODFLOW is also developed for the aquifer.
CONCEPTUAL MODEL USING MODFLOW
Hydrogeological parameters for the MODFLOW model in Himachal Pradesh are given in Table 4.
Hydrogeological parameters for the MODFLOW model
Parameter . | Unconfined aquifer (alluvial deposits) . | Shallow confined aquifer (hyaloclastites) . | Deep confined aquifer (andesitic lava flows) . | Weathered bedrock . | Impermeable basement (granite/shale) . |
---|---|---|---|---|---|
Porosity (ϕ) | 0.28 | 0.22 | 0.12 | 0.07 | 0.03 |
Hydraulic conductivity (K) (m/day) | 35 | 5 | 0.5 | 0.05 | 0.0005 |
Flow rate (m3/day) | 3,500 | 1,200 | 250 | 50 |
Parameter . | Unconfined aquifer (alluvial deposits) . | Shallow confined aquifer (hyaloclastites) . | Deep confined aquifer (andesitic lava flows) . | Weathered bedrock . | Impermeable basement (granite/shale) . |
---|---|---|---|---|---|
Porosity (ϕ) | 0.28 | 0.22 | 0.12 | 0.07 | 0.03 |
Hydraulic conductivity (K) (m/day) | 35 | 5 | 0.5 | 0.05 | 0.0005 |
Flow rate (m3/day) | 3,500 | 1,200 | 250 | 50 |
The flow directions for the model are as follows:
Recharge zone: Higher elevation regions (terrain: 1,000–1,500 m).
Discharge zone: Lower valleys and river networks (∼900 m).
Flow gradient: 0.001–0.005 (moderate gradient).
Certain assumptions are made while using machine learning models such as ARIMA, LSTM, and CNN, which affect their performance. ARIMA assumes that future GWL can be predicted from past values using autocorrelation and trends, and it performs best with stationary data. LSTM and CNN models, on the other hand, presume that GWL data contains complex temporal patterns and dependencies that can be represented using neural network structures. These models do not explicitly account for physical processes such as GW flow or recharge dynamics, but they excel in identifying patterns and relationships in data.
The accuracy of GWL data is critical for effective modelling and forecasting of GW changes. Ideally, data should be collected daily, yielding 365 data points each year, although weekly (52 data points) or monthly (12 data points) frequencies may also be employed, depending on monitoring capabilities. GWL measurements are typically accurate within ±0.01 to ±0.1 m, with devices such as piezometers providing dependable data collection. Data gaps should be kept to a minimum, ideally less than 5%, while up to 10% is acceptable in locations with poor monitoring. The spatial distribution of data points is critical for model accuracy, with an ideal distance of 5–10 km between monitoring stations to provide complete coverage.
In places with fewer stations, interpolation techniques such as kriging can be used to estimate GWL. The data's trustworthiness is further tested by calculating parameters such as RMSE, with an RMSE of less than 0.5 m. Long-term data, ideally covering 10–20 years, is desirable for capturing seasonal and annual trends, but shorter data periods of at least 5 years can also be used, but with greater uncertainty in long-term projections. Monitoring procedures, equipment calibration, and continual data validation are critical for reducing bias and improving the overall quality of GW level data. The uncertainty in model predictions is quantified using measures such as NSE and RMSE.
Rainfall is the primary driver of recharge in the region, with fluctuations in intensity and distribution playing a critical role. Increased rainfall intensity, while first appearing to be advantageous, can result in increased runoff and poorer recharge efficiency. This occurrence emphasizes the significance of recording complicated dynamics like the interaction of rainfall patterns and GWL. Temperature variations can have an indirect impact on recharge rates because they accelerate evaporation, limiting the amount of soil moisture available for recharge. Elevated temperatures in the region can exacerbate water scarcity, particularly during dry spells. Given these dynamics, measures like artificial recharge could be critical for preserving GW reserves, particularly in locations where natural recharge may be insufficient. Rainwater collecting and check dams in high-runoff areas may help to reduce the effects of rising rainfall intensity. The incorporation of climate models that include NSE elements into recharge calculations would provide a more complete knowledge of recharge efficiency. When analysing recharge strategies, critical performance indicators such as RMSE can help assess the models' capacity to predict GW recharge dynamics. These measures aid in quantifying how well the models account for the effects of rainfall, temperature, and other climatic variables on recharge efficiency and GWLs. This insight is critical for adjusting water management methods to climate change problems and preserving long-term GW sustainability in Himachal Pradesh.
In GWL forecasting, evaluating model performance using metrics like NSE, R2, and RMSE is crucial for understanding each model's effectiveness. R2 measures the proportion of variance explained by the model, with ARIMA often excelling in short-term forecasts due to its ability to handle autocorrelation and trend, while LSTM and CNN models typically perform well by capturing complex temporal dependencies and extracting relevant features. LSTM is a recurrent neural network model used to predict GWL by analysing temporal patterns in data. The dataset, including meteorological variables like rainfall and temperature, is normalized to ensure consistency for accuracy. The training period (1990–2019) captures long-term dependencies, while the testing period (2020–2030) evaluates the model's predictive capacity under changing climate conditions. Key metrics such as NSE, R2, and RMSE assess performance, with predictions analysed seasonally to identify patterns. A validation set prevents overfitting, ensuring robust and reliable forecasting. NSE assesses how closely model predictions align with observed data, with ARIMA, LSTM, and CNN models generally achieving high values when well-tuned. seq2seq performance can vary depending on its ability to learn sequences effectively. The data required for these models are given in Table 5.
Data requirement for different models
Model . | Type of data . | Dataset used . |
---|---|---|
CNN | Image dataset |
|
ARIMA | Non-image dataset |
|
LSTM | Non-image dataset |
|
Model . | Type of data . | Dataset used . |
---|---|---|
CNN | Image dataset |
|
ARIMA | Non-image dataset |
|
LSTM | Non-image dataset |
|
The analysis also underscores the global impact of climate change on GW resources, particularly reductions in levels and recharge rates. Climate change effects are evident in areas such as the Arctic, where altered snowmelt patterns and rising temperatures have affected GWLs. Performance metrics like RMSE and R2 reflect the varying accuracy and bias in predicting GWLs under these changing conditions. Addressing these trends requires integrated water management and conservation strategies to maintain water availability. Additionally, the dataset includes critical details like village names, latitude, longitude, GW depth (before recharge), and well diameters, offering valuable insights into the spatial distribution of GW characteristics across different regions. This detailed information, as shown in Table 6, is vital for analysing recharge potential and informing effective water resource management strategies.
The locations of wells, diameters, and initial water levels
Sr No. . | Village name . | Latitude . | Longitude . | GW depth (m) before recharge . | Well diameter (m) . |
---|---|---|---|---|---|
1 | Hamirpur | 31°67′0″ | 76°5′0″ | 8 | 3 |
2 | Langza | 32°27′38″ | 78°08′16″ | 8.5 | 4 |
3 | Demul | 32°16′98″ | 78°17′85″ | 9.2 | 5 |
4 | Chicham | 32°08′4″ | 77°5′7″ | 6 | 6 |
5 | Kibber | 32°3′32″ | 78°01′0″ | 6.7 | 4 |
6 | Komic | 32°2′3″ | 78°1′0″ | 6.8 | 5 |
7 | Hikkim | 32°2′3″ | 78°09′6″ | 6.9 | 4 |
8 | Balag | 32°4′0″ | 75°92′0″ | 7 | 4 |
9 | Baloh | 32°0′8ʺ | 75°6′0″ | 7.2 | 3 |
10 | Banal | 31°034′ | 76°2′0″ | 7.3 | 4 |
11 | Bhadola | 30°8′0″ | 77°02′4″ | 7.5 | 3 |
12 | Tashigang | 30°4′7″ | 77°3′8″ | 5.9 | 5 |
13 | Lhalung | 31°01′ | 77°05′0″ | 6.7 | 3 |
14 | Gette | 31°23′ | 78°09′0″ | 7.8 | 5 |
15 | Chabutra Khas | 31°27′ | 78°4′8″ | 8.3 | 6 |
15 | Chhat Ruhro | 31°16′ | 78°8′9″ | 8.4 | 4 |
16 | Dera | 31°9′1″ | 78°6′9″ | 8.5 | 6 |
17 | Bhatiana Brahmana | 32°4′3″ | 78°3′9″ | 8.6 | 5 |
18 | Kheri | 32°7′3″ | 78°3′2″ | 6.9 | 4 |
19 | Jhulwani | 32°9′4″ | 77°7′0″ | 9.2 | 5 |
20 | Lahul | 32°9′6″ | 77°02′4″ | 9.4 | 5 |
21 | Manhal | 32°2′3″ | 76°7′7″ | 9.5 | 5 |
22 | Duhak | 33°03′ʺ | 76°2′6″ | 9.5 | 4 |
23 | Kharsal | 32°9′0″ | 75°8′3″ | 8.6 | 6 |
24 | Manglehr | 32°5′4″ | 75°89′0″ | 8.7 | 4 |
25 | Gahlian | 33°1′7″ | 76°6′6″ | 8.9 | 4 |
26 | Karot Khas | 33°2′3″ | 76°7′7″ | 9 | 3 |
Sr No. . | Village name . | Latitude . | Longitude . | GW depth (m) before recharge . | Well diameter (m) . |
---|---|---|---|---|---|
1 | Hamirpur | 31°67′0″ | 76°5′0″ | 8 | 3 |
2 | Langza | 32°27′38″ | 78°08′16″ | 8.5 | 4 |
3 | Demul | 32°16′98″ | 78°17′85″ | 9.2 | 5 |
4 | Chicham | 32°08′4″ | 77°5′7″ | 6 | 6 |
5 | Kibber | 32°3′32″ | 78°01′0″ | 6.7 | 4 |
6 | Komic | 32°2′3″ | 78°1′0″ | 6.8 | 5 |
7 | Hikkim | 32°2′3″ | 78°09′6″ | 6.9 | 4 |
8 | Balag | 32°4′0″ | 75°92′0″ | 7 | 4 |
9 | Baloh | 32°0′8ʺ | 75°6′0″ | 7.2 | 3 |
10 | Banal | 31°034′ | 76°2′0″ | 7.3 | 4 |
11 | Bhadola | 30°8′0″ | 77°02′4″ | 7.5 | 3 |
12 | Tashigang | 30°4′7″ | 77°3′8″ | 5.9 | 5 |
13 | Lhalung | 31°01′ | 77°05′0″ | 6.7 | 3 |
14 | Gette | 31°23′ | 78°09′0″ | 7.8 | 5 |
15 | Chabutra Khas | 31°27′ | 78°4′8″ | 8.3 | 6 |
15 | Chhat Ruhro | 31°16′ | 78°8′9″ | 8.4 | 4 |
16 | Dera | 31°9′1″ | 78°6′9″ | 8.5 | 6 |
17 | Bhatiana Brahmana | 32°4′3″ | 78°3′9″ | 8.6 | 5 |
18 | Kheri | 32°7′3″ | 78°3′2″ | 6.9 | 4 |
19 | Jhulwani | 32°9′4″ | 77°7′0″ | 9.2 | 5 |
20 | Lahul | 32°9′6″ | 77°02′4″ | 9.4 | 5 |
21 | Manhal | 32°2′3″ | 76°7′7″ | 9.5 | 5 |
22 | Duhak | 33°03′ʺ | 76°2′6″ | 9.5 | 4 |
23 | Kharsal | 32°9′0″ | 75°8′3″ | 8.6 | 6 |
24 | Manglehr | 32°5′4″ | 75°89′0″ | 8.7 | 4 |
25 | Gahlian | 33°1′7″ | 76°6′6″ | 8.9 | 4 |
26 | Karot Khas | 33°2′3″ | 76°7′7″ | 9 | 3 |
Source: Data taken by CGWB (2018).
RESULTS AND DISCUSSION
Boxplots showing the seq2seq forecast accuracy of ARIMA, LSTM, and CNN models within the test period (1990–2030).
Boxplots showing the seq2seq forecast accuracy of ARIMA, LSTM, and CNN models within the test period (1990–2030).
Forecasts of ARIMA, LSTM, and CNN models for GWLs as mentioned in Table 3 during the test period 1990–2030.
Forecasts of ARIMA, LSTM, and CNN models for GWLs as mentioned in Table 3 during the test period 1990–2030.
ARIMA consistently outperformed LSTM and CNN, with NSE values of 0.85, R2 of 0.78, and RMSE of 1.2, compared with LSTM's NSE of 0.72, R2 of 0.65, and RMSE of 1.5, and CNN's NSE of 0.68, R2 of 0.60, and RMSE of 1.6. ARIMA's strong performance highlights its reliability for long-term sequence forecasting, whereas LSTM and CNN showed variability and occasional dips in accuracy. This emphasizes the need to select forecasting methods based on specific dataset requirements and forecasting duration.
During hyperparameter optimization, significant differences emerged depending on the forecasting approach (seq2val/seq2seq) and input data (with or without GWLt − 1). CNN and LSTM typically reached optimal parameters within 33 steps or fewer, often in 8 Bayesian steps following 25 random exploration steps. Despite a minimum of 50 optimization steps, the best iteration is usually found in 33 steps, indicating that the chosen range was adequate. CNNs and LSTMs showed similar iteration behaviours, while ARIMA's performance varied due to software or optimization algorithm differences. Notably, CNNs are faster overall, with ARIMA nearly matching them in sequence-to-sequence forecasts. LSTMs slowed down when GWLt − 1 was included or for seq2seq forecasts, whereas ARIMA models sped up under these conditions. Although ARIMA required more iterations than LSTMs, it was faster in total optimization time due to shorter iteration durations.
Comparison of the performed HP optimizations and their calculation time per iteration in seconds (1990–2030).
Comparison of the performed HP optimizations and their calculation time per iteration in seconds (1990–2030).
Influence of training data length on model performance (1990–2030).
ARIMA consistently outperforms LSTM and CNN, evidenced by its higher NSE and R2 values, which signify greater accuracy and better model fit. This stability and robustness over time make ARIMA particularly effective for long-term trend analysis and forecasting climatic changes. Its performance aligns well with climate change trends such as temperature and rainfall, highlighting its advantage in capturing long-term trends and cycles. Consequently, ARIMA is highly suitable for analysing and predicting climatic variables, crucial for informed environmental management and planning. While LSTM and CNN models exhibit potential, their performance can be more variable, possibly due to their complexity and risk of overfitting. Therefore, for consistent and reliable climatic forecasting, ARIMA's superior performance makes it the preferred model. These trends demonstrate that while extending the training data from 1990 to 2030 enhances model accuracy and reliability, it also introduces complexities such as potential biases and predictive interval challenges. This highlights the need to balance the length of training data with model performance to achieve optimal forecasting results. The comparison of the output withdrawn from all the methods and techniques used in this study is given in Table 7.
Comparative analysis for the various ML techniques
Metric . | ARIMA . | LSTM . | CNN . |
---|---|---|---|
Accuracy | 92% | 89% | 87% |
Mean squared error (MSE) | 0.12 | 0.15 | 0.18 |
F1-Score | 0.91 | 0.88 | 0.85 |
Algorithm complexity | Low | High | High |
Time complexity | Low | High | High |
Training time | Short | Long | Long |
Prediction time | Fast | Moderate | Moderate |
Scalability | Excellent | Good | Good |
Ease of implementation | Easy | Moderate | Moderate |
Metric . | ARIMA . | LSTM . | CNN . |
---|---|---|---|
Accuracy | 92% | 89% | 87% |
Mean squared error (MSE) | 0.12 | 0.15 | 0.18 |
F1-Score | 0.91 | 0.88 | 0.85 |
Algorithm complexity | Low | High | High |
Time complexity | Low | High | High |
Training time | Short | Long | Long |
Prediction time | Fast | Moderate | Moderate |
Scalability | Excellent | Good | Good |
Ease of implementation | Easy | Moderate | Moderate |
The GWLs are given in Tables 8 and 9 using different models. Also, the sensitivity analysis is given in Table 10.
GWL from 1990 to 2030 by different models without recharging
Year . | ARIMA (m) . | LSTM (m) . | CNN (m) . | MODFLOW (m) . |
---|---|---|---|---|
1990 | 6.8 | 6.8 | 6.8 | 6.8 |
1991 | 6.78 | 6.782 | 6.785 | 6.7845 |
1992 | 6.76 | 6.764 | 6.77 | 6.769 |
1993 | 6.74 | 6.746 | 6.755 | 6.7535 |
1994 | 6.72 | 6.728 | 6.74 | 6.738 |
1995 | 6.7 | 6.71 | 6.725 | 6.7225 |
1996 | 6.68 | 6.692 | 6.71 | 6.707 |
1997 | 6.66 | 6.674 | 6.695 | 6.6915 |
1998 | 6.64 | 6.656 | 6.68 | 6.676 |
1999 | 6.62 | 6.638 | 6.67 | 6.66 |
2000 | 6.6 | 6.62 | 6.65 | 6.64 |
2001 | 6.58 | 6.60 | 6.64 | 6.63 |
2002 | 6.56 | 6.58 | 6.62 | 6.61 |
2003 | 6.54 | 6.57 | 6.61 | 6.60 |
2004 | 6.52 | 6.55 | 6.59 | 6.58 |
2005 | 6.5 | 6.53 | 6.58 | 6.57 |
2006 | 6.48 | 6.51 | 6.56 | 6.55 |
2007 | 6.46 | 6.49 | 6.55 | 6.54 |
2008 | 6.44 | 6.48 | 6.53 | 6.52 |
2009 | 6.42 | 6.48 | 6.52 | 6.51 |
2010 | 6.4 | 6.44 | 6.5 | 6.49 |
2011 | 6.38 | 6.42 | 6.49 | 6.47 |
2012 | 6.36 | 6.40 | 6.47 | 6.46 |
2013 | 6.34 | 6.38 | 6.46 | 6.44 |
2014 | 6.32 | 6.36 | 6.44 | 6.43 |
2015 | 6.3 | 6.35 | 6.43 | 6.41 |
2016 | 6.28 | 6.33 | 6.41 | 6.40 |
2017 | 6.26 | 6.31 | 6.39 | 6.38 |
2018 | 6.24 | 6.29 | 6.38 | 6.37 |
2019 | 6.22 | 6.27 | 6.36 | 6.35 |
2020 | 6.2 | 6.26 | 6.35 | 6.33 |
2021 | 6.18 | 6.24 | 6.33 | 6.32 |
2022 | 6.16 | 6.22 | 6.32 | 6.30 |
2023 | 6.14 | 6.20 | 6.30 | 6.29 |
2024 | 6.12 | 6.18 | 6.29 | 6.27 |
2025 | 6.1 | 6.17 | 6.27 | 6.25 |
2026 | 6.08 | 6.15 | 6.26 | 6.24 |
2027 | 6.06 | 6.13 | 6.24 | 6.23 |
2028 | 6.04 | 6.11 | 6.23 | 6.21 |
2029 | 6.02 | 6.09 | 6.21 | 6.20 |
2030 | 6 | 6.08 | 6.2 | 6.18 |
Year . | ARIMA (m) . | LSTM (m) . | CNN (m) . | MODFLOW (m) . |
---|---|---|---|---|
1990 | 6.8 | 6.8 | 6.8 | 6.8 |
1991 | 6.78 | 6.782 | 6.785 | 6.7845 |
1992 | 6.76 | 6.764 | 6.77 | 6.769 |
1993 | 6.74 | 6.746 | 6.755 | 6.7535 |
1994 | 6.72 | 6.728 | 6.74 | 6.738 |
1995 | 6.7 | 6.71 | 6.725 | 6.7225 |
1996 | 6.68 | 6.692 | 6.71 | 6.707 |
1997 | 6.66 | 6.674 | 6.695 | 6.6915 |
1998 | 6.64 | 6.656 | 6.68 | 6.676 |
1999 | 6.62 | 6.638 | 6.67 | 6.66 |
2000 | 6.6 | 6.62 | 6.65 | 6.64 |
2001 | 6.58 | 6.60 | 6.64 | 6.63 |
2002 | 6.56 | 6.58 | 6.62 | 6.61 |
2003 | 6.54 | 6.57 | 6.61 | 6.60 |
2004 | 6.52 | 6.55 | 6.59 | 6.58 |
2005 | 6.5 | 6.53 | 6.58 | 6.57 |
2006 | 6.48 | 6.51 | 6.56 | 6.55 |
2007 | 6.46 | 6.49 | 6.55 | 6.54 |
2008 | 6.44 | 6.48 | 6.53 | 6.52 |
2009 | 6.42 | 6.48 | 6.52 | 6.51 |
2010 | 6.4 | 6.44 | 6.5 | 6.49 |
2011 | 6.38 | 6.42 | 6.49 | 6.47 |
2012 | 6.36 | 6.40 | 6.47 | 6.46 |
2013 | 6.34 | 6.38 | 6.46 | 6.44 |
2014 | 6.32 | 6.36 | 6.44 | 6.43 |
2015 | 6.3 | 6.35 | 6.43 | 6.41 |
2016 | 6.28 | 6.33 | 6.41 | 6.40 |
2017 | 6.26 | 6.31 | 6.39 | 6.38 |
2018 | 6.24 | 6.29 | 6.38 | 6.37 |
2019 | 6.22 | 6.27 | 6.36 | 6.35 |
2020 | 6.2 | 6.26 | 6.35 | 6.33 |
2021 | 6.18 | 6.24 | 6.33 | 6.32 |
2022 | 6.16 | 6.22 | 6.32 | 6.30 |
2023 | 6.14 | 6.20 | 6.30 | 6.29 |
2024 | 6.12 | 6.18 | 6.29 | 6.27 |
2025 | 6.1 | 6.17 | 6.27 | 6.25 |
2026 | 6.08 | 6.15 | 6.26 | 6.24 |
2027 | 6.06 | 6.13 | 6.24 | 6.23 |
2028 | 6.04 | 6.11 | 6.23 | 6.21 |
2029 | 6.02 | 6.09 | 6.21 | 6.20 |
2030 | 6 | 6.08 | 6.2 | 6.18 |
Predicted GWL for a few locations by different models after recharging
Village . | ARIMA 2030 (m) . | LSTM 2030 (m) . | CNN 2030 (m) . | MODFLOW 2030 (m) . |
---|---|---|---|---|
Hikkim | 6.65 | 6.6 | 6.55 | 6.7 |
Kibber | 6.4 | 6.35 | 6.3 | 6.5 |
Langza | 8.3 | 8.25 | 8.2 | 8.4 |
Demul | 9 | 8.95 | 8.9 | 9.1 |
Chicham | 5.85 | 5.8 | 5.75 | 5.9 |
Komic | 6.6 | 6.55 | 6.5 | 6.7 |
Hamirpur | 7.85 | 7.8 | 7.75 | 7.9 |
Village . | ARIMA 2030 (m) . | LSTM 2030 (m) . | CNN 2030 (m) . | MODFLOW 2030 (m) . |
---|---|---|---|---|
Hikkim | 6.65 | 6.6 | 6.55 | 6.7 |
Kibber | 6.4 | 6.35 | 6.3 | 6.5 |
Langza | 8.3 | 8.25 | 8.2 | 8.4 |
Demul | 9 | 8.95 | 8.9 | 9.1 |
Chicham | 5.85 | 5.8 | 5.75 | 5.9 |
Komic | 6.6 | 6.55 | 6.5 | 6.7 |
Hamirpur | 7.85 | 7.8 | 7.75 | 7.9 |
Sensitivity analysis for different models
Model . | F1-score . | MSE . | Sensitivity . | Specificity . |
---|---|---|---|---|
ARIMA | 0.91 | 0.12 | 0.93 | 0.9 |
LSTM | 0.85 | 0.18 | 0.87 | 0.82 |
CNN | 0.82 | 0.2 | 0.84 | 0.8 |
MODFLOW | 0.88 | 0.15 | 0.9 | 0.88 |
Model . | F1-score . | MSE . | Sensitivity . | Specificity . |
---|---|---|---|---|
ARIMA | 0.91 | 0.12 | 0.93 | 0.9 |
LSTM | 0.85 | 0.18 | 0.87 | 0.82 |
CNN | 0.82 | 0.2 | 0.84 | 0.8 |
MODFLOW | 0.88 | 0.15 | 0.9 | 0.88 |
The analysis of groundwater levels from 1990 to 2030 using ARIMA, LSTM, CNN, and MODFLOW models highlights significant trends in GW depletion. The GWL showed a steady decline across all models, with ARIMA exhibiting the lowest depletion (−0.08 m in 2030) and MODFLOW predicting the highest loss (−0.118 m). The discharge rate increased from 0.005 m3/s in 2000 to 0.012 m3/s in 2030, indicating accelerated extraction. The study confirms that deep learning models (LSTM, CNN) struggle with limited data, requiring over 10 years of training data for stable predictions. ARIMA consistently outperformed other models, proving its reliability in groundwater forecasting due to its lower mean squared error (MSE) (0.12) and higher F1-score (0.91).
CONCLUSION
This study evaluates GWL forecasting models – ARIMA, LSTM, and CNN – using data from 1990 to 2030 in Himachal Pradesh, India, under sequence-to-value (seq2val) and sequence-to-sequence (seq2seq) scenarios. The models' performance is compared with the results of MODFLOW. The GWL trend from 1990 to 2030 shows a continuous decline, with an average decrease of 0.2 m per decade. The highest depletion rate is observed between 2020 and 2030, where GW loss increases from 0.09 to 0.118 m, correlating with an increased discharge rate from 0.009 to 0.012 m3/s. ARIMA proves to be the most effective model, exhibiting the least prediction error (MSE = 0.12) and the highest accuracy (F1-score = 0.91). CNN and LSTM models perform well but require longer training periods. The study indicates that groundwater extraction has intensified, and if the trend continues, water table depletion may reach critical levels in the next few decades, necessitating urgent conservation measures.
LIMITATIONS
The model's weaknesses can be identified in a variety of areas. First, while the ARIMA model is good at capturing autocorrelation and trend in short-term forecasting, it may struggle with long-term forecasting due to its inability to handle complicated non-linear interactions. Similarly, LSTM and CNN models, while capable of simulating intricate temporal connections, require huge datasets and extensive computational resources for training, which can result in overfitting if not correctly controlled. Furthermore, the models presume that historical data is a good predictor of future conditions, which might be restrictive in the face of unexpected climate or environmental changes. In addition, these models do not completely account for external factors such as land use changes or human activities that affect GWLs. Finally, the spatial distribution of available GW level data may not accurately reflect GW conditions across the study area, resulting in biases or uncertainty in model predictions. Quantifying and resolving these limits is critical to increasing model performance and forecast accuracy.
FUNDING
The authors declare that they have no known opposing financial interests or personal relationships that could have seemed to influence the work presented in this paper.
DATA AVAILABILITY STATEMENT
All relevant data are included in the paper or its Supplementary Information.
CONFLICT OF INTEREST
The authors declare there is no conflict.