ABSTRACT
The Mid-Gangetic Plain, a vital farmland in India, faces increasing groundwater quality deterioration due to anthropogenic activities. This study aimed to assess groundwater quality and contamination sources in the region utilizing statistical methods. A total of 78 groundwater samples were collected and analyzed using standard methods. The hydrochemistry analysis of samples revealed that several parameters such as Ca2+, Mg2+, HCO3−, NO3−, F− and PO43− surpassed the limits prescribed by the Bureau of Indian Standards (BIS). The principal component analysis yielded three significant factors, explaining 68.96% variation, highlighting geogenic and anthropogenic influences on groundwater chemistry. Hierarchical cluster analysis categorized groundwater into three clusters based on the parameters with similar trends of variation. Furthermore, discriminant analysis identified four significant variables (Mg2+, F−, Cl− and NO3−) responsible for creating the distinction among the identified clusters. Hydrogeochemical categorization and multivariate statistical analyses indicated that rock–water interaction, weathering, leaching and anthropogenic activities collectively influenced groundwater quality throughout the studied region. The Water Quality Index reveals that 59% of samples have good water quality, while 41% exhibit poor quality predominantly concentrated in the south-western, south-eastern and central regions. This study demonstrates the efficacy of statistical techniques to interpret complex datasets and grasp water quality dynamics, enhancing groundwater management.
HIGHLIGHTS
Statistical analysis revealed that geogenic and anthropogenic factors significantly influenced the groundwater chemistry.
Discriminant analysis identified Mg2+, F−, Cl− and NO3− as key variables in shaping distinct water quality clusters.
The WQI revealed that 59% of samples have good water quality, while 41% exhibit poor quality predominantly concentrated in the south-western, south-eastern, and central regions.
ABBREVIATIONS
INTRODUCTION
Groundwater (GW) stands as a widely recognized invaluable resource globally, serving as a crucial source for a multitude of applications. It is an essential part of the hydrological cycle and provides a steady supply of fresh water for use in agriculture, industry and home usage. Yet, a range of human and natural elements can lead to the deterioration and pollution of GW, rendering it susceptible to adverse impacts. According to a report published in the year 2019 by the WHO/UNICEF Joint Monitoring Program for Water Supply, global access to safe drinking water increased from 61 to 71% between 2000 and 2017. However, 2.2 billion people still lacked safe drinking water, with 785 million lacking basic services. In India, with a population of 1.4 billion, 35 million individuals do not have access to clean drinking water, and 678 million lack access to proper sanitation facilities (Water.org 2023). India, the largest consumer of GW globally, using approximately 251 billion cubic meters annually which accounts for more than one-quarter of the global total, faces challenges due to unregulated usage, leading to dried-up wells and declining water levels (Niti Aayog 2021). GW is predominantly used in the agricultural sector (87%), followed by domestic (10%) and industrial (3%) sectors (Alam et al. 2024). GW resources are the predominant source of potable water in most of the rural regions in India since there is no access or limited access to public water supply or a reliable supply of clean drinking water. Furthermore, inhabitants in metropolitan areas are increasingly dependent on GW because of the inconsistent and insufficient availability of water resources (Sreedhar et al. 2019). The Gangetic plains in India, are currently facing significant threats from GW contamination due to both natural and anthropogenic inputs (Mandal et al. 2019; Mukherjee et al. 2020). The decline in GW quality is a result of increasing GW withdrawals, water constraints, rapid development, extensive industrialization and urbanization (Alam & Kumar 2023; Kumar et al. 2019a, 2019b, 2023). Numerous geogenic processes, such as bedrock degradation and mineral dissolution, have been associated with the degradation of GW quality. Along with geogenic processes, agricultural activities, and industrial and municipal discharge also influenced the GW (Maity et al. 2020; Alam & Singh 2022). These activities lead to the introduction of a variety of trace elements and pollutants such as nitrate, fluoride and arsenic and have a direct effect on the physicochemical characteristics of GW (Chen et al. 2017; Kumar et al. 2022). Due to their potential to harm human health, these pollutants represent a serious threat. Consequently, quantifying the extent of GW pollution necessitates a comprehensive assessment.
The Water Quality Index (WQI) is a numerical indicator well acknowledged as a technique that confers a broad evaluation of water quality by utilizing the physiochemical properties of water. Owing to its versatility, adaptability and statistical compactness it facilitates the interpretation of complex water quality data and has gained significant interest among researchers (Alam & Singh 2022; Hinge et al. 2022; Gad et al. 2023). The emergence of GIS represents a valuable resource for showcasing the precision of water quality mapping. In previous studies, several researchers have explored the assessment of GW quality under various conditions by employing GIS (Nas & Berktay 2010; Marko et al. 2014; Assiuti & Governorate 2020; Gad et al. 2023). The integration of the WQI with GIS enables the assessment of water quality in inaccessible or geographically challenging areas. Utilizing the water quality conditions of neighboring gauzed sites, extrapolates the water quality at ungauged regions (Alam et al. 2024). Water quality analysis has been the subject of several researches, whereby various multivariate statistical approaches such as principal component analysis (PCA) and hierarchal cluster analysis (HCA) have been used. These methods have been effectively utilized in numerous recent studies to investigate the factors contributing to water pollution and gain a comprehensive understanding of water quality dynamics (Kumar et al. 2018; Das et al. 2019; Sreedhar et al. 2019; Govind et al. 2021; Panghal & Bhateria 2021; Hinge et al. 2022; Mishra & Lal 2023). In addition to HCA, in the current study, discriminant analysis (DA) is also applied to check the reliability of HCA. DA is a statistical technique which identifies the pattern among complex dataset and identifies the most significant variables responsible for creating the group differences. There is very few research where DA has been applied to determine the most influential parameter which are responsible for cluster pattern formation (Ajorlo et al. 2013; Yang et al. 2016; Chen et al. 2018; Masood et al. 2022). A study conducted by Alam & Singh (2023) in the Gaya district of south Bihar utilized PCA and HCA for source appointment of contamination in a regional aquifer. The author identified that over-exploitation of GW triggered the geogenic activities which led to a rise in the level of fluoride across the region. Govind et al. (2021) conducted a study in the Arwal and Jehanabad districts of south Bihar to check the aptness of GW for agriculture and human consumption. They utilized the WQI and PCA to determine the aptness of GW for consumption and to determine the probable sources of contamination respectively. They concluded that a significant part of Mid-Gangetic plains in south Bihar has fluoride levels above prescribed limits and GW of the region is adversely affected by lithological sources as well as anthropogenic sources.
RESEARCH SIGNIFICANCE
The selected study area lies in the Mid-Gangetic Plain of south Bihar which is considered as a significant farmland in India. As per the census (2011), a population of about 2.5 million lives within the study area and GW is the sole supply of drinking water for this population. The water quality in the research region may have been adversely affected because of inadequate sewage drainage infrastructure, significant agricultural practices, and over-exploitation of GW resources. People have been greatly impacted by GW contamination. Based on the information provided in an earlier section we identified that one significant research gap is the absence of a baseline assessment of GW quality in the study area. The hydrochemistry of GW in the Nalanda district has not been reported and analyzed using statistical methods, and there has been no effort made to identify the potential contamination sources. Along with baseline assessment, understanding the sources and causes of GW contamination is crucial for effective management and mitigation strategies. A comprehensive study to establish the status of GW quality, identification of pollution patterns and pollution source characterization is essential.
This paper aims to fill the identified research gap. The objective of the current research is to evaluate the status of GW quality and check its suitability for human consumption by utilizing the WQI. It also seeks to identify potential contamination sources and factors affecting GW quality variation using multivariate statistical techniques, while determining GW quality distribution patterns through GIS analysis. This assessment offers vital insights for planners, authorities, and policymakers regarding current GW quality for drinking purposes, aiding in the development of effective management strategies applicable not only to our specific region but also to similar geographic areas facing similar challenges worldwide. It also demonstrates the innovative use of multivariate statistical techniques to interpret complex data and understand water quality dynamics.
MATERIALS AND METHODS
Description and hydrogeological settings of the study area
Water sample collection and analysis
To choose the sampling locations, a systematic grab sampling strategy was employed by dividing the entire research area into 5 × 5 km grids (Kumar et al. 2018). After that, the selection of sampling locations within each grid was done in a manner that ensured representation of the entire area and the livelihoods associated with it. A total of 78 samples (n = 78) of GW were collected across different marked sampling locations during the month of June 2022 in the pre-monsoon season in Nalanda district, India (Figure 1). The entire set of samples was acquired from unconfined subsurface sources, comprising bore wells and hand pumps with depths varying between 5 and 25 m throughout the study area. High-density polyethylene screw-capped bottles of capacity 1 L were pre-sterilized and cleaned with deionized water and were utilized to collect GW samples. Prior to sampling, the source of sampling underwent flushing for 3–5 min to remove any stagnated water, and the sampling bottles were rinsed twice using source water to avoid any possible outer contamination. Immediately following sampling, parameters including pH and electrical conductivity (EC) were promptly assessed in situ utilizing a Thermo-Scientific Orion Star A2140 (S.no-X34633) multiparameter kit. The obtained samples were then passed through filtration using a 0.45-micron membrane syringe filter. A portion of filtered samples underwent acidification to avoid any possible wall deposition to preserve the chemical integrity of the water sample and prevent reactions that occur during the storage (APHA 2017; Kumar et al. 2019a, 2019b). To avoid matrix decomposition, the non-acidified samples were stored away from light at 4 °C, until the assessment was finished (APHA 2017). The samples of GW were examined for their physiochemical characteristics, such as pH, EC, total dissolved solids (TDS), total hardness (TH), total alkalinity (TA), dissolved oxygen (DO), calcium (Ca2+), magnesium (Mg2+), sulfate (SO42−), phosphate (PO43−), nitrate (NO3−), fluoride (F−), chloride (Cl−), and bicarbonate (HCO3−). The quantification of Ca2+ and Mg2+ in the samples was assessed through the conventional EDTA titrimetric methods, as outlined in APHA (2017) guidelines. Chloride levels were determined using the argentometric method and bicarbonate was determined through titration method. Sulfate and nitrate levels were determined utilizing a UV-spectrophotometer (Thermo-Scientific Evolution-201, S.no-V05044). The concentration level of fluoride was quantified via potentiometric analysis using an ion selective electrode (Thermo-scientific, electrode Sno. UQ1-10725) (Kumar & Kumar 2015). The American Public Health Association's recommendations for the study of GW samples were followed in conducting the quantification and analysis of these parameters. These techniques guarantee the correctness, dependability, and comparability of the results.
Data analysis and methodology
Principal component analysis
Multivariate statistical methods are beneficial for establishing relationships and finding links among extensive and diverse datasets. PCA is a statistical approach employed to reduce dimensions (Jolliffe & Cadima 2016; Hinge et al. 2022). PCA reduces numerous potentially linked variables into a reduced set of linearly uncorrelated variables referred to as principal components (PC). This transformation preserves the majority of information included in the raw dataset. Following PCA, the subsequent procedure is factor analysis (FA), which aims to further optimize the effectiveness of the data acquired via PCA. In PCA, weighted observable variables are combined linearly; the weights are established using eigenvalue calculations, and only components having eigenvalues higher than 1 are retained (Alam & Singh 2023).
Hierarchal cluster analysis
HCA is a data mining method which groups the cases or observations on the basis of similarities in measurable variables. Its goal is to find natural clusters or groups among a set of information. HCA is a popular way of clustering water quality characteristics in GW. This approach divides variables into clusters based on their commonalities, with each cluster reflecting a distinct process inside the system. The technique operates on the principle of building a binary data tree, gradually merging similar point groups to achieve significant homogeneity within clusters and noticeable heterogeneity between clusters (Zhou et al. 2007). By finding these clusters, we gain understanding of the underlying causes of trends and associations in the data being examined (Sreedhar et al. 2019). In the current study, HCA was conducted on the experimental data with the aid of Ward's linkage technique. The results are depicted in a 2-D plot known as a dendrogram which displays the relationships between groups and their closeness. This dendrogram offers a simplified representation of the dimensionality of the original data.
Discriminant analysis
WQI for drinking
Geostatistical analysis
The present research used a geostatistical model to examine and depict the spatial distribution pattern of water quality distribution within the designated study region (Thomas 2023). The ordinary kriging method with log transformation technique was utilized to generate various semivariogram models such as models including Circular, Spherical, Exponential and Gaussian (Nas & Berktay 2010). The aforementioned models were used to analyze the GW quality dataset with the objective of identifying the best appropriate model for assessing spatial variability in relation to the WQI. Kriging, being a stochastic technique, relies on the use of statistical model to analyze the data. The spatial interpolation method using Kriging incorporates standard errors, which serve as a metric for quantifying the level of uncertainty linked to the projected quantities. Kriging methods exhibit optimal performance when the data approximates a normal distribution. Transformations were applied to achieve normal distribution and comply with the condition of equal variability of data. Prediction performance assessment was carried out through cross-validation, enabling the selection of the optimal model that yields the most precise predictions. To evaluate the precision of a model in making accurate predictions, it is preferable for the standardized mean error (MS) to be close to 0. Moreover, minimizing both the root-mean-square error (RMSE) and the average standard error (ASE) is essential, particularly when evaluating various models (Omran 2012). Furthermore, the root-mean-square standardized error (RMSS) should ideally approach 1 (Johnston et al. 2001; Hossain et al. 2020).
RESULTS AND DISCUSSION
General hydrochemistry
Table 1 presents the findings of a descriptive statistics analysis conducted on the physiochemical characteristics of 78 GW samples collected in Nalanda district. The analysis confirms a notable variation in physiochemical characteristics across the study area. The recommended pH range of water fit for human consumption is 6.5–8.5 (BIS 2012). In all samples, pH was found to be within the recommended BIS limits. The predominant characteristic of the GW samples collected in the Nalanda district was shown to be alkaline in nature. EC is a quantitative assessment of water's capacity to carry electric current (Alam et al. 2024). It is influenced by the quantity of ions in water, which can conduct electricity, and it is a general indicator of the overall salinity. A significant disparity in EC was detected across the research area. TDS offers a comprehensive evaluation of water quality, encompassing a broader range of dissolved substances, providing more detailed insights compared to EC (Azhdarpoor et al. 2019). The TDS levels in the samples ranged from 188.20 to 1,856 mg/L, with an average concentration of 411.89 mg/L. Most of the samples were found to exceed the desirable TDS concentration. As per a classification based on Subba Rao (2017), In the present study, only 10% of the samples are very fresh water types, 87% of the samples are fresh water types and only 4% of the samples are brackish in nature. Alkalinity is the generally used term to describe the ability of water to counteract the effects of acid. The primary causes of alkalinity in GW are carbonates, bicarbonates, hydroxides, and other natural components (Panghal & Bhateria 2021). The water sample exhibited a total alkalinity range of 22.935 to 69.570 mg/L, with a mean value of 37.470 mg/L. According to BIS, the acceptable limit for TH is 200 mg/L, whereas the permissible limit is 600 mg/L. It was observed that 96% of the samples were below the permissible limit. Elevated hardness value above 250 mg/L may cause stones in the kidney, and consumption of water with excessive hardness leads to stomach-related diseases and may end up with permanent damage to the stomach (Patil & Patil 2010).
. | Min. . | Max. . | Mean . | Std. deviation . | WHO (2017) . | BIS IS10500:(2012) . |
---|---|---|---|---|---|---|
pH | 6.74 | 7.740 | 7.294 | 0.199 | 6.5–8.5 | 6.5–8.5 |
EC | 44.50 | 3,790 | 776.459 | 490.420 | – | 750 |
TDS | 188.20 | 1,856 | 411.895 | 233.297 | 600–1,000 | 500 |
TA | 22.935 | 69.570 | 37.470 | 8.290 | 200 | 200 |
TH | 151.965 | 782.85 | 338.586 | 107.973 | 200 | 200 |
DO | 1.170 | 7.74 | 2.428 | 0.945 | 6 | 5 |
Ca2+ | 59.865 | 419.05 | 167.610 | 65.371 | 75 | 75 |
Mg2+ | 4.605 | 409.84 | 170.975 | 77.822 | 50 | 30 |
F− | 0.180 | 1.5 | 0.519 | 0.287 | 1.5 | 1–1.5 |
Cl− | 12.009 | 754.146 | 56.256 | 87.416 | 200 | 250 |
NO3− | 0.018 | 69.77 | 12.641 | 17.929 | 50 | 45 |
SO42− | 10.615 | 172.054 | 47.579 | 28.305 | 250 | 200 |
PO43− | 0.089 | 0.67 | 0.146 | 0.094 | – | 0.1 |
HCO3− | 229.350 | 695.695 | 374.703 | 82.904 | 300 | 300 |
. | Min. . | Max. . | Mean . | Std. deviation . | WHO (2017) . | BIS IS10500:(2012) . |
---|---|---|---|---|---|---|
pH | 6.74 | 7.740 | 7.294 | 0.199 | 6.5–8.5 | 6.5–8.5 |
EC | 44.50 | 3,790 | 776.459 | 490.420 | – | 750 |
TDS | 188.20 | 1,856 | 411.895 | 233.297 | 600–1,000 | 500 |
TA | 22.935 | 69.570 | 37.470 | 8.290 | 200 | 200 |
TH | 151.965 | 782.85 | 338.586 | 107.973 | 200 | 200 |
DO | 1.170 | 7.74 | 2.428 | 0.945 | 6 | 5 |
Ca2+ | 59.865 | 419.05 | 167.610 | 65.371 | 75 | 75 |
Mg2+ | 4.605 | 409.84 | 170.975 | 77.822 | 50 | 30 |
F− | 0.180 | 1.5 | 0.519 | 0.287 | 1.5 | 1–1.5 |
Cl− | 12.009 | 754.146 | 56.256 | 87.416 | 200 | 250 |
NO3− | 0.018 | 69.77 | 12.641 | 17.929 | 50 | 45 |
SO42− | 10.615 | 172.054 | 47.579 | 28.305 | 250 | 200 |
PO43− | 0.089 | 0.67 | 0.146 | 0.094 | – | 0.1 |
HCO3− | 229.350 | 695.695 | 374.703 | 82.904 | 300 | 300 |
Note: All parameters are represented in mg/L except for pH (unitless) and EC is in μS/cm.
Major ions
In the present study, the cationic dominance follows the sequence Ca2+> Mg2+, whereas the anionic dominance is in the sequence HCO3−> Cl−> SO42−> NO3−> F−> PO43−. The abundance of calcium and magnesium represents the property of freshwater systems. Both cations are major contributors to the hardness of water. The recommended BIS value for calcium is 75–200 mg/L, and for magnesium, it is 30–100 mg/L. The calcium concentration in 85% of the samples was found in the range of BIS prescribed limits, while the magnesium concentration in 87% of the samples was found above the BIS limiting value. Long-term intake of water excessively enriched with calcium and magnesium may lead to cardiovascular illness, diarrhea, and retarded growth in children (Mandal et al. 2019; Mukherjee et al. 2020).
Elevated chloride levels act as a pollution tracer in GW (Sadat-Noori et al. 2014). High chloride intake may cause hypertension, be laxative, and affect the metabolism of the human body (Ramakrishnaiah et al. 2009; Adimalla & Qian 2019). The samples exhibited a range of chloride concentrations, spanning from 12 to 754.15 mg/L, with an average value of 56.256 mg/L. Except for one sample with a concentration of 754.15 mg/L, all the samples were confirmed to be under the permissible BIS level of 250 mg/L. The SO42− concentration in all the samples was found to be within the BIS acceptable limit (200 mg/L). The phosphate content in 60% of samples was found near the BIS limit (0.1 mg/L). A total of 40% of the samples exceeded the acceptable limits set by the BIS. The elevated levels of phosphate in GW may be attributed to the phosphate-rich sewage and detergent-containing wastewater coming from community drains (Alam et al. 2024). The bicarbonates in most of the samples surpassed the allowable level set by the BIS, which is 300 mg/L. The bicarbonates contribute to the alkalinity of the water. The bicarbonates in GW are generally imparted due to weathering of rocks, gaseous soil carbon-dioxide, and dissolution of carbonates. The range of nitrate concentration in the GW samples was 0.018–69.77 mg/L, with a mean concentration of 12.641 mg/L. The BIS maximum permissible value of nitrate in drinking water is 45 mg/L with zero relaxation. A total of nine samples (12% of the total samples) were found above the BIS permissible limit. Several factors, such as fertilizers, manure applied to the soil, and sewage released from septic tanks, can introduce nitrates into GW. Drinking water with high nitrate levels can raise the risk of blue baby syndrome in infants, stomach cancer and gastroenterological illness, especially for young children (Kumar et al. 2019a, 2019b; Jandu et al. 2021). Leaching of natural fluoride sources like fluoride-bearing rocks and sediments may be the factors contributing to the presence of fluoride in subsurface water. The amount of fluoride intake determines its impact on human health. Fluoride at the right dose (up to 1 mg/L) is essential for us to avoid dental cavities by strengthening tooth enamel, but exposure to a high fluoride dose (above 1.5 mg/L) can lead to dental fluorosis and exposure to such doses for a longer time may promote bone fluorosis (Fordyce 2011; Ahmad et al. 2022). The research observed a range of fluoride concentrations, ranging from 0.18 to 1.50 mg/L, with an average value of 0.519 mg/L. It was determined that all the samples fell within the permitted limit set by the BIS (1.5 mg/L).
Correlation analysis
Hydrogeochemical processes
Ionic cross-plots (ICPs) serve as commonly employed graphical tools in GW chemistry, facilitating the analysis and visual representation of GW sample compositions. As GW chemistry is influenced by the interactivity among hydrogeological components of the aquifer, so the utilization of ICPs holds great importance in understanding the hydrogeochemical aspects of GW, offering valuable insights into the mechanisms governing chemical composition (Das & Kaur 2007; Subba Rao et al. 2019; Malik et al. 2021). The current research uses several ICPs, also called scatterplots, to obtain insights into the hydrogeochemical evolution of GW across the study region.
Indeed, assessing the broad chemistry of GW, which is influenced by human interventions in the aquifer, is a complex procedure. As an example, the prevalence of HCO3− and Cl− ions in GW is mostly attributed to irrigation return flow. Additionally, sewer effluents from the community serve as sources of HCO3−, Cl−, and NO3− ions. Furthermore, the use of agrochemicals, applied to boost crop yields, serves as a common source of SO42− and NO3− ions in GW (Jalali 2009; Li et al. 2016). Indeed, TDS quantifies the overall content of dissolved salts in GW. By establishing the correlation of TDS with dissolved ions such as HCO3−, SO42−, Cl−, and NO3− which are significant pollutants also joined from non-lithological sources like household wastes and agricultural activities, we can gain insights into their respective origins. In the current study, TDS exhibited a positive correlation (Figure 3) with Cl−, SO42−, NO3−, and HCO3− evidence that anthropogenic inputs have a significant impact on the GW chemistry in the studied region (Subba Rao et al. 2019). A positive correlation between TDS and NO3− + Cl−/HCO3− (r = 0.17504) as shown in Figure 4(f), supports the influence of anthropogenic inputs on GW pollution (Li et al. 2016).
KMO and Bartlett's test
The KMO and Bartlett's tests are the statistical tests used to assess the appropriateness of any given dataset for its PCA/FA. For the PCA analysis, KMO values are classified as sufficient between 0.8 and 1, reasonably acceptable between 0.5 and 0.8, and unsatisfactory below 0.5 (Li et al. 2020; Masood et al. 2022). As illustrated in Table 2, for the current dataset, the KMO's sample adequacy value was 0.613, and Bartlett's test of sphericity on the correlation matrix of parameters is significant and yielded chi-square = 1,023.15 (p = 0.00001 and df = 45). The KMO number is within the range that is reasonably acceptable. Furthermore, the significant Bartlett's test result shows that the variables in the dataset are not entirely uncorrelated, confirming the necessity of using PCA to find important trends and information in the data. Also, only parameters having communality >0.5 were considered for PCA.
Parameter . | VF1 . | VF2 . | VF3 . | Communalities . | Validation . |
---|---|---|---|---|---|
pH | −0.418 | 0.603 | 0.053 | 0.541 | KMO = 0.613 Bartlett's test: Approx. chi square = 1,023.15 df = 45 Sig <0.00001 |
TDS | 0.810 | 0.400 | 0.065 | 0.819 | |
Ca2+ | 0.496 | −0.652 | −0.095 | 0.68 | |
Mg2+ | 0.799 | 0.061 | 0.183 | 0.676 | |
TH | 0.876 | − 0.351 | 0.074 | 0.896 | |
F− | 0.161 | 0.731 | − 0.468 | 0.779 | |
Cl− | 0.087 | 0.313 | 0.865 | 0.855 | |
NO3− | 0.699 | − 0.160 | −0.096 | 0.523 | |
SO42− | 0.617 | 0.266 | −0.221 | 0.501 | |
HCO3− | 0.562 | 0.557 | 0.016 | 0.627 | |
Eigenvalue | 2.995 | 2.764 | 1.138 | − | − |
% Variance | 29.946 | 27.645 | 11.376 | − | − |
Cum. % variance | 29.946 | 57.59 | 68.966 | − | − |
Parameter . | VF1 . | VF2 . | VF3 . | Communalities . | Validation . |
---|---|---|---|---|---|
pH | −0.418 | 0.603 | 0.053 | 0.541 | KMO = 0.613 Bartlett's test: Approx. chi square = 1,023.15 df = 45 Sig <0.00001 |
TDS | 0.810 | 0.400 | 0.065 | 0.819 | |
Ca2+ | 0.496 | −0.652 | −0.095 | 0.68 | |
Mg2+ | 0.799 | 0.061 | 0.183 | 0.676 | |
TH | 0.876 | − 0.351 | 0.074 | 0.896 | |
F− | 0.161 | 0.731 | − 0.468 | 0.779 | |
Cl− | 0.087 | 0.313 | 0.865 | 0.855 | |
NO3− | 0.699 | − 0.160 | −0.096 | 0.523 | |
SO42− | 0.617 | 0.266 | −0.221 | 0.501 | |
HCO3− | 0.562 | 0.557 | 0.016 | 0.627 | |
Eigenvalue | 2.995 | 2.764 | 1.138 | − | − |
% Variance | 29.946 | 27.645 | 11.376 | − | − |
Cum. % variance | 29.946 | 57.59 | 68.966 | − | − |
Bold values highlights the set of strongly loaded physiochemical parameters on the different varifactors.
Principal component analysis
VF2, with an eigenvalue of 2.764, explained 27.645% of the overall variation holds strong loadings of F−(0.731) and moderate loading of pH (0.603) and HCO3− (0.557). The positive loading of F−, along with pH and the negative loading of Ca2+, indicate the hydrogeochemistry associated with calcite-fluorite and rock–water interaction. Also, the strong loading of HCO3− and pH suggests the alkaline nature of GW. This represents a geogenic factor affecting the GW quality (Hinge et al. 2022). As depicted in Figure 6(b), representing the spatial distribution of loadings on VF2, loading concentration is scattered on the entire study region and there is some concentration in the north-eastern region and south-western region of the study area. A high amount of HCO3− present in water promotes alkalinity which affects human health. Fluoride strongly influenced VF2 shows elevated levels of fluoride in the south-western and north-east regions. South Bihar plains are severely affected by fluoride concentrations. So, in Nalanda district fluoride concentration is also a major contributor to control geochemistry. As per the reports of CGWB, the neighboring districts of Nalanda, namely Gaya, Nawada and Aurangabad, have been reported with high fluoride levels in GW. The current study has also reported elevated fluoride concentration in the GW of the district. Therefore, fluoride has been extracted here as a factor with strong positive loading (Table 2). VF3, having eigen value 1.138, explained 11.376% of the overall variation holds robust loading of Cl− (0.865). VF3 indicates the anthropogenic factor influencing the GW quality, and the high loading of chlorides in GW may be associated with the application of bleaching powder in agricultural activities and other possible sources of chlorides like wastewater and sewage originating from household activities. The loading distribution of VF3 across the research region is shown in Figure 6(c). From the spatial map, it can be observed that the south-western region and some parts of the east region have high concentrated loading. This indicates that Cl− is also a determinant to control the regional GW chemistry.
Hierarchal cluster analysis
Discriminant analysis
Test of function(s) . | Wilks' Lambda . | Chi-square . | df . | Sig. . | Eigen value . | % variance . | Canonical correlation . |
---|---|---|---|---|---|---|---|
1 | 0.180 | 125.867 | 8 | 0.0001 | 3.073 | 89.5 | 0.869 |
2 | 0.735 | 22.654 | 3 | 0.0001 | 0.361 | 10.5 | 0.515 |
Test of function(s) . | Wilks' Lambda . | Chi-square . | df . | Sig. . | Eigen value . | % variance . | Canonical correlation . |
---|---|---|---|---|---|---|---|
1 | 0.180 | 125.867 | 8 | 0.0001 | 3.073 | 89.5 | 0.869 |
2 | 0.735 | 22.654 | 3 | 0.0001 | 0.361 | 10.5 | 0.515 |
Parameters . | Functiona . | Functionb . | ||
---|---|---|---|---|
DF1 . | DF2 . | DF1 . | DF2 . | |
Mg2+ | 0.014 | −0.004 | 0.759 | −0.208 |
F− | 1.741 | 3.173 | 0.450 | 0.820 |
NO3− | 0.066 | −0.021 | 0.706 | −0.221 |
Cl− | 0.007 | 0.008 | 0.540 | 0.593 |
(Constant) | −4.469 | −1.147 | – | − |
Parameters . | Functiona . | Functionb . | ||
---|---|---|---|---|
DF1 . | DF2 . | DF1 . | DF2 . | |
Mg2+ | 0.014 | −0.004 | 0.759 | −0.208 |
F− | 1.741 | 3.173 | 0.450 | 0.820 |
NO3− | 0.066 | −0.021 | 0.706 | −0.221 |
Cl− | 0.007 | 0.008 | 0.540 | 0.593 |
(Constant) | −4.469 | −1.147 | – | − |
aUnstandardized coefficients.
bStandardized coefficients.
GW quality distribution across the study area
In the current investigation, the WQI obtained using Equation (2) for the GW samples ranged from 35.67 to 96.79. Based on the obtained WQI, the water quality at each site may be categorized as excellent, good, poor, very poor, and unsuitable, depending upon their respective WQI values (Ram et al. 2021; Mishra & Lal 2023). The WQI value and water categorization are summarized in Table 5. Based on the WQI values, 59% of the total samples are classified as having good water quality and suitable for human consumption, 35% are classified as having poor water quality, and 6% are classified as having extremely poor water quality and are not safe for consumption.
WQI value . | Water quality status . | No. of samples . | Samples (%) . |
---|---|---|---|
0–25 | Excellent | – | – |
25–50 | Good | 46 | 59 |
50–75 | Poor | 27 | 35 |
75–100 | Very poor | 5 | 6 |
>100 | Unsuitable | – | – |
WQI value . | Water quality status . | No. of samples . | Samples (%) . |
---|---|---|---|
0–25 | Excellent | – | – |
25–50 | Good | 46 | 59 |
50–75 | Poor | 27 | 35 |
75–100 | Very poor | 5 | 6 |
>100 | Unsuitable | – | – |
Parameter . | Model . | Prediction errors . | ||||
---|---|---|---|---|---|---|
. | . | Mean . | ASE . | MS . | RMSS . | RMSE . |
GWQI | Circular | 0.220 | 10.91 | 0.0043 | 1.051 | 12.390 |
Spherical | 0.198 | 10.93 | 0.0024 | 1.046 | 13.389 | |
Exponential | 0.168 | 10.91 | 0.0043 | 1.042 | 12.386 | |
Gaussiana | 0.216 | 10.90 | 0.0040 | 1.030 | 12.323 |
Parameter . | Model . | Prediction errors . | ||||
---|---|---|---|---|---|---|
. | . | Mean . | ASE . | MS . | RMSS . | RMSE . |
GWQI | Circular | 0.220 | 10.91 | 0.0043 | 1.051 | 12.390 |
Spherical | 0.198 | 10.93 | 0.0024 | 1.046 | 13.389 | |
Exponential | 0.168 | 10.91 | 0.0043 | 1.042 | 12.386 | |
Gaussiana | 0.216 | 10.90 | 0.0040 | 1.030 | 12.323 |
aBest-fitted model among all.
The spatial distribution map of the WQI across the region was created using the Ordinary Kriging technique incorporating the Gaussian model utilizing the Spatial Analyst tool in ArcGIS 10.8 and is depicted in Figure 9(c). The classification of the spatial distribution map of the WQI is done based on Table 5 and it was found that water quality in the north-eastern region is classified as good water quality, indicating potable GW. The spatial distribution map also revealed that the major parts of south-western, south-eastern, and central regions of the studied area have poor water quality. This observation could be a consequence of inadequate handling of household waste, intensive agriculture practices, and industrial waste in the region. Some scattered part of the study region falls under very poor water quality indicating high pollution to the GW and is unfit for consumption.
CONCLUSION
The study, conducted in Nalanda, Bihar, assessed the status of GW quality using physiochemical parameters from 78 samples across the study area utilizing an approach including the WQI, geostatistics, hydrogeochemical characterization and multivariate statistical methods. The study uncovered the alkaline nature of GW, with several physicochemical parameters including, hardness, alkalinity, calcium, magnesium, bicarbonate, fluoride, and nitrate surpassing prescribed limits set by BIS Manual IS 10500 (BIS 2012). In the present study, the cationic dominance follows the sequence Ca2+> Mg2+, whereas the anionic dominance is in the sequence HCO3−> Cl−> SO42−> NO3−> F−> PO43−, indicative of freshwater systems. PCA recognized three factors which collectively accounted for 68.96% of the total variance responsible for such GW quality distribution, highlighting the influence of bicarbonate, magnesium, and fluoride from geogenic sources and NO3−, PO43−, Cl−, and SO42− from anthropogenic sources on the GW. HCA classified the GW in three clusters across the region which was also confirmed by DA. In stepwise DA, two DFs were generated from DA, with four significant variables (Mg2+, F−, Cl− and NO3−) found as the most important discriminating parameters responsible for creating the differences among the identified clusters. Hydrogeochemical categorization and multivariate statistical analyses concluded that rock–water interaction, weathering, leaching, and anthropogenic activities collectively influenced GW quality throughout the studied region. The WQI varied between 35.67 and 96.79, categorizing 59% of samples as good, 35% as poor, and 6% as very poor, highlighting the need for treatment before consumption. The geostatistical analysis identified the Gaussian model as the best-fit model for the WQI interpolator. The spatial distribution map of GW quality distribution indicated that the water quality in significant portions of the south and central regions of the research area has poor water quality. Remedial measures are required in the identified regions to safeguard the health of inhabitants. This research provides a foundational framework for GW quality assessment and sets a baseline for policymakers to focus their efforts on addressing specific sources of pollutants and implementing mitigation measures accordingly which will help to improve public health. Therefore, this research demonstrates the efficacy of employing multivariate statistical techniques to interpret intricate datasets, assess water quality, and understand interactions among variables.
AUTHOR CONTRIBUTION
A.K. contributed to conceptualization, formal analysis, investigation, methodology, software, validation, writing – original draft. A.S. contributed to conceptualization, data curation, investigation, methodology, editing and reviewing draft, supervision, validation.
FUNDING
No funding was provided for this work.
DATA AVAILABILITY STATEMENT
All relevant data are included in the paper or its Supplementary Information.
CONFLICT OF INTEREST
The authors declare there is no conflict.