ABSTRACT
Monitoring groundwater quality (GWQ) is vital for water sustainability, mainly in arid to semi-arid regions with intense agriculture and inadequate irrigation management systems. It was, closely, linked to groundwater hydrochemistry (GWH). The present work aims to identify factors controlling this in the Chougafiya shallow aquifer (Central Tunisia), which is known for groundwater hypersalinization and intensive agricultural activities under severe climate conditions. Based on 30 inventoried wells data, the proposed approach combines (i) exploratory factor analysis and (ii) confirmatory factor analysis as a structural equation model (SEM) to detect the most relevant factors involved in GWH and their spatial variability. It has been demonstrated that GWH was governed by Na–Cl and Ca-SO4 type, resulting from evaporated minerals. The results showed a total explained variance of 86% and allowed the identification of two main factors that control groundwater salinization: (1) the natural factor, mainly, attributed to the water–rock interactions and (2) the anthropogenic factor, mainly related to anarchic farming practices. As a confirmatory analysis, the development of SEM led to the identification of the most important factors that control water mineralization. These results could be helpful for the strategic plans for water resources management in the studied area.
HIGHLIGHTS
Thirty samples were considered for monitoring of hydrochemistry in agricultural areas.
Groundwater salinization was related to two main origins natural and anthropogenic.
The EFA-CFA method was helpful to groundwater mineralization assessment under a limited database.
Hierarchical clustering is the most widely applied in Earth sciences.
The high nitrate rate in groundwater reflects the high influence of agricultural practices.
INTRODUCTION
Water resource management is a crucial and major issue for developing countries’ sustainability. Therefore, each country needs to do a thorough follow-up to get the data needed to forecast the continuous changes in the quality of these water resources, mainly, with reference to arid to semi-arid climate.
The strategic significance of groundwater for the security of water, energy, and food is increasing due to human use and climatic changes. Groundwater hydrochemistry (GWH) in arid and semi-arid zones was characterized by a very high variability over time. These are caused primarily by surface and subsurface conditions as irregularities in climatological phenomena and are influenced by topography, soil, and vegetation (Mhamdi et al. 2006; Scanlon et al. 2006; Ben Brahim et al. 2012; Makni et al. 2013; Haj-Amor et al. 2016, 2018; Haj-Amor & Bouri 2019). What's more, the GWH was featured by the influence of both natural processes such as water–rock interaction and anthropogenic activities (domestic, agricultural, and industrial).
Today, due to the combined effects of the large spatial and temporal variability of precipitation and the intensification of pumping, most of the water bodies located particularly in Central Tunisia show a remarkable decrease in the piezometric level coupled to groundwater hypersalinization, especially in some locations (Farid et al. 2012; Hajji et al. 2018).
Multivariate analysis has been largely used to perform comprehensive models of GWH over the world (Kolsi-Hajji et al. 2013; Hajji et al. 2018, 2020; Boughariou et al. 2018, 2021; El-Kholy et al. 2022). It was divided into two phases: (1) exploratory analysis and (2) confirmatory analysis. Nowadays, factor analysis (FA) and hierarchical cluster analysis (HCA) have been used as components of the exploratory analysis for the assessment of GWH in several studies (Acikel & Ekmekci 2018; Prajapati et al. 2020; Hajji et al. 2021a, 2021b).
In fact, FA is a multivariate statistical method used to identify linear relationships among observed variables by grouping them into unobserved factors. These factors are linear combinations of the original variables that explain variance within the data. FA is often employed to reduce the dimensionality of data and discover underlying structures.
The PCA method is similar to FA but specifically used to explore the correlation structure among several variables. It transforms the original variables into a set of new, uncorrelated variables called principal components, which capture the maximum variance in the original data. PCA is widely used for data visualization and identifying the most significant variables in a dataset.
In the scientific literature, these techniques are frequently employed to explore and interpret complex data in various domains, including social sciences, biology, economics, and psychology. They play a crucial role in reducing data dimensionality, visualizing hidden structures, and discovering relationships between variables, making them powerful tools for quantitative research. When integrating these techniques into an introduction, it is important to emphasize their relevance and specific application to the study domain, as well as their potential contribution to advancing knowledge and solving specific problems.
Furthermore, these methods are powerful tools for exploring data structure, uncovering hidden patterns, and interpreting complex relationships between variables. They are extensively used in scientific research, business data analysis, and other fields requiring a deep understanding of multidimensional data.
The study undertaken by Kim et al. (2005), using the hybrid HCA-FA method allowed to demonstrate that GWQ in the coastal area of Kimje in South Korea, was influenced not only by seawater intrusion but also by chemical fertilizer and microbial activities. Moreover, Chapagain et al. (2010) indicate that the water quality of deep groundwater was influenced by the natural hydrogeochemical environment and didn't vary significantly as a function of season. While Acikel & Ekmekci (2018) demonstrated, using multivariate statistical techniques, that groundwater quality (GWQ) in the Azmak Spring Zone, in Turkey, was partly affected by seasonal fresh surface water contribution.
Generally, the FA, including PCA, was one of the most common EFA techniques. It is based on using datasets’ dimension reduction without much information loss. It allows for indicating the associations between variables’ participation in GWH among several factors of influence. HCA contributes to linking samples illustrating the similarity of variables in the dataset to groups (Kim et al. 2005).
For the confirmatory analysis, the SEM technique was considered the most powerful tool. In fact, SEM was used to corroborate the grouping of parameters revealed by FA and unidirectional ANOVA. SEM found good adjustment indices, confirming that the spatial variation in GWH was, mainly, controlled by the major elements dissolved in groundwater; it provides a technique for confirming the analysis of exploratory data (Belkhiri & Narany 2015).
Furthermore, Wang et al. (2023) analyzed the relative water scarcity degree across different zones, using the SEM to identify the main factors affecting the water poverty index (WPI) and the contribution rate of each factor.
Moreover, this study investigates groundwater geochemistry in the Chougafiya Basin, Tunisia, using isotopes to understand origins, residence times, and controlling factors. It highlights significant variations in groundwater salinity and hydrochemical facies influenced by geological formations and tectonic activities. These findings underscore the imperative for sustainable groundwater management in semi-arid regions grappling with declining water levels and shifting climate conditions (Farid et al. 2015).
The hydrochemical study by Farid et al. (2012) revealed that the water composition varies between a chloride–sulfate–calcium type and a chloride–sodium type. The primary geochemical processes contributing to the saline content involve water–rock interaction (specifically the dissolution of evaporite minerals) and cation exchange.
In this case study, the reduced rainfall, especially in recent years, results in dams drying up and a low rate of groundwater renewal. The Chougafiya aquifer, located in the Kairouan Plain (Central Tunisia), faces anthropogenic and severe climatic constraints. In fact, over the past 30 years, the aquifer has been intensively exploited increasing over the years (Farid 2007), which has led to GWQ-quantitative degradation. This situation requires to update on the hydrodynamics and hydrochemical investigation of the Chougafiya shallow aquifer to suggest some recommendations concerning the water resources in the study area.
The present paper aims to provide more investigation and understanding of GWH behavior using the hybrid method EFA-CFA in agricultural plains under semi-arid climate conditions and human pressure with reference to the limited database.
MATERIALS AND METHODS
Study area
Location map, wells’ position, and piezometric map in the study area.
The region's climate is semi-arid, with a mean annual precipitation of 292 mm/year, and a mean annual temperature of 21 °C over the period 2007–2021.
Regional geology in the Chougafiya Basin shows that recognized layers in the plain extend from the Triassic to the Quaternary.
Hydrogeological studies (Farid 2007) show that the main aquifer was logged at the Mio-plio-Quaternary level. His thickness is very variable; it ranges from 15 m in the West of the basin (J. Merabtiha), to more than 130 m toward Sebkha el Batten. In addition, this formation showed a large lateral variation of the facies and considerable lithological changes. In fact, the superficial aquifer showed alternations of conglomerates, gravel, clay, sand, and pebbles, enclosed in a clay matrix in the northern and western part of the aquifer, with fine-textured sedimentation formed by clays, sandy and silty clays, and clay sands in the rest of the plain.
Piezometric map of the Chougafiya surface water table (2020) and continuous recordings in the piezometers (1), (2), (3), and (4).
Piezometric map of the Chougafiya surface water table (2020) and continuous recordings in the piezometers (1), (2), (3), and (4).
Sampling, chemical analysis, and results’ quality assurance
Thirty samples are taken, in accordance with conventional sampling rules, from wells equally distributed in the study area. The chemical analyses were related to the analytical methods applied in order to identify major element concentrations. The chemical analyses were carried out in the Laboratory of Environmental Bioprocesses at the Sfax Biotechnology Centre (SBC). Ion concentrations were determined by high-performance liquid chromatography (HPLC) technique.
The assessment of GWQ in the Chougafiya aquifer typically begins with the sampling phase, involving the collection of groundwater samples and the determination of physico-chemical parameters. These parameters include in situ measurements, such as pH, TDS (total dissolved solids), temperature, and laboratory analyses, such as TDS, and major ions dissolved in water (cations and anions), such as Ca2+, Mg2+, Na+, K+, Cl−, , and
. In this study, a sampling campaign was conducted at 30 groundwater wells in the Chougafiya aquifer during May and June 2022 (Figure 2). Field equipment used included GPS for the geolocation of wells, a portable multi-function conductivity meter for measuring pH, conductivity, dissolved oxygen, and water temperature, and clean plastic bottles for sample collection. The well name, sampling date, and sample number are recorded. Plastic bottles were rinsed each time with the appropriate well water before sampling. A quantity of 1 L of water was collected in each one. This process is repeated for each sample. Then, samples are collected in a cooler and transported at 4 °C to the laboratory. At the laboratory, necessary analyses of major ions present in the water are conducted.

In this study, ion balance values for the sampled specimens were consistently below 5%, indicating the reliability of the results.
The proposed methodology
In order to achieve the pre-established objectives within collected data, two kinds of statistical analysis were involved in this current study: an exploratory then a confirmatory analysis. The exploratory one included two multivariate analyses: HCA and PCA, The aims were, respectively, to divide data on groups showing the same characteristics (Rousseeuw et al. 1996; Faes et al. 2001), decrease data complexity (Landau & Everitt 2003), and to discover which variables causing more data variability (Kaufmann & Schering 2007). The confirmatory analysis consisted of multivariate confirmatory factor analysis (CFA) using the SEM technique (Anderson & Gerbing 1988). However, as the Kolmogorov–Smirnov and Shapiro–Wilk test revealed a clear deviance from normality for almost all water indicators, a non-parametric Kruskal–Wallis H test was used instead of the conventional ANOVA one-way test in order to explore data variability within the dataset. The same reason was behind the use of a bootstrap analysis when conducting an SEM FA to ensure parameters’ stability and model validation (Bollen & Stine 1992; Schumacker & Lomax 2010).
Exploratory factor analysis
In this study, EFA was based on both HCA and principal component analysis.
Hierarchical cluster analysis
Hierarchical clustering, or HCA, is a logic algorithm that organizes similar objects into groups known as clusters. The result consists of a collection of clusters, each of which is distinct from the others and contains objects that are broadly similar to one another. Traditionally, cluster analysis was seen as a classification method without prior assumption of difference within a sample or population (Punj & Stewart 1983). As mentioned earlier, the cluster analysis aims to subdivide data into a number of groups or clusters sharing the same priorities (Bardhoshi et al. 2021). It's, therefore, a useful tool to discover the underlying homogeneous structures within a dataset (Yim & Ramdeen 2015).
According to Beckstead (2002), HCA is a type of cluster analysis in which the researcher has usually no prior idea about how many groups are behind data. HCA has a descriptive purpose; it enables a hierarchical classification of data (Köhn & Hubert 2014). This classification can be observed via a dendrogram (Bridges 1966).
Hierarchical clustering is the most widely used in Earth sciences (Davis 1986) and is frequently used to classify hydrogeochemical data (Steinhorst & Williams 1985; Ribeiro & Macedo 1995; Güler 2002). Moreover, HCA was used to rank water quality data into different classes. It is used also to test water quality data and determine if samples can be grouped into hydrochemical groups. This method was used to group water samples into significantly different clusters (Zhang et al. 2012).
Principal component analysis
PCA is a widely used tool for data modeling, compression, and visualization (Vidal et al. 2016). PCA works on the transformation of a number of possibly correlated (similar) variables within a dataset to a lesser number of new variables called principal components (Richardson 2009; Abdi & Williams 2010). While EFA and PCA are not exactly the same, they are usually employed for the same objective. Both are data-reductionist techniques, aiming to explore the underlying factor structure within a dataset (Alavi et al. 2020). However, PCA is more popular than EFA (De Winter & Dodou 2016). For this reason, PCA was selected as the method of factor extraction in this current study using SPSS.23 software. While choosing PCA as the EFA method, the researcher must determine the number of factors to retain and the rotation method (Jain & Shandliya 2013). Kaiser's rules suggest that any factor that has an eigenvalue above 1 is for retention (Hair et al. 2010). For the rotation method, experts recommend both an oblique (such as oblimin or promax) and an orthogonal one. Researchers may try an oblique rotation, if low correlations between factors were found (less than ±0.32), an orthogonal Varimax rotation will be then the best option (Brown 2009).
The PCA method ‘main component analysis’ is carried out in order to multiply the approaches and to verify the results. It is a method for studying data taking into account its multidimensional character. This is a particularly powerful method which consists in finding a new space where variables are represented by taking the minimum amount of information (Kolsi-Hajji et al. 2013). It is based on the study of covariance between variables.
In the literature, this technique is used extensively to characterize, classify water, and determine the source of contamination (Ogasawara 1999; Hamzaoui-Azaza et al. 2009; Jiang .2009). In fact, during the application of the PCA method, both Varimax rotation and Kaiser Normalization can be used. The following two options improve the results achieved (Yidana et al. 2010, Kolsi-Hajji et al. 2013). Moreover, the Varimax rotation method consists in applying an orthogonal matrix to the matrix of factors to increase the differences between factors, which helps us interpret the results (Yidana et al. 2010, Kolsi-Hajji et al. 2013).
In addition, this method was used to increase the participation of variables with a high contribution and to reduce those who have a low contribution. The first factor is related to the highest own value. The Kaiser method (Kaiser 1958) is widely used in combination with the rotation method.
The factors chosen are those with eigenvalues at least equal to 1 (Ogasawara 1999; Yidana et al. 2010, Kolsi-Hajji et al. 2013). The PCA is produced by the software SPSS version 28.0 (Statistical Package for the Social Sciences). It's a statistical analysis program that's used extensively in this area.
CFA – structural equation modeling
Structural equation modeling is a multivariate statistical analysis technique that is used to analyze structural relationships. This technique is a combination of FA and multiple regression analysis, and it is used to analyze the structural relationship between measured variables and latent constructs.
The model can then be statistically tested in a simultaneous analysis of the entire set of variables to see if it is consistent with the data. Several aspects of the SEM set differ from the previous generation of multivariate procedures (Fornell 1983).
First, it approaches data analysis in a confirmatory rather than exploratory manner. Furthermore, SEM is well suited to data analysis for inference. On the other hand, many other multivariate procedures are primarily descriptive, making hypothesis testing difficult. Second, traditional multivariate procedures are incapable of assessing or correcting measurement errors. SEM provides explicit estimates of the error variance parameters (Byrne 2001, 2013). The data were input into Amos7.0 (Arbuckle 2006) for SEM analysis (Chenini & Khemiri 2009).
RESULTS AND DISCUSSION
Physico-chemical parameters
The temperature measurements of the surface water temperature of Chougafiya show values ranging from 22 to 30.6 °C during May and June 2022, these values reflect a very important influence of atmospheric temperature. This could be due to the low depths of the water points. The pH measurements recorded during sampling range from 6.5 to 7.8 with an average of 7.4 in May 2022.
The conductivity values of groundwater in the Chougafiya are very variable. They oscillate between 115 and 11,950 μS/cm in June 2022. Low values were recorded at Chebika (i.e. well n°20 (EC = 115 μS/cm). The highest value was recorded in the central part of the basin in the vicinity of Sebkha Hariet el Batene (well n°6 with EC = 11,950 μS/cm). This suggests the presence of an invasion of salt water from the Sebkha that significantly affected the Chougafiya (Fej Ruissate region).
Hydrochemical analysis results
The results of the analyses are summarized in Table 1. Then, the normality test was performed by both the Shapiro–Wilk test and Kolmogorov–Smirnov test (Table 2).
Descriptive statistics of chemical analyses of Chougafiya water (in mg/L)
. | N . | Minimum . | Maximum . | Medium . | Standard deviation . |
---|---|---|---|---|---|
EC (μS/cm) | 30 | 115 | 11,950 | 4,386.0 | 3,861.1 |
Cl | 30 | 27 | 5,371 | 1,461.8 | 1,606.2 |
NO3 | 30 | 3 | 208 | 25.6 | 36.1 |
SO4 | 30 | 29 | 2,671 | 503.5 | 531.1 |
Na | 30 | 57 | 2,162 | 673.3 | 603.9 |
K | 30 | 28 | 549 | 128.4 | 123.4 |
Mg | 30 | 23 | 461 | 160.9 | 142.0 |
Ca | 30 | 52 | 1,053 | 316.2 | 274.6 |
. | N . | Minimum . | Maximum . | Medium . | Standard deviation . |
---|---|---|---|---|---|
EC (μS/cm) | 30 | 115 | 11,950 | 4,386.0 | 3,861.1 |
Cl | 30 | 27 | 5,371 | 1,461.8 | 1,606.2 |
NO3 | 30 | 3 | 208 | 25.6 | 36.1 |
SO4 | 30 | 29 | 2,671 | 503.5 | 531.1 |
Na | 30 | 57 | 2,162 | 673.3 | 603.9 |
K | 30 | 28 | 549 | 128.4 | 123.4 |
Mg | 30 | 23 | 461 | 160.9 | 142.0 |
Ca | 30 | 52 | 1,053 | 316.2 | 274.6 |
Normality assessment
. | Kolmogorov–Smirnov . | Shapiro–Wilk . | ||||
---|---|---|---|---|---|---|
Statistic . | Df . | Sig. . | Statistic . | df . | Sig. . | |
EC | 0.200 | 30 | 0.003 | 0.879 | 30 | 0.003 |
Cl | 0.312 | 30 | <0.001 | 0.784 | 30 | <0.001 |
NO3 | 0.375 | 30 | <0.001 | 0.407 | 30 | <0.001 |
SO4 | 0.212 | 30 | <0.001 | 0.737 | 30 | <0.001 |
Na | 0.252 | 30 | <0.001 | 0.845 | 30 | <0.001 |
K | 0.292 | 30 | <0.001 | 0.755 | 30 | <0.001 |
Mg | 0.238 | 30 | <0.001 | 0.830 | 30 | <0.001 |
Ca | 0.248 | 30 | <0.001 | 0.823 | 30 | <0.001 |
. | Kolmogorov–Smirnov . | Shapiro–Wilk . | ||||
---|---|---|---|---|---|---|
Statistic . | Df . | Sig. . | Statistic . | df . | Sig. . | |
EC | 0.200 | 30 | 0.003 | 0.879 | 30 | 0.003 |
Cl | 0.312 | 30 | <0.001 | 0.784 | 30 | <0.001 |
NO3 | 0.375 | 30 | <0.001 | 0.407 | 30 | <0.001 |
SO4 | 0.212 | 30 | <0.001 | 0.737 | 30 | <0.001 |
Na | 0.252 | 30 | <0.001 | 0.845 | 30 | <0.001 |
K | 0.292 | 30 | <0.001 | 0.755 | 30 | <0.001 |
Mg | 0.238 | 30 | <0.001 | 0.830 | 30 | <0.001 |
Ca | 0.248 | 30 | <0.001 | 0.823 | 30 | <0.001 |
In fact, The Kolmogorov–Smirnov test is used for n ≥ 50 samples, whereas the Shapiro–Wilk test is more suitable for smaller sample sizes (<50 samples) though it can also handle larger sample sizes. The null hypothesis for the two tests mentioned above states that the data come from a normally distributed population. Following the two tests, parameter normality was ensured.
This result confirms the hypothesis of mineralization acquisition by dissolution of primary Halite and secondary gypsum minerals.
Results of the statistics-analysis approach in HCA
Cluster 1: was formed by the wells number 3, 21, 29, 22, 28, 8, 19, 20, 4, 25, 26, 18, 23, 16, 24, 14, 15, 12, 27, and 30.
Cluster 2: the reaming wells so 6, 9, 10, 13, 5, 11, 2, 17, 1, and 7.
Add to this, an element that exists naturally and is necessary for plant and animal life is nitrogen. Nitrates or low amounts of nitrogen are typical for both surface and groundwater. On the other hand, increased nitrate level brought on by water pollution caused by humans. Numerous things, such as nitrogen-rich geologic deposits, animal feces, precipitation, and fertilizer, can release nitrate into groundwater (Follett et al. 1991). The development of agriculture requiring excessive fertilization of crops through nitrogen fertilizers and livestock manure has led to the introduction of high nitrate concentrations in groundwater, exceeding standards (Izbicki et al. 2003). Nitrate concentrations in Chougafiya groundwater range widely from 3.43 to 57.57 mg/L.
High nitrate concentrations reflect the significant influence of irrigation water return. A review of the nitrate spatial distribution map (Figure 7) shows that the highest values appear to be strongly related to land-use patterns. Especially, it is shown that the highest nitrate values, higher than 50 mg/L, are recorded in samples n°5, 6, and 7. This result confirms that obtained previously by the statistical method (HCA) in which samples 5, 6, and 7 are presented in cluster 2 corresponding to the agricultural factor.
For at least another 50 years, a portion of the nitrogen fertilizer applied to the soil today will continue to seep toward the groundwater as was much longer than previously thought (Sebilo et al. 2013). Recent research revealed that in highly fertilized croplands, irrigation significantly affected
leaching as well (Li et al. 2018; Su et al. 2022).
For nations, where the use of N fertilizer increased, the buildup of soil problems is a significant challenge (Ncibi et al. 2021). Even if we immediately stop using too much N fertilizer, the cumulative
deposits would still exist for decades because
could build up in the deep vadose zone and groundwater for a long time.
Results of the statistics-analysis approach in the main components (PCA)
A large number of variables are reduced to a small number using PCA. Thirty samples and eight variables (EC, Cl, NO3, K, SO4, Ca, Mg, and Na) were considered in this operation.
The KMO and Bartlett sphericity tests have been used to confirm the validity of the method (Table 3). The KMO test results were actually higher than the minimally acceptable threshold (KMO = 0.761 > 0.5). This suggests that the data satisfy the PCA's sampling requirements. To find out if the correlation represents an identity matrix, the Bartlett sphericity test was also computed. The results show that the PCA is appropriate, and they are highly significant (p = 0.000). It is evident from these two statistical tests that the variables used complied with the PCA implementation's requirements.
KMO and Bartlett test
Kaiser–Meyer–Olkin precision sampling measurement . | 0.761 . | |
---|---|---|
Bartlett sphericity test | Approximated khi-two | 354.730 |
Ddl | 28 | |
Bartlett signification | 0.000 |
Kaiser–Meyer–Olkin precision sampling measurement . | 0.761 . | |
---|---|---|
Bartlett sphericity test | Approximated khi-two | 354.730 |
Ddl | 28 | |
Bartlett signification | 0.000 |
The good correlations among Na, Cl, Ca, SO4, and Mg, on one hand, and EC, on the other hand, reflect the significant contribution of these elements to the mineralization acquisition in groundwater. The good correlation between Na and Cl (r = 0.96) (Table 4, in SM) confirms the common origin of these two elements related to the dissolution of the halite. The positive ratio is high between Ca/Mg and SO4 (r > 0.5) and indicates a common origin of these ions in the dissolution of sulfate minerals: gypsum and/or anhydrite.
The PCA method consists in reducing the workspace from eight parameters to two factors (F1 and F2) indicated by two eigenvalues > 1 (Fig. 8, in SM).
These main components explain most of the variance (Davis 1986; Kumar et al. 2019), with 64.812% for axis F1 and 21.320% for axis F2 (Table 5, in SM). All chemical elements have a good correlation with the F1 axis (with contribution factors λ > 0.6) except for the nitrate element which is well correlated with the F2 axis (λNO3 = 0.966) (Fig. 9 in SM).
A PCA with Varimax rotation resulted in two principal components fulfilling Kaiser's rule of factor retention (Fig. 9, in SM). These bi-factors represent more than 86% of the total variance which largely exceeds the suggested threshold of 60% (Hair et al. 2010). In addition, we found that all loading factors were above 0.7 which is considered a desirable level of correlation between factors and their corresponding items (Hair et al. 2010; Tabachnick & Fidell 2013). FA results are summarized in Table 6 (in SM) including an assessment of component reliability using the composite reliability coefficient (CR). Reliably exists if a CR is above 0.6 (Bagozzi & Yi 1988).
Kruskal–Wallis H test
Kruskal–Wallis H test is a free distribution test which is usually used to compare several populations without being concerned by normality or homogeneity of variance (Chan & Walmsley 1997). Then the Kruskal–Wallis H test is often seen as an alternative to the classic ANOVA analysis of variance test, when violation of normality and homogeneity is extreme and when data are ordinal, interval, or ratio level (Pagano 1994; Vargha & Delaney 1998). Under non-normality, the Kruskal–Wallis H test was used in this current study to explore more data and to detect variability in data groundwater indicators taking the wells’ initial grouping as an independent variable. But prior to this, data were ranked under a 5-point Likert scale in order to fulfil the condition of ranked data type. All the series of Kruskal–Wallis H tests and data scaling were performed via SPSS.28 software, results are presented in Table 7 (in SM). All the series of Kruskal–Wallis H tests and data scaling were performed via SPSS.28 software, results are summarized in the Table 7 (in SM).
The Kruskal–Wallis H tests showed no significant mean rank differences between the two identified clusters only for the NO3 groundwater indicator. This means that these two variables didn't contribute enough to our established data segmentation. However, the PCA analysis revealed that this variable is strongly correlated with their respective factor (λNO3 = 0.934). The CFA will show then if this variable should be omitted or not from our data analysis of variability within the collected dataset.
Confirmatory analysis results
In similar cases, SEM experts often recommend, the use of a non-parametric bootstrap analysis, especially the Bollen–Stine of χ2 test statistic one, in order to assess overall model fit (Walker & Smith 2017). Simply speaking, bootstrap analysis is a computing procedure which aims to calculate confidence intervals for parameter estimates or the p-values for testing hypotheses using bootstrap samples from a parent sample without distributional assumptions (Corrêa Ferraz et al. 2022). Bootstrapping could be used to assess parameter stability and to check if non-normality had affected or not parameter estimation (Byrne 2013).
Our study results showed a non-significant Bollen–Stine p-value (p = 0.066 > 0.05) (Table 9, in SM), which indicates that our model is valid (Walker & Smith 2017). In addition, the bootstrapped standard errors (Table 10, in SM) were contrasted with those of the regular maximum likelihood (ML) ones and showed that they are not very distant. The same thing was found for loading factors and their respective p-values all were statistically significant (Table 11, in SM).
The first group is on the positive side of the F1 axis and includes the most mineralized samples, reflecting the influence of the dissolution of evaporative minerals, and dolomitization. The second group, which is on the positive side of the F2 axis, is strongly connected to nitrates, showing anthropogenic pollution in relation to agricultural practices in this area.
CONCLUSION
The management of water resources is essential for the sustainability of developing nations, especially those facing to climate change effects and water resource scarcity. The Chougafiya shallow aquifer (Central Tunisia) suffers from a low recharge rate and GWQ degradation threat.
The HCA–PCA–SEM combined technique was performed to confirm the mineralization origin in the Chougafiya shallow aquifer. It shows that the groundwater mineralization was related mainly to the dissolution of Halite and gypsum minerals. The exploratory analysis (HCA and PCA techniques) showed that acquiring salinity process in the Chougafiya groundwater was linked to two main factors: (1) natural factor, expressed by climatic conditions and a high relationship between EC and Ca, Mg, Na, K, Cl, SO4, and HCO3. It is mainly due to the rock–water interaction groundwater leaching and Sebkha El Baten influence (2) the anthropogenic factor related to high nitrate concentrations generated by excessive use of fertilizers in some specific lands. In fact, the confirmatory analysis, performed by SEM technique, indicated that agricultural activity was the main origin of anthropogenic contamination in irrigation areas which are located in the central part of the basin. One notices that the South-Eastern part (Fej Ruissat) is the most affected by the groundwater salinization pressure and the poor suitability for human use.
The effectiveness of the applied method (EFA-CFA) as a hybrid technique to support decision-making in situations where data are scarce was confirmed by this study, particularly in managing water management and ensuring sustainable development.
It is recommended to implement stringent management measures to limit overexploitation of the aquifer and prevent continued decline in piezometric levels, developing and implementing artificial recharge strategies to restore groundwater levels and prevent intrusion of saline water. Additionally, it's worthy to promote the adoption of sustainable agricultural practices including the rational use of fertilizers to reduce nitrate levels in groundwater. It is also recommended to enhance continuous monitoring of GWQ and adjust management strategies based on hydrogeological and hydrochemical study findings. Furthermore, promoting awareness and educating farmers and local communities about the impacts of their activities on the aquifer and water resource sustainability become necessary.
ACKNOWLEDGEMENTS
Firstly, all authors extend their appreciation to Taif University, Saudi Arabia for supporting this work and all those who have contributed to this work through project number (TU-DSPP-2024-240). Secondly, the authors would like to acknowledge the Deanship of Scientific Research, Taif University for his supporting this work.
AUTHOR CONTRIBUTIONS
In the collaborative effort of conceiving and designing this study, all authors played a pivotal role, contributing their expertise to shape the research framework. Notably, Nourhene Akrout assumed a leadership role in crafting the initial draft of the manuscript. This foundational document served as the cornerstone for subsequent revisions and improvements, a process enriched by the valuable feedback and input provided by all authors. Through a series of iterative reviews, the manuscript underwent comprehensive scrutiny, ultimately resulting in a final version that garnered unanimous approval from all contributors.
DECLARATION OF COMPETING INTEREST
In adherence to ethical standards, the authors affirm that they possess no known competing financial interests or personal relationships that could have potentially influenced the integrity of the work reported in this paper. This declaration underscores the commitment to transparency and integrity in scholarly pursuits, ensuring the credibility and objectivity of the research findings.
DATA AVAILABILITY STATEMENT
Data cannot be made publicly available; readers should contact the corresponding author for details.
CONFLICT OF INTEREST
The authors declare there is no conflict.