ABSTRACT
Geostatistical modeling is a powerful tool for improving the characterization, management, and prediction of decision-making processes. The Wilaya of El-Oued is considered the backbone of agriculture in Algeria, covering 40% of the total cultivated area in the country. This makes it one of the regions with the highest consumption of agricultural fertilizers to meet the aspirations of local authorities in this field. Additionally, it has experienced significant population growth in recent years, coupled with a weak sewage network, which could lead to the direct contamination of groundwater and the region's natural resources. This study aims to test various deterministic interpolation methods, such as the inverse distance weighting (IDW) method, the radial basis function (RBF), and the geostatistical methodology (point kriging), for mapping nitrates in 113 wells in the El-Oued aquifer. We utilized Surfer 20 for this purpose. Given the significant differences in estimates based on the methods used, it is appropriate to evaluate the spatial distribution maps of nitrates obtained by comparing their predictive qualities through the calculation of the mean estimation error. The results indicate the superiority of the point kriging technique compared with deterministic estimates, with R2 estimation errors of about 0.067.
HIGHLIGHT
This work in order to facilitate the management of the water resources on which the study area depends, which are represented in the groundwater, and also to give an idea of the blessing of this water, and this is through identifying the geographical locations of these water resources, whether polluted or not polluted.
INTRODUCTION
In light of the climate changes occurring globally, along with the scarcity of water and food resources, Algeria is striving to achieve self-sufficiency in food in an effort to reduce economic dependence on the hydrocarbons sector.
The country has vast areas in the south characterized by an arid desert climate. However, the country has taken on the challenge and, in recent years, has been working to reclaim, utilize, and invest in land for agriculture.
The El-Oued region has made significant progress in this field, leading in the production of various crops such as potatoes and dates, achieving 70 and 35% of national production, respectively. With the agricultural boom in the region, the availability of a large labor force, and rapid and unplanned population growth, water resources are diminishing in both quantity and quality (Khechana 2014; Brahim 2009).
This is particularly concerning with the use of fertilizers to improve soil quality and agricultural yield. Based on this basis, the issue of water is considered to have arisen as a subject of special and great interest. In this race, the groundwater began to flow sharply, which is a challenge and a major issue for preserving this resource.
The nitrates, which contain chemicals that contain azote and oxygen, are a natural substance in the environment of the biological process that causes the decomposition of the material's organs (Bouselsal & Kherici 2014; Abdelmonem et al. 2024).
On the other hand, the concentrations of nitrates in the southern neighborhoods increased due to the intensive use of azote in modern agriculture as well as proven results in industrial and domestic activities (Nas & Berktay 2006).
In the Algerian Standards and the World Health Organization (WHO), nitrate is given to be 50 mg/l (Official Gazette of the Algerian Republic 2011; WHO 2011). The presence of nitrates in the southern regions is most likely to affect the harmful effects of aquaculture and sanitation.
The nitrates can be converted into nitrites in the water tank with the nitrates consumed, which compromises the capacity of the blood vessel to transport oxygen and enters the problems of the Sanctuary graves, in particular the lights (Majour et al. 2023a, 2023b).
Nitrates may cause important environmental problems (Majour et al. 2023a, 2023b). In order to eliminate, or at least reduce, these dangers resulting from groundwater pollution with litter, the necessary data must be available to create models and draw maps that determine the location of the most polluted areas (Attoui et al. 2024).
For this reason, we will use geostatistical methods. Geostatistical methods provide a set of statistical tools for spatial variation analysis and spatial interpolation (Renard & Comby 2006; Gundogdu & Guney 2007; Halimi & Djabri 2024).
These techniques produce not only prediction surfaces but also error or uncertainty surfaces. A map showing isolines is usually the visual culmination of this process and plays a major role in decision-making (Despagne 2006; Gomez-Hernandez & Garcia 1989).
Interpolation techniques can be used to decompose into different groups of principles: settings and geostatistics. Interpolation techniques vary between surfaces at nearby points, such as the similitude function (pond by inverse distance) or lissage degree (base radial functions) (Collins & Bolstad 1996). The techniques of interpolation geostatistics (kriging) use the relevant statistics of the better points. Geostatistical techniques that quantify the spatial autocorrelation of the measurable points and automatically calculate the spatial configuration of the points of the vision (Despagne 2006; Theodossiou & Latinopoulos 2006).
The present study compares three of the most widely used interpolation methods in hydrochemistry: (1) inverse distance (IDW), (2) radial basis function (RBF), and (3) point kriging. To assess their performance, a quantitative statistical analysis methodology is used. It is based on (1) the examination of all nitrate concentration data interpolated by the cross-validation method and (2) the statistical and visual analysis (in the form of interpolation error maps) of the residuals.
The purposes of this paper are (1) to analyze the spatial pattern and mapping of nitrate concentration for the observation period and (2) to evaluate the distribution of nitrate concentration with geostatistical methods in the El-Oued region of Algeria in 2023.
METHODS
Study area
Data and methods
To determine the spatial pollution by nitrates in the groundwater that captures the phreatic aquifer of the El-Oued region, we used nitrate concentration data from well-distributed water points within the study site, along with spatial and geographic data defining the surrounding terrain of each of these water points. To conduct this study, nitrate concentration data from the groundwater inventory was collected from 113 water points for the period of March 2023.
Data processing was carried out using specialized software, including Statistica 8.0, which was used for statistical analysis. Surfer 20, which facilitated data interpolation and the production of the maps presented in the results.
Three interpolation techniques were employed: the IDW method, the RBF method, and point kriging.
Inverse distance (commonly referred to by the acronym IDW) is a method of local and deterministic estimation, classified as an exact interpolator. With the inverse distance to a power, the data are weighted during interpolation in such a way that the influence of one point relative to another decreases with the distance from the grid node. Weighting is assigned to the data through the use of a weighting power, which controls how the weighting factors decrease as the distance from a grid node increases. The higher the weighting power, the less influence distant points have on the grid node during interpolation. As the power increases, the value of the grid node approaches the value of the nearest point. For a smaller power, the weights are distributed more uniformly among the neighboring data points.
It assumes that values very close to the unsampled location are more representative of the value to be estimated than those from more distant samples (Franke 1980). Its results are good when the values to be interpolated are regularly sampled. It is a simple method to use.
The RBF method allows interpolation from randomly distributed data. It constitutes polynomial surfaces of a given order connecting the training points. In geostatistical analysis, there is the possibility to adjust the degree of locality (a polynomial surface per measure in a given neighborhood) and the degree of globality (a single polynomial surface for the entire study area expressing the first-order trend). This type of interpolation is made more flexible than polynomial interpolation by a tension parameter that controls the behavior of the interpolation function and the smoothing parameter (Benjamin 2007; Suthar et al. 2009).
Kriging is a local stochastic (or geostatistical) method developed by Matheron (1963, 1970). It is an exact interpolator that takes into account the size of the field to be estimated and the positions of the points relative to each other (i.e., the interdependence of variables) and considers the spatial continuity of the studied area (Leuangthong et al. 2004). It is a method of linear estimation that minimizes the estimation variance as calculated using the variogram.
The variogram can be interpreted in terms of spatial continuity or correlation if graphically inverted to obtain a decreasing function: the greater the distance between two measurements, the less correlated the measurements are (Ahmadi & Sedghamiz 2007). The variogram is the cornerstone of spatial analysis by kriging; it helps detect outlier points by their position relative to others. Isolated points should be removed to achieve good spatial correlation (Cay & Uyan 2009). Once the experimental variogram is established, it needs to be fitted with a model that best suits it; it is not always easy to find the theoretical model that matches it (Chappell et al. 2003; Diodato & Ceccarelli 2005).
Methodology
The use of Surfer 20 software is essential for the regional mapping project (Christakos 2001). Grid data includes a suite of integrated applications: Data file and Worksheet, Map, and Gridding method. With the applications, one can perform all tasks in Surfer 20, from the simplest to the most advanced, including data management, geographic analysis, data updating, geoprocessing, and mapping. The mapping methods used in this work are those proposed by the Gridding method extension of Surfer 20.
The procedure for developing a nitrate content map is divided into several steps (Figure 2): The first step involves acquiring and structuring spatial nitrate data using the Data file and Worksheet application. The second step is the analysis, visualization, and spatial modeling of data using the Gridding method application, which includes common extensions such as geostatistical analysis, cross-validation, and map (Figure 2).
The third step involves interpolation using deterministic methods (IDW and RBF) and geostatistical methods (point kriging) using the Gridding method. Finally, we compare different methods through cross-validation and residual analysis (Figure 2).
Data analysis
The rigorous evaluation of the best interpolators is subject to a method of statistical and visual analysis. This method integrates numerous statistical parameters and coefficients and unfolds as follows.
First, nitrate concentration graphs in groundwater were plotted using a normal distribution curve for each data point (Figure 3). The analysis shows that the entire dataset exhibits a certain degree of asymmetrical distribution, with certain values appearing with high frequency: thus, this statistical series is multimodal, reflecting a low value of the skewness coefficient (SK), as shown in Figure 3(b). Moreover, the kurtosis value equal to 7.84 indicates asymmetry with a spread toward high values (right of the histogram), also showing a strong asymmetry in the data distribution. A logarithmic transformation would certainly help make the data distribution more symmetric, but this transformation is discouraged by several authors.
The analysis of nitrate concentrations shows that the mean values exceed the WHO standards (50 mg/l). Examining the variogram surface obtained from nitrate concentration data (Figure 3(c)); a marked anisotropic behavior is observed at large distances, with high variability in the direction of 130°.
RESULTS AND DISCUSSION
The prepared nitrate concentration data (113 water points or nitrate concentration values) are interpolated using different techniques to produce corresponding numerical models, such as the IDW method, the RBF method, and point kriging. In order to select the map that appears most representative of reality, our choice must be based on objective criteria. For this purpose, it is necessary to compare the models used through cross-validation.
Interpolation of nitrate concentrations using deterministic methods
However, the nitrate concentrations obtained by the RBF method are smoother than those obtained by the IDW method, which shows concentrations with a strong concentric trend around the data.
Thus, the shape of the nitrate concentration isolines obtained reflects the fact that the method used greatly depends on the location of the data relative to the considered node.
Across the map obtained by the IDW method (Figure 4(a)), we notice that the majority of the study area contains nitrate concentrations between 44 and 64 mg/l. It is characterized by medium to high nitrate concentrations over a large area of the study zone. As for the nitrate concentrations obtained by the RBF method (Figure 4(b)), they range between 2 and 130 mg/l. It is characterized by very low to very high nitrate concentrations over most of the study region's surface area. In addition, it is also noted that most of the nitrate values obtained by the RBF method are between 40 and 80 mg/l. This means that these nitrate concentrations are quite close to the measured nitrate concentrations, which range between 40 and 92 mg/l; in contrast to the first method (IDW), which resulted in nitrate values moderately similar to the measured nitrate values.
Interpolation of nitrate concentrations using geostatistics
The tools of the exploratory study show that the nitrate concentration in the study region is not a stationary variable, which implies that the assumed model for this regionalized variable is Z(x) = Y(x) + m(x), consisting of a deterministic drift m(x) and a stationary residue Y(x) with zero expectation. In this case, linear kriging is the most appropriate.
The second step involves defining the scale at which we adapt the drift. If we consider a global drift (over the entire area), then we tend toward the extreme ‘global’; if, on the contrary, we have a local drift that only concerns small portions of the area, we tend toward ‘local’ (Ghazi et al. 2014).
The third step involves developing a semi-variance model: by visualizing the surface and the variogram cloud at different distance steps, we identify an isotropic behavior at short distances. Subsequently, a variogram was computed with a step of 1 km and a total of 18 steps. The shape of the semi-variance cloud suggests the use of a linear model with nugget effect, with a nugget effect of approximately 298, as nitrate concentration is not even at a small scale (Figure 5).
It is characterized by very low to very high nitrate concentrations over a large area of the study zone. It is also recorded that most of the nitrate values obtained by the point kriging method range between 40 and 100 mg/l, which means that these nitrate concentrations are more similar to the measured nitrate concentrations, which range between 40 and 92 mg/l.
Test and validation of the choice of model
The validation of the obtained results is an essential post-interpolation step. The performed estimations are validated using different approaches: comparison between estimation and measurement on two separate samples, cross-validation.
The cross-validation method (Table 1) allows for the evaluation of the estimation error for each tested interpolation method.
Interpolation method . | IDW . | RBF . | Point kriging . |
---|---|---|---|
Standard error (SE) | 1.90 | 1.94 | 1.80 |
Average error (ME) | −1.12 | −0.92 | −0.46 |
RMSE | 11.93 | 9.77 | 4.92 |
R | 0.7794 | 0.8043 | 0.8137 |
SK | 0.82 | 0.96 | 0.93 |
K | 4.97 | 4.91 | 4.42 |
Interpolation method . | IDW . | RBF . | Point kriging . |
---|---|---|---|
Standard error (SE) | 1.90 | 1.94 | 1.80 |
Average error (ME) | −1.12 | −0.92 | −0.46 |
RMSE | 11.93 | 9.77 | 4.92 |
R | 0.7794 | 0.8043 | 0.8137 |
SK | 0.82 | 0.96 | 0.93 |
K | 4.97 | 4.91 | 4.42 |
The root mean square error (RMSE) is an estimate of the variance of the regression error, which is the value of the average error of data dispersion around the regression axis or the variation of residuals; the higher the RMSE, the wider the dispersion of residuals, and the less accurate the model.
where Z is the observed value, P is the predicted value, and N is the total number of test observations.
Table 1 shows that the standard error (SE) has average values for both the IDW and RBF methods and slightly lower average values for the point kriging method. This is confirmed by the values of the mean error (ME) and the RMSE. The SK indicates that all three methods used result in a slightly skewed distribution of interpolated values.
Discussing the three interpolation methods, it is observed that the dispersion maintains approximately the same spatial distribution with all three methods. But when looking at the mean spatialization errors obtained, it is found that the IDW method (ME = −1.12) leads to a greater overestimation than the RBF method (ME = −0.92) and the point kriging method (ME = −0.46) (Table 1). Moreover, the RMSE of the interpolation by the point kriging method (RMSE = 4.92) is lower than that of the IDW method (RMSE = 11.93) and the RBF method (RMSE = 9.77) (Table 1).
For the regression coefficient (R), the closer the value is to 1, the more accurate the interpolation. However, for all tested techniques, R is high and slightly close to 1, with a slight advantage for point kriging (R = 81.37%) (Table 1).
The coefficient of kurtosis (K) quantifies the deviation of the distribution shape from a normal distribution (Table 1).
So, we can say that interpolation using point kriging is more effective than RBF and IDW.
The point kriging method has the advantage of predicting and constructing the experimental variogram, which describes the average variability of the phenomenon in space (spatial correlation). Kriging is therefore an interpolator that takes into account the spatial structure of the phenomenon, thus refining the estimation and reducing uncertainty in under sampled areas.
Analysis of residuals
The residual is the difference between the observed value (Zi) and the interpolated or predicted value (Pi) at a location i, which corresponds to the interpolation error.
Hence, the comparative study of the residual of all tested spatial interpolation techniques provides important information about each method and allows for a more precise evaluation of the margin of error it produces during interpolation.
For calculating the residual, the difference between the nitrate concentration values of the initial database before interpolation and those from a database extracted from the interpolated grid at the same locations was taken into account. This operation was performed for each interpolation method.
Table 2 indicates that the error distribution is almost symmetric for point kriging and RBF since they have the values closest to 3 for the kurtosis coefficient (K).
Interpolation method . | IDW . | RBF . | Point kriging . |
---|---|---|---|
Min (mg/l) | −54.5 | −47.22 | −34.85 |
Max (mg/l) | 76.96 | 65.13 | 58 |
Standard error (SE) | 1.88 | 1.72 | 1.68 |
Standard deviation (SD) | 19.86 | 16.26 | 17.75 |
Kurtosis coefficient (K) | 6.05 | 3.89 | 3.08 |
SK | 0.94 | 0.54 | 0.56 |
RMSE | 19.80 | 16.20 | 17.67 |
Interpolation method . | IDW . | RBF . | Point kriging . |
---|---|---|---|
Min (mg/l) | −54.5 | −47.22 | −34.85 |
Max (mg/l) | 76.96 | 65.13 | 58 |
Standard error (SE) | 1.88 | 1.72 | 1.68 |
Standard deviation (SD) | 19.86 | 16.26 | 17.75 |
Kurtosis coefficient (K) | 6.05 | 3.89 | 3.08 |
SK | 0.94 | 0.54 | 0.56 |
RMSE | 19.80 | 16.20 | 17.67 |
This means that these two methods tend to homogenize the error at the level of the grid produced by interpolation. The closer the SK coefficient is to zero or null, the better the symmetric distribution. In the case we are studying, we notice that the values of this coefficient (SK) for the two methods (RBF and point kriging) used are moderately close to zero (SK = 0.54 and 0.56, respectively) (Table 2), and from there we can say that these two methods are more accurate and less biased.
For point kriging, the difference between the minimum and maximum of the residual is the smallest (92.85 mg/l), compared with the RBF method and IDW method where the difference between the minimum and maximum of the residual is estimated 112.35 and 131.46 mg/l, respectively (Table 2), as well as for the value of RMSE, which is high in the case of IDW (19.18).
This constitutes an original approach, allowing both an understanding of the spread of errors in the studied area and highlighting the relationship between the magnitude of residuals and the terrain characteristics.
However, these estimations show significant differences depending on the techniques used, without necessarily indicating a clear hierarchy between major method types (deterministic or stochastic).
The residuals of nitrate concentrations resulting from three interpolation methods are then calculated for 113 water points. They are subsequently interpolated by IDW, RBF, and kriging (Figure 7(a)–7(c)).
The modeling error, resulting from the first and second methods (IDW + RBF), varies from −55 to 75 mg/l and −45 to 65 mg/l, respectively, indicating a considerable overestimation of nitrate concentrations for a large area of the study zone (Figure 7(a) and 7(b)). The third method (point kriging) exhibits residuals ranging from −35 to +55 mg/l; these are, with a narrower range compared with the first and second methods.
Figure 7(c) illustrates the spatial variability of nitrate concentration residuals resulting from the third method (point kriging). It is characterized by low residuals ranging between −5 and +15 mg/l over a large area of the study zone. The kriging model is suitable for interpolating nitrate concentrations and generally exhibits a homogeneous distribution with low error values.
All three methods used yielded almost the same degree of correlation between measured and estimated values, with a slight advantage for the kriging method, with a correlation coefficient estimated at 81%; the RBF method yields a similar result with a coefficient of around 80%; the IDW method obtains a correlation cloud slightly away from the first bisector (R = 77%).
The joint study of the IDW, RBF, and point kriging methods shows that the groundwater of the phreatic aquifer which covers the majority of the study area is contaminated by nitrates, where the nitrates concentrations exceed (more than 100 mg/l) the WHO standards (50 mg/l), which can have harmful consequences on crops. Despite some inaccuracies, point kriging provides us with more stringent fundamentals in terms of spatiotemporal interpolation. It provides more accurate results that closely reflect reality by reducing errors. Point kriging is therefore the ideal interpolation method for nitrate concentrations to address current and future challenges.
CONCLUSIONS
The purpose of this study is to evaluate several deterministic interpolation methods (such as the IDW approach, RBF, and geostatistical methodology (point kriging)) for mapping nitrates in the 113 wells in the El-Oued aquifer. The results show that the degree of correlation between estimated values and measured values is moderate to strong, as the correlation coefficient for IDW was estimated at 77%, while for the other two methods, RBF and point kriging, the degree of correlation is strong, estimated at 80 and 81%, respectively. The overlap of error clouds resulting from interpolation against measured values shows that the cloud for the point kriging method is better clustered around the y = 0 line with an R2 coefficient of around 0.067. Based on these results, we can affirm that point kriging is the most accurate method for creating a spatial model of nitrate concentrations in the phreatic aquifer of the El-Oued region. Implementing a combination of recommendations can significantly reduce nitrate contamination of groundwater. It requires coordinated efforts from farmers, policymakers, researchers, and the community to achieve sustainable management of nitrate levels and protect groundwater resources:
Use technology to more precisely administer fertilizers depending on soil tests and crop requirements.
Use fertilizers that gradually release nutrients, reducing the possibility of leaching.
Use crop rotation systems that incorporate legumes, which can fix atmospheric nitrogen and reduce the need for synthetic fertilizers.
Properly store and manage livestock manure to avoid nitrate runoff, and use compost manure to fix nitrogen before application.
Upgrade septic systems and sewage treatment facilities to properly remove nitrates from wastewater.
Implement policies to encourage best management practices and sustainable farming approaches.
Regularly check groundwater nitrate levels, particularly in sensitive locations, to identify pollution causes and trends.
Educate farmers about the effects of nitrate contamination and effective management strategies for reducing nitrate leaching.
Provide resources and assistance in adopting new technology and processes.
Raise public understanding about the sources and consequences of nitrate contamination, and encourage community participation in mitigation initiatives.
AUTHORS' CONTRIBUTIONS
M.A., Z.N., and A.B. wrote the article and produced the maps; K.A. and R.B. reviewed the text of the article and corrected the language.
DATA AVAILABILITY STATEMENT
All relevant data are included in the paper or its Supplementary Information.
CONFLICT OF INTEREST
The authors declare there is no conflict.