ABSTRACT
Investigating hydraulic conductivity (K) is crucial for aquifer studies and groundwater flow modelling. The main objectives of the current study are to investigate the effectiveness of artificial neural network (ANN), adaptive neuro-fuzzy inference system (ANFIS), Gaussian process regression (GPR), and random forest (RF) algorithms in estimating K using data from 270 borehole soil samples, collected along the Beas riverbank in Kangra district, Himachal Pradesh, India. For the K estimation, the study utilizes the grain size parameters, i.e., d10, d50, coefficient of uniformity (Cu), and porosity (n) as input parameters. The performance evaluation of the developed models was assessed using the statistical parameters. While the performance of each model is quite satisfactory, the present study establishes the efficacy of the GPR model during validation having a determination coefficient of 0.985. The root mean square errors for ANN, ANFIS, GPR, and RF were 0.019, 0.017, 0.00853, and 0.019, respectively. The techniques used in the study offer precise K-prediction abilities that facilitate groundwater management and contaminant transport analysis. The GPR model in the study outperforms other models in estimating the K of soil samples and serves as an efficient tool for managing soil water and solute transport.
HIGHLIGHTS
The primary goal of the study is to develop hydraulic conductivity models using four data-driven techniques, i.e., ANN, ANFIS, GPR, and RF.
These data-driven techniques enable quick estimation of hydraulic conductivity, facilitating precise determination of groundwater recharge rate.
Among all these models, the GPR model performs better in predicting the hydraulic conductivity of given soil samples.
INTRODUCTION
Groundwater recharge is essential for sustaining water resources, especially in arid and semi-arid regions where water resources are scarce, and ecosystems face significant stress (Noori et al. 2023). Hydraulic conductivity (K), a fundamental parameter for evaluating the drainage potential of transmitting media, plays a vital role in shaping groundwater recharge rates and, consequently, the surrounding environment and ecosystems (Mahdian et al. 2024). The investigation of the K of porous media is essential for assessing soil stability, predicting landslides, and modelling groundwater flow. The K of porous media depends on the physical properties of the flowing fluid and transmission medium such as particle size, porosity, and pore connectivity (Chandel et al. 2021). Despite the importance of K in geological and geotechnical characterization, the evaluation of K remains a challenging task due to the heterogeneity of the soil particles, difficulties in soil sampling, subsequent experimental procedure, and the large extent of flow area within soil deposit (López-Acosta et al. 2019). Measurement and prediction of soil's K value can be done using direct and indirect approaches (Mozaffari et al. 2024).
Hydraulic conductivity can be measured directly via field or laboratory methods. The laboratory methods include constant head and falling head tests, while the field methods include ring infiltrometer, instant profile, and test basins (Williams & Ojuri 2021; Chandel et al. 2023), borehole permeameter and pressure infiltrometer (Alagna et al. 2023), and auger hole and tension infiltrometer (Kargas et al. 2022). Due to temporal and geographical fluctuations, the direct measurement of K becomes impractical, costly, and time-consuming. As a result, indirect methods for estimating the K from more readily and affordably measured soil parameters, i.e., specific gravity, porosity, and the proportion of clay, silt, and sand content were developed and widely used (More et al. 2022). The indirect techniques comprise empirical equations and machine learning-based data-driven techniques for the estimation of K. The main advantage of empirical equations is that they can estimate the K value quicker than the direct measurement (Williams & Ojuri 2021). Due to the domain-specific development of the empirical equations, these are restricted to specific boundary conditions.
Recent studies demonstrate the effectiveness of data-driven techniques in addressing a range of environmental and engineering problems, including water engineering (Jafari-Asl et al. 2024), flood prediction (Ahmadi et al. 2024; Donnelly et al. 2024), climate parameters (Chatrabgoun et al. 2020; Yeganeh-Bakhtiary et al. 2023), and scouring (Habib et al. 2024). These techniques are capable of precisely and accurately performing complicated tasks, i.e., learning from the data, modelling, and deriving a relation from experimental data. The data-driven techniques can be either individually operated like support vector machines (SVMs), fuzzy, and artificial neural networks (ANNs) or they can be hybrid, i.e., adaptive neuro-fuzzy inference system (ANFIS) and genetic algorithm (GA)-ANN (Williams & Ojuri 2021). Sihag (2018) estimated the K of fly-ash-mixed soils using SVM, ANN, random forest (RF), Gaussian process regression (GPR), M5P model tree, and two other traditional models and concluded that SVM with Radial Basis Function (RBF) kernel was the most accurate technique among others. Boadu (2020) investigated Multiple Linear Regression (MLR) and support vector regression (SVR) for predicting K and concluded that the SVR model performed better for estimating K. Singh et al. (2021a, b) examined ANN, MLR, RF, and M5P tree-based models to determine K and outcomes demonstrated that all models have a reasonable ability for prediction, with RF-based models outperforming the others. Abdalrahman et al. (2022) predicted the infiltration rate of treated wastewater using ANN, multilayer perceptron model (MLP), and Elman neural network (ENN), and concluded that the MLP model exhibited superior performance. Similarly, Singh et al. (2021b) developed two hybrid machine learning-based pedotransfer functions (ML-PTFs) by integrating a genetic algorithm with multilayer perceptron (MLP-GA) and support vector machine (SVM-GA). Their results indicated that the SVM-GA PTF demonstrated greater efficiency compared with the MLP-GA algorithm.
The evaluation of the research literature revealed that several data-driven methods are available for determining the K using easily measurable soil properties. ANFIS uses the Takagi-Sugeno fuzzy inference system, an effective tool for capturing nonlinear relationships between inputs and outputs (Seifi et al. 2020). GPR is a probabilistic, nonparametric model that can model complex systems and has a favourable nonlinear mapping ability (Richardson et al. 2017). RF is a meta-estimator that controls overfitting and increases predictive accuracy through the process of fitting several decision tree classifiers to various dataset subsamples and then averages the outcomes. The analysis of the literature review infers that previous studies have not integrated the applications of ANN, GPR, ANFIS, and RF techniques for estimating the hydraulic conductivity of porous media. These techniques can establish a correlation between the dependent and independent variables, even when the dataset contains inconsistent or noisy data points. Therefore, the study focuses on developing and validating the K models using four different techniques. Moreover, the study examines the performance of models through statistical indicators to identify the most suitable model for K estimation. The study presents a detailed comparison, offering valuable insights into how data-driven techniques can enhance the prediction of hydraulic conductivity (K) in porous media.
MATERIALS AND METHODS
Experimental setup
Model development
In the second phase, ANN, ANFIS, GPR, and RF techniques were employed to estimate K of the porous medium utilizing observations obtained from the experimental investigations. The accuracy of these models is checked concerning the experiment data. To verify the model's performance, statistical measures like determination coefficient (R2), root mean square error (RMSE), relative absolute error (RAE), mean absolute error (MAE), Nash–Sutcliffe efficiency (NSE), root mean squared logarithmic error (RMSLE), and Kling–Gupta efficiency (KGE) were used. An overview of the modelling techniques employed in the study is given below.
Artificial neural network
ANN modelling strategy
Adaptive neuro-fuzzy inference system
Layer 1: The membership grade of an input variable is generated by each node. The fuzzy set associated with each node is categorized using a variety of membership functions (MFs), including generalized bell-shaped, Gaussian, triangle-shaped, and trapezoidal-shaped functions (Seifi et al. 2020).
Layer 2: The incoming signal, which specifies the extent to which a rule is firing, determines each node's output, which is a fixed node with the label ∏.
Layer 3: Each node computes the normalized firing strengths and is a fixed node with label N.
Layer 4: For each adaptive node (i), compute the input of the ith rule using the node function for predicting the model's output.
Layer 5: Adding together the outputs of all incoming signals gives the total output of the ANFIS.
ANFIS combines the advantages of neural networks and fuzzy logic, making it effective for modelling complex and uncertain systems. The key benefit of ANFIS lies in its interpretability, as it uses fuzzy rules to clarify the relationship between inputs and outputs. However, ANFIS can struggle with high-dimensional datasets, and its efficacy declines, when the input data points increase. ANFIS, like ANN, does not provide a built-in way to quantify uncertainty in its predictions.
ANFIS modelling strategy
Gaussian process regression
GPR modelling strategy
Random forest
RF modelling strategy
Performance evaluation of developed models




RESULTS AND DISCUSSION
The number of training datasets determines how well the model estimates K in the porous medium. A total of 270 datasets were selected for the study, out of which 200 (75%) were used for training the model, while the remaining 70 (25%) samples were utilized to test the model. In the present study, four input parameters ‘d10, d50, Cu, and n’ are used to predict the K. The initial part of the study focuses on statistical analysis and the dataset employed for model development and validation. While the subsequent part focuses on K estimation through various models, along with their quantitative assessment.
Statistical analysis
Statistics summary of the characteristics of soil samples
Characteristics . | Minimum . | Maximum . | Mean . | Standard deviation . |
---|---|---|---|---|
d10 (mm) | 0.100 | 0.409 | 0.201 | 0.093 |
na | 0.286 | 0.426 | 0.358 | 0.028 |
d50 (mm) | 0.289 | 1.450 | 0.660 | 0.309 |
Cua | 2.150 | 9.567 | 4.691 | 1.559 |
K (cm/s) | 0.010 | 0.342 | 0.067 | 0.068 |
Characteristics . | Minimum . | Maximum . | Mean . | Standard deviation . |
---|---|---|---|---|
d10 (mm) | 0.100 | 0.409 | 0.201 | 0.093 |
na | 0.286 | 0.426 | 0.358 | 0.028 |
d50 (mm) | 0.289 | 1.450 | 0.660 | 0.309 |
Cua | 2.150 | 9.567 | 4.691 | 1.559 |
K (cm/s) | 0.010 | 0.342 | 0.067 | 0.068 |
aDenotes the dimensionless characteristics.
Correlation matrix between various soil parameters and K
. | d10 . | n . | d50 . | Cu . | K . |
---|---|---|---|---|---|
d10 | 1 | ||||
n | 0.13 | 1 | |||
d50 | 0.91 | −0.22 | 1 | ||
Cu | −0.18 | −0.96 | 0.17 | 1 | |
K | 0.92 | 0.70 | 0.82 | −0.65 | 1 |
. | d10 . | n . | d50 . | Cu . | K . |
---|---|---|---|---|---|
d10 | 1 | ||||
n | 0.13 | 1 | |||
d50 | 0.91 | −0.22 | 1 | ||
Cu | −0.18 | −0.96 | 0.17 | 1 | |
K | 0.92 | 0.70 | 0.82 | −0.65 | 1 |
Correlation heatmap between the dependent and independent parameters.
Estimation of K using ANN
ANN models' performance evaluation parameter of the training and testing datasets
Statistical indicators . | ANN 3 model . | ANN 4 model . | ANN 5 model . | ANN 6 model . | ||||
---|---|---|---|---|---|---|---|---|
Training . | Testing . | Training . | Testing . | Training . | Testing . | Training . | Testing . | |
R2 | 0.930 | 0.882 | 0.927 | 0.895 | 0.951 | 0.902 | 0.947 | 0.889 |
RMSE | 0.0191 | 0.0223 | 0.0195 | 0.0217 | 0.0157 | 0.0206 | 0.0159 | 0.0218 |
RAE | 0.1278 | 0.2236 | 0.2048 | 0.2713 | 0.1458 | 0.1982 | 0.1567 | 0.2158 |
MAE | 0.0070 | 0.0121 | 0.0112 | 0.0147 | 0.0080 | 0.0108 | 0.0075 | 0.0117 |
NSE | 0.9231 | 0.8777 | 0.9199 | 0.8842 | 0.9464 | 0.8928 | 0.9464 | 0.8838 |
RMSLE | 0.0159 | 0.0197 | 0.0163 | 0.0194 | 0.0129 | 0.0181 | 0.0135 | 0.0191 |
KGE | 0.8762 | 0.8378 | 0.8777 | 0.8594 | 0.9237 | 0.8646 | 0.9200 | 0.8575 |
Statistical indicators . | ANN 3 model . | ANN 4 model . | ANN 5 model . | ANN 6 model . | ||||
---|---|---|---|---|---|---|---|---|
Training . | Testing . | Training . | Testing . | Training . | Testing . | Training . | Testing . | |
R2 | 0.930 | 0.882 | 0.927 | 0.895 | 0.951 | 0.902 | 0.947 | 0.889 |
RMSE | 0.0191 | 0.0223 | 0.0195 | 0.0217 | 0.0157 | 0.0206 | 0.0159 | 0.0218 |
RAE | 0.1278 | 0.2236 | 0.2048 | 0.2713 | 0.1458 | 0.1982 | 0.1567 | 0.2158 |
MAE | 0.0070 | 0.0121 | 0.0112 | 0.0147 | 0.0080 | 0.0108 | 0.0075 | 0.0117 |
NSE | 0.9231 | 0.8777 | 0.9199 | 0.8842 | 0.9464 | 0.8928 | 0.9464 | 0.8838 |
RMSLE | 0.0159 | 0.0197 | 0.0163 | 0.0194 | 0.0129 | 0.0181 | 0.0135 | 0.0191 |
KGE | 0.8762 | 0.8378 | 0.8777 | 0.8594 | 0.9237 | 0.8646 | 0.9200 | 0.8575 |
Scatter plot between the experimentally measured and model predicted K values for the training (left) and testing (right) stages. (a) ANN 3, (b) ANN 4, (c) ANN 5, and (d) ANN 6.
Scatter plot between the experimentally measured and model predicted K values for the training (left) and testing (right) stages. (a) ANN 3, (b) ANN 4, (c) ANN 5, and (d) ANN 6.
Box plot of the experimentally measured and ANN model predicted K values for the training (left) and testing (right) stages.
Box plot of the experimentally measured and ANN model predicted K values for the training (left) and testing (right) stages.
Violin plot of the experimentally measured and ANN model predicted K values for the training (left) and testing (right) stages.
Violin plot of the experimentally measured and ANN model predicted K values for the training (left) and testing (right) stages.
Estimation of K using ANFIS
Assessment metrics for ANFIS model performance on the training and testing datasets
Statistical indicators . | ANFIS triangular . | ANFIS Gaussian . | ANFIS Gbell . | |||
---|---|---|---|---|---|---|
Training . | Testing . | Training . | Testing . | Training . | Testing . | |
R2 | 0.965 | 0.919 | 0.973 | 0.924 | 0.970 | 0.918 |
RMSE | 0.0134 | 0.0197 | 0.0112 | 0.0179 | 0.0125 | 0.0189 |
RAE | 0.0953 | 0.1785 | 0.0840 | 0.1612 | 0.0774 | 0.1763 |
MAE | 0.0052 | 0.0097 | 0.0046 | 0.0088 | 0.0048 | 0.0096 |
NSE | 0.9621 | 0.9060 | 0.9735 | 0.9215 | 0.9681 | 0.9121 |
RMSLE | 0.0110 | 0.0168 | 0.0093 | 0.0152 | 0.0087 | 0.0161 |
KGE | 0.9293 | 0.8561 | 0.9725 | 0.9373 | 0.9553 | 0.9274 |
Statistical indicators . | ANFIS triangular . | ANFIS Gaussian . | ANFIS Gbell . | |||
---|---|---|---|---|---|---|
Training . | Testing . | Training . | Testing . | Training . | Testing . | |
R2 | 0.965 | 0.919 | 0.973 | 0.924 | 0.970 | 0.918 |
RMSE | 0.0134 | 0.0197 | 0.0112 | 0.0179 | 0.0125 | 0.0189 |
RAE | 0.0953 | 0.1785 | 0.0840 | 0.1612 | 0.0774 | 0.1763 |
MAE | 0.0052 | 0.0097 | 0.0046 | 0.0088 | 0.0048 | 0.0096 |
NSE | 0.9621 | 0.9060 | 0.9735 | 0.9215 | 0.9681 | 0.9121 |
RMSLE | 0.0110 | 0.0168 | 0.0093 | 0.0152 | 0.0087 | 0.0161 |
KGE | 0.9293 | 0.8561 | 0.9725 | 0.9373 | 0.9553 | 0.9274 |
Scatter plot between the experimentally measured and model predicted K values for the training (left) and testing (right) stages. (a) ANFIS triangular, (b) ANFIS Gaussian, and (c) ANFIS Gbell.
Scatter plot between the experimentally measured and model predicted K values for the training (left) and testing (right) stages. (a) ANFIS triangular, (b) ANFIS Gaussian, and (c) ANFIS Gbell.
Box plot of the experimentally measured and ANFIS model predicted K values for the training (left) and testing (right) stages.
Box plot of the experimentally measured and ANFIS model predicted K values for the training (left) and testing (right) stages.
Violin plot of the experimentally measured and ANFIS model predicted K values for the training (left) and testing (right) stages.
Violin plot of the experimentally measured and ANFIS model predicted K values for the training (left) and testing (right) stages.
Estimation of K using GPR
Performance evaluation parameter of the GPR model for the training and testing datasets
Statistical indicators . | Rational quadratic GPR . | Squared exponential GPR . | Matern 5/2 GPR . | Exponential GPR . | ||||
---|---|---|---|---|---|---|---|---|
Training . | Testing . | Training . | Testing . | Training . | Testing . | Training . | Testing . | |
R2 | 0.970 | 0.954 | 0.974 | 0.951 | 0.979 | 0.962 | 0.995 | 0.985 |
RMSE | 0.0109 | 0.0144 | 0.0111 | 0.0147 | 0.0099 | 0.0127 | 0.0049 | 0.0085 |
RAE | 0.0943 | 0.1540 | 0.0976 | 0.1616 | 0.0773 | 0.1240 | 0.0317 | 0.0767 |
MAE | 0.0052 | 0.0084 | 0.0053 | 0.0088 | 0.0042 | 0.0067 | 0.0017 | 0.0042 |
NSE | 0.9747 | 0.9493 | 0.9742 | 0.9466 | 0.9792 | 0.9604 | 0.9948 | 0.9822 |
RMSLE | 0.0091 | 0.0124 | 0.0092 | 0.0127 | 0.0082 | 0.0109 | 0.0041 | 0.0075 |
KGE | 0.9784 | 0.9485 | 0.9782 | 0.9465 | 0.9805 | 0.9527 | 0.9874 | 0.9655 |
Statistical indicators . | Rational quadratic GPR . | Squared exponential GPR . | Matern 5/2 GPR . | Exponential GPR . | ||||
---|---|---|---|---|---|---|---|---|
Training . | Testing . | Training . | Testing . | Training . | Testing . | Training . | Testing . | |
R2 | 0.970 | 0.954 | 0.974 | 0.951 | 0.979 | 0.962 | 0.995 | 0.985 |
RMSE | 0.0109 | 0.0144 | 0.0111 | 0.0147 | 0.0099 | 0.0127 | 0.0049 | 0.0085 |
RAE | 0.0943 | 0.1540 | 0.0976 | 0.1616 | 0.0773 | 0.1240 | 0.0317 | 0.0767 |
MAE | 0.0052 | 0.0084 | 0.0053 | 0.0088 | 0.0042 | 0.0067 | 0.0017 | 0.0042 |
NSE | 0.9747 | 0.9493 | 0.9742 | 0.9466 | 0.9792 | 0.9604 | 0.9948 | 0.9822 |
RMSLE | 0.0091 | 0.0124 | 0.0092 | 0.0127 | 0.0082 | 0.0109 | 0.0041 | 0.0075 |
KGE | 0.9784 | 0.9485 | 0.9782 | 0.9465 | 0.9805 | 0.9527 | 0.9874 | 0.9655 |
Scatter plot between the experimentally measured and model predicted K values for the training (left) and testing (right) stages. (a) Rational quadratic GPR, (b) squared exponential GPR, (c) Matern 5/2 GPR, and (d) exponential GPR model.
Scatter plot between the experimentally measured and model predicted K values for the training (left) and testing (right) stages. (a) Rational quadratic GPR, (b) squared exponential GPR, (c) Matern 5/2 GPR, and (d) exponential GPR model.
Box plot of the experimentally measured and GPR model predicted K values for the training (left) and testing (right) stages.
Box plot of the experimentally measured and GPR model predicted K values for the training (left) and testing (right) stages.
Violin plot of the experimentally measured and GPR model predicted K values for the training (left) and testing (right) stages.
Violin plot of the experimentally measured and GPR model predicted K values for the training (left) and testing (right) stages.
Estimation of K using RF regression
Scatter plot between the experimentally measured and model predicted K values for the training (left) and testing (right) stages for RF.
Scatter plot between the experimentally measured and model predicted K values for the training (left) and testing (right) stages for RF.
Box plot of the experimentally measured and RF model predicted K values for the training (left) and testing (right) stages.
Box plot of the experimentally measured and RF model predicted K values for the training (left) and testing (right) stages.
Violin plot of the experimentally measured and RF model predicted K values for the training (left) and testing (right) stages.
Violin plot of the experimentally measured and RF model predicted K values for the training (left) and testing (right) stages.
Sensitivity analysis
The sensitivity analysis was conducted on the best-performing models based on ANN, ANFIS, GPR, and RF techniques to evaluate the influence of various input parameters. Each developed model employs four input parameters, i.e., d10, n, d50, Cu, whose significance was assessed by systematically excluding them. From Table 6, it is clear that the most sensitive variables in model development are d10 and d50, as their removal led to a substantial decrease in R2 and an increase in RMSE values. In contrast, excluding n and Cu had a minimal impact on model performance, suggesting their lower sensitivity. These results highlight that d10 and d50 parameters have a substantial influence on the model performance, underscoring the importance of their careful selection and optimization.
Sensitivity analysis of the developed models
Variables . | ANN . | ANFIS . | GPR . | RF . | ||||
---|---|---|---|---|---|---|---|---|
R2 . | RMSE . | R2 . | RMSE . | R2 . | RMSE . | R2 . | RMSE . | |
d10, n, d50 | 0.880 | 0.0279 | 0.919 | 0.0137 | 0.962 | 0.0110 | 0.926 | 0.0219 |
n, d50, Cu | 0.852 | 0.0730 | 0.887 | 0.0557 | 0.921 | 0.0279 | 0.901 | 0.0347 |
d10, d50, Cu | 0.891 | 0.0316 | 0.925 | 0.0221 | 0.954 | 0.0103 | 0.914 | 0.0279 |
n, d10, Cu | 0.837 | 0.0654 | 0.853 | 0.0706 | 0.917 | 0.0215 | 0.871 | 0.0423 |
d10, n, d50, Cu | 0.902 | 0.0206 | 0.924 | 0.0179 | 0.98 | 0.0085 | 0.94 | 0.0197 |
Variables . | ANN . | ANFIS . | GPR . | RF . | ||||
---|---|---|---|---|---|---|---|---|
R2 . | RMSE . | R2 . | RMSE . | R2 . | RMSE . | R2 . | RMSE . | |
d10, n, d50 | 0.880 | 0.0279 | 0.919 | 0.0137 | 0.962 | 0.0110 | 0.926 | 0.0219 |
n, d50, Cu | 0.852 | 0.0730 | 0.887 | 0.0557 | 0.921 | 0.0279 | 0.901 | 0.0347 |
d10, d50, Cu | 0.891 | 0.0316 | 0.925 | 0.0221 | 0.954 | 0.0103 | 0.914 | 0.0279 |
n, d10, Cu | 0.837 | 0.0654 | 0.853 | 0.0706 | 0.917 | 0.0215 | 0.871 | 0.0423 |
d10, n, d50, Cu | 0.902 | 0.0206 | 0.924 | 0.0179 | 0.98 | 0.0085 | 0.94 | 0.0197 |
Radar plot illustrating variation in performance of the developed models based on the key statistical indicators.
Radar plot illustrating variation in performance of the developed models based on the key statistical indicators.
Radar plot highlighting maximum performance indicators (R2, KGE, and NSE) (left) and variation in error metrics (RMSE, RMSLE, MAE, and RAE) (right) for the developed models.
Radar plot highlighting maximum performance indicators (R2, KGE, and NSE) (left) and variation in error metrics (RMSE, RMSLE, MAE, and RAE) (right) for the developed models.
Comparison of the experimentally observed and model predicted K values.
Taylor diagram showing different models developed for hydraulic conductivity prediction.
Taylor diagram showing different models developed for hydraulic conductivity prediction.
CONCLUSION
In the present study, four distinct machine learning techniques, i.e., ANN, ANFIS, GPR, and RF, are used. Based on the correlation analysis, the parameters that were most significant in predicting the K of the porous media are Cu, n, d10, and d50. The predictive performance of the developed models was compared with measured K values using various statistical indicators (R2, RMSE, MAE, RAE, NSE, RMSLE, and KGE) and scatter plots. The performance of all the developed models was satisfactory, but GPR and RF models have higher prediction efficiency as compared with the ANN and ANFIS models. The exponential GPR model outperformed other models achieving R2, RMSE, RAE, MAE, NSE, RMSLE, and KGE values of 0.94, 0.0197, 0.1962, 0.0107, 0.9060, 0.0168, and 0.9655, respectively, during validation, making it most reliable method in the current study. The outcomes of the study highlight the potential of machine learning techniques in reducing uncertainties and overcoming traditional limitations in modelling soil hydraulic properties. The results and methods from the present study can be applied to address similar environmental problems. The developed models can be utilized for predicting the hydraulic conductivity of diverse media, supporting groundwater management, soil restoration, and agricultural planning. However, the relatively small dataset used in this study for modelling K may not adequately reflect the full range of porous media conditions. Furthermore, evaluating the machine learning techniques with a larger dataset could provide insights, and future studies may assess their performance in addressing other hydrological and environmental challenges.
DATA AVAILABILITY STATEMENT
All relevant data are included in the paper or its Supplementary Information.
CONFLICT OF INTEREST
The authors declare there is no conflict.