The escalating challenge of water scarcity demands advanced methodologies for sustainable water management, particularly in agriculture. Machine learning (ML) has become a crucial tool in optimizing the hydrological cycle within both natural and engineered environments. This review rigorously assesses various ML algorithms, including neural networks, decision trees, support vector machines, and ensemble methods, for their effectiveness in agricultural water management. By leveraging diverse data sources such as satellite imagery, climatic variables, soil properties, and crop yield data, the study highlights the frequent use and superior predictive accuracy of the Random forest (RF) model. Additionally, artificial neural networks (ANNs) and support vector machines (SVM) show significant efficacy in specialized applications like evapotranspiration estimation and water stress prediction. The integration of ML techniques with real-time data streams enhances the precision of water management strategies. This review underscores the critical role of ML in advancing decision-making through the development of explainable artificial intelligence, which improves model interpretability and fosters trust in automated systems. The findings position ML models as indispensable for real-time, data-driven management of agricultural water resources, contributing to greater resilience and sustainability under the dynamic pressures of global environmental change.

  • Machine learning (ML) optimizes agricultural water management amid water scarcity.

  • Random forest is superior in predicting water-related outcomes.

  • Artificial neural network and support vector machine excel in evapotranspiration and water stress prediction.

  • ML models enhance precision with real-time data integration.

  • Continued tech integration and explainable artificial intelligence are essential.

ART

adaptive resonance theory

BWET

bias-corrected weighted ensemble technique

GWFP

groundwater flow prediction

BWFP

baseflow prediction

CNN

convolutional neural network

DAT

data assimilation technique

DT

decision tree

ELM

extreme learning machine

GAM

generalized additive model

GBR

gradient boosting regressor

GWET

groundwater evapotranspiration

KNN

K-nearest neighbors

LR

linear regression

LSTM

long Short-Term Memory

MAE

mean absolute error

MAPE

mean absolute percentage error

MLP

multi-layer perceptron

RAE

relative absolute Error

RFR

random forest regressor

RL

reinforcement learning

RRSE

root relative squared error

RSS

residual sum of squares

SVR

support vector regression

SWAT

soil and water assessment tool

WF

wavelet function

To enhance crop water productivity (CWP), two main strategies can be employed: increasing crop yields without raising water consumption or reducing water usage while sustaining or improving yields (Foley et al. 2020). On a global scale, improving CWP requires a thorough understanding of water usage patterns, including where and how water is utilized and its variability in producing specific crop quantities (Blatchford et al. 2018; Kilemo 2022). In recent years, the intersection of machine learning (ML) techniques and agricultural water management has garnered significant attention due to its potential to address pressing challenges in sustainable food production. The increasing strain on water resources due to population growth, climate variability, and competing sectoral demands makes optimizing agricultural water usage critical for food security and sustainability (Hameed et al. 2019; Jung et al. 2021). ML provides a data-driven approach to addressing these challenges by developing predictive models and optimizing resource management (Sarker 2021; Umutoni & Samadi 2024). Advanced algorithms such as neural networks, decision trees, and support vector machines (SVM) enable researchers to analyze water use patterns and develop innovative solutions for mitigating water-related risks (Veeragandham & Santhi 2020).

Water is a fundamental resource in agriculture, essential for crop growth, livestock production, and ecosystem health. The efficient and sustainable use of water is critical to ensure food security, support rural livelihoods, and preserve natural ecosystems. However, water scarcity, pollution, and competition from other sectors pose significant challenges to agricultural water management (D'Odorico et al. 2020). In many regions, unsustainable water practices, such as excessive irrigation and groundwater depletion, threaten the long-term viability of agricultural systems. Addressing these challenges requires innovative approaches that optimize water usage, improve water productivity, and promote conservation practices that minimize environmental impacts (Kılıç 2020).

While ML offers transformative potential, its application in agriculture faces significant hurdles, including the need for high-quality data, model interpretability, and integration with existing practices. This review addresses these challenges by showcasing original achievements, such as the superior performance of ensemble methods (e.g., RF, light gradient boosting machine (LightGBM)) in predicting water footprints and evapotranspiration, and the integration of real-time data streams to improve decision-making. These advancements demonstrate how ML can overcome traditional limitations, offering robust solutions for sustainable water management.

ML is a subfield of artificial intelligence (AI) that focuses on developing algorithms and statistical models that enable computers to learn from and make predictions or decisions based on data (Lee et al. 2017). In the context of agricultural water management, ML plays a crucial role in analyzing vast datasets related to water usage patterns, soil characteristics, climate variables, and crop growth parameters (Sahoo et al. 2017; Benos et al. 2021). By leveraging ML techniques, such as pattern recognition and predictive modeling, researchers can gain valuable insights into the complex interactions between these factors and optimize water resource allocation for sustainable agricultural practices. Recent advancements in ML have introduced sophisticated models such as EMD-LSTM, LSTM-INFO, RVFL-EROA, ANFIS-WCAMFO, and ANN-RUNAO, which show promise in time series modeling for hydrological and environmental applications (Li et al. 2018; Ikram et al. 2023; Adnan et al. 2024, 2025). However, their application in agricultural water management remains limited due to a lack of validation studies and practical implementation in this domain.

However, despite the growing interest and potential benefits, the application of ML in agricultural water management presents several challenges and opportunities that warrant further investigation (Li et al. 2023). These include the need for robust data collection and quality assurance protocols, the development of interpretable and transparent models, and the integration of socio-economic factors and stakeholder perspectives into decision-support frameworks. Moreover, as climate change exacerbates water scarcity and variability, there is a pressing need to enhance the resilience of agricultural systems through adaptive management strategies informed by advanced analytics (Maleksaeidi & Karami 2013; Jung et al. 2021).

In this review, we investigate the application of ML techniques in agricultural water management, with a focus on optimizing water allocation, enhancing decision-making, and mitigating climate-related risks. In the following section, we outline the key ML techniques employed in this domain and discuss their effectiveness based on recent studies.

Our objective is to synthesize key findings from these investigations, providing a comprehensive overview of the current state of knowledge in this rapidly evolving field. The novelty of this work lies in evaluating diverse ML techniques, emphasizing ensemble methods and real-time data integration, to enhance agricultural water management, sustainability, and resilience under climate variability.

ML techniques in agricultural water usage

ML offers several advantages for analyzing water usage patterns and optimizing resource allocation in agriculture (Sun & Scanlon 2019; Latif et al. 2023). First, it can handle large and complex datasets with high dimensionality, which are common in agricultural systems due to the multitude of interacting variables. Second, ML algorithms are capable of detecting subtle patterns and trends in data that may not be apparent through traditional statistical methods. This ability is particularly valuable for identifying factors influencing water usage and predicting future water demand. Additionally, ML models can adapt and improve over time as more data becomes available (Horvitz & Mulligan 2015), making them well suited for dynamic and evolving agricultural environments.

ML techniques have emerged as powerful tools in addressing the complexities of agricultural water usage. These techniques leverage algorithms and computational models to analyze vast datasets and extract valuable insights into water consumption patterns, irrigation practices, and crop water requirements (Elbeltagi et al. 2020b; Abdel-Hameed et al. 2024; Umutoni & Samadi 2024). By employing methods, such as predictive modeling, classification, and clustering, researchers, can develop accurate models that forecast future water demands, optimize irrigation scheduling, and identify areas for water conservation. Moreover, ML enables the integration of diverse data sources, including satellite imagery, weather forecasts, soil moisture measurements, and crop characteristics, to enhance the precision and efficiency of water management strategies in agriculture.

Role of ML in water management

ML models have been widely used in water resource management due to their ability to process large datasets and extract meaningful insights. These models facilitate informed decision-making, optimize water allocation, and improv irrigation efficiency. Furthermore, ML enables the development of adaptive management frameworks that respond to changing environmental conditions and evolving agricultural needs. By integrating ML into water management systems, stakeholders can better understand the dynamics of water availability and usage, leading to more effective and resilient agricultural practices (Sun & Scanlon 2019; Fu et al. 2022).

Commonly used techniques

Several ML algorithms are frequently employed in agricultural water management analysis. This discussion highlights the most commonly used ML models in the field, as reviewed in the literature, and their typical applications in agriculture. These models have been employed to address a wide range of challenges, from optimizing irrigation schedules to predicting crop yields and water footprints.

Random forest

Random forest (RF) is an ensemble learning (EL) technique that constructs multiple decision trees to improve prediction accuracy and model stability. This method introduces two key sources of randomness: (1) the random selection of training samples to generate the root node of each decision tree, ensuring diversity in the dataset, and (2) the random selection of attributes during the tree-building process, where the most optimal attribute is chosen for node splitting (see Figure 1) (Magidi et al. 2021; Shao et al. 2021). The algorithm employs bootstrap sampling, where multiple subsets of the original dataset are created by sampling with replacement. Each decision tree is trained on a different subset, and the final prediction is determined by aggregating the outputs of all trees – using majority voting for classification tasks or averaging for regression tasks (Xu et al. 2021).
Figure 1

The training process of RF.

Figure 1

The training process of RF.

Close modal

Artificial neural networks

ANNs are computational models inspired by the human brain's structure, consisting of an input layer, one or more hidden layers, and an output layer. Each node in the hidden layers applies nonlinear transformations to weighted inputs, enabling ANNs to capture complex patterns in data (see Figure 2) (Khairunniza-Bejo et al. 2014; Kumar et al. 2020). The depth and number of hidden layers can be adjusted based on the complexity of the task, making ANNs highly adaptable for diverse applications such as crop yield prediction, evapotranspiration estimation, and soil moisture modeling.
Figure 2

Neural network with two hidden layers.

Figure 2

Neural network with two hidden layers.

Close modal

Support vector machines

The SVM is a powerful supervised learning algorithm widely used for classification and regression. It identifies an optimal hyperplane that maximizes the margin between different classes in classification tasks or fits within a defined error margin for regression, enhancing model accuracy and generalization. To handle nonlinear relationships, SVM employs kernel functions that transform data into a higher-dimensional space, allowing for more effective pattern recognition (Guerrero et al. 2012; Löw et al. 2013). In agricultural water management, SVM is commonly used for crop classification, soil moisture prediction, and water stress detection, providing accurate insights even with complex and high-dimensional datasets. The SVM model seeks to maximize the margin between classified data points while minimizing misclassification errors. The optimization function is given by:
(1)
where represents the norm of the weight vector, C is a regularization parameter that balances model complexity and training error. are slack variables measuring misclassification errors.
Subject to the constraints:
(2)

These constraints ensure that the predictions do not deviate from the actual values by more than the specified , allowing some flexibility through the slack variables for points that are otherwise hard to fit under this strict margin (Shrestha & Shukla 2015).

Light gradient boosting machine

LightGBM is an efficient gradient boosting framework that uses tree-based learning algorithms. It is designed for distributed and efficient training, particularly on large datasets (Fan et al. 2019). LightGBM improves on traditional gradient boosting methods by using a histogram-based algorithm for faster processing and reduced memory usage. This model also handles large amounts of data with a higher accuracy and can efficiently manage categorical features as well (Ustuner & Sanli 2019; McCarty et al. 2020).

Regression trees

Regression trees (RT) are decision trees designed specifically for continuous outcome variables (regression problems). The tree splits the data into branches to form homogenous groups with similar target values. At each node, the tree chooses the split that minimizes the variance in the target variable, ultimately leading to a model that predicts the mean of each group. This method is particularly useful for capturing nonlinear relationships and interactions between features (Mokhtar et al. 2021).

Bagging (bootstrap aggregating)

Bagging, or bootstrap aggregating, is an EL technique that improves the stability and accuracy of ML models, particularly useful for high-variance models like decision trees (Schick et al. 2016; Winkler et al. 2018; Afrifa et al. 2022; Afrifa et al. 2023a). The process involves three main steps:

Bootstrap sample: Generate multiple subsets of the original dataset by sampling with replacement, creating diverse training sets (B1, B2, and B3).

Model training: Train individual models (M1, M2, and M3) on each subset in parallel.

Aggregation/voting: Combine the predictions from all models either by averaging (for regression) or voting (for classification) to produce a final output (Figure 3).
Figure 3

Bagging process flowchart.

Figure 3

Bagging process flowchart.

Close modal

This method reduces variance and helps prevent overfitting, enhancing the model's performance on unseen data. The diagram illustrates these steps, showing how data flows from the initial dataset through model training to the final aggregated output (Das et al. 2022).

Decision trees

Decision trees are a non-parametric supervised learning method used for classification and regression. The model splits the data into subsets using a tree-like model of decisions, where nodes represent the features of a dataset, branches represent the decision rules, and each leaf node represents the outcome (Figure 4) (Murphy et al. 2016). Decision trees are easy to interpret and capable of handling both numerical and categorical data. They can easily overfit, but this can be mitigated by techniques such as pruning, setting the minimum samples per leaf, or using them as part of an ensemble method such as RF (Christias et al. 2020; Xu et al. 2021; Afrifa et al. 2023b).
Figure 4

Decision tree algorithm.

Figure 4

Decision tree algorithm.

Close modal

Extreme gradient boosting

Extreme Gradient Boosting (XGBoost) is a highly efficient, optimized library that improves upon traditional gradient boosting methods. It is notable for its ability to handle missing data, implement regularization to prevent overfitting, and prune trees. XGBoost operates by combining ‘weak’ learners into a ‘strong’ learner through an additive process, enhancing both speed and predictive accuracy, which is essential in various industry applications (Ge et al. 2022; Huber et al. 2022).

Predictions in XGBoost at any time step t are computed as (Fan et al. 2021):
(3)
where is the current learner, and is the previous prediction.
To minimize overfitting and assess model performance, XGBoost's objective function at step t is:
(4)
where l is the loss function and represents the regularization term:

This streamlined approach makes XGBoost a preferred choice for achieving superior results in data-driven competitions and practical applications.

Strengths and weaknesses of ML techniques

Various ML techniques, each with their own strengths, weaknesses, and typical applications in agriculture, are introduced in Table 1. This table provides a comprehensive summary derived from the review of several scholarly papers. The references for these studies, which discuss ML methods in detail, are elaborated upon in the subsequent section.

Table 1

ML techniques, strengths and weaknesses and their typical use in agriculture

TechniqueStrengthsWeaknessesTypical use cases
RF High predictive accuracy, robust across datasets Computationally intensive, less interpretable Estimating water footprints, predicting evapotranspiration 
ANNs Capable of capturing complex nonlinear relationships; highly flexible Requires large datasets and extensive training; prone to overfitting Crop yield prediction, ETc estimation 
SVM Effective in high-dimensional spaces; robust against overfitting Requires careful parameter tuning; not scalable to very large datasets Water stress detection in crops, crop classification 
LightGBM Fast and efficient with large datasets; handles categorical features naturally Can overfit on small datasets; sensitive to noisy data Optimizing water use efficiency; soil moisture prediction 
RT Simple to understand and interpret; handles both numerical and categorical data Prone to overfitting; unstable with small changes in data Modeling relationships affecting crop health and yield 
Bagging (Bootstrap Aggregating) Reduces variance and avoids overfitting; improves model stability Can be less interpretable; increase in computational burden Enhancing prediction models' accuracy for crop disease detection 
Decision trees Easy to understand; no data scaling required Sensitive to noisy data and prone to overfitting Decision support for water management, threshold setting for irrigation 
XGBoost Highly efficient and flexible; excellent performance on structured data Can overfit if not tuned properly; complex model tuning required Enhancing irrigation management; resource allocation optimization 
TechniqueStrengthsWeaknessesTypical use cases
RF High predictive accuracy, robust across datasets Computationally intensive, less interpretable Estimating water footprints, predicting evapotranspiration 
ANNs Capable of capturing complex nonlinear relationships; highly flexible Requires large datasets and extensive training; prone to overfitting Crop yield prediction, ETc estimation 
SVM Effective in high-dimensional spaces; robust against overfitting Requires careful parameter tuning; not scalable to very large datasets Water stress detection in crops, crop classification 
LightGBM Fast and efficient with large datasets; handles categorical features naturally Can overfit on small datasets; sensitive to noisy data Optimizing water use efficiency; soil moisture prediction 
RT Simple to understand and interpret; handles both numerical and categorical data Prone to overfitting; unstable with small changes in data Modeling relationships affecting crop health and yield 
Bagging (Bootstrap Aggregating) Reduces variance and avoids overfitting; improves model stability Can be less interpretable; increase in computational burden Enhancing prediction models' accuracy for crop disease detection 
Decision trees Easy to understand; no data scaling required Sensitive to noisy data and prone to overfitting Decision support for water management, threshold setting for irrigation 
XGBoost Highly efficient and flexible; excellent performance on structured data Can overfit if not tuned properly; complex model tuning required Enhancing irrigation management; resource allocation optimization 

Recent studies on ML techniques in agricultural water management

This section provides a detailed analysis of recent scholarly papers focusing on the application of ML techniques in agricultural water management. Each paper is meticulously reviewed to elucidate the methodologies employed, the specific agricultural contexts in which ML was implemented, and the resultant findings. The discussion aims to critically evaluate the effectiveness of these techniques in enhancing water management practices in agriculture. Additionally, the insights and conclusions drawn from these studies are synthesized to understand the broader implications and potential future directions for integrating ML in agricultural water management.

Geng et al. (2023) found that ML model performance in simulating crop water footprints varies by crop type, spatial scale, and water supply scenario. The RF model was most effective for maize, soybean, and rice at the site scale, while LightGBM performed better for wheat. At the provincial scale, LightGBM was the most accurate model. The study also shows higher accuracy of ML models in irrigated scenarios compared to rainfed ones, highlighting the importance of irrigation. Spatiotemporal features were the most significant variables, indicating their usefulness in rapid WF estimation when detailed soil data are unavailable.

Azzam et al. (2022) evaluated ML models for estimating green and blue water evapotranspiration (GWET and BWET) in the Amu Darya River Basin. The RF model outperformed others, achieving correlation coefficients up to 0.99 and root mean square error (RMSE) as low as 0.2637 mm/day. ANN and SVM models were also effective but less consistent. The inclusion of precipitation data significantly improved GWET predictions, emphasizing its importance.

Fan et al. (2021) examined the performance of various ML models for estimating daily maize transpiration in Northwest China. The models evaluated included SVM, XGBoost, ANNs, and deep neural networks (DNN). Using a combination of meteorological data (Tmax, Tmin, RH, U, Rs), soil water content (SWC), and leaf area index (LAI), the study found that the DNN model achieved the highest accuracy, with R2 values ranging from 0.816 to 0.954 and RMSE values between 0.344 and 0.621 mm/day. This research underscores the effectiveness of DNNs in modeling complex nonlinear relationships in transpiration estimation, offering significant potential for enhancing irrigation practices and water management in agriculture.

Granata's (2019) findings reveal the robustness of ML models in predicting ETa with high precision, where the M5P regression tree model exhibited the best performance across all tested models. This highlights the importance of a comprehensive input variable set in achieving high model accuracy, underscoring the critical role of detailed climatic and soil data in ETa estimation. The study also demonstrates the variability in model performance based on the complexity of the input variables, with simpler models still performing satisfactorily but less accurately than more complex ones.

Shrestha & Shukla (2015) explored the use of ML models to estimate crop coefficient (Kc) and crop evapotranspiration (ETc) for bell pepper and watermelon. The study compared the performance of SVM, ANNs, and relevance vector machines (RVM). The results demonstrated that the SVM model significantly outperformed the ANN and RVM models, achieving prediction errors of 2.6% for watermelon and 11.2% for bell pepper. The superior performance of the SVM model underscores its robustness and effectiveness in capturing the complex relationships between input variables and crop water use.

Elbeltagi et al. (2020b) evaluated various ML models for estimating the green and blue water footprints (GWFP and BWFP) of maize in the Nile Delta, Egypt. The study identified that ANNs achieved the highest predictive accuracy with coefficients of determination (R2) ranging from 0.94 to 0.99 and RMSE as low as 0.2637 mm/day. The inclusion of meteorological data such as temperature and precipitation significantly enhanced model performance, emphasizing the importance of comprehensive input data.

Mokhtar et al. (2021) investigated the performance of various ML models to estimate the blue and green water footprints (BWFP and GWFP) of rice in the Yunnan Province, southwest China. The study used four models: RT, RF, additive regression (AR), and reduced error pruning tree (REPT). Among these, the RT model in Scenario 1 (solar radiation, humidity, and vapor pressure deficit) outperformed others for BWFP estimation, with an RMSE value of 11.82 m3/ton and MAPE value of 0.5%. For GWFP, including precipitation in the input scenario significantly improved model accuracy. The study highlights the effectiveness of RT and RF models, particularly in scenarios with comprehensive climate and crop data, underscoring the potential of these models for reliable water footprint predictions in rice production under varying climatic conditions.

Geng et al. (2023) findings illustrate the potential of EL algorithms – RF, bagging, and adaptive boosting (Ad) – to estimate daily actual evapotranspiration (ETa) in tea plantations using various meteorological data scenarios. The RF model demonstrated superior performance, with RMSE values ranging from 0.41 to 0.56 mm/day, MAE from 0.32 to 0.42 mm/day, and R2 between 0.84 and 0.91. Bagging and SVM also showed good performance, though the MLP model was the least accurate. These results underscore the RF model's effectiveness and stability in predicting daily ETa, particularly with comprehensive climate and plant data inputs.

Ge et al. (2022) evaluated the performance of several ML models for predicting daily evapotranspiration (ET) in greenhouse-grown tomatoes. The XGBoost model achieved the highest prediction accuracy, with a mean square error (MSE) of 0.032, RMSE of 0.163, and a coefficient of determination (R2) of 0.981. GBR and SVR also showed good performance but did not match the accuracy of XGBoost. This study underscores the importance of comprehensive input data, including net solar radiation, temperature, and relative humidity, to enhance model performance. Ge et al. highlight the potential of these ML models, particularly XGBoost, in optimizing irrigation practices and improving water resource management in greenhouse agriculture.

Elbeltagi et al. (2020a) demonstrate that DNNs effectively predict the water footprint under varying climatic scenarios, emphasizing the importance of precise and adaptable modeling techniques in agriculture. The study illustrates that DNN models can capture the complex interactions between climate factors and crop water use, providing reliable predictions that are crucial for strategic agricultural planning. The findings suggest that while DNNs offer significant advantages in modeling water footprints, their effectiveness depends on the quality of input data and the specific configurations of the neural network. This research underscores the potential of ML to enhance predictive accuracy and aid in the development of resilient agricultural practices against the backdrop of global climate change.

Dehghanisanij et al. (2022) demonstrated that a hybrid ML approach combining an adaptive neuro-fuzzy inference system (ANFIS) and seasonal optimization (SO) algorithm effectively estimates water-use efficiency (WUE) and yield in apple orchards. The SO–ANFIS model outperformed traditional methods with R2 values of 0.989 for WUE and 0.988 for yield. The study highlighted the model's ability to handle complex irrigation and climatic data, significantly enhancing prediction accuracy and supporting efficient water management practices.

Virnodkar et al. (2020) emphasize the superior capabilities of ML models, such as ANN, SVM, and RF, over traditional RS methods in detecting crop water stress with higher accuracy and efficiency. They explored the integration of remote sensing and ML techniques to determine crop water stress. The study demonstrated that ML models, particularly SVM and ANNs, significantly enhance the accuracy of water stress predictions compared to traditional methods. These models effectively process large volumes of remote sensing data, offering precise and efficient crop water stress assessments. The findings highlight the potential of combining remote sensing with ML to improve water management and irrigation practices in precision agriculture.

Gao et al. (2023) provided a comprehensive review of the applications of MLin irrigation, emphasizing its potential to enhance agricultural water productivity. The study categorized ML applications into water scarcity diagnosis, water demand prediction, and irrigation decision-making, showcasing the effectiveness of models such as CNN, RF, LSTM, SVM, and XGBoost. These models demonstrated high accuracy in diagnosing water stress, predicting soil moisture content, and optimizing irrigation schedules, thereby significantly improving water use efficiency and crop yields. Despite the promising results, Gao et al. highlighted challenges such as data quality, model interpretability, and portability. To address these issues, they proposed a unified ML framework integrating deep learning and optimization algorithms to enhance model robustness and applicability across various agricultural contexts.

Liakos et al. (2018) demonstrate that ML technologies, when combined with the appropriate sensors and data, can significantly improve water management in agriculture. They highlighted that ML models, such as ANN, ELM, and SVM, significantly improved the estimation of evapotranspiration and soil moisture content, achieving high accuracy and efficiency. For example, regression models achieved a correlation coefficient (R) of 0.9999 in estimating monthly mean evapotranspiration in arid regions. Despite these successes, challenges like data quality and model interpretability remain. Liakos et al. emphasized the need for integrating ML with big data and high-performance computing to address these issues, underscoring ML's transformative potential in enhancing precision agriculture and sustainable water management.

Benos et al. (2021) conducted a comprehensive review of ML applications in agriculture, focusing on optimizing irrigation and enhancing water use efficiency. The study highlighted that ML models, such as ANNs, SVMs, and CNNs, significantly improved the accuracy of predicting evapotranspiration, soil moisture, and irrigation scheduling. The review underscored ML's potential to advance precision agriculture and sustainable water management, despite challenges like data quality and model interpretability.

Umutoni & Samadi (2024) illustrate the significant advancements ML has brought to irrigation practices, notably through predictive models that utilize data from diverse sources such as soil sensors and weather forecasts. The paper evaluates the strengths and limitations of various ML algorithms in real-world applications, noting that while ML significantly enhances decision-making, its effectiveness hinges on data quality, algorithm suitability, and the integration of expert knowledge. The discussion underscores the need for ongoing research to address data scarcity, improve model accuracy, and ensure the practical deployment of ML systems in diverse agricultural settings.

Abdel-Hameed et al. (2024) investigated the estimation of potato water footprint in arid regions using ML models. The research shows that models such as RF, ANN, and SVM can significantly enhance the accuracy of water footprint predictions by incorporating climatic, soil, and spatiotemporal data. The findings indicate that the RF model is particularly effective for site-scale predictions, while ANN models are robust across various climatic conditions. The study underscores the importance of detailed input data and site-specific modeling to improve the precision of water footprint estimates.

Elbeltagi et al. (2023) discovered that the performance of ML models in predicting vapor pressure deficit (VPD) varies significantly across different semi-arid regions in Egypt, which are crucial for optimizing water use in agriculture. Utilizing several ML models, including Linear Regression, Random SubSpace, REPTree, and notably RF, the study rigorously tested these models to determine their effectiveness in forecasting VPD. The conclusion drawn was that the RF model, due to its high accuracy and robust performance across complex datasets, was the most effective tool for this purpose.

Kumar et al. (2020) explore the use of ANN models, specifically K-SOM and FF-BP, to predict the crop water stress index (CWSI) for Indian mustard. The study demonstrates the high accuracy of these models in estimating CWSI, with K-SOM achieving R2 values of 0.97 during development and 0.96 during validation phases, indicating low bias error. In comparison, the FF-BP model showed R2 values of 0.861 and 0.747 for the development and validation phases, respectively, with a higher bias error. The research underscores the importance of using precise input data, including air and canopy temperatures and relative humidity, to enhance model performance. Kumar et al. highlight the potential of these ANN models in improving water resource management by providing accurate stress index predictions, thus aiding in optimizing irrigation practices.

Das et al. (2022) evaluated the effectiveness of various ML models for surface soil moisture mapping using combined optical, thermal, and microwave remote sensing data. Focusing on crops such as wheat, maize, mustard, and pulses in a New Delhi farm, the study compared Cubist, RF, gradient boosting machine (GBM), and a stacking ensemble approach. The stacking model, integrating outputs from Cubist, RF, and GBM with the elastic net as a meta-learner, achieved the lowest mean bias error (MBE) of 0.18% and RMSE value of 5.03%. The RF model stood out with the highest correlation (r = 0.71) and lowest RMSE (5.17%), underscoring its superior predictive performance. This research underscores the potential of advanced ML models, particularly ensemble techniques, in enhancing soil moisture estimation accuracy, essential for optimized irrigation and water management.

Comparative analysis of ML models in agricultural water management

In this section, Table 2 presents a rigorous comparative analysis of various ML models applied in the field of agricultural water management. This table meticulously details each model's performance metrics, including accuracy, computational efficiency, scalability, and adaptability to diverse agricultural conditions. By examining these critical parameters, the table offers an in-depth understanding of the strengths and limitations of each model. This comprehensive summary facilitates the identification and selection of the most suitable ML approaches for addressing specific water management challenges in agriculture, thereby contributing to the optimization of water usage and sustainability in agricultural practices.

Table 2

Performance metrics of ML models in agricultural water management

ReferenceCropModelAimInputPerformance
Li et al. (2023)  Wheat, maize, soybean, rice GAM, LightGBM, RF, ANN Study on the adaptability of ML models for various crops and scenarios of WFs Climatic, soil, and spatiotemporal The RF model was most accurate for maize, soybean, and rice at the site scale, whereas LightGBM was more applicable to wheat 
Azzam et al. (2022)  Wheat ANN, KNN, SVM, RF Estimation of green and blue water evapotranspiration using ML models Temperature, precipitation, solar radiation, humidity, wind speed, soil moisture RF model showed highest accuracy for GWET with R2 = 0.99, RMSE = 0.2637 mm/day, NSE = 0.99 
Fan et al. (2021)  Maize SVM, XGBoost, ANN, DNN Estimate daily maize transpiration Tmax, Tmin, RH, U, Rs, SWC, LAI DNN: Best accuracy (R2 = 0.816–0.954, RMSE = 0.344–0.621 mm/day). SVM, XGBoost, ANN also evaluated 
Granata (2019)  Various crops M5P regression tree, Bagging, RF, SVR To compare actual evapotranspiration estimation models based on AI theories Net solar radiation, sensible-heat flux, soil moisture, wind speed, mean relative humidity, mean temperature M5P regression tree showed the highest accuracy with NSE = 0.987, MAE = 0.14 mm/day, RMSE = 0.179 mm/day, RAE = 15.4% 
Shrestha & Shukla (2015)  Bell pepper, watermelon SVM, ANN, RVM Estimation of crop coefficient (Kc) and crop evapotranspiration (ETcDAT, irrigation frequency, water table depth, soil moisture, relative humidity, rainfall, solar radiation, temperature, wind speed SVM: superior to ANN and RVM, with prediction errors of 2.6% for watermelon and 11.2% for bell pepper 
Elbeltagia et al. (2020a)  Maize ANN models To estimate, forecast, and model the green and blue water footprints of maize using ANN Tmin, Tmax, precipitation, solar radiation, soil moisture, wind speed, vapor pressure deficit, humidity, crop coefficient (KcThe ANN models achieved high statistical significance with close to 1 coefficients of determination 
Mokhtar et al. (2021)  Rice RT, RF, AR, REPT Prediction of crop water footprint Climate and crop data The RT model excelled in BWFP estimation with an RMSE of 11.82 m3/ton and MAPE of 0.5%. Including precipitation significantly improved GWFP accuracy. 
Geng et al. (2023)  Tea KNN, SVM, RF, MLP, AdaBoost, Bagging Estimate daily actual evapotranspiration (ETa) of tea plantations Meteorological and evapotranspiration data collected from tea plantations over 12 years (2010–2021) Bagging and RF models exhibited the highest steadiness across different scenarios 
Ge et al. (2022)  Greenhouse tomato XGBoost, LR, SVR, KNR, RFR, ABR, BR, GBR Predict greenhouse tomato evapotranspiration Rn, Ta, Tamin, Tamax, RH, RHmin, RHmax, V XGBoost: Highest accuracy (MSE = 0.032, RMSE = 0.163, R2 = 0.981). GBR and SVR also performed well 
Elbeltagi et al. (2020a)  Wheat, Maize DNN Estimation of the water footprint under current and future scenarios Historical climate data (2006–2017), future climate projections (2022–2040), crop yield data DNN showed high accuracy with correlation coefficients between 0.92 and 0.97 for ETc predictions for both crops 
Dehghanisanij et al. (2022)  Apple orchards SO–ANFIS Estimate WUE and yield under narrow strip irrigation Irrigation and climate data from apple orchards, 2019–2021 SO–ANFIS achieved high accuracy with R2 values of 0.989 for WUE and 0.988 for yield 
Virnodkar et al. (2020)  Various crops ANN, SVM, RF, DT, XGBoost Determine crop water stress using remote sensing and ML for precision agriculture Remotely sensed data, including spectral indices and thermal images; ground truth data for model training ML models demonstrated high accuracy in predicting crop water stress, surpassing traditional remote sensing methods 
Gao et al. (2023)  Various ANN, SVM, RF, Decision Trees, XGBoost Review and propose an ML-based intelligent irrigation model framework Various data sources including sensor, remote sensing, and manual data collection Integrating MLmodels significantly improves the accuracy and efficiency of irrigation practices 
Liakos et al. (2018)  Various (general) ANN, SVM, decision trees, KNN, Bayesian The use of ML in optimizing irrigation and enhancing water use efficiency Data from sensors related to soil moisture, climate conditions, and water usage Regression models achieved an R value of 0.9999 for estimating monthly mean evapotranspiration 
Benos et al. (2021)  Various (general) ANN, RF, SVM, etc. Use of ML in optimizing irrigation and enhancing water use efficiency Data including climate variables, soil properties, crop management information ML models achieved R2 values up to 0.99 for estimating evapotranspiration and soil moisture content 
Umutoni & Samadi (2024)  Various ANN, SVM, RF, Decision Trees, LSTM, RL Review the use of ML in supporting irrigation decision-making Data from sensors (soil moisture, climate conditions), and models like SWAT, Aqua crop, DSSAT ML models achieved up to 20% water savings in irrigation compared to traditional methods 
Amal M Abdel-Hameed et al. Potato M5P, ANN, RF, SVM To estimate the water footprint of potato crops using various ML models in arid regions Climatic data (temperature, precipitation), soil moisture, solar radiation, wind speed M5P showed high accuracy with R2 = 0.92, ANN models performed well with R2 = 0.89, RF and SVM had lower performance with R2 = 0.85 and R2 = 0.82, respectively 
Ahmed Elbeltagi et al. Various crops LR, ART, RSS, RF, REPTree, M5P Predict VPD for efficient water management Climatic data, VPD history RF model most accurate with CC = 0.9694, MAE = 0.0967, RMSE = 0.1252, RAE = 21.7297%, RRSE = 24.0356% in test stage 
Kumar et al. (2020)  Indian mustard ANN (K-SOM, FF-BP) To predict the CWSI Ta (air temperature), Tc (canopy temperature), RH (relative humidity) K-SOM: R2 = 0.97 (development), R2 = 0.96 (validation), low bias error; FF-BP: R2 = 0.861 (development), R2 = 0.747 (validation), higher bias error compared to K-SOM 
Das et al. (2022)  Wheat, maize, mustard, pulses Cubist, RF, GBM, stacking Surface soil moisture mapping using optical-thermal-microwave remote sensing synergies Radar backscatter, visible, near-infrared, short-wave infrared, land surface temperature Stacking model showed the least MBE (0.18%) and RMSE (5.03%). RF model had highest correlation (r = 0.71) and lowest RMSE (5.17%) 
ReferenceCropModelAimInputPerformance
Li et al. (2023)  Wheat, maize, soybean, rice GAM, LightGBM, RF, ANN Study on the adaptability of ML models for various crops and scenarios of WFs Climatic, soil, and spatiotemporal The RF model was most accurate for maize, soybean, and rice at the site scale, whereas LightGBM was more applicable to wheat 
Azzam et al. (2022)  Wheat ANN, KNN, SVM, RF Estimation of green and blue water evapotranspiration using ML models Temperature, precipitation, solar radiation, humidity, wind speed, soil moisture RF model showed highest accuracy for GWET with R2 = 0.99, RMSE = 0.2637 mm/day, NSE = 0.99 
Fan et al. (2021)  Maize SVM, XGBoost, ANN, DNN Estimate daily maize transpiration Tmax, Tmin, RH, U, Rs, SWC, LAI DNN: Best accuracy (R2 = 0.816–0.954, RMSE = 0.344–0.621 mm/day). SVM, XGBoost, ANN also evaluated 
Granata (2019)  Various crops M5P regression tree, Bagging, RF, SVR To compare actual evapotranspiration estimation models based on AI theories Net solar radiation, sensible-heat flux, soil moisture, wind speed, mean relative humidity, mean temperature M5P regression tree showed the highest accuracy with NSE = 0.987, MAE = 0.14 mm/day, RMSE = 0.179 mm/day, RAE = 15.4% 
Shrestha & Shukla (2015)  Bell pepper, watermelon SVM, ANN, RVM Estimation of crop coefficient (Kc) and crop evapotranspiration (ETcDAT, irrigation frequency, water table depth, soil moisture, relative humidity, rainfall, solar radiation, temperature, wind speed SVM: superior to ANN and RVM, with prediction errors of 2.6% for watermelon and 11.2% for bell pepper 
Elbeltagia et al. (2020a)  Maize ANN models To estimate, forecast, and model the green and blue water footprints of maize using ANN Tmin, Tmax, precipitation, solar radiation, soil moisture, wind speed, vapor pressure deficit, humidity, crop coefficient (KcThe ANN models achieved high statistical significance with close to 1 coefficients of determination 
Mokhtar et al. (2021)  Rice RT, RF, AR, REPT Prediction of crop water footprint Climate and crop data The RT model excelled in BWFP estimation with an RMSE of 11.82 m3/ton and MAPE of 0.5%. Including precipitation significantly improved GWFP accuracy. 
Geng et al. (2023)  Tea KNN, SVM, RF, MLP, AdaBoost, Bagging Estimate daily actual evapotranspiration (ETa) of tea plantations Meteorological and evapotranspiration data collected from tea plantations over 12 years (2010–2021) Bagging and RF models exhibited the highest steadiness across different scenarios 
Ge et al. (2022)  Greenhouse tomato XGBoost, LR, SVR, KNR, RFR, ABR, BR, GBR Predict greenhouse tomato evapotranspiration Rn, Ta, Tamin, Tamax, RH, RHmin, RHmax, V XGBoost: Highest accuracy (MSE = 0.032, RMSE = 0.163, R2 = 0.981). GBR and SVR also performed well 
Elbeltagi et al. (2020a)  Wheat, Maize DNN Estimation of the water footprint under current and future scenarios Historical climate data (2006–2017), future climate projections (2022–2040), crop yield data DNN showed high accuracy with correlation coefficients between 0.92 and 0.97 for ETc predictions for both crops 
Dehghanisanij et al. (2022)  Apple orchards SO–ANFIS Estimate WUE and yield under narrow strip irrigation Irrigation and climate data from apple orchards, 2019–2021 SO–ANFIS achieved high accuracy with R2 values of 0.989 for WUE and 0.988 for yield 
Virnodkar et al. (2020)  Various crops ANN, SVM, RF, DT, XGBoost Determine crop water stress using remote sensing and ML for precision agriculture Remotely sensed data, including spectral indices and thermal images; ground truth data for model training ML models demonstrated high accuracy in predicting crop water stress, surpassing traditional remote sensing methods 
Gao et al. (2023)  Various ANN, SVM, RF, Decision Trees, XGBoost Review and propose an ML-based intelligent irrigation model framework Various data sources including sensor, remote sensing, and manual data collection Integrating MLmodels significantly improves the accuracy and efficiency of irrigation practices 
Liakos et al. (2018)  Various (general) ANN, SVM, decision trees, KNN, Bayesian The use of ML in optimizing irrigation and enhancing water use efficiency Data from sensors related to soil moisture, climate conditions, and water usage Regression models achieved an R value of 0.9999 for estimating monthly mean evapotranspiration 
Benos et al. (2021)  Various (general) ANN, RF, SVM, etc. Use of ML in optimizing irrigation and enhancing water use efficiency Data including climate variables, soil properties, crop management information ML models achieved R2 values up to 0.99 for estimating evapotranspiration and soil moisture content 
Umutoni & Samadi (2024)  Various ANN, SVM, RF, Decision Trees, LSTM, RL Review the use of ML in supporting irrigation decision-making Data from sensors (soil moisture, climate conditions), and models like SWAT, Aqua crop, DSSAT ML models achieved up to 20% water savings in irrigation compared to traditional methods 
Amal M Abdel-Hameed et al. Potato M5P, ANN, RF, SVM To estimate the water footprint of potato crops using various ML models in arid regions Climatic data (temperature, precipitation), soil moisture, solar radiation, wind speed M5P showed high accuracy with R2 = 0.92, ANN models performed well with R2 = 0.89, RF and SVM had lower performance with R2 = 0.85 and R2 = 0.82, respectively 
Ahmed Elbeltagi et al. Various crops LR, ART, RSS, RF, REPTree, M5P Predict VPD for efficient water management Climatic data, VPD history RF model most accurate with CC = 0.9694, MAE = 0.0967, RMSE = 0.1252, RAE = 21.7297%, RRSE = 24.0356% in test stage 
Kumar et al. (2020)  Indian mustard ANN (K-SOM, FF-BP) To predict the CWSI Ta (air temperature), Tc (canopy temperature), RH (relative humidity) K-SOM: R2 = 0.97 (development), R2 = 0.96 (validation), low bias error; FF-BP: R2 = 0.861 (development), R2 = 0.747 (validation), higher bias error compared to K-SOM 
Das et al. (2022)  Wheat, maize, mustard, pulses Cubist, RF, GBM, stacking Surface soil moisture mapping using optical-thermal-microwave remote sensing synergies Radar backscatter, visible, near-infrared, short-wave infrared, land surface temperature Stacking model showed the least MBE (0.18%) and RMSE (5.03%). RF model had highest correlation (r = 0.71) and lowest RMSE (5.17%) 

Quantitative analysis of ML utilization and applications in agricultural water management

This section provides a quantitative analysis of the utilization rates and application frequencies of various ML models. The utilization rate of each ML model can be seen in Figure 5. The RF model is the most frequently used, representing approximately 35% of the instances, followed by ANNs at 30% and SVM at 27%. XGBoost and M5P are also notable, accounting for 13 and 10%, respectively. Other models such as KNN, DNN, bagging, decision trees, and RT are used less frequently, each with a usage rate below 10%. Models with less than 7% frequency were not included in this figure.
Figure 5

Distribution of ML models.

Figure 5

Distribution of ML models.

Close modal
Figure 6 illustrates the frequency of various ML applications in agricultural water management. ‘Water Use Prediction’ is the most common, accounting for 21% of use cases, followed by ‘Evapotranspiration Estimation’ at 17% and ‘Water Footprint Estimation’ at 14%. ‘Irrigation Scheduling’ and ‘Water Stress Prediction’ represent 10 and 11%, respectively. ‘Water Use Efficiency’ accounts for 12%, while ‘Crop Yield Prediction’ and ‘Decision Support for Water Management’ make up 7 and 8%, respectively.
Figure 6

Frequency of typical use cases for ML models.

Figure 6

Frequency of typical use cases for ML models.

Close modal

ML has emerged as a game-changer in agricultural water management, enhancing irrigation efficiency, water demand forecasting, and climate resilience. This study underscores the advantages of ensemble methods like RF and LightGBM for handling large datasets and the strength of ANNs in specific water-related predictions. However, challenges such as data availability, model interpretability, and scalability must be addressed to fully harness ML's potential. Future efforts should focus on integrating advanced sensor networks, refining AI-driven decision-making tools, and ensuring accessibility for diverse agricultural stakeholders. By overcoming these barriers, ML-driven solutions can play a pivotal role in sustainable and resilient water management practices worldwide.

Challenges and strategic directions

Figure 7

ML limitations.

Challenge: ML algorithms require large volumes of high-quality data to function optimally. In many agricultural settings, especially in developing countries, data may be sparse, inaccurate, or outdated. This can lead to poor model performance and unreliable predictions.

Potential solution: Initiatives to improve data collection infrastructure, such as government-funded projects to deploy more sensors and satellites, can enhance data quality. Collaborative efforts between academic institutions, industry, and government to create open-source data repositories could also be beneficial (Kamilaris & Prenafeta-Boldú 2018).

  • 2. Model complexity and interpretability:

Challenge: Many advanced ML models, such as DNNs, are often seen as ‘black boxes’ due to their complex nature, making it difficult for practitioners to trust and interpret their predictions.

Potential solution: Research into explainable AI (XAI) is vital, as these techniques can make the workings of complex models more transparent and understandable to non-experts. Developing hybrid models that combine ML with traditional decision-making frameworks could also help enhance trust and interpretability (Ryo 2022).

  • 3. Integration with existing systems:

Challenge: Integrating ML systems with existing agricultural management practices can be challenging due to compatibility issues, the need for technical expertise, and resistance from traditional practitioners.

Potential solution: Developing user-friendly ML tools that require minimal setup and are compatible with existing technologies can ease integration. Training sessions and workshops for farmers and agricultural managers can help bridge the knowledge gap and promote acceptance (Wolfert et al. 2017).

  • 4. Scalability and adaptability:

Challenge: ML models developed in one region or for one type of crop may not perform well when applied to different regions or crops due to varying climatic and soil conditions.

Potential solution: Focused research on developing adaptable and scalable ML models that can adjust to different environmental conditions and agricultural practices is essential. Using transfer learning and fine-tuning models based on local data can improve scalability (Liakos et al. 2018).

  • 5. Cost and resource requirements:

Challenge: The deployment of sophisticated ML models can be resource-intensive, requiring significant computational power and expertise, which might be unaffordable for small to medium-sized farms.

Potential solution: Cloud-based ML solutions can reduce the need for on-site computational resources. Subsidies or financial incentives from governments or agricultural cooperatives can help smaller farms adopt these technologies (Shamshiri et al. 2018).

  • 6. Regulatory and ethical concerns:

Challenge: The use of ML in agriculture raises concerns about data privacy, ownership, and ethical use, especially when data from multiple sources are integrated and analyzed.

Potential solution: Establishing clear regulations and ethical guidelines for the use of agricultural data and ML models can help address these concerns. Stakeholder engagement in developing these regulations can ensure that they are broadly acceptable and effective (Ryan 2023).

  • 7. Climate change and uncertainty in future projections

Challenge: Climate variability introduces significant uncertainties in ML-based water management models. Changes in temperature, precipitation patterns, and extreme weather events can disrupt model assumptions, reducing their predictive reliability.

Potential solution: Hybrid climate-aware ML models that integrate hydrological simulations with AI-driven predictions can enhance robustness against climate uncertainties. Additionally, ensemble modeling approaches that aggregate predictions from multiple ML models can improve reliability in uncertain scenarios (Azmat et al. 2022).

Key findings

ML has increasingly become a pivotal technology in revolutionizing agricultural practices, particularly in water management. This review has illustrated that ML offers substantial promise for enhancing the productivity and sustainability of agricultural operations through precise, efficient water management – crucial in the face of mounting global challenges such as water scarcity and climate change.

Our extensive examination of ML applications in agricultural water management showcases these technologies' transformative capabilities. By integrating a variety of models such as RF, ANNs, SVM, and LightGBM, significant strides have been made in improving prediction accuracy and management strategies across diverse agricultural environments. These models facilitate nuanced decision-making, optimize resource allocation, and enable robust risk mitigation under varying climatic conditions.

Key findings from the papers reviewed reveal:

  • Predictive accuracy: ML models, particularly ensemble methods such as RF and LightGBM, deliver exceptional predictive accuracy, crucial for effective irrigation planning and water allocation.

  • Data integration and analysis: The successful integration of various data types, including remote sensing and ground sensor data, enhances model reliability and broadens their applicability, supporting comprehensive water management strategies.

  • Model-specific insights: RF and ANN excel in handling complex datasets and generating dependable predictions for water consumption and evapotranspiration – essential for informed decision-making in regions facing water scarcity. LightGBM's effectiveness underscores the importance of selecting models based on specific agricultural needs and environmental conditions.

  • Despite these advances, several challenges persist: High-quality, diverse datasets are essential for training robust models, and the complexity of integrating ML technologies into existing agricultural frameworks remains a significant hurdle. Additionally, ensuring model interpretability and managing socio-economic factors are critical for the practical application of these solutions.

To propel this field forward, future research should focus on:

  • - Enhancing data collection techniques: Improved data quality and granularity will aid in refining ML models, making them more responsive to specific agricultural contexts.

  • - Improving model transparency and usability: Developing user-friendly models with clear methodologies will facilitate broader adoption, especially by practitioners who are not ML experts.

  • - Tailoring models to specific needs: Customizing ML approaches to accommodate particular crop types, climates, and water management systems will boost their effectiveness and relevance.

  • - Fostering interdisciplinary collaborations: Engaging various scientific and industry disciplines can spur innovations that blend agronomic knowledge with cutting-edge ML techniques, leading to integrated solutions for water management.

In conclusion, ML not only offers a promising avenue for advancing agricultural water management practices but also plays a critical role in shaping the future of global agriculture. By continuing to leverage and refine these technologies, there is tremendous potential to improve the sustainability and efficiency of farming operations worldwide, ensuring food security and resource conservation in an increasingly unpredictable global climate. This synthesis provides a clear overview of the current capabilities of ML in agricultural water management and sets a comprehensive roadmap for ongoing research and practical applications.

All authors have reviewed and approved the final manuscript and consent to its publication.

This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.

M. M. and F. M. wrote the original draft, conducting the literature review, and designing figures, tables, and illustrations. D. B. synthesizing information from various sources, enhancing readability and flow, offering suggestions for improvement, overseeing final revisions, supervising the research process, and ensuring methodological rigor. S. M., A. D., and J. L. N. reviewing and revising sections related to their expertise, providing critical feedback and suggestions, ensuring technical accuracy, and contributing to the interpretation of findings. S. M. and H. G. reviewed and edited the article, revision of the manuscript in response to reviewers' comments.

All relevant data are included in the paper or its Supplementary Information.

The authors declare there is no conflict.

Abdel-Hameed
A. M.
,
Abuarab
M.
,
Al-Ansari
N.
,
Sayed
H.
,
Kassem
M. A.
,
Elbeltagi
A.
&
Mokhtar
A.
(
2024
)
Estimation of potato water footprint using machine learning algorithm models in arid regions
,
Potato Research
,
1755–1774. https://doi.org/10.1007/s11540-024-09716-1
.
Adnan
R. M.
,
Mo
W.
,
Ewees
A. A.
,
Heddam
S.
,
Kisi
O.
&
Zounemat-Kermani
M.
(
2024
)
Enhancing streamflow prediction accuracy: a comprehensive analysis of hybrid neural network models with Runge–Kutta with aquila optimizer
,
International Journal of Computational Intelligence Systems
,
17
(
1
),
1
23
.
Adnan
R. M.
,
Mostafa
R. R.
,
Wang
M.
,
Parmar
K. S.
,
Kisi
O.
&
Zounemat-Kermani
M.
(
2025
)
Improved random vector functional link network with an enhanced remora optimization algorithm for predicting monthly streamflow
,
Journal of Hydrology
,
650
,
132496
.
Afrifa
S.
,
Zhang
T.
,
Appiahene
P.
&
Varadarajan
V.
(
2022
)
Mathematical and machine learning models for groundwater level changes: a systematic review and bibliographic analysis
,
Future Internet
,
14
(
9
),
259
.
Afrifa
S.
,
Varadarajan
V.
,
Appiahene
P.
&
Zhang
T.
(
2023a
)
A novel artificial intelligence techniques for women breast cancer classification using ultrasound images
,
Clinical and Experimental Obstetrics & Gynecology
,
50
(
12
),
271
.
Afrifa
S.
,
Varadarajan
V.
,
Appiahene
P.
,
Zhang
T.
&
Domfeh
E. A.
(
2023b
)
Ensemble machine learning techniques for accurate and efficient detection of botnet attacks in connected computers
,
Eng
,
4
(
1
),
650
664
.
Azmat
M.
,
Madondo
M.
,
Dipietro
K.
,
Horesh
R.
,
Bawa
A.
,
Jacobs
M.
,
Srinivasan
R.
&
O'Donncha
F.
(
2022
)
Forecasting soil moisture using domain inspired temporal graph convolution neural networks to guide sustainable crop management. arXiv preprint arXiv:2212.06565
.
Azzam
A.
,
Zhang
W.
,
Akhtar
F.
,
Shaheen
Z.
&
Elbeltagi
A.
(
2022
)
Estimation of green and blue water evapotranspiration using machine learning algorithms with limited meteorological data: a case study in Amu Darya River Basin, Central Asia
,
Computers and Electronics in Agriculture
,
202
,
107403. https://doi.org/10.1016/j.compag.2022.107403
.
Benos
L.
,
Tagarakis
A. C.
,
Dolias
G.
,
Berruto
R.
,
Kateris
D.
&
Bochtis
D.
(
2021
)
Machine learning in agriculture: a comprehensive updated review
,
Sensors
,
21
(
11
),
3758
.
https://doi.org/10.3390/s21113758
.
Blatchford
M. L.
,
Karimi
P.
,
Bastiaanssen
W. G. M.
&
Nouri
H.
(
2018
)
From global goals to local gains – a framework for crop water productivity
,
ISPRS International Journal of Geo-Information
,
7
(
11
),
414
.
https://doi.org/10.3390/ijgi7110414
.
Christias
P.
,
Daliakopoulos
I. N.
,
Manios
T.
&
Mocanu
M.
(
2020
)
Comparison of three computational approaches for tree crop irrigation decision support
,
Mathematics
,
8
(
5
),
717
.
https://doi.org/10.3390/MATH8050717
.
Cravero
A.
,
Pardo
S.
,
Sepúlveda
S.
&
Muñoz
L.
(
2022
)
Challenges to use machine learning in agricultural big data: a systematic literature review
,
Agronomy
,
12
(
3
),
748
.
https://doi.org/10.3390/agronomy12030748
.
Das
B.
,
Rathore
P.
,
Roy
D.
,
Chakraborty
D.
,
Jatav
R. S.
,
Sethi
D.
&
Kumar
P.
(
2022
)
Comparison of bagging, boosting and stacking algorithms for surface soil moisture mapping using optical-thermal-microwave remote sensing synergies
,
Catena
,
217
,
106485
.
https://doi.org/10.1016/j.catena.2022.106485
.
Dehghanisanij
H.
,
Emami
H.
,
Emami
S.
&
Rezaverdinejad
V.
(
2022
)
A hybrid machine learning approach for estimating the water-use efficiency and yield in agriculture
,
Scientific Reports
,
12
(
1
),
6728
.
https://doi.org/10.1038/s41598-022-10844-2
.
D'Odorico
P.
,
Chiarelli
D. D.
,
Rosa
L.
,
Bini
A.
,
Zilberman
D.
&
Rulli
M. C.
(
2020
)
The global value of water in agriculture
,
Proceedings of the National Academy of Sciences of the United States of America
,
117
(
36
),
21985
21993
.
https://doi.org/10.1073/pnas.2005835117
.
Elbeltagi
A.
,
Aslam
M. R.
,
Malik
A.
,
Mehdinejadiani
B.
,
Srivastava
A.
,
Bhatia
A. S.
&
Deng
J.
(
2020a
)
The impact of climate changes on the water footprint of wheat and maize production in the Nile Delta, Egypt
,
Science of The Total Environment
,
743
,
140770
.
https://doi.org/10.1016/j.scitotenv.2020.140770
.
Elbeltagi
A.
,
Deng
J.
,
Wang
K.
&
Hong
Y.
(
2020b
)
Crop water footprint estimation and modeling using an artificial neural network approach in the Nile Delta, Egypt
,
Agricultural Water Management
,
235
,
106080
.
https://doi.org/10.1016/j.agwat.2020.106080
.
Elbeltagi
A.
,
Srivastava
A.
,
Deng
J.
,
Li
Z.
,
Raza
A.
,
Khadke
L.
,
Yu
Z.
&
El-Rawy
M.
(
2023
)
Forecasting vapor pressure deficit for agricultural water management using machine learning in semi-arid environments
,
Agricultural Water Management
,
283
,
108302
.
https://doi.org/10.1016/j.agwat.2023.108302
.
Fan
J.
,
Ma
X.
,
Wu
L.
,
Zhang
F.
,
Yu
X.
&
Zeng
W.
(
2019
)
Light gradient boosting machine: an efficient soft computing model for estimating daily reference evapotranspiration with local and external meteorological data
,
Agricultural Water Management
,
225
,
105758
.
https://doi.org/10.1016/j.agwat.2019.105758
.
Fan
J.
,
Zheng
J.
,
Wu
L.
&
Zhang
F.
(
2021
)
Estimation of daily maize transpiration using support vector machines, extreme gradient boosting, artificial and deep neural networks models
,
Agricultural Water Management
,
245
,
106547
.
https://doi.org/10.1016/j.agwat.2020.106547
.
Foley
D. J.
,
Thenkabail
P. S.
,
Aneece
I. P.
,
Teluguntla
P. G.
&
Oliphant
A. J.
(
2020
)
A meta-analysis of global crop water productivity of three leading world crops (wheat, corn, and rice) in the irrigated areas over three decades
,
International Journal of Digital Earth
,
13
(
8
),
939
975
.
https://doi.org/10.1080/17538947.2019.1651912
.
Fu
G.
,
Jin
Y.
,
Sun
S.
,
Yuan
Z.
&
Butler
D.
(
2022
)
The role of deep learning in urban water management: a critical review
,
Water Research
,
223
,
118973
.
https://doi.org/10.1016/j.watres.2022.118973
.
Gao
H.
,
Zhangzhong
L.
,
Zheng
W.
&
Chen
G.
(
2023
)
How can agricultural water production be promoted? A review on machine learning for irrigation
,
Journal of Cleaner Production
,
414
,
137687
.
https://doi.org/10.1016/j.jclepro.2023.137687
.
Ge
J.
,
Zhao
L.
,
Yu
Z.
,
Liu
H.
,
Zhang
L.
,
Gong
X.
&
Sun
H.
(
2022
)
Prediction of greenhouse tomato crop evapotranspiration using XGBoost machine learning model
,
Plants
,
11
(
15
),
1923
.
https://doi.org/10.3390/plants11151923
.
Geng
J.
,
Li
H.
,
Luan
W.
,
Shi
Y.
,
Pang
J.
&
Zhang
W.
(
2023
)
Estimation of daily actual evapotranspiration of tea plantations using ensemble machine learning algorithms and six available scenarios of meteorological data
,
Applied Sciences
,
13
(
23
),
12961
.
https://doi.org/10.3390/app132312961
.
Granata
F.
(
2019
)
Evapotranspiration evaluation models based on machine learning algorithms – a comparative study
,
Agricultural Water Management
,
217
,
303
315
.
https://doi.org/10.1016/j.agwat.2019.03.015
.
Guerrero
J. M.
,
Pajares
G.
,
Montalvo
M.
,
Romeo
J.
&
Guijarro
M.
(
2012
)
Support vector machines for crop/weeds identification in maize fields
,
Expert Systems with Applications
,
39
(
12
),
11149
11155
.
https://doi.org/10.1016/j.eswa.2012.03.040
.
Hameed
M.
,
Moradkhani
H.
,
Ahmadalipour
A.
,
Moftakhari
H.
,
Abbaszadeh
P.
&
Alipour
A.
(
2019
)
A review of the 21st century challenges in the food-energy-water security in the Middle East
,
Water (Switzerland)
,
11
(
4
),
682
.
https://doi.org/10.3390/w11040682
.
Horvitz
E.
&
Mulligan
D.
(
2015
)
Data, privacy, and the greater good
,
Science
,
349
(
6245
),
253
255
.
https://doi.org/10.1126/science.aac4520
.
Huber
F.
,
Yushchenko
A.
,
Stratmann
B.
&
Steinhage
V.
(
2022
)
Extreme gradient boosting for yield estimation compared with deep learning approaches
,
Computers and Electronics in Agriculture
,
202
,
107346
.
https://doi.org/10.1016/j.compag.2022.107346
.
Ikram
R. M. A.
,
Mostafa
R. R.
,
Chen
Z.
,
Parmar
K. S.
,
Kisi
O.
&
Zounemat-Kermani
M.
(
2023
)
Water temperature prediction using improved deep learning methods through reptile search algorithm and weighted mean of vectors optimizer
,
Journal of Marine Science and Engineering
,
11
(
2
),
259
.
Jung
J.
,
Maeda
M.
,
Chang
A.
,
Bhandari
M.
,
Ashapure
A.
&
Landivar-Bowles
J.
(
2021
)
The potential of remote sensing and artificial intelligence as tools to improve the resilience of agriculture production systems
,
Current Opinion in Biotechnology
,
70
,
15
22
.
https://doi.org/10.1016/J.COPBIO.2020.09.003
.
Kamilaris
A.
&
Prenafeta-Boldú
F. X.
(
2018
)
Deep learning in agriculture: a survey
,
Computers and Electronics in Agriculture
,
147
,
70
90
.
DOI: 10.1016/j.compag.2018.02.016
.
Khairunniza-Bejo
S.
,
Mustaffha
S.
,
Ishak
W.
&
Ismail
W.
(
2014
)
Application of artificial neural network in predicting crop yield: a review
,
Journal of Food Science and Engineering
,
4
(
1
),
1
.
Kılıç
Z.
(
2020
)
The importance of water and conscious use of water
,
International Journal of Hydrology
,
4
(
5
),
239
241
.
https://doi.org/10.15406/ijh.2020.04.00250
.
Kumar
N.
,
Adeloye
A. J.
,
Shankar
V.
&
Rustum
R.
(
2020
)
Neural computing modelling of the crop water stress index
,
Agricultural Water Management
,
239
,
106259
.
https://doi.org/10.1016/j.agwat.2020.106259
.
Latif
S. D.
,
Alyaa Binti Hazrin
N.
,
Hoon Koo
C.
,
Lin Ng
J.
,
Chaplot
B.
,
Feng Huang
Y.
,
El-Shafie
A.
&
Najah Ahmed
A.
(
2023
)
Assessing rainfall prediction models: exploring the advantages of machine learning and remote sensing approaches
,
Alexandria Engineering Journal
,
82
,
16
25
.
https://doi.org/10.1016/j.aej.2023.09.060
.
Lee
A.
,
Taylor
P.
&
Kalpathy-Cramer
J.
(
2017
)
Machine learning has arrived!
,
Ophthalmology
,
124
(
12
),
1726
1728
.
https://doi.org/10.1016/j.ophtha.2017.08.046
.
L'Heureux
A.
,
Grolinger
K.
,
Elyamany
H. F.
&
Capretz
M. A. M.
(
2017
)
Machine learning with big data: challenges and approaches
,
IEEE Access
,
5
,
7776
7797
.
https://doi.org/10.1109/ACCESS.2017.2696365
.
Li
Z.
,
Peng
F.
,
Niu
B.
,
Li
G.
,
Wu
J.
&
Miao
Z.
(
2018
)
Water quality prediction model combining sparse auto-encoder and LSTM network
,
IFAC-PapersOnLine
,
51
(
17
),
831
836
.
Li
Z.
,
Wang
W.
,
Ji
X.
,
Wu
P.
&
Zhuo
L.
(
2023
)
Machine learning modeling of water footprint in crop production distinguishing water supply and irrigation method scenarios
,
Journal of Hydrology
,
625
,
130171
.
https://doi.org/10.1016/j.jhydrol.2023.130171
.
Liakos
K. G.
,
Busato
P.
,
Moshou
D.
,
Pearson
S.
&
Bochtis
D.
(
2018
)
Machine learning in agriculture: a review
,
Sensors (Switzerland)
,
18
(
8
),
2674
.
https://doi.org/10.3390/s18082674
.
Löw
F.
,
Michel
U.
,
Dech
S.
&
Conrad
C.
(
2013
)
Impact of feature selection on the accuracy and spatial uncertainty of per-field crop classification using support vector machines
,
ISPRS Journal of Photogrammetry and Remote Sensing
,
85
,
102
119
.
https://doi.org/10.1016/j.isprsjprs.2013.08.007
.
Magidi
J.
,
Nhamo
L.
,
Mpandeli
S.
&
Mabhaudhi
T.
(
2021
)
Application of the random forest classifier to map irrigated areas using Google Earth Engine
,
Remote Sensing
,
13
(
5
),
876
.
https://doi.org/10.3390/rs13050876
.
Maleksaeidi
H.
&
Karami
E.
(
2013
)
Social-ecological resilience and sustainable agriculture under water scarcity
,
Agroecology and Sustainable Food Systems
,
37
(
3
),
262
290
.
https://doi.org/10.1080/10440046.2012.746767
.
McCarty
D. A.
,
Kim
H. W.
&
Lee
H. K.
(
2020
)
Evaluation of light gradient boosted machine learning technique in large scale land use and land cover classification
,
Environments
,
7
(
10
),
84
.
https://doi.org/10.3390/environments7100084
.
Mokhtar
A.
,
He
H.
,
He
W.
,
Elbeltagi
A.
,
Maroufpoor
S.
,
Azad
N.
,
Alsafadi
K.
&
Gyasi-Agyei
Y.
(
2021
)
Estimation of the rice water footprint based on machine learning algorithms
,
Computers and Electronics in Agriculture
,
191
,
106501
.
https://doi.org/10.1016/j.compag.2021.106501
.
Murphy
H. M.
,
Bhatti
M.
,
Harvey
R.
&
Mcbean
E. A.
(
2016
)
Using decision trees to predict drinking water advisories in small water systems
,
Journal – American Water Works Association
,
108
(
2
),
E109
E118
.
https://doi.org/10.5942/jawwa.2016.108.0008
.
Sahoo
S.
,
Russo
T. A.
,
Elliott
J.
&
Foster
I.
(
2017
)
Machine learning algorithms for modeling groundwater level changes in agricultural regions of the U.S
,
Water Resources Research
,
53
(
5
),
3878
3895
.
https://doi.org/10.1002/2016WR019933
.
Schick
S.
,
Rössler
O.
&
Weingartner
R.
(
2016
)
Comparison of cross-validation and bootstrap aggregating for building a seasonal streamflow forecast model
,
Proceedings of The International Association of Hydrological Sciences
,
374
,
159
163
.
https://doi.org/10.5194/piahs-374-159-2016
.
Shamshiri, R. R., Weltzien, C., Hameed, I. A., Yule, I. J., Grift, T. E., Balasundram, S. K., Pitonakova, L., Ahmad, D. & Chowdhary, G.
(
2018
)
Research and development in agricultural robotics: A perspective of digital farming
.
Shao
G.
,
Han
W.
,
Zhang
H.
,
Liu
S.
,
Wang
Y.
,
Zhang
L.
&
Cui
X.
(
2021
)
Mapping maize crop coefficient Kc using random forest algorithm based on leaf area index and UAV-based multispectral vegetation indices
,
Agricultural Water Management
,
252
,
106906
.
https://doi.org/10.1016/j.agwat.2021.106906
.
Shrestha
N. K.
&
Shukla
S.
(
2015
)
Support vector machine based modeling of evapotranspiration using hydro-climatic variables in a sub-tropical environment
,
Agricultural and Forest Meteorology
,
200
,
172
184
.
https://doi.org/10.1016/j.agrformet.2014.09.025
.
Sun
A. Y.
&
Scanlon
B. R.
(
2019
)
How can big data and machine learning benefit environment and water management: a survey of methods, applications, and future directions
,
Environmental Research Letters
,
14
(
7
),
073001
.
https://doi.org/10.1088/1748-9326/ab1b7d
.
Umutoni
L.
&
Samadi
V.
(
2024
)
Application of machine learning approaches in supporting irrigation decision making: a review
,
Agricultural Water Management
,
294
,
108710
.
https://doi.org/10.1016/j.agwat.2024.108710
.
Ustuner
M.
&
Sanli
F. B.
(
2019
)
Polarimetric target decompositions and light gradient boosting machine for crop classification: a comparative evaluation
,
ISPRS International Journal of Geo-Information
,
8
(
2
),
97
.
https://doi.org/10.3390/ijgi8020097
.
Veeragandham
S.
&
Santhi
H.
(
2020
)
A review on the role of machine learning in agriculture
,
Scalable Computing: Practice and Experience
,
21
(
4
),
583
589
.
https://doi.org/10.12694/scpe.v21i4.1699
.
Virnodkar
S. S.
,
Pachghare
V. K.
,
Patil
V. C.
&
Jha
S. K.
(
2020
)
Remote sensing and machine learning for crop water stress determination in various crops: a critical review
,
Precision Agriculture
,
21
(
5
),
1121
1155
.
https://doi.org/10.1007/s11119-020-09711-9
.
Weersink
A.
,
Fraser
E.
,
Pannell
D.
,
Duncan
E.
&
Rotz
S.
(
2018
)
Opportunities and challenges for big data in agricultural and environmental analysis
,
Annual Review of Resource Economics
,
10
,
19
37
.
https://doi.org/10.1146/annurev-resource-100516-053654
.
Winkler
D.
,
Haltmeier
M.
,
Kleidorfer
M.
,
Rauch
W.
&
Tscheikner-Gratl
F.
(
2018
)
Pipe failure modelling for water distribution networks using boosted decision trees
,
Structure and Infrastructure Engineering
,
14
(
10
),
1402
1411
.
https://doi.org/10.1080/15732479.2018.1443145
.
Wolfert
S.
,
Ge
L.
,
Verdouw
C.
&
Bogaardt
M. J.
(
2017
)
Big data in smart farming – a review
,
Agricultural Systems
,
153
,
69
80
.
Xu
J.
,
Xu
Z.
,
Kuang
J.
,
Lin
C.
,
Xiao
L.
,
Huang
X.
&
Zhang
Y.
(
2021
)
An alternative to laboratory testing: random forest-based water quality prediction framework for inland and nearshore water bodies
,
Water (Switzerland)
,
13
(
22
),
3262
.
https://doi.org/10.3390/w13223262
.
Zhou
L.
,
Pan
S.
,
Wang
J.
&
Vasilakos
A. V.
(
2017
)
Machine learning on big data: opportunities and challenges
,
Neurocomputing
,
237
,
350
361
.
https://doi.org/10.1016/j.neucom.2017.01.026
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).