This study investigates the performance of six machine learning (ML) models – Random Forest (RF), Adaptive Boosting (ADA), CatBoost (CAT), Support Vector Machine (SVM), Lasso Regression (LAS), and Artificial Neural Network (ANN) – against traditional empirical formulas for estimating maximum scour depth after sluice gates. Our findings indicate that ML models generally outperform empirical formulas, with correlation coefficients (CORR) ranging from 0.882 to 0.944 for ML models compared with 0.835–0.847 for empirical methods. Notably, ANN exhibited the highest performance, followed closely by CAT, with a CORR of 0.936. RF, ADA, and SVM performed competitive metrics around 0.928. Variable importance assessments highlighted the dimensionless densimetric Froude number (Fd) as significantly influential, particularly in RF, CAT, and LAS models. Furthermore, SHAP value analysis provided insights into each predictor's impact on model outputs. Uncertainty assessment through Monte Carlo (MC) and Bootstrap (BS) methods, with 1,000 iterations, indicated ML's capability to produce reliable uncertainty maps. ANN leads in performance with higher mean values and lower standard deviations, followed by CAT. MC results trend towards optimistic predictions compared with BS, as reflected in median values and interquartile ranges. This analysis underscores the efficacy of ML models in providing precise and reliable scour depth predictions.

  • Benchmarked six ML models against empirical formulas for estimating scour depth.

  • ML algorithms performed superior performance with CORR [0.882–0.944].

  • CAT, ANN, and RF models excelled in precision and accuracy.

  • Evaluated predictor importance using permutation and SHAP values.

  • Assessed uncertainty in predictions by Monte Carlo and Bootstrap methods.

This content is only available as a PDF.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (

Supplementary data