This study presents the ‘Dual Path CNN-MLP’, a novel hybrid deep neural network (DNN) architecture that merges the strengths of convolutional neural networks (CNNs) and multilayer perceptrons (MLPs) for regional groundwater flow simulations. This model stands out from previous DNN approaches by managing mixed input types, including both imagery and numerical vectors. Such flexibility allows the diverse nature of groundwater data to be efficiently utilized without the need to convert it into a uniform format, which often leads to oversimplification or unnecessary expansion of the dataset. When applied to the northeast Qatar aquifer, the model demonstrates high accuracy in simulating transient groundwater flow fields, benchmarked against the well-established MODFLOW model. The model's efficacy is confirmed through k-fold cross-validation, showing an error margin of less than 12% across all examined locations. The study also examines the model's ability to perform uncertainty analysis using Monte Carlo simulations, finding that it achieves around 1% average absolute percentage error in estimating the mean hydraulic head. Errors are mostly found in areas with significant variations in the hydraulic head. Switching to this machine learning model from the conventional MODFLOW simulator boosts computational efficiency by about 99%, showcasing its advantage for tasks like uncertainty analysis in repetitive groundwater simulations.

  • This study focuses on machine learning–based regional groundwater flow modeling.

  • The ‘Dual Path CNN-MLP’ architecture for mixed input data types is proposed here.

  • The model's efficacy in predicting the northeast Qatar aquifer's hydraulic head is assessed.

  • Alignment with MODFLOW outcomes are demonstrated in the study.

Numerical models of regional groundwater flow are vital tools for understanding groundwater systems and predicting how they react to groundwater extraction, climate variability, land use changes, and other influential factors (Zhou & Li 2011; Omar et al. 2020; Liu et al. 2022; Sun et al. 2023). To calculate the hydraulic head, these models solve an implicit system of equations derived from discretized versions of the partial differential equations governing groundwater flow and transport. This discretization is often achieved through finite difference or finite element methods, and the process is iteratively repeated at each time step. These numerical models are extensively employed in aquifer resource management (Omar et al. 2020; Miro et al. 2021; Ostad-Ali-Askari & Shayannejad 2021), design of aquifer storage and recovery schemes (LaHaye et al. 2021; Tiwari et al. 2022, 2023), drought risk management (Shivakoti et al. 2019; Wossenyeleh et al. 2021), and aquifer pollution control (Robinson et al. 2009; Guo et al. 2021; Panjehfouladgaran & Rajabi 2022; Shakeri et al. 2023). However, the computational expense of numerical methods used in regional groundwater simulations can pose a significant obstacle for tasks that involve repeated simulations, such as uncertainty analysis and simulation-based optimization. To overcome this issue, a solution is to use a limited number of numerical simulations to train data-driven surrogate models that can replace numerical models in such tasks.

Data-driven surrogate models utilized in this context are primarily defined within the scope of univariate (Tian et al. 2016; Mirarabi et al. 2019; Sarma & Singh 2022) or multivariate (Rizeei et al. 2018; Najafzadeh et al. 2022) time series forecasting (Gong et al. 2018; Azizpour et al. 2022; Eslami et al. 2022). These models typically utilize the historical time series of head data, occasionally along with auxiliary data such as precipitation, evaporation, discharge, and irrigation, to predict future heads. Table 1 lists various models previously applied in this context, accompanied by their respective references.

Table 1

Overview of selected data-driven surrogate models used in groundwater flow and contaminant simulation studies

Model typeReferences
Feed-forward neural networks Kouziokas et al. (2018)  
Gaussian processes emulators Kopsiaftis et al. (2019), Rajabi (2019)  
Polynomial chaos expansion Laloy et al. (2013), Rajabi (2019)  
Kriging Clifton & Neuman (1982), Asadi & Adhikari (2022)  
Radial basis functions Nourani et al. (2017)  
Support vector machines Yoon et al. (2011), Lal & Datta (2018)  
DNNs Mo et al. (2020), Payne et al. (2022), Zheng et al. (2023), Xia et al. (2023)  
Model typeReferences
Feed-forward neural networks Kouziokas et al. (2018)  
Gaussian processes emulators Kopsiaftis et al. (2019), Rajabi (2019)  
Polynomial chaos expansion Laloy et al. (2013), Rajabi (2019)  
Kriging Clifton & Neuman (1982), Asadi & Adhikari (2022)  
Radial basis functions Nourani et al. (2017)  
Support vector machines Yoon et al. (2011), Lal & Datta (2018)  
DNNs Mo et al. (2020), Payne et al. (2022), Zheng et al. (2023), Xia et al. (2023)  

Surrogate models have been widely utilized in regional groundwater flow studies for various purposes, including pumping optimization (Christelis et al. 2019; Han et al. 2020), uncertainty quantification (Sreekanth & Datta 2011; Asher et al. 2015; Gadd et al. 2019), and sensitivity analysis (Miao et al. 2019; Chen et al. 2021). Although these models are effective in predicting pointwise head values, they face challenges in accurately representing the spatially continuous nature of groundwater hydraulic heads, which are better conceptualized as maps rather than discrete points. Traditional interpolation methods between pointwise estimates may oversimplify and distort reality, especially in areas with complex head distributions. Addressing this, image-to-image regression emerges as a robust solution, allowing for the direct prediction of hydraulic head distributions without the need for interpolation, thus preserving spatial details (Zhu & Zabaras 2018). This approach effectively captures the spatial relationships and patterns influenced by the region's geological and hydrological features (Celik & Aslan 2020), offering significant advantages over traditional methods by producing higher-quality outcomes (Rajabi et al. 2022).

Several recent studies, as summarized in Table 2, have utilized image-to-image regression for groundwater flow and contaminant transport simulations. These studies predominantly employed encoder–decoder convolutional deep neural networks (ED-CNNs), such as the attention U-Net, to perform image-to-image regression (Kontos et al. 2022; Xia et al. 2023; Zheng et al. 2023). Through a data-driven supervised learning approach, these models have demonstrated their ability to produce accurate results for both forward and inverse simulations, while significantly reducing computational time compared to state-of-the-art numerical solvers. Although most of these studies have focused on steady-state flow conditions (Taccari et al. 2022; Xia et al. 2023; Zheng et al. 2023), a few have also considered transient flow (e.g., Mo et al. 2019). In the case of transient flow, a typical approach involves using the output from the previous time step as an input to the network for predicting the current time step's output.

Table 2

Review of image-to-image DNN approaches for groundwater flow and contaminant simulations

ReferenceTest caseInput imagesOutput images
Model
Zheng et al. (2023)  Steady-state groundwater flow through a single-layer, heterogeneous confined aquifer, affected by a contaminant source with time-varying strengtha Hydraulic conductivity, contaminant source strength  Hydraulic head, contaminant concentration Combines a generative adversarial network and a convolutional neural network 
Xia et al. (2023)  Single-layer, heterogeneous aquifer with multiple time-varying source strengths Hydraulic conductivity field, contaminant source  Hydraulic head, contaminant concentration Convolutional encoding-decoding NN 
Taccari et al. (2023)  Single-layer, heterogeneous confined aquifer a pumping well Pumping well location hydraulic conductivity  Steady-state heads distribution DeepONet (vanilla, and extended) 
Kontos et al. (2022)  With two pumping wells and six suspected possible instantaneous contaminant sources Scenario-specific factors encompass: the pollution status of each node, the specific day on which each node becomes polluted, the pollution duration at each node, and the hydraulic head drawdown observed at each node  Contaminant source location Convolutional encoder–decoder 
Taccari et al. (2022)  Single-layer, heterogeneous confined aquifer Head at the boundaries, location of the boundaries, hydraulic conductivity field  Steady-state head distribution Data-driven Attention U-Net 
Jiang et al. (2021)  Time-dependent multiphase flow in a single-layer, channelized geological system The binary channelized formation, saturation, and pressure maps at time   Saturation and pressure maps at time  Autoregressive residual U-net 
Zhou & Tartakovsky (2021)  Heterogeneous aquifer with a pointwise contaminant sourcea Initial contaminant concentration distribution  Concentration distribution Convolutional encoder–decoder NN 
Mo et al. (2019)  Single-layer, heterogeneous aquifer with a pointwise contaminant sourceb Hydraulic conductivity field, source terms  Hydraulic head, contaminant concentration Convolutional encoder–decoder NN 
Pan et al. (2022)  Single-layer, heterogeneous unconfined aquifer Hydraulic conductivity 
 
Hydraulic head Convolutional-cycle generative adversarial NN 
ReferenceTest caseInput imagesOutput images
Model
Zheng et al. (2023)  Steady-state groundwater flow through a single-layer, heterogeneous confined aquifer, affected by a contaminant source with time-varying strengtha Hydraulic conductivity, contaminant source strength  Hydraulic head, contaminant concentration Combines a generative adversarial network and a convolutional neural network 
Xia et al. (2023)  Single-layer, heterogeneous aquifer with multiple time-varying source strengths Hydraulic conductivity field, contaminant source  Hydraulic head, contaminant concentration Convolutional encoding-decoding NN 
Taccari et al. (2023)  Single-layer, heterogeneous confined aquifer a pumping well Pumping well location hydraulic conductivity  Steady-state heads distribution DeepONet (vanilla, and extended) 
Kontos et al. (2022)  With two pumping wells and six suspected possible instantaneous contaminant sources Scenario-specific factors encompass: the pollution status of each node, the specific day on which each node becomes polluted, the pollution duration at each node, and the hydraulic head drawdown observed at each node  Contaminant source location Convolutional encoder–decoder 
Taccari et al. (2022)  Single-layer, heterogeneous confined aquifer Head at the boundaries, location of the boundaries, hydraulic conductivity field  Steady-state head distribution Data-driven Attention U-Net 
Jiang et al. (2021)  Time-dependent multiphase flow in a single-layer, channelized geological system The binary channelized formation, saturation, and pressure maps at time   Saturation and pressure maps at time  Autoregressive residual U-net 
Zhou & Tartakovsky (2021)  Heterogeneous aquifer with a pointwise contaminant sourcea Initial contaminant concentration distribution  Concentration distribution Convolutional encoder–decoder NN 
Mo et al. (2019)  Single-layer, heterogeneous aquifer with a pointwise contaminant sourceb Hydraulic conductivity field, source terms  Hydraulic head, contaminant concentration Convolutional encoder–decoder NN 
Pan et al. (2022)  Single-layer, heterogeneous unconfined aquifer Hydraulic conductivity 
 
Hydraulic head Convolutional-cycle generative adversarial NN 

Note: , Forward mapping; , Direct inverse mapping.

aGroundwater contaminant source identification is done by employing the surrogate model in the context of the iterative local updating ensemble smoother algorithm.

bThe surrogate model is employed in the Markov Chain Monte Carlo algorithm for contaminant source identification.

In the studies reviewed above, there is a common practice of transforming all the inputs of the deep neural network (DNN) models into images to perform image-to-image regression. However, certain aspects, such as constant head or specified flux boundary conditions, pumping and injection rates, and total recharge rate, could potentially be more effectively represented as values or vectors rather than images. The simplification of homogenizing inputs using images is primarily driven by the unique challenges posed by mixed data obtained from groundwater flow and transport modeling, where each data type may require different preprocessing steps, including scaling, normalization, and feature engineering. The treatment of mixed data remains an active area of research, heavily influenced by the specific task and the desired outcome. Therefore, it is crucial to explore and develop approaches that can adequately handle the diverse nature of data types encountered in groundwater flow and contaminant simulations.

In addition, all studies reviewed in Table 2 focus solely on two-dimensional (2D) hypothetical test cases with square domains. While the surrogate models proposed in these studies have demonstrated promising results for addressing subsurface flow problems, the current models assume relatively simple heterogeneity, often relying on Gaussian or bimodal distributions (Zheng et al. 2023). Incorporating input data based on real geological heterogeneities would impose more demanding requirements on the image-to-image regression models (Jiang et al. 2021). This is an aspect that needs to be thoroughly addressed to ensure the models can effectively handle the complexities introduced by real-world geological conditions.

The current study aims to address the above research gaps by proposing an innovative hybrid DNN model capable of handling mixed inputs consisting of both images and numeric values. The primary objective is to leverage this model to accurately estimate the full hydraulic head distribution within a real-world aquifer characterized by two distinct layers and a heterogeneous hydraulic conductivity. This application will provide insights into the model's ability to handle the complexities introduced by real-world geological conditions, including layered structures, and varying hydraulic conductivity values. The accuracy and reliability of the model's predictions are assessed by comparing them against the outputs of an established numerical solver.

In this section, we present our test case and describe the process of numerical simulation used to generate training, validation, and test datasets for the hybrid DNN model. Subsequently, we outline the framework of the supervised training task and provide a detailed description of the model architecture and training approach.

Study area

The study area is the eastern part of the North Qatar aquifer (see Figure 1), which covers around 3,569 km2 or 13% of the entire surface area of Qatar. The area is mostly flat, and surface elevation ranges from 0 at the sea boundary to 48 m above the mean sea level further inland. The topography is characterized by sunken formations that result from the dissolution of limestone, creating karst features that are beneficial for replenishing the groundwater supply (Baalousha 2016a).
Figure 1

The study area in northeast Qatar, and visualization of the elevation map.

Figure 1

The study area in northeast Qatar, and visualization of the elevation map.

Close modal

The climate is generally hot and arid, with hot summers and short, mild winters. Rainfall is highly erratic and mostly occurs between November and March. The average temperature in summer can exceed 40°C, while in winter it can drop to around 15°C. Apart from flash floods, surface water is nonexistent. The average annual rainfall is 80 mm per year. Recharge, in the context of Qatar's hydrogeology, predominantly takes place via terrestrial depressions where rainwater runoff accumulates following precipitation events. Recharge from rainfall is estimated to be 7–10% of annual rainfall (Baalousha et al. 2015). Flash floods resulting from thunderstorms account for much of the recharge potential. Hence, there is high uncertainty in natural recharge rates.

The aquifer in this area is unconfined, extensively karstic, and fractured (Eccleston et al. 1981; Ahmad & Al-Ghouti 2020). Due to high salinity, the groundwater in Qatar is generally not used directly as drinking water for the public, but it is widely used for agricultural and domestic consumption (Ahmad & Al-Ghouti 2020). The study area of northeast Qatar has a relatively high density of pumping wells. According to Schlumberger Water Services (2009), the freshwater lens in the north-central part of Qatar has been reduced from 15% of the country's area in 1971 to less than 2% in 2009.

Numerical simulations

For the numerical simulations, the widely used MODFLOW model has been chosen. Several previous studies (e.g., Baalousha 2016a, 2016b; Ahmad & Al-Ghouti 2020) have focused on the numerical simulation of the Qatar aquifer or regions that overlap with our study area. Building on our understanding of this previous work, we modeled the aquifer as a two-layer system. Layer 1 represents the uppermost layer, with a thickness range of 16.34–103.58 m, while layer 2 constitutes the lower layer, with a thickness range of 339.73–356.83 m. The model boundaries consist of the sea to the east and north, representing constant head boundaries, while the western boundary accounts for lateral regional flow. According to previous studies, this lateral inflow is estimated to be approximately 2 million m3/year (Schlumberger Water Services 2009; Baalousha 2016b).

A quasi-3D model domain, which encompasses 713,878 cells, each with dimensions of 100 × 100 m, was set up in this study. These cells are structured into two layers, with each layer containing 356,939 cells. The MODFLOW model was developed using the spatial recharge distribution data from Baalousha et al. (2018). Groundwater recharge from rainfall varies spatially, influenced by factors such as rainfall data, surface topography, and soil type. In Qatar, recharge predominantly takes place within land depressions where rainwater gathers post-rainstorm events (Baalousha et al. 2018). Consequently, most of the recharge occurs in specific areas with limited surface extent.

The model was calibrated under steady-state conditions using groundwater level measurements from the predevelopment era of the 1950s. The calibration process involved employing the parameter estimation (PEST) code (Welter et al. 2015) to estimate the distribution of heterogeneous hydraulic conductivity. To account for the spatial variability of hydraulic conductivity and reduce the number of parameters, we adopt the pilot points approach. A total of 51 pilot points were randomly selected and utilized. During model calibration, total groundwater recharge is kept constant at 8,227.86 m3/day (about 3 million m3/year). This amount is calculated based on a soil water budget model that was developed by Baalousha et al. (2018) to determine the distribution of net recharge from rainfall.

The calibrated steady-state numerical model was then utilized to construct a transient model. For this purpose, the steady-state groundwater conditions served as the initial conditions for the subsequent simulations. The transient model was designed to simulate 30 years, encompassing two stress periods per year, namely, the months with higher and lower precipitation. Specifically, in each year, the months from November through March experienced marginally higher precipitation, while the remaining seven months had slightly lower rainfall. We will refer to these two periods as ‘wet’ and ‘dry’ months, although it is important to note that the region experiences annual rainfall consistently less than 80 mm/year. Therefore, when we use the term ‘wet period’, we are referring to months with marginally higher precipitation, which are not truly ‘wet’ by typical standards.

Creating the training and validation dataset for model development

We integrated the transient MODFLOW model with the Python FloPy package (Bakker et al. 2016) to streamline model simulation and post-processing. Utilizing FloPy, we developed a custom code to automatically generate MODFLOW input files, execute the model, and extract hydraulic head distributions. To generate the training dataset for the hybrid DNN model, two model parameters are varied in the simulations, as follows:

  • (1) Total annual recharge: The total annual recharge is randomly varied between 2.7 and 12 m3/year using a uniform probability distribution. This range is equivalent to 0.9–4 times the annually calibrated recharge value. The sampled recharge values are then distributed between the wet stress period months (November through March) and the dry stress period months of the year, with ratios of 0.8 and 0.2, respectively. As the spatial distribution of the total recharge remains consistent with the description provided in Section 2.2, the hybrid DNN model takes the total annual recharge as a numerical input.

  • (2) Hydraulic conductivity: The model domain is divided into 100 rectangular zones, each measuring 4,100 × 8,800 m. The hydraulic conductivity values for each zone are derived by multiplying the calibrated hydraulic conductivity values with random samples from a normal distribution, which has a mean value of 1.925 and a standard deviation of 0.4. This process was consistently applied to both layers, resulting in the creation of a checkerboard pattern. As this process changes the hydraulic conductivity distribution, the hydraulic conductivity is an image input to the machine learning (ML) model.

The above parameters, including the recharge rates and hydraulic conductivity values, are loosely based on existing literature and empirical data. It is important to emphasize that these parameters are utilized as illustrative values within the context of our study. Given the inherent variability and complexity of hydrogeological systems, the exact values for these parameters are challenging to determine precisely. The primary aim of employing these values is not to assert their absolute accuracy but rather to demonstrate the efficacy and applicability of our novel ML-based modeling approach to a real-case scenario.

The regional groundwater model is executed over a 30-year period, capturing the initial head distribution and the head distribution at the end of each year. The objective is to develop an ML model that can accurately simulate the head changes occurring during a 1-year period. For this purpose, we gather the hydraulic conductivity maps for the two layers (as two images), the head distribution at the start of the year (as an image), and the total recharge rate during that year (as a numeric value) – all as inputs. With this input data, the hybrid DNN aims to predict the head distribution at the end of the wet and dry period of that year as two image outputs. All input and output images utilized in the model are configured to a resolution of 876 × 404 pixels. The simulations were executed on a computer equipped with an Intel Core i7-7700HQ CPU, clocked at 2.80GHz, and supported by 12 GB of RAM. Our study entailed the completion of 1,000 numerical simulations, each generating data of about 105 MB across 63 distinct CSV files, cumulatively resulting in a dataset approaching 100 GB in total storage requirement.

Surrogate modeling based on mixed inputs

Multimodal inputs, which consist of a combination of images and numerical data, introduce an additional layer of complexity to image-to-image regression tasks (Sinitsin et al. 2022). It requires learning a function that maps both the input images and the numerical inputs to an output image. The image inputs are denoted as , and the scalar inputs as . We seek a mapping function from these inputs to an output image such that:
(1)
where ( is the nth input channel, H is the image height, and W is the image width), , and ( is the output channels, denoting predictions for the wet and dry periods). The loss function minimized is the mean squared error (MSE) (Chen et al. 2022; Taccari et al. 2022):
(2)
where is the total number of pixels in the output image. This function f is trained on a dataset of matching input–output image pairs along with their associated scalar values and, once trained, can be used to predict the output image from a new set of input images and scalars.

Network architecture

We have developed a novel hybrid network architecture capable of effectively handling mixed data types for image-to-image regression tasks. The architecture consists of two distinct branches, each tailored to process and analyze different types of inputs: images and numerical data. The first branch is a CNN encoder, adept at capturing spatial hierarchies and intricate patterns within image data through the mathematical operation of convolution. The convolution layers utilize kernels of size 3 × 3 in our model, to perform localized weighted sums across the image, enabling the detection of spatial features and patterns. This operation is defined as
(3)
where represents the convolution operation, is the input image, and K is the kernel. The convolution operation captures essential spatial information in the form of feature maps. A 3 × 3 kernel offers a good balance between computational efficiency and the ability to capture relevant spatial features. This choice is informed by prevalent practices in deep learning, where 3 × 3 kernels are widely adopted for their efficiency in various pretrained models and architectures, such as Visual Geometry Group (VGG) and ResNet.
The second branch comprises a multilayer perceptron (MLP), specifically designed to process numerical inputs. Utilizing dense layers with rectified linear unit (ReLU) activation functions, the MLP branch models complex relationships between the numerical features, providing a complementary data analysis approach to CNN's spatial analysis. The outputs of these two branches, denoted as vectors and , are then merged using concatenation, forming a unified feature vector . This concatenation allows the model to leverage the strengths of both spatial and numerical data analyses. The combined data are further processed through additional dense layers, enabling the extraction of relevant and informative features from the mixed inputs. Finally, the output image is generated through a dense layer employing the ReLU activation function, followed by a reshaping operation to match the desired output image dimensions . This reshaping involves transforming the final dense layer's output into a 3D tensor:
(4)
This reshaping ensures the preservation of spatial integrity in the output image, crucial for accurate spatial feature representation. Figure 2 demonstrates the architecture of the hybrid network architecture. By synergistically harnessing the strengths of CNNs in spatial pattern recognition and MLPs in numerical data analysis, the hybrid neural network can learn a highly expressive mapping function. A short, descriptive name for this architecture could be ‘Dual Path CNN-MLP’, which reflects the two distinct processing branches for different types of data. Details of the model architecture are provided in Table 3.
Table 3

Architecture of dual path CNN-MLP model

LayerSpecificationActivationOutput shape
Image input Shape = (219, 101, 3) None (None, 219, 101, 3) 
Conv2D Filters = 16, kernel size = 3, padding = ‘same’ ReLU (None, 219, 101, 16) 
Conv2D Filters = 16, kernel size = 3, padding = ‘same’ ReLU (None, 219, 101, 16) 
Conv2D Filters = 32, kernel size = 3, padding = ‘same’ ReLU (None, 219, 101, 32) 
Conv2D Filters = 64, kernel size = 3, padding = ‘same’ ReLU (None, 219, 101, 64) 
MaxPooling2D Pool size = (2, 2) None (None, 109, 50, 64) 
Dropout Rate = 0.5 None (None, 109, 50, 64) 
Flatten – None (None, 348800) 
Numerical _input Shape = (1,) None (None, 1) 
Concatenate – None (None, 348801) 
Dense Units = 32 ReLU (None, 32) 
Dense Units = 64 ReLU (None, 64) 
Dense Units = 128 ReLU (None, 128) 
Dropout Rate = 0.5 None (None, 128) 
Dense Units = 219 × 101 ReLU (None, 219,101) 
Reshape Target shape = (219, 101, 1) None (None, 219, 101, 1) 
LayerSpecificationActivationOutput shape
Image input Shape = (219, 101, 3) None (None, 219, 101, 3) 
Conv2D Filters = 16, kernel size = 3, padding = ‘same’ ReLU (None, 219, 101, 16) 
Conv2D Filters = 16, kernel size = 3, padding = ‘same’ ReLU (None, 219, 101, 16) 
Conv2D Filters = 32, kernel size = 3, padding = ‘same’ ReLU (None, 219, 101, 32) 
Conv2D Filters = 64, kernel size = 3, padding = ‘same’ ReLU (None, 219, 101, 64) 
MaxPooling2D Pool size = (2, 2) None (None, 109, 50, 64) 
Dropout Rate = 0.5 None (None, 109, 50, 64) 
Flatten – None (None, 348800) 
Numerical _input Shape = (1,) None (None, 1) 
Concatenate – None (None, 348801) 
Dense Units = 32 ReLU (None, 32) 
Dense Units = 64 ReLU (None, 64) 
Dense Units = 128 ReLU (None, 128) 
Dropout Rate = 0.5 None (None, 128) 
Dense Units = 219 × 101 ReLU (None, 219,101) 
Reshape Target shape = (219, 101, 1) None (None, 219, 101, 1) 
Figure 2

Schematic of proposed DNN architecture. Conv, convolution layer; Concat, concatenation layer; Fc, fully connected layers.

Figure 2

Schematic of proposed DNN architecture. Conv, convolution layer; Concat, concatenation layer; Fc, fully connected layers.

Close modal

The Dual Path CNN-MLP model was developed to integrate both image-based and vector-based inputs after recognizing the limitations of a purely MLP-based approach. Initially, the MLP model, designed to handle numerical data, was fed hydraulic conductivity values, head distribution, and recharge rates. However, due to the high dimensionality (63,328 dimensions), it struggled with underfitting, failed to capture complex spatial patterns, and ultimately made inaccurate predictions of head distribution. These results are not included here because the model performed so poorly that presenting them would be irrelevant. To address this limitation, the Dual Path CNN-MLP model leverages the spatial pattern recognition capabilities of CNNs to handle hydraulic conductivity as image data. This improves the prediction accuracy by utilizing spatial correlations and ensures computational efficiency through convolution operations that reduce dimensionality while preserving essential features. The approach significantly outperforms the MLP model, offering much more accurate predictions and faster convergence on solutions, optimizing training time and computational resources.

Model training and validation

To efficiently alleviate the computational demands during training of the Dual Path CNN-MLP model, we implement average pooling, employing a (4,4) pool size to down-sample the image data from 876 × 404 to 219 × 101 while retaining essential spatial patterns. In addition, the input data underwent min–max normalization to ensure robust model training. The dataset was segmented into training and validation sets, employing the k-fold cross-validation technique, a widely recognized method for evaluating deep learning models to achieve more dependable outcomes (Khan et al. 2021). This technique, known for its efficacy in mitigating overfitting and ensuring the broad applicability of the dataset (Garre et al. 2020; Nguyen et al. 2021; Vu et al. 2022), involves distributing the entire dataset into k equal parts. Each part, or fold, is alternately used for validation, with the remaining folds allocated for training, a strategy that fosters diverse training experiences across the models (Saud et al. 2020). This cycle results in the training of k distinct models on slightly varied data segments. The efficacy of each model is then evaluated based on its performance on the validation fold, employing metrics like the correlation coefficient (R2) and root mean squared error (RMSE). The model exhibiting the highest average accuracy and lowest average error across all validation folds is selected. Its performance is further assessed on a test dataset to obtain an unbiased estimation of its capabilities. This comprehensive methodology enhances the stability and reliability of our model's performance estimates, minimizing variability arising from random data partitioning. The model development process was carried out within a Jupyter Notebook environment, making use of the TensorFlow and Keras libraries.

The evaluation of our model's performance leverages three key metrics: RMSE measures the overall difference between predicted and actual values; mean absolute percentage error (MAPE) represents relative differences as a percentage; and explained variance score (EVS) assesses the matching degree of fluctuation between predicted and actual values, with values closer to 1 indicating better simulations (Cheng & Cao 2014; Du et al. 2021; Li et al. 2021). These three criteria are calculated as follows:
(5)
(6)
(7)
where , is the actual value of sample i, is the predicted value of sample i, is the average of actual value for sample i, and N is the sample size.

In this section, the numerical simulation outputs are presented, and the Dual Path CNN-MLP surrogate model is validated. The model is then employed for uncertainty propagation analysis, and the resulting estimates are compared with the known target values.

Numerical simulation results

Figure 3(a) presents the head distributions from the MODFLOW-based steady-state simulation, representing the predevelopment conditions in the study area. The calibrated model reproduces the predevelopment hydraulic head distribution with reasonable accuracy and a mean average percentage error of less than 10%. Figure 3(d) and 3(e) illustrate the calibrated hydraulic conductivities for the study area. In layer 1, these values range from 0.01 to over 15 m/day, while in layer 2, they span from 0.01 to nearly 80 m/day. These figures underscore the significant heterogeneity of the partially karstified aquifer in this region. In general, the hydraulic conductivity values identified in this study align with those reported in earlier studies for Qatar, which varied between 0.01 and over 1,000 m/day (Schlumberger Water Services 2009; Baalousha 2016a; Baalousha et al. 2019; Ajjur & Al-Ghamdi 2022). These studies consistently highlight higher hydraulic conductivities in Qatar's northern regions. Moreover, within our study area, hydraulic conductivity demonstrates an incremental increase from the southwest, progressing eastward and northward toward the sea, a pattern also observed and corroborated by Baalousha et al. (2019) for the same region.
Figure 3

Hydraulic head and conductivity distributions: (a) hydraulic head output from steady-state simulation, (b) hydraulic head at the conclusion of the dry period, (c) hydraulic head at the end of the wet period in year 30 for the transient simulation with identical recharge and hydraulic conductivity values as the steady-state model, (d) calibrated hydraulic conductivity in logarithmic scale in layer 1, and (e) calibrated hydraulic conductivity in logarithmic scale in layer 2.

Figure 3

Hydraulic head and conductivity distributions: (a) hydraulic head output from steady-state simulation, (b) hydraulic head at the conclusion of the dry period, (c) hydraulic head at the end of the wet period in year 30 for the transient simulation with identical recharge and hydraulic conductivity values as the steady-state model, (d) calibrated hydraulic conductivity in logarithmic scale in layer 1, and (e) calibrated hydraulic conductivity in logarithmic scale in layer 2.

Close modal

Training results of the Dual Path CNN-MLP model

The selection of hyper-parameters for the Dual Path CNN-MLP model was carried out through iterative experimentation, with their specific values detailed in Table 4. The Adam optimizer was selected for its efficiency, coupled with the MSE as the chosen loss function to guide the training process. We configured the training with a batch size of 16, spanning over 60 epochs, and applied a learning rate of 0.001 to ensure precise model adjustments during optimization. The dataset encompassed 6,000 training samples, alongside 1,500 reserved for testing purposes.

Table 4

The hyperparameter values of a dual path CNN-MLP model

HyperparameterValue
Optimizer Nadam 
Loss function MSE 
Samples 6,000 
Test samples 1,500 
Validation split 0.166 
Batch size 16 
Epochs 60 
Learning rate 0.001 
HyperparameterValue
Optimizer Nadam 
Loss function MSE 
Samples 6,000 
Test samples 1,500 
Validation split 0.166 
Batch size 16 
Epochs 60 
Learning rate 0.001 

To evaluate the model's performance, we employed k-fold cross-validation with , iterating each fold through 60 epochs using a similar subset size of approximately 6,000 samples. This method allowed for training on five-folds while designating one-fold exclusively for validation purposes, facilitating the training of the model under six distinct conditions to enhance generalizability and robustness. The MSE variations during each validation phase are depicted in Figure 4, which also illustrates the average MSE across all training and validation folds. This visualization confirms the model's consistent performance improvement across folds, underscoring its capacity to achieve high accuracy and indicating effective overfitting mitigation for a more dependable performance assessment. The validation accuracy achieved across the k-fold cross-validation process was notably high, with scores of 0.998, 0.998, 0.996, 0.996, 0.997, and 0.996 for each fold. In parallel, the validation RMSE values recorded for each fold were 0.012, 0.012, 0.018, 0.018, 0.013, and 0.018, showcasing the model's consistent predictive precision. Based on these results, the model demonstrating the optimal balance of the highest average accuracy and minimal error was selected for further evaluation.
Figure 4

MSE during the validation periods for each fold and the average MSE for both training and validation datasets across all folds.

Figure 4

MSE during the validation periods for each fold and the average MSE for both training and validation datasets across all folds.

Close modal

The higher MSE observed in training compared to validation accuracy in Figure 4 can be attributed to several factors. First, using five-sixth of the data for training exposes the model to a wider variety of cases and noise, thereby increasing the variability in error calculation. This contrasts with the validation phase, which uses only one-sixth of the data and potentially captures less complexity and noise, resulting in a lower MSE. In addition, since each fold is validated only once without being directly trained on, the model can better generalize to this unseen data subset, which may lead to lower MSE during validation phases. Moreover, the use of a logarithmic scale to display MSE exaggerates these differences, making the discrepancies between training and validation MSE more pronounced than they might appear on a linear scale.

The effectiveness of the selected model was subsequently evaluated against the test dataset, providing a detailed examination of its ability to generalize and the dependability of its performance. A scatter plot depicted in Figure 5(a) and 5(b) compares the model's predicted head values against those obtained from numerical simulations for both wet and dry periods. During the wet months, as illustrated in Figure 5(a), the model demonstrates a robust correlation with an score of 0.995, although minor discrepancies are noted for higher head values. Conversely, in the dry months (Figure 5(b)), the model's predictions closely align with the actual values, evidenced by an impressive score of 0.998, despite a few instances where the model slightly overestimates compared to the numerical simulations. This analysis underscores the model's strong predictive accuracy across different hydrological conditions.
Figure 5

Scatter plots comparing numerical simulations versus predicted head values (a) during the wet and (b) dry months for the test periods. The coefficient of determination is denoted as R2 and the red line represents the 1:1 line.

Figure 5

Scatter plots comparing numerical simulations versus predicted head values (a) during the wet and (b) dry months for the test periods. The coefficient of determination is denoted as R2 and the red line represents the 1:1 line.

Close modal

To rigorously assess the selected model's ability to predict groundwater head values during both wet and dry periods, we utilized MAPE, RMSE, and EVS as key performance indicators across the training, validation, and testing phases. Detailed in Table 5, the results highlight the model's high degree of accuracy and precision. Specifically, MAPE values fall within a narrow range of 1.977–2.028%, demonstrating the model's consistent accuracy and minimal average error in estimating groundwater levels. RMSE figures, lying between 0.016 and 0.018 m, further affirm the model's precision in its predictions. Moreover, EVS scores approaching 0.996 illustrate the model's exceptional capacity to account for the variance observed in numerical simulations, especially during wet months, showcasing its predictive reliability.

Table 5

Summary of statistical metrics for evaluating the dual path CNN-MLP model performance in simulating groundwater head values

MonthsPeriodsMAPE (%)RMSE (m)EVS
Wet Training 2.028 0.018 0.996 
 Validation 1.977 0.016 0.997 
 Testing 2.025 0.018 0.996 
Dry Training 1.53 0.0045 0.997 
 Validation 1.47 0.0043 0.997 
 Testing 1.52 0.0045 0.997 
MonthsPeriodsMAPE (%)RMSE (m)EVS
Wet Training 2.028 0.018 0.996 
 Validation 1.977 0.016 0.997 
 Testing 2.025 0.018 0.996 
Dry Training 1.53 0.0045 0.997 
 Validation 1.47 0.0043 0.997 
 Testing 1.52 0.0045 0.997 

Head time series prediction

To further analyze the model's performance, we choose a specific observation point, as shown in Figure 6(a), and present 30-year time series of variations in numerically simulated head values (from a single numerical model simulation of the test dataset), Dual Path CNN-MLP-predicted head values (generated from 30 one-year simulations, where the model's output at time becomes the input at time ), and the corresponding recharge time series (Figure 6(b)). The results highlight a strong agreement between the model's predictions and the numerically simulated head values during wet months, indicating a good replication of behavior. During dry months, minor discrepancies emerge between the DNN-predicted and numerically simulated values, though the overall behavior of the time series remains preserved.
Figure 6

Comparing a 30-year time series of head values at a specific observation point depicted in (a), between target values and predictions by the Dual Path CNN-MLP model (b).

Figure 6

Comparing a 30-year time series of head values at a specific observation point depicted in (a), between target values and predictions by the Dual Path CNN-MLP model (b).

Close modal

Analyzing error distribution across the head map

To assess the performance of the neural network in simulating head distribution, we employed a test data sample. In Figure 7(a), the neural network's input parameters are presented, encompassing hydraulic conductivity maps in layers 1 and 2, along with the recharge time series. Figure 7(b) depicts the reference hydraulic head distribution obtained through numerical simulation. In parallel, Figure 7(c) showcases the hydraulic head estimates produced by the Dual Path CNN-MLP model, visualized as a contour plot. To demonstrate disparities, Figure 7(d) exhibits the spatial distribution of absolute percentage error between the numerical and DNN model outcomes. As demonstrated, the spatial distribution of hydraulic head values generated by the numerical model and the DNN model are similar. The percentage error at various points varies between 0 and 12%, with the majority of points having errors less than 4%. The most pronounced errors belong to regions characterized by the steepest spatial gradients of head change.

Uncertainty propagation analysis

In this subsection, we assess the effectiveness of our Dual Path CNN-MLP model in managing uncertainty propagation analysis for regional groundwater flow simulations, with Monte Carlo simulations (MCSs) serving as the foundational method for this analysis. MCS is a statistical technique that involves defining a model of the system in question with input variables , each associated with specific probability distributions to reflect the uncertainty or variability of these inputs. For each iteration of the simulation, a set of input values () is randomly drawn from these distributions, and the model calculates an outcome , where f represents the system's mathematical function or model. This process is repeated many times, generating a dataset of outcomes from which statistical properties, such as mean (), standard deviation (), and the probability density function , are estimated. These properties provide insights into the expected behavior of the system, its variability, and the probabilistic relationship between inputs and the outcome. MCS is particularly useful for assessing risk and evaluating the impact of uncertain parameters (Baalousha 2016c; Wang et al. 2020).
Figure 7

(a) Input data: recharge and hydraulic conductivity in layers 1 and 2, (b) target head distributions, (c) Dual Path CNN-MLP-based predicted head distribution, and (d) absolute percentage error map.

Figure 7

(a) Input data: recharge and hydraulic conductivity in layers 1 and 2, (b) target head distributions, (c) Dual Path CNN-MLP-based predicted head distribution, and (d) absolute percentage error map.

Close modal
To conduct MCS, we utilize 200 separate sets of input values (specifically, hydraulic conductivity maps and recharge values) within a MODFLOW numerical model. This process yields and maps for the resulting head outputs. Then, we substitute the numerical models with the Dual Path CNN-MLP surrogate model and repeat the MCSs with the same number of input sets. The obtained and distributions for both the numerical models and the Dual Path CNN-MLP are juxtaposed in Figure 8. Furthermore, Figure 8 also illustrates the error maps, which are generated through a pixel-by-pixel comparison of the differences between the numerical model and the DNN's and values.
Figure 8

Maps of mean, standard deviation, and their respective estimated absolute errors for simulated head values in the 30th year during wet months (a–f) and dry months (g–l).

Figure 8

Maps of mean, standard deviation, and their respective estimated absolute errors for simulated head values in the 30th year during wet months (a–f) and dry months (g–l).

Close modal

Within the presented figures, a notable agreement between and can be recognized based on the outputs of DNN and the numerical model during wet and dry months. Specifically, during wet months, the alignment between mean maps of the numerical model and DNN model (Figure 8(b) and 8(c)) significantly surpasses that of the standard deviation maps (Figure 8(e) and 8(d)). Conversely, during dry months, the agreement between standard deviation maps the numerical model and DNN model deviations (Figure 8(g) and 8(h)) is more pronounced compared to the alignment with mean maps (Figure 8(j) and 8(k)). Maximum absolute errors stand at 0.009 and 0.008 for and 0.01 and 0.004 for estimations during wet and dry months, respectively, primarily concentrated near regions of high head values. The average absolute percentage error for mean head values between the numerical model and DNN model is 0.61% for wet months and 0.81% for dry months, underscoring a higher degree of agreement with the target maps during dry months.

Computation efficiency

In this section, we quantify the computational efficiency of our proposed ML model relative to traditional numerical modeling, specifically using MODFLOW 2005. Our analysis centers on the computational time required for each approach, providing a clear metric for comparison. In our test case, a single simulation with MODFLOW 2005 requires 265.485 s to complete. In contrast, the neural network model, following an initial training phase of approximately 1,6407 s, can execute predictions in just 0.39 s. Hence, although the ML model demands a substantial one-time training investment, this phase does not recur for similar tasks, positioning the model for quick predictions thereafter. The neural network's ability to perform predictions in under a second represents a considerable advantage over numerical modeling. The computational efficiency improvement can be expressed as the percentage reduction in time from MODFLOW 2005 to the Dual Path CNN-MLP surrogate model for predictions:
(8)

This reduction in computation time illustrates the significant efficiency gains achievable with the proposed ML model.

In this study, we developed and tested the ‘Dual Path CNN-MLP’, a novel hybrid model that combines CNNs and MLPs to address the challenge of simulating regional groundwater flow. By effectively handling a mix of input types, including imagery and numerical data, this model overcomes the traditional limitations associated with the homogenization of data formats, preserving the integrity of the diverse groundwater data and enhancing the model's utility in real-world applications. Our methodology involved applying this model to simulate transient groundwater flow within the northeast Qatar aquifer, comparing its performance against the established MODFLOW model through rigorous validation techniques, including k-fold cross-validation. Key findings from our study reveal that the Dual Path CNN-MLP model achieves remarkable precision in predicting groundwater flow, with an error margin of less than 12% across all tested locations. The model notably excels in uncertainty analysis, evidenced by its ability to estimate the mean hydraulic head with an average absolute percentage error of about 1%, and demonstrates substantial computational efficiency, offering a 99% reduction in computation time over the MODFLOW simulations.

The proposed model has the capacity to integrate additional factors such as porosity and surface topology, among others, either as images or vector inputs. However, in this investigation, we deliberately chose not to expand the model's input spectrum. This was based on the observation that our current methodology achieves notable accuracy without needing to factor in these additional elements. The decision on whether to include a more comprehensive set of parameters is dependent on the specific requirements of each study and presents an opportunity for further exploration in future research. Given that this study primarily explores 2D simulations, there is potential for extending our work into three-dimensional groundwater modeling. Such expansion would allow for a better understanding of aquifer behaviors and geological complexities. Future research could consider the model's adaptability to varied hydrogeological scenarios, its ability to process a broader array of data types, and ways to enhance its architecture for improved predictive performance. In addition, integrating physical loss functions, akin to those used in Physics-Informed Neural Networks, could offer a pathway to increase the physical realism of the simulations, thereby enhancing their feasibility and applicability in real-world settings.

The authors would like to acknowledge the support of Hamed Bin Khalifa University-Qatar, and Sultan Qaboos University (SQU), Oman, for the support received through the awarded grants NPRP13S-0129-200198 (for SQU, EG/DVC/WRC/21/01). The support of the research group DR/RG/017 is also appreciated.

All relevant data are included in the paper or its Supplementary Information.

The authors declare there is no conflict.

Ahmad
A. Y.
&
Al-Ghouti
M. A.
2020
Approaches to achieve sustainable use and management of groundwater resources in Qatar: A review
.
Groundwater for Sustainable Development
11
,
100367
.
Asher
M. J.
,
Croke
B. F.
,
Jakeman
A. J.
&
Peeters
L. J.
2015
A review of surrogate models and their application to groundwater modeling
.
Water Resources Research
51
(
8
),
5957
5973
.
Azizpour
A.
,
Izadbakhsh
M. A.
,
Shabanlou
S.
,
Yosefvand
F.
&
Rajabi
A.
2022
Simulation of time-series groundwater parameters using a hybrid metaheuristic neuro-fuzzy model
.
Environmental Science and Pollution Research
29
(
19
),
1
17
.
Baalousha
H. M.
2016a
Development of a groundwater flow model for the highly parameterized Qatar aquifers
.
Modeling Earth Systems and Environment
2
,
1
11
.
Baalousha
H. M.
2016b
Groundwater vulnerability mapping of Qatar aquifers
.
Journal of African Earth Sciences
124
,
75
93
.
Baalousha
H. M.
2016c
Using Monte Carlo simulation to estimate natural groundwater recharge in Qatar
.
Modeling Earth Systems and Environment
2
,
1
7
.
Baalousha
H.
,
McPhee
H. M.
&
Anderson
M. J.
2015
Estimation of natural groundwater recharge in Qatar using GIS
. In
MODSIM2015, 21st International Congress on Modelling and Simulation
, pp.
2026
2032
.
Baalousha
H. M.
,
Barth
N.
,
Ramasomanana
F. H.
&
Ahzi
S.
2018
Groundwater recharge estimation and its spatial distribution in arid regions using GIS: A case study from Qatar karst aquifer
.
Modeling Earth Systems and Environment
4
,
1319
1329
.
Bakker
M.
,
Post
V.
,
Langevin
C. D.
,
Hughes
J. D.
,
White
J. T.
,
Starn
J. J.
&
Fienen
M. N.
2016
Scripting MODFLOW model development using Python and FloPy
.
Groundwater
54
(
5
),
733
739
.
Chen
Y.
,
Li
L.
,
Li
W.
,
Guo
Q.
,
Du
Z.
&
Xu
Z.
2022
AI Computing Systems: An Application Driven Perspective
.
Elsevier
.
Christelis
V.
,
Kopsiaftis
G.
&
Mantoglou
A.
2019
Performance comparison of multiple and single surrogate models for pumping optimization of coastal aquifers
.
Hydrological Sciences Journal
64
(
3
),
336
349
.
Eccleston
B. L.
,
Pike
J. G.
&
Harhash
I.
1981
The Water Resources of Qatar and Their Development
,
Vol. I
.
Food and Agricultural Organization of the United Nations
,
Roime, Italy
.
Eslami
P.
,
Nasirian
A.
,
Akbarpour
A.
&
Nazeri Tahroudi
M.
2022
Groundwater estimation of Ghayen plain with regression-based and hybrid time series models
.
Paddy and Water Environment
20
(
3
),
429
440
.
Han
Z.
,
Lu
W.
,
Fan
Y.
,
Lin
J.
&
Yuan
Q.
2020
A surrogate-based simulation–optimization approach for coastal aquifer management
.
Water Supply
20
(
8
),
3404
3418
.
Kontos
Y. N.
,
Kassandros
T.
,
Perifanos
K.
,
Karampasis
M.
,
Katsifarakis
K. L.
&
Karatzas
K.
2022
Machine learning for groundwater pollution source identification and monitoring network optimization
.
Neural Computing and Applications
34
(
22
),
19515
19545
.
Kopsiaftis
G.
,
Protopapadakis
E.
,
Voulodimos
A.
,
Doulamis
N.
&
Mantoglou
A.
2019
Gaussian process regression tuned by bayesian optimization for seawater intrusion prediction. Computational intelligence and neuroscience, 2019, 2859429
.
Kouziokas
G. N.
,
Chatzigeorgiou
A.
&
Perakis
K.
2018
Multilayer feed forward models in groundwater level forecasting using meteorological data in public management
.
Water Resources Management
32
,
5041
5052
.
LaHaye
O.
,
Habib
E. H.
,
Vahdat-Aboueshagh
H.
,
Tsai
F. T. C.
&
Borrok
D.
2021
Assessment of aquifer storage and recovery feasibility using numerical modeling and geospatial analysis: application in Louisiana
.
JAWRA Journal of the American Water Resources Association
57
(
3
),
505
526
.
Liu
Q.
,
Gui
D.
,
Zhang
L.
,
Niu
J.
,
Dai
H.
,
Wei
G.
&
Hu
B. X.
2022
Simulation of regional groundwater levels in arid regions using interpretable machine learning models
.
Science of The Total Environment
831
,
154902
.
Mirarabi
A.
,
Nassery
H. R.
,
Nakhaei
M.
,
Adamowski
J.
,
Akbarzadeh
A. H.
&
Alijani
F.
2019
Evaluation of data-driven models (SVR and ANN) for groundwater-level prediction in confined and unconfined systems
.
Environmental Earth Sciences
78
,
1
15
.
Miro
M. E.
,
Groves
D.
,
Tincher
B.
,
Syme
J.
,
Tanverakul
S.
&
Catt
D.
2021
Adaptive water management in the face of uncertainty: Integrating machine learning, groundwater modeling and robust decision making
.
Climate Risk Management
34
,
100383
.
Najafzadeh
M.
,
Homaei
F.
&
Mohamadi
S.
2022
Reliability evaluation of groundwater quality index using data-driven models
.
Environmental Science and Pollution Research
29
(
6
),
8174
8190
.
Nguyen
X. C.
,
Nguyen
T. T. H.
,
La
D. D.
,
Kumar
G.
,
Rene
E. R.
,
Nguyen
D. D.
,
Chang
S. W.
,
Chung
W. J.
,
Nguyen
X. H.
&
Nguyen
V. K.
2021
Development of machine learning-based models to forecast solid waste generation in residential areas: A case study from Vietnam
.
Resources, Conservation and Recycling
167
,
105381
.
Omar
P. J.
,
Gaur
S.
,
Dwivedi
S. B.
&
Dikshit
P. K. S.
2020
A modular three-dimensional scenario-based numerical modelling of groundwater flow
.
Water Resources Management
34
,
1913
1932
.
Ostad-Ali-Askari
K.
&
Shayannejad
M.
2021
Quantity and quality modelling of groundwater to manage water resources in Isfahan-Borkhar Aquifer
.
Environment, Development and Sustainability
23
(
11
),
15943
15959
.
Payne
K.
,
Chami
P.
,
Odle
I.
,
Yawson
D. O.
,
Paul
J.
,
Maharaj-Jagdip
A.
&
Cashman
A.
2022
Machine learning for surrogate groundwater modelling of a small carbonate island
.
Hydrology
10
(
1
),
2
.
Rajabi
M. M.
,
Javaran
M. R. H.
,
Bah
A. O.
,
Frey
G.
,
Le Ber
F.
,
Lehmann
F.
&
Fahs
M.
2022
Analyzing the efficiency and robustness of deep convolutional neural networks for modeling natural convection in heterogeneous porous media
.
International Journal of Heat and Mass Transfer
183
,
122131
.
Robinson
C.
,
Brovelli
A.
,
Barry
D. A.
&
Li
L.
2009
Tidal influence on BTEX biodegradation in sandy coastal aquifers
.
Advances in Water Resources
32
(
1
),
16
28
.
Sarma
R.
&
Singh
S. K.
2022
A comparative study of data-driven models for groundwater level forecasting
.
Water Resources Management
36
(
8
),
2741
2756
.
Saud
S.
,
Jamil
B.
,
Upadhyay
Y.
&
Irshad
K.
2020
Performance improvement of empirical models for estimation of global solar radiation in India: A k-fold cross-validation approach
.
Sustainable Energy Technologies and Assessments
40
,
100768
.
Schlumberger Water Services 2009 Studying and Developing the Natural and Artificial Recharge of the Groundwater in Aquifer in the State of Qatar, Appendices; Schlumberger Water Services, Doha, Qatar. Project final report retrieved from Department of Agricultural and Water Research (DAWR), Ministry of Environment (MoE)
.
Shivakoti
B. R.
,
Villholth
K. G.
,
Pavelic
P.
&
Ross
A.
2019
Strategic use of groundwater-based solutions for drought risk reduction and climate resilience in Asia and beyond. Contributing paper to Global Assessment Report on disaster risk reduction (GAR 2019). United Nations Office for Disaster Risk Reduction, Geneva, Switzerland
.
Sinitsin
V.
,
Ibryaeva
O.
,
Sakovskaya
V.
&
Eremeeva
V.
2022
Intelligent bearing fault diagnosis method combining mixed input and hybrid CNN-MLP model
.
Mechanical Systems and Signal Processing
180
,
109454
.
Sun
Q.
,
Gao
M.
,
Wen
Z.
,
Guo
F.
,
Hou
G.
,
Liu
Z.
,
Cai
Z.
,
Chang
X.
,
Zheng
T.
&
Zhao
G.
2023
Reactive transport modeling for the effect of pumping activities on the groundwater environment in muddy coasts
.
Journal of Hydrology
621
,
129614
.
Taccari
M. L.
,
Nuttall
J.
,
Chen
X.
,
Wang
H.
,
Minnema
B.
&
Jimack
P. K.
2022
Attention U-Net as a surrogate model for groundwater prediction
.
Advances in Water Resources
163
,
104169
.
Taccari
M. L.
,
Wang
H.
,
Goswami
S.
,
Nuttall
J.
,
Chen
X.
&
Jimack
P. K.
2023
Developing a cost-effective emulator for groundwater flow modeling using deep neural operators. arXiv preprint arXiv:2304.12299
.
Tian
J.
,
Li
C.
,
Liu
J.
,
Yu
F.
,
Cheng
S.
,
Zhao
N.
&
Wan Jaafar
W. Z.
2016
Groundwater depth prediction using data-driven models with the assistance of gamma test
.
Sustainability
8
(
11
),
1076
.
Welter
D. E.
,
White
J. T.
,
Hunt
R. J.
&
Doherty
J. E.
2015
Approaches in highly parameterized inversion – PEST ++ Version 3, a Parameter ESTimation and uncertainty analysis software suite optimized for large environmental models (No. 7-C12). US Geological Survey
.
Wossenyeleh
B. K.
,
Worku
K. A.
,
Verbeiren
B.
&
Huysmans
M.
2021
Drought propagation and its impact on groundwater hydrology of wetlands: A case study on the Doode Bemde nature reserve (Belgium)
.
Natural Hazards and Earth System Sciences
21
(
1
),
39
51
.
Yoon
H.
,
Jun
S. C.
,
Hyun
Y.
,
Bae
G. O.
&
Lee
K. K.
2011
A comparative study of artificial neural networks and support vector machines for predicting groundwater levels in a coastal aquifer
.
Journal of Hydrology
396
(
1–2
),
128
138
.
Zhou
Y.
&
Li
W.
2011
A review of regional groundwater flow modeling
.
Geoscience Frontiers
2
(
2
),
205
214
.
Zhou
Z.
&
Tartakovsky
D. M.
2021
Markov chain Monte Carlo with neural network surrogates: Application to contaminant source identification
.
Stochastic Environmental Research and Risk Assessment
35
,
639
651
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY-NC-ND 4.0), which permits copying and redistribution for non-commercial purposes with no derivatives, provided the original work is properly cited (http://creativecommons.org/licenses/by-nc-nd/4.0/).