## Abstract

The purpose of this study was to improve the optimization model for predicting more parameters in more difficult conditions (more grid cell numbers and high time interval numbers) than other studies in groundwater flow modelling. Also, the model needs fewer observation numbers for estimating parameters than other studies. In the present study, an optimization model based on model calibration was developed to estimate simultaneously four groundwater flow parameters – hydraulic conductivity, transmissivity, storage coefficient and leakance. The modified clonal selection algorithm, a class of artificial immune systems, was used as a heuristic optimization method. In order to simulate the groundwater flow, MODFLOW was used in conjunction with the model in MATLAB. The input files for MODFLOW were obtained by GMS groundwater simulator. The model was applied to two different hypothetical groundwater systems (two- and three-dimensional) under transient conditions to evaluate its performance. The results showed that the model was feasible for groundwater flow modelling and it could determine the groundwater flow parameters successfully with less observations and more grid cell numbers than the other studies.

## HIGHLIGHTS

The optimization model was improved for determining groundwater model parameters.

The model can estimate more parameters in more difficult conditions (more grid cell numbers and high time interval numbers) than other studies.

The model needs to less observation numbers for estimating parameters than other studies.

## INTRODUCTION

Parameter estimations of models are performed by model calibration. Model calibration is the minimizing process of the difference between the model predictions and field observations of hydraulic heads, flows, water quality (concentrations), etc. It is crucial for groundwater modelling, surface water and hydrologic modelling, etc. Predictions of models depend on the calibration. The models should be calibrated correctly in order to obtain good results. There are several parameters which are required to calibrate in the models. Thus, the model calibration process is a very hard task. According to the type of layer (confined or unconfined aquifer) and simulation (steady-state or transient conditions), various parameters such as hydraulic conductivity, transmissivity, storage coefficient and leakance need to be calibrated in groundwater flow modelling. In order to determine these parameters, numerous trials and errors should be carried out according to observed heads. In this context, many approaches have been performed to facilitate this task in the related literature (Kleinecke 1971; Cooley 1979; Neuman & Carrera 1985; Sadeghi-Tabas *et al.* 2017). Rajanayaka & Kulasiri (2001) used the stochastic inverse method to predict hydraulic conductivity and longitudinal hydrodynamic dispersion parameters. Karahan & Ayvaz (2005) developed the dual reciprocity finite differences method to solve groundwater parameter (transmissivity) estimation problems. Karahan & Ayvaz (2006, 2008) proposed the artificial neural network model for predicting aquifer parameters (transmissivity and storativity). Ayvaz *et al.* (2007) used kernel-based fuzzy c-means clustering and genetic algorithm for estimating aquifer parameter (transmissivity) and zone structure. Hendricks-Franssen & Kinzelbach (2008) applied the ensemble Kalman filter to estimate states and parameters (transmissivity) in groundwater modelling. Majdalani & Ackerer (2011) identified groundwater parameters (hydraulic conductivity) by using an adaptive multiscale method. Noronha & Lee (2013) applied an information theory-based approach to quantify parameter uncertainty (hydraulic conductivity) in groundwater flow modelling. Swathi & Eldho (2018) applied a simulation-optimization model based on meshless local Petrov–Galerkin method and particle swarm optimization for aquifer parameter (transmissivity) identification and zonation structure estimation. However, these studies focused on predicting only one or two parameters, either hydraulic conductivity or transmissivity and storage coefficient. Thus, it is unknown whether they are feasible for different groundwater systems consisting of both confined and unconfined aquifers which require all flow parameters. Some studies such as those of Kresic (1997) and Ayvaz & Gurarslan (2018, 2019) determined transmissivity, storage coefficient and leakage factor for one-dimensional groundwater systems including confined and leaky confined aquifers by using pumping test data. Thomas *et al.* (2019) applied a genetic algorithm, differential evolution, cat swarm optimization and particle swarm optimization to estimate transmissivity, longitudinal dispersivity and transverse dispersivity (dispersivity is a parameter of groundwater contaminant transport model). Koltsida & Kallioras (2019) used a nonlinear regression (within UCODE_2014 code) for estimating hydraulic conductivity, specific storage and recharge parameters. However, it is also unknown whether they are applicable for two- or three-dimensional complex groundwater systems. In this regard, this study proposes an optimization model based on model calibration using the modified clonal selection algorithm to predict simultaneously hydraulic conductivity (for unconfined aquifer), transmissivity (for confined aquifer), storage coefficient (both aquifers) and leakance. In order to test the performance of the model, it was applied to two- and three-dimensional hypothetical groundwater systems including different zones and aquifers, respectively, under transient conditions. Also, a minimum number of observed heads was used and selected randomly in the study areas throughout the model calibration for estimating the parameters. With this study, the aim was to develop an optimization model determining all required parameters which are missing in the other studies for groundwater flow modelling using fewer observations and more grid cell numbers.

## MATERIALS AND METHODS

### Governing groundwater flow equation

*K*,

_{xx}*K*and

_{yy}*K*are hydraulic conductivity values along the

_{zz}*x*,

*y*and

*z*coordinate axes (L T

^{−1}),

*h*is the hydraulic head (L),

*W*is the volumetric flux per unit volume representing sources and/or sinks of water, with

*W*< 0 for flow out of the groundwater system and

*W*> 0 for flow into the system (T

^{−1}),

*S*is the specific storage of the porous material (L

^{−1}) and

*t*is time (T). Block-centred finite difference approach is used for groundwater flow modelling. For confined aquifers, all cells in the layer are assumed to have constant transmissivity (

*T*=

*K·b*, where

*T*is the transmissivity (L

^{2}T

^{−1}) and

*b*is the saturated thickness of aquifer (L)) throughout the simulation. On the other hand, transmissivities are not constant for unconfined aquifers because of changing water level. Thus, transmissivity (

*T*) is used for confined aquifers while hydraulic conductivity (

*K*) is used for unconfined aquifers as parameter (constant). Moreover, there is a leakance (

*L*) parameter in addition to

*K*,

*T*and

*S*parameters for modelling multiple groundwater layers. This parameter is the vertical hydraulic conductivity divided by the thickness from a layer to the layer below.

### Optimization model

*h*under transient conditions. The objective function is as follows:where is the

*i*-th predicted hydraulic head at

*t*-th time, is the

*i*-th observed hydraulic head at

*t*-th time,

*Nh*is the number of observed heads and

*Nt*is the number of times (days etc.). The modified clonal selection algorithm (modified Clonalg) developed by Eryiğit (2015), a class of artificial immune systems, was used to minimize the objective function (Equation (2)). Figure 1 shows the modified Clonalg for optimization problems, where

*Ab*is an antibody set (population) randomly constituted,

*f*is an antibody's antigenic affinity corresponding to the objective function for a given antibody,

*C*is a set of antibodies cloned, and

*C**is a set of matured (mutated) antibodies after the cloning process. In the modified Clonalg, new genes are generated for each clone with a certain probability depending on a given problem, called ‘probability rate’. The number of clones generated for all antibodies can be calculated as follows (De Castro & Von Zuben 2002):where

*N*is the total number of clones in

_{c}*C*,

*β*is a multiplying coefficient, and ‘round’ is the rounding operator for an integer.

*α*is the mutation rate for the clones exposed to the maturation process,

_{i}*ρ*is a decay coefficient and

*f*is the affinity value (objective function value) normalized over the interval [0,1].

_{i}*Ab*:where

*N*is the total number of antibodies in

_{Ab}*Ab*,

*x*is the gene of

_{ij}*Ab*, corresponding to a decision variable of the objective function,

_{i}*nd*is the number of genes (decision variables) of

*Ab*. In this study,

_{i}*x*corresponds to

_{ij}*K*,

*T*and

*S*and

*L*parameters.

*f*was minimized depending on the parameters (genes) constituted and mutated throughout processes of the modified Clonalg. Lower and upper limits for

*K*,

*T*and

*S*and

*L*were assigned as [0.01, 1, 0.0001, 0.0001] and [10, 1,000, 0.1, 0.1], respectively.

The optimization model was coded and linked with MODFLOW 2005 in MATLAB 2018b. The input files for MODFLOW were created by using GMS 10.3 groundwater simulator (Majumder & Bhattacharjya 2011; Borah & Bhattacharjya 2013; Majumder & Eldho 2015). The flow chart of the optimization model is given in Figure 2. The model was run ten and five times for the applications until the maximum iteration number is reached in each run. Random seed (random number generation) was applied while constituting the initial population set in each run. The analyses were performed by using a PC with Intel Core I7-8700 CPU 3.2 GHz.

### Applications

#### Two-dimensional scenario

This two-dimensional hypothetical groundwater system consists of one confined aquifer (one layer) including four zones with different transmissivities and storage coefficients. The aim is to determine simultaneously transmissivity and storage coefficient values of the four zones under transient conditions. Transmissivity and storage coefficient values of the four zones are *T*_{1} = 54 m^{2} day^{−1}, *S*_{1} = 0.0022; *T*_{2} = 65 m^{2} day^{−1}, *S*_{2} = 0.0037; *T*_{3} = 78 m^{2} day^{−1}, *S*_{3} = 0.0041; *T*_{4} = 89 m^{2} day^{−1}, *S*_{4} = 0.0032. The dimensions of the aquifer are 3 km × 3 km (area of 9 km^{2}). The grid has 30 rows and 30 columns (grid cell size is 100 m × 100 m, Δx = Δy = 100 m). The aquifer thickness is 100 m. Flow boundary conditions are assumed to be 146 m and 150 m constant heads at the west and east sides of the aquifer, respectively. North and south sides are no flow boundaries. Starting heads in the aquifer are 148 m. The total simulation time is 100 days with 1-day intervals (Δt = 1 day). There are two pumping wells (PWs) (Q_{1} = 4,750 m^{3} day^{−1} and Q_{2} = 5,500 m^{3} day^{−1}) and 15 observation wells (OWs) in the aquifer. The hydraulic heads are observed at each time interval during the simulation (15 × 100 = 1,500 observation data). Also, there is a recharge varying between 0.0004 and 0.0005 m day^{−1} based on time at the whole aquifer. Figure 3 shows a structure of the two-dimensional aquifer and locations of the wells. Figure 4 illustrates the simulation results of hydraulic heads after 100 days for the two-dimensional aquifer.

#### Three-dimensional scenario

This three-dimensional hypothetical groundwater system consists of one unconfined and one confined aquifer (two layers). The aim is to determine simultaneously hydraulic conductivity, transmissivity, storage coefficient and leakance values of the two aquifers under transient conditions. The parameter values of the unconfined and confined aquifers are *K*_{1} = 0.76 m day^{−1}, *S*_{1} = 0.0039, *L*_{1} = 0.041 day^{−1}; *T*_{2} = 59.3 m^{2} day^{−1}, *S*_{2} = 0.0027, respectively. The dimensions of each aquifer are 3 km × 3 km (area of 9 km^{2} for each layer). The grid of each aquifer has 30 rows and 30 columns (grid cell size is 100 m × 100 m, Δx = Δy = 100 m). Each aquifer thickness is 100 m (total thickness is 200 m). Flow boundary conditions are assumed to be 200 m constant heads at the west and east sides of the upper aquifer (unconfined), respectively. North and south sides are no flow boundaries for the two aquifers. Starting heads in the aquifers are 199 m. The total simulation time is 100 days with 1-day intervals (Δt = 1 day). There are three PWs (Q_{1} = 5,100 m^{3} day^{−1}, Q_{2} = 4,300 m^{3} day^{−1} and Q_{3} = 3,800 m^{3} day^{−1}) and 20 OWs in the aquifers. The hydraulic heads are observed at each time interval during the simulation (20 × 100 = 2,000 observation data for each layer). Also, there is a recharge varying between 0.0009 and 0.001 m day^{−1} based on time at the entire aquifer. Figure 5 shows locations of the wells in the area. Figure 6 illustrates the simulation results of hydraulic heads in the layers after 100 days for the three-dimensional groundwater system.

## RESULTS AND DISCUSSION

The results were obtained as shown in Tables 1–3 after ten and five runs of the model for the scenarios. In Table 1, parameters and performances of the modified Clonalg are given. Comparison results of mean predicted and actual groundwater parameters of two- and three-dimensional groundwater systems are given in Tables 2 and 3, respectively. The applications were performed by considering the minimum number of OWs because of the cost (the model was tested, and best results were obtained by using minimum numbers of 15 and 20 OWs for Scenarios 1 and 2, respectively). In order to increase the prediction possibility of groundwater parameters, more OWs can be used. But, more observations mean more cost. In this regard, the important point is to determine the groundwater parameters by using minimum observations versus maximum grid cell number of the study area. While increasing a grid cell number, the estimation of parameters becomes more difficult and needs more OWs. If any model can determine the parameters with minimum number of observations, this demonstrates the predictive power of the model. Also, the number of time intervals affects the parameter predictions under transient conditions. The predicted groundwater parameters should provide actual hydraulic head values for each interval. Therefore, more time intervals make model calibration difficult as well. Karahan & Ayvaz (2006) used 400 grid cells to estimate a transmissivity parameter in a two-dimensional confined aquifer (Δx = Δy = 50 m, 1 km^{2}). They obtained good results by using 40–80 head observations (10–20% of the hydraulic heads) under steady-state conditions. Ayvaz *et al.* (2007) used 576 grid cells to determine transmissivities in a two-dimensional confined aquifer (Δx = Δy = 250 m, 36 km^{2}) under steady-state conditions. They used 37 OWs to obtain good results for four different transmissivity zones in the aquifer. Karahan & Ayvaz (2008) used 441 grid cells to predict transmissivity and storativity parameters in a two-dimensional confined aquifer (Δx = Δy = 100 m, 4.41 km^{2}) under transient conditions. They used 30 OWs to determine the parameters for 60 time intervals (monthly). Swathi & Eldho (2018) used 169 nodes (instead of grid cells) and 18 OWs to estimate transmissivities in a two-dimensional confined aquifer (Δx = Δy = 500 m, 36 km^{2}) under transient conditions for 100 days (Δt = 1 day). Thomas *et al.* (2019) used 49 nodes (instead of grid cells) and 18 OWs to predict transmissivities in a two-dimensional confined aquifer (Δx = Δy = 1,000 m, 36 km^{2}) under transient conditions for 1,000 days (Δt = 1 day).

Scenario . | N
. _{Ab} | β
. | ρ
. | Probability rate . | Iteration number . | Min. f (m)
. | Max. f (m)
. | Mean f (m)*
. | Mean run time (hour)* . |
---|---|---|---|---|---|---|---|---|---|

1 (15 OWs) | 10 | 1 | 6 for T, 3 for S | 0.1 | 500 | 8.48 | 39.89 | 22.60 ± 9.95 | 20.06 ± 0.005 |

2 (20 OWs) | 15 | 1 | 5 for K, 6 for T, 3 for S, 3 for L | 0.1 | 2,000 | 15.19 | 20.99 | 18.94 ± 2.19 | 257.9 ± 0.061 |

Scenario . | N
. _{Ab} | β
. | ρ
. | Probability rate . | Iteration number . | Min. f (m)
. | Max. f (m)
. | Mean f (m)*
. | Mean run time (hour)* . |
---|---|---|---|---|---|---|---|---|---|

1 (15 OWs) | 10 | 1 | 6 for T, 3 for S | 0.1 | 500 | 8.48 | 39.89 | 22.60 ± 9.95 | 20.06 ± 0.005 |

2 (20 OWs) | 15 | 1 | 5 for K, 6 for T, 3 for S, 3 for L | 0.1 | 2,000 | 15.19 | 20.99 | 18.94 ± 2.19 | 257.9 ± 0.061 |

*N _{Ab}*, number of population

*Ab*;

*β*, multiplying coefficient for the cloning;

*ρ*, decay coefficient;

*K*, hydraulic conductivity;

*T*, transmissivity;

*S*, storage coefficient;

*L*, leakance; OW, observation well.

*Average of ten and five runs for the scenarios, respectively.

Zone . | Mean predicted T (m^{2} day^{−1})*
. | Actual T (m^{2} day^{−1})
. | Mean predicted S (unitless)*
. | Actual S (unitless)
. |
---|---|---|---|---|

1 | 54.44 ± 0.31 | 54 | 0.00228 ± 8.08 × 10^{−5} | 0.0022 |

2 | 65.58 ± 0.43 | 65 | 0.00366 ± 2.17 × 10^{−5} | 0.0037 |

3 | 77.67 ± 0.31 | 78 | 0.00413 ± 1.72 × 10^{−5} | 0.0041 |

4 | 88.39 ± 0.42 | 89 | 0.00324 ± 1.89 × 10^{−4} | 0.0032 |

Zone . | Mean predicted T (m^{2} day^{−1})*
. | Actual T (m^{2} day^{−1})
. | Mean predicted S (unitless)*
. | Actual S (unitless)
. |
---|---|---|---|---|

1 | 54.44 ± 0.31 | 54 | 0.00228 ± 8.08 × 10^{−5} | 0.0022 |

2 | 65.58 ± 0.43 | 65 | 0.00366 ± 2.17 × 10^{−5} | 0.0037 |

3 | 77.67 ± 0.31 | 78 | 0.00413 ± 1.72 × 10^{−5} | 0.0041 |

4 | 88.39 ± 0.42 | 89 | 0.00324 ± 1.89 × 10^{−4} | 0.0032 |

*Average of ten runs.

Layer . | Mean predicted K (m day^{−1})*
. | Actual K (m day^{−1})
. | Mean predicted T (m^{2} day^{−1})*
. | Actual T (m^{2} day^{−1})
. | Mean predicted S (unitless)*
. | Actual S (unitless)
. | Mean predicted L (day^{−1})*
. | Actual L (day^{−1})
. |
---|---|---|---|---|---|---|---|---|

Unconfined | 0.764 ± 0.012 | 0.76 | – | – | 0.00388 ± 2.76 × 10^{−5} | 0.0039 | 0.0440 ± 0.0037 | 0.041 |

Confined | – | – | 58.98 ± 1.04 | 59.3 | 0.00272 ± 2.64 × 10^{−5} | 0.0027 | – | – |

Layer . | Mean predicted K (m day^{−1})*
. | Actual K (m day^{−1})
. | Mean predicted T (m^{2} day^{−1})*
. | Actual T (m^{2} day^{−1})
. | Mean predicted S (unitless)*
. | Actual S (unitless)
. | Mean predicted L (day^{−1})*
. | Actual L (day^{−1})
. |
---|---|---|---|---|---|---|---|---|

Unconfined | 0.764 ± 0.012 | 0.76 | – | – | 0.00388 ± 2.76 × 10^{−5} | 0.0039 | 0.0440 ± 0.0037 | 0.041 |

Confined | – | – | 58.98 ± 1.04 | 59.3 | 0.00272 ± 2.64 × 10^{−5} | 0.0027 | – | – |

*****Average of five runs.

In this study, 900 grid cells and 15 OWs were used to estimate transmissivities and storage coefficients in a two-dimensional confined aquifer for 100 days (Δt = 1 day, 100 time intervals). Although lower OW number and higher grid cell number were used, the flow parameters could be predicted accurately (see Table 2). This demonstrates that the model needs fewer observations in a higher number of grid cells (than the other studies) with high interval numbers. Moreover, the studies mentioned above aimed to determine only one (transmissivity) or two flow parameters (transmissivity and storativity) for a two-dimensional confined aquifer. Therefore, it is unknown whether they are applicable for three-dimensional groundwater systems consisting of both confined and unconfined aquifers which require other flow parameters (hydraulic conductivity and leakance) in addition to transmissivity and storativity. Thus, they need to be tested for these systems. Similarly, Koltsida & Kallioras (2019) used 898 grid cells and 41 OWs for estimating hydraulic conductivity and specific storage in a two-dimensional aquifer (Δx = Δy = 400 m, 160 km^{2}) under transient conditions for 182 days (Δt = 1 day). The present study is using significantly fewer OW numbers (15 OWs) with more grid cells (900 cells) than Koltsida & Kallioras (2019) for a two-dimensional aquifer. Also, their method should be applied to three-dimensional groundwater systems to prove it is feasible and successful.

Noronha & Lee (2013) used 324 grid cells (for each layer) to estimate hydraulic conductivities (transmissivities divided by thickness) in a three-dimensional groundwater system consisting of two confined aquifers (Δx = Δy = 1,000 m, 324 km^{2}) under transient conditions. They used eight OWs to determine the parameters for 12 time intervals. Although they studied the three-dimensional groundwater system, they only predicted one parameter (hydraulic conductivity) for each aquifer. In this study, 20 OWs and 900 grid cells for each layer were used to estimate four different flow parameters (hydraulic conductivity for upper layer, transmissivity for lower layer, storage coefficient for each layer and leakance for upper layer) in a three-dimensional groundwater system consisting of unconfined and confined aquifers for 100 days (100 time intervals). Furthermore, they used lower grid cell and interval numbers (324 and 12, respectively) for only one parameter in spite of using a lower OW number.

Similarly, Rajanayaka & Kulasiri (2001) used 45 grid cells and 45 OWs for predicting hydraulic conductivity based on depth (0.4–2.2 m) in a three-dimensional artificial confined aquifer (Δx = Δy = 1 m, 44 m^{2}) under transient conditions for 432 h (Δt = 2–4 hour). Their results are not compatible between actual and estimated values (137 and 261.6 m/d in 2.2 m depth). The present study uses a significantly lower OW number (20 OWs) with more grid cells (900 cells) than Rajanayaka & Kulasiri (2001) for a three-dimensional groundwater system.

As is seen, the model can predict more flow parameters even though higher grid cell and lower OW numbers are used. Consequently, it is more advantageous than the other approaches in terms of parameter estimation.

The iteration number (IN) was effective for estimating the parameters. Better results might be obtained by using higher IN. However, it increases the processing time (run time). In each run, approximate parameter values were obtained (see standard deviations in Tables 2 and 3). This indicates that the model is stable.

## CONCLUSION

In this study, the optimization model was improved to estimate groundwater flow parameters required for two- and three-dimensional groundwater systems under transient conditions. While other approaches in the related literature determine only one or two flow parameters, which are inadequate for different groundwater systems including both confined and unconfined aquifers, the present model can simultaneously predict four different flow parameters for these systems. Also, the model needs lower observation numbers in more grid cells with high time interval numbers. Consequently, the developed model is able to estimate more groundwater flow parameters by using fewer observations in more difficult conditions than other studies. It can be said that the present model is more useful and economic.

Parameters of many real groundwater basins are unknown. In future studies, this model can be utilized for calibrations of groundwater models applied by real observations to estimate hydraulic heads or contaminant concentrations.

## ACKNOWLEDGEMENTS

This study was performed while the author was at Purdue University. The author thanks Prof. Dr Bernard Engel for his acceptance as a visiting scholar at Purdue University. Also, the author thanks The Scientific and Technological Research Council of Turkey (TUBITAK) for financial support (No. 1059B191800383).

## DATA AVAILABILITY STATEMENT

All relevant data are included in the paper or its Supplementary Information.