Contaminants in groundwater may enter through various sources which are required to be identified for informed decision-making regarding remediation. In early stages of contamination of an aquifer, the sources of contamination are generally unknown. To estimate the unknown release histories at potential sources, the governing equations of contaminant transport are solved backwards in time which renders the inverse problem as ill-posed. Furthermore, due to measurement or operational errors, observed breakthrough curves can be noisy which can induce uncertainty. Simulation-optimization (SO) techniques are generally used for solving the inverse model. Nevertheless, swarm population-based optimization algorithms may get trapped in local optima in the nonlinear search space related to the problem. In this study, SO model based on meshless radial point collocation method (RPCM) and multiverse optimizer (MVO) is proposed. MVO implements gradual transformation from global exploration to local exploitation phase which helps in better estimation of unknowns. The performance of RPCM-MVO-SO model is compared with two other SO models developed by combining RPCM with grey wolf optimizer and particle swarm optimization in two case studies. The proposed model performs better compared to the other two SO models considered which suggests the efficacy of the model for release history estimation in groundwater.

  • A novel RPCM-MVO-SO model is proposed for contaminant source identification.

  • The RPCM-MVO model performs better compared to RPCM-PSO and RPCM-GWO for the two case studies considered.

  • RPCM-MVO model estimated release intensities have low variations for noisy observations.

Groundwater is one of the important water resources which gets frequently contaminated due to deliberate or, unattended discharges from various sources located underground such as chemical releases, septic tanks, storage tanks and tailing ponds (Batu 2005). These sources may inject several reactive or, conservative pollutants which severely affect water quality. An aquifer may be perceived as contaminated when water quality obtained in any one of the observation wells becomes objectionable or, nonusable for various activities such as drinking, agricultural or industrial purposes. When the contamination is first detected in an aquifer, many a time the details about the source are not generally known due to the slow movement of groundwater which results in a large time difference between the event of contamination and its detection. Obtaining the details of the contaminant source, i.e., its location and release history are essential as it assists in designing the optimal remedial actions, prediction of future contaminant plume movement and demarcation of the contaminated aquifer.

The transport of contaminants in groundwater occurs through the presence of a hydraulic gradient which results in advection and dispersion processes (Bear & Cheng 2010). These processes are represented by partial differential equations (PDEs) which are forward in time. Nevertheless, the estimation of release histories in an already contaminated aquifer requires solving the PDEs backwards in time using concentration measurements at the observation wells. However, solving the PDEs backwards in time becomes prohibitively complex due to the ill-posedness of the inverse problem (Jha & Datta 2013) and the physical infeasibility of the process (Gorelick et al. 1983). Generally, simulation-optimization (SO) models are used for release history estimation. In these models, the release history of contaminants from possible source locations is estimated by developing a simulation model which represents the aquifer hydrogeology and the information from observation wells. Further, the optimization model utilizes the simulation model to simulate the response of the aquifer at the observation wells as a result of contaminant release at the source. To predict the history of contaminant release at the source locations, the difference between observed and simulated breakthrough curves at the observation wells is minimized in the SO framework.

Due to the inherent complexity of the groundwater flow and transport process, the relationship between release rates and breakthrough curves is nonlinear. Therefore, stochastic population-based optimization models are usually used in inverse problems due to the nonrequirement of gradients. Several optimization algorithms such as particle swarm optimization (PSO) (Singh et al. 2018), genetic algorithm (Han et al. 2020), cat swarm optimization (CSO) (Thomas et al. 2018), differential evolution (DE) (Gurarslan & Karahan 2015), grey wolf optimization (GWO) (Majumder & Eldho 2020), etc., have been successfully used in SO models for groundwater problems. Apart from that, algorithms such as Improved Runge–Kutta Optimization (IRKO) (Devi et al. 2022), genetic firefly (Gupta et al. 2021), Yin-Yang firefly algorithm (Wang et al. 2020) and Harris hawk optimization (Sammen et al. 2020) are used in solving optimization problems in various fields. These algorithms traverse the nonlinear search space and evolve the solution iteratively to arrive at the global optimum solution. The search procedure in an optimization problem can be divided into global exploration and local exploitation phases (Mirjalili et al. 2016). In the global exploration phase, the potential locations of global optima are identified whereas in the location exploitation phase, the solutions at these potential locations are further refined iteratively. Further many researchers used surrogate models with SO models to reduce the computational cost to a great extent (Borah & Bhattacharjya 2016; Majumder & Eldho 2020). Anshuman & Eldho (2022b) proposed a surrogate SO model based on Entity Aware Sequence-to-sequence learning using Long Short Term Memory (EAS-LSTM) for simultaneous aquifer parameter estimation and contaminant source identification.

In this study, the goal is to evaluate the efficiency of a recently proposed optimization model Multiverse optimization (MVO) (Mirjalili et al. 2016) with grey wolf optimizer (GWO) (Mirjalili et al. 2014) and PSO (Eberhart & Kennedy 1995) for release history estimation in groundwater using the SO technique. The SO model is developed by linking the meshfree RPCM simulation model with the optimization models. The MVO iteratively improves the estimates of the unknowns by reducing the error between observed and simulated breakthrough curves. The MVO algorithm uses a parameter named wormhole existence probability (WEP) which allows local changes in limited population size which increases with the increase in iteration number. This technique helps in a gradual transition from global exploration to the local exploitation phase. This study compares the performance of RPCM-MVO-SO with that of RPCM-PSO and RPCM-GWO models for two case studies. Additionally, using multiple erroneous breakthrough curves in case study 2, this study attempts to compare the uncertainty of estimated release histories by different SO models as a result of observational errors.

Groundwater flow and transport

The governing equation representing steady-state flow in a two-dimensional confined aquifer is given by the following equation (Bear & Cheng 2010).
(1)
Here, h is the hydraulic head; is transmissivity in x-direction and is transmissivity in y-direction; Qw represents injection/pumping well and is Kronecker delta function. The value of is 1 subject to the presence of injection/pumping well at , otherwise, it is taken as zero. The governing equation for the transport of a contaminant species in a two-dimensional aquifer domain is given by (Bear & Verruijt 1987).
(2)
where C represents the concentration of the contaminant, and are seepage velocities, λ is the rate of first-order decay and Qc is the release rate of the contaminant at the source location. The terms and denote dispersion coefficients which are obtained using seepage velocities and dispersivities (longitudinal and transverse, i.e., and , respectively) (Rastogi 2007). The seepage velocities in a coupled system are obtained using Darcy's law which utilizes the effective porosity (ηe), hydraulic conductivity (κ) and head variation in the problem domain obtained using the flow model (Bear & Cheng 2010). Along with the governing Equations (1) and (2), corresponding boundary conditions which are applicable at the boundaries are solved to obtain head and contaminant concentrations in the problem domain (Singh et al. 2016).

Radial point collocation method

A meshless RPCM, previously used to model groundwater (Singh et al. 2016; Mategaonkar et al. 2018; Anshuman et al. 2019; Thomas et al. 2019) is utilized in this study. RPCM disintegrates the problem domain into many intersecting support domains which surround each node present in the problem domain. Equation (3) computes the state variable ‘h’ based on the n nodes in the support domain of a point of interest (Liu & Gu 2005).
(3)
Here, refers to the multi-quadrics radial basis function (MQ-RBF). This function can be evaluated for a node i in the support domain of kth node as follows (Kansa 1990).
(4)
Here, is the Euclidean distance between ith node in the support domain of the point of interest xk. dc represents the average distance between all internal nodes in a problem domain. MQ-RBF shape parameters are denoted by the terms q and α. The vector consisting of the coefficients in Equation (3) can be calculated by implementing the MQ-RBF moment matrix at all nodes in the local support domain as follows (Liu & Gu 2005).
(5)
Here represents the approximated solution of the state variable of h. Substitution of the coefficient vector presented in Equation (5) into Equation (3) yields the solution for the state variable as follows (Liu & Gu 2005).
(6)
In Equation (6), ϕ represents the shape function which is used for spatial discretization of governing equations. For estimating the derivatives of the shape function, the corresponding derivatives of the MQ-RBF vector are used in Equation (6). The governing Equation (1) after the implementation of shape functions and derivatives can be represented as shown in Equation (7) (Anshuman & Eldho 2022a).
(7)
Similarly, the equation governing contaminant transport in Equation (2) can also be discretized in the following equation (Singh et al. 2016).
(8)

As shown in Equation (8), a semi-implicit scheme is used to represent the temporal discretization of concentration in Equation (2) (Anshuman et al. 2019). The temporal discretization parameter takes a value of 0 for fully implicit and 1 for fully explicit as per Equation (8). Here a value of 0.5 is assigned to , which is also referred to as the Crank–Nicolson scheme (Rastogi 2007), as it is known to provide accurate transient simulations (Singh et al. 2016).

Multiverse optimization

Multiverse optimization (MVO) (Mirjalili et al. 2016) is a lately proposed optimization algorithm which is inspired by a popular physics theory which argues for the existence of multiple universes (Barrow et al. 2004). Here, the candidate solutions of the decision variables are referred to as universes. As per the optimization algorithm, the universes interact by swapping elements through white holes, black holes and wormholes. A black hole can devour anything within its gravitational field and is observed throughout the universe. Contrastingly, white holes are theorized to eject enormous matter into space (Eardley 1974). It is theorized that wormholes connect different areas of space-time, as well as the multiverse (Morris & Thorne 1988). A white hole is assumed to exist in the best universe in MVO, while a black hole is assumed to exist in the remaining universes (Mirjalili et al. 2016). The tunnels, which are determined by a roulette wheel mechanism, transport objects from the best universe white to other universes. It is represented as follows.
(9)
The jth variable is denoted by and is selected through the roulette wheel mechanism on the kth universe. Accordingly, refers to the jth variable, while refers to the normalized inflation rate for the ith universe. The term represents a random number in [0, 1]. A proportional inflation rate is assigned to the universes in the multiverse according to the objective function value. After each iteration, the universe with the best inflation rate is considered the best solution. The interaction between universes in MVO is governed by five principles based on inflation rates which are detailed in Mirjalili et al. (2016). On account of these principles, the exploration and exploitation of the search space in the universe are done using white holes/black and wormholes, respectively. The wormholes which can be present in any universe facilitate local changes in the universes using the following equation.
(10)
where represents the jth variable of the best universe. The upper and lower bounds of the jth variable are represented by and . The terms , and denote random numbers in [0,1]. TDR represents the travelling distance rate which is dependent on iteration number t and maximum iteration number . It is given by Equation (11).
(11)
(12)

The term WEP denotes WEP which is also dependent on t and . As observed in Equations (11) and (12), the WEP increases linearly with t while TDR decreases nonlinearly with iteration number t. The maximum and minimum values of WEP are set as 1 and 0.2 (Mirjalili et al. 2016). This essentially means that at the first iteration, approximately 20% population undergoes local changes as per Equation (10) and this percentage increases to approximately 100% as the iteration number reaches to . It may be noted that, at all iterations, the changes are also implemented through Equation (9). In this manner, the WEP parameter helps in the gradual transition of the search process from global exploration to local exploitation.

Particle swarm optimization

PSO, originally developed by Eberhart & Kennedy (1995), is based on the principle of the movement of a bird flock. Here, each candidate solution is referred to as a particle which is assigned a position and velocity in the search space. The velocity and position of the particles are updated at each iteration based on their previous position and the best position of the swarm as shown in the following equations.
(13)
(14)

Here, and represent the velocity and position of the ith particle for iteration number t. b and G represent the best solution of individual particle and swarm, respectively. The terms ω, c1 and c2 are constants. The terms r1 and r2 are random numbers in [0,1]. In this study, the parameters of PSO, i.e., ω, c1 and c2 are set as 0.5, 1.49 and 1.49 based on the study reported by Thomas et al. (2019).

Grey wolf optimization

The GWO algorithm is another stochastic population-based optimization algorithm that is based on the hunting behaviour of the grey wolf (Mirjalili et al. 2014). The alpha, beta, delta and omega wolves in the pack are ranked according to descending hierarchical order. The major steps in the hunting of grey wolves are tracking, encircling and attacking the prey (Muro et al. 2011). In GWO, the candidate solutions are represented as wolves and the alpha wolf corresponds to the solution with the best value of fitness function. Mathematically, the encircling behaviour of grey wolves is represented by Equation (15). The positions of the wolves are modified after each iteration as per Equation (16).
(15)
(16)

The coefficient C depends on a randomly generated number in the range [0,1] whereas A depends on a random number in the range [0,1] as well as iteration number t. These are explained in detail in Mirjalili et al. (2014).

Objective function

In contaminant source identification, the contaminant release histories at different source locations are required to be estimated using the information regarding aquifer hydrogeological parameters and routinely collected concentration measurements at the observation wells. This is accomplished by utilizing the SO model, which minimizes the error between the simulated and observed concentrations. Based on the RPCM simulation model, these simulations are generated in this study. Mathematically, the objective function is represented in the following equations (Singh et al. 2018).
(17)
(18)
The term indicates the observed concentrations, whereas represents the simulated concentrations acquired by the RPCM model at the observation points. The term indicates the number of observation wells while indicates the length of the observations. Concentrations at the observation points, , depend on the release intensities at locations. Here is a matrix which has dimensions of where is the number of sources and is the number of stress periods. It may be noted that since the problems considered in this study are based on hypothetical study areas, the RPCM-simulated concentrations for a certain combination of source strengths are assumed to be observed concentrations. Using the SO model and observed concentrations, the source strengths are required to be estimated. Considering the possibility of measurement errors and other factors, in this study, a random normal deviates () with zero mean and 0.1 standard deviation is introduced to the normalized observed concentrations () as follows.
(19)
(20)

Here and are erroneous concentrations and normalized erroneous concentrations. The term represents the normalization function. Here maxmin normalization is used. The release history estimation using the SO model is presented in case studies 4.1 and 4.2.1. In these problems, the simulated concentrations at the observation wells are used as observed concentrations. The SO models are also tested for estimation of release histories using erroneous concentrations in case 4.2.2 considering the possibility of measurement error. Here, the simulated concentrations at the observation wells with random normal error, i.e., perturbed concentrations are used in the objective function as in Equation (17).

SO model

In the SO model for contaminant source identification, the optimization utilizes the simulation model to minimize the error (see Equation (17)) between the simulated/predicted and observed breakthrough curves at the observation wells. With increasing iteration numbers, the optimization models improve these predictions. The step-wise procedure for the RPCM-MVO-SO model is described given below:

  • 1.

    The aquifer parameters such as aquifer properties, boundary conditions and source/sink information are provided in this step. Using this information and MQ-RBF parameters, coupled flow and transport RPCM model is developed.

  • 2.

    The populations are initialized for the first iteration based on the bounds provided for each variable to be obtained.

  • 3.

    The parameters WEP and TDR are evaluated based on the iteration number. The objective function is defined as a minimization of error between the concentrations and the corresponding observations evaluated for each universe/population. It is also named the inflation rate in MVO. The universes are sorted based on their inflation rates and normalized inflation rates are evaluated. The universe with the highest value of inflation rate is considered the best universe. In this study, the minimization of error between observed and simulated concentration is considered the objective function. Hence, the inflation rate is considered as −1 times the objective function value (Mirjalili et al. 2016).

  • 4.

    As mentioned earlier, in MVO, there is only one universe with a white hole and all other universes possess black holes. For each variable in a universe, the white hole is selected based on a roulette wheel selection technique if a random number (r1) is less than the corresponding normalized inflation rate. The variable from that universe is replaced by the variable from the selected white hole containing the universe.

  • 5.

    Furthermore, based on the random number value r2 and WEP, the wormhole tunnel is established between the concerned universe and the white hole containing the universe. Subsequently, the variables in the universes are changed based on the TDR and randomly generated number r3.

  • 6.

    Steps 4–6 are repeated until the termination criteria are met. The termination criteria can be the maximum number of iterations or, the defined value of the objective function. The universe containing the best value of the inflation rate is considered the solution obtained from the RPCM-MVO-SO model.

The developed RPCM flow and transport model with verified with analytical and other numerical solutions as reported by Singh et al. (2016) and Anshuman & Eldho (2020). Further, the RPCM model is linked with MVO for release history estimation. The flowchart of the developed RPCM-MVO-SO model is explained in Figure 1. It may also be noted that the performance of RPCM-MVO is compared with that of RPCM-GWO and RPCM-PSO in two case studies. The development of these SO models is very similar to that of RPCM-MVO with corresponding changes in steps 3–5.
Figure 1

Flowchart of RPCM-MVO-SO model.

Figure 1

Flowchart of RPCM-MVO-SO model.

Close modal

Performance indicators

In this study, the performance of an SO model is dependent on the estimated release intensities at the source locations. These estimates are compared with actual release intensities using the following metrics (Singh et al. 2018).
(21)
(22)

Here denotes the release intensities of the contaminant at the source locations. The actual and predicted values of are represented using superscripts act and pred, respectively. The terms and denote the number of source locations and stress periods, respectively. The value of R2 varies between 0 and 1. For higher correlation, the value of R2 tends to be 1. Meanwhile, Root Mean Squared Error (RMSE) can take any value, with lower values indicating lower errors and a better match between actual and estimated contaminant release histories.

Case study 1

In this study, a homogenous isotropic confined aquifer of 30.5 m thickness from Gorelick et al. (1983) is considered (see Figure 2(a)). The north and south boundaries of the aquifer are subjected to Dirichlet boundary conditions with head values being 100 and 87.8 m resulting in the southward direction of groundwater flow. The no-flow boundary condition is applied to the east and west boundaries. The hydrogeological parameters influencing flow and transport phenomena are provided in Table 1. The aquifer is recharged at the rate of 0.0067 m/day at the pond location as shown in Figure 2(a). The potential contamination sources (S1, S2, S3, S4 and S5) as shown in Figure 2(b) inject total dissolved solutes (TDS) into the aquifer for an overall period of 4 years consisting of four stress periods of 1 year each. The disposal rate or, release history of TDS into the aquifer at these sources is presented in Table 2. These constitute 20 unknowns (5 possible source locations 4 stress periods) which are required to be estimated for the concerned case study using the time series at the observation wells (OB1, OB2 and OB3) shown in Figure 2(b). The flow and transport in the aquifer are simulated by the meshless RPCM model which consists of 632 nodes with a nodal spacing of 45.7 m (see Figure 2(b)). The MQ-RBF parameters such as q, α and radius of the local support domain are set as 0.98, 5 and 3, respectively. The values of the parameters are adapted from previous studies which demonstrate different contaminant transport problems in groundwater (Singh et al. 2016; Anshuman & Eldho 2020). The sensitivity analysis of these parameters is also presented in detail in Anshuman & Eldho (2020) and Anshuman et al. (2019). The RPCM flow model simulated head distribution is presented in Figure 3(a). As seen in the figure, the hydraulic head contours in the vicinity of the pond location are altered due to incoming groundwater flux from the pond. Figure 3(b) illustrates the RPCM-simulated concentrations at the observation points. Figure 4 presents the concentration contours at different timesteps. It may be noted that, although contaminant is introduced in the sources during the first 4 years, the coupled RPCM flow and transport simulation model is run for 15 years considering the response time in the observation wells. The time step size in the simulation model is 36.5 days. The developed simulation model is coupled with optimization algorithms, i.e., PSO, GWO and MVO as discussed in section 2.
Table 1

Hydrogeological parameters and their values for case studies

Hydrogeological parameterCase study 1Case study 2
Hydraulic conductivity Kx 8.4 m/day 5 m/day 
Ky 8.4 m/day 2.5 m/day 
Porosity 0.2 0.3 
Dispersivity α7.6 m 50 m 
α2.3 m 5 m 
Hydrogeological parameterCase study 1Case study 2
Hydraulic conductivity Kx 8.4 m/day 5 m/day 
Ky 8.4 m/day 2.5 m/day 
Porosity 0.2 0.3 
Dispersivity α7.6 m 50 m 
α2.3 m 5 m 
Table 2

Contamination release history at the source locations for case studies

Stress periodTDS released at source
Case study 1 (g/s)
Case study 2
S1S2S3S4S5S1S2S3S4
SP1 51 15 47 27 12 23 20 
SP2 44 35 20 
SP3 11 12 12 17 25 
SP4 24 16 21 
Stress periodTDS released at source
Case study 1 (g/s)
Case study 2
S1S2S3S4S5S1S2S3S4
SP1 51 15 47 27 12 23 20 
SP2 44 35 20 
SP3 11 12 12 17 25 
SP4 24 16 21 
Figure 2

(a) Schematic representation of the study area, (b) Discretization of the problem domain for case study 1.

Figure 2

(a) Schematic representation of the study area, (b) Discretization of the problem domain for case study 1.

Close modal
Figure 3

(a) Head distribution in the aquifer, (b) breakthrough curves at the observation wells simulated by RPCM model for case study 1.

Figure 3

(a) Head distribution in the aquifer, (b) breakthrough curves at the observation wells simulated by RPCM model for case study 1.

Close modal
Figure 4

Concentration contours at different timesteps for case study 1.

Figure 4

Concentration contours at different timesteps for case study 1.

Close modal
All SO models considered are run for 1,000 iterations with a population size of 25. The convergence curves are obtained as presented in Figure 5(a). It is observed that the RPCM-MVO-SO model converges slower compared to other SO models. This is attributed to the fact that the MVO algorithm restricts local-level changes in the candidate solutions using the WEP parameter during the initial stages of the optimization. As the number of iterations increases, the WEP increases linearly between WEPmin and WEPmax as shown in Equation (12). On the other hand, PSO and GWO used in this study do not enforce such restrictions which result in faster convergence as shown in Figure 5(a). However, from the results obtained in Figure 5, it is observed that RPCM-MVO exhibits a better match with actual injection rates compared to RPCM-PSO and RPCM-GWO models for estimation of release histories at different sources. Similarly, the error indicators presented in Table 3 show that the RPCM-MVO-SO model shows higher accuracy compared to RPCM-PSO and RPCM-GWO with R2 and RMSE values of 0.982 and 3.139, respectively. The better performance of RPCM-MVO can be attributed to a smooth transition of global exploration to local exploitation of the search space using the WEP parameter.
Table 3

Performance metrics of the RPCM-PSO, RPCM-GWO and RPCM-MVO-SO models for case studies

Inverse modelCase study 1
Case study 2
R2RMSE (ppm)R2RMSE (ppm)
RPCM-MVO 0.982 3.139 0.992 1.054 
RPCM-GWO 0.858 9.214 0.991 1.079 
RPCM-PSO 0.912 8.109 0.967 2.299 
Inverse modelCase study 1
Case study 2
R2RMSE (ppm)R2RMSE (ppm)
RPCM-MVO 0.982 3.139 0.992 1.054 
RPCM-GWO 0.858 9.214 0.991 1.079 
RPCM-PSO 0.912 8.109 0.967 2.299 
Figure 5

(a) Comparison of convergence curves by SO model and estimated contaminant release by RPCM-PSO, RPCM-GWO and RPCM-MVO in comparison with actual release at source locations, (b) S1, (c) S2, (d) S3, (e) S4 and (f) S5 for case study 1.

Figure 5

(a) Comparison of convergence curves by SO model and estimated contaminant release by RPCM-PSO, RPCM-GWO and RPCM-MVO in comparison with actual release at source locations, (b) S1, (c) S2, (d) S3, (e) S4 and (f) S5 for case study 1.

Close modal

Case study 2

Here a hypothetical homogenous aquifer of dimension 400 m300 m is considered (see Figure 6(a)). The north, east and south boundaries are subjected to Dirichlet conditions, i.e., they are constant head boundaries whereas the north boundary is a no-flow boundary. The heads in the Dirichlet boundaries vary linearly from one end to another as shown in Figure 6(a). The hydrogeological parameters influencing flow and transport phenomena are provided in Table 1. The potential contamination sources (S1, S2, S3 and S4) as shown in Figure 6(b) inject TDS into the aquifer for an overall period of 2 years consisting of four stress periods of 6 months each. The TDS release history of the sources is presented in Table 2. These constitute 16 unknowns that are required to be estimated for the concerned case study. The flow and transport in the aquifer are simulated by the meshless RPCM model which consists of 4131 nodes of a nodal distance of 10 m (see Figure 6(b)). The MQ-RBF shape parameters are fixed the same as in case study 1. There are five observation wells, namely OB1, OB2, OB3, OB4 and OB5. The RPCM flow model-simulated head distribution is presented in Figure 6(c). Using the head values, the velocities and dispersion coefficients are computed. The RPCM transport model is run for a simulation period of 2 years with a timestep size of 5 days. The concentrations at the observation points are shown in Figure 6(d). The concentration contours at different timesteps are presented in Figure 7.
Figure 6

(a) Schematic representation of the study area, (b) discretization of the problem domain, (c) head distribution in the aquifer, (d) breakthrough curves at the observation wells simulated by RPCM model for case study 2.

Figure 6

(a) Schematic representation of the study area, (b) discretization of the problem domain, (c) head distribution in the aquifer, (d) breakthrough curves at the observation wells simulated by RPCM model for case study 2.

Close modal
Figure 7

Concentration contours at different timesteps for case study 2.

Figure 7

Concentration contours at different timesteps for case study 2.

Close modal

Observed breakthrough curves without error

The coupled RPCM flow and transport model is embedded with MVO, GWO and PSO for constituting SO models for the estimation of release histories at sources observed in Figure 6 which correspond to the concentration profiles at the observation wells. All SO models are run for 1,000 iterations with a population size of 25 which indicates a total of 25,000 evaluations of the coupled RPCM simulation model. The time taken by the RPCM-MVO, RPCM-GWO and RPCM-PSO is 21.1, 20.9 and 20.8 h, respectively. The results obtained by the SO models are presented in Figure 8 and Table 3. It is observed that the solution provided by the RPCM-MVO-SO model shows the closest resemblance to the actual release intensities among all SO models. The RPCM-GWO model estimates are observed to differ by high values for source location S1. All SO models are observed to provide reasonably accurate results for source location S2. The RPCM-PSO model estimated release rates show the highest difference at source location S3. On the other hand, RPCM-MVO is observed to be the most accurate both in terms of R2 and RMSE.
Figure 8

Comparison of estimated contamination releases for PSO, GWO and MVO algorithms with actual releases at source locations (a) S1, (b) S2, (c) S3 and (d) S4 for case study 2.

Figure 8

Comparison of estimated contamination releases for PSO, GWO and MVO algorithms with actual releases at source locations (a) S1, (b) S2, (c) S3 and (d) S4 for case study 2.

Close modal

Erroneous breakthrough curves

The field measurements of concentrations at the observation wells are prone to human or operational errors (Yeh et al. 2007). Here, breakthrough curves at the observation wells are introduced with random normal samples as explained in section 3.1. The perturbed breakthrough curves are presented in Figure 9(a). Here five breakthrough curves are used. For each breakthrough curve, the SO models are run to estimate the release intensities at the source locations. The population size and maximum iteration number of optimization models are kept the same as in section 4.2.1. The convergence curves obtained for the SO model simulations are presented in Figure 9(d). In the figure, the range of objective function values is estimated using where and signify the mean and standard deviation of the corresponding convergence curves of the SO models, respectively. It is observed in Figure 9(d) that the RPCM-MVO model converges to a lower value of objective function compared to RPCM-GWO and RPCM-PSO. Also, the range of estimates of the best values of the objective function for different breakthrough curves for RPCM-MVO, RPCM-GWO and RPCM-PSO are 0.021, 0.238 and 0.068. Hence, this denotes that RPCM-MVO exhibits lower uncertainty in the estimation of release history compared to other SO models. The estimated release histories shown in Figure 9 at different stress periods also denote that RPCM-MVO model estimates have lower deviations compared to other SO models. Also, RPCM-MVO model provides reasonably close estimates of the release intensities to the actual values for different sources and stress periods. In contrast, the range of estimated release rates is higher for RPCM-GWO and RPCM-PSO for source location 3 and 2, respectively, as shown in Figure 9. This suggests that the RPCM-MVO-SO model simulations are more reliable compared to the other SO models in consideration.
Figure 9

(a) Breakthrough curves at the observation wells after incorporating random noise, (d) convergence curves of SO models and comparison of estimated release intensities for RPCM-PSO, RPCM-GWO and RPCM-MVO algorithms with actual injection rates at source locations, (b) S1, (c) S2, (e) S3 and (f) S4 for case study 2.

Figure 9

(a) Breakthrough curves at the observation wells after incorporating random noise, (d) convergence curves of SO models and comparison of estimated release intensities for RPCM-PSO, RPCM-GWO and RPCM-MVO algorithms with actual injection rates at source locations, (b) S1, (c) S2, (e) S3 and (f) S4 for case study 2.

Close modal

There are a variety of sources that can introduce contaminants into groundwater, and these sources must be identified so that informed decisions can be made concerning groundwater remediation and prediction of plume movement. This study proposes an RPCM-MVO-SO model for the estimation of release histories utilizing contaminant observations and compares its efficacy to that of the RPCM-GWO and RPCM-PSO models in two case studies. In the first case study, the number of estimable variables is 20. Based on the performance indicators, the RPCM-MVO model produces significantly better results compared to that of RPCM-GWO and RPCM-PSO. Interestingly, the convergence curve of RPCM-MVO is observed to converge slower compared to the other SO models which are due to initial restrictions on local exploitation imposed using the WEP parameter in MVO. Due to this, the algorithm focuses on global exploration and gradually focuses on local exploitation with an increase in iteration number. On the other hand, the PSO and GWO algorithms tend to converge faster as they apply changes in the entire population using Equations (14) and (16), respectively.

In case study 2, the SO models are utilized for the estimation of 16 variables (4 sources × 4 stress periods). Similar to case study 1, the RPCM-MVO model is observed to perform better in terms of R2 and RMSE. Interestingly, the RPCM-PSO model performs well in terms of R2. However, due to imprecise estimations for source location 3, the RMSE computed is significantly higher than that of RPCM-MVO. Furthermore, considering measurement and operational errors in the measurement at observation points, the SO models are run using five noisy breakthrough curves. The performances indicate that the RPCM-MVO estimated release histories are close to the actual values for all source locations and stress periods. Additionally, the convergence curves and boxplots of the SO models in Figure 9 indicate low variability in RPCM-MVO estimates. In contrast, the RPCM-GWO and RPCM-PSO estimates show higher variability due to noisy breakthrough curves, especially in source locations 2 and 3. It may be noted that the breakthrough curves used for the evaluation of the objective function use fewer data points compared to the timesteps used for simulation. In case study 1, the measurement interval for case study 1 is 365 days whereas it is 30 days for case study 2 for simulation periods of 15 and 2 years, respectively. This indicates that the SO model can perform well in limited data availability. Based on the case studies, MVO exploits the multidimensional search space for the inverse problems better compared to GWO and PSO. The time required for all SO models is nearly equal as observed in the case studies. However, in terms of convergence, the MVO model is slower compared to GWO and PSO due to the restricted change to the population in the initial phase of optimization in MVO. Hence, for low-dimensional problems with relatively smoother objective functions, the GWO and PSO may provide accurate solutions in fewer iterations. Due to the use of the simulation model in conjunction with the optimization model, the PDEs related to governing equations are solved at every timestep increasing the computational cost. The computational efficiency of the inverse modelling can be improved significantly by using surrogate models (Borah & Bhattacharjya 2016). However, machine learning-based surrogate models also require sample generation for training which can be computationally expensive for a large study area.

In this study, an SO inverse model RPCM-MVO is used for contaminant source identification in groundwater. The performance of the proposed SO model is compared with RPCM-GWO and RPCM-PSO models for two case studies. The RPCM-MVO is observed to provide closer estimates of the actual contaminant release rates compared to other SO models considered. The RMSE values obtained are 3.14 ppm and 1.05 ppm for case studies 1 and 2, respectively. Furthermore, for inverse modelling using noisy observations, the RPCM-MVO model provides lower variation in the estimates of the release intensities at the source locations. The better performance of MVO can be attributed to the gradual transition between global exploration and local exploitation phases which also results in slower convergences as observed in the convergence curves in the case studies. However, this technique prevents MVO solutions from getting trapped in the local minima. On the other hand, GWO and PSO solutions converge faster initially. Nonetheless, they show little improvement with an increase in iterations after this initial phase. It may be noted that the use of simulation models in SO technique results in an increased computational cost, making it computationally expensive for large aquifers and long simulation periods. Using surrogate models derived from machine learning techniques in this case can be computationally effective. Moreover, the use of surrogate models can also assist in more comprehensive estimation of uncertainty in release history estimation due to observational errors by using a greater number of breakthrough curve realizations. Due to the presence of complexity in the flow and transport process in the groundwater, the search space for the optimization problem is highly nonlinear and multidimensional. From the case studies considered in this study, it can be concluded that the MVO algorithm traverses the nonlinear search space more efficiently compared to GWO and PSO indicating higher efficacy of the RPCM-MVO model for contaminant source identification.

The first author of the paper is grateful to Indian Institute of Technology Bombay for providing funding for the research. The authors are thankful to the anonymous reviewers for their constructive suggestions which helped in improving the quality of the paper.

All relevant data are included in the paper or its Supplementary Information.

The authors declare there is no conflict.

Anshuman
A.
&
Eldho
T. I.
2020
Meshfree radial point collocation-based coupled flow and transport model for simulation of multispecies linked first order reactions
.
Journal of Contaminant Hydrology
229
.
https://doi.org/10.1016/j.jconhyd.2019.103582
.
Anshuman
A.
&
Eldho
T. I.
2022a
Meshless local strong form-based method for transport simulation of nonlinearly adsorbing solutes in a highly heterogeneous confined aquifer
.
Water Resources Management
36
(
4
),
1443
1461
.
https://doi.org/10.1007/S11269-022-03092-6
.
Anshuman
A.
&
Eldho
T. I.
2022b
Entity aware sequence to sequence learning using LSTMs for estimation of groundwater contamination release history and transport parameters
.
Journal of Hydrology
608
,
127662
.
https://doi.org/10.1016/J.JHYDROL.2022.127662
.
Anshuman
A.
,
Eldho
T. I.
&
Singh
L. G.
2019
Simulation of reactive transport in porous media using radial point collocation method
.
Engineering Analysis with Boundary Elements
104
.
https://doi.org/10.1016/j.enganabound.2019.03.016
.
Barrow
J. D.
,
Davies
P. C.
&
Harper Jr
C. L.
(eds).
2004
Science and Ultimate Reality: Quantum Theory, Cosmology, and Complexity. Cambridge University Press, Cambridge. ISBN 0-521-83113-X
.
Batu
V.
2005
Applied Flow and Solute Transport Modeling in Aquifers
.
https://doi.org/10.1201/9781420037470
.
Bear
J.
&
Cheng
A. H. D.
2010
Modeling Groundwater Flow and Contaminant Transport
.
https://doi.org/10.1007/978-1-4020-6682-5
.
Bear
J.
&
Verruijt
A.
1987
Modeling Groundwater Flow and Pollution
.
https://doi.org/10.1139/t88-098
.
Borah
T.
&
Bhattacharjya
R. K.
2016
Development of an improved pollution source identification model using numerical and ANN based simulation-optimization model
.
Water Resources Management
30
(
14
).
https://doi.org/10.1007/s11269-016-1476-6
.
Devi
R. M.
,
Premkumar
M.
,
Jangir
P.
,
Elkotb
M. A.
,
Elavarasan
R. M.
&
Nisar
K. S.
2022
IRKO: an improved Runge-Kutta optimization algorithm for global optimization problems
.
Computers, Materials and Continua
70
(
3
).
https://doi.org/10.32604/cmc.2022.020847
.
Eardley
D. M.
1974
Death of white holes in the early universe
.
Physical Review Letters
33
(
7
).
https://doi.org/10.1103/PhysRevLett.33.442
.
Eberhart
R.
&
Kennedy
J.
1995
Particle swarm optimization
. In
Proceedings of the IEEE International Conference on Neural Networks
, Vol.
4
, pp.
1942
1948
.
Gorelick
S. M.
,
Evans
B.
&
Remson
I.
1983
Identifying sources of groundwater pollution: an optimization approach
.
Water Resources Research
19
(
3
).
https://doi.org/10.1029/WR019i003p00779
.
Gupta
D.
,
Dhar
A. R.
&
Roy
S. S.
2021
A partition cum unification based genetic firefly algorithm for single objective optimization
.
Sadhana – Academy Proceedings in Engineering Sciences
46
(
3
).
https://doi.org/10.1007/s12046-021-01641-0
.
Gurarslan
G.
&
Karahan
H.
2015
Solving inverse problems of groundwater-pollution-source identification using a differential evolution algorithm
.
Hydrogeology Journal
23
(
6
),
1109
1119
.
https://doi.org/10.1007/S10040-015-1256-Z
.
Han
K.
,
Zuo
R.
,
Ni
P.
,
Xue
Z.
,
Xu
D.
,
Wang
J.
&
Zhang
D.
2020
Application of a genetic algorithm to groundwater pollution source identification
.
Journal of Hydrology
589
.
https://doi.org/10.1016/j.jhydrol.2020.125343
.
Jha
M.
&
Datta
B.
2013
Three-dimensional groundwater contamination source identification using adaptive simulated annealing
.
Journal of Hydrologic Engineering
18
(
3
).
https://doi.org/10.1061/(asce)he.1943-5584.0000624
.
Kansa
E. J.
1990
Multiquadrics – a scattered data approximation scheme with applications to computational fluid-dynamics-I surface approximations and partial derivative estimates
.
Computers and Mathematics with Applications
19
(
8–9
).
https://doi.org/10.1016/0898-1221(90)90270-T
.
Liu
G. R.
&
Gu
Y. T
.
2005
.
An Introduction to Meshfree Methods and Their Programming
.
https://doi.org/10.1007/1-4020-3468-7
Majumder
P.
&
Eldho
T. I.
2020
Artificial neural network and grey wolf optimizer based surrogate simulation-optimization model for groundwater remediation
.
Water Resources Management
34
(
2
).
https://doi.org/10.1007/s11269-019-02472-9
.
Mategaonkar
M.
,
Kamat
S.
&
Eldho
T. I.
2018
In-situ bioremediation of groundwater using a meshfree model and particle swarm optimization
.
Journal of Hydroinformatics
20
(
4
).
https://doi.org/10.2166/hydro.2018.110
.
Mirjalili
S.
,
Mohammad
S.
&
Lewis
A.
2014
Grey wolf optimizer
.
Advances in Engineering Software
69
,
46
61
.
Mirjalili
S.
,
Seyed
M. M.
&
Hatamlou
A.
2016
Multi-verse optimizer: a nature-inspired algorithm for global optimization
.
Neural Computing and Applications
.
https://doi.org/10.1007/s00521-015-1870-7
.
Morris
M. S.
&
Thorne
K. S.
1988
Wormholes in spacetime and their use for interstellar travel: a tool for teaching general relativity
.
American Journal of Physics
56
(
5
).
https://doi.org/10.1119/1.15620
.
Muro
C.
,
Escobedo
R.
,
Spector
L.
&
Coppinger
R. P.
2011
Wolf-pack (Canis lupus) hunting strategies emerge from simple rules in computational simulations
.
Behavioural Processes
88
(
3
).
https://doi.org/10.1016/j.beproc.2011.09.006
.
Rastogi
A. K.
2007
Numerical Groundwater Hydrology
.
Penram International Publication
,
India
.
Sammen
S. S.
,
Ghorbani
M. A.
,
Malik
A.
,
Tikhamarine
Y.
,
AmirRahmani
M.
,
Al-Ansari
N.
&
Chau
K. W.
2020
Enhanced artificial neural network with Harris hawks optimization for predicting scour depth downstream of ski-jump spillway
.
Applied Sciences (Switzerland)
10
(
15
).
https://doi.org/10.3390/app10155160
.
Singh
L. G.
,
Eldho
T. I.
&
Kumar
A. V.
2016
Coupled groundwater flow and contaminant transport simulation in a confined aquifer using meshfree radial point collocation method (RPCM)
.
Engineering Analysis with Boundary Elements
66
.
https://doi.org/10.1016/j.enganabound.2016.02.001
.
Singh
L. G.
,
Eldho
T. I.
&
Kumar
A. V.
2018
Identification of groundwater contamination sources using Meshfree RPCM simulation and particle swarm optimization
.
Water Resources Management
32
(
4
),
1517
1538
.
https://doi.org/10.1007/S11269-017-1885-1
.
Thomas
A.
,
Majumdar
P.
,
Eldho
T. I.
&
Rastogi
A. K.
2018
Simulation optimization model for aquifer parameter estimation using coupled meshfree point collocation method and cat swarm optimization
.
Engineering Analysis with Boundary Elements
91
.
https://doi.org/10.1016/j.enganabound.2018.03.004
.
Thomas
A.
,
Eldho
T. I.
,
Rastogi
A. K.
&
Majumder
P.
2019
A comparative study in aquifer parameter estimation using MFree point collocation method with evolutionary algorithms
.
Journal of Hydroinformatics
21
(
3
),
455
473
.
Wang
W. C.
,
Xu
L.
,
Chau
K. W.
&
Xu
D. M.
2020
Yin-Yang firefly algorithm based on dimensionally Cauchy mutation
.
Expert Systems with Applications
150
.
https://doi.org/10.1016/j.eswa.2020.113216
.
Yeh
H.
,
Chang
T.
&
Lin
Y.
2007
Groundwater contaminant source identification by a hybrid heuristic approach
.
Water Resources Research
43
(
9
),
https://doi.org/10.1029/2005WR004731
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).