## Abstract

Groundwater monitoring plays a significant role in groundwater management. This study presents an optimization method for designing groundwater-level monitoring networks. The proposed design method was used in the Eshtehard aquifer, in central Iran. Three scenarios were considered to optimize the locations of the observation wells: (1) designing new monitoring networks, (2) redesigning existing monitoring networks, and (3) expanding existing monitoring networks. The kriging method was utilized to determine groundwater levels at non-monitoring locations for preparing the design data base. The optimization of the groundwater monitoring network had the objectives of (1) minimizing the root mean square error and (2) minimizing the number of wells. The non-dominated sorting genetic algorithm (NSGA-II) was applied to optimize the network. Inverse distance weighting interpolation was used in NSGA-II to estimate the groundwater levels while optimizing network design. Results of the study indicate that the proposed method successfully optimizes the design of groundwater monitoring networks that achieve accuracy and cost-effectiveness.

## INTRODUCTION

Groundwater monitoring plays an important role in collecting data to assess changes in environmental processes in groundwater resources contamination. Loáiciga *et al.* (1992) classified groundwater monitoring design into hydrogeological approaches and statistical approaches. Hydrogeological approaches use hydrogeologic information and expert judgment to assess and design groundwater monitoring networks. Statistical approaches acknowledge the uncertainty associated with human knowledge of the underlying hydrogeology and treat aquifer properties as random or spatially correlated variables. Statistical approaches include simulation-based approaches, variance-based approaches, and probability-based approaches.

McKinney & Loucks (1992) developed a network design algorithm to improve the reliability of groundwater simulation model predictions. Hudak (2006) reported a Monte Carlo (MC) physics-based simulation approach to locate detection wells in aquifers beneath landfills. Yang *et al.* (2008) used the kriging standard deviation as a criterion for determining the density of groundwater-level monitoring networks. Dokou & Pinder (2009) created an optimal search strategy that identified contamination sources by the least number of water quality samples. The strategy included a MC stochastic groundwater flow and transport model, a predetermined set of potential source locations, and a Kalman filter which updated the simulated contaminant concentration field using contaminant concentration data. Mogheir *et al.* (2003) introduced the spatial structure by means of a trans-information and correlation model for groundwater quality variables. The trans-information model was superior to the correlation model in describing the spatial variability (structure) of groundwater quality variables. Wu *et al.* (2006) compared the MC simple genetic algorithm (MCSGA) and noisy genetic algorithm (NGA) in the design of cost-effective sampling networks considering uncertainties in the hydraulic conductivity. Both methods combined the genetic algorithm (GA) with a numerical flow and transport simulator and a global plume estimator to optimize the sampling network for contaminant plume monitoring. Dhar & Patil (2012) designed groundwater quality monitoring networks under epistemic uncertainty considering spatiotemporal pollutant concentrations as fuzzy numbers. The proposed methodology incorporated fuzzy ordinary kriging (FOK) within the decision model formulation for spatial estimation of contaminant concentration values. The non-dominated sorting genetic algorithm (NSGA-II) was used to solve a design model. Results of the study showed the applicability of the proposed methodology for network design under epistemic uncertainty.

In recent decades, several groundwater monitoring studies have turned to machine learning approaches to add or remove monitoring stations. Data mining is an analytic process to explore data by consistent patterns and/or systematic relation between variables (Fallah-Mehdipour *et al.* 2013a, 2013b, 2014; Orouji *et al.* 2013, 2014; Akbari-Alashti *et al.* 2014; Aboutalebi *et al.* 2015, 2016a, 2016b, 2016c; Soleimani *et al.* 2016a, 2016b; Bozorg-Haddad *et al.* 2017). Asefa *et al.* (2004) described a methodology based on support vector machines to design monitoring networks. Khader & McKee (2014) applied a regression vector machine (RVM) for groundwater monitoring network design. Their RVM method employed a MC simulation process to capture the uncertainties in recharge, hydraulic conductivity, and nitrate reaction processes. This paper presents an optimization method to design reliable and efficient groundwater monitoring networks. The method has as objectives reducing costs and increasing the groundwater-level monitoring accuracy.

## GROUNDWATER-LEVEL ESTIMATION

*j*and time t, and = weight applied to , which differs according to the interpolation method used. Spatial interpolation methods belong to two main categories: (1) deterministic (e.g., inverse distance weighting (IDW), splines, radial basis functions, etc.) and (2) geostatistical (e.g., kriging, hierarchical models, copula, etc.). The former methods use a mathematical function to calculate values at unknown locations and provide deterministic estimates. The latter methods provide probabilistic estimates of a variable and its variance of estimation at points where measurements do not exist (Loáiciga

*et al.*2010).

*p*= 2 (Bivand

*et al.*2008).

*h*, = the total number of the variable pairs separated by a distance

*h*,

*Z(x*

_{i})*=*the value of the variable at location

*x*. A parametric or nonparametric model such as Gaussian, spherical, exponential, linear models is fitted to the empirical semivariogram.

_{i}*μ*

*=*Lagrange multiplier and = semivariogram between and .

## THE NSGA-II

Multi-objective evolutionary algorithms (MOEAs) were divided into two categories (Deb *et al.* 2002): (1) non-elitist MOEA and (2) elitist MOEA. Elitism in evolutionary algorithms is the use of superior solutions of a past generation to the next generation solutions. One of the most common methods of elitist MOEAs is the NSGA-II, an improved version of the NSGA (Srinivas & Deb 1994).

The NSGA-II is one of the most common and effective algorithms for solving multi-objective problems. It is a random-based search algorithm and a variant of the GA.

The NSGA-II applies selection, crossover, and mutation operators. The crossover operator recombines the members of a population to make a new population. The mutation operator is applied to manipulate the population's members. The NSGA-II begins with the random generation of a population, which is subsequently sorted based on non-domination into several Pareto fronts. The first Pareto front is a completely non-dominated set whose members are not dominated by the members of other fronts. The members of the second front dominate those from subsequent fronts, but are dominated by the members of the first front, and so on and so forth.

Each member of a front is assigned a fitness value or rank (Deb *et al.* 2002). For example, individuals from the first front have rank 1, and those from the members of the next front have a rank equal to 2, etc. The crowding distance is a parameter of the NSGA-II that measures the distance of a member to its neighbors and ensures diversity in a population.

As soon as the rank and the crowding distance for all of the members of all fronts are determined, then parents are selected from the population by using binary tournament selection based on the rank and the crowding distance. An individual having smaller value of rank or greater crowding distance is selected (Figure 1). The selected population uses crossover and mutation operators to generate offspring. The current population and current offspring are sorted again and the best individuals are selected according to their rank and crowding distances. Figure 2 shows a flowchart of the NSGA-II.

## CASE STUDY

The study area is the aquifer of the Eshtehard plain, which covers an area equal to 235 km^{2} in north central Iran (see a map in Figure 3). The plain is surrounded by the Qazvin plain and the Tehran-Karaj plain to the west and east, respectively. The study area features an arid climate, thus groundwater is the main resource of water for residents. The uncontrolled exploitation of groundwater in the plain has produced a declining groundwater table. There is a need for an optimal monitoring network to characterize aquifer conditions as groundwater is mined to meet several water-supply functions. The existing groundwater monitoring network in the region has 18 wells. A four-year period (2009–2013) of recorded data (historical) of existing observation wells was chosen for the study.

## METHODOLOGY

First, a groundwater data base was prepared prior to designing a monitoring network and choosing optimal locations of its observation wells within the study area. Next, an optimal groundwater monitoring network was designed.

### Data base

Available records of groundwater level were used for estimating the groundwater level over the entire Eshtehard aquifer using kriging.

### Optimizing model

The two objective functions of the groundwater monitoring network design method are:

- 1.
minimizing the number of observation wells in the area; and

- 2.minimizing the root mean square error (
*RMSE*) between observational and estimated values at all locations within the aquifer: in which*f*_{1}= the first objective function (number of wells),*f*_{2}= the second objective function (*RMSE*), = observational groundwater level of point*i*at the end of period*t*, = groundwater-level estimation of point*i*at the end of period of*t*,*T**=*number of total time periods,*N**=*number of groundwater-level estimation points.

Estimated values of groundwater level were obtained for all the potential sampling locations using IDW in the optimization phase. The *RMSE* of the network is determined based on Equation (7). The estimated values with IDW and the observational values of the potential points are used to construct the first objective function (minimization of the number of wells). Figure 4 shows a flowchart of the methodology.

### Optimization algorithm

The NSGA-II was selected for solving the multi-objective optimization of the groundwater-level monitoring network. Three scenarios of monitoring network designing were considered:

#### First scenario

Redesigning of groundwater monitoring network in the study region. The aim of solving this scenario was finding the best location of observation wells over all areas of the aquifer. In this scenario, the groundwater monitoring network was redesigned regardless of the position of available observation wells and the best locations of the wells and *RMSE* of the network were obtained for each observation well.

#### Second scenario

Selecting the best set of wells among existing observation wells in the study area. This scenario allows the use of several existing observation wells among the existing monitoring locations to achieve the optimization objectives. For instance, 18 observation wells exist in the research area, of which five of the 18 wells were chosen using the joint *RMSE*. Therefore, this scenario redesigns an existing monitoring network by keeping active a subset of the existing monitoring wells.

#### Third scenario

Adding extra wells to the observation wells of the network in the research area. This scenario expands an existing monitoring network considering the joint *RMSE*.

## RESULTS AND DISCUSSION

A data base of groundwater level in the Eshtehard aquifer, central Iran, was established using kriging. Groundwater-level data from 18 observation wells for a four-year period measured monthly was used for interpolation purposes. The NSGA-II algorithm was applied for solving the monitoring network optimization problem. The number of iterations, population size, cross-over, and mutation probabilities were set equal to 1,000, 50, 0.7, and 0.2, respectively, in the NSGA-II. A maximum of 30 wells was designated for the optimized monitoring network. The summary of results is as follows.

### A – Scenario 1

The NSGA-II found the optimal groundwater monitoring locations in the aquifer and in the corresponding results were represented as a Pareto front (see Figure 5). Due to the randomized nature of the algorithm's solution, three separate runs were performed to find representative results. The calculated Pareto fronts shown in Figure 5 imply very close results for the three runs, which, in turn, shows the reliable convergence of the NSGA-II over several runs.

Figure 6 shows a sample of 15 optimized groundwater monitoring wells obtained in the first run of the first scenario. It is clear in Figure 6 that the wells have a suitable distribution so that they cover all areas of the aquifer.

### B – Scenario 2

Under this scenario the monitoring network was redesigned from the existing observation wells in the Eshtehard area. Results of this scenario are presented in Figure 7 in the form of three Pareto fronts obtained in three separate runs. Notice the closeness of the Pareto fronts, which means the NSGA-II algorithm converged to almost the same solution in all runs.

A sample of optimized groundwater monitoring locations is presented in Figure 8.

### C – Scenario 3

Under this scenario 18 available observation wells within the aquifer area were kept and the monitoring network was expanded with a few more wells. The results of the optimization are depicted as Pareto fronts in Figure 9, in which the closeness of the three Pareto fronts is evident. Given the randomized algorithmic search it is clear that the similarity of the results of the three runs imply adequate convergence to a near globally optimal solution.

A sample of the optimized groundwater monitoring network under Scenario 3 is portrayed in Figure 10. The chosen monitoring wells are suitably distributed throughout the aquifer.

Results of the first runs of the three scenarios are presented in Figure 11 as Pareto fronts. The flexible nature of the first scenario, compared to the other two scenarios, is clear in Figure 11. It is evident that optimizing the locations of the monitoring wells regardless of existing observation wells in the study region leads to efficient collection of accurate groundwater levels. Our results also show that taking into account the existing observation wells provides useful information for optimizing the entire monitoring network.

Under the second scenario, the main constraint is the fact that existing monitoring wells are used in the optimized network. This caused a rise of the *RMSE* in the network associated with Scenario 2 compared with the *RMSE* associated with Scenario 1. Choosing 10 wells, for instance, the *RMSE* values of the first and second scenarios were 1.180 and 1.489, respectively. Pareto fronts for the first and second scenarios are shown in Figure 11, where it is evident that the fully optimized monitoring network under Scenario 1 (where all monitoring locations are optimized) exhibits a lower *RMSE* than the Pareto fronts for Scenario 2.

It is seen in Figure 11 that the *RMSE* values of the first and third scenarios for 25 wells equaled 0.972 and 1.070, respectively. This vicinity of the two fronts implies that when the number of wells in the monitoring network exceeds 18, the number of wells is sufficiently high that all areas of the aquifer are covered. This is affirmed by the closeness of the first and third fronts in Figure 10 for well numbers equal to 28, 29, and 30.

## CONCLUDING REMARKS

Proper characterization of groundwater conditions relies on well-designed groundwater monitoring networks. This work introduced multi-objective optimization of groundwater monitoring network design with the evolutionary algorithm NSGA-II under three scenarios. The novel methodology was applied to the Eshtehard plain aquifer, Iran. The first part of this paper created time series of groundwater level values over the entire area of the plain with kriging, and stored the time series into a comprehensive data base. The second part of this paper optimized the network of observation wells employing the NSGA-II, and optimized groundwater levels were obtained with the IDW. The objectives of the optimization problem were the minimization of the number of monitoring wells and the minimization of the *RMSE*.

Three scenarios for groundwater network design were herein considered. Under the first scenario, all the monitoring wells' locations were optimized over the entire aquifer with the constraint being that not more than 30 wells would be deployed in the monitoring network. The second scenario involved redesigning an existing monitoring network and choosing the best subset of 18 existing wells that satisfies the multi-objective criteria. The third scenario involved expanding the monitoring network beyond the existing 18 monitoring wells. The key findings of this study are: the most efficient and accurate groundwater monitoring network design approach is that achieved under Scenario 1, when all the monitoring locations are optimized, regardless of existing monitoring wells.

The redesign results for Scenario 2 indicate that it is necessary to remove several wells among the existing wells of the network to achieve an optimized groundwater monitoring network according to the objectives of the design approach. Scenario 3 provides the solution of how to expand an existing groundwater monitoring network by choosing the optimal new wells.

This study relied on time series of groundwater levels to design groundwater monitoring networks. Previous studies on monitoring networks have not applied time series of groundwater levels because of complexities that arise in handling temporal variability within the spatial analysis. The optimization algorithm employed in this paper considered the entire area of the aquifer in search of the best monitoring sites. In brief, this study presented a groundwater monitoring network design method that could search all the aquifer area to find the best monitoring sites employing long-term groundwater-level data.

## REFERENCES

*.*

*.*