Abstract
The groundwater planning problems are often multiobjective. Due to conflicting objectives and non-linearity of the variables involved, several feasible solutions may have to be evolved rather than single optimal solution. In this study, the simulation model built on an Analytic Element Method (AEM) and the optimization model built on a Non-dominated Sorting Genetic Algorithm (NSGA-II) were coupled and applied to study a part of the Dore river catchment, France. The maximization of discharge, the minimization of pumping cost and the minimization of piping cost are the three objectives considered. 2105 non-dominated groundwater planning strategies were generated. K-Means cluster analysis was employed to classify the strategies, and clustering was performed for 3 to 25 clusters. A cluster validation technique, namely Davies–Bouldin (DB) index, was employed to find the optimal number of clusters of groundwater strategies which were found to be 20. Multicriterion Decision-Making (MCDM) techniques, namely VIKOR and TOPSIS, were developed to rank the 20 representative strategies. Both these decision-making techniques preferred representative strategy A5 (piping cost, pumping cost and discharge respectively of 880,000 Euro, 679,000 Euro and 1,263.1 m3/s). The sensitivity analysis of parameter v in VIKOR suggested that there were changes in ranking pattern for various values of v. However, the first position remained unchanged.
HIGHLIGHTS
The simulation model built on an Analytic Element Method (AEM) and the optimization model built on a Non-dominated Sorting Genetic Algorithm (NSGA-II) were linked and applied for groundwater planning.
The Davies–Bouldin (DB) index was employed to find the optimal number of clusters of groundwater strategies.
Multicriterion Decision-Making (MCDM) techniques, VIKOR and TOPSIS were implemented to rank the 20 representative strategies.
INTRODUCTION
Growing water demand necessitates the development of cost-effective wells and transport system. This involves the development of groundwater pumping and transport systems. A major part of the total energy consumption, in a water distribution network, is for extracting the groundwater. This consumption becomes more dominant due to unmanaged extraction and corresponding diminishing of groundwater resources (Ahlfeld & Laverty 2015). Therefore, optimal wells and transport system are expected to achieve sustainable economic benefits and utilization of groundwater resources, avoiding reconstruction of pumping wells in view of declining groundwater resources in a multiobjective framework. The use of a simulation–optimization model was found suitable for different groundwater management problems (Onwunalu & Durlofsky 2010; Gaur et al. 2011a, 2011b; Ayvaz & Elci 2013).
Singh et al. (2008) presented an Interactive Multiobjective Genetic Algorithm for a hypothetical aquifer. It was concluded that expert interaction improved the prediction capability of the calibrated model. Raju et al. (2012) applied Multiobjective Differential Evolution, K-Means Cluster Analysis, Davies–Bouldin and Dunn's indices for an irrigation planning problem in India. Tabari & Soltani (2013) compared the sequential genetic algorithm and the Non-dominated Sorting Genetic Algorithm (NSGA-II) to the Karaj-Iran aquifer and found that NSGA-II performed better. Zekri et al. (2015) applied NSGA-II, MODFLOW and Compromise Programming to a coastal aquifer, in Oman. Studies revealed that the evolved annual groundwater abstraction resulted in considerable economic benefits. Farhadi et al. (2016) employed Nash modelling for the Daryan Aquifer, Fars Province, Iran. Fulfilling irrigation water demand, a reduction in the groundwater drawdown and an increase in equity of water allocation were the considered objectives. A MODFLOW simulation model, an Artificial Neural Network and an NSGA-II-based optimization model were employed. Sreekanth et al. (2016) presented a stochastic multiobjective formulation to assess the maximum volume of water which can be injected into a hypothetical confined aquifer. Well locations and injection rates were taken as decision variables. Reliability analysis was performed using Monte Carlo Analysis. Results concluded that stochastic-based multiobjective optimization was very efficient in identifying robust groundwater management strategies.
Rezaei et al. (2017) applied fuzzy Multiobjective Particle Swarm Optimization (f-MOPSO) to improve water resource planning in the plains of Najafabad, Iran, and two other MOPSO algorithms and found f-MOPSO to be superior. Alizadeh et al. (2017) applied a methodology based on NSGA-II, MODFLOW, Social choice rules, M5P model tree and fallback bargaining procedures to the Darian aquifer, Iran, to determine optimal groundwater policies. The evolved model resulted in a mean increase in groundwater level throughout the aquifer. Makaremi et al. (2017) coupled NSGA-II with EPANET to pipe network distribution.
Mortazavi-Naeini et al. (2015) proposed efficient multiobjective ant colony optimization-I (EMOACO-I) and compared with NSGA-II, SMPSO and ɛMOEA for the Canberra system and the Sydney system, Australia. None of the optimization methods was found superior. Sadeghi-Tabas et al. (2017) employed the Cuckoo optimization algorithm (COA)-AMALGAM and the MODFLOW for an arid groundwater system in Iran. Three objectives of minimization in nature were considered to generate Pareto solutions, and it was concluded that the model evolved in their study provided sustainable groundwater management alternative. Cisty et al. (2017) proposed NSGA-II for a two-phase approach to the case study of the Balerma irrigation network. Bozorg-Haddad et al. (2017) compared the multiobjective developed firefly algorithm (MODFA), the multiobjective firefly algorithm (MOFA) and the multiobjective genetic algorithm (MOGA) with Karoun basin, Iran. It was concluded that the MODFA performed better than the other two. Mirzaie-Nodoushan et al. (2017) applied NSGA-II for the cost-effective design of groundwater monitoring networks for the Eshtehard aquifer, in central Iran. Johns et al. (2020) applied a multiobjective adaptive locally constrained genetic algorithm (MOALCO-GA), an NSGA-II and a multiobjective pipe smoothing genetic algorithm (MOPS-GA) for water distribution network systems. They used a hyper volume indicator for the comparison of algorithms. The other two algorithms were found to perform relatively well compared with NSGA-II.
The limited utilization of Multicriterion Decision-Making (MCDM) was noticed in groundwater management. Hajkowicz & Collins (2007), Geng & Wardlaw (2013) and Rousta & Araghinejad (2015) made extensive studies on the role of MCDM in water resources. Duckstein et al. (1994) discussed the prioritization of groundwater management strategies. Rahman et al. (2013) applied spatial multicriteria decision analysis (SMCDA) to find the most suitable sites for applying a Managed Aquifer Recharge technique. Six locations were identified and ranked on the basis of different decision criteria. The study recommended a combined use of mathematical modelling and SMCDA. An et al. (2016) proposed MCDM for the sustainability assessment of different remediation techniques of groundwater resources. Eight criteria were used and alternative technologies were evaluated. ELimination Et Choix Traduisant la REalité (ELECTRE) and Analytic Hierarchy Process (AHP) were used to rank the alternatives. Bajić et al. (2017) applied Fuzzy AHP to choose the optimal groundwater management system to open-cast mine and found the results satisfactory. Wang et al. (2017) applied weighted product and simple weighted addition methods, Technique for Order Preference by Similarity to an Ideal Solution (TOPSIS) and Cooperative Game Theory for selecting appropriate remediation technology for the Chengli oil field.
Roozbahani et al. (2018) developed a groundwater level prediction mechanism using Bayesian Networks and MCDM techniques, Simple Additive Weighting (SAW), PROMETHEE-II and TOPSIS. They applied these methodologies to the Birjand aquifer in Iran. Borda method was used for aggregating the final ranking of the scenarios and found the methodologies satisfactory. Studied research works have used various multiobjective optimization algorithms for the generation of Pareto front and ranking of groundwater strategies. However, no study was reported on filtering or classifying a large number of generated non-dominated sets to a manageable size if generated Pareto front is too large to handle the ranking which is one of the focuses and novelty of the present study. Clustering and validation indices that minimize duplication of data and facilitate optimum clusters play a major role in this regard. Therefore, the present study is focused on the application of clustering to find the representative solutions from generated Pareto front that can be used in multicriterion analysis. Keeping these developments, the present study focuses on the following objectives:
To explore NSGA-II as the multiobjective optimization algorithm for generating non-dominated groundwater strategies to a part of the Dore river catchment, France.
To develop a methodology for clustering and ranking of non-dominated strategies that can be used as the basis for groundwater policy studies.
Objectives are proposed to be achieved by employing the following four-stage procedure:
- 1.
NSGA-II-based multiobjective optimization to generate non-dominated groundwater strategies based on objective functions: maximize discharge, minimize cost of piping and minimize cost of pumping.
- 2.
K-Means based cluster analysis for grouping the number of strategies generated in stage 1 into clusters.
- 3.
The DB cluster validation index for finding the optimum cluster size.
- 4.
Application of two MCDM methods, TOPSIS and VIKOR, to select the best groundwater strategy.
To our knowledge, no earlier study is reported in groundwater planning incorporating approaches similar to the ones suggested in the present work.
The next section presents the study area and model development; Section ‘Methodologies employed’ describes the employed methodologies. Section ‘Stage 1: model application’ presents results related to stage 1 (multiobjective modelling), Section ‘Stages 2 and 3: cluster analysis and validation’ is related to stages 2 and 3 (clustering aspects) and Section ‘Stage 4: application of VIKOR and TOPSIS’ is related to stage 4 (ranking aspects). The last section presents the conclusions of the study.
STUDY AREA AND MODEL DEVELOPMENT
The study area consists of part of the Dore river catchment, France, and is about 30 km2 in extent (Figure 1). The annual average rainfall is 780 mm. Different hydrological features along with other relevant information are taken from the maps developed by BRGM (National Service for Geological Survey). The aquifer is unconfined and most of the catchment consists of fluvial quaternary sediments underlain by clay and marls. The impervious sub-stratum is a mixture of sand and clay. The quaternary alluvium comprises of sand, gravel and pebbles with silt. Twelve piezometric measurements available were considered for the model calibration and validation. Eleven gauging sites are chosen to measure river stage data for the development of a conceptual model.
The proposed groundwater model is intended to establish new wells and transport system by locating the optimal position of wells and corresponding discharge rates. The wells were characterized by well elements and river sites were labelled by 39 head line-sink elements. The conceptual model is based on the Analytic Element Method (AEM). The aquifer is considered to be homogenous as it is located between the two rivers with alluvium property. The AEM is particularly developed to deal with simulation–optimization problems and it was found efficient (Gaur et al. 2011a, 2011b). The AEM has some advantages such as fast convergence as the head can be computed directly at the desired location without getting the head for a whole grid system. Here also, the authors are trying to utilize the benefits in the application of simulation–optimization techniques in groundwater management problem.
The calibration of AEM was performed using 12 available piezometric measurements (Gaur et al. 2011a, 2011b), the values of hydraulic conductivity are changed methodically and the model output has been compared with the observed values at a 95% confidence level. The constant hydraulic conductivity of 0.02 m3/day is considered due to the homogeneous nature of the aquifer. A detailed description of AEM is available in Strack (1989). The optimization model calls the simulation model to generate the state variables which excessively increases the computational burden. Therefore, the AEM is coupled with Particle Swarm Optimization (PSO) so that potentials from all the known elements are stored in a single matrix. So for each run, this model does not calculate the value for the whole equation again and again. This coupling of the models reduces the convergence time (Gaur et al. 2011a, 2011b). The development of wells and transport systems is multiobjective, and the description of objectives is as follows:
METHODOLOGIES EMPLOYED
Even though a number of advanced multiobjective optimization techniques were developed and are available, we propose to explore NSGA-II as it is highly efficient in preserving elitism, crowding-distance mechanism, fast non-dominated sorting mechanism, robustness and convergence in less computational time (Deb 2001; Deb et al. 2002). NSGA-II offers better flexibility and this advantage make the simulation–optimization process faster and more efficient (Chaturvedi et al. 2020). No algorithm was found to be best globally for all situations and locations because the superiority of one algorithm over another is based on the case study and relevant parameters. Parameters in NSGA-II are the size of the population, the number of generations, crossover fraction, creation function, selection function, mutation rates, Pareto front population fraction and function tolerance. The working structure of NSGA-II is presented in Figure 2 which mainly includes initialization of variables, evaluation of objective functions, operations of cross over, mutation and related operators (Johns et al. 2020).
K-Means is used for the grouping of groundwater planning strategies generated with NSGA-II (Raju & Nagesh Kumar 2014). K-Means is an iterative process-based algorithm with an objective function related to the minimization of error (Raju & Nagesh Kumar 2018). The working structure of the K-Means algorithm is presented in Figure 3 which is self-explanatory. Cluster validation indices are found to be advantageous for finding optimal clusters as they will be able to assess the separation between clusters in an efficient way (Wang et al. 2009). The DB index is one such validation index (Davies & Bouldin 1979) employed in the present study for finding the optimal number of clusters of groundwater strategies. The DB index works mainly based on intercluster error and intracluster error and a less index value is preferred.
Two different distance-based MCDM techniques were employed to rank the representative strategies obtained by K-Means for optimum cluster size, namely VIKOR (VIšekriterijumsko KOmpromisno Rangiranje; Wu et al. 2016) and TOPSIS (Opricovic & Tzeng 2004). TOPSIS was selected, as the concept is rational, comprehensive; and the computation is simple (Ulengin et al. 2010). VIKOR was chosen as it is one of the most successfully applied MCDM techniques to various problems (Salehi 2016). VIKOR is similar to TOPSIS, but with different aggregation functions (Opricovic & Tzeng 2004). These techniques are described in Tables 1 and 2, respectively.
Step . | Description . | Mathematical expression/Remark . |
---|---|---|
1 | Ideal and anti-ideal values of criterion j | |
2 | and , i = 1, 2, …, N | |
3 | Value of | |
; ; ; ; = weight for the strategy of the maximum group utility; = weight of the individual regret | ||
4 | Ranking basis | Minimum Qi value is preferred |
Step . | Description . | Mathematical expression/Remark . |
---|---|---|
1 | Ideal and anti-ideal values of criterion j | |
2 | and , i = 1, 2, …, N | |
3 | Value of | |
; ; ; ; = weight for the strategy of the maximum group utility; = weight of the individual regret | ||
4 | Ranking basis | Minimum Qi value is preferred |
Step . | Description . | Mathematical expression/Remark . |
---|---|---|
1 | Ideal and anti- ideal values of criterion j | , |
2 | Separation measure of each strategy a from the ideal | |
3 | Separation measure of each strategy a from the anti- ideal | |
4 | Closeness index of each strategy a | |
5 | Ranking basis | Higher value is preferred |
Step . | Description . | Mathematical expression/Remark . |
---|---|---|
1 | Ideal and anti- ideal values of criterion j | , |
2 | Separation measure of each strategy a from the ideal | |
3 | Separation measure of each strategy a from the anti- ideal | |
4 | Closeness index of each strategy a | |
5 | Ranking basis | Higher value is preferred |
RESULTS AND DISCUSSION
Stage 1: model application
AEM-based flow model (Gaur et al. 2011a) and NSGA-II-based optimization models were coupled and used for simulation–optimization analysis. MATLAB global optimization toolbox with multiobjective optimization was used for analysis (Matlab 2017). MATLAB functions were written for coupling of models. To couple the model with NSGA-II, the AEM model function was called through cost function, with discharge and location as inputs of the model. Further, the outputs of the AEM model were post-processed along with the imposition of constraints and penalties. The coupled AEM-NSGA-II model was applied to generate the Pareto front fulfilling the three objectives. Fifteen decision variables were considered based on the location and discharge of five pumping wells. The corresponding coordinate values X and Y along with the discharge value of each well were taken as the decision variables. A penalty function approach was employed to facilitate constraints and the same weightage was given to each constraint. The weightage factor of 107 Euros was considered which was arrived at on the basis of trial and error (Ayvaz & Elci 2013) as the model was not able to converge for values less than 107.
For tuning of NSGA-II parameters, many values were tried until they were found insensitive to variations (minimum and maximum of solution bounds), their default values were used. The detailed discussion about the tuning of parameters is as follows:
As per the MATLAB suggestions, if the number of decision variables is greater than 5, the population should be set near 200. In our case, there were 15 decision variables; so the population was set to 200.
We used the system without parallelization; hence, there is no migration of the sub-population and consequently no effect of migration fraction.
The creation function generated the initial population which was set as ‘feasible’. This creates uniformly distributed points and handles linear constraints, if any.
Tournament selection was chosen as the selection function for choosing parents for the next generations. For our problem, changing the mutation rate from 0.1 to 0.8 did not affect the range of solutions obtained; so it was again set to the default value of 0.1.
Decreasing the crossover fraction to 0.5 to 0.2 caused the penalty solutions being persistent, and increasing the value from 0.8 to 1, had no significant effect in diversity. Hence, the default value of 0.8 was chosen.
Pareto fraction defines the number of Pareto solutions to be obtained from the previous Pareto fronts. On increasing the Pareto front population fraction from 0.35 at intervals of 0.25, a decrease in penalty solutions was observed. Finally, the value of 0.85 was chosen.
If the change of best fitness was very small over 100 generations (stall generations) it was considered that the solution was stable and the Pareto front had stopped moving. So, the function tolerance was set to 10−5. The number of generations was set high (700 generations) to prevent any premature convergence. Also, the function tolerance criterion was satisfied always and earlier compared with the full number of generations.
The tuned AEM-NSGA-II was run up to the maximum number of 700 generations. However, convergence occurred after 542 iterations. The model identified 2105 non-dominated solutions out of 108,400 evaluations (200 population × 542 iterations). Figure 4 presents the Pareto front generated by the AEM-NSGA-II model. The minimum and maximum values of the final Pareto front were found to be 840 and 1,337 m3/s for discharge, 183,110 and 1,014,811 Euros for piping cost and 187,863 and 761,799 Euros for pumping cost. Figure 4 also shows the mutual dependence and sensitivity towards each other, i.e. impact of cost change of one objective over another. As a note, if functions are multiplied by any constant number, it will only act as a scaling factor and the solution will follow the same trend.
2105 non-dominated groundwater planning strategies were generated. These were difficult to handle directly for ranking; K-Means in conjunction with the DB index was used for finding the optimal size of clusters that could be further processed for ranking of representative strategies.
Stages 2 and 3: cluster analysis and validation
The normalization approach was used for making the criterion dimensionless (Raju & Nagesh Kumar 2014). Weights of the three criteria, piping cost, pumping cost and discharge are adopted as equal in line with NSGA-II analysis. Weighted normalized values, which are the product of weights and normalized values, are input to cluster analysis. K-Means and the DB index were computed with Cluster Validity Analysis Platform (CVAP) (Wang 2007). K-Means is performed for 3 to 25 clusters and the algorithm is run multiple times in an iterative manner. Accordingly, the DB index is also computed for 3 to 25 clusters. Most of the time, the optimum cluster size varies between 20 and 25. Keeping the variation of the DB index over the number of iterations and data similarity, the optimum cluster size is chosen as 20. The percentage number of strategies among 2105 falling in each cluster is presented in Figure 5. For example, A1(1946): 6.22% meant that 6.22% of 2105 strategies are part of cluster 1; 1946 is the representative strategy in cluster 1 and A1 is the notation to present the corresponding representative strategy. It is observed that, in some of the clusters, the division of strategies is almost uniform. The representative strategy for each cluster needs to be determined. The squared error between the weighted normalized strategy in that group and group mean is computed. The strategy that gives the least error is picked up as the representative for that group. A1 to A20 represent strategy numbers 1946, 856, 1545, 314, 1274, 185, 1492, 1675, 1503, 1524, 658, 172, 62, 1654, 909, 1454, 1685, 1911, 61, and 2050 for computational purposes. Figure 6 presents the 20 representative strategies A1 to A20.
Stage 4: application of VIKOR and TOPSIS
Matlab based VIKOR and TOPSIS codes were developed for ranking of A1 to A20. The methodologies of VIKOR and TOPSIS are presented in Tables 1 and 2. The code had provision for browsing input file, weights file, as well as the normalization approach. In the VIKOR method, user can choose any value of (between 0 and 1). Figure 7 presents values of for = 0.5. varying between 0.5938–2.3763, 0.528–1, 0–0.9834 over 20 strategies. A5, A7 and A14 are the top three preferred strategies with values of 0.0, 0.222 and 0.2523. The least preferred strategies are A11, A18 and A12 with values of 0.9834, 0.9666 and 0.9368. Figure 8 presents the ranking pattern corresponding to values of 0.1 and 0.5, 0.7, 1.0 (termed as S1, S2 and S3). The sensitivity analysis of the changing values of did not show any impact on the top-ranking strategy (A5), i.e. strategy number 1274 falling in cluster 5. The corresponding piping cost, pumping cost and discharge are 880,000 Euro, 679,000 Euro and 1,263.1 m3/s, respectively.
However, a significant effect is observed in other strategies. Spearman rank correlation (Gibbons 1971) between S1–S2, S1–S3 and S2–S3 are found to be 0.911, 0.844 and 0.977, respectively, indicating a reasonably strong correlation between different scenarios.
Figure 9 presents values of , and by TOPSIS. , and are varying between 0.1937–1.0321, 0.2468–1.0299 and 0.213–0.8261 over 20 strategies. A5, A3 and A2 occupied the first three positions (with of 0.8261, 0.7751 and 0.7101) and A19, A4 and A11 occupied the last three positions (with of 0.213, 0.2221 and 0.2307).
Figure 10 presents the ranking of strategies by TOPSIS and its comparison with VIKOR ( = 0.5). Spearman rank correlation between VIKOR and TOPSIS is 0.3503, indicating not so strong correlation even though these are based on a similar methodology. In our opinion, it is the first application of VIKOR to groundwater studies in conjunction with cluster analysis.
The methodology in our opinion is robust and can be replicated with suitable modifications. However, the outcome may vary for different periods and locations which we propose to study in future with the present methodology. Since a similar methodology was not applied for the Dore catchment in the past, there is no mechanism to verify the effectiveness of the present approach. However, we propose to collect more micro level data and conduct field surveys to validate or test the efficacy of the present approach.
Water managers can perform sensitivity analysis to assess the impact of one unit of piping cost/pumping cost on discharge which will help analyse the impact of changing piping material cost or electricity cost due to inflation. With this hypothesis, water managers can use this information to optimally locate the pumping wells and their corresponding discharges to accomplish the water demand of nearby city ‘Thiers’ in a unified manner with future economical scenarios. This also unlocks the path to place the results into pragmatic use. However, a few more challenges still remain. Some amount of conviction is needed for the field experts and end users about the reasonableness and ability of methodologies that suit their thinking and requirements before implementing. The approach suggested here has an academic and practical flavour and we are confident that it would assist water resources planners and researchers.
We will take this opportunity to collect more field data, field visits and targeted to compare the present methodology with traditional methodology for more meaningful inferences. Efforts will also be made to minimize computational time to perform simulation and optimization process seamlessly.
CONCLUSIONS
The present study discussed the methodology consisting of a four-stage procedure comprising the application of NSGA-II, K-Means cluster analysis, DB index and ranking techniques for groundwater planning problem of the Dore river catchment, France. The following specific observations are made from the present study:
It is observed that representative strategy A5 (strategy number 1274 falling in cluster 5) (piping cost, pumping cost and discharge respectively of 880,000 Euro, 679,000 Euro and 1,263.1 m3/s) is found to be the best by both VIKOR and TOPSIS.
Spearman rank correlation suggests not so strong correlation between VIKOR and TOPSIS. The sensitivity analysis of changing values of did not show any impact on the top-ranking strategy (A5).
Performing clustering of the NSGA-II generated non-dominated strategies that minimized duplication of data and facilitated getting at the optimum number of clusters.
AEM-based groundwater model is efficient to handle simulation–optimization model efficiently.
Location of the storage tank and corresponding piping cost significantly impact the optimal location of pumping wells.
ACKNOWLEDGEMENTS
Part of the paper is related to one module of the Council of Scientific and Industrial Research (CSIR) sponsored project. The second and third authors acknowledge the support of CSIR, New Delhi, through project no. 22(0782)/19/EMR-II dated 24 July 2019.
DATA AVAILABILITY STATEMENT
All relevant data are included in the paper or its Supplementary Information.