The problem of identifying an unknown pollution source in polluted aquifers, based on known contaminant concentration measurements, is part of the broader group of issues called inverse problems. This paper investigates the feasibility of solving the groundwater pollution inverse problem by using artificial neural networks (ANNs). The approach consists first in training an ANN to solve the direct problem, in which the pollutant concentration in a set of monitoring wells is calculated for a known pollutant source. Successively, the trained ANN is frozen and is used to solve the inverse problem, where the pollutant source is calculated which corresponds to a set of concentrations in the monitoring wells. The approach has been applied for a real case which deals with the contamination of the Rhine aquifer by carbon tetrachloride (CCl4) due to a tanker accident. The obtained results are compared with the solution obtained with a different approach retrieved from literature. The results show the suitability of ANN-based methods for solving inverse non-linear problems.

INTRODUCTION

Groundwater is an important source for the production of drinking water. Consequently, the protection of groundwater resource quality appears of extreme importance for life support systems. Nevertheless, groundwater is exposed to man-made pollution that might prevent its use for drinking as well as for other domestic, industrial and agricultural purposes. When groundwater is polluted, the restoration of quality and removal of pollutants is a very slow, hence long, and sometimes practically impossible task.

In the field of groundwater resource contamination, it should be highlighted that in some cases pollution might result from contaminations whose origins differ in time and place from the point where the contaminations were witnessed. To tackle such situations, it is necessary to develop specific techniques for identifying the behaviour of unknown contaminant sources from both spatial and temporal points of view. Getting to know the initial conditions of pollution is consistent with the implementation of the European Union Directive 2004/35/EC. This Directive, based on the ‘polluter-pays’ principle, concerns the environmental liability in relation to the prevention and compensation of environmental damages. The application of Directive 2004/35/EC requires the development of novel methodologies, such as that proposed in this work, for the identification of unknown pollution sources in contaminated aquifers. The problem of identifying an unknown pollution source in contaminated aquifers, based on known contaminant concentration measurement, is part of the wide group of issues called inverse problems. During the last decade, several studies have been dedicated to the development of different methods for solving inverse problems, however, works using artificial neural networks (ANNs) are less popular. Among these latest, in Zio (1997), an ANN is trained to identify the value of the dispersion coefficient in a simple analytic contaminant transport model; in Fanni et al. (2002), an ANN is used to locate a pollutant source; in Rajanayaka et al. (2002), a hybrid approach based on a combination of two types of ANN model is applied to estimate hydrogeological parameters; in Scintu (2004), ANNs are trained to predict the coordinates of the pollutant source and the time the pollution occurred; in Singh & Datta (2007), an ANN is used to identify unknown pollution sources with partially missing concentration observation data; in Foddis et al. (2013), ANNs are used for locating the source of a contamination event in time and space.

In this work, an ANN-based inverse problem solving method is presented for the identification of unknown contaminant sources and their characteristics. Compared to the state of the art, the originality of the developed method lies in the approach used for solving the inverse problem, which consists of inverting the ANN model instead of solving the equations that describe the physical system. In particular, different ANNs are trained to associate the pollution source features with the contaminant concentration measurements acquired in the monitoring wells. After this phase, the trained ANN is inverted by fixing the measurements at the wells and determining the corresponding pollutant source. This approach might be used for defining punctual contaminant sources to face groundwater quality problems. In particular, location and fluxes over time can be defined. The procedure described in this paper has been validated in a real case which tackles an actual contamination that occurred in the Rhine aquifer (France) (Vigouroux et al. 1983).

METHOD DESCRIPTION

Multi-layer perceptron network model

An ANN is a distributed, adaptive, generally non-linear learning machine, built from many different artificial neurons (AN) that interact with each other through weighted connections. The ANN is symbolized like a graph where patterns are represented in terms of numerical values attached to the ANs, the nodes of the graph. In particular, multi-layer perceptron (MLP) ANNs are logically arranged in two or more layers (Figure 1), and the numerical data conceptually flow forward from the input to the output layers. This is the structure of ANN used to model the system under study.
Figure 1

MLP structure.

Figure 1

MLP structure.

Such a neural model is trained by using a set of input–output pairs of patterns. These patterns are created by means of flux and transport contaminant modelling software, which calculates both the source of contaminant and the set of measurements in the monitoring wells. These data are converted into patterns which represent the input and the output of the ANN, respectively. Thanks to the weighted links, the ANN associate an output pattern to any input pattern. The training procedure consists of adapting the weights so that the ANN performs a desired association between the input and output patterns of a given training set of examples. In this work, the desired output values (target) represent the measured values so that the trained ANN is able to associate a given source of pollutant with the corresponding measurements in the monitoring wells. The Levenberg–Marquardt algorithm has been used to train the ANN.

The trained MLP realizes a relationship between input and output patterns described by the following system of algebraic equations: 
formula
1
where (see Figure 1): is the input of the network, is the weights matrix of the input layer, is the bias vector of the input layer, and are the input and the output of the hidden layer, respectively, is the hidden neurons logistic activation function, is the output of the network, is the weights matrix of the output layer, and is the bias vector of the output layer.

Solving system equations (1) for the input , given an assigned value of the output , means solving an inverse problem of determining the characteristics of the pollutant source. In the following two sections, the adopted methods to extract features from data and to solve system (1) are described.

Feature extraction

Data which describe groundwater flow and contaminant transport are characterized by huge volume and strong redundancy of information, so that in general it is possible and also mandatory to pre-process the data by reducing the volume without a meaningful loss of information. This operation is referred to as feature extraction and it allows the condensing of the information contained in the original data into a strongly reduced dimension of data. Feature extraction techniques have been chosen in compliance with two important requirements. Firstly, a high rate of compression of the information has to be ensured in order to guarantee a significant correspondence between the back transformed data and the original values. Secondly, the procedure can be easily backward applied to allow the reconstruction of the initial data set with the least possible error.

The adopted feature extraction procedure consists of: (1) calculating the two-dimensional discrete fast Fourier transform (2D-FFT) for constructing both the input and output matrices; (2) reducing the matrices by a fixed threshold of energy content; and (3) reducing the matrix size by means of the principal component analysis (PCA) method.

The training set is constituted by input-output pairs of matrices, calculated by a simulator, where the former reports the time evolution of the pollutant source at four different depths, while the latter one is the time evolution of measured pollutant concentration corresponding to the monitoring wells. Therefore, the former matrix has four rows, while the latter one has as many rows as the number of monitoring wells. The number of columns is equal to the number of time samples. Both input and output matrices have to be converted into a pattern, and this pair of patterns will represent a specific case or example. By performing a suitable number of simulations, we finally obtain an input and an output matrix having the same number of columns, each one corresponding to an example.

The feature extraction procedure is first performed on the two matrices which, respectively, describe source and monitoring wells for each example. The two-dimensional Fourier Transform (Mersereau & Dudgeon 1975) of the two matrices is calculated, switching to the frequency domain. Here, each element represents a spectral component of the signal, whose relevance in the signal is measured by its amplitude. Therefore we can define a preliminary threshold of amplitude in order to eliminate the components which are less relevant. The selected components have to be the same for all the examples, therefore all the components which overcome the threshold in at least one example will be kept. The same set of components will have to be kept for any future example. For each example, the selected components of both input and output matrices are rearranged in two column vectors, therefore at the end of this stage we have two matrices, input and output, respectively, for the entire training set, and for both, the number of columns is equal to the number of examples. The dimension of these two matrices can be further reduced, taking into account the redundancy of information due to the correlation existing among rows of the same matrix. The most efficient way to linearly eliminate such redundancy is the PCA method (Henebry & Rieck 1996). This analysis consists of rotating the reference axis in such a way that the variance of points distribution along some axis (principal) is maximum, so that the residual variance along the remaining axis can be neglected. The quantity of information which is lost with the neglected axis can be easily estimated, and in the real cases the PCA allows strong reduction of the dimension of patterns with very little loss of information. The dropped matrices are the input and output matrices that will be used to train the ANN. The training of the ANN is performed by using the Levenberg–Marquardt algorithm. The quality of the training affects the reliability of the successive inversion of the ANN, therefore all the standard expedients have to be adopted to guarantee the suitable approximation level and generalization capability of the trained ANN.

Inversion procedure

Once the training of the MLP ANN is completed it is used to solve the inverse problem by exploiting the MLP inversion method described in Fanni & Montisci (2003) and Carcangiu et al. (2007). It consists of determining the input pattern which corresponds to a given output pattern, which in our case means identifying the unknown pollution source on the basis of a set of measurements acquired in the monitoring wells. As in any identification problem, it is fundamental to guarantee the uniqueness of the solution. This implies that the two systems of linear equations in (1) have to be determined or over-determined, and so the number of input neurons has to be less than or equal to the number of hidden neurons, which has to be less than or equal to the number of output neurons. Secondly, for the same reason, the first and the second connection weights matrices and have to be full rank. The parameter that can affect such features of the ANN is, generally speaking, the approximation degree of the representation of both input and output data. This means that by increasing the precision used to describe the input and output data, we could find out that the weights matrices are not full rank. For this reason a trial-and-error procedure should be adopted in order to seek the best approximation level which guarantees the invertibility of the MLP ANN.

Inverting the MLP ANN means solving system (1) for the unknowns , and , while the output is fixed. If the constraints on the rank of and are fulfilled, solving system (1) is quite simple as the three unknown patterns , and can be determined one after the other univocally.

On the basis of the third equation in (1), starting from the output , the vector can be determined. To guarantee the uniqueness of the solution, the number of rows (number of output neurons) must be higher than the number of columns (number of hidden neurons). In this case, the results of the system of equations are over-determined and the uniqueness is ensured by assuming the solution which corresponds to the minimum mean squared error. Such a solution can be found by solving the following modified system of equations, which has a squared coefficients matrix: 
formula
2
where the mark T represents the transposition operator. If the number of hidden neurons is equal to the number of output neurons, the weights matrix is squared, and the solution corresponding to the minimum sum squared error can be directly calculated as 
formula
3
The second equation in (1) states a bi-univocal relationship between and , therefore the vector is 
formula
4
Finally, provided that the matrix is full rank, the input pattern can be calculated as 
formula
5

Also in this case, if the number of input neurons is equal to the number of hidden neurons, the system can be solved by directly inverting the matrix .

The pattern describes the unknown source of pollutant. To generate the time evolution of the source, the transforms that have been applied for the feature extraction have to be back applied to the obtained pattern .

APPLICATION OF THE METHOD TO A CASE STUDY

The performance of the proposed methodology has been evaluated by defining the behaviour in time and space of the unknown pollution source of the Alsatian aquifer.

General description of the Upper Rhine Graben and Alsatian aquifer

The Alsatian aquifer is located in the southern part of the Upper Rhine valley. The Upper Rhine Graben is a segment of the European Cenozoic rift system that developed in the north-western forelands of the Alps. It extends over 300 km from Basel (Switzerland) in the south to Frankfurt (Germany) in the north, with an average width of approximately 40 km. As shown in Figure 2 it is flanked in the south by the Vosges and Black Forest (Schwarzwald) mountains, to the west and the east, respectively (Bertrand et al. 2006).
Figure 2

The Rhine Valley and the Alsatian aquifer.

Figure 2

The Rhine Valley and the Alsatian aquifer.

The Alsatian aquifer surface is over 3,000 km2 and contains a volume of alluvium of about 250 billion m3. It represents one of the largest freshwater reserves in Europe. The groundwater reservoir contains about 50 billion m3 of water, with an annual renewal of 1.3 billion m3. This large aquifer is of vital importance since it supplies 75% of the drinking water requirements, 50% of the industrial water needs and 90% of the irrigation water needs in Alsace. The Alsatian part of the Rhine aquifer has a surface length of 160 km and a maximum width of 20 km (Hamond 1995; Bertrand et al. 2006).

The Alsatian aquifer is an extensive alluvial aquifer with a layered structure composed of a random superposition of different alluviums (clay, sand fine to rough, gravels, coarse etc.). This permeable alluvial has a thickness of a few metres at the Vosgean edge, and 150–200 m in the centre of the Rhine plain (Hamond 1995).

The groundwater reservoir is part of a complex hydrosystem which includes frequent exchanges between the rivers and the aquifer which vary with the seasons and it is highly exposed to contamination from rivers (Stengera & Willingerb 1998).

History of the pollution of the aquifer by CCl4

In 1970, a tanker truck containing carbon tetrachloride (CCl4), property of a Dutch company, overturned in the north of Benfeld (a small town located about 35 km south of Strasbourg). According to a note from Service Géologique d'Alsace-Lorraine of 21 December 1971, about 4,000 litres of CCl4 were spread in the area of the accident, infiltrating into the ground or disappearing by evaporation (Hamond 1995). In 1991, the analyses carried out by Bureau de Recherche Géologique et Minières showed abnormal quantities of CCl4 in the supplies of drinking water (about 60 μg/L). These quantities exceeded the safe limits recommended by the World Health Organization (2 μg/L). This high level of CCl4 concentration has caused a serious problem in the region by contaminating the main sources of drinking water in the Alsace Region (France) (Aswed 2008).

The exact amount of chemical infiltrated in the subsoil is unknown, as is the behaviour of the pollution source through time. The assessment of such characteristics constitutes the main issue for the individuation and remediation of the accident effects and it is the objective of this study.

ANN to study the Alsatian aquifer

The behaviour of the Alsatian aquifer unknown contaminant source is determined based on the contaminant concentration measured in the monitoring wells dislocated in the study area.

In a first phase, the ANN was trained to solve the direct problem by using a set of patterns created by means of flux and transport contaminant modelling software, where the patterns describe both the source of contaminant and the set of measurements in the monitoring wells,. The data set of the patterns was been constructed through a coherent number of hydrogeological scenarios, based on a 3D model of the domain developed by Aswed (2008). The input patterns were made of the pollution source features in terms of the injection rates in the four hydrogeological layers. The output patterns were contaminant concentration observation data at 45 monitoring wells. Sources and monitoring wells are related by a bi-univocal relationship, meaning that any specific profile of monitoring wells corresponds to one specific contaminant source behaviour.

In a second phase, the trained network was inverted in order to identify the source of contaminant corresponding to a set of data measured in correspondence with monitoring wells.

Groundwater flow and contaminant transport numerical model

The CCl4 pollution of the Alsatian aquifer has been the subject of several studies (Vigouroux et al. 1983; Aswed 2008). The 3D flux and contaminant transport numerical model used for constructing the patterns was calibrated using measured data of CCl4 concentration that were collected over 12 years (1992–2004) and simulations were performed for 1970 to 2024.

The patterns have been simulated using the non-commercial numerical software Transport of RadioActiver Elements in the Subsurface (TRACES) that combines mixed-hybrid finite elements and discontinuous finite elements to solve for hydrodynamic state and mass transfer in porous media (Hoteit & Ackerer 2003). The contaminated zone is enclosed within a 3D domain of 6 km width, 20 km length, and about 110 m depth. This aquifer domain is discretized using a 3D triangular prismatic grid with 25,388 nodes and 45,460 elements. According to the estimated geometry of the cross-sections (the landfill site was divided into eight zones by soil type) the domain is discretized into ten layers. The contaminant source is located in the first four layers (the thicknesses of the layers are, respectively, 16 m, 4 m, 5 m, and 5 m from the top to the bottom). The volume of contaminated aquifer is about 230–1,300 m3. The surface of contaminant infiltration is about 7–37 m2 (Aswed 2008).

Using TRACES, various states of pollution sources have been constructed adjusting the source characteristics in terms of injection rates over the vertical section. In total, 104 examples were constructed with the same duration of activity of about 31 years (11,520 days) and were located in the same positions in the domain (accident site in Benfeld). So for each of the 104 sources, a different evolution of the contaminant concentration through time for the four layers in the numerical domain has been considered. The total time of simulation is about 54 years (20,000 days).

ANN pattern construction: elaboration and feature extraction

The example patterns obtained with TRACES consist of 208 matrices of contaminant concentrations:

  • 104 matrices corresponding to the features of the pollution sources. These had dimensions [11,520 × 4]; 11,520 represents the time (days) and 4 represents the layers in the source location.

  • 104 matrices corresponding to contaminant concentration in the monitoring wells. These had dimensions of [4,000 × 45]; 4,000 represents the time (one value for each 5 days) and 45 represents the monitoring wells in the domain.

The 104 input matrices and 104 output matrices have been reorganized to form two matrices which describe the whole training set: one for the input and one for the output.

The two groups of 104 input and output matrices were considered separately (Figure 3).
Figure 3

Matrix reduction diagram.

Figure 3

Matrix reduction diagram.

For the first step, the 2D-FFT has been calculated for each matrix, using Matlab. For each case we obtained two vectors: one for the input that has the dimension of 46,084 elements and one for the output that has the dimension of 15,774 elements. Vectors, corresponding to the different cases, have been joined to form a matrix where each column corresponds to one example, so at the end of this step the dimensions of the two matrices are [46,084 × 104] for the input matrix and [15,774 × 104] for the output matrix, respectively.

In the second step, from each of these matrices the negligible components have been removed. A component is considered negligible if its amplitude does not pass a predetermined threshold in any of the examples. The value of the threshold is determined by seeking a crossover between the level of approximation of the acquired data and the dimension of the input and output matrices. For applying the PCA, the number of remaining values is pushed to have a number of rows less than or equal to the number of examples. As a result of this second step, the input matrix had the dimension [99 × 104] and the output matrix had the dimension [100 × 104].

In the third step, the PCA transform has been applied to further reduce the volume of data. The degree of compression that can be obtained depends on the fraction of information, evaluated as the variance of the distribution, that could be lost. Such a fraction has been determined by means of a trial-and-error procedure which provided a value of 2 × 10–3 for the input and 10–7 for the output. Consequently, the input matrix had dimension [11 × 104] and the output matrix dimension [36 × 104].

ANN training and inversion

One 11-11-36 MLP has been trained with the Levenberg–Marquardt algorithm by presenting the pairs of input (pollution source scenarios) and target patterns (contaminant concentration for the 45 monitoring wells). Special attention has been paid to train the ANN in such a way that it is able to generalize the information contained in the training set. To this end, during the training phase, the error was evaluated even in the validation set, and the training was interrupted as soon as the error in the validation set started to rise (cross-validation stop criterion). A third set (test set) of examples has been put aside in order to check on a totally independent set the generalization capability of the trained ANN. The 104 examples have been so subdivided: 74 in the training set, 19 in the validation set and 11 in the test set.

Once the training phase was completed, the trained network was inverted to identify the contaminant source. To this end, an output pattern was generated on the basis of the contaminant concentrations measured in the 45 monitoring wells and the procedure described above, in the section ‘Inversion procedure’, was applied. In the present case, the number of hidden neurons is lower than the number of output neurons, so the system is over-determined. By applying Equation (2), the solution corresponding to the minimum sum squared error for the vector was found. The second equation in (1) states a bi-univocal relation between and , and then the constant term of the third equation in (1) was determined. Finally, by solving this equation for , the solution of the inverse problem was determined.

RESULTS AND DISCUSSION

On the basis of the known contaminant concentration data in the monitoring wells, the injection rates of the pollution sources in the cross-section were identified. To evaluate the performance of the method, the reconstructed pollution source reported in Aswed (2008) was assumed as source of the simulation and the corresponding values of the well data were calculated with the simulator. The four diagrams in Figure 4 show the comparison between the behaviour by Aswed (blue) and the diagram obtained by inverting the ANN (green) for the four layers. The constant concentration in the early period could indicate that in the early stage of the process an accumulation in the surface layer continued to release contaminant at a constant rate for 10 years.
Figure 4

Simulated (dashed black) and inverted (continuous grey) source trends at layer 1 (16 m), layer 2 (20 m), layer 3 (25 m) and layer 4 (30 m), respectively (Foddis 2011).

Figure 4

Simulated (dashed black) and inverted (continuous grey) source trends at layer 1 (16 m), layer 2 (20 m), layer 3 (25 m) and layer 4 (30 m), respectively (Foddis 2011).

As can be noted, the calculated behaviour of the source satisfactorily fits the trend reported in Aswed (2008). The approximation is better in the deeper layers (layer 4) where the concentration is higher due to the deposition of the contaminant with time. The ripple and the boundary peaks do not correspond to a real behaviour of the source, but they are a consequence of the cut of frequencies performed in the feature extraction procedure. Standard signal filtering procedures, like Kalman filters, could be applied in order to improve the appearance of the signal, both filtering the ripple and automatically cutting the peaks by fixing a bound of the profile derivative. Nevertheless, this kind of post-processing of the results appears to be useless for an expert assessment.

CONCLUSION

This work mainly focuses on an ANN-based methodology for solving the inverse problem in aquifer pollution. A crucial aspect of the procedure is represented by the training of the ANN, which has to guarantee a satisfactory approximation of the physical system, but at the same time it has to fulfil several requirements in terms of number of neurons and rank of the connection weight matrices. The trained ANN is analytically inverted to solve the inverse problem, which consists of determining the profile of the pollutant source on the basis of a set of measurements in monitoring wells. This procedure has been tested on a real case: an accident, which took place in the north-east of France in 1970, gave rise to contamination by CCl4 of the Alsatian aquifer. Various numerical model scenarios of the CCl4 contamination in the Alsatian aquifer have been generated to train the ANN to solve the direct problem of associating the pollution source characteristics with the measurements acquired in the monitoring wells. The inverse problem is solved by fixing the output pattern and analytically calculating the corresponding input. Finally, the obtained results are compared with the pollution source behaviour reconstructed by another method retrieved from the literature. The obtained results show the suitability of the method, which can provide simple and reasonable solutions to various problem in hydrogeology, especially for this kind of non-linear problem.

REFERENCES

REFERENCES
Aswed
T.
2008
Modélisation de la pollution de la nappe d'alsace par solvants chlores
.
PhD thesis
,
Université Louis Pasteur, Institut de Mécanique des Fluides et des Solides, UMR-CNRS 7507
,
Strasbourg
,
France
.
Carcangiu
S.
Di Barba
P.
Fanni
A.
Mognaschi
M. E.
Montisci
A.
2007
Comparison of multi-objective optimisation approaches for inverse magnetostatic problems
.
Int. J. Comput. Maths Electrical Electronic Eng.
26
(
2
),
293
305
.
Fanni
A.
Montisci
A.
2003
A neural inverse problem approach for optimal design
.
IEEE Trans. Magnet.
39
(
3
),
1305
1308
.
Fanni
A.
Uras
G.
Usai
M.
Zedda
M. K.
2002
Neural network for monitoring groundwater
. In:
Hydroinformatics 2002: Proceedings of the Fifth International Conference on Hydroinformatics
,
IWA Publishing
,
London
,
1–5 July 2002
, pp.
687
692
.
Foddis
M. L.
2011
Application of Artificial Neural Networks in Hydrogeology: Identification of Unknown Pollution Sources in Contaminated Aquifers
.
PhD thesis
,
Universities of Cagliari and Strasbourg
.
Foddis
M. L.
Ackerer
P.
Montisci
A.
Uras
G.
2013
Polluted aquifer inverse problem solution using artificial neural networks
.
AQUA Mundi
IV
,
15
21
.
Hamond
M. C.
1995
Modélisation de l'extension de la pollution de la nappe phréatique d'Alsace par le tétrachlorure de carbone au droit et à l'aval de Benfeld
.
Mémoire pour le DESS sciences de l'environnement
.
Institut de Mécanique des Fluides et des Solides
,
Strasbourg
,
France
.
Henebry
G.
Rieck
D. R.
1996
Applying principal components analysis to image time series: effects on scene segmentation and spatial structure
. In:
Geoscience and Remote Sensing Symposium, 1996. IGARSS '96
.
‘Remote Sensing for a Sustainable Future’
.
IEEE, New York
.
Hoteit
H.
Ackerer
P.
2003
TRACES User's Guide V 1.00
.
Institut mécanique des fluides et des solides de Strasbourg
,
France
.
Mersereau
R. M.
Dudgeon
D. E.
1975
Two-dimensional digital filtering
.
Proc. IEEE
63
(
4
),
610
623
.
Rajanayaka
C.
Samarasinghe
S.
Kulasiri
D.
2002
Solving the inverse problem in stochastic groundwater modelling with artificial neural networks
.
In: Integrated Assessment and Decision Support, A. E. Rizzoli & A. J. Jakeman (eds), Proceedings of the 1st Biennial Meeting of the iEMSs, Volume 2, International Environmental Modelling and Software Society, Manno, Switzerland. http://www.iemss.org/iemss2002/proceedings/vol2.html.
Scintu
C.
2004
Reti neurali artificiali: una applicazione nello studio di acquiferi contaminati
.
PhD thesis, University of Cagliari, Italy
.