Reliable long-term groundwater level (GWL) prediction is essential to assess the availability of resources and the risk to drinking water supply in changing climatic and socio-economic conditions, especially in areas with water deficits. The modern approach in this area involves the use of machine learning methods. However, the greatest challenge in these methods lies in the optimization of input selection. The presented research concerns the selection of the best combination of predictors using the Hellwig method. It served as a preprocessing technique before GWL prediction using support vector regression (SVR) and multilayer perceptron (MLP) for three wells in the Greater Poland Province, where the largest water deficits occur, in the period 1975–2014. The results of this method were compared with those of the regression method, general regression model. For the case study under investigation, the Hellwig method provided the best set of predictors consisted of GWL at lags of −1 and −2 months, precipitation from the current month, and delayed from −1 to −6 months, and past temperature at −1, −3, −4 and −6 months. Such input led to a model accuracy of 0.003–0.022 for a mean squared error and r2 of >0.8. The results obtained with SVR were slightly better than those with MLP. Moreover, every well required an individual set of predictors, and additional meteorological inputs improved the models’ performance.

  • The performance of the Hellwig method in the selection of input variables for groundwater level (GWL) forecasting using support vector regression and multilayer perceptron.

  • GWL dependence on meteorological and hydrological conditions within the past 6 months.

  • Additional meteorological inputs can improve the performance of forecasts.

  • Neighbouring wells require an individual set of predictors.

  • GWL modelling in areas threatened by water scarcity.

Graphical Abstract

Graphical Abstract
Graphical Abstract

Differential access to water in the world causes temporary or permanent water deficits in various regions. This results from irregular water resources and the rainfall that supplies them, population increase, anthropogenic processes (like progressive urbanization), increase in the use of water for industry and agriculture, environmental pollution, climate change, etc. (Kubicz et al. 2018, 2021; Lochyński et al. 2019). Satisfying the water needs of various recipients in a sustainable way is, therefore, one of the major challenges of the modern world (Glorian et al. 2018). To meet this challenge, the potential of intelligent water resource management systems is currently being evaluated worldwide. The idea of these systems is to use complex IT techniques to model processes occurring within groundwater (and surface water). Rapid progress in the field of computer science enables the implementation of new algorithms for problem optimization in the area of water resource management. A very popular technique in this area is groundwater level (GWL) prediction.

In environmental engineering, GWL forecasting results in a specific (short-, medium-, and long-term) time horizon are important for the rational management of groundwater resources. Long-term predictions of the groundwater resources’ quantity and their trends’ analysis allow to assess the availability of resources in changing climatic (and socio-economic) conditions (Malakar et al. 2018), particularly in areas suffering from water scarcity (Ibrahim et al. 2015).

Until now, for this purpose, depending on the time horizon of forecast, machine learning methods such as genetic programming (GP) (Shiri & Kisi 2011; Sivapragasam et al. 2015), artificial neural networks (ANNs) (Tsanis et al. 2008; Jalalkamali et al. 2011; Chitsazan et al. 2015; Kumar & Bhattacharjya 2021), and support vector machines (SVMs) (Suryanarayana et al. 2014; Huang et al. 2017; Yadav et al. 2017; Rahman et al. 2020) were the most often used. The undoubted advantage of their use is the fact that detailed knowledge about the modelled phenomenon and information on the existence of dependencies between data, or whether such relationships exist at all, is not required. However, one of the greatest challenges in these methods lies in the optimal selection of input parameters for the prediction model (Chitsazan et al. 2015; Amaranto et al. 2018). Sivapragasam et al. (2015) indicated that the selection of predictors plays a crucial role in modelling efficiency, especially for limited data sets. Moreover, they stated that the study area needs a unique and individual approach in terms of GWL modelling because a model suitable for a particular site may not necessarily work well even for a neighbouring area. The authors use different data preprocessing techniques for this purpose. Wu et al. (2021) used the Boruta method to obtain the best input variable combinations for monthly GWL forecasting in the Zhangye Basin, northwest China. Sharafati et al. (2020) attempted the gamma test for monthly GWL prediction over the Rafsanjan aquifer in Iran. Rahman et al. (2020) used eXtreme Gradient Boosting and random forest to find predictors for GWL forecasts for Kumamoto City in southern Japan. Sahoo et al. (2017) employed an input variable selection (IVS) method based on singular spectrum analysis, mutual information, and genetic algorithms to model seasonal GWL changes in the High Plains aquifer system and the Mississippi River Valley alluvial aquifer, USA. However, less computational demanding methods for predictor selection in GWL forecasting based on correlations between input and output variables have been implemented. Zhao et al. (2016) tested the correlation between predictors and GWL before GWL modelling in the Shuping landslide in the Three Gorges Reservoir area, China. Iqbal et al. (2020) studied the correlation between meteorological parameters and target GWL for the area located between Ravi River and Sutlej River, Pakistan. Osman et al. (2021) used cross-correlation to choose the best input variables for predictive GWL models in Selangor, Malaysia. Kajewska-Szkudlarek & Łyczko (2021) assessed the Hellwig method for input selection and proved its effectiveness in filling the gaps in daily GWL time series on the basis of neighbouring wells. Obtained models with a root mean square error (RMSE) of 0.024–0.292 and r2 of 0.7–0.9 were considered as high quality. Moreover, they showed good performance for both high and low GWL values. In the present research, the Hellwig method was used to choose the best combination of input variables for monthly GWL forecasting; this method also fit in the correlation techniques for predictor selection. This method is otherwise known as the method of information capacity indicators (Hellwig 1968). Its high efficiency, confirmed by a comparison with other variable selection techniques (zero unitarization method, Technique for Order Preference by Similarity to Ideal Solution (TOPSIS), etc.) (Omiotek et al. 2019; Wójcik-Leń et al. 2019; Szmidt et al. 2020), led the authors to apply it in GWL modelling. Authors who have used the Hellwig method in different applications have pointed out its several advantages like universality (Łopucki & Kiersztyn 2015), low complexity and easy implementation (Omiotek et al. 2019), wide availability (Omiotek et al. 2019), and low calculation effort (Kajewska-Szkudlarek & Łyczko 2021). Moreover, at the stage of predictor selection, it is based on the use of the variables that are strongly correlated with the target variable and poorly correlated with each other. (Łopucki & Kiersztyn 2015).

Previous studies indicate that the best GWL prediction models are obtained when the input variables are meteorological parameters describing weather conditions (or climatic conditions over longer periods) and conditions resulting from the hydrogeology of measurement sites. In addition, the authors consider the past levels of groundwater depth as the most significant variable, while the choice of other parameters depends on the location of the study area (Adamowski & Chan 2011; Jalalkamali et al. 2011; Shiri & Kisi 2011; Suryanarayana et al. 2014; Chitsazan et al. 2015; Sivapragasam et al. 2015). Among the most frequently considered input variables are: GWL, air temperature and humidity, rainfall, and evapotranspiration. The quality of predictive models is assessed based on the following measures: correlation coefficient r (0.95–0.98 in Chitsazan et al. 2015), squared correlation r2 (0.92–0.98 in Jalalkamali et al. 2011), mean squared error (MSE) (0.005–0.009 in Chitsazan et al. 2015), RMSE (0.131–0.276 in Jalalkamali et al. 2011), mean absolute error (MAE) (0.065–0.098 in Shiri & Kisi 2011), and mean absolute percentage error (0.157–0.423 in Jalalkamali et al. 2011).

Chitsazan et al. (2015) in their research used the feed-forward back propagation neural network to predict the monthly GWL of Aghili plain, Iran, with four different learning algorithms: descent with momentum, Levenberg Marquardt (LM), resilient back propagation (RP), and scaled conjugate gradient. Input data to the model included rainfall, evaporation, relative air humidity, air temperature, discharge of irrigation canals, recharge from the boundaries, and GWLs of wells in time t, t − 1, t − 2, and the water level value in t + 1. The climatological data covered 39 years of measurement. The authors adapted the clustering approach to the data, dividing them into three groups according to the hydrogeological characteristics of different parts of the plain surrounding each well. The measures that determined the quality of the prediction model at the individual stages of its creation were the MSE and correlation coefficient. The authors concluded that LM algorithm displayed best performance for all three hydrogeological groups.

Yadav et al. (2017) assessed the suitability of extreme learning machines (ELMs) and SVM for GWL prediction at two observation wells located in Canada. A monthly data set of 8 years from 2006 to 2014 consisting of both hydrological and meteorological parameters (rainfall, temperature, evapotranspiration, and GWL) was used for making a comparative study of the models. These variables were used in various combinations for performing univariate and multivariate analyses of the models. The prediction performance of the developed models was evaluated using the coefficient of determination, root mean square error, and Nash–Sutcliffe efficiency criterion. The study demonstrates that the proposed ELM model has better forecasting ability than the SVM model for monthly GWL forecasting.

Sivapragasam et al. (2015) used GP to model the spatial fluctuations of monthly GWL in the Arjuna Nadhi subbasin region, India, for a period of 10 years (2001–2010). Similar to Yadav et al. (2017), the authors proved that input parameter selection is essential for accurate prediction and that the parameters should be chosen by considering the individual conditions of research sites. They modelled the GWL in the output wells using information from neighbouring ones.

Jalalkamali et al. (2011) used ANN and neuro-fuzzy models for GWL prediction in Kerman plain, Iran. Their data set included 22 years of monthly records of air temperature, rainfall, and water levels in the output well as well as neighbouring wells. They looked for the best combinations among past GWL, temperature, and rainfall time series. However, they used only data from the current month and delayed by one month. Then, they evaluated the received models post factum on the basis of their performance.

Wunsch et al. (2018) used nonlinear autoregressive networks with exogenous input (NARX) to obtain GWL forecasts for several wells in southwest Germany. They determined the input and feedback delays by applying Seasonal and Trend decomposition using Loess on the temperature and rainfall time series and using auto- and cross-correlation functions on the remainders to choose significant time lags.

Currently, to improve the quality of forecasts (also the GWL), it is becoming increasingly common to create hybrids of predictive models combining several techniques or to use data preprocessing before implementing the models. It increases the efficiency of the model by combining the unique features of the constituent models to capture different patterns in the data (Suryanarayana et al. 2014). Adamowski & Chan (2011) hypothesize that introducing additional meteorological predictors, except for past hydrological time series, improves the prediction performance. Moreover, it is an attempt to answer the question if predictor selection may be universal for neighbouring wells or if every well should require an individual set of predictors.

The use of intelligent techniques for GWL forecasting is not a new phenomenon; however, the overwhelming majority of authors build predictive models based on relatively short time series of hydrological and meteorological parameters (usually less than 10 years), which are not sufficient to assess the response of groundwater to contemporary climate change.

Nash & Sutcliffe (1970) and Wheater et al. (1993) indicated that there is a need for including both simplicity and lack of duplication in the model structure as the most important principles of building models with optimized parameters. Driven by a motivation to use an IVS technique that has less computational burden than other methods in this area, the Hellwig method, which requires only a calculation of the correlation matrix between predictors and the output variable, was tested.

The purpose of the study was to assess how many and which lags in monthly temperature and precipitation as well as GWL time series will provide the most accurate GWL prediction for three wells located in the northern part of Greater Poland Province on the basis of data from the multiyear period 1975 to 2014. To find the optimal combination of hydrological and meteorological explanatory variables for GWL prediction modelling, the Hellwig method was used. To the best of the authors' knowledge, it is the first application of such methodology to GWL time-series forecasting approach apart from that of Kajewska-Szkudlarek & Łyczko (2021). Hydrological and basic meteorological time series (temperature and precipitation) were taken into account in the present research as the factors most influencing the GWL in areas with water scarcity. Another point examined in this study was to verify the hypothesis that introducing additional meteorological predictors, except for past hydrological time series, improves the prediction performance. Moreover, it is an attempt to answer the question if predictor selection may be universal for neighbouring wells or if every well should require an individual set of predictors.

The Greater Poland Province is the most vulnerable to water scarcity in the entire country. A great deal of local research is available on water problems in this province. Zieliński (2015) assessed the functioning of field farms from communes particularly threatened by agricultural drought and farms from other communes in that area. The author stated that farms suffering from drought were characterized by a greater labour input per 1 ha of used agriculture area (UAA), had smaller areas of UAA, and had worse economic conditions. Boczoń et al. (2016) studied the soil drought that occurred in 2015 in forests in almost the whole of Poland. The drought with the longest duration affected mostly Greater Poland, where a lack of water available to trees was mainly observed. This resulted in slow growth, a decrease in health status, and dying trees. Igliński & Skrzatek (2018) attempted to ask the question whether small hydropower plants could stop land stepping, while Nowak & Ptak (2018) focused on the important role of lakes as a component of small retention in that region. Waligórski et al. (2018) analysed reservoirs in the province in terms of their usefulness in water management. Kubicz & Bąk (2018) researched the effect of precipitation on the GWL and the impact of meteorological drought on the hydrogeological GWL in that area in the period 1981–2015.

Such a long database used in the present research (almost 40 years), describing GWL fluctuations and their predictions, provides information about groundwater resources’ response to contemporary climate change in areas threatened by water deficits. Moreover, the obtained results could support the process of groundwater resource management in environmental engineering from a long-term perspective.

The novelty in this research lies in the use of the Hellwig method for achieving the best combination of inputs in GWL modelling selection as well as defining GWL dependence on meteorological and hydrological conditions within the last 6 months. The main limitation is the low-resolution (monthly) data used, because sustainable groundwater management requires more specific (daily) results as well. In future research, the authors are planning to complement their studies with a short-term (higher-resolution) analysis of GWL variability on individual days and to attempt to answer the question about uncertainty in groundwater time-series modelling.

Research area

GWL data were collected from three wells (I/170/4, II/401/1, II/430/1) located in the northern part of Greater Poland Province, where the largest water deficits occur (Bąk 2003; Kubicz & Bąk 2018) (Figure 1).

Figure 1

Research area location.

Figure 1

Research area location.

Close modal

Monthly GWL time series in the first well covered the period 1975–2014, while in the other two sites, it was 1980–2014. These measuring points belong to the observation and research network of the Polish Geological Institute – National Research Institute (PIG–PIB). They are located in Borówiec (52°16′N, 17°02′E) – I/170/4, Ujście (53°04′N, 16°44E) – II/401/1, and Bęglewo (52°52′N, 16°11′E) – II/430/1. Summary information about the analysed wells is presented in Table 1.

Table 1

Summary information about the analysed wells (Polish Hydrogeological Survey 2019)

Well numberI/170/4II/401/1II/430/1
Site Borówiec Ujście Bęglewo 
Hydrogeological region (Paczyński & Sadurski 2007Middle Vistula Region – Lowlands Subregion Warta River Region – Lowlands Subregion Warta River Region – Lowlands Subregion 
Coordinates (WGS 84) 52°16′N, 17°02′E 53°04′N, 16°44′E 52°52′N, 16°11′E 
Type of research point Piezometer Drilled well Drilled well 
Type of aquifer Confined Unconfined Confined 
Stratigraphy Quaternary Quaternary Quaternary 
Lithology Sands and gravels Sands Sands 
Start of records 1975 1980 1980 
End of records 2014 2014 2014 
Well depth (m) 50 30 27.5 
Depth of aquifer top (m) 28 13 23 
Depth of aquifer bottom (m) 46 >30 >27.5 
WGW(1991–2015)a (m) 6.5 12.8 1.53 
SGW (1991–2015)b (m) 7.58 13.7 2.93 
NGW(1991–2015)c (m) 8.55 14.8 3.65 
Well numberI/170/4II/401/1II/430/1
Site Borówiec Ujście Bęglewo 
Hydrogeological region (Paczyński & Sadurski 2007Middle Vistula Region – Lowlands Subregion Warta River Region – Lowlands Subregion Warta River Region – Lowlands Subregion 
Coordinates (WGS 84) 52°16′N, 17°02′E 53°04′N, 16°44′E 52°52′N, 16°11′E 
Type of research point Piezometer Drilled well Drilled well 
Type of aquifer Confined Unconfined Confined 
Stratigraphy Quaternary Quaternary Quaternary 
Lithology Sands and gravels Sands Sands 
Start of records 1975 1980 1980 
End of records 2014 2014 2014 
Well depth (m) 50 30 27.5 
Depth of aquifer top (m) 28 13 23 
Depth of aquifer bottom (m) 46 >30 >27.5 
WGW(1991–2015)a (m) 6.5 12.8 1.53 
SGW (1991–2015)b (m) 7.58 13.7 2.93 
NGW(1991–2015)c (m) 8.55 14.8 3.65 

aMaximum GWL in a long-term period; the minimum value of the depth to water table.

bAverage GWL in a long-term period; the arithmetic mean of all measured values of the depth to water table.

cMinimum GWL in a long-term period; the maximum value of the depth to the water table.

The source of the meteorological data was the IMGW-PIB. These data included precipitation and air temperature time series covering the same period. Precipitation time series were obtained from Trzcianka (53°02′N, 16°27′E) and Mosina (52°14′N, 16°50′E), while maximum temperatures were obtained from Szamotuły (52°36′N, 16°34′E) and Kórnik (52°14′N, 17°05′E) (Figure 1).

According to the Institute of Meteorology and Water Management – National Research Institute (IMGW-PIB) measurements, the mean annual air temperature in the province in the period 1991–2020 was about 9.0 °C (9.0 °C in spring, 19.0 °C in summer, 9.0 °C in autumn, and 0.0 °C in winter). In the case of precipitation, it was 550 mm (120 mm in spring, 200 mm in summer, 120 mm in autumn, and 110 mm in winter). The mean maximum temperature for the period 1991–2020 was 28.5 °C, while the minimum was −8.5 °C. The driest years when precipitation was lower than 400 mm were 1992, 2003, 2015, and 2018, and the wettest years with an annual rainfall of about 700 mm were 2010, 2016, and 2017 (klimat.imgw.pl; Szyga-Pluta 2018). On the basis of long-term meteorological data measured since the mid-19th century, the Greater Poland Province was already classified in the first half of the 20th century as the warmest region in Poland in terms of thermal conditions and as the driest in terms of precipitation with the lowest annual rainfall of 450–500 mm (Bąk 2003).

Before starting the analysis, the hydrological and meteorological data were standardized.

Input variables’ selection

The literature indicates that past GWLs (Shiri & Kisi 2011; Suryanarayana et al. 2014; Chitsazan et al. 2015; Yadav et al. 2017), as well as rainfall and temperature time series, are the best predictors for GWL forecasting models (Jalalkamali et al. 2011). This is because these meteorological parameters are simple to measure and widely available and, thus, when chosen as predictors, they make the selected approach easily transferable (Wunsch et al. 2018). For achieving the optimal combination of explanatory variables’ selection, the Hellwig method, also known as the method of information capacity indicators (Hellwig 1968), was used. Although this method has not yet been used in GWL modelling (except for Kajewska-Szkudlarek & Łyczko 2021), its high efficiency is confirmed in the literature on the basis of a comparison with other variable selection techniques (Omiotek et al. 2019; Wójcik-Leń et al. 2019).

The full range of potential inputs included 20 time series: six for GWL (from previous to 6 months before actual month) as well as seven for rainfall and maximum temperature (from the actual month and from 1 to 6 months before). The number of all potential predictor combinations resulted from the formula
(1)
where m is the number of input variables and S is the number of combinations.

It is equalled as follows: 63 (GWL time series) and 127 (rainfall time series) as well as 127 (temperature time series), which is 317 for each well. Creating 317 models for each piezometer will involve a great deal of work and will be time-consuming, and, thus, the Hellwig method was used to create the rank of potential combinations which could serve as predictors to the forecasting models. The idea behind this method is to use input variables strongly correlated with the target variable and at the same time weakly dependent with each other. In addition, there is a numerical criterion, the so-called integral capacity of the combination of information carriers. The information carriers are all explanatory variables.

For all received combinations, the individual capacity of the information carriers was defined according to the following formula:
(2)
where j is the variable number in the combination under consideration, rj is the correlation coefficient of the potential explanatory variable number j with the explained variable (element of the vector of linear correlation coefficients between the explanatory variables and the explained variable R0), and rij is the correlation coefficient between the ith and the jth potential explanatory variables (elements of the correlation coefficients matrix between the potential explanatory variable R).

The HQ measures the magnitude of the information that variable Xj gives about variable Y in combination. When rj increases, HQ also increases, whereas it decreases when the variable Xj is correlated more with the other explanatory variables.

After calculating the individual capacities of the information carriers for all possible combinations, the integral capacity of the information carriers was calculated according to the formula
(3)
where n is the number of predictors, j is the variable number in the combination under consideration, and hj is the individual capacity of the information carriers.

The integral capacity of the information carriers for the combination is the sum of the individual capacities of the information carriers, which belong to this combination. It is the criterion for choosing the appropriate combination of explanatory variables, and the chosen combination is the one with the highest HQ (Hellwig 1968).

To validate the predictor selection made with the Hellwig technique, the regression method was used. It is aimed at assessing the relationship between predictors and a target. Regression methods make it possible to find factors that have a statistically significant impact on the output variable and to select the linear model of the best quality. The general form of the regression model can be written using the following relationship:
(4)
where Y is the output variable (target), X is the input variable (predictors), β is the coefficient, unknown, necessary to estimate, and ε is the random variable affected by the immeasurability of some factors, a lack of knowledge about them, or the inaccuracy of measurement.

The multiple linear general regression model (GRM) was used in this research for comparison with the results of the Hellwig method. The same hydrological and meteorological time series were used in variable selection as well as the regression modelling process. The input variable was the GWL in a given well, which was modelled on the basis of the above-mentioned predictors. The progressive and reverse stepwise and best-subset regression methods were used for IVS with which the best quality model was obtained.

Jalalkamali et al. (2011) also looked for the best combination between past GWL, temperature, and rainfall time series; however, they used only data from the current month and delayed by one month. In the present study, the authors hypothesized that such delay is not sufficient in the Greater Poland Province (in fact, the groundwater is relatively shallow; however, the analysed wells are characterized by a different type of aquifer), and, therefore, they proposed a 6-month lag during the creation of the input data sets. However, now the challenge was to find the optimal selection of precipitation and temperature data (monthly time series), which could serve as additional predictors apart from past GWL data.

GWL modelling

The methodology consists in using support vector regression (SVR) to predict GWL fluctuations. This technique is considered accurate not only in GWL modelling but also in other environmental approaches. It refers to both a single model and a combination (hybrid) of several models and techniques (Shabib et al. 2018; Liu et al. 2019).

SVR, which is an element of the SVM method, was developed to solve problems related to regression. Both SVM and SVR allow to find a hyperplane that reflects the nature of the data in the best way, and they control whether it meets the binding requirements of correct element classification for SVM and computational error in a specific range for SVR. Its aim is to find a function with at most ε deviation from the obtained targets for training data that are as flat as possible (Elshorbagy et al. 2010). For SVM, the so-called kernel trick is used as a standard approach that allows to introduce nonlinearity to the model and, thus, to improve its ability to cope with data that are not separated in a linear way (Cortes & Vapnik 1995). The nonlinear nature of these data may be captured as a result of using radial basis function (RBF). For SVR, the values of the coordinates of the vector w and the value of the absolute term b are found by solving the following optimization problem:

min ||w|| maintaining
(5)
where ||w|| is the length of vector w, w is the weight vector determining the hyperplane that separates points belonging to different classes, b is the absolute term determined during optimization, xi is the set of data belonging to two classes defined by yi variables, yi is the actual value corresponding to xi input vector, and is the non-optimized method parameter that defines the acceptable level of error, i.e., of the difference between the predicted values and those that exist in the teaching data.

The characteristics of the created SVR models are presented in Table 2. In the present study, the GWL time series were predicted with the use of SVR supported by a RBF algorithm. The RBF was determined by the gamma parameter γ (width of the kernel function). The parameters of the SVR model that defined the width of the margin of trust and the maximum value of weight that may be determined for the given vector were ε and C. Optimization of the hyperparameters C, ε, and γ was conducted using the cross-validation method. First, the models were created using only GWL predictors (a), and then temperature and rainfall input variables were added to determine if the quality of the model changes after adding meteorological parameters (b).

Table 2

SVR models’ characteristics

SVRParameters
No. of support vectorsThe kernel
Cεγ
a) Only GWL predictors 
 I/170/4 10.0 0.100 1.000 117 RBF 
 II/401/1 10.0 0.100 0.500 93 
 II/430/1 10.0 0.100 1.000 63 
b) All predictors 
 I/170/4 10.0 0.100 0.111 77 RBF 
 II/401/1 10.0 0.100 0.091 97 
 II/430/1 10.0 0.100 0.091 48 
SVRParameters
No. of support vectorsThe kernel
Cεγ
a) Only GWL predictors 
 I/170/4 10.0 0.100 1.000 117 RBF 
 II/401/1 10.0 0.100 0.500 93 
 II/430/1 10.0 0.100 1.000 63 
b) All predictors 
 I/170/4 10.0 0.100 0.111 77 RBF 
 II/401/1 10.0 0.100 0.091 97 
 II/430/1 10.0 0.100 0.091 48 

To avoid the phenomenon of overfitting, the data set was divided into two subsets: training and testing, which were about 75 and 25%, respectively. The training set covered the period from 1975/1980 (depending on the well) to April 2006 and the testing set from May 2006 until the end of 2014. For I/170/4, the training subset consisted of 372 elements (the average GWL was 7.36, while σ = 0.523) and a testing of 104 (mean value 7.42 and standard deviation 0.306). In the case of II/401/1, the training set was of 310 values (the GWL average was 13.57 and σ = 0.347), whereas testing counted 104 (the mean GWL value was 13.72, σ was 0.332). For II/430/1, it was 307 values in learning (2.96 and 0.262, respectively) and 104 in testing (2.84 and 0.330, respectively) subsets.

To verify the results obtained, the SVR GWL models were compared with the most commonly used ANN type: multilayer perceptron (MLP). For MLP, the BFGS (Broyden–Fletcher–Goldfarb–Shannon) learning algorithm is the most frequently used, with a sum of squares (SOS) error function. For each well, from many created neural networks, the one with the best quality according to the SOS error was chosen to be compared with the SVR model. Different MLP models with differing architectures were made: (a) MLP 1-6-1, MLP 2-5-1, and MLP 1-6-1, where the numbers mean the number of neurons in input, hidden, and output layers; (b) MLP 9-7-1, MLP 11-12-1, and MLP 11-12-1, respectively, for I/170/4, II/401/1, and II/430/1 (Table 3). In ANN modelling, the same input variables (the best combination of predictors according to the Hellwig method presented in Table 4a) were used as in SVR models.

Table 3

MLP models’ characteristics

MLPNo. of neurons
No. of learning epochsActivation function of neurons
Input layerHidden layerOutput layerHidden layerOutput layer
a) Only GWL predictors 
 I/170/4 44 tanh Logistic 
 II/401/1 57 Exponential tanh 
 II/430/1 20 tanh Linear 
b) All predictors 
 I/170/4 40 tanh Exponential 
 II/401/1 11 12 91 Logistic Logistic 
 II/430/1 11 12 13 Exponential Exponential 
MLPNo. of neurons
No. of learning epochsActivation function of neurons
Input layerHidden layerOutput layerHidden layerOutput layer
a) Only GWL predictors 
 I/170/4 44 tanh Logistic 
 II/401/1 57 Exponential tanh 
 II/430/1 20 tanh Linear 
b) All predictors 
 I/170/4 40 tanh Exponential 
 II/401/1 11 12 91 Logistic Logistic 
 II/430/1 11 12 13 Exponential Exponential 

Activation function of neurons in the input layer was linear in all cases.

Table 4

Best combination of predictors for GWL modelling

Well numberGWL (m)
(a) Hellwig method 
 I/170/4 GWL-1, GWL-2, GWL-3, GWL-4, GWL-5, GWL-6 
 II/401/1 GWL-1, GWL-2, GWL-3, GWL-4, GWL-5, GWL-6 
 II/430/1 GWL-1, GWL-2, GWL-3, GWL-4, GWL-5, GWL-6 
 P (mm) 
 I/170/4 P, P-1, P-2, P-3, P-4, P-5, P-6 
 II/401/1 P, P-1, P-2, P-3, P-4, P-5, P-6 
 II/430/1 P, P-1, P-2, P-3, P-4, P-5, P-6 
 T (oC) 
 I/170/4 T, T-1, T-2, T-3, T-4, T-5, T-6 
 II/401/1 T, T-1, T-2, T-3, T-4, T-5, T-6 
 II/430/1 T, T-1, T-2, T-3, T-4, T-5, T-6 
(b) General regression method 
 I/170/4 GWL-1, GWL-2, GWL-3, GWL-4, GWL-5, GWL-6 
 II/401/1 GWL-1, GWL-2, GWL-3, GWL-4, GWL-5, GWL-6 
 II/430/1 GWL-1, GWL-2, GWL-3, GWL-4, GWL-5, GWL-6 
 P (mm) 
 I/170/4 P, P-1, P-2, P-3, P-4, P-5, P-6 
 II/401/1 P, P-1, P-2, P-3, P-4, P-5, P-6 
 II/430/1 P, P-1, P-2, P-3, P-4, P-5, P-6 
 T (oC) 
 I/170/4 T, T-1, T-2, T-3, T-4, T-5, T-6 
 II/401/1 T, T-1, T-2, T-3, T-4, T-5, T-6 
 II/430/1 T, T-1, T-2, T-3, T-4, T-5, T-6 
Well numberGWL (m)
(a) Hellwig method 
 I/170/4 GWL-1, GWL-2, GWL-3, GWL-4, GWL-5, GWL-6 
 II/401/1 GWL-1, GWL-2, GWL-3, GWL-4, GWL-5, GWL-6 
 II/430/1 GWL-1, GWL-2, GWL-3, GWL-4, GWL-5, GWL-6 
 P (mm) 
 I/170/4 P, P-1, P-2, P-3, P-4, P-5, P-6 
 II/401/1 P, P-1, P-2, P-3, P-4, P-5, P-6 
 II/430/1 P, P-1, P-2, P-3, P-4, P-5, P-6 
 T (oC) 
 I/170/4 T, T-1, T-2, T-3, T-4, T-5, T-6 
 II/401/1 T, T-1, T-2, T-3, T-4, T-5, T-6 
 II/430/1 T, T-1, T-2, T-3, T-4, T-5, T-6 
(b) General regression method 
 I/170/4 GWL-1, GWL-2, GWL-3, GWL-4, GWL-5, GWL-6 
 II/401/1 GWL-1, GWL-2, GWL-3, GWL-4, GWL-5, GWL-6 
 II/430/1 GWL-1, GWL-2, GWL-3, GWL-4, GWL-5, GWL-6 
 P (mm) 
 I/170/4 P, P-1, P-2, P-3, P-4, P-5, P-6 
 II/401/1 P, P-1, P-2, P-3, P-4, P-5, P-6 
 II/430/1 P, P-1, P-2, P-3, P-4, P-5, P-6 
 T (oC) 
 I/170/4 T, T-1, T-2, T-3, T-4, T-5, T-6 
 II/401/1 T, T-1, T-2, T-3, T-4, T-5, T-6 
 II/430/1 T, T-1, T-2, T-3, T-4, T-5, T-6 

Bold values denote the best combination of predictors obtained by the Hellwig method.

The SOS error function was used to assess the current quality of the MLP networks in the learning process. The error function measures the consistency of network prediction with the defined value and it is used to determine the size of necessary neuron weight changes in each iteration. The learning process strives to minimize the size of the SOS error:
(6)
where N is the number of instances (input–output) pairs used for learning, yi is the network prediction (network output), and ti is the observed value (output based on data) for the ith instance.

The higher the difference between the actual and the predicted values, the larger the error and the more adjustment is required for the network (Dell Inc. 2016).

To eliminate the excessive adjustment phenomenon in SVR and MLP alike, the data were divided into learning and testing subsets.

Models’ performance metrics

Models’ quality assessment was based on the value of squared correlation (r2) for the actual and prognosed values, MSE as well as the MAE of the models according to the following formulas. Modelling was carried out using the Matlab and Statistica programs.
(7)
(8)
(9)
where n is the number of observations, xi is the predicted data, is the average value of predicted data, yi is the observed data, and is the average value of observed data.

Input variables’ selection

To select the input variables for the SVR model, the Hellwig method was used to create the ranking of the predictors’ combinations. The best combinations for the three analysed wells are presented in Table 4a. Within each parameter (GWL, P, and T), the selected ones were marked in bold. The best combinations of input variables were GWL-1, P-1, P-2, P-3, P-4, P-5, and P-6 and T-1 and T-4 for I/170/4. In case the of II/401/1, it was a combination of GWL-1 and GWL-2, all P predictors, and T-3 and T-6. For II/430/1, it was a combination of GWL-1, all P input variables, and T-1, T-3, and T-6. The set of predictors created in this way served as an input to the SVR and MLP models. The validation of the above IVS method with general regression models (progressive, reverse, and the best-subset regression) provided different results (Table 4b). In that case, the role of precipitation in the model was significantly smaller and the temperature lags taken into account were unlike, especially for II/430/1, where temperature was not included in the predictors’ combination.

GWL modelling

At the beginning, SVR modelling was performed only for predictors covering GWL time series (Table 5a). Then, temperature and rainfall time series were added (Table 5b).

Table 5

Prediction results

I/170/4
II/401/1
II/430/1
LearningTestingBothLearningTestingBothLearningTestingBoth
(a) For GWL predictors 
 SVR MSE 0.018 0.022 0.019 0.017 0.013 0.016 0.006 0.005 0.006 
MAE 0.104 0.111 0.106 0.093 0.082 0.090 0.055 0.059 0.056 
r2 0.927 0.903 0.920 0.866 0.897 0.873 0.925 0.939 0.928 
 MLP           
MSE 0.019 0.016 0.019 0.014 0.022 0.016 0.006 0.005 0.006 
MAE 0.107 0.097 0.105 0.085 0.101 0.089 0.057 0.051 0.056 
r2 0.918 0.933 0.925 0.885 0.838 0.861 0.923 0.942 0.933 
 GRM           
MSE 0.022 0.021 0.022 0.015 0.019 0.017 0.009 0.011 0.010 
MAE 0.112 0.100 0.106 0.091 0.100 0.096 0.062 0.071 0.067 
r2 0.913 0.921 0.917 0.883 0.830 0.857 0.926 0.941 0.934 
(b) For GWL, rainfall, and temperature predictors 
 SVR MSE 0.009 0.014 0.010 0.015 0.018 0.016 0.004 0.004 0.004 
MAE 0.071 0.088 0.076 0.088 0.093 0.089 0.044 0.047 0.045 
r2 0.964 0.937 0.958 0.877 0.864 0.871 0.958 0.960 0.958 
 MLP           
MSE 0.009 0.009 0.009 0.011 0.020 0.013 0.004 0.003 0.004 
MAE 0.072 0.068 0.071 0.076 0.091 0.080 0.043 0.045 0.044 
r2 0.961 0.963 0.961 0.908 0.858 0.883 0.953 0.959 0.956 
 GRM           
MSE 0.006 0.010 0.008 0.014 0.018 0.016 0.002 0.006 0.004 
MAE 0.082 0.086 0.084 0.091 0.093 0.092 0.050 0.056 0.053 
r2 0.964 0.966 0.965 0.860 0.880 0.870 0.953 0.959 0.956 
I/170/4
II/401/1
II/430/1
LearningTestingBothLearningTestingBothLearningTestingBoth
(a) For GWL predictors 
 SVR MSE 0.018 0.022 0.019 0.017 0.013 0.016 0.006 0.005 0.006 
MAE 0.104 0.111 0.106 0.093 0.082 0.090 0.055 0.059 0.056 
r2 0.927 0.903 0.920 0.866 0.897 0.873 0.925 0.939 0.928 
 MLP           
MSE 0.019 0.016 0.019 0.014 0.022 0.016 0.006 0.005 0.006 
MAE 0.107 0.097 0.105 0.085 0.101 0.089 0.057 0.051 0.056 
r2 0.918 0.933 0.925 0.885 0.838 0.861 0.923 0.942 0.933 
 GRM           
MSE 0.022 0.021 0.022 0.015 0.019 0.017 0.009 0.011 0.010 
MAE 0.112 0.100 0.106 0.091 0.100 0.096 0.062 0.071 0.067 
r2 0.913 0.921 0.917 0.883 0.830 0.857 0.926 0.941 0.934 
(b) For GWL, rainfall, and temperature predictors 
 SVR MSE 0.009 0.014 0.010 0.015 0.018 0.016 0.004 0.004 0.004 
MAE 0.071 0.088 0.076 0.088 0.093 0.089 0.044 0.047 0.045 
r2 0.964 0.937 0.958 0.877 0.864 0.871 0.958 0.960 0.958 
 MLP           
MSE 0.009 0.009 0.009 0.011 0.020 0.013 0.004 0.003 0.004 
MAE 0.072 0.068 0.071 0.076 0.091 0.080 0.043 0.045 0.044 
r2 0.961 0.963 0.961 0.908 0.858 0.883 0.953 0.959 0.956 
 GRM           
MSE 0.006 0.010 0.008 0.014 0.018 0.016 0.002 0.006 0.004 
MAE 0.082 0.086 0.084 0.091 0.093 0.092 0.050 0.056 0.053 
r2 0.964 0.966 0.965 0.860 0.880 0.870 0.953 0.959 0.956 

The relation between the observed and the predicted data indicated a strong dependence between them, which was confirmed by the high values of the squared correlation r2 (Table 5a and b, Figure 2). In the case of GWL predictors and SVR modelling, in the testing subset, the best model's fitting to real values (r2 = 0.939) was assessed for II/430/1, while the worst (r2 = 0.897) was assessed for II/401/1. Moreover, the lowest MSE value (0.005) was received for II/430/1 and the highest (0.022) for I/170/4. Compared with neural networks predictions (MLP), the best model's performance (r2 = 0.942, MSE = 0.005) was observed for II/430/1 and the worst (r2 = 0.838, MSE = 0.022) for II/401/1. The results obtained with SVR were slightly better than those obtained with MLP (Table 5a).

Figure 2

Modelling results for (a) I/170/4, (b) II/401/1, and (c) II/430/; left: relation between the observed values of GWL (m) and those predicted by SVR; right: residual histogram of the SVR GWL (m) model.

Figure 2

Modelling results for (a) I/170/4, (b) II/401/1, and (c) II/430/; left: relation between the observed values of GWL (m) and those predicted by SVR; right: residual histogram of the SVR GWL (m) model.

Close modal

The received results indicate that adding temperature and precipitation time series as predictors slightly improves the models’ quality. In the testing subsets, for SVR forecasting, the highest (r2 = 0.960, MSE = 0.004) was in the case of II/430/1 and the lowest (r2 = 0.864, MSE = 0.018) was for II/401/1. The results obtained with the MLP models were relatively poor (Table 5b).

The prediction results obtained with the GRM models created on the basis of predictor combinations according to regression methods were very close to those obtained with SVR and MLP models (similar MSEs, MAEs and r2s). All the above-mentioned techniques that provided GWL time-series forecasts demonstrated high efficiency.

Comparing the model's performance with the literature, it can be stated that the presented approach is reasonable and provides high-quality prediction results without using complicated techniques. MSE errors in the range of 0.003–0.022 correlate to those in Chitsazan et al. (2015) (0.004–0.052), Yadav et al. (2017) (0.0028–0.54), and Jalalkamali et al. (2011) (0.0004–0.076). The modelling results were better than those obtained by Sivapragasam et al. (2015), who created a GVL prediction model with r2 not higher than 0.82 (0.838–0.963) for the test subsets of ANN and SVR models in the current research. On the other hand, the presented performance (MSE error) is comparable to that of hybrid models combining prediction methods created by Suryanarayana et al. (2014): 0.039–0.285, Adamowski & Chan (2011): 0.002–0.306, as well as Sahoo et al. (2017): error in the test set is less than 2.0.

Figure 2 only shows the testing subsets of the SVR models created on the basis of GWL, rainfall, and temperature predictors. In I/170/4 and II/401/1 wells, the frequency distribution of SVR residues is similar in terms of negative-value domination, which means that the models overestimated the real GWL values. This research result converges with that of Yadav et al. (2017), Shiri & Kisi (2011), Suryanarayana et al. (2014), all of whom indicate a predominance of predicted values over the observed ones. In the case of I/170/4, the largest number of (46 and 30, respectively) intervals of differences between observed and predicted values were [−0.1; 0.0) and [−0.2; −0.1) (Figure 2(a)). For II/401/1, there were only about 13 positive differences (in cumulative classes from 0.0 to 0.4), while the largest were [−0.1; 0.0) and [−0.2; −0.1) where 33 and 37 events fell into (Figure 2(b)), respectively. In comparison with the above results, the histogram for II/430/1 showed that GWL residues were significantly lower and the most frequently (40) interval was [0.00; 0.05) (Figure 2(c)).

The course of the time series for all analysed wells indicated that the predictions clearly reflected reality in terms of GWL monthly fluctuations (Figure 3(a)–3(c)). These results confirmed the forgoing findings that the models obtained for I/170/4 and II/430/1 were more adjustable to the observed data than for II/401/1, but all received results indicated that SVR forecasts were characterized by high predictive ability.

Figure 3

Observed and SVR predicted GWL time series for: (a) I/170/4, (b) II/401/1, and (c) II/430/1.

Figure 3

Observed and SVR predicted GWL time series for: (a) I/170/4, (b) II/401/1, and (c) II/430/1.

Close modal

Long-term GWL forecasting supports optimization in water resource planning and management as well as assessing the risk to drinking water supply, particularly in areas with water scarcity. In this area, data-driven models using machine learning techniques are becoming increasingly popular and have great potential as they can be an alternative for multidimensional physics-based models. In the present study, SVR and MLP techniques, which are computational intelligence methods, were used to predict the monthly GWL time series in three wells located in the northern part of Greater Poland Province on the basis of the measurement period from 1975 to 2014. The main aim of the research was to find the optimal combination of predictors between hydrological and meteorological time series.

The obtained results indicated that two-stage approach to monthly GWL forecasting that combines the Hellwig method for input variables’ selection with machine learning (SVR and ANN) modelling is reasonable and improves the quality of prediction and minimizes error values. According to the Hellwig method, the most important predictors for the three wells are: GWL time series at lags −1 and −2, all precipitation from the current month, and delayed from 1 to 6 months, as well as −1, −3, −4, and −6 temperature lags. A comparison of the obtained results with general regression models (progressive, reverse, and the best-subset regression) provided different results. In this case, the most significant predictors for I/170/4 were GWL-1, P, P-1, P-3, P-4, T, T-2, and T-6 for II/401/1 GWL-1, GWL-2, GWL-5, P, and T-5, and GWL-1, P, P-1, and P-3 for II/430/1. Furthermore, with reference to the hypotheses stated in the Introduction, adding rainfall and temperature data while modelling provides better results (however marginally) than using only past GWL time series, whereas ANN (MLP) gives generally slightly worse forecasts than SVR. Additionally, each GWL measuring point needs an individual approach in terms of optimal input variables’ selection, which plays a crucial role in GWL time-series prediction.

High values of the model's quality measures like MSE errors in the range of 0.003–0.022 and r2 more than 0.8 were found to be acceptable in the field of regional hydrology in areas threatened by water deficits. Moreover, the created models showed good prediction ability for high and low GWL values. The GRM method using different set of predictors for each well provided predictive models’ accuracy very close to SVR and MLP based on input combinations according to the Hellwig method. However, the second approach required less computational effort (calculation correlation between inputs and target).

The main advantage of the proposed method is that it is simple and its implementation requires access only to basic hydrological and meteorological data. It may find application in groundwater management and planning, especially in actual conditions related to observed climate change and water deficit threats.

The research was performed in, and supported by, the Institute of Environmental Engineering Wrocław University of Environmental and Life Sciences. This research did not receive any specific grants from funding agencies in the public, commercial, or not-for-profit sectors. The source of the hydrological data is Polish Geological Institute – National Research Institute (PIG–PIB). The source of the meteorological data is the Institute of Meteorology and Water Management – National Research Institute (PIB). Data from the Institute of Meteorology and Water Management – National Research Institute have been processed.

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Data cannot be made publicly available; readers should contact the corresponding author for details.

Adamowski
J.
&
Chan
H. F.
2011
A wavelet neural network conjunction model for groundwater level forecasting
.
Journal of Hydrology
407
(
1–4
),
28
40
.
doi:10.1016/j.jhydrol.2011.06.013
.
Amaranto
A.
,
Munoz-Arriola
F.
,
Corzo
G.
,
Solomatine
D. P.
&
Meyer
G.
2018
Semi-seasonal groundwater forecast using multiple data-driven models in an irrigated cropland
.
Journal of Hydroinformatics
20
(
6
),
1227
1246
.
doi:10.2166/hydro.2018.002
.
Bąk
B.
2003
Climatic conditions of the Greater Poland and Kujawy regions
.
Woda-Środowisko-Obszary Wiejskie
3
,
11
38
(in Polish)
.
Boczoń
A.
,
Kowalska
A.
,
Dudzińska
M.
&
Wróbel
M.
2016
Drought in Polish forests in 2015
.
Polish Journal of Environmental Studies
25
(
5
),
1857
1862
.
doi:10.15244/pjoes/62797
.
Chitsazan
M.
,
Rahmani
G.
&
Neyamadpour
A.
2015
Forecasting groundwater level by artificial Neural Networks as an alternative approach to groundwater modelling
.
Journal of Geological Society of India
85
,
98
106
.
doi:10.1007/s12594-015-0197-4
.
Cortes
C.
&
Vapnik
V.
1995
Support-vector networks
.
Machine Learning
20
(
3
),
273
297
.
doi:10.1007/BF00994018
.
Dell Inc.
2016
Dell Statistica (data analysis software system) 13, software.dell.com.
Elshorbagy
A.
,
Corzo
G.
,
Srinivasulu
S.
&
Solomatine
D. P.
2010
Experimental investigation of the predictive capabilities of data-driven modeling techniques in hydrology – part 1: concepts and methodology
.
Hydrology and Earth System Sciences
14
,
1931
1941
.
doi:10.5194/hess-14-1931-2010
.
Glorian
H.
,
Schmalz
V.
,
Lochyński
P.
,
Fremdling
P.
,
Boernick
H.
,
Worch
E.
&
Dittmar
T.
2018
Portable analyzer for on-site determination of dissolved organic carbon – development and field testing
.
International Journal of Environmental Research and Public Health
15
(
11
),
1
13
.
doi:10.3390/ijerph15112335
.
Hellwig
Z.
1968
On the optimal choice of predictors
. In:
Toward a System of Quantitative Indicators of Components of Human Resources Development
(
Gostkowski
Z.
ed.).
UNESCO
,
Paris
.
Huang
F.
,
Huang
J.
,
Jiang
S. H.
&
Zhou
C.
2017
Prediction of groundwater levels using evidence of chaos and support vector machine
.
Journal of Hydroinformatics
19
(
4
),
586
606
.
doi:10.2166/hydro.2017.102
.
Ibrahim
B.
,
Wisser
D.
,
Barry
B.
,
Fowe
T.
&
Aduna
A.
2015
Hydrological predictions for small ungauged watersheds in the Sudanian zone of the Volta basin in West Africa
.
Journal of Hydrology: Regional Studies
4
(
B
),
386
397
.
doi:10.1016/j.ejrh.2015.07.007
.
Igliński
B.
&
Skrzatek
M.
2018
Will the restoration of small hydropower plants in Wielkopolska and Kujawy stop their steppe. Ekologia i Technika
148
(
1
),
13
18
(in Polish)
.
Iqbal
M.
,
Naeem
U. A.
,
Ahmad
A.
,
Rehman
H.-U.
,
Ghani
U.
&
Farid
T.
2020
Relating groundwater levels with meteorological parameters using ANN technique
.
Measurement
166
,
108163
.
doi:10.1016/j.measurement.2020.108163
.
Jalalkamali
A.
,
Sedghi
H.
&
Manshouri
M.
2011
Monthly groundwater level prediction using ANN and neuro-fuzzy models: a case study on Kerman plain, Iran
.
Journal of Hydroinformatics
13
(
4
),
867
876
.
doi:10.2166/hydro.2010.034
.
klimat.imgw.pl (assessed 25 August 2021)
.
Kubicz
J.
&
Bąk
B.
2018
The reaction of groundwater to several months’ meteorological drought in Poland
.
Polish Journal of Environmental Studies
28
(
1
).
doi:10.15244/pjoes/81691
.
Kubicz
J.
,
Pawełczyk
A.
&
Lochyński
P.
2018
Environmental health risk posed by contamination of the individual water wells
.
Chemosphere
208
,
247
256
.
doi:10.1016/j.chemosphere.2018.05.182
.
Kubicz
J.
,
Lochyński
P.
,
Pawełczyk
A.
&
Karczewski
M.
2021
Effects of drought on environmental health risk posed by groundwater contamination
.
Chemosphere
263
,
128145
.
doi:10.1016/j.chemosphere.2020.128145
.
Kumar
D.
&
Bhattacharjya
R. K.
2021
GRNN model for prediction of groundwater fluctuation in the state of Uttarakhand of India using GRACE data under limited bore well data
.
Journal of Hydroinformatics
108
.
doi:10.2166/hydro.2021.108
.
Liu
W.
,
Guo
G.
,
Chen
F.
&
Chen
Y.
2019
Meteorological pattern analysis assisted daily PM2.5 grades prediction using SVM optimized by PSO algorithm
.
Atmospheric Pollution Research
10
(
5
),
1482
1491
.
doi:10.1016/j.apr.2019.04.005
.
Lochyński
P.
,
Charazińska
S.
,
Łyczkowska-Widłak
E.
&
Sikora
A.
2019
Electropolishing of stainless steel in laboratory and industrial scale
.
Metals
9
(
8
),
1
15
.
doi:10.3390/met9080854
.
Łopucki
R.
&
Kiersztyn
A.
2015
Urban Green space conservation and management based on biodiversity of terrestrial fauna – a decision support tool
.
Urban Forestry & Urban Greening
14
(
3
),
508
518
.
doi:10.1016/j.ufug.2015.05.001
.
Malakar
P.
,
Mukherjee
A.
&
Sarkar
S.
2018
Potential application of advanced computational techniques in prediction of groundwater resource of India
. In:
Groundwater of South Asia
(
Mukherjee
A.
ed.).
Springer Hydrogeology, Springer
,
Singapore
, pp.
643
655
.
Nash
J. E.
&
Sutcliffe
J. V.
1970
River flow forecasting through conceptual models – a discussion of principles
.
Journal of Hydrology
10
,
282
290
.
doi:10.1016/0022-1694(70)90255-6
.
Nowak
B.
&
Ptak
M.
2018
Potential use of lakes as a component of small retention in Wielkopolska. E3S Web of Conferences 44, 00127. doi:10.1051/e3sconf/20184400127
.
Omiotek
Z.
,
Stepanchenko
O.
,
Wójcik
W.
,
Legiec
W.
&
Szatkowska
M.
2019
The use of the Hellwig's method for feature selection in the detection of myeloma bone destruction based on radio graphic images
.
Biocybernetics and Biomedical Engineering
339
,
328
338
.
doi:10.1016/j.bbe.2018.11.008
.
Osman
A. I. A.
,
Ahmed
A. N.
,
Chow
M. F.
,
Huang
Y. F.
&
El-Shafie
A.
2021
Extreme gradient boosting (Xgboost) model to predict the groundwater levels in Selangor Malaysia
.
Ain Shams Engineering Journal
doi:10.1016/j.asej.2020.11.011
.
Paczyński
B.
&
Sadurski
A.
2007
Hydrogeologia regionalna Polski, Tom I – Wody słodkie (Regional Hydrogeology of Poland, Volume I – Fresh Waters)
.
Polish Geological Institute
,
Warsaw
.
Polish Hydrogeological Survey. Polish Geological Institute National Research Institute 2019. Hydrogeological Annual Report. Hydrological year 2018, Warsaw, Poland
Rahman
A. T. M. S.
,
Hosono
T.
,
Quilty
J. M.
,
Das
J.
&
Basak
A.
2020
Multiscale groundwater level forecasting: coupling new machine learning approaches with wavelet transforms
.
Advances in Water Resources
141
,
103595
.
doi:10.1016/j.advwatres.2020.103595
.
Sahoo
S.
,
Russo
T. A.
,
Elliott
J.
&
Foster
I.
2017
Machine learning algorithms for modelling groundwater level changes in agricultural regions of the U.S
.
Water Resources Research
53
,
3878
3895
.
doi:10.1002/2016WR019933
.
Shabib
A.
,
Munir
A.
,
Noureen
H.
,
Salman
B.
,
Iftikhar
A.
&
Zahid
N.
2018
Rainfall prediction in Lahore City using data mining techniques
.
International Journal of Advanced Computer Sciences and Applications
9
(
5
).
doi:10.14569/IJACSA.2018.090439
.
Sharafati
A.
,
Asadollah
S. B. H. S.
&
Neshat
A.
2020
A new artificial intelligence strategy for predicting the groundwater level over the Rafsanjan aquifer in Iran
.
Journal of Hydrology
591
,
125468
.
doi:10.1016/j.jhydrol.2020.125468
.
Shiri
J.
&
Kisi
O.
2011
Comparison of genetic programming with neuro-fuzzy systems for predicting short-term water table depth fluctuations
.
Computer Geosciences
37
,
1692
1701
.
doi:10.1016/j.cageo.2010.11.010
.
Sivapragasam
C.
,
Kannabiran
K.
,
Karthik
G.
&
Raja
S.
2015
Assessing suitability of GP modelling for groundwater level
.
Aquatic Procedia
4
,
693
699
.
doi:10.1016/j.aqpro.2015.02.089
.
Suryanarayana
C.
,
Sudheer
C.
,
Mahammood
V.
&
Panigrahi
B. K.
2014
An integrated wavelet-support vector machine for groundwater level prediction in Visakhapatnam, India
.
Neurocomputing
145
,
321
335
.
doi:10.1016/j.neucom.2014.05.026
.
Szmidt
E.
,
Kacprzyk
J.
,
Bujnowski
P.
2020
Attribute selection via Hellwig's algorithm for Atanassov's intuitionistic fuzzy sets
. In:
Computational Intelligence and Mathematics for Tackling Complex Problems. Studies in Computational Intelligence
(
Kóczy
L.
,
Medina-Moreno
J.
,
Ramírez-Poussa
E.
&
Šostak
A.
, eds).
Springer
,
Cham
, p.
819
,
doi:10.1007/978–3–030–16024–1_11
.
Tsanis
I. K.
,
Coulibaly
P.
&
Daliakopoulos
I. N.
2008
Improving groundwater level forecasting with a feedforward neural network and linearly regressed projected precipitation
.
Journal of Hydroinformatics
10
(
4
),
317
330
.
doi:10.2166/hydro.2008.006
.
Waligórski
B.
,
Sojka
M.
,
Jaskuła
J.
&
Korytowski
M.
2018
Analysis of the use of selected reservoirs in the Wielkopolska Province
.
Annals of Warsaw University of Life Sciences – SGGW Land Reclamation
50
(
4
),
373
385
.
doi:10.2478/sggw-2018-0029
.
Wheater
H. S.
,
Jakeman
A. J.
,
Beven
K. J.
1993
Progress and directions in rainfall-runoff modeling
. In:
Modeling Change in Environmental Systems
(
Jakeman
A. J.
,
Beck
M. B.
&
McAleer
M. J.
, eds).
Wiley
,
Chichester
,
UK
, pp.
101
132
.
Wunsch
A.
,
Liesch
T.
&
Broda
S.
2018
Forecasting groundwater levels using nonlinear autoregressive networks with exogenous input (NARX)
.
Journal of Hydrology
567
,
743
758
.
doi:10.1016/j.jhydrol.2018.01.045
.
Yadav
B.
,
Ch
S.
,
Mathur
S.
&
Adamowski
J.
2017
Assessing the suitability of extreme learning machines (ELM) for groundwater level prediction
.
Journal of Water and Land Development
32
,
103
112
.
doi:10.1515/jwld-2017-0012
.
Zhao
Y.
,
Li
Y.
,
Zhang
L.
&
Wang
Q.
2016
Groundwater level prediction of landslide based on classification and regression tree
.
Geodesy and Geodynamics
7
,
348
355
.
doi:10.1016/j.geog.2016.07.005
.
Zieliński
M.
2015
Economic situation of farms specializing in field crops particularly endangered by agricultural drought in Wielkopolskie Province in the years 2006–2013
.
Woda-Środowisko-Obszary Wiejskie
15
(
4
),
83
92
(in Polish)
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).