This paper examines a series of connected and isolated lakes in the UK as a model system with historic episodes of heavy metal contamination. A 9-year hydrometeorological dataset for the sites was identified to analyse the legacy of heavy metal concentrations within the selected lakes based on physico-chemical and hydrometeorological parameters, and a comparison of the complementary methods of multiple regression, time series analysis, and artificial neural network (ANN). The results highlight the importance of the quality of historic datasets without which analyses such as those presented in this research paper cannot be undertaken. The results also indicate that the ANNs developed were more realistic than the other methodologies (regression and time series analysis) considered. The ANNs provided a higher correlation coefficient and a lower mean squared error when compared to the regression models. However, quality assurance and pre-processing of the data were challenging and were addressed by transforming the relevant dataset and interpolating the missing values. The selection and application of the most appropriate temporal modelling technique, which relies on the quality of available dataset, is crucial for the management of legacy contaminated sites to guide successful mitigation measures to avoid significant environmental and human health implications.

  • Heavy metal contamination in aquatic ecosystems is a significant environmental concern.

  • A series of connected and isolated lakes were examined as a model system.

  • A 9-year hydrological and meteorological dataset was analysed.

  • Multiple regression, time series analysis, and artificial neural networks were used.

  • The accuracy of ANN models was better than the other methods tested.

Exposure to heavy metals within riverine systems and the potential effects of these metals on aquatic life is a significant environmental concern in many developed (Howard et al. 2015; Hurley et al. 2017, 2019; Mistri et al. 2020), and low- and middle-income countries (Sow et al. 2019; Siddique et al. 2020; Tunde & Oluwagbenga 2020). When a source of heavy metals enters aquatic ecosystems, it may have significant environmental consequences given their toxicity, long-term persistence in the environment and potential for biomagnification within terrestrial and aquatic food chains (Zhou et al. 2008). For example, a wide range of anthropogenic sources are known to lead to metal pollution within the natural environment that include metal mines (Varol & Sen 2012; Alam et al. 2020), industrial wastes/effluents (Muhammad & Ahmad 2020), sewage treatment works (Dukes et al. 2020), inappropriately protected landfill sites (Wuana & Okieimen 2011; Chandrappa & Das 2021), agricultural land (Karaouzas et al. 2021), combusted fossil fuels (Siddique et al. 2020), and atmospheric deposition (Tunde & Oluwagbenga 2020). The most common heavy metals recorded at polluted sites include copper (Cu), zinc (Zn), lead (Pb), chromium (Cr), cadmium (Cd), nickel (Ni), manganese (Mn), and mercury (Hg) (Wuana & Okieimen 2011; Sodango et al. 2018; Karaouzas et al. 2021). Living organisms are affected by these metals in different ways, for example, Cu, Zn, and Mn play essential roles in metabolic functioning and are considered essential for the good health of humans and other organisms (Rainbow 2007). However, when these metal concentrations are elevated above their essential levels, they can have toxic effects (Kouba et al. 2010). There is therefore an urgent need for improved understanding of the associations between trace element concentration and long-term hydrological and meteorological parameters.

There is significant evidence of historic and current heavy metal pollution in riverine water and sedimentary deposits (Yi et al. 2011; Li et al. 2012; Islam et al. 2014, 2015; Martin et al. 2015; Ahmed et al. 2015; Siddique et al. 2020; Usman et al. 2020; Karaouzas et al. 2021), which have resulted in serious long-term effects on aquatic ecosystems (Turner 2000; Brewer et al. 2005; Armitage et al. 2007; Rowland et al. 2011; Byrne et al. 2012; Howard et al. 2015; Hurley et al. 2017, 2019). Although extensive long-term historical datasets on heavy metal availability in the environment exist within some geographical regions, such as the UK, few studies have attempted to develop predictive analysis tools to forecast or identify the potential cause–effect relationships between metal concentrations in the environment and abiotic parameters which may have governed these relationships (Byrne et al. 2012; Yaho et al. 2016). An exception is Rizal et al. (2023) who discussed the development of a hydrological model with the application of the Hydrologic Engineering Centre model (HEC-HMC) using soil water data and daily rainfall records from 2009 to 2019. The theoretical runoff was calibrated and validated using the soil conservation curve number (SCS-CN) method with a good coefficient of determination. In the present study, predictive models were developed for forecasting heavy metal concentrations within the Attenborough Nature Reserve (ANR) lakes based on different parameters by combining different modelling techniques in order to conduct a comprehensive environmental risk assessment. This study attempted to analyse a long-term historical dataset to identify the key metals of concern with a view to identify the temporal patterns along with the potential cause–effect relationships. The availability of a long-term monthly dataset 2009–2017 for the same site is relatively rare and is therefore exceptionally useful for environmental monitoring purposes.

Water quality models are useful to help guide monitoring activities and predict metal concentrations in the environment; however, an accurate mechanistic model that captures all physico-chemical parameters, hydrometerological processes and the associations among them are difficult to develop. As a result, it is important to evaluate the most appropriate approaches that can accurately capture the temporal patterns and trends in key environmental parameters, such as heavy metal concentrations. The accuracy of the temporal modelling technique depends on the selection of the most appropriate method based on the parameters available, and it may therefore not be a trivial process to determine which method or model is the ‘best’ (Chatfield 2000).

In order to address the identified needs, this paper specifically aims to develop and compare the ability of statistical (regression and time series analyses) and machine learning models (artificial neural networks (ANNs)) to predict the concentration of total Cu and Zn within freshwater lakes as functions of a range of abiotic parameters. Cu and Zn are the focus of this investigation due to the availability of a long-term monitoring dataset and being the metals of greatest concern in the study area (see Section 2.1). The other feasible alternatives for Zn and Cu were Al, Cd, Cr, Pb, and Ni. However, these were not used for model development because of very low levels of metal concentrations recorded. The major advantages of using Cu and Zn over the other metals for temporal modelling was the existence of less noise and variation in the data resulting in the development of better predictive models. In previous research, Dunea & Iordache (2011) used autoregressive integrated moving average (ARIMA) modelling to create time series forecasting models and found that some models performed well for certain parameters (e.g., total suspended solids (TSS)), while the models for other parameters (e.g., pH) were less accurate. Similarly, multiple regression models were found to be accurate (Yaho et al. 2016) but they performed relatively poorly in other instances (Tipping et al. 2003). In recent years, ANNs have been increasingly used with a high degree of accuracy to predict heavy metal concentrations in soil, waste waters, rivers and ponds (Xu & Liu 2013; Zhou et al. 2015; Elzwayie et al. 2017; Ayaz & Khan 2019). Other methodological approaches considered during the study including random forest (RF), additive regression (AR), and support vector regression (SVR). Our choice for methods were based on the initial analysis of the dataset. The dataset is a relatively simple dataset, displaying weak seasonality patterns with a relatively small number of variables/parameters and our primary aim was to test whether the simplest machine learning model such as a single layer ANN had an advantage over the classical statistical techniques such as multiple regression and ARIMA. The move towards ANN was also because the relation between the copper concentration and independent parameters was seen to be non-linear. In addition, as a parametric model, ANN's size is fixed, while non-parametric models such as SVMs can become quite large and expensive in terms of time and space consumption. Similarly, with no obvious underlying rules, methods such as decision trees were discounted for the current study. However, these methods could be applied in future studies to determine if it is possible to improve model accuracy and predictive power.

This paper addresses the following research objectives to identify the most suitable modelling technique, or a combination of techniques, required to (i) characterize the dominant temporal patterns of metal (Zn and Cu) concentrations derived from surface water samples and (ii) develop a series of models capturing the temporal variability of Zn and Cu concentrations in the environment to characterise the potential cause–effect relationships.

Quality assurance and pre-processing of the data were challenging. They were addressed by logarithmically transforming the data to achieve a normal distribution and interpolating the missing values with specific attention to the values before and after the missing data. The successful development and correct selection of a predictive model will help guide future assessment strategies and the identification of potential pollution hotspots that may require the development of appropriate remediation techniques.

Study site description and model variable selection criteria

River Erewash, UK

The River Erewash in the UK has been subject to historic pollution due to former industrial activities, mining and sewage treatment (National Rivers Authority (NRA 1995). During the early 1990s, the river was recognised as one of the most polluted rivers in the UK (NRA 1995). The existence of heavy metals in historic metal and coal mine waters inflowing to this river was identified as being responsible for the decline in the abundance of aquatic organisms, including fish and invertebrate species (Derbyshire Wildlife Trust 2019). A number of sites on the river had a strong historic mismatch between the biological and chemical classifications of environmental quality (Environment Agency Thames Region 1995). As a result, the river has been extensively studied by the UK's environmental regulatory body, the Environment Agency.

ANR, UK

ANR in the UK was established on the site of a former gravel extraction/aggregate industry in 1966. Figure 1 presents the ANR site in detail. Three of the lakes in ANR, namely (i) Coneries lake, (ii) Tween pond, and (iii) Main pond, which are directly connected to flowing water from River Erewash, were considered in this research. In addition, two other lakes, which are isolated from the flowing water of the river, were examined, namely, (i) Church and (ii) Clifton ponds.
Figure 1

Sampling points used for conducting data analysis at ANR (adapted from Cross et al. 2014).

Figure 1

Sampling points used for conducting data analysis at ANR (adapted from Cross et al. 2014).

Close modal

Due to the historic metal pollution associated with the River Erewash, the hydrologically connected lakes at ANR act as a potential sediment storage area of suspended sediments prior to its entry into a nearby River Trent. In contrast, the lakes that are not connected provide ideal control sites to gauge their effectiveness in storing sediment and any associated pollutants.

In the present study, two metals (Cu and Zn) for all chosen lakes and seven abiotic environmental parameters (pH, dissolved oxygen (DO), TSS, temperature, conductivity, flow (river discharge) and precipitation) for one connected (Coneries lake (Figure 1)) and one isolated lake (Church pond (Figure 1)) were used. This research utilizes the metal concentration data for Cu and Zn obtained from all five water bodies, but the water quality variables were used from Coneries lake and Church pond. Discharge draining the River Erewash passes through the connected lakes but does not flow through the isolated waterbodies (Figure 1). Since the historical dataset was only available for the lakes of ANR, it was adopted as a model system for this study. A total of 108 monthly measurements of total Cu and Zn concentrations and the above abiotic parameters were taken at each study site during a 9-year longitudinal study between 2009 and 2017. The total Cu and Zn concentration, pH, DO, TSS, water temperature and conductivity were determined by the University of Nottingham in partnership with Cemex (UK) Ltd with the data analysis being conducted by the Environment Agency for the duration 2009–2015 and Enitial Consultants, UK for the duration 2016–2017 (Table 1).

Table 1

Key parameters used for temporal modelling

Name of the datasetWater quality characteristicsTemporal resolutionMetal concentrationTemporal resolution
Appendix Dec 2019 (McGowan & Salgado-Bonnet 2019Physico-chemical data Temperature
Conductivity
DO
pH
TSS 
Monthly values between 2005 and 2019 -Cu
-Zn 
Monthly values between 2005 and 2019 
R_Erewash_ Sandiacre_1965–2018 (National River Flow Archive 2020River flow data Gauged daily flow Daily flow values between 1965 and 2018 
GHCNdaily_ NottWatnall_ 1960–2017 (GHCN-Daily 2021Meteorological data Precipitation Daily average values between 1960 and 2017 
Maximum stream temperature 
Minimum stream temperature 
Name of the datasetWater quality characteristicsTemporal resolutionMetal concentrationTemporal resolution
Appendix Dec 2019 (McGowan & Salgado-Bonnet 2019Physico-chemical data Temperature
Conductivity
DO
pH
TSS 
Monthly values between 2005 and 2019 -Cu
-Zn 
Monthly values between 2005 and 2019 
R_Erewash_ Sandiacre_1965–2018 (National River Flow Archive 2020River flow data Gauged daily flow Daily flow values between 1965 and 2018 
GHCNdaily_ NottWatnall_ 1960–2017 (GHCN-Daily 2021Meteorological data Precipitation Daily average values between 1960 and 2017 
Maximum stream temperature 
Minimum stream temperature 

The monthly measurements of river flow (river discharge – m3 s−1) data, which were measured by a multi-path ultrasonic time-of-flight gauge for the River Erewash at Sandiacre (1965–2018), were collected from the National River Flow Archive (2020). Meteorological data used in this study were obtained from the Nottingham Watnall meteorological site (1960–2017), which are available in the website of Global Historical Climatology Network daily (GHCN-Daily 2021)). All analyses in this study were undertaken using the Statistical Package for Social Sciences (SPSS), MATLAB, and Microsoft Excel.

Data analysis methodologies

Data collection and processing

Before conducting the analyses, quality assurance and pre-processing of the data were undertaken to ensure consistent format and units of the data. Following the quality assurance and interpolation of missing values, summary statistics were derived for each of the variables used in subsequent analyses (Table 2). The results of the statistical analysis indicated that both Cu (skewness = 3.61, kurtosis = 25.12) and Zn (skewness = 3.04, kurtosis = 14.89) concentration data were not normally distributed and positively skewed. The data for the remaining parameters were distributed normally. Since normally distributed parameters can be predicted with better accuracy, given the statistical methods have been developed mostly for normally distributed data, Cu and Zn concentration data were logarithmically transformed to achieve a normal distribution.

Table 2

Summary of descriptive data statistics

Water quality characteristicsSpread of data distributionCentre of data distribution
Cu concentration (μg L−1Skewness = 3.61
Kurtosis = 25.12
Standard deviation = 1.73
Positively skewed 
Mean = 4.26
Max = 16.8
Min = 1.6 
Zn concentration (μg L−1Skewness = 3.04
Kurtosis = 14.889
Standard deviation = 9.39
Positively skewed 
Mean = 13.98
Max = 72.28
Min = 2 
Temperature (°C) Skewness = 0.04
Kurtosis = −1.19
Standard deviation = 5.33
Normal distribution 
Mean = 11.55
Max = 22.8
Min = 2.2 
Conductivity (μS cm−1Skewness = −0.17
Kurtosis = −0.84
Standard deviation = 0.14
Normal distribution 
Mean = 0.94
Max = 1.22
Min = 0.62 
DO (mg L−1Skewness = 2.29
Kurtosis = 9.60
Standard deviation = 5.01
Normal distribution 
Mean = 13.26
Max = 41
Min = 6.48 
pH Skewness = 0.26
Kurtosis = −0.23
Standard deviation = 0.58
Normal distribution 
Mean = 8.39
Max = 10
Min = 6.85 
TSS (mg L−1Skewness = 1.56
Kurtosis = 2.96
Standard deviation = 10.37
Normal distribution 
Mean = 13.41
Max = 57
Min = 1 
Precipitation (mm) Skewness = 0.83
Kurtosis = 0.44
Standard deviation = 0.99
Normal distribution 
Mean = 1.88
Max = 4.67
Min = 0.10 
Flow (m3 s−1Skewness = 1.62
Kurtosis = 3.12
Standard deviation = 1.12
Normal distribution 
Mean = 1.81
Max = 6.68
Min = 0.59 
Water quality characteristicsSpread of data distributionCentre of data distribution
Cu concentration (μg L−1Skewness = 3.61
Kurtosis = 25.12
Standard deviation = 1.73
Positively skewed 
Mean = 4.26
Max = 16.8
Min = 1.6 
Zn concentration (μg L−1Skewness = 3.04
Kurtosis = 14.889
Standard deviation = 9.39
Positively skewed 
Mean = 13.98
Max = 72.28
Min = 2 
Temperature (°C) Skewness = 0.04
Kurtosis = −1.19
Standard deviation = 5.33
Normal distribution 
Mean = 11.55
Max = 22.8
Min = 2.2 
Conductivity (μS cm−1Skewness = −0.17
Kurtosis = −0.84
Standard deviation = 0.14
Normal distribution 
Mean = 0.94
Max = 1.22
Min = 0.62 
DO (mg L−1Skewness = 2.29
Kurtosis = 9.60
Standard deviation = 5.01
Normal distribution 
Mean = 13.26
Max = 41
Min = 6.48 
pH Skewness = 0.26
Kurtosis = −0.23
Standard deviation = 0.58
Normal distribution 
Mean = 8.39
Max = 10
Min = 6.85 
TSS (mg L−1Skewness = 1.56
Kurtosis = 2.96
Standard deviation = 10.37
Normal distribution 
Mean = 13.41
Max = 57
Min = 1 
Precipitation (mm) Skewness = 0.83
Kurtosis = 0.44
Standard deviation = 0.99
Normal distribution 
Mean = 1.88
Max = 4.67
Min = 0.10 
Flow (m3 s−1Skewness = 1.62
Kurtosis = 3.12
Standard deviation = 1.12
Normal distribution 
Mean = 1.81
Max = 6.68
Min = 0.59 

Missing data

To improve the robustness of the analysis and model performance, missing values were estimated and imputed (Bennett 2001). However, DO data had 18.5% missing values (Table 3), which is more than 10% and this may result in biased results (Bennett 2001). This problem was addressed by analysing the data variation as a whole with specific attention to the values before and after the missing values. The mean value was used for imputation/replacement for the variables which were normally distributed, and the median used for Cu and Zn data which was positively skewed.

Table 3

Abiotic parameters with missing values

Abiotic parameterPercentage (%) of missing data
Conductivity (μS cm−1Less than 2% 
Cu concentration (μg L−1
Zn concentration (μg L−1
Temperature (°C) 
pH 
TSS (mg L−12.7% 
DO (mg L−118.5% 
Abiotic parameterPercentage (%) of missing data
Conductivity (μS cm−1Less than 2% 
Cu concentration (μg L−1
Zn concentration (μg L−1
Temperature (°C) 
pH 
TSS (mg L−12.7% 
DO (mg L−118.5% 

Approaches used for the identification of the best predictive models

Regression analysis

In the current investigation, 95% confidence interval (CI) data for both metals (Cu and Zn) for all ponds (Coneries, Main, Tween, Church, and Clifton) from ANR were analysed to determine the metal concentration variability in relation to the connectivity with the River Erewash over four seasons (e.g., winter: December–February, spring: March–May, summer: June–August, autumn: September–November) based on the climatological year. Bivariate Pearson correlation coefficients and multiple linear regression models (MLRMs) were used to determine the associations and predict metal concentrations from other abiotic environmental parameters (pH, DO, temperature, TSS, river flow, conductivity, and precipitation). All regression analyses were carried out using International Business Machines (IBM) SPSS Statistics 25.

We are aware that MLRMs have been widely used to determine the associations between metal concentrations and the abiotic environmental parameters in aquatic environments (Chatterjee & Hadi 2015; Schober et al. 2018). Sauvé et al. (1997) reported that some of the predictive models for metal concentrations performed well when only one parameter was included (Yaho et al. 2016) while other models performed better with multiple parameters (Janssen et al. 1997; Tipping et al. 2003; Meers et al. 2005). This research aims to explore the utility of MLRMs in the present scenario as described above, and in particular, their usefulness in comparison to other possible approaches such as time series analysis and ANN. The results are presented in Section 3.1.

Time series analysis

ARIMA modelling is a traditional technique of time series analysis which facilitates the analysis of a variable trend. It may be used to forecast future values (Ayaz & Khan 2019) and has been used in the past for the prediction of heavy metal concentrations in the environment (Liu et al. 2012; Pandey et al. 2015). In the present study, ARIMA was used to characterise the temporal trends and patterns for the selected heavy metals (Cu and Zn) and associated abiotic parameters as individual parameters which cannot be achieved otherwise by other techniques such as regression analysis or ANN. The ARIMA model parameters, autoregressive (AR) term (p), number of differencing (d) and moving average (MA) term (q), were selected based on the results of sample partial autocorrelation function plots (PACF), augmented Dickey–Fuller tests (ADF) and sample autocorrelation function plots (ACF), respectively (Prabhakaran 2021). The number of lags above the significance line in the PACF and ACF plots determines the AR (p) and MA (q) terms, and the models were built by making the time series stationary as determined based on the results of an ADF test (Prabhakaran 2021). The best ARIMA models were determined by selecting the parameters that fit the raw data most closely and by deriving the significant p values for the AR (p) and MA (q) terms, and the Akaike information criterion (AIC) values to quantify the goodness of fit (Prabhakaran 2021).

ANN analysis

ANNs emulate the functioning of the human brain (Gredilla et al. 2013) and are typically composed of three layers (input layer, one or more hidden layers and the output layer) of interconnecting neurons (American Society of Civil Engineers (ASCE) 2000; IBM 2020). Each connection is associated with a weight and the output of any neuron depends upon the weights of its input connections and its associated activation function (commonly used activation functions are sigmoid, tangent and hyperbolic tangent). The network is trained (i.e., the weights of connections determined) using available historic data through algorithms that fall broadly under two categories, namely, supervised and unsupervised techniques (Friedel 2014). Elzwayie et al. (2017) stated that ANN is a flexible method that recognises the patterns within the dataset and finds the association between inputs and outputs to predict heavy metal concentrations.

MATLAB tool-box was used to train and test various configurations of ANN to estimate heavy metal concentrations (outputs) based on the environmental parameters (inputs). Individual ANNs were created for Cu for both connected and isolated lakes because the results of the correlation analysis, which is an appropriate method to investigate the association between water quality variables, river discharge, and precipitation (Grum et al. 1997; Vervier et al. 1999) indicated a difference between the effective parameters on Cu concentrations in connected and isolated lakes. However, the Zn data were not used for the ANN configurations because the results of regression and time series analysis indicated that the results were not reliable enough and that the data were noisier with relatively large variation. As a result, only the ANN results for Cu data are presented in the current paper.

The results of the t-test on connected and isolated lakes revealed that the mean values of Cu concentrations for connected and isolated lakes were not equal. Therefore, individual models were created to predict the Cu concentrations in connected and isolated lakes using Equation (1) for an ANN with ‘n’ number of input parameters:
(1)
where W is the weight of each parameter, X is the input, b is the bias, and Y is the output.

Several ANN architectures were tested and the feedforward ANN utilising back propagation for error reduction with five neurons in its hidden layer yielded the highest coefficient of correlation (R) values for predicting Cu concentrations within both the connected and isolated lakes. R and mean squared error (MSE) values have been used as the performance metric to characterize the accuracy of the models. The other feasible alternatives for R and MSE values include mean absolute error (MAE), root mean square error (RMSE), relative absolute error (RAE), and root relative square error (RRSE). Since R and MSE can be more easily interpreted, they were adopted to characterize the accuracy of the models in the current research.

The back propagation algorithm has the advantage of being simple to adapt with no parameter and function features to tune or learn. However, noise in the data can negatively affect back propagation algorithms performance, even leading to convergence to local minima/maxima. The dataset used after pre-processing in the current study was relatively noise free and void of major irregularities, leading us to adopt back propagation as opposed to more sophisticated learning algorithms. Future studies will explore techniques such as evolutionary approaches to improve the accuracy of prediction of back propagation neural network ensuring a global convergence. In addition to using the back propagation algorithm for training, the initial weights and biases of the model was 0 and the number of training iterations were set to a maximum of 100 with an option to halt when the accuracy of the model did not improve or remained equal. While different weights and biases were not tested, different epochs were tested and the accuracy did not improve any further, which led to finalising number of iterations as 100.

To identify the models with the highest accuracy, predictive models were developed for heavy metal concentrations based on physico-chemical, biological, and meteorological parameters by comparing multiple regression, time series analysis, and ANN.

Statistical analysis

Cu and Zn concentrations (μg L−1) for all ponds in ANR and seasons were assessed to identify the effects of hydrological connectivity and seasonal patterns on the data, where winter season was from December to February, spring was from March to May, summer was from June to August, and autumn was from September to November. The effect of hydrological connectivity to the River Erewash for both Cu (Figure 2(a)) and Zn (Figure 2(b)) was clearly evident with the lakes connected to inflowing water from the River Erewash (Coneries, Main, and Tween) displaying higher concentrations than the unconnected lakes (Church and Clifton). A clear seasonal pattern was also apparent for the Zn concentration data with highest values recorded during the winter for the connected lakes before declining to a minimum during the summer (Figure 2(b)). In contrast, hydrologically unconnected lakes experienced the highest concentrations during the autumn (Figure 2(b)). This may reflect higher winter flows into the connected lakes resulting in greater Zn transport. However, during autumn, submerged aquatic plants in the connected lakes die and decompose in situ modifying the redox potential. A seasonal pattern was not as evident within the Cu series (Figure 2(a)).
Figure 2

Observed mean Cu concentration (μg L−1) with 95% CI (a) and Zn concentration (μg L−1) with 95% CI (b) for all ponds studied including those connected to the River Erewash (Coneries, Main, Tween) and unconnected ponds (Church and Clifton) at ANR for long-term climatological years; where, winter = December–February, Spring = March–May, Summer = June–August, Autumn = September–November.

Figure 2

Observed mean Cu concentration (μg L−1) with 95% CI (a) and Zn concentration (μg L−1) with 95% CI (b) for all ponds studied including those connected to the River Erewash (Coneries, Main, Tween) and unconnected ponds (Church and Clifton) at ANR for long-term climatological years; where, winter = December–February, Spring = March–May, Summer = June–August, Autumn = September–November.

Close modal

Regression models

To investigate the associations between Cu and Zn concentration and other environmental parameters, regression analysis was undertaken (Grum et al. 1997; Vervier et al. 1999). Bivariate Pearson correlation coefficients (r) were calculated for Zn and Cu (dependent variables) in association with the environmental parameters: conductivity, precipitation, temperature, pH, TSS, flow, and DO. A weak but significant positive correlation was observed between Cu concentration (μg L−1) and conductivity (μS cm−1) (r = 0.22; p < 0.01; Table 4). There was also a weak, positive correlation between Zn concentration (μg L−1) and river discharge (m3 s−1) which was statistically significant (r = 0.23; p < 0.01) (Table 4). No other statistically significant correlation was recorded.

Table 4

Correlation coefficient (r) between metal concentrations and other environmental parameters

TSS (mg L−1)DO (mg L−1)Conductivity (μS cm−1)pHTemp (°C)Flow (m3 s−1)Precipitation (mm)
Cu 0.20 0.14 0.22 −0.11 −0.12 0.18 0.14 
Zn 0.16 0.06 0.15 −0.17 −0.14 0.23 0.04 
TSS (mg L−1)DO (mg L−1)Conductivity (μS cm−1)pHTemp (°C)Flow (m3 s−1)Precipitation (mm)
Cu 0.20 0.14 0.22 −0.11 −0.12 0.18 0.14 
Zn 0.16 0.06 0.15 −0.17 −0.14 0.23 0.04 

MLRMs were developed to predict Cu and Zn concentrations for all lakes. Both models included all the abiotic parameters and identified conductivity, TSS, flow and precipitation as the best predictor variables. The adjusted R2 values for Cu and Zn were 0.134 for Cu and 0.101 for Zn, respectively (Table 5). The models were able to explain a greater variance (13.4%) for Cu than Zn (10.1%). Based on the results obtained, regression analysis was not considered sufficiently robust for the prediction of either Cu and Zn concentrations.

Table 5

Linear regression model parameters

ModelAdjusted R2Number of samplesPredictor variable and sign
Cu 0.134 530 +Conductivity, +TSS, +Flow, +Precipitation 
Zn 0.101 530 +Flow, +Conductivity, +TSS, +Precipitation 
ModelAdjusted R2Number of samplesPredictor variable and sign
Cu 0.134 530 +Conductivity, +TSS, +Flow, +Precipitation 
Zn 0.101 530 +Flow, +Conductivity, +TSS, +Precipitation 

Correlation and regression analysis have been used in previous research to examine abiotic environmental variables that may influence metal concentrations (Li et al. 2013; Yaho et al. 2016; Nasrabadi et al. 2018). For example, metal concentrations have been reported to increase at times of elevated TSS (Nasrabadi et al. 2018). Increased runoff from road surfaces due to precipitation may increase the suspended solid load in rivers resulting in an increase in metal concentrations (Yaho et al. 2016). Metal concentration may also increase under acidic conditions with subsequent high conductivity, or during periods of high river flow following precipitation input (Li et al. 2013). In the context of the current study, metals have run off from the surrounding industrial and historic mining areas.

Multiple regression analysis has been used in the past to develop predictive models of heavy metal contamination. For example, Tipping et al. (2003) used multiple regression analysis to forecast solid–solution partitioning of heavy metals in upland soils in England and Wales. Carlon et al. (2004) used multiple regression analysis to predict the mobility of heavy metals and other pollutants in soil by using a stepwise forward procedure where data from five previous published studies (Gerritse & Van Driel 1984; Jopony 1991; Aten & Gupta 1996; Janssen et al. 1997; Sauvé et al. 1997) were used. A correlation was developed to predict the mobility of lead based on pH and the performance of the equation was evaluated by considering the adjusted R2 which demonstrated underperformance of the equation. Meers et al. (2005) used multivariate regression to predict the solubility of metals in soils in Belgium. Although the regression model with pH and total cadmium displayed a good model fit in this study, it was difficult to find a practical predictive model based on the equations derived. Similarly, Ivezić et al. (2012) used multiple regression to predict the solubility of metals in soil. However, the models overestimated the concentration of metals within the polluted soils and were not useful for unpolluted soils. Yaho et al. (2016) used two statistically significant linear regression models to predict the concentration of heavy metals in two rivers from China. The R2 values obtained for the models based on the samples from an industrial river ranged between 0.86 and 0.93 and the cleaner river was 0.60–0.85. Kouadri et al. (2021) developed Water Quality Index (WQI) predictions using eight artificial intelligence algorithms based on two different scenarios. The performance of the models was assessed by correlation coefficient (R), MAE, RMSE, RAE, and RRSE. The results indicated that the multilinear regression (MLR) model provided the greatest accuracy for the first scenario and RF provided the greatest accuracy for the second scenario.

In the current research, the results of the multiple regression analysis indicated a low predictive ability of the metal concentrations, and they were unable to explain the variability within the data with accuracy. Consequently, other techniques were examined as outlined in the following sections.

Time series analysis of the historical metal data

Time series analysis was used to identify temporal patterns within the historic metal (Cu and Zn) data and other parameters using ARIMA models. The environmental parameters were assessed for one connected and one isolated lake (Figures 3 and 4), while the metal concentrations were assessed for all the connected and isolated lakes (Figures 5 and 6).
Figure 3

Fitted and observed results of the trend for abiotic environmental parameters: (a) pH, (b) temperature, (c) DO, (d) TSS (e) conductivity, (f) flow, and (g) precipitation of a connected lake (Coneries Lake) using ARIMA models.

Figure 3

Fitted and observed results of the trend for abiotic environmental parameters: (a) pH, (b) temperature, (c) DO, (d) TSS (e) conductivity, (f) flow, and (g) precipitation of a connected lake (Coneries Lake) using ARIMA models.

Close modal
Figure 4

Fitted and observed results of the trend for abiotic environmental parameters: (a) pH, (b) temperature, (c) DO, (d) TSS (e) conductivity, (f) flow, and (g) precipitation of an isolated pond (Church pond) using ARIMA models.

Figure 4

Fitted and observed results of the trend for abiotic environmental parameters: (a) pH, (b) temperature, (c) DO, (d) TSS (e) conductivity, (f) flow, and (g) precipitation of an isolated pond (Church pond) using ARIMA models.

Close modal
Figure 5

Fitted and observed results of the trend for Cu concentrations for all connected lakes: (a) Coneries, (b) Main, (c) Tween, and isolated lakes, (d) Church and (e) Clifton using ARIMA models.

Figure 5

Fitted and observed results of the trend for Cu concentrations for all connected lakes: (a) Coneries, (b) Main, (c) Tween, and isolated lakes, (d) Church and (e) Clifton using ARIMA models.

Close modal
Figure 6

Fitted and observed results of the trend for Zn concentrations for all connected lakes: (a) Coneries, (b) Main, (c) Tween, and isolated lakes, (d) Church and (e) Clifton using ARIMA models.

Figure 6

Fitted and observed results of the trend for Zn concentrations for all connected lakes: (a) Coneries, (b) Main, (c) Tween, and isolated lakes, (d) Church and (e) Clifton using ARIMA models.

Close modal

Trends and patterns of abiotic parameters of a connected lake

One connected lake (Coneries lake) was used for building the ARIMA models (Figure 3). The series of ARIMA models developed indicate good model fits for pH (Figure 3(a)), temperature (Figure 3(b)), DO (Figure 3(c)), conductivity (Figure 3(e)), and flow (Figure 3(f)) where the p values of the AR (p) and MA (q) terms were all highly significant (p < 0.05) with AIC values of −259.94, 33.68, 92.93, −122.22, and 183.91, respectively. The probable reason for this reflects the high temporal resolution of the data and because all parameters were selected after carefully reviewing the PACF plots, ACF plots, and ADF tests. However, some of the models did not provide a good fit (e.g., TSS (Figure 3(d)) and precipitation (Figure 3(g))) even though the p values of the AR (p) and MA (q) terms were significant (p < 0.05), and the AIC values were high. This was probably because there was limited data variability compared to other parameters, with many values below the limits of detection.

Trends and patterns of abiotic parameters of an isolated lake

The data for one isolated lake (Church pond) was used to build the ARIMA models (Figure 4) for analysing the trend and patterns of environmental parameters. The ARIMA models indicate a good model fit for pH (Figure 4(a)), temperature (Figure 4(b)), conductivity (Figure 4(e)), and flow (Figure 4(f)) where the p values of the AR (p) and MA (q) terms were highly significant (p < 0.05) with AIC values of 89.03, −288.1, 87.45, −246.79, 183.08, respectively. However, some of the models did not provide a good model fit, for example, DO (Figure 4(c)), TSS (Figure 4(d)) and precipitation (Figure 4(g)). For DO (Figure 4(c)), the p value of the AR (p) term was not significant (p < 0.05), although the AIC value was low. For TSS (Figure 4(d)) and precipitation (Figure 4(g)), the p values of the AR (p) and MA (q) terms were highly significant (p < 0.05) but AIC values were high. This was probably due to the limited data variability (compared to other variables) and many values being below the limits of detection.

Trends and patterns of Cu concentrations for all connected and isolated lakes

Time series plots were prepared for all lakes to illustrate the variability of monthly Cu concentrations for the time-period of 2009–2017 (Figure 5). A peak in Cu concentration (16.80 μg L−1) was observed in April 2013 for Coneries lake (Figure 5(a)). This was due to high river discharge, elevated water levels and flooding in Nottinghamshire during this time period; where the water levels were elevated at Coneries lake outlet due to water from the River Erewash being diverted away from the Coneries–Tween–Main lake chain (McGowan & Salgado-Bonnet 2019). The time series plot for the unconnected lake, Church pond, illustrates the variation of monthly Cu concentration data for the time-period of 2009–2017 (Figure 5(d)). A peak in Cu concentration (12.60 μg L−1) was observed in April 2013. Metal contaminated river sediments and metal pollutants were mobilised and carried downstream during these events. Flooding is known to result in marked fluctuations in metal concentrations associated with increased mobilisation and re-suspension of sediments (Hurley et al. 2019) on the rising limb of the hydrograph. Thus, heavy metal contaminated sediments deposited and stored within the channel and floodplain sediments may be easily mobilised and pose a threat to the aquatic environment (Environment Agency 2008). The AIC values obtained for Cu concentrations for the lakes were as follows: (a) Coneries = 28, (b) Main = 90, (c) Tween = 69, (d) Church = 87, and (e) Clifton = 120 (Figure 5).

Trends and patterns of Zn concentrations for all connected and isolated lakes

Time series plots were also prepared to analyse the variation of monthly Zn concentration values for all lakes (Figure 6) for the time-period of 2009–2017. The AIC values obtained for Zn concentrations of all the lakes were as follows: (a) Coneries = 161, (b) Main = 161, (c) Tween = 221, (d) Church = 187 and (e) Clifton = 213 (Figure 6).

The ARIMA models helped to identify and visualise temporal patterns within the historic metal (Cu and Zn) data and other parameters. The analysis demonstrated that the lakes directly connected to the inflowing River Erewash received a greater metal load than the unconnected lakes. However, it is also clear that metal concentrations do not necessarily follow seasonal patterns or the trend lines in all instances. The models performed well for some variables (e.g., pH, temperature, conductivity, river flow) but less well for other parameters (e.g., TSS and precipitation). The AIC values obtained for Cu were lower than for Zn, indicating a better goodness of fit for Cu.

Based on reviewing the ARIMA models, it was established that time series analysis was not particularly accurate for the prediction of Cu and Zn concentrations. However, the method has been widely used in previous research centred on the analysis of seasonal and long-term water quality variation. For example, Lehmann & Rode (2001) used time series and trend analysis to characterise water quality and demonstrated that there was a clear association between the years 1984 and 1989 but the patterns broke down after 1990. The correlated parameters were DO saturation and biological oxygen demand, DO saturation and nitrate, nitrate and discharge, chlorophyll-a and pH. It was also demonstrated that ARIMA models were better at forecasting events more accurately in the short-term and that they became less effective in the long term. Rao et al. (2011) used time series analysis to examine trace metal pollution in surface waters from historic Cu mining areas in China using data over 13 years from 1995 to 2007. The results indicated that the model performed well at α = 0.2 (smoothing coefficient) with an error between the estimated values and actual values of below 5%.

In the current study, the results of the time series analysis demonstrated that it was not possible to predict metal concentrations accurately using ARIMA models because it neither reflected the temporal trend nor the seasonal pattern accurately. This observation resulted in the use of ANN as outlined below.

ANN analysis

TSS, river flow, DO, and Cu were used as inputs into the ANN for connected lakes (Figure 7) to predict the Cu values using a Sigmoid activation function for all nodes in the hidden layer and linear function for the output layer. It should be noted that ANN models for Zn are not presented within this paper. As outlined above, it was clear from both regression and ARIMA modelling that the secondary data for Zn concentrations were noisier. For example, there were multiple occurrences of 2 or 3 months having identical or unreliable values for Zn concentrations. This observation was confirmed through examination of descriptive statics analysis presented in Table 2 and Figure 2, where a relatively large variation in Zn concentration values with no significant correlations related to the factors that may have influenced them was observed.
Figure 7

Structure of the ANNs with four inputs (TSS, flow, DO, and Cu) and five neurons in the hidden layer for the connected lakes with Sigmoid activation function for all nodes in the hidden layer and linear for the output layer.

Figure 7

Structure of the ANNs with four inputs (TSS, flow, DO, and Cu) and five neurons in the hidden layer for the connected lakes with Sigmoid activation function for all nodes in the hidden layer and linear for the output layer.

Close modal

The network for Cu (Figure 7) was trained to achieve the highest model accuracy. Tests were conducted with alternative structures in addition to the structure shown in Figures 7 and 9, such as ANNs with multiple hidden layers with more hidden nodes. However, those were avoided as they took more time in converging while showing no improvement in the prediction accuracy. The validation vector stopped the training procedure when the accuracy of the model did not improve or remained equal. DO and TSS were found to be the best predictors for Cu values.

A training set (70% of the data) was used to train the model. A validation set (15% of remaining data) was used to validate the model during training and a test set (remaining 15% of the data) was used to test the model after it was trained. The R (coefficient of correlation) values for each stage of ANN were as follows, R = 0.89 for validation followed by training, R = 0.72 and test, R = 0.71 (Figure 8). The next ANN was developed by replacing flow with precipitation in the input architecture (Figure 9). DO and TSS were found to be the best predictors for the ANNs developed. The obtained R values were as follows: R = 0.81 for validation followed by test, R = 0.79 and training, R = 0.70 (Figure 10).
Figure 8

Performance of the model and the value of R for the training, validation, and test for the connected lakes.

Figure 8

Performance of the model and the value of R for the training, validation, and test for the connected lakes.

Close modal
Figure 9

Structure of the ANNs with four inputs (TSS, precipitation, DO, and Cu) and five neurons in the hidden layer for the isolated lakes with Sigmoid activation function for all nodes in the hidden layer and linear for the output layer.

Figure 9

Structure of the ANNs with four inputs (TSS, precipitation, DO, and Cu) and five neurons in the hidden layer for the isolated lakes with Sigmoid activation function for all nodes in the hidden layer and linear for the output layer.

Close modal
Figure 10

Performance of the model and the value of R for the training, validation, and test for the isolated lakes.

Figure 10

Performance of the model and the value of R for the training, validation, and test for the isolated lakes.

Close modal

ANNs have been successfully used for building predictive models of heavy metal concentrations in the environment. For example, Rooki et al. (2011) used back propagation, multiple linear regression and general regression neural networks to predict heavy metal concentrations in the Shur River in southeast Iran – where a back propagation neural network displayed the highest performance. Cavalcante et al. (2013) used an ANN (multilayer perceptron) with one hidden layer of five neurons to predict the changes in physico-chemical parameters based on seasonal variability. Subida et al. (2013) compared univariate techniques, multivariate techniques and ANNs to assess their performance and to estimate heavy metal contamination in estuarine benthic habitats. Based on the results, univariate techniques displayed poorer performance compared to multivariate techniques, and ANNs provided the most straightforward to interpret results with the most efficient analytical effort. Wang et al. (2014) developed a neural network to detect the Cu, cadmium, and lead concentrations in industrial wastewaters. The performance of the model was assessed by the average MRE (mean of relative error) with the accuracy being compared with other popular chemometric neural network methods with results indicating that the performance of the model was very good. Zhou et al. (2015) used ANNs (Levenberg–Marquardt algorithm and back propagation) to identify contaminant sources and to predict the concentration of heavy metals in soil from a known city area of China. The results indicated that the sources for areas with high levels of heavy metal contamination were successfully identified and the ANNs with three layers of feed forward displayed high accuracy in estimating heavy metal concentrations. Elzwayie et al. (2017) used a radial basis function neural network (RBFNN) to predict the heavy metal concentrations in two different lakes from Libya and Malaysia with different climatic and environmental conditions (e.g., tropical and arid; polluted and unpolluted lakes). The significant parameters were different in each study area and the accuracy of the models were very high. Ayaz & Khan (2019) developed a feed forward neural network with back propagation to predict the heavy metal concentrations in Karachi coastal waters and harbour. The results clearly demonstrated that neural networks were the most useful method to predict metal concentrations. Hadjisolomou et al. (2021) investigated chlorophyll-a levels of a shallow eutrophic lake using an ANN model for different water quality parameters including, pH, DO, water temperature, phosphorus, nitrogen, electric conductivity, and Secchi disk depth. K-fold cross validation and leave-one-out (LOO) cross validation was applied to train the ANN model. The results indicated an improved trend with higher k number and the best results were obtained using LOO cross validation.

In the current study, a 4:5:1 neural network was developed with four input layers, five hidden layers, and one output layer, where DO and TSS were found to be the best predictors with a R value of 0.72. The ANN model could more accurately predict Cu concentrations than any of the other methods considered (regression or ARIMA) for the abiotic parameters used. A thorough examination of the regression models in this research indicated that it was not possible to predict metal concentrations using multiple regression accurately. In addition, the results of the time series analysis indicated that it was not possible to accurately predict metal concentrations using time series (ARIMA) methods because the models for the target metals neither reflected the trend nor any seasonal patterns. However, the results of the ANNs indicated that all models developed displayed a lower MSE (MSE = 0.01) compared to regression models, indicating that ANNs were better suited to predicting metal concentrations than the other methods considered in this study. The ANN models developed may ultimately assist local managers of the ANR in managing the metal concentrations within the lakes, which may currently affect the birds, aquatic organisms, and the wider environment.

Heavy metal contamination in aquatic ecosystems and their potential effects on aquatic life is a significant issue of environmental concern in many developed economies as well as low and middle-income countries. Despite significant historic and current issues associated with metal pollution, there have been comparatively few contemporary studies examining the long-term legacy and effect of heavy metals on aquatic ecosystems in post-industrial areas. Although extensive long-term historical datasets on heavy metal availability in the environment exist within some geographical regions, few studies have attempted to develop predictive models to forecast or to identify the potential cause–effect relationships between metal concentrations in the environment and abiotic parameters. As a result, this study attempted to compare multiple regression, time series analysis, and ANNs to identify the most accurate model to predict metal concentrations based on different abiotic parameters.

A series of lakes within the ANR, UK, was used as a model system in this paper to analyse a 15-year hydrological and meteorological dataset. ANR is a conservation area in Nottinghamshire, UK, located near the confluence of the River Erewash and River Trent. This research developed a series of models (regression analysis, time series analysis and ANNs) to predict the concentration of Cu and Zn in freshwater lakes using a range of abiotic parameters. The correlation analysis conducted in this study indicated that the electrical conductivity values (Pearson correlation coefficient, r = 0.22) and river flow (r = 0.23) were significantly associated with metal concentrations. Electrical conductivity, TSS, flow volume and precipitation were the best predictors in regression models (explaining 13.4% of the variance for Cu and 10.1% of the variance for Zn). As expected, time series analysis in this work showed that the lakes directly connected to the inflowing river received a greater metal concentration than the unconnected lakes. Individual ANN models were created for metal concentrations, where DO and TSS were found to be the best predictors, with a high R value of 0.72 and a low MSE of 0.01. Overall, the results suggest that the accuracy of ANN models were the highest among all methods considered and could more accurately predict metal concentrations based on abiotic parameters. Since ANNs could reveal different effects and patterns of the physico-chemical and seasonal changes in metal concentrations, and provided more accurate results, they were considered the most appropriate for building predictive models. Other methodological approaches could have been employed, such as RF, M5P tree (M5P), random subspace (RSS), AR, SVR, and locally weighted linear regression (LWLR). These methods could be applied in future studies to determine if it is possible to improve model accuracy and predictive power.

The results of this study revealed that TSS and DO have a strong effect on the metal concentrations in the lakes of ANR. The turbidity of a river system is raised by TSS and is strongly related to the volume (flooding) and timing of runoff. Therefore, it can be concluded that metal concentrations tend to increase in the rainy season or during flood events when there is high river flow. The environmental regulators responsible for ANR need to manage the effect of flooding in the area. The implications of this research may help environmental regulators, reserve managers and local authorities the River Erewash or ANR develop strategies to manage environmental metal concentrations and identify more robust monitoring and mitigating measures.

The monitoring and analytical work was funded by CEMEX UK Operations Ltd as part of a monitoring programme overseen by Christopher Pointer, Principal Hydrogeologist at CEMEX. The work has included many University of Nottingham students and staff, with significant contributions from Iain Cross, Teresa Needham, Julie Swales, Ian Conway, and Sarah Taylor. We would like to express our sincere gratitude to Nottinghamshire Wildlife Trust and the ANR for allowing access to the sites.

B.B. and L.B. conceptualised the study, prepared the methodology, did formal analysis, investigated, wrote the original draft, wrote, reviewed, and edited the article, and visualised the study. L.D. and P.J.W. conceptualised the study, prepared the methodology, supervised the study, wrote, reviewed, and edited the article. S.M.G. did field work, prepared the methodology, did laboratory analyses, did data curation, wrote, reviewed, and edited the article. D.B.D. conceptualised the study, prepared the methodology, collected resources, supervised the study, wrote, reviewed, and edited the article. All of the authors have read and agreed to the submitted version of the manuscript.

All relevant data are included in the paper or its Supplementary Information.

The authors declare there is no conflict.

Ahmed
M. K.
,
Shaheen
N.
,
Islam
M. S.
,
Al-Mamun
M. H.
,
Islam
S.
,
Mohiduzzaman
M.
&
Bhattacharjee
L.
2015
Dietary intake of trace elements from highly consumed cultured fish (Labeorohita, Pangasius pangasius and Oreochromis mossambicus) and human health risk implications in Bangladesh
.
Chemosphere
128
,
284
292
.
doi:10.1016/j.chemosphere.2015.02.016
.
Alam
R.
,
Ahmed
Z.
&
Howladar
M. F.
2020
Evaluation of heavy metal contamination in water, soil and plant around the open landfill site Mogla Bazar in Sylhet, Bangladesh
.
Groundwater for Sustainable Development
10
,
100311
.
doi:10.1016/j.gsd.2019.100311
.
Armitage
P. D.
,
Bowes
M. J.
&
Vincent
H. M.
2007
Long-term changes in macroinvertebrate communities of a heavy metal polluted stream: The River Nent (Cumbria, UK) after 28 years
.
River Research and Applications
23
,
997
1015
.
doi:10.1002/rra.1022
.
ASCE, T.
2000
Artificial neural networks in hydrology. I: Preliminary concepts
.
Journal of Hydrologic Engineering
5
(
2
),
115
123
.
Ayaz
M.
&
Khan
N.
2019
Forecasting of heavy metals contamination in coastal sea surface water of the Karachi harbor area using neural network approach
.
Nature Environment and Pollution Technology
18
,
719
733
.
Bennett
D. A.
2001
How can I deal with missing data in my study?
Australian and New Zealand Journal of Public Health
25
(
5
),
464
469
.
Brewer
P. A.
,
Dennis
I. A.
&
Macklin
M. G.
2005
The use of Geomorphological Mapping and Modeling for Identifying Land Affected by Metal Contamination on River Floodplains
.
DEFRA
,
London
.
Byrne
P.
,
Wood
P. J.
&
Reid
I.
2012
The impairment of river systems by metal mine contamination: A review including remediation options
.
Critical Reviews in Environmental Science and Technology
42
,
2017
2077
.
doi:10.1080/10643389.2011.574103
.
Cavalcante
Y. L.
,
Hauser-Davis
R. A.
,
Saraiva
A. C. F.
,
Brandão
I. L. S.
,
Oliveira
T. F.
&
Silveira
A. M.
2013
Metal and physico-chemical variations at a hydroelectric reservoir analyzed by multivariate analyses and artificial neural networks: Environmental management and policy/decision-making tools
.
Science of the Total Environment
442
,
509
514
.
doi:10.1016/j.scitotenv.2012.10.059
.
Chandrappa
R.
&
Das
D.
2021
Environmental Health – Theory and Practice. Volume 2: Coping with Environmental Health
.
Springer Nature
,
Basel, Switzerland
.
ISBN 978-3-030-64484-0
.
Chatfield
C.
2000
Time-Series Forecasting
.
CRC press
,
New York
.
doi:10.1201/9781420036206
.
Chatterjee
S.
&
Hadi
A. S.
2015
Regression Analysis by Example
.
John Wiley and Sons
,
Hoboken, NJ, USA
.
Cross
I.
,
McGowan
S.
,
Needham
T.
&
Pointer
C. M.
2014
The effects of hydrological extremes on former gravel pit lake ecology: Management implications
.
Fundamental and Applied Limnology/Archiv für Hydrobiologie
185
(
1
),
71
90
.
doi:10.1127/fal/2014/0573
.
Derbyshire Wildlife Trust
2019
The River Erewash – Derbyshire's wandering wonder
.
Dukes
A. D.
,
Eklund
R. T.
,
Morgan
Z. D.
&
Layland
R. C.
2020
Heavy metal concentration in the water and sediment of the lake greenwood watershed
.
Water, Air, & Soil Pollution
231
,
11
.
doi:10.1007/s11270-019-4364-x
.
Dunea
D.
&
Iordache
Ş
.
2011
Time series analysis of the heavy metals loaded wastewaters resulted from chromium electroplating process
.
Environmental Engineering and Management Journal (EEMJ)
10
(
3
),
469
482
.
doi:10.30638/eemj.2011.062
.
Elzwayie
A.
,
Afan
H. A.
,
Allawi
M. F.
&
El-shafie
A.
2017
Heavy metal monitoring, analysis and prediction in lakes and rivers: State of the art
.
Environmental Science and Pollution Research
24
,
12104
12117
.
doi:10.1007/s11356-017-8715-0
.
Environment Agency Thames Region
1995
Chironomids as Indicators of Biological Quality: Present Use and Future Potential
.
Environment Agency
,
Bristol
.
Environment Agency
2008
Assessment of Metal-Mining Contaminated River Sediments in England and Wales, Science Report – SC030136/SR4
.
Bristol, Almondsbury
,
UK
.
Gerritse
R. G.
&
Van Driel
W.
1984
The relationship between adsorption of trace metals, organic matter, and pH in temperate soils
.
Journal of Environmental Quality
13
,
197
204
.
GHCN-Daily
2021
Global Historical Climate Network Daily – Description [Online]
.
Available from: https://www.ncdc.noaa.gov/ghcn-daily-description (accessed 19 January 2021)
.
Gredilla
A.
,
De Vallejuelo
S. F. O.
,
de Diego
A.
,
Madariaga
J. M.
&
Amigo
J. M.
2013
Unsupervised pattern-recognition techniques to investigate metal pollution in estuaries
.
TrAC Trends in Analytical Chemistry
46
,
59
69
.
Grum
M.
,
Aalderink
R. H.
,
Lijklema
L.
&
Spliid
H.
1997
The underlying structure of systematic variations in the event mean concentrations of pollutants in urban runoff
.
Water Science and Technology
36
(
8–9
),
135
140
.
Hadjisolomou
E.
,
Stefanidis
K.
,
Herodotou
H.
,
Michaelides
M.
,
Papatheodorou
G.
&
Papastergiadou
E.
2021
Modelling freshwater eutrophication with limited limnological data using artificial neural networks
.
Water
13
(
11
).
doi:10.3390/w13111590
.
Hurley
R. R.
,
Rothwell
J. J.
&
Woodward
J. C.
2017
Metal contamination of bed sediments in the Irwell and Upper Mersey catchments, northwest England: Exploring the legacy of industry and urban growth
.
Journal of Soils and Sediments
17
,
2648
2665
.
doi:10.1007/s11368-017-1668-6
.
Hurley
R. R.
,
Woodward
J. C.
&
Rothwell
J. J.
2019
Highly conservative behaviour of bed sediment-associated metals following extreme flooding
.
Hydrological Processes
33
,
1204
1217
.
doi:10.1002/hyp.13382
.
IBM
2020
Neural Networks [Online]
.
Available from: https://www.ibm.com/cloud/learn/neural-networks (accessed 24 November 2021)
.
Islam
M. S.
,
Ahmed
M. K.
,
Habibullah-Al-Mamun
M.
,
Islam
K. N.
,
Ibrahim
M.
&
Masunaga
S.
2014
Arsenic and lead in foods: A potential threat to human health, Bangladesh
.
Food Additives & Contaminants
.
doi:10.1080/19440049.2014.974686
.
Islam
M. S.
,
Ahmed
M. K.
,
Raknuzzaman
M.
,
Habibullah-Al-Mamun
M.
&
Masunaga
S.
2015
Metal speciation in sediment and their bioaccumulation in fish species of three urban rivers in Bangladesh
.
Archives of Environmental Contamination and Toxicology
68
,
92
106
.
doi:10.1007/s00244-014-0079-6
.
Janssen
R. P.
,
Peijnenburg
W. J.
,
Posthuma
L.
&
Van Den Hoop
M. A.
1997
Equilibrium partitioning of heavy metals in Dutch field soils. I. Relationship between metal partition coefficients and soil characteristics
.
Environmental Toxicology and Chemistry: An International Journal
16
(
12
),
2470
2478
.
Jopony
M.
1991
The Solubility of Lead and Cadmium in Contaminated Soils
.
PhD Thesis
,
University of Nottingham
.
Karaouzas
I.
,
Kapetanaki
N.
,
Mentzafou
A.
,
Kanellopoulos
T. D.
&
Skoulikidis
N.
2021
Heavy metal contamination status in Greek surface waters: A review with application and evaluation of pollution indices
.
Chemosphere
263
,
128192
.
doi:10.1016/j.chemosphere.2020.128192
.
Kouba
A.
,
Buřič
M.
&
Kozák
P.
2010
Bioaccumulation and effects of heavy metals in crayfish: A review
.
Water, Air & Soil Pollution
211
,
5
16
.
doi:10.1007/s11270-009-0273-8
.
Lehmann
A.
&
Rode
M.
2001
Long-term behaviour and cross-correlation water quality analysis of the river Elbe, Germany
.
Water Research
35
(
9
),
2153
2160
.
doi:10.1016/S0043-1354(00)00488-7
.
Li
X.
,
Liu
L.
,
Wang
Y.
,
Luo
G.
,
Chen
X.
,
Yang
X.
,
Gao
B.
&
He
X.
2012
Integrated assessment of heavy metal contamination in sediments from a coastal industrial basin, NE China
.
PloS One
7
(
6
),
e39690
.
doi:10.1371/journal.pone.0039690
.
Li
H.
,
Shi
A.
,
Li
M.
&
Zhang
X.
2013
Effect of pH, temperature, dissolved oxygen, and flow rate of overlying water on heavy metals release from storm sewer sediments
.
Journal of Chemistry
2013
,
11
.
doi:10.1155/2013/434012
.
Liu
T.
,
Shen
X.
&
Wang
H.
2012
Forecast Study on Forecasting Pollutant Concentration of Heavy-Metal Contaminants in Streams [Online]
.
Available from: https://en.cnki.com.cn/Article_en/CJFDTOTAL-JSJH201203007.htm (accessed 27 December 2020)
.
Martin
J. A. R.
,
Arana
C. D.
,
Ramos-Miras
J. J.
,
Gil
C.
&
Boluda
R.
2015
Impact of 70years urban growth associated with heavy metal pollution
.
Environmental Pollution
196
,
156
163
.
doi:10.1016/j.envpol.2014.10.014
.
McGowan
S.
&
Salgado-Bonnet
J.
2019
Chemical and Biological Monitoring at ANR. Unpublished Report
.
University of Nottingham
.
Meers
E.
,
Unamuno
V.
,
Vandegehuchte
M.
,
Vanbroekhoven
K.
,
Geebelen
W.
,
Samson
R.
,
Vangronsveld
J.
,
Diels
L.
,
Ruttens
A.
,
Laing
G. D.
&
Tack
F.
2005
Soil-solution speciation of CD as affected by soil characteristics in unpolluted and polluted soils
.
Environmental Toxicology and Chemistry: An International Journal
24
(
3
),
499
509
.
Mistri
M.
,
Munari
C.
,
Pagnoni
A.
,
Chenet
T.
,
Pasti
L.
&
Cavazzini
A.
2020
Accumulation of trace metals in crayfish tissues: is Procambarus clarkii a vector of pollutants in Po Delta inland waters?
The European Zoological Journal
87
,
46
57
.
doi:10.1080/24750263.2020.1717653
.
Nasrabadi
T.
,
Ruegner
H.
,
Schwientek
M.
,
Bennett
J.
,
Fazel Valipour
S.
&
Grathwohl
P.
2018
Bulk metal concentrations versus total suspended solids in rivers: Time-invariant and catchment-specific relationships
.
PloS one
13
(
1
),
e0191314
.
doi:10.1371/journal.pone.0191314
.
National River Flow Archive
2020
28027- Erewash at Sandiacre [Online]
.
Available from: https://nrfa.ceh.ac.uk/data/station/info/28027 (accessed 15 July 2020)
.
NRA
1995
River Erewash Catchment Management Plan: Action Plan: August 1995
.
National Rivers Authority
,
Shrewsbury
.
Pandey
M.
,
Pandey
A. K.
,
Mishra
A.
&
Tripathi
B. D.
2015
Application of chemometric analysis and self organizing map-artificial neural network as source receptor modeling for metal speciation in river sediment
.
Environmental Pollution
204
,
64
73
.
doi:10.1016/j.envpol.2015.04.007
.
Prabhakaran
S.
2021
ARIMA Model- Complete Guide to Time Series Forecasting in Python [Online]
. .
Rainbow
P. S.
2007
Trace metal bioaccumulation: Models, metabolic availability and toxicity
.
Environment International
33
,
576
582
.
doi:10.1016/j.envint.2006.05.007
.
Rao
Y.
,
Xu
S.
&
Xiong
L.
2011
Time series prediction of heavy metal contamination in mining areas based on exponential smoothing model
. In:
International Conference on Information Science and Technology
. pp.
1318
1322
.
IEEE
.
doi:10.1109/ICIST.2011.5765081
.
Rizal
N.
,
Umarie
I.
,
Munandar
K.
&
Wardoyo
A.
2023
Calibration and validation of CN values for watershed hydrological response
.
Civil Engineering Journal
9
,
72
85
.
doi:10.28991/CEJ-2023-09-01-06
.
Rooki
R.
,
Ardejani
F. D.
,
Aryafar
A.
&
Asadi
A. B.
2011
Prediction of heavy metals in acid mine drainage using artificial neural network from the Shur River of the Sarcheshmeh porphyry copper mine
.
Southeast Iran. Environmental Earth Sciences
64
(
5
),
1303
1316
.
Rowland
P.
,
Neal
C.
,
Sleep
D.
,
Scholefield
P.
&
Vincent
C.
2011
Chemical quality status of rivers for the water framework directive: a case study of toxic metals in North West England
.
Water
3
,
649
666
.
doi:10.3390/w3020650
.
Sauvé
S.
,
McBride
M. B.
&
Hendershot
W. H.
1997
Speciation of lead in contamination soils
.
Environmental Pollution
98
,
149
155
.
Schober
P.
,
Boer
C.
&
Schwarte
L. A.
2018
Correlation coefficients: Appropriate use and interpretation
.
Anesthesia and Analgesia
126
(
5
),
1763
1768
.
Siddique
M. A. M.
,
Rahman
M.
,
Shahriar
M. A. R.
,
Hassan
M. R.
,
Fardous
Z.
,
Chowdhury
M. A. Z.
&
Hossain
M. B.
2020
Assessment of heavy metal contamination in the surficial sediments from the lower Meghna River estuary, Noakhali coast, Bangladesh: A baseline study
.
International Journal of Sediment Research
.
Elsevier. doi:10.1016/j.ijsrc.2020.10.010
.
Sodango
T. H.
,
Li
X.
,
Sha
J.
&
Bao
Z.
2018
Review of the spatial distribution, source and extent of heavy metal pollution of soil in China: Impacts and mitigation approaches
.
Journal of Health and Pollution
8
,
53
70
.
doi:10.5696/2156-9614-8.17.53
.
Sow
A. Y.
,
Dee
K. H.
,
Lee
S. W.
,
Aweng
A. L.
&
Rak
E.
2019
An assessment of heavy metals toxicity in Asian clam, corbicula fluminea, from Mekong River, Pa Sak River, and Lopburi River, Thailand
.
The Scientific World Journal
2019
,
5
.
1615298. doi:10.1155/2019/1615298
.
Tipping
E.
,
Rieuwerts
J.
,
Pan
G.
,
Ashmore
M. R.
,
Lofts
S.
,
Hill
M. T. R.
,
Farago
M. E.
&
Thornton
I.
2003
The solid–solution partitioning of heavy metals (Cu, Zn, Cd, Pb) in upland soils of England and Wales
.
Environmental Pollution
125
(
2
),
213
225
.
Tunde
O. L.
&
Oluwagbenga
A. P.
2020
Assessment of heavy metals contamination and sediment quality in Ondo coastal marine area, Nigeria
.
Journal of African Earth Sciences
170
,
103903
.
doi:10.1016/j.jafrearsci.2020.103903
.
Usman
Q. A.
,
Muhammad
S.
,
Ali
W.
,
Yousaf
S.
&
Jadoon
I. A. K.
2020
Spatial distribution and provenance of heavy metal contamination in the sediments of the Indus River and its tributaries, North Pakistan: Evaluation of pollution and potential risks
.
Environmental Technology and Innovation
21
,
101184
.
doi:10.1016/j.eti.2020.101184
.
Vervier
P.
,
Pinheiro
A.
,
Fabre
A.
,
Pinay
G.
&
Fustec
E.
1999
Spatial changes in the modalities of N and P inputs in a rural river network
.
Water Research
33
(
1
),
95
104
.
Xu
L.
&
Liu
S.
2013
Study of short-term water quality prediction model based on wavelet neural network
.
Mathematical and Computer Modelling
58
(
3–4
),
807
813
.
Yaho
H.
,
Zhuang
W.
,
Qian
Y.
,
Xia
B.
,
Yang
Y.
&
Qian
X.
2016
Estimating and predicting metal concentration using online turbidity values and water quality models in two rivers of the Taihu Basin, Eastern China
.
PloS one
11
(
3
),
e0152491
.
doi:10.1371/journal.pone.0152491
.
Zhou
Q.
,
Zhang
J.
,
Fu
J.
,
Shi
J.
&
Jiang
G.
2008
Biomonitoring: An appealing tool for assessment of metal pollution in the aquatic ecosystem
.
Analytica Chimica Acta
606
,
135
150
.
doi:10.1016/j.aca.2007.11.018
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).