Application of neural network techniques to predict the heavy metals in acid mine drainage from South African mines

Acid Mine Drainage (AMD) is the formation and movement of highly acid water rich in heavy metals. Prediction of heavy metals in the AMD is important in developing any appropriate remediation strategy. This paper attempts to predict heavy metals in the AMD (Zn, Fe, Mn, Si and Ni) from South African mines using Neural Network (NN) techniques. The Backpropagation (BP) neural network model has three layers with the input layer (pH, SO 42 (cid:1) and total dissolved solids (TDS)) and output layer (Cu, Fe, Mn and Zn). After BP training, the NN techniques were able to predict heavy metals in AMD with a tangent sigmoid transfer function ( tansig ) at hidden layer with ﬁ ve neurons and linear transfer function ( purelin ) at output layer. The Levenberg-Marquardt back-propagation ( trainlm ) algorithm was found as the best of 10 BP algorithms with mean-squared error (MSE) value of 0.00041 and coef ﬁ cient of determination ( R ) for all (training, validation and test) value of 0.99984. The results indicate that NN can be considered as an easy and cost-effective technique to predict heavy metals in the AMD.


INTRODUCTION
Mines' activities are the major contributors to acid mine drainage (AMD), which poses considerable risk to the environment (Fosso-Kankeu 2018). AMD environmental impacts are provoked by the low pH and high dissolved metals concentrations, which are toxic (Rodríguez-Galán et al. 2019). The most toxic dissolved metal in AMD is Fe, presented in Table 1. AMD from mines' activities has been reported to be derived mostly from the oxidation of pyrite. The oxidation occurs on the pyrite in two stages, of which the initial stage produces sulphuric acid and ferrous sulphate (FeSO 4 ) and the next one produces orange-red ferric hydroxide (Fe(OH) 3 ) and additional sulphuric acid (McCarthy 2011). The combination of water, oxygen and oxidizing bacteria causes the pyrite and sulphide minerals in mine wastes to oxidize and form AMD. Pyrite oxidation is quite complex; therefore different reactions can represent it under different conditions (Kefeni et al. 2017). Some examples of reactions for the major and common pyrite oxidation processes are as follows: The initial reaction involves sulphide mineral oxidation into dissolved Fe 2þ , SO 2À 4 and H þ , which Equation (1) illustrates: An increase in total dissolved solids (TDS) and acidity in the water is represented by the dissolved Fe 2þ , SO 2À 4 and H þ . If they are not neutralized, they cause pH decrease. If the environment contains enough oxygen, pH and bacteria activity which are the oxidizing requirements, then a lot of the ferrous iron (FeO) will be oxidized to ferric iron (Fe 3þ ) as shown in Equation (2): When the pH values lie between 2.3 and 3.5, Fe 3þ precipitates as Fe(OH) 3 and jarosite, which then leaves a little Fe 3þ in solution while the pH is simultaneously lowered as shown in Equation (3): Any Fe 3þ in the second equation which does not precipitate from solution through Equation (3) can be used to oxidize additional pyrite as shown in Equation (4): A combination of Equations (1)-(3) represents acid generation that produces Fe 2þ , which eventually precipitates as Fe(OH) 3 : The overall equation for stable Fe 3þ that is used to oxidize additional pyrite is a combination of Equation (1)- (3): All the equations, except for Equations (2) and (3), have assumed that the mineral being oxidized is pyrite and oxygen is the oxidant. There are other sulphide minerals, like pyrrhotite (FeS) and chalcocite (Cu 2 S), which have other ratios of metal sulphide and metals other than Fe. There are different reaction pathways, rates and stoichiometries for additional oxidants and sulphide minerals; however, with limited research on those variations (Akcil & Koldas 2005).
In South Africa and other parts of the world, there is interconnectivity in the ground and surface water systems, which makes it possible for high volumes of contaminated water to affect the water supply (Ghadimi 2015). Aluminium (Al) could cause a number of problems within the distribution network. Aluminium precipitates can increase the final water turbidity and my enmesh micro-organisms to obstruct the disinfection process (Arendze & Sibiya 2017). The presence of iron (Fe) and manganese (Mn) results in staining as well as offensive tastes and appearances, an aesthetic and operational concern. In moderate doses, it is considered as an essential nutrient for human health (Demiral et al. 2021). Other heavy metals that carry problematic effects include zinc (Zn). High concentration of Zn in human beings causes eminent health problems such as nausea, vomiting, anaemia, stomach cramps, skin irritation and cholesterol problems (Fu & Wang 2011). Nickel (Ni) exceeding its critical level might cause serious lung and kidney problems, gastrointestinal distress, pulmonary fibrosis and skin dermatitis (Kabuba & Banza 2020). High concentration of copper (Cu) brings illnesses in brain, kidneys, immunological system and haematological system. Copper consumption may also lead to serious toxicological concerns, since it could be deposited in the brain, skin and liver (Moharbi et al. 2020). These toxic heavy metals are becoming serious environmental problems and cause severe health problems.
Various investigations have been conducted on the behaviour of the heavy metals in AMD (Cánovas et al. 2007;Rooki et al. 2011;Fosso-Kankeu 2018). The conventional technique of measuring the heavy metals involves sampling and a time-consuming and high-cost laboratory analysis. Therefore, to predict the concentrations of heavy metals in water affected by AMD, it is mandatory to develop an appropriate technique to remediate and monitor for comprehensive assessment of the potential environmental impact of AMD. Few studies have been conducted for prediction of heavy metals in AMD using Neural Networks (NNs).
NNs are information processing based on the structure of a biological neural system (Rooki et al. 2011;Kabuba et al. 2014). Due to complexity of the technique, it is difficult to predict using conventional mathematical modelling. NNs is considered a promising method because of their simplicity toward prediction and modeling (Elmolla et al. 2010). In many diverse fields, NNs have been applied over the years, and such applications have been found successful and satisfactory. Some of the applications include the recognition of patterns, processing of signals and images, system identification and modelling as well as predictions of stock market. The reason for such success can be attributed to the fact that since NNs are parallel methods of processing information, they are able to extract relationships that are nonlinear and complex (Ghadimi 2015). The main objective of the study is to develop an NN model to predict heavy metals in acid mine drainage from South African mines.

METHODOLOGY Collection of raw and neutralized AMD samples
The raw samples were collected from the pipeline where acid water enters the treatment plant (influent). The raw sample is the acid water prior to any chemical addition. The point of sample collection was opened and water was allowed to run for 10 min. The samples were then taken using 2 L bottles and immediately stored in a dark space.
The water sample looked like pure water until it was allowed to stand; in the process of storage, a further oxidation reaction of dissolved Fe 3þ ions to form ferric ions in the sample takes place. The original colour of wastewater samples changes to Water Science & Technology Vol 84 No 12,3492 brownish and reddish. Thereafter, hydrolysis of Fe 3þ with water forms the solid Fe(OH) 3 (ferrihydrite), which is orange-red in colour, and the release of additional acidity. This reaction is pH dependent and under very acidic conditions of less than about pH 3.5, the solid mineral does not form and Fe 3þ remains in solution. However, the sample pH was about 4, which is a higher value, and a precipitate was formed, commonly referred to as 'yellow boy' (Fosso-Kankeu 2018). The neutralized sample was taken from an area in the plant where treatment has already taken place by adding 95% Ca(OH) 2 for neutralization. The solid precipitations were reincorporated in the water sample via acidification by adding HNO 3 prior to analysis.

Analysis for AMD properties and analytical techniques
• Inductively Coupled Plasma -Optical Emission Spectrometry (ICP-OES) analysis The acid mine water samples were submitted on the same day after collection to Setpoint laboratories in Johannesburg (South Africa) for the ICP-OES analysis, using a model number Varian 700-ES. The water samples were characterized by ICP-OES analysis to determine the concentrations of inorganic elements contained in these samples. The prepared standards of inorganic elements to be detected were used to calibrate the ICP-OES equipment for the ICP-OES analysis of the selected inorganic elements (transition elements, lighter inorganic elements and hazardous heavy elements) in the leachate samples (Olesik et al. 1994). In this process 0.2 g from reference material for digestion was used. Wet digestion of samples was performed using 5 mL mixtures of acids: HNO 3 :HClO 4 (3:1). Each sample was heated up to 180°C for 3 h on the heating digestion block. Then the acid digest sample was allowed to cool and filtered into a 25 mL volumetric flask, using Whatmann filter paper, and made up to mark with de-ionized water. The blank digest sample was similarly processed. Calibration curves for elements were created from mixtures of high-purity element standard solutions (Merck, Germany) in a 4% (v/v) HNO 3 matrix.

• Ion chromatography (IC analysis)
About 20 μL sample was injected into a Metrosep A Supp 4-250/4 anion-exchange column (stationary phase), which was held at 25°C, with a pressure of 5.83 MPa and a flow rate of 1.00 mL/min. Before analysis, the liquid samples were filtered through 0.22 μm Millipore filter paper. The IC analysis of the sample was carried out under isocratic conditions using disodium carbonate (Na 2 CO 3 ) (1.8 mmol/L) and sodium hydrogen carbonate (NaHCO 3 ) (1.7 mmol/L) as mobile phase with a pH 10.30. Different standards were used during the IC analysis including fluoride 2.0 mg/L, chloride 2.0 mg/L, nitrite 5.0 mg/L, bromide 10.0 mg/L and nitrate 10.0 mg/L.

• EC, TDS and pH values measurements
The electrical conductivity (EC), which indicates the levels of salinity of the water, TDS and pH of the water samples were recorded using a Hana HI 991301 pH meter with portable EC/pH/TDS/temperature probe.

• NN model
In this work, data sets of 28 samples were collected to develop the NN model and were divided into input and target matrix. The input variables for the NN model were selected based on the physical and chemical characteristics that have greater impact on AMD. These characteristics appear mostly in water that contains heavy metals; they are considered to have most dependence on heavy metals. The targets were chosen based on the heavy metals that are high in concentration in the sample taken and are considered unhealthy for service water and need to be removed before AMD is released to the public. The input variables identified were decided to be pH, SO 2À 4 and TDS while the targets were chosen to be Zn, Fe, Mn, Si and Ni.
The reason for selecting Zn, Fe, Mn, Si and Ni was based on the fact that they were found to be in high concentrations in the Western Basin AMD plant samples and according to literature.

Data processing
A database is critical when modelling an NN. In the first part, the database was generated by collecting a large number of data points from the experimental data. After evaluating all the experimental results, the collected data were arranged in a set of input vectors as a column in a matrix. Then another set of target vectors was arranged (the correct output vectors for each of the input vectors) to a second matrix in an MS Excel sheet. The input variables were pH, SO 2À 4 and TDS. The corresponding Zn, Fe, Mn, Si and Ni were used as a target. To ensure that all variables in the input data are important, principal component analysis (PCA) was performed as an effective procedure for the determination of input parameters. In this research, a multiple-layered perceptron (MLP) type BPNN was used for modeling.

NN design procedure
Five important aspects that must be determined in the design procedure of NNs are as follows: (1) selection of the BP training algorithm, (2) data distribution, (3) selection of the NN structure, (4) selection of the initial weight and (5) sensitivity analysis.

Selection of inputs, targets and functions
The data points, which are the results of the laboratory analyses, are exported to the Neural Network Toolbox V4.0 MATLAB, 2020a mathematical software. The inputs (three) and targets (five) were defined from the exported data. Selections of training, adaptation learning and performance functions were done.

Train NN using information and parameters
The training was done using the input and target as training information. The parameters include information such as epochs, goal, time and showing of command line.

Trial and error
Testing of the NN was done using parameters mentioned above as well as the gradient, the validation check parameter was used to check the validity of the NN and the overall data regression as presented in the plots of the NN. Fitting was observed; if the fitting is over or under, then there is re-initializing and editing of weights. The cycle of training, testing, validation and regression was done until the fitting was good. When the fitting was good, the output and error results were retrieved, and comparison was done between the targets and outputs.

Analysis of raw AMD and treated AMD samples
The water characteristics results of the raw samples and samples treated with Ca(OH) 2 are shown in Table 2. The results are of the two representative samples.
The results obtained indicated that the lowest recorded pH of the raw AMD was 4.17 and the sulphate content was found to be between 1,627 mg/L (minimum) and 1,634 mg/L (maximum), which is very high. 600 mg/L is the maximum point where human health will not be affected by sulphate concentrations (Penner et al. 2020). This confirms that the samples analysed were indeed AMD. According to Moodley et al. (2017), AMD is generally characterized by low pH, high heavy metal content and high salinity, although the sulphate and metal concentrations in the water vary based on the mine.
Samples treated with Ca(OH) 2 were of cleaner standards. The data shows an increase in pH from 2.57 to a maximum of 8.26 due to the addition of Ca(OH) 2 . This also leads to the total alkalinity of the treated samples increasing due to the addition of Ca. An insignificant reduction in TDS is observed in the neutralized solutions when compared to the raw solution on the samples, which means that treatment of AMD has very little impact in reducing TDS. This can be attributed to the fact that ions such as Al, Ag, Cr, Mo, Fe, and Pb react with OH À from the added Ca(OH) 2 at high pH of 8.2 to form metal hydroxide precipitates (Penner et al. 2020). Conductivity indicates the level of salinity of water, and AMD conductivity is high due to high salinity. The addition of Ca(OH) 2 increases pH and lowers the salinity of the AMD, and therefore the conductivity of the treated solution also decreased, as seen in Table 2. Heavy metals were identified, with Zn, Fe, Mn, Si and Ni being the ones with high concentrations. Heavy metals analysis was done on the raw AMD and treated AMD samples to determine if the heavy metals were removed by the neutralization process in order get the water to cleaner standards. Table 3 shows that in the presence of oxygen, ferrous iron was oxidized to ferric iron, which precipitated at a pH of lower than 4 in other samples.
Ferric hydroxide formed a yellowish-orange solid known as yellow boy, which usually precipitates at pH greater than 3.5. Therefore, when the pH increased to 8.26, ferric hydroxide was precipitated. In all samples, Fe went from high concentrations to ,10.0 mg/L, which is below detection. This can be seen in sample 1, where the initial concentration of Fe went from 289 mg/L to ,10.0 mg/L. Manganese (Mn) precipitation is variable due to its many oxidation states, but it generally precipitates at a pH of 9.0 to 9.5. In samples that reached a pH of 9.0, it was precipitated. In samples that did not reach a pH of 9.0, Fe precipitation largely removed Mn from the water at pH 8 due to co-precipitation, because the iron concentration in the water was much greater than the manganese content. Some of the Mn remains as dissolved Mn and in very small concentrations.
Complete precipitation of Zn occurs at pH 10.1; however, the pH of the samples was only increased to 8.26. At these pH values, most of the Zn forms zinc hydroxide and precipitates out of solution, while some of the Zn remains as dissolved Zn. In sample 1, for instance, the concentration of Zn was reduced from 0.33 mg/L to 0.07 mg/L. The same concept applies for Ni, which completely precipitates at pH 10.8 to 11. At pH between 7.0 and 8.0, most of the Ni forms nickel hydroxide and precipitates out of solution, leaving the rest as dissolved Ni. In sample 1, for instance, the concentration of Ni was reduced from 0.521 mg/L to 0.0272 mg/L.

Application of NN
According to Toma et al. (2004), to design NN, the important aspects must be determined: (1) selection of the BP training algorithm, (2) data distribution, (3) selection of the NN structure and (4) selection of the initial weight.
• Selection of the BP training algorithm Ten BP training algorithms were compared, as illustrated in Table 4, in order to select the best suitable BP training algorithm.
The NNToolbox was used, where the laboratory results of the samples were imported. In all the BP training algorithms, a three-layer NN with a tangent sigmoid transfer function (tansig) at the hidden layer and a linear transfer function (purelin) at the output layer were used. The chosen training algorithm was the Levenberg-Marquardt back-propagation (trainlm) due to the fact that it had the smallest mean squared error (MSE) of 0.00041, meaning the error of this algorithm is very low. For cross validation of NN, coefficient of determination (R) and MSE were evaluated. The MSE is elaborated in Equation (7): where subscripts cal and exp denote calculated and experimental values of SP (set point), respectively. N is the number of validation and training data. The line of best fit for the data set used in this experiment, also known as BLE, is described by Equation (8): This equation allows the estimation of the value of a dependent variable (y) from a given independent variable (X ), Where b defines the slope of the line and a is the intercept. The best linear equation that was chosen after training and testing was found to be y ¼ x þ 1.4. It gave a clear straight line with the values for training, validation and test R being 0.99908.

Data distribution
The NN model was based on the selected BP algorithm, Levenberg-Marquardt back-propagation (trainlm) for the experimental data. This was applied to train the NN. During training, the output matrix was computed by a forward pass (feed-forward BP NN) in which the input matrix is propagated forward through the network to compute the output value of each unit. The output matrix was then compared with the desired matrix, which results in an error signal for each output unit. In order to minimize the error, appropriate adjustments were made for each of the weights of the network. The training was stopped after iterations for the Levenberg-Marquardt algorithm (LMA) where the differences between training errors and validation errors were starting to increase. Eight iterations of training were performed with different weights being adjusted. In order to minimize the error, appropriate adjustments were made for each of the weights of the network. Figures 1-10 present the different algorithms: traingd, traingdm, trainbfg, traincgf, trainoss, traincgp, trainscg, traingdx, and trainlm.
The optimum algorithm was found to be trainlm for training, validation and test MSE as shown in Figure 10. The Levenberg-Marquardt back-propagation (trainlm) algorithm resulted in an R value of 0.99993 during training, 0.99998 during validation, 0.9993 during testing, and a summary of all the stages resulted in an R of 0.99984. This means that the output predicted by the network is nearly an exact fit to the output from the laboratory analysis, and this is shown by the MSE being 0.00041. This algorithm was found to be the one that resulted in the optimum structure because it had the smallest MSE value and the BLE showed better fit than the other algorithms. This means that the ideal algorithm to use in training the NNToolbox for the prediction of heavy metals in mine water is the Levenberg-Marquardt back-propagation, and the results of the output predicted by the network will be of great closeness to the output from the laboratory analysis by an MSE value of 0.00041. The training parameters for this algorithm are presented in Table 5. Water Science & Technology Vol 84 No 12,3498 Selection of NN structure For the best performance of the NN structure to be determined, it was necessary for the optimal network architecture to be defined. The number of hidden layers and number of neurons in it were determined based on the minimum value of MSE of the training and prediction set. The minimum value of MSE was 0.00041 using the Levenberg-Marquardt back-propagation (trainlm). Figure 11 shows the optimal structure: 3-5-6 with three neurons in input layer, five neurons in hidden layer and six neurons in output layer.
The network was found to be fully connected. This means that there was a connection of every neuron in each layer to every neuron in the next layer. This is how the NN structure was named, based on the number of neurons in each layer.  Figure 12 shows the optimum NN structure in detail when the Levenberg-Marquardt back-propagation (trainlm) is applied on a three-layer NN with a tangent sigmoid transfer function (tansig) at the hidden layer and a linear transfer function (purelin) at the output layer. There is only one hidden layer, which consists of five neurons.

Selection of initial weight
According to literature, an important problem encountered when training an NN was the determination of the appropriate initial values for the connection weights (Kabuba et al. 2014). These weights are modified during utilization so as to satisfy a criterion of performance. NN basically adds up the signal that comes from its inputs and multiplies them with the correspondent weights. If the result goes beyond the threshold, the neuron is able to fire and transmit a signal at the output using a transfer function (Kabuba et al. 2014).
The effective weight initiation is associated with performance characteristics such as the time needed to successfully train the network and the generalization ability of the trained network (Adam et al. 2014). The wrong choice of initial weights can lead to an increase in the training time or can even cause the non-convergence of the training algorithm. To decide on the In Equation (9), W is the connection weight. The superscripts 'I' 'h' and 'o' refer to input, hidden and output layers, respectively and subscripts 'k', 'm' and 'n' refer to input, hidden and output neurons, respectively.
The initial weights to layer 1 from input 1 using the Levenberg-Marquardt back-propagation algorithm were the following: algorithm the optimum one because it resulted in the MSE of 0.00041. The MSE using the combination of this algorithm and weights gave the smallest MSE after testing 10 algorithms, meaning the error of this algorithm is very low. These weights also made the training time short and resulted in the BLE after training and testing being y ¼ x þ 1.4. It gave a clear straight line with the values for training, validation and test R being 0.99908.

CONCLUSION
The application of NN techniques to predict the heavy metals in AMD from South African mines has been presented. The prediction of heavy metals (Zn, Fe, Mn, Si and Ni) using NN techniques with BP algorithm is presented and compared with the experimental data. The configuration of the BP giving the smallest mean-squared error was Levenberg Marquardts Algorithm (3-6-5) with a tangent sigmoid transfer function (tansig) at hidden layer and a linear transfer function (purelin) at output layer. NN predicted results are very close to the experimental results, with R (for all training, validation and test) ¼ 0.99984 and MSE ¼ 0.00041. NN results showed that NN techniques could effectively simulate and predict the heavy metals in AMD from South African mines.