Abstract
To enhance the mining area's overall use of mine water in the arid area of Western China and mitigate the current water scarcity problem, this paper introduces an intelligent optimization algorithm and neural network for mine water quality evaluation and proposes a principal component analysis (PCA)–particle swarm optimization (PSO)–back propagation (BP) mine water quality evaluation model. Firstly, the model uses PCA to identify the primary factors affecting mine water quality, then enhances the optimal weights and thresholds of the BP neural network based on the PSO algorithm, and the PCA–PSO–BP evaluation model with nine input layers, nine hidden layers, and one output layer is created. In addition, using the Shicaocun Mine as an example, the results demonstrate that the PCA–PSO–BP model has accurate mine water quality evaluation results, and the prediction accuracy reached 86.8255%. This exemplifies the PSO method's superiority to the BP neural network improvement. This study not only offers a novel theoretical framework for assessing and forecasting water quality in mining regions, but it also sets the stage for the possible broad use of state-of-the-art neural networks and optimization algorithms in the coal mining industry.
HIGHLIGHTS
Intelligent algorithms and neural networks are introduced into mine water quality evaluation.
Established a PCA–PSO–BP model for mine water quality evaluation.
Realized the accurate evaluation and reasonable prediction against the background of big data.
Provide reference for the in-depth research of optimization algorithms and neural networks in the field of water quality evaluation.
INTRODUCTION
The role of coal as the primary energy source in China will not change for a very long time (Wang et al. 2019), and the distribution of coal resources in China indicates a situation where more is in the West and less is in the East (Shang et al. 2016; Wang & Luo 2018), which is exactly the opposite of the distribution of water resources. Therefore, mining coal in arid mines in western China will deplete the water resources and exacerbate the current situation of water scarcity in the mines. (Wang et al. 2021). According to statistics, 2.1 t of mine water is created for every 1 t of raw coal mined, meaning that there is a huge waste of mine water resources. Mine water cannot be utilized directly due to the presence of suspended materials, salts, heavy metals, and other components. Instead, it must be graded based on its quality (Mitko et al. 2021; Zhang et al. 2022). In order to maximize the use of mine water resources and reduce adverse environmental consequences, it is crucial to accurately evaluate the different categories of mine water quality.
Several water quality evaluation methods have played a significant role in encouraging the development of water quality evaluation. At this point, the methods of mine water quality evaluation mostly include the single-factor evaluation method and the complete pollution index method. Although the single index evaluation technique is easy to use, it does not adequately reflect the full state of water quality, leading to significant differences. This is because it employs a single index as the reference standard (Zhao et al. 2021). The Nemerov pollution index is one of the most widely used comprehensive pollution index methods in evaluating mine water quality. The method is simple to calculate and the physical concept is understandable, but it emphasizes the impact of the maximum pollution index on the mine water quality, making it simple to exaggerate the effects of specific pollutants on water quality while at the same time indicating that the weight of each pollutant index is not objective when determining, which will eventually lead to inaccurate conclusions (Liu et al. 2022; Sharma & Krupadam 2022). In addition to the previously mentioned research methods, related theoretical techniques and models include principal component analysis (PCA) (Liu et al. 2020), Bayesian theory (Huang et al. 2019), fuzzy comprehensive evaluation method (Liu et al. 2021), gray system theory (Guo et al. 2020), cluster analysis method (Prasad et al. 2020), hierarchical analysis method (Yu et al. 2022), entropy power analysis method (Ju & Hu 2021), fuzzy variable set theory (Li et al. 2022), material element topological model (Shi et al. 2018), set pair analysis method (Qu et al. 2021), and multi-criteria group decision-making models (Baghapour et al. 2020). These models also support accurately classifying the water quality in mining areas and evaluating sudden water hazards. However, they have shortcomings such as instability in the calculation process, lack of universality, and a large gap between calculations and real outcomes in the definition of index thresholds.
Machine learning and intelligent optimization algorithms have been used extensively in the area of artificial intelligence and big data. Zhang & Li (2019) designed a fuzzy genetic neural network model to realize the detection of atmospheric quality. Liu (2022) adopted particle swarm optimization (PSO) to improve the traditional BP neural network (BPNN) and finally constructed a PSO-powered BP neural network (PSO–BPNN) model for the intelligent emergency risk avoidance of sudden financial disasters. Nassif (2014) used evolutionary algorithms to optimize artificial neural network models for decreasing the energy consumption of air conditioning systems in the field of architectural design. A thorough IFPA-BP network model was built for the intelligent diagnosis of natural gas pipeline defects, realizing the intelligent diagnosis of natural gas pipeline defects. This was based on the IFPA algorithm, which is used to optimize the initial weights and thresholds of the BPNN (Liang et al. 2020). Yang (2022) introduced a BPNN optimization algorithm based on a multidirectional mutation genetic algorithm (MMGA-BP). The multidirectional mutation genetic BPNN method is used for the intelligent optimization of English-teaching courses. To effectively predict the stock index, Yang et al. (2019) proposed a hybrid intelligent algorithm based on brain storm optimization and PSO to optimize the parameters of the system model. Du et al. (2013) proposed an integrated learning algorithm, combining the RCDPSO_DM algorithm with a Kalman filtering algorithm, which was applied to optimize antecedent and consequent parameters of constructed T-S FNNs, for medical applications handling complex clinical pathway variances. In the study applying intelligent optimization algorithms to COVID-19 identification, Baghdadi et al. (2022) proposed a method for automatically and accurately classifying COVID-19 on CT lung images using convolutional neural networks (CNNs), pre-trained models, and the sparrow search algorithm (SSA).
As can be seen, intelligent optimization algorithms and neural networks have been used in a variety of fields. However, there have been relatively few studies on the evaluation of mine water quality using intelligent optimization algorithms and neural networks, even though these technologies have strong fault tolerance, high classification accuracy, and robustness. Therefore, this study proposes to incorporate an intelligent optimization algorithm and neural network into the evaluation of mine water quality. It also innovates and develops the mine water quality evaluation model. It can support the reasonable and scientific evaluation of mine water quality, as well as lay the theoretical scientific foundation for mine water treatment, utilization, and discharge.
THEORETICAL FOUNDATION
Principal component analysis
The principal components obtained after PCA processing can reflect most of the information of the original variables with fewer indicators and are independent of each other, eliminating the information redundancy of the original numerous factors and reducing the complexity of the problem (Hsieh & Tung 2009). In addition, PCA can eliminate the influence of correlation among evaluation indicators and overcome the shortcomings of correlation among indicators and overlapping information reflected by indicators in multi-indicator evaluation, which makes the evaluation results more accurate.
The BPNN
BPNN can handle non-smooth, non-time-series data related to water quality evaluation, but at the same time, it is easy to fall into the problem of local minima itself, and the difficulty in determining the weights and thresholds may lead to inaccurate results of mine water quality evaluation.
Particle swarm optimization
At the end of the 20th century, Kennedy and Eberhart proposed PSO capable of performing swarm intelligence searches (Mendes et al. 2004). This algorithm is a swarm-intelligent search method, mainly inspired by the predation process of birds, where the solution of each optimization problem is considered as a spatially flying bird, denoted as a particle, and each particle searches for an individual optimal solution in the local solution space, and all particles together form a swarm of particles, which obtain the swarm optimal solution by communicating with each other (Clarke et al. 2014).







The PSO algorithm has the advantages of high efficiency and fast search speed, therefore, it is used to improve the BPNN to solve the defect that the BPNN is easy to fall into local optimum and achieves the purpose of improving model accuracy.
Construction of the BPNN model based on PCA and PSO
In the construction of the BPNN evaluation model of mine water quality, the accuracy of the evaluation results can be improved by appropriately increasing the network input variables. However, there are many factors affecting mine water quality, and there is often a high correlation between different factors, resulting in a large amount of redundancy of information, which not only makes the accuracy of the evaluation model deviated but also increases the difficulty of data processing. PCA is the main method to solve the correlation between factors. The PCA algorithm can get the main factors affecting the mine water quality by analyzing the contribution rate and cumulative contribution rate of each influencing factor to the mine water quality, and the accurate prediction of the mine water quality can be made by considering only the main factors.
The threshold and weights of the BPNN affect the accuracy of the neural network model; usually, the initial weights and thresholds are determined by random assignment. However, this method tends to make the BPNN fall into local optimal solutions. In order to overcome the defects and accelerate the convergence speed, this paper adopts the PSO algorithm to optimize the BPNN. The PSO algorithm can expand the space to search for the optimal solution and has a strong global search capability, and the search for the optimal weights and thresholds is completed through continuous iterative updating. The optimal weights and thresholds of the BPNN can be determined to improve the accuracy of the BPNN model and realize the scientific evaluation of the water environment in the mining area.
The steps for constructing a BPNN model based on PCA and PSO improvements are as follows.
- (1)
Determine the structure and relevant parameters of the BPNN.
- (2)
Set the cluster size, the initial flight speed of the particles and the corresponding point positions. The current best point position of the particle is selected as the initial point position, and the best point position of the swarm is the global best point position.
- (3)
Each particle contains a fitness value, which is used to reflect the superiority or inferiority of the particle. After training the BPNN, the training error is used as the fitness value of the current point position of the particle, and the result is compared with the value of the previous best point position, if it is better than the previous best point position, it is replaced, otherwise, it remains unchanged.
- (4)
If the global best point position is less than the previous best point position of the current particle, the global best point position is replaced by the previous best point position, otherwise, it remains unchanged.
- (5)
Reprogram the flight speed of the particle and the corresponding point position according to Equations (2) and (3).
- (6)
Check whether the algorithm meets the termination condition (the number of iterations reaches the maximum number of iterations or the error accuracy reaches the initially set target error accuracy), if the termination condition holds, the best weight and threshold are output, and then the model is further simulated, otherwise skip to (3).
CASE STUDY
Background
The Shicaocun coal mine is a medium and large coal mine in the south of the Yuanyang Lake mining area of Ningxia Ningdong coalfield, and the administrative division is under the jurisdiction of Ningdong Town, Lingwu City, Ningxia. Ningxia is located in the western part of China, where water resources are scarce, and mining activities will aggravate the current situation of water scarcity in the mine area, so the efficient use of mine water is very important for the protection of mine water resources, and the comprehensive evaluation of mine water quality is a prerequisite for the efficient use of mine water resources.
Eighty sets of mine water observation points were selected in the Shicaocun coal mine, and the mine water was tested by laboratory testing at each testing point. The main testing indexes included sulfide (f1), oxygen consumption (f2), hexavalent chromium (f3), total hardness (f4), ammonia nitrogen (f5), volatile phenols (f6), sulfate (f7), chloride (f8), nitrate (f9), total dissolved solids (f10), fluoride (f11), turbidity (f12), mercury (f13), selenium (f14), total alpha radioactivity (f15), total beta radioactivity (f16), anionic surfactant (f17), sodium (f18), zinc (f19), iron (f20), septum (f21), lead (f22), aluminium (f23), total bacterial colony (f24), etc. Some data on specific test results are shown in Supplementary material, Appendix. According to the Groundwater Quality Standard (GB/T14848-2017) (Chen et al. 2022), the mine water quality was classified into the following five categories, and the mine water quality classification is shown in Table 1, detailed groundwater quality classification indicator ranges are provided in Supplementary material, Appendix.
Groundwater quality standard classification table
Classification . | Meaning . |
---|---|
Class Ⅰ | Low groundwater chemical component content, suitable for various applications. |
Class Ⅱ | Lower groundwater chemical component content, suitable for various applications. |
Class Ⅲ | Medium content of groundwater chemical component, suitable for centralized domestic drinking water sources. |
Class Ⅳ | The high content of groundwater chemical components, suitable for agriculture and some industrial water |
Class Ⅴ | Groundwater with high chemical component content is not suitable as a source of domestic drinking water. |
Classification . | Meaning . |
---|---|
Class Ⅰ | Low groundwater chemical component content, suitable for various applications. |
Class Ⅱ | Lower groundwater chemical component content, suitable for various applications. |
Class Ⅲ | Medium content of groundwater chemical component, suitable for centralized domestic drinking water sources. |
Class Ⅳ | The high content of groundwater chemical components, suitable for agriculture and some industrial water |
Class Ⅴ | Groundwater with high chemical component content is not suitable as a source of domestic drinking water. |
We can determine the grade of each index by comparing the mine water test results with the Groundwater Quality Standard, but we are unable to determine the mine water quality overall. For instance, the monitoring points 01s lead and cadmium indicators meet the standard in class Ⅴ, the indication of oxygen consumption meets class Ⅳ, and the total number of colonies indicator meets class Ⅰ. 2160. However, it is not possible to determine the overall water quality of the monitoring site. It is evident that while the standard can assess the grade of a particular indicator, it is unable to assess the overall mining area's water quality. As a result, a set of models for evaluating the water quality in mining areas must be created.
Although there are relevant studies in the evaluation of mine water quality at this stage, mine water quality is affected by a variety of factors, and it is difficult to evaluate the water quality reasonably by a single indicator or multiple indicators. In addition, the mine water quality contains a large amount of data, and the traditional evaluation method can't make full use of all the data. Therefore, intelligent algorithms and neural networks can be introduced into the mine water evaluation to make full use of all the data to achieve accurate evaluation and prediction of mine water quality.
PCA reduction
Due to a large number of water quality testing indicators; direct water quality evaluation on the one hand, the fact that the amount of data is large, on the other hand, there may be a correlation between the indicators, which will also affect the evaluation results. Therefore, first of all, through the method of PCA, the water quality impact factors for dimensionality reduction processing. In the SPSS software for PCA, select the eigenvalue greater than 1 as the extraction conditions, the results of the PCA are shown in Table 2.
Results of principal component dimensionality reduction analysis
Components . | Initial eigenvalue . | Extraction of the sum of squares of loads . | ||||
---|---|---|---|---|---|---|
Total . | Percentage of variance . | Accumulation, % . | Total . | Percentage of variance . | Accumulation, % . | |
1 | 3.225 | 13.437 | 13.437 | 3.225 | 13.437 | 13.437 |
2 | 2.955 | 12.313 | 25.750 | 2.955 | 12.313 | 25.750 |
3 | 2.309 | 9.619 | 35.369 | 2.309 | 9.619 | 35.369 |
4 | 2.161 | 9.004 | 44.373 | 2.161 | 9.004 | 44.373 |
5 | 1.732 | 7.215 | 51.588 | 1.732 | 7.215 | 51.588 |
6 | 1.720 | 7.167 | 58.754 | 1.720 | 7.167 | 58.754 |
7 | 1.478 | 6.157 | 64.912 | 1.478 | 6.157 | 64.912 |
8 | 1.372 | 5.718 | 70.630 | 1.372 | 5.718 | 70.630 |
9 | 1.191 | 4.963 | 75.593 | 1.191 | 4.963 | 75.593 |
10 | 0.896 | 3.734 | 79.327 | |||
11 | 0.836 | 3.485 | 82.812 | |||
12 | 0.804 | 3.352 | 86.164 | |||
13 | 0.665 | 2.769 | 88.933 | |||
14 | 0.574 | 2.393 | 91.326 | |||
15 | 0.442 | 1.841 | 93.167 | |||
16 | 0.433 | 1.805 | 94.972 | |||
17 | 0.3030 | 1.377 | 96.348 | |||
18 | 0.274 | 1.143 | 97.492 | |||
19 | 0.229 | 0.955 | 98.446 | |||
20 | 0.153 | 0.636 | 99.082 | |||
21 | 0.105 | 0.437 | 99.519 | |||
22 | 0.061 | 0.256 | 99.775 | |||
23 | 0.030 | 0.127 | 99.901 | |||
24 | 0.024 | 0.099 | 100.000 | |||
Extraction method: Principal component analysis |
Components . | Initial eigenvalue . | Extraction of the sum of squares of loads . | ||||
---|---|---|---|---|---|---|
Total . | Percentage of variance . | Accumulation, % . | Total . | Percentage of variance . | Accumulation, % . | |
1 | 3.225 | 13.437 | 13.437 | 3.225 | 13.437 | 13.437 |
2 | 2.955 | 12.313 | 25.750 | 2.955 | 12.313 | 25.750 |
3 | 2.309 | 9.619 | 35.369 | 2.309 | 9.619 | 35.369 |
4 | 2.161 | 9.004 | 44.373 | 2.161 | 9.004 | 44.373 |
5 | 1.732 | 7.215 | 51.588 | 1.732 | 7.215 | 51.588 |
6 | 1.720 | 7.167 | 58.754 | 1.720 | 7.167 | 58.754 |
7 | 1.478 | 6.157 | 64.912 | 1.478 | 6.157 | 64.912 |
8 | 1.372 | 5.718 | 70.630 | 1.372 | 5.718 | 70.630 |
9 | 1.191 | 4.963 | 75.593 | 1.191 | 4.963 | 75.593 |
10 | 0.896 | 3.734 | 79.327 | |||
11 | 0.836 | 3.485 | 82.812 | |||
12 | 0.804 | 3.352 | 86.164 | |||
13 | 0.665 | 2.769 | 88.933 | |||
14 | 0.574 | 2.393 | 91.326 | |||
15 | 0.442 | 1.841 | 93.167 | |||
16 | 0.433 | 1.805 | 94.972 | |||
17 | 0.3030 | 1.377 | 96.348 | |||
18 | 0.274 | 1.143 | 97.492 | |||
19 | 0.229 | 0.955 | 98.446 | |||
20 | 0.153 | 0.636 | 99.082 | |||
21 | 0.105 | 0.437 | 99.519 | |||
22 | 0.061 | 0.256 | 99.775 | |||
23 | 0.030 | 0.127 | 99.901 | |||
24 | 0.024 | 0.099 | 100.000 | |||
Extraction method: Principal component analysis |
PSO–BPNN algorithm design
RESULTS
PSO–BP model training
The PSO–BP mine water quality evaluation prediction model is constructed based on MATLAB R2020b. The input nodes are the nine principal components that have been dimensioned down by PCA, the output nodes are the mine water quality levels, and the number of hidden layers is determined to be nine according to Equation (6).
The BPNN parameters were set with a training number of 1,000 times, a learning rate of 0.01, a minimum error of 1 × 10−6, a Tansig function for the implicit layer, a Purelin function for the output layer, and a Trainlm for the training function.
Analysis of evaluation results
Discussion
In order to evaluate and compare the model prediction accuracy objectively, this study uses the evaluation indexes of mean square error (MSE), mean absolute error (MAE), root mean square error (RMSE), and mean percentage error (MAPE) to analyze the prediction accuracy of the two models, and the calculation results are shown in Table 3. It is obvious from the table that the PSO–BP model has a prediction error of 13.1745% and a prediction accuracy of 86.8255%, while the BPNN model has a MAPE of 49.943% and a prediction accuracy of only 50.057%, which also proves that the PSO–BP model has a higher prediction accuracy than that of the BPNN.
Comparison of prediction accuracy indexes of two models
. | MSE . | MAE . | RMSE . | MAPE . |
---|---|---|---|---|
BP | 4.3987 | 1.6615 | 2.0973 | 49.943% |
PSO–BP | 0.25972 | 0.46655 | 0.50963 | 13.1745% |
. | MSE . | MAE . | RMSE . | MAPE . |
---|---|---|---|---|
BP | 4.3987 | 1.6615 | 2.0973 | 49.943% |
PSO–BP | 0.25972 | 0.46655 | 0.50963 | 13.1745% |
In addition, we analyze the training time of the model by changing the number of training data sets. When the training set data is 80, the training time of the model is 32.9816 s, when the training data of the model is reduced to 50, the training time shrinks to 27.7433 s, and when the training data is 30, the training time is 26.3589 s. It can be seen that the amount of training data is proportional to the training time of the model, and when the number of the training set is reduced, the training time is reduced as well, but as the number of training continues to decrease, the rate of training time reduction becomes slower and slower.
By comparing the traditional BPNN with the PSO–BP model, it is found that the BPNN has the problems of slow convergence speed and long training time. While this study makes full use of the global search capability of the PSO algorithm to optimize the BPNN, it can improve network prediction performance and computational efficiency. Moreover, for the problem of multi-indicator evaluation, it may be that the number of indicators is too large, leading to too much computation. This paper proposes a pre-processing method of indicator dimensionality reduction by using PCA, which can achieve the reduction of the number of evaluation indicators, but also retain the main information to ensure the accuracy of the evaluation results. In conclusion, the PCA–PSO–BPNN model proposed in this paper can solve the problem of multi-indicator evaluation and has the advantages of high computational efficiency and accurate evaluation results. It has accurate prediction results for mine water quality, which can propose a new assessment system for mine water quality evaluation.
CONCLUSIONS
An intelligent optimization algorithm and neural network are introduced into mine water quality evaluation, and a mine water quality evaluation model of PCA–PSO–BPNN is proposed, which can provide an intelligent evaluation prediction model for the research related to mine water quality, and the main conclusions are as follows.
- (1)
The quality of mine water is influenced by several things. The PCA method is used in this study to minimize the dimensionality of the factors influencing the quality of mine water. It achieves dimensionality reduction of the data and minimizes information loss by reducing the original 24 evaluation indexes to nine primary components.
- (2)
The PSO algorithm enhances the BPNN, and to make up for the difficulty of figuring out the thresholds and weights of the conventional BPNN, the optimal weights and thresholds of the BPNN are determined by using PSO search.
- (3)
Using the water testing data from the Shicaocun Coal Mine, the developed PCA–PSO–BP mine water quality evaluation prediction model was validated and contrasted with the conventional BPNN prediction outcomes. The outcomes show that the PCA–PSO–BP water quality evaluation model has an 86.8255% prediction accuracy compared to the conventional BPNN.
ACKNOWLEDGEMENTS
We appreciate the comments and suggestions from anonymous reviewers.
AUTHOR CONTRIBUTIONS
J.W. was involved in conceptualization, writing – original draft, data curation, analyses of results. Y.H. was involved in conceptualization, methodology, supervision.
FUNDING
This work was supported by the National Natural Science Foundation of China (Nos 52104103, 52022107, and 52174128) and the Natural Science Foundation of Jiangsu Province (Nos BK20210499 andBK20190031).
CONSENT FOR PUBLICATION
Written informed consent for publication was obtained from all participants.
DATA AVAILABILITY STATEMENT
All relevant data are included in the paper or its Supplementary Information.
CONFLICT OF INTEREST
The authors declare there is no conflict.