Monitoring point optimization in lake waters

In order to grasp the distribution of water quality index in lake water, taking Jinghu Lake of Guangxi University as the experimental object, an radial basis function (RBF) neural network was combined with a genetic algorithm on the basis of an unmanned ship to study the optimal selection of monitoring points. The single-objective and multi-objective optimization of water quality parameters were tested respectively and used to make the fitting distribution map. The results show that the genetic neural network has obvious advantages over the traditional isometric monitoring in the distribution error of water quality parameters, and the data reflected by the results are still accurate and effective at least six weeks after optimization. The results show that a genetic neural network can significantly improve the efficiency of water quality monitoring. This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/). doi: 10.2166/ws.2020.147 om http://iwaponline.com/ws/article-pdf/20/6/2348/767017/ws020062348.pdf er 2021 Gaoxuan Liu Jiaoyan Ai (corresponding author) Jun Xu Jianwu Zheng Dongyi Yao School of electrical engineering, Guangxi University, Nanning, Guangxi 530000, China E-mail: shinin@vip.163.com


INTRODUCTION
With the increase in outdoor recreational activities, the degree of human activity and changes to the natural environment, such as lakes, have become a problem (Li ). Therefore, water quality monitoring is particularly important for mastering the changing and the trend of water quality. In water quality monitoring, the layout of the monitoring points directly affects the efficiency and accuracy of the monitoring work.
The optimal selection of monitoring points will improve the working efficiency of the measurement staff and save economic expenditure (Wang et al. ). Long-term effective data collection and analysis of lake waters can help meet laws outlined for lake environmental changes, the distribution of various parameters, and thus prevent and control water pollution in a timely and effective manner (Bai et al. ).
In recent years, as people pay more attention to water resources, better management of water resources is being explored. For lakes, monitoring stations to monitor water quality are used, but construction costs are high and maintenance is difficult (Liu et al. ). Mobile water quality monitoring is a feasible method, but the existing mobile monitoring equipment has the disadvantages of large volume, inconvenient carrying, high energy consumption and secondary pollution. Therefore, we designed an autonomous mobile water quality monitoring system. It mainly consists of a monitoring platform (water quality monitoring unmanned ship), ground control terminal, remote client and hand-hold terminal (Figure 1). The system can realize autonomous movement through path editing and realize on-line monitoring of temperature, pH, dissolved oxygen, conductivity and chlorophyll a.
At present, for the optimization of lake water monitoring points, common methods include cluster analysis, dynamic closeness method, corresponding analysis, matter element analysis and other mathematical statistics methods. Among them, the cluster analysis method is simple in the case of small sample data, but ignores the interconnection of data space distribution (Mahbub et al. ). The dynamic closeness method can reflect the dynamic changes of water quality parameters at different times and perform cluster analysis but does not reflect the overall spatial distribution (Cui et al. ). The correspondence analysis method can intuitively put many sample variables on the same graph at the same time, but the results differ greatly for different evaluation environments (Zheng et al. ). The concept of matter-element analysis is clear and the calculation is simple, but the actual geographical location and environment of the monitoring point are not considered (Wang et al. ).
Based on the above situation, this experiment uses a water quality monitoring unmanned ship, relying on its fast and efficient data monitoring characteristics, and adopts the method of combining a genetic algorithm with an radial basis function (RBF) neural network. The genetic algorithm utilizes the rule 'survival of the fittest' and has a good global search ability. RBF neural network has good generalization ability for spatial fitting of data, and a quick learning convergence speed, can complement each other, a combination which meets the experimental requirements of robustness and accuracy, and keeps the fast convergence certain, reflecting the parameter distribution of waters (Simon ).

Data sources
Jinghu Lake is located at Guangxi University, with an area of about 3,000 m 2 . It is a typical small landscape lake. The previous management and maintenance of Jinghu Lake was generally determined by random sampling analysis or by visual experience. This method cannot fully grasp the water quality of Jinghu Lake, and it is difficult to judge and predict its change rule. A comprehensive understanding of the water quality requires adequate water quality testing, but a large amount of water quality testing requires a lot of manpower and financial resources. Therefore, we use the developed water quality monitoring system to obtain water quality information more efficiently and economically.
The extraction of water quality parameters will be extracted by an unmanned ship. It can detect temperature  Table 1.

RBF neural network
In this study, the geographical coordinates of the monitoring points are used as the input of the RBF neural network fitting, and the water quality parameter values are used as the output. The RBF neural network was used to establish the relationship between the geographic coordinates of the monitoring points and the water quality parameter value (Chen et al. ). The function expression is as follows: where (x, y) is the geographic coordinate of the sampling point, and Z is the water quality parameter value. According

Water temperature spatial distribution fitting
We take the temperature index of the first week as an example to carry out the experiment. The spatial distribution fitting of water temperature was completed by MATLAB.
Firstly, the monitoring points are marked with coordinates, where x and y coordinates are defined as geographical location coordinates, and Z is the corresponding temperature value. Then, the real number is encoded according to the monitoring serial number, and the corresponding data table is made to facilitate the decoding operation of the information. The temperature data table is as shown in Table 2. Then, we use the 'meshgrid' function to perform an interpolation on the data. After processing, a total of 1,021 points of information were obtained, and we randomly selected 21 points as the test set and another 1,000 points as the training set used as RBF neural network training. The experiment used a trial and error method to adjust the parameters to achieve the ideal fitting effect. Finally, we  obtained the temperature spatial distribution data of Jinghu Lake in the first week ( Figure 3).

Genetic algorithm optimization
A genetic algorithm is a kind of evolutionary algorithm. It searches for the optimal solution based on the principle of 'survival of the fittest' to simulate the natural genetic mechanism (Kaya ). It has good global optimization and robustness (Chen ). This paper uses the data collected from the original water quality monitoring points as a reference standard, and then uses the spatial distribution of water quality indicators fitted by the RBF neural network as the fitness selection function, and uses genetic algorithms to optimize the number and spatial layout of the monitoring points. The following is an example of a single target optimization to describe the flow of a genetic algorithm; the principle flow chart is shown in Figure 4.
First, the sample points are numbered (from 1 to n) using real number coding, and randomly generate an initial set of individuals to form the initial population; each chromosome is an array of real numbers encoded and the initial population is a matrix array.
Then, the fitness value of each group of chromosomes was calculated. The fitness function is the error mean square error between the numerical distribution and the standard distribution of each set of chromosomes. The expression is: where MSE is the error-average variance, N is the number of contrast points,Ẑ i is the standard value of the water quality parameter, and Z i is the corresponding fitting value. The smaller the error mean square error, the greater the fitness and the greater the probability of being selected.
After setting a certain probability, the population is crossed, mutated, and selected to obtain the next generation The objective function is set to the final selection criterion, and the objective function is set to the average absolute error. Its expression is: where R is the mean absolute error of the objective function.
Finally, determine whether the last selected individual meets the conditions of the target function, and if less than  (Liu ). This paper uses the min-max normalization for data processing, as shown in Equation (4).
whereẑ is the standard value of the water quality parameter, Z is the water quality parameter value, z min is the minimum value, and z max is the maximum value. According to the seven water quality indexes that need to be considered, the seven sub-objective functions are set up as F T (z), F pH (z), F chlÀa (z), F DO (z), F COND (z), F TN (z) and F TP (z). The main function is set to F(z), and then the weight is set according to the importance of the data.
The main function of the experiment is as follows in Equation (5).
Then, the single objective optimization method is used for multi-objective optimization. The difference is that the selection function becomes the error of the seven sub-functions, but not the single index optimization.

Single target monitoring point optimization
When studying the single-objective optimization problem, the layout optimization of water quality monitoring points was performed using the water body temperature indicator as an example. When the genetic algorithm is used to optimize the selection of monitoring points in the experiment, the initial population is first established. In this paper, the initial population size is set to 20 and 40, respectively, and the chromlength is 20 to constitute the initial population.
The number of iterations is set to 30, 50 and 100 for the com-  Table 3. Lake. Aiming at the initial 50 monitoring points, 26 optimal monitoring points are obtained through genetic neural network optimization (Figure 5(c)). Similarly, the optimization of single water quality parameter of pH, Chl-a, DO, COND, TN and TP is similar to that of temperature optimization.

Multi-objective optimization
In the actual water quality monitoring process, it is often necessary to monitor a variety of water quality indicators, so the optimization of monitoring points cannot be single-objective optimization, but multi-objective optimization is required. If single-objective optimization is used, the results are different water quality parameters need to select different monitoring points for water quality testing, which is not feasible in real monitoring operations. Therefore, this paper introduces complex multi-objective optimization through single-objective optimization, which can complete the optimization of monitoring points suitable for multiple water quality parameters and monitoring.
For the multi-objective optimization experiment, the initial population size setting is set to 40 and the number of iterations in sequence is set to 50 for the optimization operation, which is compared with the traditional isometric sampling method (Figure 6(a)), and the experimental results are shown in Table 4.
From the experimental results, it can be seen that the fit-

Optimization prediction and verification
Based on the optimization analysis of the first week of October, we get the monitoring points corresponding to single  target and multi-objective optimization. In the following few weeks, we continue to monitor the water quality of Jinghu Lake to verify whether the first optimization of the monitoring site is effective for subsequent water quality monitoring.
Because TN and TP data cannot be detected directly by sensors, it is necessary to collect water samples for chemical detection, which will consume a lot of time and money.
Therefore, we established a BP neural network for training multiple indicators, using temperature, pH, DO, COND and Chl-a as input variables, TN and TP as output variables.
The training and prediction results of BP neural network are shown in Figure 7. For the overall water quality of Jinghu Lake, the overall average error varies with the time of the week as shown in  range. Although the fitting error of multi-objective optimization is lower than that of equidistant sampling on the whole. In at least 6 weeks, the optimization effect of monitoring points is ideal.

CONCLUSION
Experiments show that relative to the selection of traditional water quality monitoring points, genetic neural network in the accuracy of water quality parameters has been significantly improved, and the optimization effect over time is slightly reduced, but the overall error after its fitting is still less than the traditional isometric monitoring methods, greatly reducing the time and effort required to improve the efficiency and accuracy of water quality monitoring.
The model used in this paper is not only applicable to the water quality parameters selected in this paper, for other different quantities and different kinds of water quality parameters monitoring is also applicable, only the application of genetic algorithms need to adjust the corresponding parameters. In this paper, in order to reduce the significant interference of seasonal weather on the water quality parameter data when monitoring water quality data, each time the monitoring of water quality data to avoid high winds, rain and other influences on the climate, the subsequent study will add data under different climates to see if the algorithm can be adjusted according to the data, making the monitoring point optimization model more generalizable.