Abstract

To improve the prediction accuracy of ammonia nitrogen in water monitoring networks, the combination of a bio-inspired algorithm and back propagation neural network (BPNN) has often been deployed. However, due to the limitations of the bio-inspired algorithm, it would also fall into the local optimal. In this paper, the seagull optimization algorithm (SOA) was used to optimize the structure of BPNN to obtain a better prediction model. Then, an improved SOA (ISOA) was proposed, and the common functional validation method was used to verify its optimization performance. Finally, the ISOA was applied to improve BPNN, which is known as the improved seagull optimization algorithm–back propagation (ISOA–BP) model. The simulation results showed that the prediction accuracy of ammonia nitrogen was greatly improved and the proposed model can be better applied to the prediction of complex water quality parameters in water monitoring networks.

HIGHLIGHTS

  • The structure of BPNN was optimized to obtain a better prediction model by using the seagull optimization algorithm (SOA).

  • We proposed an improving SOA (ISOA) and used the common functional validation method to verify its optimization performance.

  • The ISOA was used to improve BPNN, via the improved seagull optimization algorithm-back propagation (ISOA-BP) model.

INTRODUCTION

As an important part of the new generation of information technology, the Internet of Things (IoT) has been widely researched and applied in various scenarios (Fang et al. 2016, 2020a; Khan et al. 2020). Although the emergence of IoT has greatly improved our lives with the increase in the type and number of sensor devices, the amount of data to be processed is also increasing (Taonameso et al. 2019; Fang et al. 2020b, 2020c). How to deal with data efficiently and accurately has become the focus of people's research. As a method of data fusion, neural networks can abstract human brain neurons from the perspective of information processing and establish models, and compose different networks according to different connections which can be used for analysis and prediction. However, the neural network also has some shortcomings, such as falling into local optimization and poor generalization ability.

As one of the most widely used neural network structures, back propagation neural network (BPNN) is widely used in different fields. However, it is often unable to achieve global convergence and falls into local minima when solving complex problems, resulting in an invalid learning process. Additionally, the learning algorithm converges slowly, especially near the target (Han & Huang 2019). Therefore, in practical applications, the bio-inspired algorithm is often used to optimize its model structure. For example, the particle swarm optimization (PSO) algorithm is used to optimize BPNN to overcome the sensitivity and error fluctuation of the initial value of gradient descent method, and obtain the global optimal initial parameters which make the neural networks converge quickly (Wu et al. 2018). These algorithms also include fruit fly optimization algorithm (FOA) (Wu et al. 2019), genetic algorithm (GA) (Li et al. 2017), as well as mind evolutionary algorithm (MEA) (Wang et al. 2018).

As a newly proposed bio-inspired algorithm, the seagull optimization algorithm (SOA) has been proven to achieve better performance than some traditional algorithms (Dhiman & Kumar 2019). Therefore, in this paper, SOA was used to optimize the structure of BPNN to obtain an improved ammonia nitrogen prediction model. Then, considering the shortcomings of the SOA, an improved algorithm was proposed, and some benchmark functions were used to verify its performance. Finally, the improved SOA (ISOA) was applied to improve the BPNN to obtain a better prediction model. The simulation results verified that the performance of the model is better than that of the traditional model and the model using PSO to optimize BPNN. In other words, the proposed algorithm can be applied to a more complex water quality environment for water quality detection.

MATERIALS AND METHODS

SOA

The SOA is a novel bio-inspired algorithm for solving computationally expensive problems. This algorithm has a good global search ability, it imitates the way a seagull circles over its prey, and its attack will affect the local search ability of this algorithm (Dhiman & Kumar 2019; Jia et al. 2019).

Mathematical models of predator migration and attack are discussed. During the migration, the algorithm simulated how a group of gulls moved from one location to another. A seagull must meet the conditions in Equations (1)–(5).

To avoid collisions between adjacent search agents, we use an additional variable A, to calculate the new search agent location.
formula
(1)
where represents the position of the search agent which does not collide with other search agents, represents the current position of the search agent, x indicates the current iteration, and represents the movement behaviour of the search agent in a given search space.
formula
(2)
where is introduced to control the frequency of employing variable A, which decreases linearly from to 0. After avoiding collisions between neighbours, the search agents move towards the direction of the best neighbour.
formula
(3)
where represents the positions of the search agent towards the best-fit search agent . The behaviour of B is randomized and is responsible for balancing between exploration and exploitation properly. is calculated as:
formula
(4)
where is a random number in the range of [0, 1].
Lastly, the search agent can update its position with respect to the best search agent by:
formula
(5)
where represents the distance between the search agent and best-fit search agent.
This development is designed to take advantage of the history and experience of the search process. When attacking prey, the spiral action takes place in the air. This behaviour in the x, y and z planes are represented as follows:
formula
(6)
formula
(7)
formula
(8)
formula
(9)
where r is the radius of each turn of the spiral, k is a random number in the range (0 < = k < = 2π), and v are constants to define the spiral shape, and e is the base of the natural logarithm. The updated position of the search agent is calculated using Equation (6)–(9), and the positions of the other search agents can be calculated using Equation (10):
formula
(10)
where saves the best solution and updates the position of other search agents.

SOA is summarized in Table 1 (Dhiman & Kumar 2019).

Table 1

SOA procedure

Seagull optimization algorithm
Input: seagull population  
Output: optimal search agent  
1: procedure SOA 
2: Initialize the parameters A, B, and  
3: Set ← 2 
4: Set ← 1 
5: Set ← 1 
6: while ( < ) do 
7:  
8: ← Rand(0, 1) 
9: ← Rand(0, 2π) 
10:  
11: Calculate the distance using Equation (11) 
12:  
13:  
14:  
15: end while 
16: return  
17: end procedure 
1: procedure  
2: for i ← 1 to n do 
3:  
4: end for 
5:  
6: return  
7: end procedure 
1: Procedure  
2:  
3: for i ← 1 to n do 
4: if then 
5:  
6: end if 
7: end for 
8: return Best 
9: end procedure 
Seagull optimization algorithm
Input: seagull population  
Output: optimal search agent  
1: procedure SOA 
2: Initialize the parameters A, B, and  
3: Set ← 2 
4: Set ← 1 
5: Set ← 1 
6: while ( < ) do 
7:  
8: ← Rand(0, 1) 
9: ← Rand(0, 2π) 
10:  
11: Calculate the distance using Equation (11) 
12:  
13:  
14:  
15: end while 
16: return  
17: end procedure 
1: procedure  
2: for i ← 1 to n do 
3:  
4: end for 
5:  
6: return  
7: end procedure 
1: Procedure  
2:  
3: for i ← 1 to n do 
4: if then 
5:  
6: end if 
7: end for 
8: return Best 
9: end procedure 

ISOA

Due to the randomness of factor B, the traditional SOA has a poor optimization effect and is prone to fall into local optimization. Therefore, an improved factor B is expressed as follows:
formula
(11)
To solve the problem of weak development ability of the algorithm in the later stage, chaos theory (Wang et al. 2014; Jia et al. 2019; Yue et al. 2019; Zhang et al. 2019a) is adopted to increase the diversity of particles in the later stage and enhance its search ability. The logistic map is introduced, and the basic equation is:
formula
(12)
where represents the th () iteration of the th chaotic variable, and is generally 4. The transformation between the chaotic variable and the original variable is as follows:
formula
(13)
formula
(14)
where and are the search upper and lower bounds of the th dimension variable, respectively, and is the value obtained by transforming the th chaotic variable into the optimization variable after chaotic mapping.

The idea of the improved SOA is to use chaotic after iteration to conduct chaotic iteration on the location of a seagull with the best fitness and increase its diversity. First, the original variables are mapped to chaotic variables using Equation (13), and then transformed using Equation (12). Finally, the original spatial position value is returned by Equation (14). If the position after the chaos is better than before the chaos, save it; otherwise, save the position before the chaos.

Functions test

In this section, the improved algorithm is tested on some unimodal and multimodal benchmark functions (Dhiman & Kumar 2019). The information of these functions is shown in Table 2.

Table 2

Information of benchmark functions

TypeNameExpressionDomain of definitionGlobal optimumOptimal value
Unimodal Sphere    
Schwefel's 2.22    
Schwefel's 1.2    
Schwefel's 2.21    
Noise    
Multimodal Rastrigin    
Ackley    
Griewank    
TypeNameExpressionDomain of definitionGlobal optimumOptimal value
Unimodal Sphere    
Schwefel's 2.22    
Schwefel's 1.2    
Schwefel's 2.21    
Noise    
Multimodal Rastrigin    
Ackley    
Griewank    

Here, the PSO, the traditional SOA and the ISOA are used for comparison. The parameter settings of each algorithm are shown in Table 3, and the Maxiterations are 500, the number of seagulls is 100 and the dimension of seagulls is 30.

Table 3

Parameter settings of each algorithm

AlgorithmParameterValue
ISOA  
 0.1 
 
SOA  
 0.1 
 
PSO  1.49445 
 1.49445 
 0.5 
AlgorithmParameterValue
ISOA  
 0.1 
 
SOA  
 0.1 
 
PSO  1.49445 
 1.49445 
 0.5 

The optimization results of the improved algorithm on different functions are shown in Figures 14.

Figure 1

(a) Search result of function F1 and (b) result of function F2.

Figure 1

(a) Search result of function F1 and (b) result of function F2.

Figure 2

(a) Search result of function F3 and (b) result of function F4.

Figure 2

(a) Search result of function F3 and (b) result of function F4.

Figure 3

(a) Search result of F5 and (b) Result of F6.

Figure 3

(a) Search result of F5 and (b) Result of F6.

Figure 4

(a) Search result of function F7 and (b) result of function F8.

Figure 4

(a) Search result of function F7 and (b) result of function F8.

Table 4 compares the optimization results of each function.

Table 4

Optimization results of each function

Function /algorithmISOASOAPSO
F1 1.867 × 10−33 2.888 × 0−18 0.01132 
F2 3.964 × 10−15 6.757 × 10−7 0.618 
F3 9.142 × 10−26 2.076 × 10−15 0.2642 
F4 3.201 × 10−17 1.576 × 10−8 0.1465 
F5 0.000192 0.005356 0.1678 
F6 5.386 
F7 6.839 × 10−14 5.152 × 10−11 0.1454 
F8 0.001802 
Function /algorithmISOASOAPSO
F1 1.867 × 10−33 2.888 × 0−18 0.01132 
F2 3.964 × 10−15 6.757 × 10−7 0.618 
F3 9.142 × 10−26 2.076 × 10−15 0.2642 
F4 3.201 × 10−17 1.576 × 10−8 0.1465 
F5 0.000192 0.005356 0.1678 
F6 5.386 
F7 6.839 × 10−14 5.152 × 10−11 0.1454 
F8 0.001802 

As can be seen from the figures and the tables, the optimization effect of the SOA is better than that of the PSO, and the convergence speed is faster. When the number of iterations is about 100 times, the optimization effect of the PSO is better than that of the PSO for 500 iterations. Furthermore, the improved ISOA has better optimization results, faster convergence speed and a better effect than the traditional SOA algorithm.

IMPROVED NEURAL NETWORK AMMONIA NITROGEN PREDICTION MODEL

BPNN

There are many kinds of neural networks, among which BPNN is one of the most widely used. It has the advantages of simple structure, self-learning, self-organization, self-adaptation, fast training speed, local approximation and global convergence. It is generally composed of the input layer, the hidden layer and the output layer. It has been widely used in the field of prediction. The main idea of BPNN is to divide learning into forwarding the propagation of signal and back propagation of error. Specifically, in the learning process, the sample input is input through the input layer, and then transferred to the output layer through the operation of hidden layer neurons. Then, the error between the actual data and the predicted data of the output layer is calculated, and the error is put into the stage of back propagation. In the process of back propagation, the connection weights between each layer of neurons are constantly adjusted based on the gradient descent strategy until the deviation between the final predicted value and the actual value is minimized (Yang & Wang 2018; You et al. 2018; Zhang et al. 2019b). The model of BPNN is shown in Figure 5.

Figure 5

Model of BPNN.

Figure 5

Model of BPNN.

Suppose there are neurons in the input layer of BPNN model, a hidden layer with neurons and neurons in the output layer. The input is , where . represents the number of samples and the output is , where . The connection weight of th neuron in the input layer to th neuron in the hidden layer is , and the threshold value of th neuron in the hidden layer is . The connection weight of th neuron in the hidden layer and th neuron in the output layer is and the threshold is . The input of the th neuron in the hidden layer is and is the input of the th neuron in the output layer.

Assuming the actual value of th neuron in the output layer is , the total error of network output is:
formula
(15)
BPNN adopts the gradient descent method to adjust the weights and thresholds of the network to obtain better output. The following equation can be obtained by expanding the error to the hidden layer and then to the input layer:
formula
(16)
where is represented as an S function. It can be seen from the above that error E is related to the thresholds and weights. Therefore, the final error can be changed indirectly by changing the weights and thresholds.
The related equations of the weights and thresholds are as follows:
formula
(17)
formula
(18)
formula
(19)
formula
(20)
where is the number of samples.

ISOA–BP

The main idea of the ISOA–BP hybrid programming is to optimize the weights and thresholds of the back propagation network based on the ISOA. The main steps are shown as follows:

Step 1: Set the parameters of ISOA, including the number of seagulls. The weights and thresholds of the BPNN that need to be optimized are encoded as the initial seagull population.

Step 2: Initialize the position of seagulls and use Equation (7) to change their position so as to avoid collisions.

Step 3: Calculate the fitness of all seagulls at present, and find the best one as the best seagull in this iteration.

Step 4: Update the position of each seagull according to Equations (1)–(11).

Step 5: Chaotic algorithm (Equations (12)–(14)): logistic mapping is used to map the individual extremum of particles to (0, 1) for chaotic iteration. After the iteration, the inverse mapping is returned to the spatial range of the original solution. Calculate the fitness value of the current solution, and output the new solution when the new solution is better than the old one.

Step 6: To determine whether the iteration times or required accuracy are reached, output the final position as the optimal seagull position; otherwise, return to step 3.

Step 7: Decode the optimal output into the initial weights and thresholds of BPNN, and train the neural network until it meets the requirements.

Data pre-processing

Data from May to August in 2016 for a river in Qinghai province were collected once a day, including water temperature (°C), pH, dissolved oxygen (mg/L), conductivity (μs/cm), turbidity (nephelometric turbidity units, NTU), permanganate index (mg/L) and ammonia nitrogen (mg/L). A total of 123 groups of data were collected. The first 100 groups of data were used to train the network and the last 23 groups of data were used to verify the network performance.

Before using the sample data for training, it is necessary to process the sample data and pre-process the missing or wrong data. After that, the following equation was used for normalization:
formula
(21)
where and are the data before and after normalization, and and are the maximum and minimum values of data before normalization.

Simulation results

In this subsection, the back propagation (BP), PSO–BP, SOA–BP and ISOA–BP models are compared to verify the performance of BPNN optimized by the ISOA. The parameters of each model are shown in Table 5, and the Maxiterations are 1,000, train function is trainrp, number of seagulls is 100, iteration times of seagulls is 30 and the structure of BP is 6-7-1.

Table 5

Parameter settings of different models

AlgorithmParameterValue
ISOA  
 0.1 
 
SOA  
 0.1 
 
PSO  1.49445 
 1.49445 
 0.5 
AlgorithmParameterValue
ISOA  
 0.1 
 
SOA  
 0.1 
 
PSO  1.49445 
 1.49445 
 0.5 

The convergence comparison of different models is shown in Figure 6. As can be seen from Figure 6, the proposed ISOA–BP model has a smaller convergence value, which is better than that of the BP, PSO–BP and SOA–BP models.

Figure 6

Convergence comparisons of various algorithms.

Figure 6

Convergence comparisons of various algorithms.

As can be seen from Figure 7, the proposed ISOA–BP model has a faster convergence speed, which converges to the optimal value faster, and its convergence value is better than that of the BP, PSO–BP and SOA–BP models. In other words, the predicted value of the proposed algorithm is closer to the actual value. The simulation results of 23 groups of validation data from different models are shown in Figure 7.

Figure 7

Comparison of the predicted values of the four models.

Figure 7

Comparison of the predicted values of the four models.

As can be seen from Figure 8, the average error of the predicted value of the proposed algorithm is the smallest, which indicates that the predicted value is closer to the actual value. The predicted value of the proposed ISOA–BP prediction model is closer to the actual value, which shows more accurate prediction accuracy than the traditional BP, PSO–BP or SOA–BP prediction models.

Figure 8

Comparison of prediction errors of four models.

Figure 8

Comparison of prediction errors of four models.

In Figure 8 shows the error comparison of the four models, and the value is the absolute value of the error. It can be seen from the figure that the proposed model has the lowest error value for the verification samples, the lowest average error and the highest prediction accuracy.

The following two evaluation methods are used to evaluate the prediction accuracy of different models (Yu & Bai 2018).

1: Root Mean Square Error (RMSE)
formula
(22)
2: Nash–Sutcliffe Efficiency (NS)
formula
(23)
where represent the actual output of the test data, is the predicted output of the test data, is the average of the actual output of the test data, and n is the number of samples. For the first evaluation method, the lower the value, the better the effect. For the second evaluation, the closer the value to 1, the better.

The results of the above two evaluation algorithms calculations are shown in Table 6.

Table 6

Comparison of different evaluation algorithms

Algorithm/evaluationRMSENS
BP 0.658100 0.699794 
PSO–BP 0.109997 0.949822 
SOA–BP 0.053962 0.975384 
ISOA–BP 0.046361 0.978851 
Algorithm/evaluationRMSENS
BP 0.658100 0.699794 
PSO–BP 0.109997 0.949822 
SOA–BP 0.053962 0.975384 
ISOA–BP 0.046361 0.978851 

As can be seen from Table 6, compared with the traditional single neural network prediction model and PSO or SOA model, the proposed improved optimization model, namely the ISOA model, has a higher prediction accuracy.

CONCLUSIONS

Since the traditional BPNN is easily limited to local optimization, which leads to the low accuracy of ammonia nitrogen prediction, in this paper, the SOA is adopted with strong optimization performance to optimize the weights and thresholds of BPNN. Because of the shortcomings of the SOA, this paper proposes an improved algorithm to optimize BPNN by using chaos. The simulation results show that the prediction accuracy of the proposed model is higher than that of the traditional BPNN, PSO and SOA models. The prediction accuracy of the new model is higher and the effect is better, which can be applied to predict ammonia nitrogen in more complex water environments.

ACKNOWLEDGEMENTS

This work is supported by the National Natural Science Foundation of Qinghai Province, China (No. 2020-ZJ-724).

DATA AVAILABILITY STATEMENT

All relevant data are included in the paper or its Supplementary Information.

REFERENCES

REFERENCES
Fang
W.
,
Zhang
C.
,
Shi
Z.
,
Zhao
Q.
&
Shan
L.
2016
BTRES: beta-based trust and reputation evaluation system for wireless sensor networks
.
Journal of Network and Computer Applications
59
(
1
),
88
94
.
Fang
W.
,
Cui
N.
,
Chen
W.
,
Zhang
W.
&
Chen
Y.
2020a
A trust-based security system for data collecting in smart city
.
IEEE Transactions on Industrial Informatics
.
doi: 10.1109/TII.2020.3006137
.
(Early Access)
.
Fang
W.
,
Zhang
W.
,
Chen
W.
,
Liu
Y.
&
Tang
C.
2020b
TMSRS: trust management-based secure routing scheme in industrial wireless sensor network with fog computing
.
Wireless Networks
26
(
5
),
3169
3182
.
Fang
W.
,
Zhang
W.
,
Chen
W.
,
Pan
T.
,
Ni
Y.
&
Yang
Y.
2020c
Trust-based attack and defense in wireless sensor networks: a survey
.
Wireless Communications and Mobile Computing
2020
,
20
,
Article ID 2643546
.
Han
Z. Y.
&
Huang
X. G.
2019
GA-BP in thermal fatigue failure prediction of microelectronic chips
.
Electronics
8
(
542
),
1
14
.
Jia
H. M.
,
Xing
Z. K.
&
Song
W. L.
2019
A new hybrid seagull optimization algorithm for feature selection
.
IEEE Access
7
,
49614
49631
.
Khan
P.
,
Zhu
W.
,
Huang
F.
,
Gao
W.
&
Khan
N. A.
2020
Micro–nanobubble technology and water-related application
.
Water Supply
20
(
6
),
2021
2035
.
Taonameso
S.
,
Mudau
L. S.
,
Traoré
A. N.
&
Potgieter
N.
2019
Borehole water: a potential health risk to rural communities in South Africa
.
Water Supply
19
(
1
),
128
136
.
Wang
G. G.
,
Guo
L. H.
,
Gandomi
A. H.
,
Hao
G.-S.
&
Wang
H.
2014
Chaotic krill herd algorithm
.
Information Sciences
274
,
17
34
.
Wu
J.
,
Li
Z. B.
,
Zhu
L.
&
Li
G. Y.
2018
Optimized BP neural network for dissolved oxygen prediction
.
IFAC Papers on Line
51
(
17
),
596
601
.
Wu
L.
,
Yang
Y. W.
,
Maheshwari
M.
&
Li
N.
2019
Parameter optimization for FPSO design using an improved FOA and IFOA BP neural network
.
Ocean Engineering
175
,
50
61
.
Yu
T. T.
&
Bai
Y.
2018
A comparative study of extreme learning machine, least squares support vector machine, back propagation neural network for outlet total phosphorus prediction
. In
2018 Prognostics and System Health Management Conference (PHM-Chongqing)
,
Chongqing, China
, pp.
717
722
.
Zhang
Z. X.
,
Yang
R. N.
&
Li
H. Y.
2019a
Antlionoptimizer algorithm based on chaos search and its application
.
Journal of Systems Engineering and Electronics
30
,
352
365
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).