Abstract
Chlorophyll-a (Chl-a) is an important index in water quality assessment by remote sensing technology. For the study of Chl-a value measurement in rivers or lakes, there are many classical methods, such as curve fitting, back propagation (BP) neural network and radial basis function (RBF) neural network, and all of them have some corresponding applications. With the rise of computer power and deep learning, this study intended to analyze the measurement of water quality and Chl-a in deep learning (DL) and to compare it with several classical methods, so as to explore and develop better methods. Taking Taihu Lake of China as the case, this study adopted the measured data of Chl-a in Taihu Lake in 2017 and the data corresponding to the same time from Landsat8. In this study, the four methods were used to invert the distribution of the Chl-a value in Taihu Lake. From the results of inversion, the power curve fitting model with ∑Residual2 of fitting of 90.469 and inverse curve fitting model with the ∑Residual2 of fitting of 602,156.608 had better results than the other curve fitting models; however, they were not as accurate as the machine learning method from segmentation results images. The machine learning method had better accuracy than the curve fitting methods from segmentation results images. The mean squared error of testing of the three methods of machine learning (BP, RBF, DL) were respectively 1.436, 4.479, 4.356. Thus, the BP method and DL method had better results in this study.
HIGHLIGHTS
BP had the best results, and DNN also had nice mean squared error of results.
In the future, the data volume and equipment condition need to be discussed to choose the optimal method.
The machine learning method had less error than the curve fitting methods.
INTRODUCTION
There are many studies about water quality monitoring in Taihu Lake (China) using satellite remote-sensing data with the development of spectroscopy technology. The Landsat series satellites are the most popular data resources among these different satellites (Wu & Yang 2019) in China. Thus, monitoring water quality especially of chlorophyll-a through remote-sensing data is an important research issue. The earliest studies got data from Landsat 5, which carried the MSS (Multispectral Scanner) and the TM (Thematic Mapper) instruments (Li et al. 1995a; She et al. 1996; She & Cai 1997; Wang & Ma 2000; Ma & Dai 2005a; Yin et al. 2005). These studies established the relation between spectral data and chlorophyll-a concentration values, developed the regression model for them, and analyzed the accuracy and applicability of the model. The estimation of chlorophyll-a concentration was not precise enough because of the low-resolution spectrometer of the TM instruments (Tao et al. 2019; Wang & Wu 2019; Wang & Ma 2000). Landsat-8 was launched in 2013 with the OLI (Operational Land Imager) and the TIRS (Thermal Infrared Sensor). It had a 16-day repeat cycle around the Earth. Compared with medium- and low-resolution satellite sensors (e.g., MODIS), the OLI had better spatial information. Its high spatial resolution was 30 m (https://www.usgs.gov/land-resources/). Some studies proved the Landsat 8 OLI imagery database could be used for water quality monitoring in inland shallow lakes (Barsi et al. 2014; Lobo et al. 2015; Mushtaq & Lala 2016; Zheng et al. 2016).
Applications of remotely sensed data for water quality monitoring can mainly be performed by three methods: analytical method, semi-analytical method, and empirical method (Morel & Gordon 1980; Zhou et al. 2004; Odermatt et al. 2012; Wang et al. 2012; Wu & Yang 2019). The analytical method is only about theoretical formulae. The semi-analytical method requires the integrating of theoretical and empirical methods. The empirical method refers to linear regression, curve-fitting or non-linear functions (Gitelson 1992; Gurlin et al. 2011). It has been widely utilized for estimating chlorophyll-a concentration, and it is a simple, suitable, high-accuracy method for a small area (Simis et al. 2005; Matthews 2011; Odermatt et al. 2012; Wu & Yang 2019). Based on statistical analysis, researchers select the optimal band data or band combination data from remote-sensing data, obtain the inversion algorithm between the water quality parameters from laboratory and band data (or band combination data), then analyze the accuracy and applicability of the algorithm. For Taihu Lake, there have been typical empirical methods, such as linear regression, single-band method, band-combination method (e.g., band-ratio and band difference), artificial neural network, principal component analysis and so on (Li et al. 1995b; She et al. 1996; Wang & Ma 2000; Zhou et al. 2004; Ma & Dai 2005a; Sun et al. 2009; Duan et al. 2012; Wang et al. 2013; Zhang et al. 2016).
Taihu Lake is typical Case II water, and there are interactions among the compositions of water quality. The relation between spectral characteristics and chlorophyll-a are complex. It was further improved when exponential curve fitting was used instead of linear regression. Studies on chlorophyll-a concentration inversion have shown the curve fitting algorithm to be a potentially useful method (Hansen & Schjoerring 2003; Liu et al. 2004; Wang et al. 2015a).
In recent years, many scholars have tried to use the ANN (Artificial Neural Network) method to retrieve water quality parameters because this method can describe non-linear and complex systems. ANN can ‘learn’ from observed data because it is based on the non-linear mapping structures of the human brain. It is a universal and highly flexible function approximator. Many studies have established neural network estimation models of chlorophyll-a concentration in specific waters. In addition, inversion accuracy was improved significantly (Scardi 1996; Keiner & Yan 1998; Schiller & Doerffer 1999; Zhang et al. 2002; Matthews et al. 2010; Matthews 2011). There have also been some studies on Taihu Lake (Ma & Dai 2005b; Song et al. 2013; Li et al. 2014; Qi et al. 2014; Zhu et al. 2017; Zhang et al. 2018), most of them on one or two neural-network algorithms of Chl-a concentration inversion (Sun et al. 2009; Wang et al. 2015b; Cao et al. 2016; Zhu et al. 2017). However, different algorithms have their own advantages and disadvantages, for example, the generalization ability of the Back Propagation (BP) model is not good due to ‘over-fitting’ and other reasons (Haykin 2008). Deep learning has been widely used in many fields. The Taihu Lake area has also been the subject of some research based on neural network algorithms to monitor water quality, which has also had some results. However, these methods usually employ one or two algorithms, and precision comparison; it is rare to see simulations using more than two methods, so there is a lack of precision comparison between multiple methods.
In this research, four models were used to calculate the relation between algal chlorophyll-a and Landsat 8 spectrum data. These models were the curve fitting model, BP (Back Propagation) ANN model, RBF (Radical Basis Function) model, and DL (Deep Learning) model. Comparing the performance of these models, the study chose the best one for inversion of chlorophyll-a concentration by multispectral satellite remote sensing in Taihu Lake.
MATERIALS
The overview of the study area
Taihu Lake is the third largest freshwater lake in China, located in the south of the Yangtze River Delta (30°56′–31°34′N, 119°54′–120°36′E), which is shown in Figure 1 with 32 observation points (the red points). The total area of the lake is about 2,338 km2 (http://www.tba.gov.cn/). The blue-green algae bloom in large areas appears in April or May almost every year. Algae-polluted waters in the lake have adversely affected and interrupted the normal life of the several million nearby residents. Water eutrophication has become a serious environmental problem. It is important to estimate chlorophyll-a concentration in a timely way to predict the outbreak of blue-green algal blooms by means of remote sensing.
The study area with observation points (the red points). Please refer to the online version of this paper to see this figure in color: http://dx.doi.org/10.2166/ws.2021.137.
The study area with observation points (the red points). Please refer to the online version of this paper to see this figure in color: http://dx.doi.org/10.2166/ws.2021.137.
In addition, the study used Landsat-8 data which was downloaded from this website (http://www.gscloud.cn/) and kept in sync with the measured data in the 32 points in 2017.
METHODOLOGY
Curve fitting methods
Curve fitting is a method of approximating discrete points on a plane by continuous curves. The commonly used methods are linear fitting, polynomial fitting, exponential fitting, Gaussian fitting, etc. Generally, the curve type can be determined according to the professional characteristics. If not, the scatter diagram can be drawn to select the appropriate curve type according to the distribution of the scatter (Liu et al. 2004).
BP neural network
The BP neural network is a multi-layer forward feedback neural network, whose transformation function for the neurons is an s-type (sigmoid) function, so the output is a continuous quantity between 0 and 1, which can realize arbitrary nonlinear mapping from input to output.
The BP algorithm belongs to the delta (δ) algorithm, which is a supervised learning algorithm. The main idea of BP is inputting learning samples: X1, X2…Xn. It is known that the corresponding output samples are O1, O2… Om. The purpose of learning is to use the actual output of the network A1, A2… Am and the target vector O1, O2… Om. The errors between Om and its weights are modified, so that Ai (i = 1, 2…, m) can be as close as possible to the expected Oi (Zheng et al. 2017). In other words, the error sum of the network output layer is minimized. The simple structure of the BP neural network is shown as Figure 2 and the W1…Wj…Wm are the weights of the indexes. The algorithm consists of two parts: the forward transmission of information and the back propagation of error. In the process of forward propagation, the input information is transmitted from the input to the output layer through the hidden layer by layer calculation, and the state of each layer of neurons only affects the state of the next layer of neurons. If the desired output is not obtained in the output layer, the error change value of the output layer is calculated, and then the error signal is transmitted back through the network along the original connection path to modify the weight of the neurons in each layer until the desired target is achieved.
Radial basis function (RBF) neural network
Both the RBF neural network and BP neural network are nonlinear multi-layer forward networks, but the middle layer of the BP neural network structure can have many layers, while the RBF neural network has only one input layer, one output layer and one intermediate layer. In addition to its simple structure, the RBF neural network is superior to the BP neural network in its approximation ability, classification ability and learning speed. The basic idea of the RBF neural network is to use a radial basis function (RBF) as the ‘basis’ to form the hidden layer space. The hidden layer transforms the input vector, transforms input data of low dimension to the high-dimensional space, and makes the problem of linear inseparability in low-dimensional space linearly separable in high-dimensional space. The RBF is the center of radial symmetry and attenuation of a nonnegative linear function, when the center of the input layer and hidden layer mapping relation is determined by RBF, and the mapping relationship between hidden layer and output layer is linear. Therefore, the RBF neural network learning algorithm requires three parameters: the center of the radial basis function, the variance and hidden layer to output layer weights (Wang et al. 2013).
Deep learning (DL) method
DL builds multiple hidden layers on the basis of an artificial neural network with learning a more complex nonlinear network structure; it can mine the essential characteristics of data sets from limited samples. Deep learning is the internal law and presentation level of learning sample data. The information obtained in the learning process is of great help to the interpretation of data such as text, image and so on (Goodfellow et al. 2016). This research was based on Keras programming to achieve the DL model. Moreover, B1–B11 band data of Landsat8 were used as the input layer. Three fully connected layers were used to build this deep network, and the output layer was the concentration Chl-a value.
RESULTS AND DISCUSSION
Curve fitting models
There have been many curve fitting models used to predict Chl-a concentration with remote sensing data (Liu & Woods 2004). In addition, NDV values (B4/B3) have been used as the key index in inversion of Chl-a (Gu & Pei 2017). Therefore, the study took the NDV values of Landsat8 data as the X-axis and the measured Chl-a values as the Y-axis to achieve curve fitting.
From Figure 3 and Table 1, the accuracy, R2 and ∑Residual2 show that the Powera model and S model had better results than the other curve fitting models with ∑Residual2 = 90.469 and ∑Residual2 = 90.466. Therefore, the study used these models to map the level of Chl-a in the RS data based on Y = 14.85988184091882 *x(−4.582236735711132) and Y = e(−1.33226861857868+4.060797975393096/x). In addition,the Logarithmic model and Inverse model also had good results by analyzing the R2 index (R2 = 0.001), which shows that the volatility of curve fitting was low, and the result can be seen in Figure 4.
Model description based on curve fitting method
Name . | MODEL_1 . | . | . | . | |
---|---|---|---|---|---|
Dependent variable | 0 | Chl-a (mg/m3) | Equation | R2 | ∑Residual2 |
Function | 1 | Linear | Y = −67.88532101122092*x + 107.573046753554 | 0.001 | 602,224.882 |
2 | Logarithmic | Y = 39.74886869014925 + −63.28937890992608*log(x) | 0.001 | 602,190.653 | |
3 | Inverse | Y = −18.80174551607722 + 58.64419935909177/x | 0.001 | 602,156.608 | |
4 | Quadratic | Y = 3,465.543442615852 + −7,638.402568342078*x + 4,262.740570842534*x*x | 0.004 | 600,288.726 | |
5 | Cubic | Y = 2,378.697088001991 + −3,908.075599447289*x + 0*x*x + 1,621.809320132796*x*x*x | 0.004 | 600,227.393 | |
6 | Compounda | Y = 2,503.918815021841*0.005752479108287114**x | 0.020 | 90.475 | |
7 | Powera | Y = 14.85988184091882*x(−4.582236735711132) | 0.020 | 90.466 | |
8 | Sa | Y = e(−1.33226861857868+4.060797975393096/x) | 0.020 | 90.469 | |
9 | Growtha | Y = e(7.825612309578503+−5.158124367823692*x) | 0.020 | 90.475 | |
10 | Exponentiala | Y = 2,503.918815021841*e(−5.158124367823692*x) | 0.020 | 90.475 | |
11 | Logistica | Y = 1/(0 + 0.0003993739709133802*173.8380933116968**x) | 0.020 | 90.475 | |
Independent variable | B4/B3(NDV) | ||||
Constant | include | ||||
The tolerance of the input term in the equation | 0.0001 |
Name . | MODEL_1 . | . | . | . | |
---|---|---|---|---|---|
Dependent variable | 0 | Chl-a (mg/m3) | Equation | R2 | ∑Residual2 |
Function | 1 | Linear | Y = −67.88532101122092*x + 107.573046753554 | 0.001 | 602,224.882 |
2 | Logarithmic | Y = 39.74886869014925 + −63.28937890992608*log(x) | 0.001 | 602,190.653 | |
3 | Inverse | Y = −18.80174551607722 + 58.64419935909177/x | 0.001 | 602,156.608 | |
4 | Quadratic | Y = 3,465.543442615852 + −7,638.402568342078*x + 4,262.740570842534*x*x | 0.004 | 600,288.726 | |
5 | Cubic | Y = 2,378.697088001991 + −3,908.075599447289*x + 0*x*x + 1,621.809320132796*x*x*x | 0.004 | 600,227.393 | |
6 | Compounda | Y = 2,503.918815021841*0.005752479108287114**x | 0.020 | 90.475 | |
7 | Powera | Y = 14.85988184091882*x(−4.582236735711132) | 0.020 | 90.466 | |
8 | Sa | Y = e(−1.33226861857868+4.060797975393096/x) | 0.020 | 90.469 | |
9 | Growtha | Y = e(7.825612309578503+−5.158124367823692*x) | 0.020 | 90.475 | |
10 | Exponentiala | Y = 2,503.918815021841*e(−5.158124367823692*x) | 0.020 | 90.475 | |
11 | Logistica | Y = 1/(0 + 0.0003993739709133802*173.8380933116968**x) | 0.020 | 90.475 | |
Independent variable | B4/B3(NDV) | ||||
Constant | include | ||||
The tolerance of the input term in the equation | 0.0001 |
aThe model requires all non-missing values to be positive.
The four model results maps with better R2 and ∑Residual2 than the others.
The following figures were the results of these models based on the Landsat8 satellite data by using the ENVI software. The best two curve models (Inverse model and Power model) were obtained finally by considering R2 and ∑Residual2.
BP neural network
The study used 77.2% of the total data and 22.8% of the total data for training and testing with the structure of the BP neural network, as shown in Figure 5. The B1–B11 bands data and Chl-a values were used as the input (Figure 5).
At the end, the error of the sum of squares of the results of training was 23.625. The relative error was 0.675, and the error of the sum of squares of the results of testing was 1.436 with the relative error = 0.522. From the parameter estimation of the BP neural network (Table 2), the normalized importance of each band was 61.4%, 46.3%, 13.2%, 8.3%, 45.3%, 30.1%, 100.0%, 44.5%, 35.5%, 17.0%, 28.3% (from B1 to B11).
Parameter estimation of BP neural network
Predicted value . | Have predicted value . | . | . | |||
---|---|---|---|---|---|---|
Hidden layer 1 . | Output . | . | . | |||
H(1:1) . | H(1:2) . | Chl-a . | . | . | ||
Input | (Bias) | −0.659 | −0.359 | Importance | Normalized importance | |
B1 | 0.512 | − 0.223 | 0.143 | 61.4% | ||
B2 | −0.555 | −0.064 | 0.108 | 46.3% | ||
B3 | −0.169 | −0.234 | 0.031 | 13.2% | ||
B4 | −0.029 | 0.048 | 0.019 | 8.3% | ||
B5 | −0.907 | −0.109 | 0.105 | 45.3% | ||
B6 | −0.312 | −0.539 | 0.070 | 30.1% | ||
B7 | − 0.763 | 0.366 | 0.233 | 100.0% | ||
B8 | 0.119 | −0.343 | 0.103 | 44.5% | ||
B9 | 0.648 | 0.857 | 0.083 | 35.5% | ||
B10 | 0.414 | 0.525 | 0.040 | 17.0% | ||
B11 | 0.068 | 0.266 | 0.065 | 28.3% | ||
Hidden layer 1 | (Bias) | −0.017 | ||||
H(1:1) | −0.916 | |||||
H(1:2) | 0.922 |
Predicted value . | Have predicted value . | . | . | |||
---|---|---|---|---|---|---|
Hidden layer 1 . | Output . | . | . | |||
H(1:1) . | H(1:2) . | Chl-a . | . | . | ||
Input | (Bias) | −0.659 | −0.359 | Importance | Normalized importance | |
B1 | 0.512 | − 0.223 | 0.143 | 61.4% | ||
B2 | −0.555 | −0.064 | 0.108 | 46.3% | ||
B3 | −0.169 | −0.234 | 0.031 | 13.2% | ||
B4 | −0.029 | 0.048 | 0.019 | 8.3% | ||
B5 | −0.907 | −0.109 | 0.105 | 45.3% | ||
B6 | −0.312 | −0.539 | 0.070 | 30.1% | ||
B7 | − 0.763 | 0.366 | 0.233 | 100.0% | ||
B8 | 0.119 | −0.343 | 0.103 | 44.5% | ||
B9 | 0.648 | 0.857 | 0.083 | 35.5% | ||
B10 | 0.414 | 0.525 | 0.040 | 17.0% | ||
B11 | 0.068 | 0.266 | 0.065 | 28.3% | ||
Hidden layer 1 | (Bias) | −0.017 | ||||
H(1:1) | −0.916 | |||||
H(1:2) | 0.922 |
As can be seen from Figure 6, the results of predicting the Chl-a value showed that the total values were smooth based on the BP method. In addition, the southwestern area had a higher Chl-a value than the other areas.
RBF neural network
The study used 77.2% of the total data and 22.8% of the total data for training and testing with the structure of the RBF neural network, which is shown as Figure 7. The B1–B11 band data and Chl-a values were used as the input from Figure 7.
At the end, the error of the sum of squares of the results of training was 17.890. The relative error was 0.542, and the error of the sum of squares of the results of testing was 4.479 with the relative error = 0.458. From the parameter estimation of the RBF neural network table (Table 3), the normalized importance of each band was 73.2%, 78.1%, 71.3%, 80.8%, 100.0%, 70.0%, 66.8%, 74.2%, 72.5%, 75.4%, 62.2% (from B1 to B11).
Parameter estimation of RBF neural network
Predicted value . | Have predicted value . | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Hidden layera . | Output . | The importance of indexes . | |||||||||||
H(1) . | H(2) . | H(3) . | H(4) . | H(5) . | H(6) . | H(7) . | H(8) . | H(9) . | Chl-a . | Importance . | Normalized importance . | ||
Input | B1 | 0.312 | −1.353 | 0.725 | 1.114 | −0.275 | −0.370 | −0.332 | 1.613 | 1.741 | 0.089 | 73.2% | |
B2 | 0.332 | −1.303 | 0.544 | 1.053 | −0.277 | −0.377 | −0.522 | 1.711 | 1.803 | 0.095 | 78.1% | ||
B3 | −0.015 | −1.259 | 0.346 | 1.371 | −0.444 | 0.014 | −0.156 | 1.770 | 1.741 | 0.086 | 71.3% | ||
B4 | 0.059 | − 1.108 | 0.100 | 1.186 | − 0.397 | − 0.193 | − 0.503 | 1.882 | 2.081 | 0.098 | 80.8% | ||
B5 | − 0.482 | − 0.705 | − 0.272 | 0.520 | − 0.442 | 1.116 | 3.197 | 0.007 | 0.254 | 0.121 | 100.0% | ||
B6 | −0.407 | −0.685 | −0.055 | 0.563 | −0.431 | 0.345 | 3.585 | 0.354 | 0.566 | 0.085 | 70.0% | ||
B7 | −0.386 | −0.706 | 0.155 | 1.116 | −0.498 | −0.158 | 2.745 | 0.882 | 1.249 | 0.081 | 66.8% | ||
B8 | 0.073 | −1.213 | 0.262 | 1.251 | −0.408 | −0.131 | −0.373 | 1.845 | 1.868 | 0.091 | 74.2% | ||
B9 | 0.851 | 0.366 | 0.579 | 0.905 | −0.949 | −0.945 | −0.744 | 0.657 | 1.025 | 0.088 | 72.5% | ||
B10 | −1.184 | −1.288 | 1.324 | 1.167 | 0.109 | 0.214 | 0.845 | 1.140 | 0.932 | 0.091 | 75.4% | ||
B11 | −0.629 | −0.636 | 0.886 | 0.791 | 0.187 | 0.235 | 0.587 | 0.776 | −6.538 | 0.075 | 62.2% | ||
Hidden unit width | 0.450 | 0.471 | 0.612 | 0.446 | 0.339 | 0.515 | 1.862 | 0.555 | 0.339 | ||||
Hidden layers | H(1) | −0.362 | |||||||||||
H(2) | −0.276 | ||||||||||||
H(3) | 0.138 | ||||||||||||
H(4) | 3.450 | ||||||||||||
H(5) | −0.229 | ||||||||||||
H(6) | 0.684 | ||||||||||||
H(7) | −0.333 | ||||||||||||
H(8) | −0.188 | ||||||||||||
H(9) | −0.249 |
Predicted value . | Have predicted value . | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Hidden layera . | Output . | The importance of indexes . | |||||||||||
H(1) . | H(2) . | H(3) . | H(4) . | H(5) . | H(6) . | H(7) . | H(8) . | H(9) . | Chl-a . | Importance . | Normalized importance . | ||
Input | B1 | 0.312 | −1.353 | 0.725 | 1.114 | −0.275 | −0.370 | −0.332 | 1.613 | 1.741 | 0.089 | 73.2% | |
B2 | 0.332 | −1.303 | 0.544 | 1.053 | −0.277 | −0.377 | −0.522 | 1.711 | 1.803 | 0.095 | 78.1% | ||
B3 | −0.015 | −1.259 | 0.346 | 1.371 | −0.444 | 0.014 | −0.156 | 1.770 | 1.741 | 0.086 | 71.3% | ||
B4 | 0.059 | − 1.108 | 0.100 | 1.186 | − 0.397 | − 0.193 | − 0.503 | 1.882 | 2.081 | 0.098 | 80.8% | ||
B5 | − 0.482 | − 0.705 | − 0.272 | 0.520 | − 0.442 | 1.116 | 3.197 | 0.007 | 0.254 | 0.121 | 100.0% | ||
B6 | −0.407 | −0.685 | −0.055 | 0.563 | −0.431 | 0.345 | 3.585 | 0.354 | 0.566 | 0.085 | 70.0% | ||
B7 | −0.386 | −0.706 | 0.155 | 1.116 | −0.498 | −0.158 | 2.745 | 0.882 | 1.249 | 0.081 | 66.8% | ||
B8 | 0.073 | −1.213 | 0.262 | 1.251 | −0.408 | −0.131 | −0.373 | 1.845 | 1.868 | 0.091 | 74.2% | ||
B9 | 0.851 | 0.366 | 0.579 | 0.905 | −0.949 | −0.945 | −0.744 | 0.657 | 1.025 | 0.088 | 72.5% | ||
B10 | −1.184 | −1.288 | 1.324 | 1.167 | 0.109 | 0.214 | 0.845 | 1.140 | 0.932 | 0.091 | 75.4% | ||
B11 | −0.629 | −0.636 | 0.886 | 0.791 | 0.187 | 0.235 | 0.587 | 0.776 | −6.538 | 0.075 | 62.2% | ||
Hidden unit width | 0.450 | 0.471 | 0.612 | 0.446 | 0.339 | 0.515 | 1.862 | 0.555 | 0.339 | ||||
Hidden layers | H(1) | −0.362 | |||||||||||
H(2) | −0.276 | ||||||||||||
H(3) | 0.138 | ||||||||||||
H(4) | 3.450 | ||||||||||||
H(5) | −0.229 | ||||||||||||
H(6) | 0.684 | ||||||||||||
H(7) | −0.333 | ||||||||||||
H(8) | −0.188 | ||||||||||||
H(9) | −0.249 |
aDisplays the center vector for each hidden unit.
From Figure 8, the results of the prediction of Chl-a value showed that the total values were smooth based on the RBF method.
Deep learning method
The last model was the deep learning model (Barzegar et al. 2020; Peterson et al. 2020; Gebler et al. 2021; Maier et al. 2021), Deep Neural Network (DNN) model, which included three fully connected layers (Ning et al. 2021). The simplified flow of the DNN model is as shown in Figure 9. In addition, each B1–B11 data value and Chl-a value were as the input data. The proportion of training data and testing data was 8:2. This model set the epochs as 80, batch_size as 1 and verbose as 2. Finally, the mean squared errors (MSE) of the DNN model were 26.995 (training) and 4.356 (testing); the mean absolute error (MAE) of the DNN model was 0.635 (testing). The results of this kind of deep network structure are shown in Figure 10.
After trying different models, the segmentation results of the Landsat8 data for accessing the Chl-a level were as shown in Figure 11. These were some areas where Chl-a was apparent in the study area. Moreover, Figure 11 can help to show the visual performance of the Chl-a level in these models. Obviously, the curve fitting models were not as accurate as the machine learning model from Figure 11. In addition, the power curve fitting model had the best results in these curve fitting models with ∑Residual2 of fitting = 90.469; however, this was worse than the machine learning models. In the three machine learning models, the results of BP were the best, followed by DNN, and then RBF from the aspect of mean squared error of testing from Table 4.
Model analysis table
Model . | ∑Residual2 of fitting . | Mean squared error of training . | Mean squared error of testing . |
---|---|---|---|
Inverse curve fitting model | 602,156.608 | / | / |
Power curve fitting model | 90.469 | / | / |
BP model | / | 23.625 | 1.436 |
RBF model | / | 17.890 | 4.479 |
DNN model | / | 26.995 | 4.356 |
Model . | ∑Residual2 of fitting . | Mean squared error of training . | Mean squared error of testing . |
---|---|---|---|
Inverse curve fitting model | 602,156.608 | / | / |
Power curve fitting model | 90.469 | / | / |
BP model | / | 23.625 | 1.436 |
RBF model | / | 17.890 | 4.479 |
DNN model | / | 26.995 | 4.356 |
Segmentation results of the Landsat8 data: (a) input original images, (b) results of inverse model images, (c) results of power model images, (d) results of BP model images, (e) results of RBF model images, (f) results of DNN model images.
Segmentation results of the Landsat8 data: (a) input original images, (b) results of inverse model images, (c) results of power model images, (d) results of BP model images, (e) results of RBF model images, (f) results of DNN model images.
CONCLUSIONS
In this study, Chl-a values in Taihu Lake were retrieved by various methods based on Landsat8 in 2017. From the curve fitting methods, the power model and the inverse model had the better∑Residual2 of fitting of results than other curve fitting models. However, the machine learning method had less error than the curve fitting methods. Of course, the curve fitting methods were simpler than the machine learning method, and they did not require too much of the calculation power of the computer and the configuration of the machine. Of the three machine learning methods, BP had the best results, and DNN also had a nice mean squared error of results. Therefore, the best method of this study was the BP model. Although deep learning is a very popular method now, the result was slightly inferior to the BP method in this study. The reason may be that the function between the last hidden layer and the output layer of the deep learning method adopted in this study was the same as the BP method. The data were the data of multiple measurement points in multiple months of the whole year of 2017. If a study has the situation of a small amount of data, deep learning may be better than the BP method. Anyway, the advantages of a deep network are not fully demonstrated, but other methods of script research are more effective in this study. In the future, data volume and equipment condition need to be discussed to choose the optimal method.
ACKNOWLEDGEMENTS
Chlorophyll-a concentration data was from CERN (Taihu Laboratory for Lake Ecosystem Research, Nanjing Institute of Geography and Limnology). This research was supported by the grant from the Development Program of China: research and demonstration of ecological construction of typical islands in the South China Sea and the monitoring technology of ecological things in the South China Sea, NO.2017YFC0506304. And based on remote sensing geology survey, the application information extraction and drawing of national defense construction, NO.DD2016007637.
AUTHOR CONTRIBUTIONS
Xiaolan Zhao and Haoli Xu designed the experiments, Xiaolan Zhao, Haoli Xu and Zhibin Ding performed the experiments, Tingfong Wu and Wei Li got the measured Chl-a data, Daqing Wang, Zhengdong Deng and Yi Wang analyzed the data, Zhao Lu and Guangyuan Wang contributed materials and analysis tools, Haoli Xu and Xiaolan Zhao wrote the paper.
DATA AVAILABILITY STATEMENT
All relevant data are included in the paper or its Supplementary Information.