Chlorophyll-a (Chl-a) is an important index in water quality assessment by remote sensing technology. For the study of Chl-a value measurement in rivers or lakes, there are many classical methods, such as curve fitting, back propagation (BP) neural network and radial basis function (RBF) neural network, and all of them have some corresponding applications. With the rise of computer power and deep learning, this study intended to analyze the measurement of water quality and Chl-a in deep learning (DL) and to compare it with several classical methods, so as to explore and develop better methods. Taking Taihu Lake of China as the case, this study adopted the measured data of Chl-a in Taihu Lake in 2017 and the data corresponding to the same time from Landsat8. In this study, the four methods were used to invert the distribution of the Chl-a value in Taihu Lake. From the results of inversion, the power curve fitting model with ∑Residual2 of fitting of 90.469 and inverse curve fitting model with the ∑Residual2 of fitting of 602,156.608 had better results than the other curve fitting models; however, they were not as accurate as the machine learning method from segmentation results images. The machine learning method had better accuracy than the curve fitting methods from segmentation results images. The mean squared error of testing of the three methods of machine learning (BP, RBF, DL) were respectively 1.436, 4.479, 4.356. Thus, the BP method and DL method had better results in this study.

  • BP had the best results, and DNN also had nice mean squared error of results.

  • In the future, the data volume and equipment condition need to be discussed to choose the optimal method.

  • The machine learning method had less error than the curve fitting methods.

There are many studies about water quality monitoring in Taihu Lake (China) using satellite remote-sensing data with the development of spectroscopy technology. The Landsat series satellites are the most popular data resources among these different satellites (Wu & Yang 2019) in China. Thus, monitoring water quality especially of chlorophyll-a through remote-sensing data is an important research issue. The earliest studies got data from Landsat 5, which carried the MSS (Multispectral Scanner) and the TM (Thematic Mapper) instruments (Li et al. 1995a; She et al. 1996; She & Cai 1997; Wang & Ma 2000; Ma & Dai 2005a; Yin et al. 2005). These studies established the relation between spectral data and chlorophyll-a concentration values, developed the regression model for them, and analyzed the accuracy and applicability of the model. The estimation of chlorophyll-a concentration was not precise enough because of the low-resolution spectrometer of the TM instruments (Tao et al. 2019; Wang & Wu 2019; Wang & Ma 2000). Landsat-8 was launched in 2013 with the OLI (Operational Land Imager) and the TIRS (Thermal Infrared Sensor). It had a 16-day repeat cycle around the Earth. Compared with medium- and low-resolution satellite sensors (e.g., MODIS), the OLI had better spatial information. Its high spatial resolution was 30 m (https://www.usgs.gov/land-resources/). Some studies proved the Landsat 8 OLI imagery database could be used for water quality monitoring in inland shallow lakes (Barsi et al. 2014; Lobo et al. 2015; Mushtaq & Lala 2016; Zheng et al. 2016).

Applications of remotely sensed data for water quality monitoring can mainly be performed by three methods: analytical method, semi-analytical method, and empirical method (Morel & Gordon 1980; Zhou et al. 2004; Odermatt et al. 2012; Wang et al. 2012; Wu & Yang 2019). The analytical method is only about theoretical formulae. The semi-analytical method requires the integrating of theoretical and empirical methods. The empirical method refers to linear regression, curve-fitting or non-linear functions (Gitelson 1992; Gurlin et al. 2011). It has been widely utilized for estimating chlorophyll-a concentration, and it is a simple, suitable, high-accuracy method for a small area (Simis et al. 2005; Matthews 2011; Odermatt et al. 2012; Wu & Yang 2019). Based on statistical analysis, researchers select the optimal band data or band combination data from remote-sensing data, obtain the inversion algorithm between the water quality parameters from laboratory and band data (or band combination data), then analyze the accuracy and applicability of the algorithm. For Taihu Lake, there have been typical empirical methods, such as linear regression, single-band method, band-combination method (e.g., band-ratio and band difference), artificial neural network, principal component analysis and so on (Li et al. 1995b; She et al. 1996; Wang & Ma 2000; Zhou et al. 2004; Ma & Dai 2005a; Sun et al. 2009; Duan et al. 2012; Wang et al. 2013; Zhang et al. 2016).

Taihu Lake is typical Case II water, and there are interactions among the compositions of water quality. The relation between spectral characteristics and chlorophyll-a are complex. It was further improved when exponential curve fitting was used instead of linear regression. Studies on chlorophyll-a concentration inversion have shown the curve fitting algorithm to be a potentially useful method (Hansen & Schjoerring 2003; Liu et al. 2004; Wang et al. 2015a).

In recent years, many scholars have tried to use the ANN (Artificial Neural Network) method to retrieve water quality parameters because this method can describe non-linear and complex systems. ANN can ‘learn’ from observed data because it is based on the non-linear mapping structures of the human brain. It is a universal and highly flexible function approximator. Many studies have established neural network estimation models of chlorophyll-a concentration in specific waters. In addition, inversion accuracy was improved significantly (Scardi 1996; Keiner & Yan 1998; Schiller & Doerffer 1999; Zhang et al. 2002; Matthews et al. 2010; Matthews 2011). There have also been some studies on Taihu Lake (Ma & Dai 2005b; Song et al. 2013; Li et al. 2014; Qi et al. 2014; Zhu et al. 2017; Zhang et al. 2018), most of them on one or two neural-network algorithms of Chl-a concentration inversion (Sun et al. 2009; Wang et al. 2015b; Cao et al. 2016; Zhu et al. 2017). However, different algorithms have their own advantages and disadvantages, for example, the generalization ability of the Back Propagation (BP) model is not good due to ‘over-fitting’ and other reasons (Haykin 2008). Deep learning has been widely used in many fields. The Taihu Lake area has also been the subject of some research based on neural network algorithms to monitor water quality, which has also had some results. However, these methods usually employ one or two algorithms, and precision comparison; it is rare to see simulations using more than two methods, so there is a lack of precision comparison between multiple methods.

In this research, four models were used to calculate the relation between algal chlorophyll-a and Landsat 8 spectrum data. These models were the curve fitting model, BP (Back Propagation) ANN model, RBF (Radical Basis Function) model, and DL (Deep Learning) model. Comparing the performance of these models, the study chose the best one for inversion of chlorophyll-a concentration by multispectral satellite remote sensing in Taihu Lake.

The overview of the study area

Taihu Lake is the third largest freshwater lake in China, located in the south of the Yangtze River Delta (30°56′–31°34′N, 119°54′–120°36′E), which is shown in Figure 1 with 32 observation points (the red points). The total area of the lake is about 2,338 km2 (http://www.tba.gov.cn/). The blue-green algae bloom in large areas appears in April or May almost every year. Algae-polluted waters in the lake have adversely affected and interrupted the normal life of the several million nearby residents. Water eutrophication has become a serious environmental problem. It is important to estimate chlorophyll-a concentration in a timely way to predict the outbreak of blue-green algal blooms by means of remote sensing.

Figure 1

The study area with observation points (the red points). Please refer to the online version of this paper to see this figure in color: http://dx.doi.org/10.2166/ws.2021.137.

Figure 1

The study area with observation points (the red points). Please refer to the online version of this paper to see this figure in color: http://dx.doi.org/10.2166/ws.2021.137.

Close modal

In addition, the study used Landsat-8 data which was downloaded from this website (http://www.gscloud.cn/) and kept in sync with the measured data in the 32 points in 2017.

Curve fitting methods

Curve fitting is a method of approximating discrete points on a plane by continuous curves. The commonly used methods are linear fitting, polynomial fitting, exponential fitting, Gaussian fitting, etc. Generally, the curve type can be determined according to the professional characteristics. If not, the scatter diagram can be drawn to select the appropriate curve type according to the distribution of the scatter (Liu et al. 2004).

BP neural network

The BP neural network is a multi-layer forward feedback neural network, whose transformation function for the neurons is an s-type (sigmoid) function, so the output is a continuous quantity between 0 and 1, which can realize arbitrary nonlinear mapping from input to output.

The BP algorithm belongs to the delta (δ) algorithm, which is a supervised learning algorithm. The main idea of BP is inputting learning samples: X1, X2…Xn. It is known that the corresponding output samples are O1, O2… Om. The purpose of learning is to use the actual output of the network A1, A2… Am and the target vector O1, O2… Om. The errors between Om and its weights are modified, so that Ai (i = 1, 2…, m) can be as close as possible to the expected Oi (Zheng et al. 2017). In other words, the error sum of the network output layer is minimized. The simple structure of the BP neural network is shown as Figure 2 and the W1…Wj…Wm are the weights of the indexes. The algorithm consists of two parts: the forward transmission of information and the back propagation of error. In the process of forward propagation, the input information is transmitted from the input to the output layer through the hidden layer by layer calculation, and the state of each layer of neurons only affects the state of the next layer of neurons. If the desired output is not obtained in the output layer, the error change value of the output layer is calculated, and then the error signal is transmitted back through the network along the original connection path to modify the weight of the neurons in each layer until the desired target is achieved.

Figure 2

The simple structure of BP neural network.

Figure 2

The simple structure of BP neural network.

Close modal

Radial basis function (RBF) neural network

Both the RBF neural network and BP neural network are nonlinear multi-layer forward networks, but the middle layer of the BP neural network structure can have many layers, while the RBF neural network has only one input layer, one output layer and one intermediate layer. In addition to its simple structure, the RBF neural network is superior to the BP neural network in its approximation ability, classification ability and learning speed. The basic idea of the RBF neural network is to use a radial basis function (RBF) as the ‘basis’ to form the hidden layer space. The hidden layer transforms the input vector, transforms input data of low dimension to the high-dimensional space, and makes the problem of linear inseparability in low-dimensional space linearly separable in high-dimensional space. The RBF is the center of radial symmetry and attenuation of a nonnegative linear function, when the center of the input layer and hidden layer mapping relation is determined by RBF, and the mapping relationship between hidden layer and output layer is linear. Therefore, the RBF neural network learning algorithm requires three parameters: the center of the radial basis function, the variance and hidden layer to output layer weights (Wang et al. 2013).

Deep learning (DL) method

DL builds multiple hidden layers on the basis of an artificial neural network with learning a more complex nonlinear network structure; it can mine the essential characteristics of data sets from limited samples. Deep learning is the internal law and presentation level of learning sample data. The information obtained in the learning process is of great help to the interpretation of data such as text, image and so on (Goodfellow et al. 2016). This research was based on Keras programming to achieve the DL model. Moreover, B1–B11 band data of Landsat8 were used as the input layer. Three fully connected layers were used to build this deep network, and the output layer was the concentration Chl-a value.

Curve fitting models

There have been many curve fitting models used to predict Chl-a concentration with remote sensing data (Liu & Woods 2004). In addition, NDV values (B4/B3) have been used as the key index in inversion of Chl-a (Gu & Pei 2017). Therefore, the study took the NDV values of Landsat8 data as the X-axis and the measured Chl-a values as the Y-axis to achieve curve fitting.

From Figure 3 and Table 1, the accuracy, R2 and ∑Residual2 show that the Powera model and S model had better results than the other curve fitting models with ∑Residual2 = 90.469 and ∑Residual2 = 90.466. Therefore, the study used these models to map the level of Chl-a in the RS data based on Y = 14.85988184091882 *x(−4.582236735711132) and Y = e(−1.33226861857868+4.060797975393096/x). In addition,the Logarithmic model and Inverse model also had good results by analyzing the R2 index (R2 = 0.001), which shows that the volatility of curve fitting was low, and the result can be seen in Figure 4.

Table 1

Model description based on curve fitting method

NameMODEL_1
Dependent variable Chl-a (mg/m3Equation R2 ∑Residual2 
Function Linear Y = −67.88532101122092*x + 107.573046753554 0.001 602,224.882 
Logarithmic Y = 39.74886869014925 + −63.28937890992608*log(x0.001 602,190.653 
Inverse Y = −18.80174551607722 + 58.64419935909177/x 0.001 602,156.608 
Quadratic Y = 3,465.543442615852 + −7,638.402568342078*x + 4,262.740570842534*x*x 0.004 600,288.726 
Cubic Y = 2,378.697088001991 + −3,908.075599447289*x + 0*x*x + 1,621.809320132796*x*x*x 0.004 600,227.393 
Compounda Y = 2,503.918815021841*0.005752479108287114**x 0.020 90.475 
Powera Y = 14.85988184091882*x(−4.582236735711132) 0.020 90.466 
Sa Y = e(−1.33226861857868+4.060797975393096/x) 0.020 90.469 
Growtha Y = e(7.825612309578503+−5.158124367823692*x) 0.020 90.475 
10 Exponentiala Y = 2,503.918815021841*e(−5.158124367823692*x) 0.020 90.475 
11 Logistica Y = 1/(0 + 0.0003993739709133802*173.8380933116968**x0.020 90.475 
Independent variable B4/B3(NDV)    
Constant include    
The tolerance of the input term in the equation 0.0001    
NameMODEL_1
Dependent variable Chl-a (mg/m3Equation R2 ∑Residual2 
Function Linear Y = −67.88532101122092*x + 107.573046753554 0.001 602,224.882 
Logarithmic Y = 39.74886869014925 + −63.28937890992608*log(x0.001 602,190.653 
Inverse Y = −18.80174551607722 + 58.64419935909177/x 0.001 602,156.608 
Quadratic Y = 3,465.543442615852 + −7,638.402568342078*x + 4,262.740570842534*x*x 0.004 600,288.726 
Cubic Y = 2,378.697088001991 + −3,908.075599447289*x + 0*x*x + 1,621.809320132796*x*x*x 0.004 600,227.393 
Compounda Y = 2,503.918815021841*0.005752479108287114**x 0.020 90.475 
Powera Y = 14.85988184091882*x(−4.582236735711132) 0.020 90.466 
Sa Y = e(−1.33226861857868+4.060797975393096/x) 0.020 90.469 
Growtha Y = e(7.825612309578503+−5.158124367823692*x) 0.020 90.475 
10 Exponentiala Y = 2,503.918815021841*e(−5.158124367823692*x) 0.020 90.475 
11 Logistica Y = 1/(0 + 0.0003993739709133802*173.8380933116968**x0.020 90.475 
Independent variable B4/B3(NDV)    
Constant include    
The tolerance of the input term in the equation 0.0001    

aThe model requires all non-missing values to be positive.

Figure 3

The results of 11 fitting models.

Figure 3

The results of 11 fitting models.

Close modal
Figure 4

The four model results maps with better R2 and ∑Residual2 than the others.

Figure 4

The four model results maps with better R2 and ∑Residual2 than the others.

Close modal

The following figures were the results of these models based on the Landsat8 satellite data by using the ENVI software. The best two curve models (Inverse model and Power model) were obtained finally by considering R2 and ∑Residual2.

BP neural network

The study used 77.2% of the total data and 22.8% of the total data for training and testing with the structure of the BP neural network, as shown in Figure 5. The B1–B11 bands data and Chl-a values were used as the input (Figure 5).

Figure 5

The structure of the BP neural network.

Figure 5

The structure of the BP neural network.

Close modal

At the end, the error of the sum of squares of the results of training was 23.625. The relative error was 0.675, and the error of the sum of squares of the results of testing was 1.436 with the relative error = 0.522. From the parameter estimation of the BP neural network (Table 2), the normalized importance of each band was 61.4%, 46.3%, 13.2%, 8.3%, 45.3%, 30.1%, 100.0%, 44.5%, 35.5%, 17.0%, 28.3% (from B1 to B11).

Table 2

Parameter estimation of BP neural network

Predicted valueHave predicted value
Hidden layer 1
Output
H(1:1)H(1:2)Chl-a
Input (Bias) −0.659 −0.359  Importance Normalized importance 
B1 0.512 − 0.223  0.143 61.4% 
B2 −0.555 −0.064  0.108 46.3% 
B3 −0.169 −0.234  0.031 13.2% 
B4 −0.029 0.048  0.019 8.3% 
B5 −0.907 −0.109  0.105 45.3% 
B6 −0.312 −0.539  0.070 30.1% 
B7 − 0.763 0.366  0.233 100.0% 
B8 0.119 −0.343  0.103 44.5% 
B9 0.648 0.857  0.083 35.5% 
B10 0.414 0.525  0.040 17.0% 
B11 0.068 0.266  0.065 28.3% 
Hidden layer 1 (Bias)   −0.017   
H(1:1)   −0.916   
H(1:2)   0.922   
Predicted valueHave predicted value
Hidden layer 1
Output
H(1:1)H(1:2)Chl-a
Input (Bias) −0.659 −0.359  Importance Normalized importance 
B1 0.512 − 0.223  0.143 61.4% 
B2 −0.555 −0.064  0.108 46.3% 
B3 −0.169 −0.234  0.031 13.2% 
B4 −0.029 0.048  0.019 8.3% 
B5 −0.907 −0.109  0.105 45.3% 
B6 −0.312 −0.539  0.070 30.1% 
B7 − 0.763 0.366  0.233 100.0% 
B8 0.119 −0.343  0.103 44.5% 
B9 0.648 0.857  0.083 35.5% 
B10 0.414 0.525  0.040 17.0% 
B11 0.068 0.266  0.065 28.3% 
Hidden layer 1 (Bias)   −0.017   
H(1:1)   −0.916   
H(1:2)   0.922   

As can be seen from Figure 6, the results of predicting the Chl-a value showed that the total values were smooth based on the BP method. In addition, the southwestern area had a higher Chl-a value than the other areas.

Figure 6

The prediction results of Chl-a based on BP neural network.

Figure 6

The prediction results of Chl-a based on BP neural network.

Close modal

RBF neural network

The study used 77.2% of the total data and 22.8% of the total data for training and testing with the structure of the RBF neural network, which is shown as Figure 7. The B1–B11 band data and Chl-a values were used as the input from Figure 7.

Figure 7

The structure of the RBP neural network.

Figure 7

The structure of the RBP neural network.

Close modal

At the end, the error of the sum of squares of the results of training was 17.890. The relative error was 0.542, and the error of the sum of squares of the results of testing was 4.479 with the relative error = 0.458. From the parameter estimation of the RBF neural network table (Table 3), the normalized importance of each band was 73.2%, 78.1%, 71.3%, 80.8%, 100.0%, 70.0%, 66.8%, 74.2%, 72.5%, 75.4%, 62.2% (from B1 to B11).

Table 3

Parameter estimation of RBF neural network

Predicted valueHave predicted value
Hidden layera
OutputThe importance of indexes
H(1)H(2)H(3)H(4)H(5)H(6)H(7)H(8)H(9)Chl-aImportanceNormalized importance
Input B1 0.312 −1.353 0.725 1.114 −0.275 −0.370 −0.332 1.613 1.741  0.089 73.2% 
B2 0.332 −1.303 0.544 1.053 −0.277 −0.377 −0.522 1.711 1.803  0.095 78.1% 
B3 −0.015 −1.259 0.346 1.371 −0.444 0.014 −0.156 1.770 1.741  0.086 71.3% 
B4 0.059 − 1.108 0.100 1.186 − 0.397 − 0.193 − 0.503 1.882 2.081  0.098 80.8% 
B5 − 0.482 − 0.705 − 0.272 0.520 − 0.442 1.116 3.197 0.007 0.254  0.121 100.0% 
B6 −0.407 −0.685 −0.055 0.563 −0.431 0.345 3.585 0.354 0.566  0.085 70.0% 
B7 −0.386 −0.706 0.155 1.116 −0.498 −0.158 2.745 0.882 1.249  0.081 66.8% 
B8 0.073 −1.213 0.262 1.251 −0.408 −0.131 −0.373 1.845 1.868  0.091 74.2% 
B9 0.851 0.366 0.579 0.905 −0.949 −0.945 −0.744 0.657 1.025  0.088 72.5% 
B10 −1.184 −1.288 1.324 1.167 0.109 0.214 0.845 1.140 0.932  0.091 75.4% 
B11 −0.629 −0.636 0.886 0.791 0.187 0.235 0.587 0.776 −6.538  0.075 62.2% 
Hidden unit width 0.450 0.471 0.612 0.446 0.339 0.515 1.862 0.555 0.339    
Hidden layers H(1)          −0.362   
H(2)          −0.276   
H(3)          0.138   
H(4)          3.450   
H(5)          −0.229   
H(6)          0.684   
H(7)          −0.333   
H(8)          −0.188   
H(9)          −0.249   
Predicted valueHave predicted value
Hidden layera
OutputThe importance of indexes
H(1)H(2)H(3)H(4)H(5)H(6)H(7)H(8)H(9)Chl-aImportanceNormalized importance
Input B1 0.312 −1.353 0.725 1.114 −0.275 −0.370 −0.332 1.613 1.741  0.089 73.2% 
B2 0.332 −1.303 0.544 1.053 −0.277 −0.377 −0.522 1.711 1.803  0.095 78.1% 
B3 −0.015 −1.259 0.346 1.371 −0.444 0.014 −0.156 1.770 1.741  0.086 71.3% 
B4 0.059 − 1.108 0.100 1.186 − 0.397 − 0.193 − 0.503 1.882 2.081  0.098 80.8% 
B5 − 0.482 − 0.705 − 0.272 0.520 − 0.442 1.116 3.197 0.007 0.254  0.121 100.0% 
B6 −0.407 −0.685 −0.055 0.563 −0.431 0.345 3.585 0.354 0.566  0.085 70.0% 
B7 −0.386 −0.706 0.155 1.116 −0.498 −0.158 2.745 0.882 1.249  0.081 66.8% 
B8 0.073 −1.213 0.262 1.251 −0.408 −0.131 −0.373 1.845 1.868  0.091 74.2% 
B9 0.851 0.366 0.579 0.905 −0.949 −0.945 −0.744 0.657 1.025  0.088 72.5% 
B10 −1.184 −1.288 1.324 1.167 0.109 0.214 0.845 1.140 0.932  0.091 75.4% 
B11 −0.629 −0.636 0.886 0.791 0.187 0.235 0.587 0.776 −6.538  0.075 62.2% 
Hidden unit width 0.450 0.471 0.612 0.446 0.339 0.515 1.862 0.555 0.339    
Hidden layers H(1)          −0.362   
H(2)          −0.276   
H(3)          0.138   
H(4)          3.450   
H(5)          −0.229   
H(6)          0.684   
H(7)          −0.333   
H(8)          −0.188   
H(9)          −0.249   

aDisplays the center vector for each hidden unit.

From Figure 8, the results of the prediction of Chl-a value showed that the total values were smooth based on the RBF method.

Figure 8

The prediction results of Chl-a based on RBF neural network.

Figure 8

The prediction results of Chl-a based on RBF neural network.

Close modal

Deep learning method

The last model was the deep learning model (Barzegar et al. 2020; Peterson et al. 2020; Gebler et al. 2021; Maier et al. 2021), Deep Neural Network (DNN) model, which included three fully connected layers (Ning et al. 2021). The simplified flow of the DNN model is as shown in Figure 9. In addition, each B1–B11 data value and Chl-a value were as the input data. The proportion of training data and testing data was 8:2. This model set the epochs as 80, batch_size as 1 and verbose as 2. Finally, the mean squared errors (MSE) of the DNN model were 26.995 (training) and 4.356 (testing); the mean absolute error (MAE) of the DNN model was 0.635 (testing). The results of this kind of deep network structure are shown in Figure 10.

Figure 9

The simplified flow of the DNN model.

Figure 9

The simplified flow of the DNN model.

Close modal
Figure 10

The prediction results of Chl-a based on deep learning.

Figure 10

The prediction results of Chl-a based on deep learning.

Close modal

After trying different models, the segmentation results of the Landsat8 data for accessing the Chl-a level were as shown in Figure 11. These were some areas where Chl-a was apparent in the study area. Moreover, Figure 11 can help to show the visual performance of the Chl-a level in these models. Obviously, the curve fitting models were not as accurate as the machine learning model from Figure 11. In addition, the power curve fitting model had the best results in these curve fitting models with ∑Residual2 of fitting = 90.469; however, this was worse than the machine learning models. In the three machine learning models, the results of BP were the best, followed by DNN, and then RBF from the aspect of mean squared error of testing from Table 4.

Table 4

Model analysis table

Model∑Residual2 of fittingMean squared error of trainingMean squared error of testing
Inverse curve fitting model 602,156.608 
Power curve fitting model 90.469 
BP model 23.625 1.436 
RBF model 17.890 4.479 
DNN model 26.995 4.356 
Model∑Residual2 of fittingMean squared error of trainingMean squared error of testing
Inverse curve fitting model 602,156.608 
Power curve fitting model 90.469 
BP model 23.625 1.436 
RBF model 17.890 4.479 
DNN model 26.995 4.356 
Figure 11

Segmentation results of the Landsat8 data: (a) input original images, (b) results of inverse model images, (c) results of power model images, (d) results of BP model images, (e) results of RBF model images, (f) results of DNN model images.

Figure 11

Segmentation results of the Landsat8 data: (a) input original images, (b) results of inverse model images, (c) results of power model images, (d) results of BP model images, (e) results of RBF model images, (f) results of DNN model images.

Close modal

In this study, Chl-a values in Taihu Lake were retrieved by various methods based on Landsat8 in 2017. From the curve fitting methods, the power model and the inverse model had the better∑Residual2 of fitting of results than other curve fitting models. However, the machine learning method had less error than the curve fitting methods. Of course, the curve fitting methods were simpler than the machine learning method, and they did not require too much of the calculation power of the computer and the configuration of the machine. Of the three machine learning methods, BP had the best results, and DNN also had a nice mean squared error of results. Therefore, the best method of this study was the BP model. Although deep learning is a very popular method now, the result was slightly inferior to the BP method in this study. The reason may be that the function between the last hidden layer and the output layer of the deep learning method adopted in this study was the same as the BP method. The data were the data of multiple measurement points in multiple months of the whole year of 2017. If a study has the situation of a small amount of data, deep learning may be better than the BP method. Anyway, the advantages of a deep network are not fully demonstrated, but other methods of script research are more effective in this study. In the future, data volume and equipment condition need to be discussed to choose the optimal method.

Chlorophyll-a concentration data was from CERN (Taihu Laboratory for Lake Ecosystem Research, Nanjing Institute of Geography and Limnology). This research was supported by the grant from the Development Program of China: research and demonstration of ecological construction of typical islands in the South China Sea and the monitoring technology of ecological things in the South China Sea, NO.2017YFC0506304. And based on remote sensing geology survey, the application information extraction and drawing of national defense construction, NO.DD2016007637.

Xiaolan Zhao and Haoli Xu designed the experiments, Xiaolan Zhao, Haoli Xu and Zhibin Ding performed the experiments, Tingfong Wu and Wei Li got the measured Chl-a data, Daqing Wang, Zhengdong Deng and Yi Wang analyzed the data, Zhao Lu and Guangyuan Wang contributed materials and analysis tools, Haoli Xu and Xiaolan Zhao wrote the paper.

All relevant data are included in the paper or its Supplementary Information.

Barsi
J. A.
,
Lee
K.
,
Kvaran
G.
,
Markham
B.
&
Pedelty
J. A.
2014
The spectral response of the Landsat-8 operational land imager
.
Remote Sensing
6
,
10232
10251
.
doi:10.3390/rs61010232
.
Barzegar
R.
,
Aalami
M. T.
&
Adamowski
J.
2020
Short-term water quality variable prediction using a hybrid CNN–LSTM deep learning model
.
Stochastic Environmental Research and Risk Assessment
34
(
2
),
415
433
.
doi:10.1007/s00477-020-01776-2
.
Cao
H. Y.
,
Gong
T.
,
Yuan
C. Z.
&
Jiang
J. Q.
2016
Quantitative retrieval of chlorophyll-a concentration in northern part of Lake Taihu based on RBF model
.
Chinese Journal of Environmental Engineering
10
(
11
),
6449
6504
.
doi:10.12030/j.cjee.201506134
.
Duan
H. T.
,
Ma
R. H.
&
Hu
C. M.
2012
Evaluation of remote sensing algorithms for cyanobacteria pigment retrievals during spring bloom formation in several lakes of East China
.
Remote Sensing of Environment
126
,
126
135
.
doi:10.1016/j.rse.2012.08.011
.
Gebler
D.
,
Kolada
A.
,
Pasztaleniec
A.
&
Szoszkiewicz
K.
2021
Modelling of ecological status of Polish lakes using deep learning techniques
.
Environmental Science and Pollution Research
28
(
5
),
5383
5397
.
doi:10.1007/s11356-020-10731-1
.
Goodfellow
I.
,
Bengio
Y.
&
Courville
A.
2016
Deep Learning
.
The MIT Press
,
Cambridge, MA
, USA.
Gu
J. P.
&
Pei
L.
2017
Retrieval of chlorophyll content and temperature in Taihu based on Landsat 8-OLI/TIRS and HJ-1B
.
Geomatics & Spatial Information Technology
40
(
5
),
146
151 + 156
.
doi:10.3969/j.issn.1672-5867.2017.05.046
.
Gurlin
D.
,
Gitelson
A. A.
&
Moses
W. J.
2011
Remote estimation of chl-a concentration in turbid productive waters – return to a simple two-band NIR-red model?
Remote Sensing of Environment
115
,
3479
3490
.
doi:10.1016/j.rse.2011.08.011
.
Haykin
S.
2008
Neural Networks and Learning Machines
.
Prentice Hall
,
New York, USA
.
Keiner
L. E.
&
Yan
X. H.
1998
A neural network model for estimating sea surface chlorophyll and sediments from thematic mapper imagery
.
Remote Sensing of Environment
66
(
2
),
153
165
.
doi:10.1016/S0034-4257(98)00054-6
.
Li
X. W.
,
Ji
G. S.
&
Yang
J.
1995a
Estimating cyanophyta biomass standing crops in Meiliang Gulf of Lake Taihu by satellite remote sensing
.
Remote Sensing for Land & Resources
2
,
23
28
.
Li
X. W.
,
Ji
G. S.
&
Yang
J.
1995b
Satellite remote sensing of phytoplankton in Taihu Lake
.
Journal of Lake Sciences
7
(
1
),
65
68
.
doi:10.18307/1995.0109
.
Liu
C. C.
&
Woods
J.
2004
Deriving four parameters from patchy observations of ocean color for testing a plankton ecosystem model
.
Deep-Sea Research Part II: Topical Studies in Oceanography
51
(
10–11
),
1053
1062
.
doi:10.1016/j.dsr2.2003.10.007
.
Liu
H. X.
,
Zhang
C. M.
&
Liang
X. X.
2004
Conic fitting of scattered data points on a plane
.
Journal of Computer-Aided Design and Graphics
16
(
11
),
1594
1598
.
doi:10.3321/j.issn:1003-9775.2004.11.023
.
Lobo
F. L.
,
Costa
M. P. F.
&
Novo
E. M. L. M.
2015
Time-series analysis of Landsat-MSS/TM/OLI images over Amazonian waters impacted by gold mining activities
.
Remote Sensing of Environment
157
,
170
184
.
doi:10.1016/j.rse.2014.04.030
.
Ma
R. H.
&
Dai
J. F.
2005a
Investigation of chlorophyll-a and total suspended matter concentrations using Landsat-ETM and field spectral measurement in Taihu Lake, China
.
International Journal of Remote Sensing
26
(
13
),
2779
2795
.
doi:10.1080/01431160512331326648
.
Maier
P. M.
,
Keller
S.
&
Hinz
S.
2021
Deep learning with WASI simulation data for estimating chlorophyll-a concentration of inland water bodies
.
Remote Sensing
13
(
4
), 718.
doi:10.3390/rs13040718
.
Matthews
M. W.
2011
A current review of empirical procedures of remote sensing in inland and near-coastal transitional waters
.
International Journal of Remote Sensing
32
(
21
),
6855
6899
.
doi:10.1080/01431161.2010.512947
.
Matthews
M. W.
,
Bernard
S.
&
Winter
K.
2010
Remote sensing of cyanobacteria-dominant algal blooms and water quality parameters in Zeekoevlei, a small hypertrophic lake, using MERIS
.
Remote Sensing of Environment
114
(
9
),
2070
2087
.
doi:10.1016/j.rse.2010.04.013
.
Morel
A. Y.
&
Gordon
H. R.
1980
Report of the working group on water color
.
Boundary-Layer Meteorology
18
,
343
355
.
doi:10.1007/BF00122030
.
Mushtaq
F.
&
Lala
M. G. N.
2016
Remote estimation of water quality parameters of Himalayan lake (Kashmir) using Landsat 8 OLI imagery
.
Geocarto International
32
,
274
285
.
doi:10.1080/10106049.2016.1140818
.
Ning
H. T.
,
Jiang
P.
&
Wu
Y. L.
2021
Research on aerosol optical depth retrieval of Himawari-8 data based on deep neural networks
.
The Administration and Technique of Environmental Monitoring
33
(
1
),
8
12
.
doi:10.19501/j.cnki.1006-2009.2021.01.003
.
Odermatt
D.
,
Gitelson
A. A.
,
Brando
V. E.
&
Schaepman
M.
2012
Review of constituent retrieval in optically deep and complex waters from satellite imagery
.
Remote Sensing of Environment
118
,
116
126
.
doi:10.1016/j.rse.2011.11.013
.
Peterson
K. T.
,
Sagan
V.
&
Sloan
J. J.
2020
Deep learning-based water quality estimation and anomaly detection using Landsat-8/Sentinel-2 virtual constellation and cloud computing
.
GIScience & Remote Sensing
57
(
4
),
510
525
.
doi:10.1080/15481603.2020.1738061
.
Scardi
M.
1996
Artificial neural networks as empirical models for estimating phytoplankton production
.
Marine Ecology Progress Series
139
(
1–3
),
289
299
.
doi:10.3354/meps139289
.
Schiller
H.
&
Doerffer
R.
1999
Neural network for emulation of an inverse model operational derivation of Case II water properties from MERIS data
.
International Journal of Remote Sensing
20
(
9
),
1735
1746
.
doi:10.1080/0143
.
She
F. N.
&
Cai
Q. M.
1997
Principal-component-supervised classification and its application to image recognition of water quality
.
Journal of Lake Sciences
9
(
3
),
261
268
.
doi:10.18307/1997.0311
.
She
F. N.
,
Li
X. W.
,
Cai
Q. M.
&
Chen
Y. W.
1996
Quantitative analysis on chlorophyll-a concentration in Taihu Lake using thematic mapper data
.
Journal of Lake Sciences
8
(
3
),
201
207
.
doi:10.18307/1996.0302
.
Simis
S. G. H.
,
Peters
S. W. M.
&
Gons
H. J.
2005
Remote sensing of the cyanobacterial pigment phycocyanin in turbid inland water
.
Limnology and Oceanography
50
,
237
245
.
doi:10.4319/lo.2005.50.1.0237
.
Song
K. S.
,
Li
L.
,
Tedesco
L. P.
,
Li
S.
,
Duan
H. T.
,
Liu
D. W.
,
Hall
B. E.
,
Du
J.
,
Li
Z. C.
,
Shi
K.
&
Zhao
Y.
2013
Remote estimation of chlorophyll-a in turbid inland waters: three-band model versus GA-PLS model
.
Remote Sensing of Environment
136
,
342
357
.
doi:10.1016/j.rse.2013.05.017
.
Sun
D. Y.
,
Li
Y. M.
&
Wang
Q.
2009
A unified model for remotely estimating chlorophyll a in Lake Taihu, China, based on SVM and in situ hyperspectral data
.
IEEE Transactions on Geoscience and Remote Sensing
47
(
8
),
2957
2965
.
doi:10.1109/TGRS.2009.2014688
.
Tao
R.
,
Peng
J. C.
,
Zhang
H.
,
Wu
Y. X.
&
Zhang
D. R.
2019
Research progress on chlorophyll-a monitoring in inland waters based on remote sensing
.
Geomatics World
26
(
4
),
44
53
.
Wang
X. J.
&
Ma
T.
2000
The application of remote sensing Technology in monitoring the water quality of Taihu Lake
.
Environmental Science
21
(
6
),
65
68
.
doi:10.13227/j.hjkx.2000.06.015
.
Wang
X. Y.
&
Yang
W.
2019
Water quality monitoring and evaluation using remote sensing techniques in China: a systematic review
.
Ecosystem Health and Sustainability
5
(
1
),
47
56
.
doi:10.1080/20964129.2019.1571443
.
Wang
H.
,
Zhao
D. Z.
,
Wang
L.
&
Huang
F. R.
2012
Advance in remote sensing of water quality
.
Marine Environmental Science
31
(
2
),
285
288
.
doi:10.3969/j.issn.1007-6336.2012.02.030
.
Wang
X. C.
,
Shi
F.
,
Yu
L.
&
Li
Y.
2013
MATLAB Neural Network 43 Case Analysis
.
Beihang University Press
,
Beijing
,
China
, pp.
59
66
.
Wang
H. S.
,
Yan
X. F.
,
Chen
H. P.
,
Chen
C.
&
Guo
M. J.
2015a
Chlorophyll–a predicting model based on dynamic neural network
.
Applied Artificial Intelligence
29
(
10
),
962
978
.
doi:10.1080/08839514.2015.1097142
.
Wang
Y.
,
Jiang
H.
,
Jin
J. X.
,
Zhang
X. Y.
,
Lu
X. H.
&
Wang
Y. Q.
2015b
Temporal-spatial variations of chlorophyll-a in the adjacent sea area of the Yangtze River Estuary influenced by Yangtze River discharge
.
International Journal of Environmental Research and Public Health
12
(
5
),
5420
5438
.
doi:10.1109/GEOINFORMATICS.2010.5567848
.
Wu
X. Y.
&
Yang
W.
2019
Water quality monitoring and evaluation using remote sensing techniques in China: a systematic review
.
Ecosystem Health and Sustainability
5
(
1
),
47
56
.
doi:10.1080/20964129.2019.1571443
.
Yin
Q.
,
Gong
C. L.
,
Kuang
D. B.
,
Zhou
N.
,
Hu
Y.
,
Zhang
F. L.
,
Xu
W. D.
&
Ma
Y. Q.
2005
Method of satellite remote sensing of lake water quality and its applications
.
Journal of Infrared and Millimeter Waves
24
(
3
),
198
202
.
doi:10.3321/j.issn:1001-9014.2005.03.009
.
Zhang
Y. Z.
,
Pulliainen
J.
,
Koponen
S.
&
Hallikainen
M.
2002
Application of an empirical neural network to surface water quality estimation in the Gulf of Finland using combined optical data and microwave data
.
Remote Sensing of Environment
81
(
2–3
),
327
336
.
doi:10.1016/S0034-4257(02)00009-3
.
Zhang
Y. C.
,
Ma
R. H.
,
Duan
H. T.
,
Loiselle
S.
,
Zhang
M. W.
&
Xu
J. D.
2016
A novel MODIS algorithm to estimate chlorophyll a concentration in eutrophic turbid lakes
.
Ecological Indicators
69
,
138
151
.
doi:10.1016/j.ecolind.2016.04.020
.
Zhang
Y. Z.
,
Hallikainen
M.
,
Zhang
H. S.
,
Duan
H. T.
,
Li
Y.
&
Liang
X. S.
2018
Chlorophyll-a estimation in turbid waters using combined SAR data with hyperspectral reflectance data: a case study in Lake Taihu, China
.
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
11
(
4
),
1325
1336
.
doi:10.1109/JSTARS.2017.2789247
.
Zheng
Z. B.
,
Ren
J. L.
,
Li
Y. M.
,
Huang
C. C.
,
Liu
G.
,
Du
C. G.
&
Lyu
H.
2016
Remote sensing of diffuse attenuation coefficient patterns from Landsat 8 OLI imagery of turbid inland waters: a case study of Dongting Lake
.
Science of the Total Environment
573
,
39
54
.
doi:10.1016/j.scitotenv.2016.08.019
.
Zheng
G. Z.
,
Le
X. D.
,
Wang
H. P.
&
Hua
W. H.
2017
Inversion of water depth from WorldView-02 satellite imagery based on BP and RBF neural network
.
Earth Science
42
(
12
),
2345
2353
.
doi:10.3799/dqkx.2017.552
.
Zhou
Y.
,
Zhou
W. Q.
,
Wang
S. X.
&
Zhang
P.
2004
Applications of remote sensing techniques to inland water quality monitoring
.
Advances in Water Science
15
(
3
),
312
317
.
doi:10.3321/j.issn:1001-6791.2004.03.009
.
Zhu
Y. F.
,
Zhu
L.
,
Li
J. G.
,
Chen
Y. J.
,
Zhang
Y. H.
,
Hou
H. Q.
,
Ju
X.
&
Zhang
Y. Z.
2017
The study of inversion of chlorophyll a in Taihu based on GF-1 WFV image and BP neural network
.
Acta Scientiae Circumstantiae
37
(
1
),
130
137
.
doi:10.13671/j.hjkxxb.2016.0275
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).