One of the most important indicators of lake eutrophication is chlorophyll-a (Chl-a) concentration, which is also an essential component of lake water quality monitoring. It is an efficient, economical and convenient method to monitor the Chl-a concentration through remote sensing images. Taking the Wuliangsuhai Lake as an example, the relevant bands of Sentinel-2 images were used as the input and the Chl-a concentration as the output to build neural network models. In the process of building the model, we mainly studied and tested the impact of adding time features to the model input on the model accuracy. Through the experiment, it was found that the month and day difference features of remote sensing images and Chl-a measurement could significantly improve the prediction accuracy of Chl-a concentration in varying degrees. Finally, it was determined that the neural network prediction model with 12 bands of Sentinel-2 images combined month features as inputs and one hidden layer, eight neurons and Chl-a concentration as outputs was the best. Then, the accuracy of the model was validated when the test set accounts for 20 and 30%, and good results were obtained.

  • Chlorophyll-a (Chl-a) concentration has been evaluated as an essential component of lake water quality monitoring.

  • Estimating the accuracy of Chl-a can be significantly improved by adding the feature of month and day differences in turn.

  • The Chl-a prediction model was a great theoretical and practical significance for intelligent monitoring of lake water quality.

Lakes provide a variety of services, including aquaculture, species protection, transportation, agricultural irrigation and water resource storage, in addition to being vital freshwater resources (Wang et al. 2021a; Quan et al. 2022). Maintaining the health of lakes is critical for both productivity and survival (Longo et al. 2019). However, because of the lake's fast socio-economic growth, a huge volume of home and industrial effluent is dumped into it. The levels of nutrients such as nitrogen and phosphorus in the water bodies are well over the recommended levels (Li et al. 2021). This results in the eutrophication of lakes, which will lead to the abnormal proliferation of aquatic organisms such as algae and loss of water body functions, and then a large number of aquatic animals and plants will die (Zhang et al. 2019; Quan et al. 2020; Dong et al. 2021).

Lake eutrophication is a worldwide biological and environmental issue. Chl-a concentration in water bodies reflects phytoplankton dispersion and is an indicator of algal biomass (Guo et al. 2021a, 2021b) as well as an important indication of eutrophication status (Bramich et al. 2021; Chen et al. 2021). Physical sampling and chemical analysis are the most common approaches for monitoring Chl-a levels in lakes, although they do not offer a full picture of the water. Traditional methods are time-consuming and labor-intensive, with drawbacks such as discomfort, a long monitoring period and expensive costs, and they can only provide extremely restricted and discrete data. Due to the wide range and timeliness of satellite monitoring (Yang et al. 2022) and the low cost of image acquisition (Zhao et al. 2020, 2021), satellite remote sensing has become an efficient tool for monitoring Chl-a concentration in lakes (Mohebzadeh & Lee 2021). This is essential to prevent and control the eutrophication of water bodies. By analyzing the correlation between satellite band data and Chl-a concentration, we can find the bands or bands combination highly related to Chl-a concentration, and then a model of satellite images and Chl-a concentration can be created to predict Chl-a concentration.

Traditional Chl-a concentration prediction methods rely primarily on a multivariate statistical regression model between remote sensing images and measured Chl-a concentrations (Mishra & Mishra 2012; Zhang et al. 2015; Bi et al. 2018; Xu et al. 2020). Nowadays, machine learning is more used to build models. He et al. (2021) used the double hidden layer ANN (artificial neural network) model to intelligently predict the Chl-a concentration, which to some extent shows that deep learning has more advantages in extracting the main features of information than shallow learning. Pan et al. (2021) quantitatively inverted the Chl-a concentrations in Taihu Lake using a band ratio model and a three-band model. Yue et al. (2020) performed a quadratic polynomial regression of Chl-a concentration in summer with B5/B4 as the independent variable and obtained the best results, corresponding to the decision coefficient (R2) of 0.816. Martinez et al. (2020) used SVR (support vector regression) to reconstruct and study the annual variation of Chl-a in the Pacific and Indian Oceans. Shin et al. (2020) combined the RNN (Recurrent Neural Network) model with a rolling window learning method to obtain the best prediction of Chl-a concentration. Zhang et al. (2020) used SVM (support vector machine) to obtain the best combination of bands to invert Chl-a, corresponding to the R2 = 0.774. Su et al. (2021) used the machine learning model LightGBM to obtain the best results for inverse Chl-a with the corresponding R2 = 0.785. Park et al. (2020) used the RF model to reconstruct the missing values of Chl-a concentration data. Saberioon et al. (2020) used Sentinel-2A and machine learning algorithms to monitor Chl-a (R2 = 0.85, RMSE (Root Mean Square Error) = 48.572) and TSS (R2 = 0.80, RMSE = 19.55) in small inland waters. However, the reflection spectra of inland lake waters are usually complex (Mishra et al. 2009), and atmospheric variations in the Earth can interfere with the output values of the models (Feng 2021). This can lead to poor inversion accuracy of statistical and linear models for complex nonlinear problems (Zhang et al. 2009). The relationship between water quality parameters and the optical or physical properties of the water body is typically nonlinear, and neural networks are suitable for explaining this relationship (Hou et al. 2021).

In this study, based on the requirements of the project, Chl-a was measured monthly, and remote sensing images could also be obtained monthly. The 12 months of each year belong to four seasons. The impact of seasons on the lake environment is significant, so the monthly features of the sampled data should be very obvious. Additionally, the number of days between the image and the measurement time (day difference) could also be regarded as a feature. Therefore, the month features (Ms) and day difference features (Ds) would be used together with the remote sensing data as the input of the model to predict the Chl-a concentration. The purpose of this study was to verify the performance of time features in improving the prediction accuracy of the model, and establish a neural network model to predict the concentration of Chl-a in the Wuliangsuhai Lake.

The objectives of this study were (1) to compare and analyze the influence of different time features on the accuracy of neural network models, (2) to develop a neural network model with remote sensing images combined with time features as input and Chl-a concentration as output, (3) validate the accuracy of the neural network model and (4) use the established model to construct a distribution map of Chl-a concentration in the Wuliangsuhai Lake for a certain period. Through this work, the eutrophication status of the Wuliangsuhai Lake can be quickly obtained, which is of great significance for both water resources’ environmental monitoring and management.

Study area

The Wuliangsuhai Lake, located in northern China and western Inner Mongolia (Figure 1), is a multifunctional lake in arid and semi-arid region and the largest freshwater lake in the Yellow River Basin (Mao et al. 2021). It mainly receives the Yellow River water, the surplus water after agricultural irrigation in Hetao Irrigation Area, and the rural production and domestic sewage (Wang et al. 2021b). The Wuliangsuhai Lake (40°36′N–41°03′N, 108°43′E–108°57′E) is one of the typical and important freshwater lakes in northern China (Shi et al. 2020; Tang et al. 2020). The lake is surrounded by agricultural land, desert and saline areas, with dense reeds growing along the shore and in parts of the interior. The area of the Wuliangsuhai Lake is about 330 km2, the average elevation of the lake is about 1,018.5 m, the capacity of the lake is 250–300 million m3, and the depth of the lake is 0.5–2.5 m in most areas (Lv et al. 2018). The Wuliangsuhai basin has low rainfall and high evaporation, and belongs to the temperate monsoon climate zone (Wang et al. 2022, 2023). The average annual precipitation is 153.1 mm, and the annual evaporation is 2,000 mm. The lake freezes at the beginning of November every year, and melts at the end of March and early April of the next year. The ice thickness is about 0.3–0.6 m, and the freezing period is about 5 months. It is a typical cold and arid wetland with important ecological functions in the Yellow River Basin, and its water ecological environment protection is of great significance to the ecological protection of the Yellow River Basin (Du et al. 2019; Yue et al. 2021). As a famous scenic spot, it has attracted more and more tourists, thus promoting economic growth and tourism. Unfortunately, due to the rapid development of urbanization and industry, a large number of industrial wastewater and domestic sewage are directly or indirectly discharged into the Wuliangsuhai Lake, resulting in the decline of water quality. In addition, the Wuliangsuhai Lake is relatively closed, its self-purification and repair ability is weak, and the lake water renewal is slow. As a result, nutrients continue to accumulate, forming a eutrophic environment (Sun et al. 2019; Fang et al. 2021).
Figure 1

Location map of the study area: (a) China, (b) Inner Mongolia, an Autonomous Region of China and (c) Wuliangsuhai Lake.

Figure 1

Location map of the study area: (a) China, (b) Inner Mongolia, an Autonomous Region of China and (c) Wuliangsuhai Lake.

Close modal

Datasets

Measured data

The measured Chl-a data were from 19 sampling points in the Wuliangsuhai Lake (Figure 1(c)). The sample points were evenly distributed and located in the center of the lake. A handheld GPS device was used to record the coordinates. We used a collector to collect water within 0.5 m depth. The water samples were then taken back to the laboratory for Chl-a concentration extraction by Ultraviolet spectrophotometer. The measured Chl-a data in the study area were collected once a month from September 2015 to July 2018. During this period, 108 satellite images with measured data close to its acquisition time (within 2 days) were also selected as sample data to establish the model.

Satellite images

The remote sensing images are the Sentinel-2 satellite images released by the European Space Agency (ESA) (Tian et al. 2021a, 2021b). The images were selected from 2015 to 2018 when the weather was clear with low cloudiness and in a non-ice period. Sentinel-2 images are from two satellites, satellite A and satellite B (Yin et al. 2022). The satellite A was launched on June 23, 2015, and satellite B was launched on March 7, 2017. Sentinel-2 carries a multispectral imager (MSI), with an altitude of 786 km, can cover 13 spectral bands (Table 1), a width of 290 km and a ground resolution of up to 10 m. The revisit period of one satellite is 10 days, and the two satellites complement each other, and the revisit period is 5 days. From visible and near-infrared to short wave infrared, compared with other satellites, it has great advantages in revisit cycle and resolution (Sun et al. 2021; Mao et al. 2022). The Sentinel-2 images comes from the USGS official website (https://earthexplorer.usgs.gov) (Tian et al. 2019). In this study, sen2cor-02.08.00 is used to preprocess Sentinel-2 images, such as radiometric calibration and atmospheric correction.

Table 1

Sentinel-2 image bands introduction

BandNameCentral wavelength (μm)Resolution (m)
 Coastal aerosol 0.433 60 
 Blue 0.490 10 
 Green 0.560 10 
 Red 0.665 10 
 Vegetation Red Edge 0.705 20 
 Vegetation Red Edge 0.740 20 
 Vegetation Red Edge 0.783 20 
 NIR 0.842 10 
 Vegetation Red Edge 0.865 20 
 Water vapor 0.945 60 
 SWIR-Cirrus 1.375 60 
 SWIR1 1.610 20 
 SWIR2 2.190 20 
BandNameCentral wavelength (μm)Resolution (m)
 Coastal aerosol 0.433 60 
 Blue 0.490 10 
 Green 0.560 10 
 Red 0.665 10 
 Vegetation Red Edge 0.705 20 
 Vegetation Red Edge 0.740 20 
 Vegetation Red Edge 0.783 20 
 NIR 0.842 10 
 Vegetation Red Edge 0.865 20 
 Water vapor 0.945 60 
 SWIR-Cirrus 1.375 60 
 SWIR1 1.610 20 
 SWIR2 2.190 20 

BP neural network

When using the BP neural network to create a model, it is not necessary to specify the mathematical relationship between the model input and output (Ahmadi et al. 2020, 2021). By constantly adjusting (training) the connection weight of neurons in the network, the expected output value corresponding to the input can be obtained. The BP neural network model uses gradient descent method to adjust the connection weight, so as to minimize the mean square error between the model output value and the expected output value. The BP neural network model is mainly composed of input layer, hidden layers and output layer. Besides simple and relatively easy to achieve, its algorithm has strong self-learning, self-organizing and adaptive capabilities (Ahmadi et al. 2021).

The BP algorithm back propagates the error E (Equation (1)) and updates the weight value (Equation (2)) and offset term value b (Equation (3)) according to the gradient direction in order to minimize the error E. It is updated according to the learning rates and in the gradient direction. and can be set to the same value:
formula
(1)
formula
(2)
formula
(3)

Theoretically, it can simulate any complex nonlinear relationship. Therefore, the BP neural network model was chosen to predict the Chl-a concentration in the study area.

Development of the model

Determination of input bands and model middle layers

Sentinel-2 data originally had 13 bands. After preprocessing, disappeared, leaving only 12 bands. At present, there are no more scientific method to directly select the band combinations with high correlation with chlorophyll concentration. Therefore, this study selects all bands (aBs), bands with positive correlation (pBs) and bands with negative correlation (nBs) with Chl-a concentrations as the inputs of neural network, and the measured Chl-a concentration as the output for the construction and training of the BP neural network.

The selection of positive and negative correlation bands is based on the correlation coefficient rx (Equation (4)):
formula
(4)
where is the xth band in the ith sample, is the mean value of the xth band in all samples, is the measured Chl-a concentration value in the ith sample, is the mean value of Chl-a concentration in all samples and n is the number of samples.

The correlation coefficient rx between each band and Chl-a concentration is shown in Table 2. According to Table 2, the positive correlation bands are B05, B06, B07, B08, B8A and B09, and the negative correlation bands are B01, B02, B03, B04, B11 and B12.

Table 2

Rx between each band and Chl-a concentration

BandrxBandrxBandrx
B01 −0.295 B05 0.161 B8A 0.189 
B02 −0.226 B06 0.234 B09 0.145 
B03 −0.098 B07 0.242 B11 −0.103 
B04 −0.18 B08 0.224 B12 −0.149 
BandrxBandrxBandrx
B01 −0.295 B05 0.161 B8A 0.189 
B02 −0.226 B06 0.234 B09 0.145 
B03 −0.098 B07 0.242 B11 −0.103 
B04 −0.18 B08 0.224 B12 −0.149 

The neural network structure was implemented using the PyTorch framework. During neural network training, the learning rate was set to 10−3, the activation function was set to ‘ReLU’, and the number of epochs was set to 6 × 103. The R2 (Equation (5)) and RMSE (Equation (6)) were used to evaluate the accuracy of the model:
formula
(5)
formula
(6)
where N is the total number of test set samples, denotes the measured value, is the measured mean value and refers to the value predicted by the model.

The value of R2 ranges from [0,1], and the closer to 1, the higher the prediction accuracy of the model. Root mean square error (RMSE) is used to illustrate the dispersion of the sample. For nonlinear fitting, the smaller the RMSE, the better.

Since Chl-a data are measured monthly, both training set and test set should include data from each month. For this reason, the test set was randomly selected and combined from the data of each month. Here, 80% of the sample data were set as training data and 20% as test data. aBs, pBs and nBs were used as inputs, respectively, and the concentration of Chl-a was used as output. Then, the network structure with one hidden layer and different number of neurons was trained and tested. The complexity of the neural network will increase with the increasing number of hidden layers and neurons. The principle of building network structure was to use the simplest structure to obtain the highest accuracy. According to the test, when the number of neurons reaches more than eight, the accuracy was only slightly improved, so the network structure of one hidden layer and eight neurons achieves the best test results with the least number of neurons (Figure 2).
Figure 2

Neurons quantity test.

Figure 2

Neurons quantity test.

Close modal

It can be seen from Figure 2 that when the number of neurons reaches 8, even if the number of neurons continues to increase, the accuracy does not improve significantly, and R2 was below 0.8. In fact, both satellite images and measured data have periodicity in sampling time, and the data itself has the characteristics of periodicity. We tried to add the month features on the basis of input bands, so that the bands and month features could be used as the input together, and explored whether the month features were helpful to improve the accuracy of the model. The month information here is represented by the month switch vector. For example, if the input sample comes from one of the six months, it is represented by a vector. The vector is composed of 0 and 1, with a total of six elements, representing 6 months in turn. If the current sample belongs to a month, the position of the corresponding element in the vector is set to 1 and the other positions are set to 0. For example, if the sample data come from May, June, July, August and September, and the current sample is the sample of August, the vector representing the month feature is (0 ,0, 0, 1, 0).

As mentioned earlier, there may be day differences between satellite images and measured values of chlorophyll, and a vector of day difference is also defined. If the day difference is within 0, 1 and 2 days, the elements in the vector can only be −2, −1, 0, 1 and 2. The reason why there are positive and negative numbers is that the day difference can be earlier or later. The definition rule of the day difference vector is: if the difference between the image and the measured data in the current sample is D days, the corresponding element position is set to D, and the other positions are set to 0. For example, if the current image is 1 day earlier than the actual measurement, the vector representing the day difference feature is (0, − 1,0,0,0). Next, combine the two time features with aBs, pBs and nBs, respectively, and use the combined features as the input of the neural network to test the impact of time features on the prediction accuracy.

Comparison method of various inputs

Firstly, the prediction results of the model with all bands, positive correlation bands and negative correlation bands as input were compared. Then, the month features and the day differences were combined in each band combination separately to test the effects of these two features on the model accuracy. Finally, month features and day differences were added to each band combination at the same time to test the effect of both features on the model accuracy. The best form of model input could be obtained by comparing the effects of various inputs on model accuracy. At this point the optimal model was also achieved. After obtaining the optimal prediction model, the model was used to predict the Chl-a concentrations of the Wuliangsuhai Lake in a certain period. The prediction results were compared with the existing research conclusions to verify the practicability of the model.

Effect of time features on R2 and RMSE when the test set was 20%

The results in Figure 3 show that for prediction accuracy, > > and RMSEaBs < < . However, in this case, the corresponding R2 of aBs with the highest accuracy was only 0.731, which could not meet the accuracy needs of practical application.
Figure 3

Comparison of R2 and RMSE on the test set when aBs, pBs and nBs were combined with Ms and Ds as model inputs.

Figure 3

Comparison of R2 and RMSE on the test set when aBs, pBs and nBs were combined with Ms and Ds as model inputs.

Close modal

Fortunately, when aBs, pBs and nBs were combined with Ms or Ds as input, the prediction accuracy could be greatly improved. Moreover, the accuracy of the joint Ms was improved more. This is because month, as a periodic time indicator, can have a great impact on the prediction accuracy. At this time, the model accuracy was aBs + Ms > nBs + Ms > pBs + Ms It can be found that adding the month feature does not change the previous accuracy ranking. Furthermore, after adding the month feature, the improvement rate of each previous band combination was also different, and the improvement rate of pBs was the largest. The accuracy of pBs + Ms was almost twice that of pBs. After aBs, pBs and nBs were respectively combined with the month feature as input, R2 could all reach above 0.8. The R2 of aBs + Ms as model input even exceeds 0.9, which was very good for practical application.

For aBs, pBs and nBs, if Ms and Ds were combined as model inputs at the same time, the model accuracy obtained was higher than that obtained by combining Ds only. However, it could not exceed the model accuracy obtained only by combining Ms, which shows that Ds has not played a very positive role in improving the model accuracy. Consequently, the above results showed that Ms, as an inherent time feature of remote sensing data, could fundamentally and directly improve the prediction accuracy of the model.

The scatter plot shown in Figure 4 shows the comparison of measured data and predicted results of various inputs. It can also be found from the scatter plot that the Chl-a concentrations were mainly distributed within 20 μg/L. The Ds feature could make many predicted values much larger than the measured values. It could also be reflected that Ds feature did not improve the prediction accuracy as much as Ms, and for each remote sensing band combination xBs (aBs, pBs or nBs), the impact of time features on prediction accuracy met the same rule, that was, > > > and < < < . Among all inputs, aBs+Ms could obtain the highest prediction accuracy of the model.
Figure 4

Scatter plots of the measured data vs. the model predictions using different inputs.

Figure 4

Scatter plots of the measured data vs. the model predictions using different inputs.

Close modal

Influence of time features on loss reduction process in model training

Pytorch uses the MSE (mean squared error) function to calculate the loss (Equation (7)). The loss of the first 3,000 epochs of the neural network learning process was taken to draw the curve:
formula
(7)
The corresponding loss descent curve of each input during model training was shown in Figure 5. For xBs (aBs, pBs or nBs), when xBs+Ms and xBs+Ms+Ds were used as inputs, the falling speed and final loss of both were the lowest and close, and xBs+Ms was slightly better. As far as aBs was concerned, aBs+Ms and aBs+Ms+Ds go up and down in the process of decline, and finally they were close to aBs and aBs+Ds. For pBs, the final loss values of pBs+Ms and pBs+Ms+Ds were very close, but they were far from the loss values of pBs and pBs+Ds. For nBs, the loss values obtained by nBs+Ms, nBs+Ms+Ds and nBs+Ds were close, but there was a big gap with nBs. Compared with aBs+Ms, pBs+Ms, and nBs+Ms, aBs+Ms finally achieved the minimum loss, which again confirmed that aBs+Ms could obtain the best prediction as the model input from the perspective of loss.
Figure 5

The loss reduction process when the test set was 20%.

Figure 5

The loss reduction process when the test set was 20%.

Close modal

Model accuracy and loss decline process when the test set was 30%

To verify the stability of the model, set the test set to 30% and test the input forms again. It could be found that the accuracy comparison results were similar to 20%, which were still > > > and < < < (Figure 6).
Figure 6

Comparison of R2 and RMSE when the test set was 30%.

Figure 6

Comparison of R2 and RMSE when the test set was 30%.

Close modal
When the test set was 30%, the loss descent curve corresponding to each input during model training was shown in Figure 7. Different from the test value of 20%, for xBs (aBs, pBs or nBs), the initial descent speed of xBs+Ds as input was relatively fast, but the final loss value was still not as low as xBs+Ms and xBs+Ms+Ds. Similarly, when xBs+Ms and xBs+Ms+Ds were used as inputs, the final loss was the lowest and close. Compared with aBs+Ms, pBs+Ms and nBs+Ms, for the descent speed and final value of loss, aBs+Ms as the model input was optimal.
Figure 7

The loss reduction process when the test set was 30%.

Figure 7

The loss reduction process when the test set was 30%.

Close modal
The above experiments show that aBs + Ms was the best model input whether the test set accounts for 20 or 30%. Therefore, the Chl-a prediction model in this study was finally determined, that was, a neural network structure model with aBs+Ms as the model input, a single hidden layer composed of eight neurons, and chlorophyll concentration as outputs (Figure 8). When building the model, the 12 bands of data corresponding to the measurement points in the Sentinel-2 remote sensing images were used as input along with the month vectors to calibrate the determined neural network model (Huang et al. 2021). Stop calibration when the optimal precision was obtained on the test set, save the weights of each node of the model, and then use the saved weight model to predict the chlorophyll concentration.
Figure 8

An optimal neural network prediction model structure.

Figure 8

An optimal neural network prediction model structure.

Close modal

Application of neural network model in prediction

In this work, Modified Normalized Difference Water Index (MNDWI) (Equation (8)) was selected to extract the clear water area of the Wuliangsuhai Lake (Xu 2005). Using the satellite images of the Wuliangsuhai Lake in September 2015, June 2016, June 2017 and August 2017, the optimal model (Figure 8) was applied to predict the Chl-a concentration at the corresponding time (Figure 9). Using latitudes 40°57′30″ and 40°53′30″ as the dividing line, the Wuliangsuhai Lake was divided into three parts: North Area, Central Area and South Area. The results show that the distribution law of Chl-a concentrations in the Wuliangsuhai Lake in September 2015, June 2017 and August 2017 was gradually decreasing from the north to the south of the lake area. That was, for Chl-a concentrations, North Area > Central Area > South Area. In 2016, the concentration of Chl-a was at a low level, and the change trend was more stable than that in other years. The above prediction results were highly consistent with the research conclusions of Jiang et al. (2019).
formula
(8)
where SWIR1 is the first shortwave infrared band which is the 11th band in the Sentinel-2 images, and Green is the third band. The value of the MNDWI within the range of (0,1) is water body.
Figure 9

Wuliangsuhai Lake Chl-a prediction.

Figure 9

Wuliangsuhai Lake Chl-a prediction.

Close modal

After the model predicted 29,695 coordinate points in the August 2017 image that formed the clear water area, it was found that the mean chlorophyll concentration of 8,191 coordinate points in the north area was 25.06 μg/L, the mean chlorophyll concentration of 7,660 coordinate points in the middle area was 21.59 μg/L, and the mean chlorophyll concentration of 13,844 coordinate points in the south area was 11.97 μg/L. There are external water sources in the north of the lake, mainly including Yellow River, domestic sewage and residual water after farmland irrigation. The water after farmland irrigation contains a large number of chemical fertilizers, pesticides and other pollutants, so that the concentration of nitrogen and phosphorus in the water body in this area is high and the degree of eutrophication is serious. The lake as a whole includes reed area, plant area and water area. The water quality in the southern part of the lake is gradually improved because only a small amount of external water is discharged. Therefore, most of the time, the chlorophyll concentration decreases from north to south.

In this study, all bands of Sentinel-2 satellite images, the bands with positive and negative correlation with the measured Chl-a concentration were used as inputs, and the neural network was constructed to predict the Chl-a concentration in the Wuliangsuhai Lake. After a lot of experiments, a neural network model with 12 bands of Sentinel-2 image combined with month feature as the input, one hidden layer, eight neurons and Chl-a concentration as outputs was determined. The R2 could reach 0.929 when the test set accounts for 20%. The improvement of model accuracy by month feature was greater than that by day difference feature. The model test on the satellite images of the Wuliangsuhai Lake in some specific time periods demonstrated that the model developed in this work was suitable for prediction of Chl-a concentration, and had high accuracy and practicality. The model has considerable theoretical and practical relevance for directing lake ecosystem management and pollution prevention and control, and it has essential reference value for intelligent monitoring of lake water environment. At present, only three-band combinations are used as input in this research. Whether there are other band combinations that are more helpful to improve the prediction accuracy of the model remains to be explored with the help of scientific methods. Moreover, due to weather conditions, sampling time, satellite transit frequency and other reasons, the data available for model training are relatively limited, especially the day difference between images and samplings, which may lead to the inaccuracy of the model. This is bound to have a certain degree of negative impact on the generalizability of model application. In the future study, we are ready to try some statistical methods for bands selection, and look forward to obtaining higher model accuracy on larger data sets.

This research was supported by the Research Program of Science and Technology at the Universities of Inner Mongolia Autonomous Region (NJZZ23044), the National Natural Science Foundation of China (61962047), the National Key Research and Development Program of China (2019YFC0409205) and the Natural Science Foundation of Inner Mongolia Autonomous Region of China (2019MS06015, 2020MS06011, 2021MS06009).

All data are available from the corresponding author.

All relevant data are included in the paper or its Supplementary Information.

The authors declare there is no conflict.

Ahmadi
M.
,
Kheyroddin
A.
&
Dalvand
A.
2020
New empirical approach for determining nominal shear capacity of steel fiber reinforced concrete beams
.
Construction and Building Materials
234
,
117293
.
Ahmadi
M.
,
Kheyroddin
A.
&
Kioumarsi
M.
2021
Prediction models for bond strength of steel reinforcement with consideration of corrosion
.
Materials Today: Proceedings
45
,
5829
5834
.
Bi
S.
,
Li
Y. M.
&
Lv
H.
2018
Estimation of chlorophyll-a concentration in Lake Erhai based on OLCI data
.
Lake Sciences
30
(
3
),
701
712
.
Bramich
J.
,
Bolch
C. J. S.
&
Fischer
A.
2021
Improved red-edge chlorophyll-a detection for Sentinel 2
.
Ecological Indicators
120
,
106876
.
Chen
X.
,
Ou
M. W.
&
Shi
J.
2021
Remote sensing retrieval and evaluation of chlorophyll-a concentration in East Dongting Lake, China
.
IOP Conference Series. Earth and Environmental Science
668
(
1
),
12035
.
Du
D. D.
,
Li
C. Y.
&
Shi
X. H.
2019
Seasonal changes of nutritional status of Lake Wuliangsuhai
.
Journal of Arid Land Resources and Environment
33
(
12
),
186
192
.
Fang
X.
,
Wang
Q.
,
Wang
J.
,
Xiang
Y.
,
Wu
Y.
&
Zhang
Y.
2021
Employing extreme value theory to establish nutrient criteria in bay waters: a case study of Xiangshan Bay
.
Journal of Hydrology
603, 127146.
doi: 10.1016/j.jhydrol.2021.127146
.
He
E. Y.
,
Yang
J.
&
Li
S. L.
2021
Intelligent prediction method for Chl-a based on the artificial neural network
.
Marine Forecasts
38
(
1
),
44
54
.
Huang
D.
,
Wang
J.
&
Khayatnezhad
M.
2021
Estimation of actual evapotranspiration using soil moisture balance and remote sensing
.
Iranian Journal of Science and Technology, Transactions of Civil Engineering
45
,
2779
2786
.
https://doi.org/10.1007/s40996-020-00575-7
.
Jiang
X. Y.
,
Li
C. Y.
&
Shi
X. H.
2019
Spatial and temporal distribution of chlorophyll-a concentration and its relationships with environmental factors in Lake Ulansuhai
.
Ecology and Environmental Sciences
28
(
5
),
964
973
.
Li
W.
,
Shi
Y.
,
Zhu
D.
,
Wang
W.
,
Liu
H.
,
Li
J.
&
Fu
S.
2021
Fine root biomass and morphology in a temperate forest are influenced more by the nitrogen treatment approach than the rate
.
Ecological Indicators
130
,
108031
.
doi: 10.1016/j.ecolind.2021.108031
.
Longo
M.
,
Knox
R. G.
,
Medvigy
D. M.
,
Levine
N. M.
,
Dietze
M. C.
,
Kim
Y.
&
Moorcroft
P. R.
2019
The biophysics, ecology, and biogeochemistry of functionally diverse, vertically and horizontally heterogeneous ecosystems: the ecosystem demography model, version 2.2 – part 1: model description
.
Geoscientific Model Development
12
(
10
),
4309
4346
.
doi: 10.5194/gmd-12-4309-2019a
.
Lv
J.
,
Li
C. Y.
&
Zhao
S. N.
2018
Evaluation of nutritional status in Wuliangsuhai in frozen and non-frozen seasons
.
Journal of Arid Land Resources and Environment
32
(
01
),
109
114
.
Mao
Y.
,
Sun
R.
,
Wang
J.
,
Cheng
Q.
,
Kiong
L. C.
&
Ochieng
W. Y.
2022
New time-differenced carrier phase approach to GNSS/INS integration
.
GPS Solutions
26
(
4
),
122
.
doi: 10.1007/s10291-022-01314-3
.
Martinez
E.
,
Gorgues
T.
&
Lengaigne
M.
2020
Reconstructing global chlorophyll-a variations using a non-linear statistical approach
.
Frontiers in Marine Science
7
,
464
.
Pan
X.
,
Yang
Z.
&
Yang
Y. B.
2021
Mass concentration inversion analysis of chlorophyll a in Taihu Lake based on GF-6 satellite data
.
Journal of Hohai University
49
(
1
),
50
56
.
Shi
R.
,
Zhao
J. X.
&
Shi
W.
2020
Comprehensive assessment of water quality and pollution source apportionment in Wuliangsuhai Lake, Inner Mongolia, China
.
International Journal of Environmental Research and Public Health
17
(
14
),
5054
.
Sun
X.
,
Li
X.
&
Li
J. R.
2019
Distribution characteristics of different forms of nitrogen, phosphorus and phytoplankton in the whole season of Wuliangsuhai Lake
.
Ecological Science
38
(
1
),
64
70
.
Sun
R.
,
Wang
J.
,
Cheng
Q.
,
Mao
Y.
&
Ochieng
W. Y.
2021
a new IMU-aided multiple GNSS fault detection and exclusion algorithm for integrated navigation in urban environments
.
GPS Solutions
25
,
147
.
doi: 10.1007/s10291-021-01181-4
.
Tian
H.
,
Huang
N.
,
Niu
Z.
,
Qin
Y.
,
Pei
J.
&
Wang
J.
2019
Mapping winter crops in China with multi-source satellite imagery and phenology-based algorithm
.
Remote Sensing
11
(
7
),
820
.
Tian
H.
,
Qin
Y.
,
Niu
Z.
,
Wang
L.
&
Ge
S.
2021a
Summer maize mapping by compositing time series Sentinel-1A imagery based on crop growth cycles
.
Journal of the Indian Society of Remote Sensing
49
(
11
),
2863
2874
.
doi: 10.1007/s12524-021-01428-0
.
Tian
H.
,
Wang
Y.
,
Chen
T.
,
Zhang
L.
&
Qin
Y.
2021b
Early-season mapping of winter crops using Sentinel-2 optical imagery
.
Remote Sensing
13
(
19
),
3822
.
doi: 10.3390/rs13193822
.
Wang
X. H.
,
Yang
F.
&
Ma
W. J.
2021a
Seasonal variation of nitrate sources in Wuliangsuhai Lake
.
Research of Environmental Sciences
34
(
5
),
1091
1098
.
Wang
S.
,
Zhang
K.
,
Chao
L.
,
Li
D.
,
Tian
X.
,
Bao
H.
&
Xia
Y.
2021b
Exploring the utility of radar and satellite-sensed precipitation and their dynamic bias correction for integrated prediction of flood and landslide hazards
.
Journal of Hydrology
603
,
126964
.
doi: 10.1016/j.jhydrol.2021.126964
.
Wang
H. Y.
,
Chen
B.
,
Pan
D.
,
Lv
Z. A.
&
Huang
S. Q.
2022
Optimal wind energy generation considering climatic variables by Deep Belief network (DBN) model based on modified coot optimization algorithm (MCOA)
.
Sustainable Energy Technologies and Assessments
53
(
Part C
),
102744
.
Wang
G.
,
Zhao
B.
,
Wu
B.
,
Zhang
C.
&
Liu
W.
2023
Intelligent prediction of slope stability based on visual exploratory data analysis of 77 in situ cases
.
International Journal of Mining Science and Technology
33, 47–59.
https://doi.org/10.1016/j.ijmst.2022.07.002
.
Xu
H. Q.
2005
A study on information extraction of water body with the Modified Normalized Difference Water Index (MNDWI)
.
Journal of Remote Sensing
05
,
589
595
.
Xu
Y. F.
,
Chen
L. M.
&
Chen
L. G.
2020
High temporal resolution remote monitoring of chlorophyll a concentration change after rainstorm based on GOCI data in Lake Taihu
.
Water Resources and Hydropower Engineering
51
(
10
),
151
158
.
Yang
Z.
,
Yu
X.
,
Dedman
S.
,
Rosso
M.
,
Zhu
J.
,
Yang
J.
&
Wang
J.
2022
UAV remote sensing applications in marine monitoring: knowledge visualization and review
.
Science of The Total Environment
838
,
155939
.
https://doi.org/10.1016/j.scitotenv.2022.155939
.
Yin
L.
,
Wang
L.
,
Zheng
W.
,
Ge
L.
,
Tian
J.
,
Liu
Y.
&
Liu
S.
2022
Evaluation of empirical atmospheric models using swarm-C satellite data
.
Atmosphere
13
(
2
),
294
.
doi: 10.3390/atmos13020294
.
Yue
C. P.
,
Li
X.
&
Bao
L. S.
2020
Using remote sensing to estimate seasonal variation in phytoplankton biomasses in the Lake Wuliangsuhai
.
Journal of Irrigation and Drainage
39
(
8
),
122
128
.
Zhang
Y. C.
,
Qian
X.
&
Qian
Y.
2009
Quantitative retrieval of chlorophyll a concentration in Taihu Lake using machine learning methods
.
Environmental Science
30
(
5
),
1321
1328
.
Zhang
L. H.
,
Dai
X. F.
&
Bao
Y. H.
2015
Inversion of chlorophyll-a concentration based on TM remote sensing image in Wuliangsuhai Lake
.
Environmental Engineering
33
(
6
),
133
138
.
Zhang
K.
,
Wang
S.
,
Bao
H.
&
Zhao
X.
2019
Characteristics and influencing factors of rainfall-induced landslide and debris flow hazards in Shaanxi Province, China
.
Natural Hazards and Earth System Sciences
19
(
1
),
93
105
.
doi: 10.5194/nhess-19-93-2019
.
Zhang
T.
,
Huang
M. T.
&
Wang
Z. J.
2020
Estimation of chlorophyll-a concentration of lakes based on SVM algorithm and Landsat 8 OLI images
.
Environmental Science and Pollution Research
27
(
13
),
14977
14990
.
Zhao
T.
,
Shi
J.
,
Lv
L.
,
Xu
H.
,
Chen
D.
,
Cui
Q.
&
Zhang
Z.
2020
Soil moisture experiment in the Luan River supporting new satellite mission opportunities
.
Remote Sensing of Environment
240
,
111680
.
Zhao
T.
,
Shi
J.
,
Entekhabi
D.
,
Jackson
T. J.
,
Hu
L.
,
Peng
Z.
&
Kang
C. S.
2021
Retrievals of soil moisture and vegetation optical depth using a multi-channel collaborative algorithm
.
Remote Sensing of Environment
257
,
112321
.
doi: 10.1016/j.rse.2021.112321
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY-NC-ND 4.0), which permits copying and redistribution for non-commercial purposes with no derivatives, provided the original work is properly cited (http://creativecommons.org/licenses/by-nc-nd/4.0/).