Abstract
Given the increased risks of water scarcity and the presence of polluting agents in water resources, this paper aims at the development and presentation of a computational tool capable of assessing water quality based on digital processing techniques applied to satellite images. Initially, a database was created for Brazilian regions, consisting of hydrographic basins' satellite images associated with the Water Quality Index (WQI), according to the criteria established by the National Water Agency (ANA). Hitherto, the database consisted of 85 images, 61 were used in the training stage and 24 in the testing stage. In both stages, the images were subjected to thresholding using Otsu's Method, binarization, linear expansion on saturation, application of a Laplacian filter, extraction of characteristics by using co-occurrence matrices and classification by the Bayes Discriminant. Such techniques were also implemented on a computational platform in the MATLAB® environment, responsible for the interface between the system and users. The proposed system presented an approximate 70% success rate regarding the classification of WQIs, which can be improved as more information is made available to improve the databases.
HIGHLIGHTS
Application of satellite images and processing techniques to assess water quality.
Assessment of water quality in hydrographic basins without on-site measurements.
Application of Otsu's method, linear expansion by saturation and Laplacian filter.
Water features extraction and classification by co-occurrence matrices and Bayesian Linear Discriminant Analysis Classifier.
INTRODUCTION
Water is crucial for biological and social issues. In addition to mediating the biochemical reactions in living organisms, it is also used as a means of transportation, leisure, and electricity generation, among others. The increasing demand for water due to population growth, industrial and mineral exploration activities, and improper waste management has been compromising the quality and durability of the existing water resources on the planet. In this context, recent environmental disasters that have occurred in Brazil not only show the need to review the regulatory and inspection tools, but also raise awareness of the need to deploy resources for monitoring and control by society in general.
In 2015, a Vale's mining tailings dam collapsed in the municipalities of Mariana and Barra Longa, in Minas Gerais, which caused the greatest environmental impact in the country's history and the largest world disaster involving tailings dams (Brazil dam burst engulfs homes in Minas Gerais, available at https://www.bbc.com/news/world-latin-america-34742272). Four years after the Mariana disaster, another dam controlled by the same company collapsed in the Córrego do Feijão region, in Brumadinho. In 2019, the dam failure resulted in a major catastrophe, regarded as one of the biggest industrial, environmental and humanitarian disasters with more than 300 casualties, according to https://www.bbc.com/news/av/world-latin-america-48958016. In the face of these recent calamities, analyzing and monitoring the water quality of rivers, lakes and dams has become an ever-growing necessity for the populations directly affected by such shortcomings, and a matter of public interest.
Several studies aimed at analyzing water quality are available in the current literature, among which there are several proposals devoted to the use of satellite images for the identification of chemical components present in hydrographic basins.
In 2013, the Landsat 8 satellite was launched with two sensors, OLI (Operational Land Imager) and TIRS (Thermal Infrared Sensor), and 11 satellite frequency bands with characteristics that allow the identification of different contents in the target regions.
Trescott & Park (2013) used bands 1, 2 and 3 for analyzing eutrophication and Wang et al. (2018) for analyzing oxygen and dissolved permanganate concentrations. In addition to bands 2 and 3, Pisani et al. (2016) included band 4 to determine the sedimentation present in two Brazilian rivers. Bonansea & Fernandez (2013) used band 4 to analyze suspended solids in water. In Ferral et al. (2018), the authors work with bands 1 and 4 to evaluate chlorophyll concentrations and band 6 to estimate water temperature.
Watanabe et al. (2017) discussed the performance of satellite images in the study of chlorophyll in Barra Bonita Reservoir, Brazil and Nazeer & Nichol (2015) used two satellites to analyze suspended solids in Hong Kong's coastal waters.
In Wong et al. (2018), the authors used data collection to perform a quantitative parameter assessment and thus, to define the degree of reliability of the proposed algorithm.
Manickam et al. (2018) analyzed the evolution of river pollution through low-resolution satellite images and images extracted from Google Maps.
Bilal et al. (2019) applied some techniques for the analysis of water quality; among them, image variance and threshold segmentation with path discretization, highlighting the limitations to in-situ data collection and the importance of continuous monitoring of water quality data.
Batur & Maktav (2019) used images from the Landsat 8 and Sentinel 2A satellites to establish the relationship between water quality parameters and spectral reflectance and to validate surface water quality values using the Support Vector Machine (SVM), among other methods.
Proposals for calculating water quality using the Internet of Things (IoT) can be found in Verma et al. (2019), in which the authors propose a network of wireless sensors connected to a device that sends information to a virtual platform in real time. More recently, it has also been possible to observe an increasing number of studies using Machine Learning (ML) techniques for analyzing climatic and environmental conditions (Moradi et al. 2020; Sit et al. 2020). However, the large amount of information normally required for training and testing ML algorithms becomes a strong limitation for using these techniques if adequate databases are not available.
Qi et al. (2020) proposed monitoring four water quality parameters (PH, DO, CODMn and NH3-H) based on LSTM (Long Short-Term Memory) network, from 2013 to 2018, in Taihu Lake, China. The obtained model was used to determine the water quality parameters and it was applied to satellite images over several periods to measure the water quality parameter related to each pixel and the water quality changes.
Aldhyani et al. (2020) use artificial intelligence algorithms to predict WQI and WQC (Water Quality Classification) based on seven parameters from data collected in different Indian states from 2005 to 2014, which are available on Kaggle. NARNET (Nonlinear Autoregressive Neural Network) and LSTM networks were used for the WQI prediction phase. For the WQC phase, SVM, K-Nearest Neighbor (KNN) and Naïve Bayes techniques were used to classify the data in five classes (Excellent, Good, Poor, Very Poor and Unsuitable for drinking).
Based on the Ebinur Lake Basin region in China, Li et al. (2020) applied four machine learning algorithms to model and predict the WQI from 9 WQPs (Water Quality Parameter), spectral band remote sensing and 2D modeling spectral index using Sentinel-2 MSI data.
Ma et al. (2021) presented a study on Land Use and Land Cover (LUCS) of a typical agricultural basin near the Danjiangkou Reservoir, whose purpose is to determine the impact of natural and human processes on water quality. This basin was subdivided into 13 parts with a monitoring point in each sub-basin. The parameters obtained consist of total nitrogen (TN), total phosphorus (TP), ammonia nitrogen (NH4 + -N), nitrate nitrogen (NO3 - N) and chemical oxygen demand (COD), which were collected and sampled six times throughout 2018.
The purpose of this work is to develop an open interactive platform for analyzing water quality based on digital processing of satellite images, comparing the spectral content of bands 2 and 5 to the WQI. The images were obtained from the National Institute for Space Research (INPE) and were selected based on data collected from the parameters comprising the WQI. The collection data were made available by the State Water Management Institute (IGAM). An open computational platform for interaction with users was developed in a MATLAB® - R2019b environment, composed of techniques for noise elimination, analysis of statistical characteristics and the Bayesian Linear Discriminant Analysis Classifier, among others.
MATERIALS AND METHODS
Images and WQI database
In this work, the authors built a database from satellite images of different rivers and lakes in the Brazilian hydrographic basins, which were associated with water quality parameters measured in situ, necessary for the calculation of WQIs. These parameters are provided by IGAM on http://200.198.57.118:8080/jspui/handle/123456789/3216, the images by INPE on http://www.dgi.inpe.br/CDSR/ and in the database on https://github.com/GEPSIN/WQI-image-satellite-brazilian-states-dataset.
The training database consisted of 61 images, 27 of which were related to the hydrographic basins in the states of Alagoas (AL), Minas Gerais (MG), Mato Grosso (MT), Paraná (PR), Rio de Janeiro (RJ), Rio Grande do Norte (RN) and Rio Grande do Sul (RS). The other 34 images were obtained from the states of Bahia (BA), Ceará (CE), Espírito Santo (ES), Goiás (GO), Mato Grosso do Sul (MS), Paraíba (PB), Pernambuco (PE) and São Paulo (SP). These groups of states will henceforth be referred to respectively as Group 1 and Group 2. The validation tests were performed using a total of 24 images: 11 and 13 images for groups 1 and 2, respectively.
Water quality
The WQI was created in 1970 in the United States and it was initially used by the São Paulo State Environmental Company (CETESB) in 1975 and became the main benchmark for assessing water quality in the country after being disseminated to other Brazilian states in the following decade.
The WQI was developed to assess the quality of raw water for public supply before treatment. The water quality parameters indicators presented in Table 1 are used for its calculation, which basically depends on the contamination caused by the discharge of domestic sewage and industrial wastewater within the hydrographic basins. The WQI calculation is performed through a weighted product of the parameters and its numerical value varies from 0 to 100. The Brazilian states use the classification ranges in different ways, as shown in Table 2.
Parameter . | Weight . |
---|---|
Dissolved oxygen | 0.17 |
Thermotolerant coliforms | 0.15 |
Hydrogenic potential | 0.12 |
Biochemical oxygen demand | 0.10 |
Water temperature | 0.10 |
Total nitrogen | 0.10 |
Total phosphorus | 0.10 |
Turbidity | 0.08 |
Total residue | 0.08 |
Parameter . | Weight . |
---|---|
Dissolved oxygen | 0.17 |
Thermotolerant coliforms | 0.15 |
Hydrogenic potential | 0.12 |
Biochemical oxygen demand | 0.10 |
Water temperature | 0.10 |
Total nitrogen | 0.10 |
Total phosphorus | 0.10 |
Turbidity | 0.08 |
Total residue | 0.08 |
Group 1 . | Group 2 . | Classification range . |
---|---|---|
AL, MG, PR, RJ, RN, RS . | BA, CE, ES, GO, MS, PB, PE, SP . | |
91–100 | 80–100 | Excellent |
71–90 | 52–79 | Good |
51–70 | 37–51 | Fair |
26–50 | 20–36 | Bad |
0–25 | 0–19 | Worst |
Group 1 . | Group 2 . | Classification range . |
---|---|---|
AL, MG, PR, RJ, RN, RS . | BA, CE, ES, GO, MS, PB, PE, SP . | |
91–100 | 80–100 | Excellent |
71–90 | 52–79 | Good |
51–70 | 37–51 | Fair |
26–50 | 20–36 | Bad |
0–25 | 0–19 | Worst |
As shown in the block diagram presented in Figure 1, the proposed system includes a module stage for the manual acquisition of images (in the presence of clouds) or automatic acquisition (absence of clouds). Then, the pre-processing stage to separate the areas of interest and post-processing stage to extract characteristics, and the image classification stage, which is based on the comparison of the characteristics extracted from the image under analysis and characteristics previously stored in the database.
The proposal is that the user can insert any hydrographic basin images collected from the Landsat 8 with unknown WQI (water quality index) after comparing them with the characteristics extracted from images on the database used in the training phase, making it possible to obtain the WQI value and classify hydrographic basin according to Table 2.
Initial and pre-processing stages
It consists of implementing a set of procedures to prepare the image by removing noise and non-relevant information so that extracting the characteristics of interest and classifying images are more accurate and efficient (Gonzalez & Woods 2008). The following describes the procedures that make up this stage.
Segmentation method and masks
The objects of analysis are the surface water bodies, so any visual information that does not consist of water is irrelevant. This step provides the manual and automatic segmentation tools for the approximate selection of the water body, intended for images with and without the presence of clouds, respectively, both segmented from the scan and pixel comparison (Solomon & Breckon 2011).
Band 5, shown in Figure 2(a), was chosen for the segmentation stage because it exhibits a visual difference between water and land, more noticeable to the naked eye, which facilitates the clipping of the region of interest. With the image already cropped, as shown in Figure 2(b) and 2(c), a thresholding is performed in pixels greater than 9,000 for binary purposes. Experimental tests made it possible to define this threshold, which characterizes water bodies as regions whose pixel quantities are below that value.
In automatic mode, the thresholding is performed by the Otsu Method, and then the image is binarized and a pixel-by-pixel scan is performed to exclude pixels greater than 9,000, whose value was obtained by trial-and-error tests. The remaining pixels with characteristics of water-containing regions are set to zero.
At the end of this stage, there is a binarized image or mask that highlights the region of interest, where noise may still occur. For noise reduction, the resulting image is subjected to a cluster removal of N = 4 pixels using the bwareaopen function, available in the MATLAB® environment.
Applying the mask to band 2
Band 2 was used for extracting characteristics and classifying, as it yielded the best results in terms of classification during the training and test phases in relation to the other bands available in the Landsat 8 satellite images (Trescott & Park 2013; Wang et al. 2018).
The segmentation performed from band 5 results in a binary mask to select only the region of interest for analysis purposes, discarding any noisy and irrelevant information. This mask is then applied to band 2 and, as a result, the final image is generated in the pre-processing step, as shown in Figure 2(d).
Post-processing and classification
The spectral bands content of a sensor can be represented by means of histograms that indicate the number of pixels present in the image with a certain intensity value. From the histograms, the user can obtain information about the image such as the intensity of grey level pixels between the compared images and number of different classes, among others. The histogram contains only the image's radiometric information, but no spatial information (Gonzalez & Woods 2008).
When it comes to image processing, the histogram's characteristics make it possible to select more appropriate techniques for its expansion, aiming at improving contrast. As each spectral band has a particular form of histogram, different options for increasing the contrast can be chosen for different bands of a sensor, such as linear expansion by saturation, expansion by parts, and expansion by equalization, among others (Gonzalez & Woods 2008).
Linear expansion by saturation
Considering band 5, linear expansion by saturation showed better performance since it improves the contrast of the images. For that purpose, it was necessary to change the amplitude scale of each pixel through the basic rule of histogram expansion; that is, redistribution of pixels.
The modified or highlighted image will have the same number of pixels as the original image, just having its brightness values to the total quantization range of the image expanded. In this way, saturation is carried out, where the lowest and the highest pixel intensity values are identified from the expansion, so that the output pixels' gray intensity levels are those that most closely match the input pixels intensity levels. This effect results in more differentiated output intensity levels (Gonzalez & Woods 2008; Meneses & Almeida 2012).
Figure 3(a) shows an original band 2 image and its histogram, respectively. Figure 3(b) shows the image and its histogram, after the application of linear expansion by saturation, where a contrast improvement can be observed.
The great advantage of linear saturation is the maintenance of the original reflectance ratios, without any radiometric alteration in the highlighted image. For this reason, linear magnification is usually preferred among the techniques available for improving the spectral contrast of satellite images (Gonzalez & Woods 2008).
Laplacian filter
Figure 3(c) shows the image after the linear expansion by saturation and the details highlighted by the Laplacian filter, as well as its respective histogram. This image will be submitted to digital processing methods for features extraction based on satellite data.
Texture descriptor and co-occurrence matrix
The texture characteristics used for the classification are extracted using the co-occurrence matrix method. Although the concept of texture does not have a formal definition according to Tuceryan & Jain (1998), the texture of an image can be described as the spatial variation of pixel intensities or, according to Backes & Sá (2016), the texture pattern can be interpreted as the repetition of a model in its exact form or with minor variations.
According to Haralick et al. (1973), for an image with gray discrete levels of intensity, the co-occurrence matrix will have dimension and, in each position will receive the number of times that gray levels and are present in the image at a distance and orientation angle . If the co-occurrence matrix is divided by the sum of its values, it will become a probability matrix , capable of finding pairs of pixels at a distance and orientation angle . In addition, co-occurrence matrices can be symmetric if the counting of the pixel pairs occurs in a double direction, otherwise they will be asymmetric (Backes & Sá 2016).
Bayesian classifier
RESULTS AND DISCUSSION
Tests and validation
Based on information accessible in INPE and ANA databases, a new database built from satellite images associated with in-situ measurements of quality parameters was composed of 27 images for the Fair and Good classes in the Group 1 and 34 images for the Fair, Good and Excellent classes in the Group 2. In total, 61 images were used during the training phase, distributed according to Table 3.
Group 1 . | Group 2 . | |||
---|---|---|---|---|
AL, MG, PR, RJ, RN, RS . | BA, CE, ES, GO, MS, PB, PE, SP . | |||
Fair | Good | Fair | Good | Excellent |
12 | 15 | 6 | 17 | 11 |
Group 1 . | Group 2 . | |||
---|---|---|---|---|
AL, MG, PR, RJ, RN, RS . | BA, CE, ES, GO, MS, PB, PE, SP . | |||
Fair | Good | Fair | Good | Excellent |
12 | 15 | 6 | 17 | 11 |
The validation tests were performed using a total of 24 images for the two state groups, 13 and 11 images for Group 1 and Group 2 respectively, as shown in Tables 4 and 5. The success rates achieved considering this data collection were 69.23% and 72.73%, respectively.
Test images . | Georeferencing (decimal degrees) . | Governmental agency: IGAM . | Proposed system . | |||||
---|---|---|---|---|---|---|---|---|
Lat. . | Long. . | Acquisition date . | WQI . | Classification . | Measurement date . | Fair (%) . | Good (%) . | |
1 | −19.35 | −41.24 | 09/07/2017 | 75.70 | Good | 09/14/2017 | 57.18 | 42.82 |
2 | −19.51 | −41.01 | 10/09/2017 | 72.30 | Good | 10/12/2017 | 57.17 | 42.83 |
3 | −19.35 | −41.24 | 02/14/2018 | 51.45 | Fair | 02/08/2018 | 68.30 | 31.70 |
4 | −20.69 | −46.36 | 08/22/2015 | 79.80 | Good | 08/20/2015 | 54.70 | 45.30 |
5 | −21.17 | −45.13 | 08/11/2017 | 64.90 | Fair | 08/09/2017 | 51.41 | 48.59 |
6 | −21.17 | −45.13 | 12/04/2015 | 68.20 | Fair | 11/04/2015 | 58.12 | 41.88 |
7 | −21.17 | −45.13 | 05/06/2016 | 67.30 | Fair | 05/04/2016 | 51.82 | 48.18 |
8 | −18.85 | −44.79 | 05/06/2016 | 85.40 | Good | 05/06/2016 | 26.07 | 73.93 |
9 | −17.96 | −45.66 | 08/22/2015 | 84.55 | Good | 08/13/2015 | 44.09 | 55.91 |
10 | −17.96 | −45.66 | 11/17/2015 | 67.95 | Fair | 11/19/2015 | 50.99 | 49.01 |
11 | −19.51 | −41.01 | 09/07/2017 | 75.70 | Good | 09/14/2017 | 57.40 | 42.60 |
12 | −19.51 | −41.01 | 02/14/2018 | 51.45 | Fair | 02/08/2018 | 68.45 | 31.55 |
13 | −18.19 | −45.25 | 08/06/2015 | 83.60 | Good | 08/14/2015 | 42.09 | 57.91 |
Success rate | 69.23% |
Test images . | Georeferencing (decimal degrees) . | Governmental agency: IGAM . | Proposed system . | |||||
---|---|---|---|---|---|---|---|---|
Lat. . | Long. . | Acquisition date . | WQI . | Classification . | Measurement date . | Fair (%) . | Good (%) . | |
1 | −19.35 | −41.24 | 09/07/2017 | 75.70 | Good | 09/14/2017 | 57.18 | 42.82 |
2 | −19.51 | −41.01 | 10/09/2017 | 72.30 | Good | 10/12/2017 | 57.17 | 42.83 |
3 | −19.35 | −41.24 | 02/14/2018 | 51.45 | Fair | 02/08/2018 | 68.30 | 31.70 |
4 | −20.69 | −46.36 | 08/22/2015 | 79.80 | Good | 08/20/2015 | 54.70 | 45.30 |
5 | −21.17 | −45.13 | 08/11/2017 | 64.90 | Fair | 08/09/2017 | 51.41 | 48.59 |
6 | −21.17 | −45.13 | 12/04/2015 | 68.20 | Fair | 11/04/2015 | 58.12 | 41.88 |
7 | −21.17 | −45.13 | 05/06/2016 | 67.30 | Fair | 05/04/2016 | 51.82 | 48.18 |
8 | −18.85 | −44.79 | 05/06/2016 | 85.40 | Good | 05/06/2016 | 26.07 | 73.93 |
9 | −17.96 | −45.66 | 08/22/2015 | 84.55 | Good | 08/13/2015 | 44.09 | 55.91 |
10 | −17.96 | −45.66 | 11/17/2015 | 67.95 | Fair | 11/19/2015 | 50.99 | 49.01 |
11 | −19.51 | −41.01 | 09/07/2017 | 75.70 | Good | 09/14/2017 | 57.40 | 42.60 |
12 | −19.51 | −41.01 | 02/14/2018 | 51.45 | Fair | 02/08/2018 | 68.45 | 31.55 |
13 | −18.19 | −45.25 | 08/06/2015 | 83.60 | Good | 08/14/2015 | 42.09 | 57.91 |
Success rate | 69.23% |
Test images . | Georeferencing (decimal degrees) . | Governmental agency: IGAM . | Proposed system . | ||||||
---|---|---|---|---|---|---|---|---|---|
Lat. . | Long. . | Acquisition date . | WQI . | Classification . | Measurement date . | Fair (%) . | Good (%) . | Excellent (%) . | |
1 | −19.35 | −41.24 | 9/7/2017 | 75.70 | Good | 9/14/2017 | 17.18 | 48.45 | 34.37 |
2 | −19.51 | −41.01 | 10/9/2017 | 72.30 | Good | 10/12/2017 | 8.48 | 54.32 | 37.20 |
3 | −19.35 | −41.24 | 2/14/2018 | 51.45 | Good | 2/8/2018 | 18.19 | 58.49 | 23.32 |
4 | −21.17 | −45.13 | 8/11/2017 | 64.90 | Good | 8/9/2017 | 14.48 | 42.04 | 43.48 |
5 | −21.17 | −45.13 | 12/4/2015 | 68.20 | Good | 11/4/2015 | 18.68 | 46.11 | 35.21 |
6 | −18.19 | −45.25 | 5/20/2016 | 67.50 | Good | 5/13/2016 | 10.41 | 39.66 | 49.92 |
7 | −17.96 | −45.66 | 8/8/2016 | 85.05 | Excellent | 8/11/2016 | 13.18 | 36.67 | 50.15 |
8 | −18.85 | −44.79 | 5/6/2016 | 85.40 | Excellent | 5/6/2016 | 22.11 | 27.75 | 50.14 |
9 | −18.19 | −45.25 | 8/8/2016 | 84.40 | Excellent | 8/12/2016 | 13.23 | 41.97 | 44.80 |
10 | −20.01 | −47.88 | 9/14/2015 | 47.30 | Fair | 9/21/2015 | 95.30 | 3.65 | 1.05 |
11 | −19.76 | −50.20 | 9/30/2016 | 47.25 | Fair | 9/30/2016 | 14.08 | 46.15 | 39.77 |
Success rate | 72.73% |
Test images . | Georeferencing (decimal degrees) . | Governmental agency: IGAM . | Proposed system . | ||||||
---|---|---|---|---|---|---|---|---|---|
Lat. . | Long. . | Acquisition date . | WQI . | Classification . | Measurement date . | Fair (%) . | Good (%) . | Excellent (%) . | |
1 | −19.35 | −41.24 | 9/7/2017 | 75.70 | Good | 9/14/2017 | 17.18 | 48.45 | 34.37 |
2 | −19.51 | −41.01 | 10/9/2017 | 72.30 | Good | 10/12/2017 | 8.48 | 54.32 | 37.20 |
3 | −19.35 | −41.24 | 2/14/2018 | 51.45 | Good | 2/8/2018 | 18.19 | 58.49 | 23.32 |
4 | −21.17 | −45.13 | 8/11/2017 | 64.90 | Good | 8/9/2017 | 14.48 | 42.04 | 43.48 |
5 | −21.17 | −45.13 | 12/4/2015 | 68.20 | Good | 11/4/2015 | 18.68 | 46.11 | 35.21 |
6 | −18.19 | −45.25 | 5/20/2016 | 67.50 | Good | 5/13/2016 | 10.41 | 39.66 | 49.92 |
7 | −17.96 | −45.66 | 8/8/2016 | 85.05 | Excellent | 8/11/2016 | 13.18 | 36.67 | 50.15 |
8 | −18.85 | −44.79 | 5/6/2016 | 85.40 | Excellent | 5/6/2016 | 22.11 | 27.75 | 50.14 |
9 | −18.19 | −45.25 | 8/8/2016 | 84.40 | Excellent | 8/12/2016 | 13.23 | 41.97 | 44.80 |
10 | −20.01 | −47.88 | 9/14/2015 | 47.30 | Fair | 9/21/2015 | 95.30 | 3.65 | 1.05 |
11 | −19.76 | −50.20 | 9/30/2016 | 47.25 | Fair | 9/30/2016 | 14.08 | 46.15 | 39.77 |
Success rate | 72.73% |
It can be estimated that better results can be achieved if there is an increase in the number of images associated with in-situ measurements of water quality indexes. Other limitations to be overcome refer to reduced availability of hydrographic basins' satellite images with sufficient dimensions for segmentation.
Computational platform
A computational platform whose graphical user interface is shown in Figure 4 was developed in a MATLAB® environment to provide interaction between the user and the proposed system. This interface makes it possible to view and select a previously stored satellite image for analysis. Then, it is necessary to define the group to which the image belongs; that is, Group 1 referring to the images of the hydrographic basins in the states of AL, MG, MT, PR, RJ, RN, and RS or Group 2, referring to the states of BA, CE, ES, GO, MS, PB, PE, and SP.
The next step will depend on the visibility of the water body in the image; that is, whether the image is covered with clouds or not. As a matter of fact, if the image has clouds, the manual button must be selected to crop the image in order to obtain a region with better visibility. On the other hand, if the image has no clouds, the user can choose between the automatic or manual options.
Once the region of interest has been defined, the user may request a histogram view, allowing visual inspection of the image gray level distribution. Finally, the classification option will provide the WQI estimated value result as well as the quality classification range of the water under analysis. The home button restarts the platform for the input of a new image.
The computational platform was created using the Graphical User Interface Development Environment (GUIDE), a tool available in the MATLAB® environment. This tool associates programming routines with specific events through call-backs in such a way that, if an event (clicking a button, opening an image, etc.) is detected, a call-back triggers the associated code. This MATLAB® application requires few modifications in the original program, makes it possible to use functions stored in different directories, and facilitates expanding the number of operations. The platform is freely available at https://github.com/GEPSIN/AQAIS.
CONCLUSIONS
The maintenance and preservation of environmental resources is an issue that has increasingly played an important role in the face of the progressive scarcity of water resources and the disasters and tragedies that compromise natural resources.
In order to assess the water quality of watersheds through satellite images, the Landsat 8 satellite was chosen to obtain the images due to its greater number of frequency bands for the respective analyses; however, it has insufficient spatial resolution for some narrower water bodies, making the image unsuitable for segmentation and leading to limitations in building up the database. Another difficulty encountered was the limited amount of information provided by the Brazilian agencies responsible for analyzing water quality and the lack of a national database with standardized information.
Despite such limitations, the platform showed good results with success rates of around 70%, with good prospects for it to be expanded and optimized.
The main contributions of the proposed system are the association of parameters for measuring water quality in rivers and lakes and satellite images previously available in public databases that made it possible to estimate water quality in regions previously limited to in-situ assessments, normally inaccessible to the general population.
DATA AVAILABILITY STATEMENT
All relevant data are available from an online repository or repositories. The platform is freely available at https://github.com/GEPSIN/AQAIS. The database with images and WQI are freely available at https://github.com/GEPSIN/WQI-image-satellite-brazilian-states-dataset.