Treated wastewater may affect water quality and thereby significantly alter physicochemical and biological water quality parameters. The impact of wastewater treatment plants (WWTPs) on receiving water bodies is a multivariate problem. In this study, we investigated the effect of 45 full-scale WWTPs on tropical receiving water bodies in Mato Grosso do Sul State, Brazil. Most of the Pantanal wetland area lies within Mato Grosso do Sul State, thus representing a region of great hydrological relevance. Partial least squares-discriminant analysis (PLS-DA) was employed to discriminate samples collected at four WWTPs monitoring sites: influent, final effluent, upstream, and downstream of the discharges. The model demonstrated excellent accuracy when discriminating the influent from the effluent samples, but poor accuracy when discriminating upstream and downstream samples, indicating the high dilution capacity of the receiving water bodies as a critical factor in the water resources management. The results demonstrate the great potential of the methodology for better water resources management, which can be used in even more complex WWTP databases, allowing the assessment of effluent disposals' impacts in detail. It is recommended to use this methodology in water-limited regions to determine the effect of disposals in areas with different characteristics.
Effluents from WWTPs may affect the water quality of receiving water bodies.
The impact of WWTPs on receiving water bodies is a multivariate problem.
The use of the PLS-DA technique allows a better water resources management and the determination of effects of WWTPs on receiving water bodies.
Classification models’ accuracy highlights the tropical Brazilian watercourses’ high dilution capacity.
Surface water quality is strongly influenced by anthropogenic activity. Effluents from wastewater treatment plants (WWTPs) are among the main anthropogenic sources, whose discharge may alter some of the water quality parameters of receiving water bodies (Yotova et al. 2020). As the composition of WWTP effluents do not match that of the receiving body, the effluent may significantly impact its chemical and biological characteristics (Drury et al. 2013), thereby compromising downstream uses (Abily et al. 2021).
WWTP monitoring data are not public in Brazil and other Latin American countries, making it difficult to obtain and analyze them. Besides, existing datasets usually have many missing data points and long periods without any data collection. Consequently, there is a lack of knowledge and experience in this regard. Vast areas and distinct environments with their peculiarities must be monitored, but the human and economic resources are limited. A comprehensive and rapid approach based on a one-sampling campaign with distinct environmental indicators might effectively diagnose environmental systems (Beghelli et al. 2016). In this context, appropriate methods may provide an overview of the WWTPs when a long monitoring period is infeasible.
Various methods have been employed to assess the impact of WWTP disposal on receiving water bodies. Lu et al. (2019) investigated the temporal and spatial variations in the water quality of an urban river receiving effluent from a WWTP in China. Multivariate statistical techniques were used to assess six sampling sites along the river and found that the effluent considerably affected the river's water quality. In Brazil, the impact on water quality relating season and land-use was observed using multivariate analysis with 80 monitoring points collected between 2010 and 2011 (de Souza Pereira et al. 2019), identifying the areas without sewage collection and treatment as those having the worst values of the water quality index (WQI). Dantas et al. (2021) assessed the performance of 14 full-scale WWTPs in Brazil by evaluating the parameters monitored in both raw and treated wastewater. To assess the impact of disposal on surface water quality, the monitoring data of the receiving water bodies collected at points upstream and downstream of each WWTP disposal were statistically compared. Venelinov et al. (2021) assessed the impact of discharges from three WWTPs on a river in Bulgaria. The authors compared the observed concentrations of the river's physicochemical parameters and trace elements to legally established limits. They also calculated the contributions of effluent pollutant loads to total river loads.
Environmental pollution issues, such as the impact of WWTPs on the contamination of receiving waters, are multivariate problems influenced by several variables with varying correlation values (Khatoonabadi et al. 2021). Therefore, the use of multivariate statistical analysis in the assessment of these impacts has gained importance (Shin et al. 2013; Zhang et al. 2016; Lu et al. 2019).
An example of a multivariate technique is partial least squares-discriminant analysis (PLS-DA), a valuable statistical tool for water quality assessment (Yotova et al. 2019). PLS-DA was first formally reported by Barker & Rayens (2003) and combines dimensionality reduction and discriminant analysis into one algorithm (Lee et al. 2018). PLS-DA is a variant of partial least squares regression (PLS-R), which can be used when the response variable is categorical (Fordellone et al. 2019). It is a classification technique used to determine which group a sample is most likely to belong to, based on a set of analytical measurements (Brereton & Lloyd 2014) and a well-known classification method for the prioritization of features discriminating different classes of environmental samples (Khatoonabadi et al. 2021).
The PLS-DA classification model can be constructed using either the PLS1-DA or PLS2-DA algorithms. PLS1-DA models a binary classification problem, i.e., of two classes, whereas PLS2-DA models a multi-class problem, i.e., the number of classes is greater than two (Lee et al. 2018). Yotova et al. (2019) used PLS1-DA to discriminate water quality factors and parameters between treated effluents and receiving water bodies in 21 Bulgarian WWTPs. The authors employed binary classification problems to discriminate treated effluent from surface water samples and to discriminate upstream from downstream samples in receiving waters. Mihaylova et al. (2022) have used PLS1-DA to discriminate different classes of samples (treated, untreated wastewaters, surface waters) in 11 Bulgarian WWTPs based on physicochemical parameters and ecotoxicological endpoints.
To the best of our knowledge, no study has employed PLS2-DA to discriminate the four sampling locations (influent, final effluent, upstream, and downstream of the disposals) and to assess the impact of WWTPs in tropical receiving waters with high dilution capacity. The current study, thus, aims to evaluate the use of the PLS-DA technique to determine the effects of Brazilian WWTPs on receiving water bodies by discriminating the water quality samples among influent, final effluent, upstream, and downstream of the discharge points.
Sampling of wastewater treatment plants
Figure 1 shows the location of the WWTPs, their treatment technologies, and their sizes. The design flow of the 45 WWTPs ranges from 2 to 120 L/s; therefore, the sewage treatment plants under study are small- (< 50 L/s) and medium-sized (50 L/s < design flow < 200 L/s) according to Brazilian legislation (Resolution 377/2006 of National Environment Council). Supplementary Material, Table S1 shows the characterization of the 45 WWTPs under study, which includes the design flow, the design population equivalent, the treatment facilities, and WWTPs coordinates.
There are two stabilization ponds facilities, one septic tank facility, 20 upflow anaerobic sludge blanket (UABS) reactors, and 22 UASB followed by post-treatment (Figure 1). The configurations of the stabilization ponds are as follows: one WWTP is a facultative pond, and the other consists of anaerobic ponds followed by facultative ponds. Different post-treatments of UASB reactors are employed in different WWTPs, namely anaerobic filter, trickling filters, maturation ponds, biodiscs, submerged aerated biofilters, and physical-chemical treatments.
Preliminary statistical analysis
Considering the data on the influent and effluent values of each parameter, the Wilcoxon non-parametric test was applied at a 5% significance level. The test was also applied to compare the data gathered upstream and downstream of the WWTPs discharge points. The results are presented in box plot graphs.
PLS-DA was carried out in four steps: binary classification problems to differentiate between (i) influent and effluent samples, and (ii) upstream and downstream samples; (iii) a multi-class problem to classify the four sampling sites (influent, effluent, upstream, and downstream); and (iv) a binary classification problem to differentiate wastewater samples (considering both influent and effluent) and receiving water samples (considering both upstream and downstream of the disposals).
Data were autoscaled prior to models' construction. In PLS-DA modeling, it is necessary to determine the number of PLS components to be retained during model formation (Lee et al. 2018). The optimal number of components selected for each step was the one that resulted in a minimum classification error rate (Fordellone et al. 2019).
The variable importance on projection (VIP), a measure of the importance of the variables in the classification model, was calculated. Water quality indicators with VIP scores greater than 1 were considered to have significant discriminative power in the classification model (Yotova et al. 2019; Mihaylova et al. 2022).
The receiver operator characteristic curve (ROC) plots the sensitivity against 1-specificity of the model for different values of discrimination thresholds. The area under the curve (AUC) – a commonly used measure to evaluate a classifier discriminative ability (Rohart et al. 2017) – is employed as the main figure of merit for the obtained PLS-DA models (Yotova et al. 2019; Mihaylova et al. 2022) and was calculated for each model.
All analyses were performed using the R programming language (R Core Team 2021). ‘PackagePLSDA’ (Ipopa et al. 2018) package was used for model construction and the results of confusion matrix, classification error rate, and VIP. Package ‘mixOmics’ (Rohart et al. 2017) was used for the biplot graph, ROC curve, and AUC results.
RESULTS AND DISCUSSION
Preliminary statistical analysis
In Brazil, the National Environment Council (CONAMA) Resolution n. 430/2011 establishes conditions for the disposal of effluents into water bodies at the national level. Considering the parameters assessed in the study (BOD, COD, chlorides, NH4, P, pH, turbidity, and thermotolerant coliforms), only BOD and pH have legislated standards for sanitary sewage discharge. The requirements of CONAMA 430/2011 for municipal wastewater discharge are met for most of the samples (Table 1). However, it is important to highlight that this legislation is flexible compared to international standards.
|Parameter .||Standard .||Samples not complying (n) .|
|BOD||Maximum effluent concentration of 120 mg/L, or removal efficiency of at least 60%||5|
|pH||Between 5.0 and 9.0||0|
|Parameter .||Standard .||Samples not complying (n) .|
|BOD||Maximum effluent concentration of 120 mg/L, or removal efficiency of at least 60%||5|
|pH||Between 5.0 and 9.0||0|
Another Brazilian legal requirement is the granting of rights to use water resources. Thus, facilities that discharge effluent into water bodies may have more restrictive standards than 120 mg/L for BOD disposal. However, not all Brazilian WWTPs have a granting concession. Then this more stringent standard may not be required.
Of the 45 WWTPs in the study, six are required to meet two criteria for their granting concession: (i) a more restrictive standard than the CONAMA Resolution for BOD effluent concentration and (ii) BOD removal efficiency. Four WWTPs did not meet the removal efficiency limit, however five complied with the maximum effluent concentration. Two WWTPs among the six met both criteria.
Of the 45 WWTPs in the study, 22 are required to meet only the criteria of a more restrictive standard than the CONAMA Resolution for BOD effluent concentration. Among them, 14 complied with the requirements (64%). A total of 50% of these 14 WWTPs have UASB in the treatment process without post-treatment, which shows the region's high dilution capacity of the receiving water bodies. Thus, more restrictive criteria are needed to guarantee better effluent quality and better water resources management for multiple uses. WWTPs which comply with the CONAMA Resolution (less restrictive) only and do not comply with the granting concession criteria may generate conflicts for the use of water resources in the watersheds (da Silva et al. 2015).
The optimal number of components was determined to be three, resulting in an error classification rate of 7.8%. The misclassified samples were influent classified as effluent, indicating more diluted sewage in these samples. These occurred in four WWTPs of UASB reactor, two of UASB followed by trickling filter and secondary sedimentation tank, and one of UASB followed by maturation pond.
The variables with significant discriminative power (VIP > 1) were BOD, COD, turbidity, and thermotolerant coliforms. These variables were significantly different in the Wilcoxon test when comparing the influent and effluent values (Figure 3). The high AUC value (0.9165) indicates the excellent accuracy of the classification model (Zhu et al. 2010), which was expected considering the removal of contaminants by the biological process. Mihaylova et al. (2022) also found excellent model performance while using the PLS1-DA classification model to discriminate the samples assigned to untreated wastewaters (11 samples) and treated wastewaters (11 samples) classes.
The majority (72%) of misclassified samples are downstream samples classified as upstream. This result indicates that the discharged treated wastewater did not substantially affect the surface water quality in the respective receiving water bodies in the one-sampling campaign (Yotova et al. 2019).
Yotova et al. (2019) found good accuracy (AUC value of 0.81) for discriminating between upstream and downstream samples of 21 Bulgarian WWTPs disposals. According to the authors, the results represent the impact of WWTPs on water quality. Significant water quality parameters for the classification model were pH, Cl, Mn, Zn, and Se (VIP > 1).
Khatoonabadi et al. (2021) assessed a river in Germany at points upstream and downstream of a WWTP disposal. The authors applied PLS-DA to discriminate river water samples and observed a clear differentiation between the upstream and downstream classes, with model error rate and accuracy equal to 0.06 and 0.94, respectively. Out of a total of 148, 66 features or organic pollutants were found with VIP values greater than one.
Yotova et al. (2019) and Khatoonabadi et al. (2021) showed the importance of including more water quality parameters in the classification model, such as heavy metals and contaminants of emerging concern. These variables are not included in the Brazilian WWTPs monitoring programs. The inclusion of these critical variables in the analysis could demonstrate the impact of discharges into water bodies more accurately.
The results indicate an overview of orders of magnitudes of the dilution factors of WWTPs under study (Figure 7). According to the ANA (2017), 60 out of 79 municipalities located in Mato Grosso do Sul State have a high effluent dilution capacity, with receiving water bodies presenting at least 2,000 L/hab.day, considering the urban population of the municipalities.
During dry weather conditions, the dilution of treated wastewater was found to be far less than 10 in receiving waters downstream of more than 100 Swiss WWTPs (Ort & Siegrist 2009). Rice & Westerhoff (2017) calculated the dilution factor for receiving streams in the United States. During low-flow conditions (Q95%), the authors found dilution factor results for the 25th, 50th, and 75th percentiles of 2, 14, and 134, respectively. All of these statistics were higher in Mato Grosso do Sul State.
Downstream samples classified as upstream represent 65% of the model's misclassified samples. This result of the multi-class model agrees with those of the previous binary classification models, wherein the classification model discriminating influent and effluent classes exhibited excellent accuracy, and the model discriminating between upstream and downstream classes exhibited poor accuracy.
The high rate of downstream samples misclassified as upstream samples is probably related to the high dilution capacity of the receiving waters in Mato Grosso do Sul State. López et al. (2019) assessed the contribution of six Spanish WWTPs to receiving water bodies in terms of microbiological parameters. The bacterial concentration was not significantly different between the upstream and downstream points of the treated wastewater discharge. According to the authors, the result may be a consequence of the low percentage of the flow rate discharge compared with the river flow rate, which in all six cases was < 1%. The discharged effluents were rapidly diluted in the rivers, and the treated wastewater did not significantly influence the bacterial quality of the receiving bodies.
Yotova et al. (2019) found an excellent ability of the developed model to discriminate WWTPs final effluents and surface waters, with 93.65% of predictions being accurate. In terms of the significant water quality indicators (VIP > 1) found by the authors, some of them (electrical conductivity, P, N, Cl, and Zn) had higher values in the effluents, whereas others (TSS and Fe) were more concentrated in the surface waters. In this study, all variables with significant discriminative power (VIP > 1) had higher raw and treated wastewater values (Figure 3) when compared with the upstream and downstream values (Figure 4).
According to Yotova et al. (2019), the misclassification of water samples as wastewater samples could be attributed to an unauthorized discharge in the corresponding river areas. In the current model for WWTPs in Mato Grosso do Sul, all misclassified samples were wastewater classified as surface water.
BOD and COD were the only variables with VIP > 1 in all four PLS-DA models, and P had a VIP > 1 in three of the four models. The pH had the lowest VIP in all four models and had no significant discriminative power in any of the achieved classification models. There was little variability in the pH data (Figures 3 and 4). Dantas et al. (2021) found that the pH values of WWTPs effluents were generally close to neutrality and found no significant difference between upstream and downstream pH values in the study's receiving water bodies.
In this study, monitoring data from 45 full-scale WWTPs were assessed by considering the four sampling sites (influent, effluent, upstream, and downstream of the disposals). There was a significant difference between influent and effluent values of BOD, COD, chlorides, turbidity, and thermotolerant coliforms; and between upstream and downstream values of BOD, P, and thermotolerant coliforms (p < 0.05). These results indicate the absence of tertiary treatment in WWTPs.
The PLS-DA model developed to discriminate influent and effluent classes showed excellent accuracy owing to contaminant removal in the biological process. Conversely, the binary model employed to discriminate upstream and downstream classes had poor accuracy. These results are attributed to the high dilution capacities of receiving water bodies in Mato Grosso do Sul. The multi-class model developed to categorize all four classes and the binary model to discriminate wastewater and surface water samples corroborate the results of the previous models and highlight the high dilution factor of the watercourses being studied. This methodology could also be used in receiving water bodies of water-limited regions with lower dilution capacities, enabling assessment of the impact of disposal in regions with different characteristics.
The study adopted the proposed PLS-DA method to broadly assess tropical Brazilian's WWTPs, with high dilution capacities of receiving water bodies, and discriminate samples according to their monitoring sites. More data must be collected and included in the statistical analysis to evaluate the performance of each WWTP and the impact on water quality in a more in-depth manner. We also recommend assessing the method's application to more complete databases, including contaminants of emerging concern, the macroinvertebrate community, and ecotoxicological endpoints. These critical variables must be included in monitoring programs of the WWTPs and will allow better detection of the effects of WWTP effluent on receiving water bodies.
The authors would like to thank Marjuli Morishigue and the sanitation service provider Aegea Saneamento e Participações S.A. and Ambiental MS Pantanal SPE S.A. for providing the monitoring data, and Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) and Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq) for their financial support during the course of the research.
DATA AVAILABILITY STATEMENT
Data cannot be made publicly available; readers should contact the corresponding author for details.
CONFLICT OF INTEREST
The authors declare there is no conflict.