In the occurrence of environmental disasters involving water resources, deploying an emergency monitoring network for assessing water quality is within the first measures to be taken. Emergency networks usually cover a large set of water quality variables and monitoring stations along the watershed. Focusing on variables that represent greater risk to the environment and have less predictable spatial and temporal distribution is a strategy to optimize efforts on monitoring. The goal of this study is to assess the use of Shannon's entropy to identify non-critical water quality variables in an emergency monitoring network implemented in a watershed impacted by the collapse of a mining iron tailing dam, the Doce River watershed (Brazil). Monitoring stations were grouped into water quality subregions through cluster analysis and Shannon's entropy was used to estimate information redundancy of monitored variables. From information redundancy and after checking for compliance with environment normative, non-critical water quality variables were identified. Results indicated that non-critical variables represent 32–50% of the variables monitored. Emergency network managers find in this method a robust tool to improve the network performance. However, special attention should be paid to outliers' presence that can bias analyses based on Shannon's entropy.

  • Water quality variable selection criteria based on Shannon's entropy have not been explored in emergency monitoring networks.

  • The selection criterion based on the concept of ‘non-critical variables’ is appropriate for water quality variables.

  • The set of selected non-critical variables was very expressive and demonstrates the potential of the method to guide adjustments in emergency networks.

  • Metals and metalloids are predominant in terms of the chemical nature of the main non-critical variables in the Doce River watershed.

Water is one of the main natural resources affected by disasters involving environmental impacts, such as dam break of mine tailings (da Cunha Richard et al. 2020; Baudson et al. 2021), toxic and harmful substances leakage or spillage (Raven & Georg 1989; Pedrosa 2007; Hou 2012), gas pipe drilling (Graham & Wilcox 2021), spill of ash (Deonarine et al. 2013), oil (He et al. 2023), vinasse (da Silva et al. 2022), radioactive substances (Koo et al. 2014) and other hazardous residues from anthropogenic activities (e.g.Guo & Duan 2021; Cacciuttolo & Cano 2022). The COVID-19 pandemic has shown that health disasters can also have environmental implications and influence water consumption (Alvisi et al. 2021; Berglund et al. 2022), wastewater characteristics (Sharif et al. 2021; de Araújo et al. 2022) and water quality (Jerez et al. 2023; Lian et al. 2023).

Disasters related to water pollution are often sudden, difficult to predict and lead to serious environmental, social and economic impacts (Hou et al. 2014). Regarding space-time behavior, the impacts resulting from environmental disasters can range from short to long-term and can be localized or cover large areas (Semenova 2020; Gabriel et al. 2021).

Environmental disasters can directly impact water resources and aquatic ecosystems. Changes in water quality, alterations in the morphology of riverbed sediment, and the bioavailability of pollutants previously immobilized in sediment can occur (Miller et al. 2023). Regarding biological aspects, there may be mortality of fish and other aquatic organisms, habitat destruction, mutagenicity, bioaccumulation of pollutants, carcinogenicity, and genotoxicity (Deonarine et al. 2013; Graham & Wilcox 2021; da Silva et al. 2022; Lusweti et al. 2022).

Additionally, depending on the characteristics of pollutants released by disasters, there may be persistence in the environment and a risk to human health (Deonarine et al. 2013; Graham & Wilcox 2021; Lusweti et al. 2022).

The occurrence of such events requires quick actions from the stakeholders aiming at mitigating impacts and, when possible, avoiding additional losses (Tang et al. 2019; He et al. 2023). Environmental monitoring data, especially water quality data, will play an important role, both for short-term responses such as helping to predict space–time dispersion of pollutants (Ding & Fang 2019; Pereira et al. 2021), and for medium and long-term responses as for designing and monitoring recovery strategies.

Emergency networks for water quality monitoring usually differ from pre-existing regulatory networks. First, disaster impacts on aquatic ecosystems are, as a rule, little known because they may result in complex mechanisms involving physicochemical and biological interactions which may have synergistic impacts on aquatic biota and be amplified over the trophic chain (Brinkmann & Rowan 2018; Zorzal-almeida & Fernandes 2021). Consequently, emergency monitoring networks cover a broader spectrum of pollutants (some unusual in regulatory networks but directly related to disasters), in an attempt to encompass all likely impacts on water uses and aquatic ecosystem.

Additionally, emergency monitoring networks have a greater number of monitoring stations aiming at producing a great volume of spatially distributed information; however, the length of monitored time series is generally short (UNEP 2005; Shi et al. 2018; Jing et al. 2019; Manley et al. 2020; Mendes et al. 2022; Oehrig et al. 2023; Pacheco et al. 2023; Wild et al. 2023). Finally, data analysis from emergency monitoring networks is a challenging task because many statistical methods make assumptions regarding data distribution and/or require many observations (N) compared to the number of variables, known in the literature as ‘small N large P problem’ (Mirauda & Ostoich 2020).

Studies on the emergency network for water quality monitoring are scarce in the literature, but water quality monitoring networks for regulatory purposes have been the object of numerous studies in the past decades (Karamouz et al. 2009) which aimed to identify the best scenarios regarding the monitored variables (Khalil et al. 2010; Barcellos & Souza 2022), monitoring frequency (Do et al. 2013; da Luz et al. 2022), number of samples and location of monitoring stations (Nguyen et al. 2020; Reina-García et al. 2020; de Almeida et al. 2022). The large number of water quality variables leads to the need for strategies for optimizing monitoring networks (Calazans et al. 2018a); however, most of the studies have focused on other aspects of monitoring network design.

In a review study, Nguyen et al. (2019) found 14 research studies from a total of 311 that investigated strategies for the selection of critical water quality variables. Identifying non-critical variables allows the proposition of distinct monitoring strategies for them, such as a reduction in monitoring frequency targeting at lowering costs without loss of relevant information.

As for the methods used for assessing water quality variables relevance in monitoring programs, there has been a predominance of statistical methods, with emphasis on principal components analysis (Calazans et al. 2018b), correlation-regression (Khalil et al. 2010) and discriminant analysis (Wang et al. 2014). Alternative approaches are considered promising, especially those based on information entropy (Shannon 1948).

Shannon's entropy, also known as information entropy, measures uncertainty about random processes and was applied in several water resources engineering problems (e.g.Baran et al. 2017; Xiong et al. 2018; Liuzzo et al. 2019; Wang et al. 2021). Recent research has highlighted that information entropy remains a valuable tool for hydrological studies (Yazdi 2018; Singh et al. 2019; Mirauda & Ostoich 2020; Ursulak & Coulibaly 2021). In absolute numbers, studies are still not very expressive in the literature and much of the method potential still remains unexplored (Singh et al. 2019). By 2023, only nine studies using information entropy for the selection of water quality variables had been published (Jiang et al. 2020a; Barbaros 2022). Our literature review revealed that none of them addressed water quality variables within the context of emergency networks. On the other hand, Shannon's entropy is a widely employed method in research concerning the design, redesign and optimization of hydrological monitoring networks (Keum et al. 2017; Nguyen et al. 2019; Jiang et al. 2020a).

An optimized monitoring of water quality should focus on variables that pose a greater risk to the environment and present a less predictable spatial and time distribution. In this sense, Shannon's entropy can be a useful tool by treating water quality parameters as random variables and allowing quantifying their information content (uncertainty). A more predictable water quality variable is a candidate for lower-frequency monitoring. In this context, the goal of this study is to assess the use of Shannon entropy to identify non-critical water quality variables in the context of emergency monitoring networks implemented in watersheds impacted by environmental disasters.

Background on information entropy

Information theory deals with the quantification of the information content in a random variable and is based on the entropy concept developed by Shannon (1948). Information entropy, also known as entropy or marginal entropy is a measurement of information or uncertainty. Given an event that can be described by a discrete random variable X whose occurrence probability is , with the different outcomes of X, besides being a measure of uncertainty about the occurrence of , also assesses the information content of this event. Events with a high occurrence probability need less information to be characterized; on the other hand, the less likely an event is, the greater the information needed to characterize it or the greater the information content it produces (Mirauda & Ostoich 2020). In the case of discrete variables, marginal entropy is mathematically described in the following equation:
formula
(1)
where j is a state of a discrete random variable X, is the result corresponding to the state j, and is the probability of occurrence of the result .

Marginal entropy can be applied to both discrete and continuous data, measured at different time scales (annual, monthly, seasonal or others) to assess the uncertainty associated with the set (Shannon 1948; Singh et al. 2019). Maximum entropy is the maximum entropy value related to an appropriately chosen probability distribution that tends to maximize given the existing constraints. In the case of discrete variables with defined domain [a,b] (finite interval) and no restrictions on moments, the uniform distribution maximizes entropy (Singh 2013; Singh et al. 2019). When formulating the maximum entropy solution, estimates of improve if physical principles are considered helping to eliminate physically inconsistent solutions (Perdigão et al. 2020).

Relative entropy (Equation (2)) is the ratio between marginal entropy and maximum entropy for a given random variable within the interval [0,1] (Shannon 1948):
formula
(2)
The information redundancy R (Equation (3)) of each variable can be calculated from the relative entropy, as proposed by Singh (2013):
formula
(3)

Information redundancy assumes values in the range [0,1]. Variables with redundancy close to 0 exhibit a high degree of uncertainty, making them highly informative. Conversely, variables with high predictability, indicated by information redundancy approaching the value of 1, are less informative (Shannon & Weaver 1949; Singh 2013).

Study area

The study area is the Doce River watershed (Figure 1), located in the southeastern region of Brazil. Its drainage area covers approximately 83,400 km², of which 86% is part of the State of Minas Gerais and the remaining 14% is part of the State of Espírito Santo. The Doce River length is approximately 853 km, and it is formed from the confluence of Carmo and Piranga Rivers (Santolin et al. 2015).
Figure 1

Geographic location of the Doce River watershed with a zoomed frame on to the region where the Fundão dam was located. Location of monitoring stations of PMQQS program.

Figure 1

Geographic location of the Doce River watershed with a zoomed frame on to the region where the Fundão dam was located. Location of monitoring stations of PMQQS program.

Close modal

The regional climate is characterized by a humid tropical type, exhibiting clearly defined seasonality. The wet season in the watershed contributes to 85% of the annual precipitation and spans from October to March, while the dry season persists from April to September. The annual rainfall ranges from 900 to 1,500 mm and the air temperature exceeds 18 °C (Kütter et al. 2023).

Doce River historically records high values of thermotolerant coliforms, turbidity, and total phosphorus. The presence of some metals above the permitted levels, such as dissolved iron, total manganese, and total lead, was highlighted in the watershed's water resources plan, occurring at several locations and indicating the impact of industrial and agricultural activities (ANA 2016).

Mining is one of the main economic activities in the watershed, especially in the upper Doce River region, and large mining projects have been conducted in this region for decades (Espindola et al. 2017). On 5 November 2015, at the Germano mining complex, in the municipality of Mariana (MG), the Fundão dam collapsed. The reservoir accumulated approximately 50 million m3 of iron mining tailings. The released tailings reached the Santarém reservoir, causing its overtopping and forcing the wave to pass along 55 km in the Gualaxo do Norte River until it flowed into the Carmo River. From there, the plume traveled 22 km until it reached the Doce River, where the small hydroelectric reservoir Risoleta Neves retained part of the tailings (Figure A1 – Supplementary material). From there, it continued to the watershed outlet, where the tailings were released into the Atlantic Ocean on 21 November 2015, totaling 663.2 km of directly impacted water bodies (IBAMA 2015).

Along its trajectory, the tailings caused numerous biophysical impacts on the river system and the coastal zone, affecting the channel and the banks of the Doce River, impairing the water quality, and making it unfit for human and animal consumption and for aquatic biota. In addition, water supply was suspended to 12 cities served directly by the Doce River, affecting an estimated population of 424,000 people (IBAMA 2015; Sánchez et al. 2018).

Data collection and treatment

The methodological steps of this research are shown in the diagram presented in Figure 2.
Figure 2

Methodological steps for the selection of non-critical water quality variables.

Figure 2

Methodological steps for the selection of non-critical water quality variables.

Close modal

This research uses data from the Systematic Quali-quantitative Water and Sediment Monitoring Program (PMQQS), operated by Renova foundation in the Doce River watershed (RENOVA 2017). The PMQQS program started on 31 July 2017, and is permanent. The network currently operates 92 stations distributed among the coastal, estuarine, Doce River, Doce River lagoons, and its tributaries. This study focused on lotic water bodies, comprising a total of 39 monitoring stations (Figure 1) (Table A1 – Supplementary material).

For this study, variables that had a percentage of invalidated data greater than 30% were discarded. Likewise, variables with the percentage of missing values above 30% were discarded. Out of the 90 water quality variables (refer to Table A2 in Supplementary material), 78 were chosen for analysis (see Table 1). Data below the limit of quantification or outside the limits of analytical detection (censored data) were replaced by the respective limits of detection. Considering all non-missing records as valid data, a total of 107,585 values from monthly monitoring conducted between August 2017 and December 2020 were used in this research.

Table 1

Set of water quality variables from the PMQQS network selected for this research

Water quality variables – PMQQS network
Total alkalinity Total copper Ammoniacal nitrogen 
Dissolved aluminum In situ conductivity Total Kjeldahl nitrogen 
Total aluminum True color Organic nitrogen 
Dissolved antimony Dissolved chromium In situ dissolved oxygen 
Total antimony Total chrome In situ saturated dissolved oxygen 
Dissolved arsenic BOD Polyphosphate 
Total arsenic Total hardness In situ redox potential 
Dissolved barium Escherichia colia Dissolved silver 
Total barium Feoftina Total silver 
Dissolved beryllium Iron II Dissolved selenium 
Total beryllium Iron III Total selenium 
Dissolved boron Dissolved iron Total sodium 
Total boron Total iron Total dissolved solids 
Dissolved cadmium Dissolved phosphorus Sedimentable solids 
Total cadmium Total phosphorus Total suspended solids 
Total calcium Total magnesium Total solids 
Dissolved organic carbon Dissolved manganese Sulfides as undissociated H2
Total organic carbon Total manganese Total sulfides 
Dissolved lead Dissolved mercury In situ sample temperature 
Total lead Total mercury In situ turbidity 
Free cyanide Dissolved molybdenum Dissolved vanadium 
Total chloride Total molybdenum Total vanadium 
Chlorophyll a Dissolved nickel Dissolved zinc 
Dissolved cobalt Total nickel Total zinc 
Total cobalt Nitrate In situ pH 
Dissolved copper Nitrite Laboratory pH 
Water quality variables – PMQQS network
Total alkalinity Total copper Ammoniacal nitrogen 
Dissolved aluminum In situ conductivity Total Kjeldahl nitrogen 
Total aluminum True color Organic nitrogen 
Dissolved antimony Dissolved chromium In situ dissolved oxygen 
Total antimony Total chrome In situ saturated dissolved oxygen 
Dissolved arsenic BOD Polyphosphate 
Total arsenic Total hardness In situ redox potential 
Dissolved barium Escherichia colia Dissolved silver 
Total barium Feoftina Total silver 
Dissolved beryllium Iron II Dissolved selenium 
Total beryllium Iron III Total selenium 
Dissolved boron Dissolved iron Total sodium 
Total boron Total iron Total dissolved solids 
Dissolved cadmium Dissolved phosphorus Sedimentable solids 
Total cadmium Total phosphorus Total suspended solids 
Total calcium Total magnesium Total solids 
Dissolved organic carbon Dissolved manganese Sulfides as undissociated H2
Total organic carbon Total manganese Total sulfides 
Dissolved lead Dissolved mercury In situ sample temperature 
Total lead Total mercury In situ turbidity 
Free cyanide Dissolved molybdenum Dissolved vanadium 
Total chloride Total molybdenum Total vanadium 
Chlorophyll a Dissolved nickel Dissolved zinc 
Dissolved cobalt Total nickel Total zinc 
Total cobalt Nitrate In situ pH 
Dissolved copper Nitrite Laboratory pH 

It was also necessary to carry out a treatment to identify and remove outliers because the Shannon's entropy and particularly the maximum entropy are sensitive to them (Nooghabi & Nooghabi 2016). Several algorithms for detecting outliers were tested and the Gini method (NAIR 1936) resulted in the best estimates.

Setting water quality spatial and temporal boundaries

The Doce River watershed was divided into homogeneous water quality subregions by applying cluster analysis to identify groups monitoring stations where data showed strong similarity. This analysis was conducted with the aim of defining subregions with similar characteristics, as large watersheds may encompass different hydrological and environmental contexts within them (Robertson et al. 2006; Versiani et al. 2009; Baldan et al. 2022).

Cluster analysis is a multivariate statistical method usually used in exploratory analyses to identify patterns in datasets and to gather similar observations into groups (Kettenring 2006). In this study, Ward's method (Ward 1963) was adopted, which is an agglomerative hierarchical technique, and the Euclidean distance was taken as a measurement of dissimilarity. Ward's method forms quite homogeneous groups with minimal internal variance (Szekely & Rizzo 2005) and is extensively applied in water quality studies (Azhar et al. 2015; Lobo et al. 2015; Hajigholizadeh & Melesse 2017; Kändler et al. 2017; Pinto et al. 2018; Li et al. 2019; da Silva 2020; Jiang et al. 2020b).

Data were standardized, as suggested by Härdle & Simar (2015), to eliminate distortions arising from the different measurement scales of water quality variables. As a criterion for determining the ideal number of groups, the analysis of the fusion behavior in the dendrogram (Hair et al. 2006) is associated with knowledge about the spatial distribution of stations.

To assess the temporal variability of water quality, data were also classified according to the period of the year in the wet (October to March) or dry (April to September) season.

Computing information redundancy of water quality variables

For every water quality variable observed in each monitoring station, redundancy was obtained from treated time series through (1) discretizing observed data; (2) calculating marginal entropies according to Equation (1); (3) calculating maximum entropy and; (4) obtaining relative entropy (Equation (2)) and redundancy (Equation (3)).

The procedure applied in this work for discretizing water quality variables is described in Section 2.5. Discretizing random variables was necessary because it is particularly difficult to determine the marginal entropy of continuous variables. The use of discrete distributions, even in monitoring networks where continuous variables prevail, has become increasingly frequent (Keum & Coulibaly 2017a; Mirauda & Ostoich 2020). Maximum entropy was obtained from uniform distribution and using Monte Carlo Simulation (MCS) as explained in Section 2.5.1.

An exploratory analysis was performed to assess marginal and maximum entropy and information redundancy considering each water quality subregion and data seasonality.

Quantization of continuous random variables

The quantization of variables has been extensively explored, with results often considered satisfactory (Alfonso et al. 2013; Keum & Coulibaly 2017b; de Pádua et al. 2019; Foroozand & Weijs 2021). The quantization proposed by Alfonso et al. (2013) is a mathematical floor function which transforms a set of continuous values into a discrete set, approaching a value x to its nearest and lowest integer multiple, , multiple of a constant a (Equation (4)).
formula
(4)
where is the quantized discrete value of the variable x, is the mathematical floor function, a is the bin size.
The mathematical floor function does not require a parametric distribution and allows considering physical aspects for setting a value for a (Mirauda & Ostoich 2020). However, determining an appropriate value for a is a complex task and in this study the criterion of Scott (1979) was applied (Equation (5)), supported by suggestions and satisfactory results obtained in previous works (e.g. Singh 2013; Mirauda & Ostoich 2020):
formula
(5)
where a is the bin size, is the standard deviation of the random variable X,N is the dataset size of the random variable X.

In the case of PMQQS program, many water quality variables concern trace elements and present censored values. Using Equation (2) as proposed by Alfonso et al. (2013) would distort the probability density function (PDF), leading to a loss of variability, impacting the number of discrete states which strongly affects marginal entropy values (Çengel 2021). To overcome this limitation, we proposed to approximate the quantized value of each observation to the number of decimals one unit lower than the sensitivity level of the variable's analytical method. For example, if the variable had a measurement precision represented by the fourth decimal digit, in the quantization process, the third decimal digit was assumed as the precision of the quantized variable. This ensured that relatively close values represented the same discrete state and that remarkably different values became distinct states of the new discrete variable after quantization.

Maximum entropy using MCS

To obtain the maximum entropy of each variable, the uniform distribution was considered as the one that maximizes the marginal entropy. The uniform distribution was selected because (i) the variables used to determine marginal entropy had been previously discretized through a quantization method, (ii) we adopted a defined domain [a,b] based on the data original range, and (iii) no assumption was made on the behavior of data mean and variance, as time series are generally considered to be relatively short for this purpose.

MCS was applied to obtain the maximum entropy through the pseudorandom number generator Mersenne Twister (Matsumoto & Nishimura 1998) which was developed in the late 1990s for use in Monte Carlo simulation. For every water quality variable, an ensemble of 100 different synthetic datasets was drawn from the uniform distribution taking the observed data interval as a domain to preserve the variable characteristics. Thus, considering the number of variables and stations of the PMQQS network investigated in this research, 304,200 simulations were performed.

Each synthetic dataset was quantized using the same criteria established for the quantization of the original variables and, later, its marginal entropy was calculated. Each marginal entropy calculated in this step represents one of the many possibilities of maximum entropy of the original variable which was estimated through the following equation:
formula
(6)
where is the maximum entropy of the variable X, is the marginal entropy of the quantized synthetic variable , is the 90th percentile of marginal entropy of quantized synthetic variables.

The native Stats package R was used for the Monte Carlo Simulations (R Core Team 2021).

Assessing non-critical water quality variables

In this work, variables that present information redundancy R greater than or equal to a determined threshold are classified as low information content variables (LICV). Then, for each water quality subregion established according to cluster analysis, the relative representativeness of LICV () was computed through the following equation:
formula
(7)
where denotes the relative representativeness of LICV in each subregion and seasonal period, denotes the number of stations where the variable was classified as LICV, N is the total number of stations in the subregion.

The threshold for classifying variables as LICV was determined by using informational redundancy of all water quality variables in both seasonal periods across the investigated watershed. The threshold was defined by observing the percentile in which there was a leap in informational redundancy toward 100% or values close to this threshold.

Variables classified as LICV in all monitoring stations in a subregion, () in both seasonal periods, were confronted with the maximum admissible limits for fresh waters class 2 established by resolution no. 357/2005 of the National Council for the Environment (BRASIL 2005). The variables that exceeded the maximum limits set by the legislation were removed from the group of ‘non-critical variables’. For this consistency step, the LICV series even included measures identified as outliers. These LICV variables that were within the legislation limits were identified as non-critical variables in the Doce River watershed and could be submitted to a reduction in their monitoring frequency.

Monitored variables in dissolved form without environmental standards were analysed according to established limits for their total concentration.

Validation

Principal component analysis (PCA) was used to validate non-critical water quality variables selected in the previous step. PCA is a commonly employed technique for handling multivariate data, with the aim of organizing and reducing dimensionality. By linearly combining the original variables, PCA produces a new set of orthogonal variables, referred to as principal components. The components are independent, non-correlated, and the sum of their variances equals that of the original variables (Abdi & Williams 2010; Olsen et al. 2012; Sergeant et al. 2016).

It is expected that due to their low information content, the non-critical water quality variables will present a small contribution for the two first components, showing that they are not determinant for water quality characteristics in each subregion. The data used in the PCA were the same as described in Section 3.2. However, since PCA does not admit the presence of missing data, it was necessary to eliminate missing data campaigns. The dataset was standardized to eliminate distortions arising from the different measurement scales of water quality variables. As for the correlation between the variables, one variable from each pair whose Pearson correlation coefficient showed a value greater than 0.8 was eliminated. This procedure aimed at reducing redundant variables, optimizing the set of analysed variables (Hair et al. 2006) and avoiding distortions in the results that could attribute greater importance to multicollinear variables. Outliers were not eliminated to preserve as much as possible the information contained in the dataset.

Water quality spatial boundaries

In the development of all methodological stages, the models and functions were implemented in the R programming language using entropy, FactoMineR and MultivariateAnalysis packages (Le et al. 2008; Hausser & Strimmer 2021; Azevedo 2022).

The cluster analysis using Ward's algorithm resulted in the segmentation of the 39 monitoring stations into 14 main groups, considered water quality subregions in the watershed (Figure 3). The groups were quite heterogeneous in terms of a number of stations which ranged from one to seven. Subregions were named from A to N, from upstream to downstream in the watershed (Figure 3) (Table A3 and Figure A2, Supplemenary material).
Figure 3

Clusters of monitoring stations in the PMQQS program.

Figure 3

Clusters of monitoring stations in the PMQQS program.

Close modal

Subregion A is formed by one station located in Gualaxo do Norte River, upstream Fundão dam. Subregions B, C and D are formed by monitoring stations in Gualaxo do Norte River, downstream Fundão dam. Subregion E is formed by monitoring stations in Carmo River. Subregions F, H, K and L are formed by monitoring stations in Doce River tributaries (Piranga, Santo Antônio, Piracicaba, Guandu and Caratinga Rivers), while subregions G, J and M are formed by monitoring stations located both in the Doce River and in its tributaries (Piranga, Matipó, Suaçui Grande and Manhuaçu Rivers). Subregions I and N have one monitoring station each, in Doce River.

Assessing entropy measures

Marginal entropy, maximum entropy and, by extension, information redundancy, of the water quality variables were influenced by seasonality in Doce River watershed. The maximum observed value of marginal entropy ranged from 4.09 bits during the dry period to 4.39 bits during the wet season, corresponding to an increase of 7%. A gain in information content during the wet season was observed in all water quality subregions, which was highlighted by the higher values of the 50th and 75th percentiles of marginal entropy when compared to the dry season (Figure 4(a)). The effect of seasonality on the Shannon's entropy of water quality variables had already been demonstrated in the previous studies in water resources (Tanos et al. 2015).
Figure 4

Marginal entropy (a), maximum entropy (b) grouped by water quality subregion during dry and wet seasons and (c) demonstrates the distribution of sample sizes between dry and wet seasons by water quality subregion. ‘n’ represents the size of each seasonal sample.

Figure 4

Marginal entropy (a), maximum entropy (b) grouped by water quality subregion during dry and wet seasons and (c) demonstrates the distribution of sample sizes between dry and wet seasons by water quality subregion. ‘n’ represents the size of each seasonal sample.

Close modal

Marginal entropy values are asymmetrically distributed, with greater spread up to the 50th percentile and higher values in the upper tail of the distribution. The lowest marginal entropy obtained, both in the dry and wet seasons, was zero, which indicates little or negligible uncertainty. The variance of the observed data was represented by discrete states after quantizing the variables. The smaller the variance of a given variable, the smaller the number of discrete states to describe it, and it may reach a single state and zero uncertainty (null entropy).

The maximum entropy was limited to the highest value of marginal entropy. During the dry season, the maximum entropy reached 4.09 bits and in the rainy season, 4.39 bits. As with marginal entropy, seasonality drives maximum entropy toward greater values in the wet season (Figure 4(b)).

The similarity among the most expressive values of marginal and maximum entropies was already expected, because when using the uniform distribution with a finite interval obtained from the observed data, the statistical characteristics of the variables in the results of the Monte Carlo simulations are preserved.

Selection of non-critical water quality variables in the Doce River watershed

The histogram of information redundancy values (calculated from Equation (3)) considering all water quality variables and monitoring stations showed an abrupt change to 100% close to the 50th percentile, which corresponds to 80% of information redundancy (Figure 5). This value was used as the threshold for classifying variables as LICV.
Figure 5

Cumulative histogram of information redundancy values (calculated from Equation (3)) considering all water quality variables and monitoring stations in the study area. The red line represents the threshold for classifying LICV.

Figure 5

Cumulative histogram of information redundancy values (calculated from Equation (3)) considering all water quality variables and monitoring stations in the study area. The red line represents the threshold for classifying LICV.

Close modal

A total of 51 variables in the wet season and 62 variables in the dry season were classified as LICV across the 14-water quality subregions. Those with information redundancy greater than 80% in all subregions in both seasonal periods included a total of 41 variables. Comparing their values with Brazilian environmental standards allowed verifying that there were non-conformances. In these cases, despite the low information content presented by the variables in all stations of the same subregion, they were not labeled as non-critical variables.

In these cases, subregion L (in Caratinga River) accounted for the highest number of variables with non-conforming values, four in total, while subregions D (in Gualaxo no Norte River) and J (in Doce River) recorded three non-conforming variables each, subregions B (in Gualaxo do Norte River and Suaçuí Grande River), C (in Gualaxo no Norte River), E (in Carmo River), G (in Doce River), M (in Manhuaçu River and Manhuaçu River) and N (in Doce River) recorded two non-conforming variables each, F (in Santo Antônio Rivers), H (in Piracicaba River), I (in Doce River) and K (in Guandu River) had the lowest number (one non-conforming variable each). In subregion A (station RGN-01 in Gualaxo do Norte River, upstream of the Fundão dam), all were within environmental standard limits. The number of non-critical water quality variables ranged from 25 to 39, according to the water quality subregion (Figure A3 – Supplementary material).

Regarding non-conforming variables, the one that stood out the most in the comparison with environmental standards was biochemical oxygen demand (BOD), with non-conforming values in eight subregions, total phosphorus in six subregions, total lead in four subregions, total mercury and dissolved phosphorus in two subregions each and dissolved boron, total boron, total cadmium and total nickel, with one subregion each (Table A4 – Supplementary material).

The final set of variables selected as non-critical has a predominance of metals and metalloids. In summary, 15 non-critical variables were identified in the 14 subregions, other six variables in 13 subregions and five variables in 12 subregions. The other non-critical variables cover a smaller number of subregions (Table 2).

Table 2

Non-critical water quality variables in the subregions

 
 

Non-critical variables are characterized by low information entropy, which means small uncertainty in their expected values. Uncertainty is linked to the degree of redundancy, in this case, understood as the repeatability of the data. When the measurements of these variables are repeated continuously, each new piece of data is generated under great predictability, producing little or no new information content. These variables are likely assigned as ‘non-critical’ because they are frequently under quantification limits and/or are not related to the mine tailings discharged in the watershed. Moreover, the monitoring program started a few years after the dam break and the pollutant levels may have decreased due physical–chemical processes, such as sedimentation, taking place in the aquatic system.

Spatial and temporal variability of non-critical variables

In all subregions, the variables identified as non-critical, as well as their amount, varied between the dry and wet periods in different proportions, except for subregion E (Figure 6).
Figure 6

Number of non-critical water quality variables for each water quality subregion by seasonal period.

Figure 6

Number of non-critical water quality variables for each water quality subregion by seasonal period.

Close modal

Due to changes in rainfall and runoff patterns associated with the spatial dynamics of land use/occupation in watersheds, seasonality often impacts water quantity and quality (Rodrigues et al. 2018; Schliemann et al. 2021). Seasonality influence was also observed by researchers who investigated water quality in the Doce River watershed, before and after the Fundão disaster (Petrucio et al. 2005; Nogueira et al. 2021; Passos et al. 2021).

The dry period predominated over the wet period in terms of the number of non-critical water quality variables. Point source pollution loads, mainly from untreated and treated sewage, remain nearly constant over the year while during the dry period, small precipitated volumes and reduced flows in the river network generate lower diffuse pollution loads from the watershed and smaller erosion and resuspension of sediments in the rivers (Oliveira & Quaresma 2017; da Cunha Richard et al. 2020). Consequently, water quality variable concentrations in the dry period present less variability (less uncertainty) and lower entropy when compared to the wet period.

Entropy measures varied along the watershed according to the variable and the season. For illustration purposes, Figures 7 and 8 show the spatial distribution of information redundancy, respectively, during dry and wet periods, for dissolved arsenic and total manganese. Total manganese was considered an important water quality variable in all subregions, whereas dissolved arsenic was identified as a non-critical variable in 10 subregions.
Figure 7

Spatial distribution of information redundancy for dissolved arsenic in dry and wet seasons in the Doce River watershed.

Figure 7

Spatial distribution of information redundancy for dissolved arsenic in dry and wet seasons in the Doce River watershed.

Close modal
Figure 8

Spatial distribution of information redundancy for total manganese in the wet and dry season in the Doce River watershed.

Figure 8

Spatial distribution of information redundancy for total manganese in the wet and dry season in the Doce River watershed.

Close modal

Information redundancy of dissolved arsenic shows a great spread of values along the watershed (Figure 7), both in dry and wet seasons, ranging from 0 to 100%. Less redundancy zones are very well delimited in the upper reach of the Doce River with values below 40%. Monitoring stations included in this area correspond to all stations of subregions E, G and I. Subregion E corresponds to the Carmo River watershed which was historically explored by gold mining (Borba et al. 2000; Daus et al. 2005; Silva et al. 2018). This activity, especially in past centuries when mining was rudimentary and with little care regarding environmental impacts, promotes arsenic release from the soil and minerals, representing a serious risk to ecosystems and human health due to its toxicity characteristics (Alonso et al. 2020; Barcelos et al. 2020).

Subregions G and I are located at the upper reach of the Doce River, probably influenced by gold mining in the Carmo watershed, as concentrations of dissolved arsenic in non-compliance with environmental standards have been recorded before and after the Fundão disaster in this location. Comparing the maps of Figure 7 for dissolved arsenic, redundancy decreases from dry to wet season in the middle reach of the Doce River (subregion J). Less frequently, dangerous levels of arsenic have also been identified in the water for human consumption closer to the middle reach of the Doce River (Teixeira et al. 2020). Dissolved arsenic was indicated as a non-critical variable in all subregions, except in the four aforementioned subregions (E, G, I and J).

The maps in Figure 8 show that due to total manganese presence and variability all over the watershed, redundancy spatial distribution is quite homogenous and similar between dry and wet periods. Low levels of redundancy predominate throughout the year, ranging from 0 to 34.4% in the dry season and from 1.7 to 27.1% during the wet season. This behavior reflects a more expressive variability of total manganese, both in temporal and spatial terms in the Doce River watershed. Recent studies report measurements of total manganese at high concentrations and even violating environmental standards all over the Doce River watershed (de Carvalho et al. 2018). This is likely related to Fundão disaster, since manganese is part of iron mining tailings and tends to be gradually remobilized from sediment deposited in river channels to surface waters by chemical and physical–chemical processes, representing a medium and long-term risk to the population and aquatic ecosystems (Carvalho et al. 2018; Baudson et al. 2021; Moreira et al. 2021; Duarte & Neves 2023).

Validation

The correlation analysis showed a strong multicollinearity between the water quality variables in all subregions (see Figures A4 to A17 – Supplementary material). The set of variables eliminated from each strongly correlated pair ranged from 9 (subregion F) to 40 variables (subregion N) (Table A5 – Supplementary material). Variables whose variability was null and could not be standardized were also excluded from the analysis, and therefore, the total set of variables investigated in the PCA ranged from 9 to 46 (Table A6 – Supplementary material).

The PCA showed that the cumulative explained variance in the two first principal components ranged from 21.5 (subregion E) to 67.6% (subregion N), with an average value between subregions of 29.9%. The eigenvalues of these components in all subregions were greater than 1, being representative according to the Kaiser criterion (Ferré 1995).

The database of all subregions included in some proportion non-critical water quality variables. Subregion N had only one non-critical variable investigated in the PCA while subregion M had 11 non-critical variables (Table A7 – Supplementary material).

The set of variables with the greatest contribution to the first principal component (PC1) in subregions A, B, C, E, F, H, L, M and N for PC1 did not include non-critical variables. Variables with a percentage of individual contribution higher than the average for each subregion were considered to have the greatest contribution. Subregions G, I and K had one and subregions D and J had two non-critical variables each contributing to the first component (Table 3) (Figure A18 – Supplementary material).

Table 3

Non-critical water quality variables listed among the variables with the greatest contribution to PC1

Non-critical water quality variableSubregions
Total chrome 
Total Kjeldahl nitrogen 
Total lead 
Total nickel 
Dissolved cobalt 
Total selenium 
Chlorophyll a 
Non-critical water quality variableSubregions
Total chrome 
Total Kjeldahl nitrogen 
Total lead 
Total nickel 
Dissolved cobalt 
Total selenium 
Chlorophyll a 

Similarly, non-critical variables were also identified among those with the greatest contribution to the second principal component – PC2 (Table 4), except in subregions E, F and I. In the other subregions, the non-critical variables that contributed in a significant proportion to PC2 ranged from 1 (subregion A, G, H, M and N) to 3 variables (subregion B) (Table 4) (Figure A18 – Supplementary material).

Table 4

Non-critical water quality variables listed among those with the greatest contribution to PC2

Non-critical water quality variablesSubregions
Ammoniacal nitrogen A, C, K 
Free cyanide B, H 
Total Kjeldahl nitrogen 
Iron II 
Total sulfides C, L 
Settleable solids 
Total lead 
Polyphosphate D, K, L, M 
Dissolved antimony 
Total beryllium J, K 
Total selenium 
Non-critical water quality variablesSubregions
Ammoniacal nitrogen A, C, K 
Free cyanide B, H 
Total Kjeldahl nitrogen 
Iron II 
Total sulfides C, L 
Settleable solids 
Total lead 
Polyphosphate D, K, L, M 
Dissolved antimony 
Total beryllium J, K 
Total selenium 

The PCA analysis showed, in most cases, that the non-critical water quality variables as proposed by this study had minor relevance to explain water quality variability of the Doce River watershed, as would be expected. Therefore, both PCA and the method proposed here yielded similar results. However, there are non-critical variables among those with the greatest contribution to the first two principal components, which was due to the presence of outliers (Figure 9). Outliers become important when they support statistical models, as they add variance to the data (Hair et al. 2006). Outlier values imputed a high degree of relative importance in the PCA, which was not consistent with the characteristics of these variables.
Figure 9

Box–Whisker of the non-critical water quality variables with the greatest contribution to PC1 (a) and PC2 (b) by subregion (Variables were standardized for allowing visualization).

Figure 9

Box–Whisker of the non-critical water quality variables with the greatest contribution to PC1 (a) and PC2 (b) by subregion (Variables were standardized for allowing visualization).

Close modal

Addressing outliers in environmental data is challenging, as these values are not necessarily indicative of errors, but may represent extreme values that actually occurred in the environment. In this study, outlier removal was carried out only before the implementation of the method based on information entropy. The decision to employ this strategy was guided by the recognition that outlier values would obscure the quantization process and, consequently, hinder the identification of non-critical water quality variables through the information entropy method. While outlier values could have been excluded before conducting the PCA analysis, this procedure was intentionally omitted to illustrate their potential impact on the results. Indeed, the analysis of Figure 9 shows that the greater importance attributed to non-critical variables concerning PC1 and PC2 components comes exclusively from the outliers, corroborating the results of the selection of water quality non-critical variables using information entropy.

It is, therefore, recommended that in future applications of the non-critical variable selection method, data treatment by outlier removal should be maintained as a preliminary step. This process does not necessarily have to rely solely on the Gini method. Other methods need to be analysed, as they may yield satisfactory results considering the specific characteristics of each dataset.

Management context

The occurrence of large-scale environmental disasters has placed environmental management, especially the management of water resources, at the center of attention worldwide. Depending on the nature of the impacts and their extension in geographical terms, the environmental recovery process becomes quite complex and challenging, often requiring the construction of solutions that are not readily available.

In the case of water resources management, the entire chain of actions necessary for the recovery of the impacted ecosystems, from the immediate monitoring of the spatial dynamics of the pollutant dispersion to the temporal monitoring of mitigation measures, goes through the establishment and operation of monitoring networks. In this context, these networks play an increasingly strategic role to safeguard ecosystems and guarantee water security for the affected populations. Very often emergency monitoring networks are oversized, both in the number of stations and monitored parameters. Although this approach is legitimate and necessary, especially in cases where pollutants are highly mobile and dangerous, it is necessary to identify strategies which allow for optimizing the monitoring efforts, not losing sight of the specificities of each crisis scenario. In the Doce River watershed, for instance, the emergency water quality monitoring must consider the dynamics of pollutant bioavailability. Under varying environmental conditions, elements immobilized in sediments may experience accelerated bioavailability through physicochemical processes (Costa et al. 2021; Queiroz et al. 2021a). This phenomenon is highlighted by several authors who caution against the potential for chronic effects stemming from the Fundão disaster. Metals and metalloids adsorbed or complexed with iron oxides and hydroxides in sediments can undergo exchanges, leading to an elevation in their concentration in water resources and presenting a significant risk to ecosystems and human populations (Queiroz et al. 2018, 2021b; Gabriel et al. 2021).

Emergency monitoring programs require regular assessment. Identifying less informative and lower-risk variables to water resources and aquatic ecosystems enables a potential reduction of the monitoring frequency of non-critical variables. It ensures that monitoring is focused on the most critical variables, minimizing resource wastage and allowing surplus resources to be redirected to other actions of environmental recovery programs. Its implementation can assist managers of emergency monitoring networks in making more secure and subjective-free decisions regarding monitoring criteria.

This paper presented a method based on information entropy for selecting non-critical water quality variables of emergency monitoring networks in the context of environmental disasters impacting water resources. The method has the potential to support managers in decision-making regarding the planning and operation of emergency monitoring networks. It allows us to summarize for the Doce River study case:

  • 14 subregions with homogeneous water quality characteristics were identified in the Doce River watershed through cluster analysis;

  • Seasonality strongly influences the information content of water quality variables, generally making them more informative during the wet season;

  • Across the watershed, 41 non-critical water quality variables were identified and distributed among distinct sets within each water quality subregion;

  • Non-critical water quality variables ranged from 32 to 50% of the total monitored variables in the water quality subregions;

  • Metals and metalloids stand out as the prevailing non-critical water quality variables in the Doce River watershed.

The Doce River watershed served as a case study; nevertheless, the applied method is easily transferable to other watersheds due to (i) its independence from data following a normal distribution, (ii) its ability to capture the spatiotemporal fluctuations in the informative content of water quality variables, even for elements with concentrations as low as parts per billion (ppb); (iii) its objectivity in the identification of non-critical water quality variables, avoiding the influence of personal judgments. A careful evaluation is recommended to evaluate outliers and define the discretization method for continuous variables in other study areas.

We acknowledge IFMG – Campus Governador Valadares for the support provided as a waiver for exclusive dedication to the doctoral research of the first author.

1

Variables were standardized for allowing visualization.

All relevant data are included in the paper or its Supplementary Information.

The authors declare there is no conflict.

Abdi
H.
&
Williams
L. J.
2010
Principal component analysis
.
Wiley Interdisciplinary Reviews: Computational Statistics
2
,
433
459
.
https://doi.org/10.1002/wics.101
.
Agência nacional de águas (ANA)
2016
Encarte Especial sobre a Bacia do Rio Doce Rompimento da Barragem em Mariana/MG. Conjuntura dos Recursos Hídricos no Brasil
, Vol.
1
.
ANA
,
Brasília, Brasil
.
Alfonso
L.
,
He
L.
,
Lobbrecht
A.
&
Price
R.
2013
Information theory applied to evaluate the discharge monitoring network of the Magdalena River
.
Journal of Hydroinformatics
15
(
1
),
211
228
.
https://doi.org/10.2166/hydro.2012.066
.
Alonso
D. L.
,
Pérez
R.
,
Okio
C. K. Y. A.
&
Castillo
E.
2020
Assessment of mining activity on arsenic contamination in surface water and sediments in southwestern area of Santurbán Paramo, Colombia
.
Journal of Environmental Management
264
(
February
).
https://doi.org/10.1016/j.jenvman.2020.110478
.
Alvisi
S.
,
Franchini
M.
,
Luciani
C.
,
Marzola
I.
&
Mazzoni
F.
2021
Effects of the COVID-19 lockdown on water consumptions: Northern Italy case study
.
Journal of Water Resources Planning and Management
147
,
1
9
.
https://doi.org/10.1061/(asce)wr.1943-5452.0001481
.
Azevedo
A. M.
2022
Package Multivariate Analysis
.
Azhar
S. C.
,
Aris
A. Z.
,
Yusoff
M. K.
,
Ramli
M. F.
&
Juahir
H.
2015
Classification of river water quality using multivariate analysis
.
Procedia Environmental Sciences
30
,
79
84
.
https://doi.org/10.1016/j.proenv.2015.10.014
.
Baldan
D.
,
Chattopadhyay
S.
,
Prus
P.
,
Funk
A.
,
Keller
A.
&
Piniewski
M.
2022
Regionalization strategy affects the determinants of fish community structure
.
Ecohydrology
15
,
1
14
.
https://doi.org/10.1002/eco.2425
.
Baran
T.
,
Harmancioglu
N. B.
,
Cetinkaya
C. P.
&
Barbaros
F.
2017
An extension to the revised approach in the assessment of informational entropy
.
Entropy
19
(
12
),
1
18
.
https://doi.org/10.3390/e19120634
.
Barbaros
F.
2022
Entropy-assisted approach to determine priorities in water quality monitoring process
.
Environmental Monitoring and Assessment
194
,
12
.
https://doi.org/10.1007/s10661-022-10580-0
.
Barcellos
D. d. S.
&
Souza
F. T. d.
2022
Optimization of water quality monitoring programs by data mining
.
Water Research
221
(
June
),
1
12
.
https://doi.org/10.1016/j.watres.2022.118805
.
Barcelos
D. A.
,
Pontes
F. V. M.
,
Fernanda
A. N. G.
,
Castro
D. C.
,
Nathalia
O. A.
&
Castilhos
Z. C.
2020
Gold mining tailing: Environmental availability of metals and human health risk assessment
.
Journal of Hazardous Materials
397
(
April
),
122721
.
https://doi.org/10.1016/j.jhazmat.2020.122721
.
Baudson
E.
,
Aparecida
M.
,
Benda
F.
,
Oliveira
D.
,
Engel
M.
,
Henrique
C.
,
Oliveira
R. D.
,
Lang
D.
,
Tadeu
M.
,
Orlando
D. A.
,
Vinícius
C.
&
Turbay
G.
2021
Trace metals in Rio Doce sediments before and after the collapse of the Fundão iron ore tailing dam, Southeastern Brazil
.
Chemosphere
262
,
1
8
.
https://doi.org/10.1016/j.chemosphere.2020.127879
.
Berglund
E. Z.
,
Buchberger
S.
,
Cunha
M.
,
Faust
K. M.
,
Giacomoni
M.
,
Goharian
E.
,
Kleiner
Y.
,
Lee
J.
,
Ostfeld
A.
,
Pasha
F.
,
Pesantez
J. E.
,
Saldarriaga
J.
,
Shafiee
E.
,
Spearing
L.
,
van Zyl
J. E.
&
Ethan Yang
Y. C.
2022
Effects of the COVID-19 pandemic on water utility operations and vulnerability
.
Journal of Water Resources Planning and Management
148
,
1
12
.
https://doi.org/10.1061/(asce)wr.1943-5452.0001560
.
Borba
R. P.
,
Figueiredo
B. R.
,
Rawlins
B.
&
Matschullat
J.
2000
Arsenic in water and sediment in the iron quadrangle, State of Minas Gerais, Brazil
.
Revista Brasileira de Geociências
30
(
3
),
558
561
.
BRASIL
2005
Resolução n.o 357, de 17 de março de 2005
.
Diário Oficial da República Federativa do Brasil
,
Brasília
.
Brinkmann
L.
&
Rowan
D. J.
2018
Vulnerability of Canadian aquatic ecosystems to nuclear accidents
.
Ambio
47
(
5
),
585
594
.
https://doi.org/10.1007/s13280-017-0995-6
.
Cacciuttolo
C.
&
Cano
D.
2022
Environmental impact assessment of mine tailings spill considering metallurgical processes of Gold and Copper mining : Case studies in the Andean countries of Chile
.
Water
14
,
1
30
.
Calazans
G. M.
,
Pinto
C. C.
,
Costa
E. P. d.
,
Perini
A. F.
&
Oliveira
S. C.
2018a
The use of multivariate statistical methods for optimization of the surface water quality network monitoring in the Paraopeba river basin, Brazil
.
Environ Monit Assess
190
(
491
),
1
17
.
Calazans
G. M.
,
Pinto
C. C.
,
Costa
E. P. d.
,
Perini
A. F.
&
Oliveira
S. C.
2018b
Using multivariate techniques as a strategy to guide optimization projects for the surface water quality network monitoring in the Velhas river basin, Brazil
.
Environmental Modeling and Assessment
190
,
726
.
Carvalho
G.
,
Larissa
C.
,
Passos
S.
,
Onesorge
T.
,
Lopes
M.
,
Miura
T.
,
Merçon
J.
,
Silva
D.
,
Bianca
C.
,
Barbosa
V.
,
Sperandio
L.
,
Edgar
C.
,
Kampke
H.
,
Chippari-gomes
A. R.
&
Chippari-gomes
A. R.
2018
Genotoxic, biochemical and bioconcentration effects of manganese on Oreochromis niloticus (Cichlidae)
.
Ecotoxicology
27
,
1150
1160
.
https://doi.org/10.1007/s10646-018-1970-0
.
Çengel
Y. A.
2021
On entropy, information, and conservation of information
.
Entropy
23
,
6
.
https://doi.org/10.3390/e23060779
.
Costa
E. S.
,
Cagnin
R. C.
,
da Silva
C. A.
,
Longhini
C. M.
,
F.
,
Lima
A. T.
,
Gomes
L. E. d. O.
,
Bernardino
A. F.
&
Neto
R. R.
2021
Iron ore tailings as a source of nutrients to the coastal zone
.
Marine Pollution Bulletin
171
,
112725
.
https://doi.org/10.1016/j.marpolbul.2021.112725
.
da Cunha Richard
E.
,
Duarte Jr
H.
,
de
A.
,
Duque Estrada
G. C.
,
Bechtold
J. P.
,
Maioli
B. G.
,
de Freitas
A. H. A.
,
Warner
K. E.
&
Figueiredo
L. H. M.
2020
Influence of Fundão tailings dam breach on water quality in the Doce River watershed
.
Integrated Environmental Assessment and Management
16
,
583
595
.
https://doi.org/10.1002/ieam.4311
.
da Luz
N.
,
Tobiason
J. E.
&
Kumpel
E.
2022
Water quality monitoring with purpose: Using a novel framework and leveraging long-term data
.
Science of the Total Environment
818
,
151729
.
https://doi.org/10.1016/j.scitotenv.2021.151729
.
da Silva
K. R.
2020
Avaliação da qualidade da água de uma microbacia hidrográfica rural dos tabuleiros costeiros do Brasil
.
Universidade Federal de Sergipe
.
da Silva
D. L.
,
de Lima
A. R. B.
,
de Lima Souza
J. R.
&
Adam
M. L.
2022
Environmental monitoring for genomic damage after an environmental accident in a river in the Brazilian Northeast
.
Water, Air, and Soil Pollution
233
(
12
),
1
10
.
https://doi.org/10.1007/s11270-022-05967-1
.
Daus
B.
,
Wennrich
R.
,
Morgenstern
P.
,
Weiß
H.
,
Euge
H.
,
Nalini
H. A.
,
Leonel
L. V.
,
Monteiro
R. P. G.
&
Moreira
R. M.
2005
Arsenic speciation in plant samples from the Iron Quadrangle, Brazil
.
Microchimica Acta
180
,
175
180
.
https://doi.org/10.1007/s00604-005-0397-5
.
de Almeida
R. G. B.
,
Lamparelli
M. C.
,
Dodds
W. K.
&
Cunha
D. G. F.
2022
Spatial optimization of the water quality monitoring network in São Paulo State (Brazil) to improve sampling efficiency and reduce bias in a developing sub-tropical region
.
Environmental Science and Pollution Research
29
(
8
),
11374
11392
.
https://doi.org/10.1007/s11356-021-16344-6
.
de Araújo
J. C.
,
Mota
V. T.
,
Teodoro
A.
,
Leal
C.
,
Leroy
D.
,
Madeira
C.
,
Machado
E. C.
,
Dias
M. F.
,
Souza
C. C.
,
Coelho
G.
,
Bressani
T.
,
Morandi
T.
,
Freitas
G. T. O.
,
Duarte
A.
,
Perdigão
C.
,
Tröger
F.
,
Ayrimoraes
S.
,
de Melo
M. C.
,
Laguardia
F.
,
Reis
M. T. P.
,
Mota
C.
&
Chernicharo
C. A. L.
2022
Long-term monitoring of SARS-CoV-2 RNA in sewage samples from specific public places and STPs to track COVID-19 spread and identify potential hotspots
.
Science of the Total Environment
838
.
https://doi.org/10.1016/j.scitotenv.2022.155959
.
de Carvalho
G. O.
,
Pinheiro
A. d. A.
,
de Sousa
D. M.
,
Padilha
J. d. A.
,
Souza
J. S.
,
Galvão
P. M.
,
Paiva
T. d. C.
,
Freire
A. S.
,
Santelli
R. E.
,
Malm
O.
&
Torres
J. P. M.
2018
Metals and arsenic in water supply for riverine communities affected by the largest environmental disaster in Brazil: The dam collapse on Doce river
.
Orbital
10
,
299
307
.
https://doi.org/10.17807/orbital.v10i4.1081
.
Deonarine
A.
,
Bartov
G.
,
Johnson
T. M.
,
Ruhl
L.
,
Vengosh
A.
&
Hsu-Kim
H.
2013
Environmental impacts of the Tennessee Valley Authority Kingston coal ash spill. 2. Effect of coal ash on methylmercury in historically contaminated river sediments
.
Environmental Science and Technology
47
(
4
),
2100
2108
.
https://doi.org/10.1021/es303639d
.
de Pádua
L. H. R.
,
Nascimento
N. d. O.
,
Silva
F. E. O. E.
&
Alfonso
L.
2019
Analysis of the fluviometric network of Rio Das Velhas using entropy
.
Revista Brasileira de Recursos Hidricos
24
,
1
14
.
https://doi.org/10.1590/2318-0331.241920180188
.
Do
H. T.
,
Lo
S. L.
&
Phan Thi
L. A.
2013
Calculating of river water quality sampling frequency by the analytic hierarchy process (AHP)
.
Environmental Monitoring and Assessment
185
(
1
),
909
916
.
https://doi.org/10.1007/s10661-012-2600-6
.
Duarte
E. B.
&
Neves
M. A.
2023
Main chemical and mineralogical components of the Rio Doce sediments and the iron ore tailing from the Fundão Dam disaster, Southeastern Brazil
.
Environmental Monitoring and Assessment
1
13
.
https://doi.org/10.1007/s10661-023-11087-y
.
Espindola
H. S.
,
Elisa
I.
&
Mifarreg
G.
2017
Território da mineração : Uma contribuição teórica
.
Revista Brasileira de Geografia
62
(
2
),
67
93
.
Ferré
L.
1995
Selection of components in principal component analysis: A comparison of methods
.
Computational Statistics and Data Analysis
19
(
6
),
669
682
.
https://doi.org/10.1016/0167-9473(94)00020-J
.
Foroozand
H.
&
Weijs
S. V.
2021
Objective functions for information-theoretical monitoring network design: What is ‘optimal’?
Hydrology and Earth System Sciences
25
(
2
),
831
850
.
https://doi.org/10.5194/hess-25-831-2021
.
Gabriel
F. Â.
,
Ferreira
A. D.
,
Queiroz
H. M.
,
Vasconcelos
A. L. S.
,
Ferreira
T. O.
&
Bernardino
A. F.
2021
Long-term contamination of the Rio Doce estuary as a result of Brazil's largest environmental disaster
.
Perspectives in Ecology and Conservation
19
,
417
428
.
Graham
A.
&
Wilcox
D. A.
2021
The impacts of Marcellus Shale gas drilling accidents on amphibians in a Pennsylvania fen
.
Wetlands Ecology and Management
29
(
1
),
155
167
.
https://doi.org/10.1007/s11273-020-09775-4
.
Guo
G.
&
Duan
R.
2021
Simulation and assessment of a water pollution accident caused by phenol leakage
.
Water Policy
23
(
3
),
750
764
.
https://doi.org/10.2166/wp.2021.153
.
Hair
J. F.
Jr.
,
Black
W. C.
,
Babin
B. J.
,
Anderson
R. E.
&
& Tatham
R. L.
2006
Multivariate Data Analysis
, 6th edn.
Prentice Hall
,
Upper Saddle River, NJ, USA
.
Hajigholizadeh
M.
&
Melesse
A. M.
2017
Assortment and spatiotemporal analysis of surface water quality using cluster and discriminant analyses
.
Catena
151
,
247
258
.
https://doi.org/10.1016/j.catena.2016.12.018
.
Härdle
W. K.
&
Simar
L.
2015
Applied Multivariate Statistical Analysis
, 4th edn.
Springer, Org.
,
Berlin
.
https://doi.org/10.1007/978-3-662-45171-7
.
Hausser
J.
&
Strimmer
K.
2021
Entropy: Estimation of Entropy, Mutual Information and Related Quantities
.
He
F.
,
Ma
J.
,
Lai
Q.
,
Shui
J.
&
Li
W.
2023
Environmental impact assessment of a wharf oil spill emergency on a river water source
.
Water (Switzerland)
15
(
2
).
https://doi.org/10.3390/w15020346
.
Hou
Y.
2012
Environmental accident and its treatment in a developing country: A case study on China
.
Environmental Monitoring and Assessment
184
(
8
),
4855
4859
.
https://doi.org/10.1007/s10661-011-2307-0
.
Hou
D.
,
Ge
X.
,
Huang
P.
,
Zhang
G.
&
Loáiciga
H.
2014
A real-time, dynamic early-warning model based on uncertainty analysis and risk assessment for sudden water pollution accidents
.
Environmental Science and Pollution Research
21
,
8878
8892
.
https://doi.org/10.1007/s11356-014-2936-2
.
Instituto brasileiro do meio ambiente e dos recursos naturais renováveis (IBAMA)
2015
Laudo técnico preliminar: Impactos ambientais decorrentes do desastre envolvendo o rompimento da barragem de Fundão, em Mariana, Minas Gerais, IBAMA
.
Brasília, Brasil
.
Jerez
L. A. M.
,
Welker
A. L.
,
Kemp
S. J.
&
Smith
V. B.
2023
Effect of human mobility changes due to COVID-19 on stream water quality in watersheds with different predominant land uses
.
Journal of Water Resources Planning and Management
149
,
1
10
.
https://doi.org/10.1061/jwrmd5.wreng-5650
.
Jiang
J.
,
Tang
S.
,
Han
D.
,
Fu
G.
,
Solomatine
D.
&
Zheng
Y.
2020a
A comprehensive review on the design and optimization of surface water quality monitoring networks
.
Environmental Modelling and Software
132
(
104792
),
1
17
.
https://doi.org/10.1016/j.envsoft.2020.104792
.
Jiang
D. y.
,
Wang
Y. y.
,
Liao
Q.
,
Long
Z.
&
Zhou
S. Y.
2020b
Assessment of water quality and safety based on multi-statistical analyses of nutrients, biochemical indexes and heavy metals
.
Journal of Central South University
27
(
4
),
1211
1223
.
https://doi.org/10.1007/s11771-020-4361-7
.
Jing
P.
,
Yang
Z.
,
Zhou
W.
,
Huai
W.
&
Lu
X.
2019
Inversion of multiple parameters for river pollution accidents using emergency monitoring data
.
Water Environment Research
91
(
8
),
731
738
.
https://doi.org/10.1002/wer.1099
.
Kändler
M.
,
Blechinger
K.
,
Seidler
C.
,
Pavlů
V.
,
Šanda
M.
,
Dostál
T.
,
Krása
J.
,
Vitvar
T.
&
Štich
M.
2017
Impact of land use on water quality in the upper Nisa catchment in the Czech Republic and in Germany
.
Science of the Total Environment
586
,
1316
1325
.
https://doi.org/10.1016/j.scitotenv.2016.10.221
.
Karamouz
M.
,
Kerachian
R.
,
Akhbari
M.
&
Hafez
B.
2009
Design of river water quality monitoring networks: A case study
.
Environmental Modeling and Assessment
14
(
6
),
705
714
.
https://doi.org/10.1007/s10666-008-9172-4
.
Kettenring
J. R.
2006
The practice of cluster analysis
.
Journal of Classification
23
(
1
),
3
30
.
https://doi.org/10.1007/s00357-006-0002-6
.
Keum
J.
&
Coulibaly
P.
2017a
Information theory-based decision support system for integrated design of multivariable hydrometric networks
.
Journal of the American Water Resources Association
53
,
6239
6259
.
https://doi.org/10.1111/j.1752-1688.1969.tb04897.x
.
Keum
J.
&
Coulibaly
P.
2017b
Sensitivity of entropy method to time series length in hydrometric network design
.
Journal of Hydrologic Engineering
22
(
7
),
04017009
.
https://doi.org/10.1061/(asce)he.1943-5584.0001508
.
Keum
J.
,
Kornelsen
K. C.
,
Leach
J. M.
&
Coulibaly
P.
2017
Entropy applications to water monitoring network design: A review
.
Entropy
19
,
1
21
.
https://doi.org/10.3390/e19110613
.
Khalil
B.
,
Ouarda
T. B. M. J.
,
St-Hilaire
A.
&
Chebana
F.
2010
A statistical approach for the rationalization of water quality indicators in surface water quality monitoring networks
.
Journal of Hydrology
386
(
1–4
),
173
185
.
https://doi.org/10.1016/j.jhydrol.2010.03.019
.
Koo
Y. H.
,
Yang
Y. S.
&
Song
K. W.
2014
Radioactivity release from the Fukushima accident and its consequences: A review
.
Progress in Nuclear Energy
74
,
61
70
.
https://doi.org/10.1016/j.pnucene.2014.02.013
.
Kütter
V. T.
,
Martins
G. S.
,
Brandini
N.
,
Cordeiro
R. C.
,
Almeida
J. P. A.
&
Marques
E. D.
2023
Impacts of a tailings dam failure on water quality in the Doce river: The largest environmental disaster in Brazil
.
Journal of Trace Elements and Minerals
5
,
1
13
.
https://doi.org/10.1016/j.jtemin.2023.100084
.
Le
S.
,
Josse
J.
&
Husson
F.
2008
Factominer: A package for multivariate analysis
.
Journal of Statistical Software
25
(
1
),
1
18
.
https://doi.org/10.18637/jss.v025.i01
.
Li
P.
,
Tian
R.
&
Liu
R.
2019
Solute geochemistry and multivariate analysis of water quality in the Guohua phosphorite mine, Guizhou Province, China
.
Exposure and Health
11
(
2
),
81
94
.
https://doi.org/10.1007/s12403-018-0277-y
.
Lian
M.
,
Wang
J.
,
Wang
B.
,
Xin
M.
,
Lin
C.
,
Gu
X.
,
He
M.
,
Liu
X.
&
Ouyang
W.
2023
Spatiotemporal variations and the ecological risks of organophosphate esters in Laizhou Bay waters between 2019 and 2021: Implying the impacts of the COVID-19 pandemic
.
Water Research
233
,
1
10
.
https://doi.org/10.1016/j.watres.2023.119783
.
Liuzzo
L.
,
Sammartano
V.
&
Freni
G.
2019
Comparison between different distributed methods for flood susceptibility mapping
.
Water Resources Management
33
(
9
),
3155
3173
.
https://doi.org/10.1007/s11269-019-02293-w
.
Lobo
E. A.
,
Schuch
M.
,
Heinrich
C. G.
,
da Costa
A. B.
,
Düpont
A.
,
Wetzel
C. E.
&
Ector
L.
2015
Development of the trophic water quality index (TWQI) for subtropical temperate Brazilian lotic systems
.
Environmental Monitoring and Assessment
187
,
6
.
https://doi.org/10.1007/s10661-015-4586-3
.
Lusweti
E.
,
Kanda
E. K.
,
Obando
J.
&
Makokha
M.
2022
Effects of oil exploration on surface water quality – a review
.
Water Practice and Technology
17
(
10
),
2171
2185
.
https://doi.org/10.2166/wpt.2022.104
.
Manley
S. F.
,
Conner
R.
&
Holthouse Putz
A. R.
2020
Chicago's response to a hexavalent chromium spill
.
Journal – American Water Works Association
112
(
3
),
22
29
.
https://doi.org/10.1002/awwa.1460
.
Matsumoto
M.
&
Nishimura
T.
1998
Mersenne Twister : A 623-dimensionally equidistributed uniform pseudorandom number generator
.
ACM Transactions on Modeling and Computer Simulation (TOMACS)
8
(
1
),
3
30
.
Mendes
R. G.
,
do Valle Junior
R. F.
,
de Melo Silva
M. M. A. P.
,
de Morais Fernandes
G. H.
,
Fernandes
L. F. S.
,
Fernandes
A. C. P.
,
Pissarra
T. C. T.
,
de Melo
M. C.
,
Valera
C. A.
&
Pacheco
F. A. L.
2022
A partial least squares-path model of environmental degradation in the Paraopeba River, for rainy seasons after the rupture of B1 tailings dam, Brumadinho, Brazil
.
Science of the Total Environment
851
.
https://doi.org/10.1016/j.scitotenv.2022.158248
.
Miller
M. E.
,
Ghisolfi
R. D.
&
Barroso
G. F.
2023
Remote sensing monitoring of mining tailings in the fluvial-estuarine-coastal ocean continuum of the Lower Doce River Valley (Brazil)
.
Environmental Monitoring and Assessment
195
(
5
),
542
.
https://doi.org/10.1007/s10661-023-11123-x
.
Mirauda
D.
&
Ostoich
M.
2020
MIMR criterion application: Entropy approach to select the optimal quality parameter set responsible for river pollution
.
Sustainability (Switzerland)
12
(
5
).
https://doi.org/10.3390/su12052078
.
Moreira
E.
,
Hermínio
S.
,
Nalini
A.
,
Adriana
J.
,
Abreu
T.
&
Brandão
L.
2021
Mobilization of heavy metals in river sediments from the region impacted by the Fundão dam rupture
.
Brazil. Environmental Earth Sciences
80
(
24
),
1
10
.
https://doi.org/10.1007/s12665-021-10107-9
.
Nair
U. S.
1936
The standard error of Gini's mean difference
.
Biometrika
28
(
3
),
428
436
.
Nguyen
T. H.
,
Helm
B.
,
Hettiarachchi
H.
,
Caucci
S.
&
Krebs
P.
2019
The selection of design methods for river water quality monitoring networks: A review
.
Environmental Earth Sciences
78
(
3
),
1
17
.
https://doi.org/10.1007/s12665-019-8110-x
.
Nguyen
T. H.
,
Helm
B.
,
Hettiarachchi
H.
&
Caucci
S.
2020
Quantifying the information content of a water quality monitoring network using principal component analysis: A case study of the Freiberger Mulde River Basin, Germany
.
Water
12
(
420
),
1
21
.
Nogueira
L. B.
,
Sousa
S. M.
,
Santos
C. G. L.
,
Araújo
G. S.
,
Oliveira
L.
&
Nogueira
K. O. P. C.
2021
Water quality from gualaxo do Norte and Carmo Rivers (Minas Gerais, Brazil) after the Fundão dam failure
.
Anuario do Instituto de Geociencias
44
,
1
11
.
https://doi.org/10.11137/1982-3908_2021_44_37175
.
Nooghabi
M. J.
&
Nooghabi
E. K.
2016
On entropy of a Pareto distribution in the presence of outliers
.
Communications in Statistics – Theory and Methods
45
(
17
),
5234
5250
.
https://doi.org/10.1080/03610926.2014.941495
.
Oehrig
J.
,
Kananizadeh
N.
,
Wild
M.
,
Rouhani
S.
&
Odle
W.
2023
Applying multivariate techniques to fingerprint water quality impact of the Fundão Dam breach within the Rio Doce basin
.
Integrated Environmental Assessment and Management
00
(
00
),
1
15
.
https://doi.org/10.1002/ieam.4820
.
Oliveira
K. S. S.
&
Quaresma
V. d. S.
2017
Temporal variability in the suspended sediment load and streamflow of the Doce River
.
Journal of South American Earth Sciences
78
,
101
115
.
https://doi.org/10.1016/j.jsames.2017.06.009
.
Pacheco
F. A. L.
,
do Valle Junior
R. F.
,
de Melo Silva
M. M. A. P.
,
Tarlé Pissarra
T. C.
,
de Souza Rolim
G.
,
de Melo
M. C.
,
Valera
C. A.
,
Moura
J. P.
&
Sanches Fernandes
L. F.
2023
Geochemistry and contamination of sediments and water in rivers affected by the rupture of tailings dams (Brumadinho, Brazil)
.
Applied Geochemistry
152
,
1
17
.
https://doi.org/10.1016/j.apgeochem.2023.105644
.
Passos
J. B. de M. C.
,
Teixeira
D. B. de S.
,
Campos
J. A.
,
Lima
R. P. C.
,
Fernandes‑Filho, Inácio
E.
&
Silva
Da, D. D.
2021
Multivariate statistics for spatial and seasonal quality assessment of water in the Doce River basin, Southeastern Brazil
.
Environmental Monitoring and Assessment
193
(
3
),
1
16
.
Pedrosa
P.
2007
Optical resilience of the Paraíba do Sul River (Brazil) during a toxic spill of a wood-pulping factory: The Cataguazes accident
.
Environmental Monitoring and Assessment
129
(
1–3
),
137
150
.
https://doi.org/10.1007/s10661-006-9348-9
.
Perdigão
R. A. P.
,
Ehret
U.
,
Knuth
K. H.
&
Wang
J.
2020
Debates: Does information theory provide a new paradigm for earth science? Emerging concepts and pathways of information physics
.
Water Resources Research
56
(
2
),
1
13
.
https://doi.org/10.1029/2019WR025270
.
Pereira
S. F. P.
,
Rocha
R. M.
,
Pinheiro
L. S.
&
Nogueira
D. P.
2021
Integration of statistical models and computer simulation in environmental accidents: A study on leakage of red mud in the Pará River, Amazon, Brazil
.
Journal of the Brazilian Chemical Society
32
(
10
),
1997
2008
.
Petrucio
M. M.
,
Medeiros
A. O.
,
Rosa
C. A.
&
Barbosa
F. A. R.
2005
Trophic state and microorganisms community of major sub-basins of the middle Rio Doce basin, southeast Brazil
.
Brazilian Archives of Biology and Technology
48
(
4
),
623
633
.
https://doi.org/10.1590/s1516-89132005000500015
.
Pinto
C. C.
,
Almeida
K. d. B.
&
Oliveira
S. C.
2018
Spatial evaluation of the water quality from the from the Velhas river channel, in the state of Minas Gerais
.
Periódico tchê química
15
(
30
),
75
86
.
Queiroz
H. M.
,
Nóbrega
G. N.
,
Ferreira
T. O.
,
Almeida
L. S.
,
Romero
T. B.
,
Santaella
S. T.
,
Bernardino
A. F.
&
Otero
X. L.
2018
The Samarco mine tailing disaster : A possible time-bomb for heavy metals contamination?
Science of the Total Environment
637–638
,
498
506
.
https://doi.org/10.1016/j.scitotenv.2018.04.370
.
Queiroz
H. M.
,
Ferreira
T. O.
,
Barcellos
D.
,
Nóbrega
G. N.
,
Antelo
J.
,
Otero
X. L.
&
Bernardino
A. F.
2021a
From sinks to sources: The role of Fe oxyhydroxide transformations on phosphorus dynamics in estuarine soils
.
Journal of Environmental Management
278
.
https://doi.org/10.1016/j.jenvman.2020.111575
.
Queiroz
H. M.
,
Ying
S. C.
,
Bernardino
A. F.
,
Barcellos
D.
,
Nóbrega
G. N.
,
Otero
X. L.
&
Ferreira
T. O.
2021b
Role of Fe dynamic in release of metals at Rio Doce estuary: Unfolding of a mining disaster
.
Marine Pollution Bulletin
166
,
1
7
.
https://doi.org/10.1016/j.marpolbul.2021.112267
.
R Core Team
2021
R: A Language and Environment for Statistical Computing.
Vienna, Austria
.
Reina-García
J.
,
Toro-Vélez
A. F.
,
Peña-Varón
M. R.
,
Olaya-Ochoa
J.
&
Figueroa-Casas
A.
2020
Methodological design for the macro-location of a micropollutants monitoring network in tropical rivers: A case study in Cauca River
.
Environmental Monitoring and Assessment
192
,
4
.
https://doi.org/10.1007/s10661-020-8154-0
.
RENOVA
2017
Programa de monitoramento quali-quantitativo sistemático de água e sedimentos – PMQQS Relatório técnico abril 2017
.
RENOVA
,
Belo Horizonte, Brasil
.
Robertson
D. M.
,
Saad
D. A.
&
Heisey
D. M.
2006
A regional classification scheme for estimating reference water quality in streams using land-use-adjusted spatial regression-tree analysis
.
Environmental Management
37
,
209
229
.
https://doi.org/10.1007/s00267-005-0022-8
.
Rodrigues
V.
,
Estrany
J.
,
Ranzini
M.
,
de Cicco
V.
,
Martín-Benito
J. M. T.
,
Hedo
J.
&
Lucas-Borja
M. E.
2018
Effects of land use and seasonality on stream water quality in a small tropical catchment: The headwater of Córrego Água Limpa, São Paulo (Brazil)
.
Science of the Total Environment
622–623
,
1553
1561
.
https://doi.org/10.1016/j.scitotenv.2017.10.028
.
Sánchez
L. E.
,
Alger
K.
,
Alonso
L.
,
Barbosa
F. A. R.
,
Brito
M. C. W.
,
Laureano
F. V.
,
May
P.
,
Roeser
H.
&
Kakabadse
Y.
2018
Os impactos do rompimento da Barragem de Fundão: O caminho para uma mitigação sustentável e resiliente
.
Santolin
C. V. A.
,
Ciminelli
V. S. T.
,
Nascentes
C. C.
&
Windmöller
C. C.
2015
Distribution and environmental impact evaluation of metals in sediments from the Doce River Basin, Brazil
.
Environmental Earth Sciences
1235
1248
.
https://doi.org/10.1007/s12665-015-4115-2
.
Schliemann
S. A.
,
Grevstad
N.
&
Brazeau
R. H.
2021
Water quality and spatio-temporal hot spots in an effluent-dominated urban
.
Hydrological Processes
35
,
1
17
.
Semenova
G.
2020
Environmental disasters as a factor of environmental pollution
.
E3S Web of Conferences
217
,
1
8
.
https://doi.org/10.1051/e3sconf/202021704007
.
Sergeant
C. J.
,
Starkey
E. N.
,
Bartz
K. K.
,
Wilson
M. H.
&
Mueter
F. J.
2016
A practitioner's guide for exploring water quality patterns using principal components analysis and procrustes
.
Environmental Monitoring and Assessment
188
.
https://doi.org/10.1007/s10661-016-5253-z
Shannon
C. E.
1948
A mathematical theory of communication
.
Bell System Technical Journal
27
,
379
423
.
https://doi.org/10.1002/j.1538-7305.1968.tb00069.x
.
Shannon
C. E.
&
Weaver
W.
1949
The Theory of Mathematical Communication
.
University of Illinois Press
,
Urbana, Illinois
.
Sharif
S.
,
Ikram
A.
,
Khurshid
A.
,
Salman
M.
,
Mehmood
N.
,
Arshad
Y.
,
Ahmed
J.
,
Safdar
R. M.
,
Rehman
L.
,
Mujtaba
G.
,
Hussain
J.
,
Ali
J.
,
Angez
M.
,
Alam
M. M.
,
Akthar
R.
,
Malik
M. W.
,
Baig
M. Z. I.
,
Rana
M. S.
,
Usman
M.
,
Ali
M. Q.
,
Ahad
A.
,
Badar
N.
,
Umair
M.
,
Tamim
S.
,
Ashraf
A.
,
Tahir
F.
&
Ali
N.
2021
Detection of SARs-CoV-2 in wastewater using the existing environmental surveillance network: A potential supplementary system for monitoring COVID-19 transmission
.
PLoS One
16
,
1
9
.
https://doi.org/10.1371/journal.pone.0249568
.
Shi
B.
,
Jiang
J.
,
Sivakumar
B.
,
Zheng
Y.
&
Wang
P.
2018
Quantitative design of emergency monitoring network for river chemical spills based on discrete entropy theory
.
Water Research
134
,
140
152
.
https://doi.org/10.1016/j.watres.2018.01.057
.
Silva
D. d. C.
,
Bellato
C. R.
,
Neto
M.
,
de O
J.
&
Fontes
M. P. F.
2018
Trace elements in river waters and sediments before and after a mining dam breach (Bento Rodrigues, Brazil)
.
Quimica Nova
41
(
8
),
857
866
.
https://doi.org/10.21577/0100-4042.20170252
.
Singh
V. P.
2013
Entropy Theory and Its Application in Environmental and Water Engineering
, 1st edn.
John Wiley & Sons
,
Chichester
.
Singh
K. R.
,
Dutta
R.
,
Kalamdhad
A. S.
&
Kumar
B.
2019
An investigation on water quality variability and identification of ideal monitoring locations by using entropy based disorder indices
.
Science of the Total Environment
647
,
1444
1455
.
https://doi.org/10.1016/j.scitotenv.2018.07.463
.
Szekely
G. J.
&
Rizzo
M. L.
2005
Hierarchical clustering via joint between-within distances: Extending Ward's minimum variance method
.
Journal of Classification
22
(
2
),
151
183
.
https://doi.org/10.1007/s00357-005-0012-9
.
Tang
X.
,
Zhai
A.
,
Ding
X.
&
Zhu
Q.
2019
Safety guarantee system of drinking water source in Three Gorges reservoir area and its application in Huangjuedu drinking water source area
.
Sustainability (Switzerland)
11
(
24
).
https://doi.org/10.3390/su11247074
.
Tanos
P.
,
Kovács
J.
,
Kovács
S.
,
Anda
A.
&
Hatvani
I. G.
2015
Optimization of the monitoring network on the River Tisza (Central Europe, Hungary) using combined cluster and discriminant analysis, taking seasonality into account
.
Environmental Monitoring and Assessment
187
(
9
).
https://doi.org/10.1007/s10661-015-4777-y
.
Teixeira
M. C.
,
Santos
A. C.
,
Fernandes
C. S.
&
Ng
J. C.
2020
Arsenic contamination assessment in Brazil – Past, present and future concerns: A historical and critical review
.
Science of the Total Environment
730
,
99
.
https://doi.org/10.1016/j.scitotenv.2020.138217
.
United Nations Environment Programme (UNEP)
2005
The Songua River Spill China
, December
.
Versiani
B. R.
,
Carneiro
R. d. M. F.
,
Amaral
I. R.
&
Quintão
C. M. F.
2009
Maximum flood regionalization in large basins: Study case of the Alto São Francisco region – Minas Gerais, Brazil
.
Hydrological Processes
23
,
3201
3206
.
https://doi.org/10.1002/hyp
.
Wang
Y. B.
,
Liu
C. W.
,
Liao
P. Y.
&
Lee
J. J.
2014
Spatial pattern assessment of river water quality: Implications of reducing the number of monitoring stations and chemical parameters
.
Environmental Monitoring and Assessment
186
(
3
),
1781
1792
.
https://doi.org/10.1007/s10661-013-3492-9
.
Wang
W. c.
,
Du
Y. j.
,
Chau
K. w.
,
Xu
D. m.
,
Liu
C. j.
&
Ma
Q.
2021
An ensemble hybrid forecasting model for annual runoff based on sample entropy, secondary decomposition, and long short-term memory neural network
.
Water Resources Management
35
(
14
),
4695
4726
.
https://doi.org/10.1007/s11269-021-02920-5
.
Ward
J. H.
1963
Hierarchical grouping to optimize an objective function
.
Journal of the American Statistical Association
58
(
301
),
236
244
.
Wild
M.
,
Rouhani
S.
,
Oehrig
J.
,
Alves
P. H. G.
,
Odle
W.
&
Gaspar
D. F. A.
2023
Using spatiotemporal ratio analyses to quantitatively estimate water quality recovery of the Rio Doce
.
Integrated Environmental Assessment and Management
00
(
00
),
1
13
.
https://doi.org/10.1002/ieam.4813
.
Xiong
F.
,
Guo
S.
,
Chen
L.
,
Chang
F. J.
,
Zhong
Y.
&
Liu
P.
2018
Identification of flood seasonality using an entropy-based method
.
Stochastic Environmental Research and Risk Assessment
32
(
11
),
3021
3035
.
https://doi.org/10.1007/s00477-018-1614-1
.
Yazdi
J.
2018
Water quality monitoring network design for urban drainage systems, an entropy method
.
Urban Water Journal
15
,
227
233
.
https://doi.org/10.1080/1573062X.2018.1424215
.
Zorzal-Almeida
S.
&
Fernandes
V. D. O.
2021
Ecological thresholds of periphytic communities and ecosystems integrity in lower Doce River basin
.
Science of the Total Environment
796
,
1
11
.
https://doi.org/10.1016/j.scitotenv.2021.148965
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY-NC-ND 4.0), which permits copying and redistribution for non-commercial purposes with no derivatives, provided the original work is properly cited (http://creativecommons.org/licenses/by-nc-nd/4.0/).

Supplementary data