Abstract
The total maximum daily load (TMDL) program requires each state in the United States to assess their water bodies, list those that are impaired, and develop TMDL plans or programs to restore them. However, the impact that spatial and socioeconomic variables have on TMDL progress is unknown. This study seeks to fill this gap through a nation-wide analysis of the influence of spatial and socioeconomic variables on indicators of TMDL progress. To do so, data were collected and analyzed for each state, including indicators of TMDL progress, spatial variables, and socioeconomic data. Then, these data were applied to identify overall trends and to define the relationships between indicators of TMDL progress and spatial and socioeconomic variables. Results indicate that the size of a state, the length of total streams, and median household income are related to the percentage of streams that are assessed within a state. In addition, states largely followed similar patterns in TMDL progress based on the US Environmental Protection Agency region that they were within, indicating that location plays a large role. Overall, this study helps to contextualize progress in TMDL development and aid in our understanding of factors that influence the implementation of water quality programs.
HIGHLIGHTS
Progress of the total maximum daily load (TMDL) program across all US states is summarized.
Regression analysis identifies what spatial or socioeconomic factors are related to indicators of TMDL progress.
Size of a state, stream lengths, and per capita earnings are related to the percent of streams that are assessed.
Indicators of TMDL progress in each state are influenced by their Environmental Protection Agency region.
INTRODUCTION
Nonpoint source pollution is one of the greatest threats to water quality in streams and rivers across the world (Brown & Froemke, 2012; du Plessis, 2022). Within the United States, the quality of most freshwater bodies is regulated through the Clean Water Act (CWA). Section 303(d) of the CWA requires states to assess their water bodies and, for those that are identified as impaired, establish total maximum daily loads (TMDLs) (US Congress, 1972). A TMDL is defined as the maximum amount of a pollutant allowed to enter a water body so that it will meet water quality standards for that particular pollutant and designated use (i.e., fishable, swimmable, and/or drinkable) (US Congress, 1972). This program is important as it is the statutory method to address nonpoint source pollution, which the US Environmental Protection Agency (EPA) claims is the leading remaining cause of water quality problems (USEPA, 2022b), with over 610,000 miles of rivers and streams listed as impaired (USEPA, 2017). To address nonpoint source pollution, the TMDL program relies on a system of policy implementation where the federal government delegates primary authority to agencies at the state level to execute the TMDL development process. The TMDL development process has three distinct steps: (1) assessment of surface water bodies; (2) listing impaired water bodies; and (3) development of TMDLs to impaired water bodies.
Parameters used in regression analysis of the TMDL study.
Dependent variable . | Independent variable . |
---|---|
Assessed (%) | Total stream miles |
Impaired (%) | Stream density |
TMDL complete (%) | State land area (miles) |
Assessed miles | Average area rainfall |
Impaired miles | Elevation gain feet |
TMDL complete miles | Average slope |
Population | |
Population density | |
GDP (billion USD) | |
No high school diploma percentage | |
Bachelor's degree or higher percentage | |
Per capita earnings | |
Median household income | |
Remote area percentage |
Dependent variable . | Independent variable . |
---|---|
Assessed (%) | Total stream miles |
Impaired (%) | Stream density |
TMDL complete (%) | State land area (miles) |
Assessed miles | Average area rainfall |
Impaired miles | Elevation gain feet |
TMDL complete miles | Average slope |
Population | |
Population density | |
GDP (billion USD) | |
No high school diploma percentage | |
Bachelor's degree or higher percentage | |
Per capita earnings | |
Median household income | |
Remote area percentage |
Once TMDLs are developed, they could be included in watershed management plans and implemented to meet these criteria by reducing both point source and nonpoint source pollution. Point source pollution is limited through National Pollutant Discharge Elimination System (NPDES) permits, which are regulated by Section 402 of the CWA. However, no federal regulation or authority exists to enforce pollutant reductions against nonpoint sources, including the enforcement of implementing TMDLs to achieve these target loads. Rather, since the implementation of pollutant controls is outside of the EPA's statutorily defined authority, the implementation of TMDLs is done by the states, often through incentives to polluters (Copeland, 2014; Lichtenberg, 2019). For example, the CWA section 319 grant programs provide funding for nonpoint source pollution controls when the control of point source pollution through NPDES alone does not achieve TMDL goals (Jones, 2014). TMDL development is the responsibility of state agencies; therefore, the extent to which states assess water bodies, identify impairments, and develop TMDLs can vary from state to state. It has been over 30 years since the EPA published regulations establishing TMDL requirements in 1992 (Houck, 1997) and therefore presents an opportune time to reflect on the progress of the TMDL programs across the United States.
This reflection is valuable in understanding what factors may contribute to advancing or prohibiting the progress of TMDLs. The US states vary in their populations, geography, demographics, and economic output, which may influence both the approach and extent to which they enact water quality programs. To that end, there is a diversity of approaches to assessing and implementing TMDLs through monitoring and modeling technologies. This includes assessment methods to monitor water quality parameters that range from detailed monitoring requiring scientific expertise, to those that can be carried out by volunteer citizens (Loperfido et al., 2010; Nation & Johnson, 2016; Brett, 2017; Webster & Dennison, 2022). While technical approaches to assessing water bodies and applying models to develop TMDLs are well defined and available for use (Zhang & Quinn, 2019), it is less clear the extent to which different states have implemented TMDL programs and the factors that contribute to their progress. This is important because understanding the factors that impact the progress of protecting the nations’ water bodies can help to improve TMDL methods and ultimately reach the designated goal of obtaining fishable, swimmable, and/or drinkable waters.
However, the extent to which socioeconomics or the geography of a state influences the implementation of TMDLs is unclear. This gap in understanding is critical as socioeconomics, land use, and environmental dynamics have been shown to be important factors in developing effective TMDL plans (Mirchi & Watkins, 2012) and in implementing other aspects of the CWA, such as the NPDES permit program (McDonald & Naughton, 2019) that seeks to address water quality impairments. Water quality impairments have also been shown to have a relationship with socioeconomic factors, such as a negative impact correlation to housing prices (Papenfus, 2019), as well as significant correlations with education, ethnic composition, age structure, and population density (Farzin & Grogan, 2013). In addition, the economics of a region could influence the degree to which the public influences management decisions as per capita income is positively correlated with environmental concerns (Andrew et al., 2019).
Beyond socioeconomic effects, there may also be regional or spatial similarities among states in their approach to managing water quality due to common geomorphologic or climatic similarities, or their presence in similar EPA regions. The US EPA is divided into 10 different regions with a regional office associated with each that is responsible for enforcing implementation of TMDL programs within their states. This regionalization of enforcement offices may therefore lead to different levels of program development, as these regions have largely developed independently due to different factors such as water body priorities, litigation, and resource availability (Neilson & Stevens, 2002). However, no studies have evaluated the relationship between socioeconomics, regionalization, and TMDL implementation in the United States. Therefore, efforts to manage water quality through the TMDL program may be impacted by the socioeconomic and spatial factors of a watershed. Understanding these relationships could allow watershed managers and planners to better allocate resources and guide efforts to improve upon TMDL development programs.
The objectives of this study are to analyse the state-level TMDL progress across the United States and evaluate its relationship to socioeconomic and spatial factors. To do so, data were gathered from the EPA Assessment and Total Maximum Daily Load Tracking and Implementation System (ATTAINS) database for each state, including TMDL progress and impairment causes, and then analyzed to identify spatial trends or other clusters in state-level TMDL progress, including the percentage of streams that have been assessed, the percent of assessed streams that are impaired, and those for which a TMDL is complete. In addition, US Census (gross domestic product (GDP), educational attainment levels, etc.) and regionalization (EPA region, climatic region, etc.) data were collected for each state. These data were then used to explore how socioeconomic factors and spatial relationships correlate to markers of TMDL progress. Ultimately, this study can help to contextualize progress in TMDL development and help in our understanding of the influence that socioeconomic or regionalization may have on the implementation of water quality programs.
METHODOLOGY
This section presents the methodology to analyse the state-level TMDL progress across the United States and evaluate its relationship to socioeconomic and spatial factors. Section 2.1 presents the methods and approaches of data collection, Section 2.2 presents data preparation, and Section 2.1 presents data analysis.
Data collection
TMDL assessment data were extracted from EPA ATTAINS archived data (US EPA, 2022a). For each state, ATTAINS includes data on the status of TMDL for surface water bodies, including the miles of water body that are assessed, miles unassessed, specific impairment, miles impaired, and miles with a completed TMDL. From these data, we derived indicators of TMDL progress including (1) the percent of streams and rivers that have been assessed; (2) the percentage of the assessed streams and rivers that are listed as impaired; and (3) the percentage of impaired streams and rivers for which a TMDL has been completed.
In addition, US Census data were obtained for each state including the population, GDP, educational attainment, and per capita earnings (U.S. Census Bureau, 2019). Finally, other geographic information including state land area, elevation (highest and lowest elevation), and average annual rainfall were derived using GIS analysis and datasets listed in Table SI-1. From these data, elevation gain and average geographical slopes were derived as they have been shown to be a significant factor impacting stream water quality (Connolly et al., 2018). Remote land percentage was derived from data from the Economic Research Service (Cromartie & Nulph, 2019), which defines zip-code level land areas as frontier and remote (FAR) based on population and travel times from population centers. Specifically, a FAR area is defined as an area that is 60 min or more from an urban area of 50,000 or more people (Cromartie & Nulph, 2019). The FAR area is meant to represent the population's access to services, and in the case of assessing streams, it may be that the more remote areas the state has, the more effort and resources it takes to travel to and assess stream quality. Furthermore, remoteness may indicate the level of interest in better water quality among citizens, as water quality perceptions have been linked to income level and location (Andrew et al., 2019). All datasets and sources are listed in Table SI-1.
Data preparation
Once data were obtained from ATTAINS, they were processed to ensure consistency across each state. Because TMDLs are implemented at the state level, each state operates independently and may have different ways in which they classify pollutants, perform monitoring and testing, and report their results. This can make it challenging to summarize data across all states due to differences in methodology and reporting. For example, in evaluating the suspended matter in water, states may list this as total suspended solids or turbidity. Therefore, to enable comparisons among states, the data were aggregated into common categories as shown in Table SI-2.
Data analysis
To analyze the data, three types of analyses were performed: summary statistics, clustering, and linear regression. To summarize data across all states, descriptive statistics (mean, median, and standard deviation) were performed on the TMDL and socioeconomic data. Two types of clustering were used: k-means clustering for TMDL progress and hierarchical clustering among socioeconomic variables. Clustering is a common method to allocate objects (e.g., states) with multiple datasets (e.g., water quality parameters of each state) into groups that have similar attributes across datasets (Javadi et al., 2017). For TMDL progress, k-means clustering (Hartigan & Wong, 1979) was used to identify if states had any common groupings considering the percent of streams that were assessed, those that were impaired, and those in which a TMDL was completed. K-means clustering is one of the most commonly used clustering methods, and it can be used to generate an optimally defined number of clusters (Ali & Kadhum, 2017). The hypothesis for k-means cluster analysis was that, given an optimal k-value (number of clusters), the dataset would reveal meaningful groupings that could represent the progress of the TMDL process. By using k-means clustering, we sought to identify states whose TMDL progress varied between high and low values across all three variables (i.e., k = 3).
Alternatively, socioeconomic data included numerous variables, and therefore, a hierarchical clustering was applied based on Euclidean distance and complete linkage method (Murtagh & Contreras, 2012). Unlike the k-means method, hierarchical clustering does not require prior knowledge of the number of clusters. It is a stepwise clustering method that merges the most similar data points together into groups at each level. One objective of using this method to analyse socioeconomic data between states is to determine how clusters might be associated geographically or spatially. Once clustering was complete, we observed that, spatially, many of the k-means and hierarchical clusters appeared to mirror the boundaries of the EPA regions or a collection of them. Therefore, we performed descriptive statistics on both the clusters described earlier as well as the specific EPA regions a state fell within and performed analysis of variance (ANOVA) tests to determine the statistical significance of these groupings.
Single variable regression analysis for all relationships with a p < 0.1.
Dependent (y) . | Independent (x) . | Equation . | p-Value . | R2 . |
---|---|---|---|---|
Assessed (%) | Total stream length (mi) | y = 64.3 − 0.0003 × x | 0.002 | 0.19 |
Assessed (%) | State land area (km2) | y = 59.1 − 0.0003 × x | 0.004 | 0.16 |
Assessed (%) | Median household income ($) | y = −31.4 + 0.0011 × x | 0.012 | 0.13 |
Assessed (%) | Population density (pop/mi2) | y = 33.3 + 0.0410 × x | 0.017 | 0.12 |
Assessed (%) | Per capita earnings ($) | y = −18.6 + 0.0013 × x | 0.050 | 0.08 |
Assessed (%) | Bachelor or higher degree (%) | y = −10.1 + 1.85 × x | 0.053 | 0.08 |
Assessed (%) | No high school diploma (%) | y = 73.2 − 2.50 × x | 0.088 | 0.06 |
TMDL complete (%) | No high school diploma (%) | y = 4.8 + 1.687 × x | 0.099 | 0.06 |
Dependent (y) . | Independent (x) . | Equation . | p-Value . | R2 . |
---|---|---|---|---|
Assessed (%) | Total stream length (mi) | y = 64.3 − 0.0003 × x | 0.002 | 0.19 |
Assessed (%) | State land area (km2) | y = 59.1 − 0.0003 × x | 0.004 | 0.16 |
Assessed (%) | Median household income ($) | y = −31.4 + 0.0011 × x | 0.012 | 0.13 |
Assessed (%) | Population density (pop/mi2) | y = 33.3 + 0.0410 × x | 0.017 | 0.12 |
Assessed (%) | Per capita earnings ($) | y = −18.6 + 0.0013 × x | 0.050 | 0.08 |
Assessed (%) | Bachelor or higher degree (%) | y = −10.1 + 1.85 × x | 0.053 | 0.08 |
Assessed (%) | No high school diploma (%) | y = 73.2 − 2.50 × x | 0.088 | 0.06 |
TMDL complete (%) | No high school diploma (%) | y = 4.8 + 1.687 × x | 0.099 | 0.06 |
RESULTS AND DISCUSSION
The results are presented on the TMDL data itself in Section 3.1, including a summary of indicators of TMDL progress and impairment types, K-means clustering of these indicators, and the impact of regionalization on indicators of TMDL progress. In addition, results are presented in Section 3.2 that summarizes the socioeconomic data through summary statistics and hierarchical clustering. Finally, the relationship between indicators of TMDL progress and socioeconomic factors is presented through single and multivariable regression in Section 3.3.
TMDL data
Indicators of TMDL progress
Distribution of the percent of streams within a state that have been assessed, listed as impaired, and have had a completed TMDL.
Distribution of the percent of streams within a state that have been assessed, listed as impaired, and have had a completed TMDL.
Impairment types
Summary of the top four impairments across each state with the number of states on the y-axis.
Summary of the top four impairments across each state with the number of states on the y-axis.
(a) Total number of states with each impairment cause and (b) distribution of the percent of assessed streams that are listed as impaired within each state for each impairment cause.
(a) Total number of states with each impairment cause and (b) distribution of the percent of assessed streams that are listed as impaired within each state for each impairment cause.
K-means clustering
(a) K-means clustering of assessed, impaired, and TMDL complete percentages, and (b) spatial representation of clusters. Groups 1 (blue), 2 (red), and 3 (green) represent clusters of states that behave similarly in terms of TMDL progress.
(a) K-means clustering of assessed, impaired, and TMDL complete percentages, and (b) spatial representation of clusters. Groups 1 (blue), 2 (red), and 3 (green) represent clusters of states that behave similarly in terms of TMDL progress.
EPA regionalization
(a) Assessed percentage, (b) impaired percentage, (c) TMDL completed percentage for streams by EPA region, and (d) EPA regions.
(a) Assessed percentage, (b) impaired percentage, (c) TMDL completed percentage for streams by EPA region, and (d) EPA regions.
Socioeconomic data
Summary of socioeconomic data
Boxplot distribution of socioeconomic data: (a) land area, (b) population, (c) GDP, (d) percent with no high school diploma, (e) percent of population with a bachelors degree or higher, and (f) per capita earnings.
Boxplot distribution of socioeconomic data: (a) land area, (b) population, (c) GDP, (d) percent with no high school diploma, (e) percent of population with a bachelors degree or higher, and (f) per capita earnings.
Hierarchical clustering
Cluster analysis of socioeconomic data color coded by EPA regions: black, 1, 2, 3; red, 4; green, 5; blue, 6, 7; light blue, 8, 9, 10.
Cluster analysis of socioeconomic data color coded by EPA regions: black, 1, 2, 3; red, 4; green, 5; blue, 6, 7; light blue, 8, 9, 10.
Relationship between TMDLs and socioeconomics
Response screening was performed to determine if there are any correlations among TMDL progress and socioeconomic variables across all states. The results in Table 2 indicate that the percent of streams that are assessed is negatively correlated to total stream length (R2 = 0.19) and state land area (R2 = 0.16), both of which explain the largest degree of variability in percent of stream that are assessed. This suggests that the larger the land area and length of streams to assess, the lower the percentage of streams that are actually assessed, likely due to the sheer size and the length of streams and rivers within the state. In addition, the percent of streams that are assessed is positively correlated to percent of population with a bachelor's degree or higher, percent of population with no high school diploma, and per capita earnings (p < 0.05). These positive correlations with median household income and bachelor's degree or higher suggest that the economic output of a state has an influence on the percentage of streams that are assessed. This implies that state economic resources are a contributing factor to TMDL progress. Funding for the development of TMDLs mainly comes from state taxes and to a certain degree, from nonprofit organizations directly involved with the water quality concerns of the regions. Therefore, per capita earnings have a direct relationship to these processes.
Finally, the percent of TMDLs that are complete has a low positive correlation (p < 0.1) to percentage of population without a high school diploma. These trends are opposite to what is observed with the percent that are assessed and is unclear how these variables would act in such a way to influence the completion of TMDLs. However, despite having a moderate significance of the slope, it also has the lowest model fits (R2 = 0.06), suggesting that it only explains a small portion of the variability in the percent of TMDLs that are complete. Finally, there were no correlations between any socioeconomic and spatial variables to the percentage of streams that are impaired when categorized based on the specific impairments themselves.
To understand how multiple variables might better predict TMDL progress, we performed multivariable linear regression using a stepwise approach. Results demonstrated that the prediction of the percentage of assessed streams could be improved by combining the total stream length and median household income, with an adjusted R2 of 0.29 (Table 3). No multivariable equations were statistically significant for the percent of impairments or TMDLs completed.
Multiple linear regression model output for precent of streams that are assessed.
Variables . | Estimate . | p-Value . |
---|---|---|
Intercept | 0.00027 | – |
Total stream length (mi) | −0.00030 | 0.003 |
Median household income (USD) | 0.00097 | 0.025 |
Variables . | Estimate . | p-Value . |
---|---|---|
Intercept | 0.00027 | – |
Total stream length (mi) | −0.00030 | 0.003 |
Median household income (USD) | 0.00097 | 0.025 |
(a) Comparison of mean assessed miles and per capita earnings in each EPA region and (b) their correlation.
(a) Comparison of mean assessed miles and per capita earnings in each EPA region and (b) their correlation.
DISCUSSION
This study presents TMDL progress across the country and the influence that spatial and socioeconomic factors have on the percent of streams that are assessed, those that are impaired, and those for which a TMDL is complete. The outcomes of this work demonstrate that not all states have implemented their TMDL programs to the same degree. While some of this can be attributed to the difference in state land areas and stream miles that must be assessed, there are other socioeconomic factors that have a similar degree of explanatory power.
To that end, median household income had a significant relationship with the percentage of streams that are assessed within a state. Without financial resources, states may not have the necessary personnel to collect the data that are needed for assessing streams. It could be that this lack of funding inhibits progress and should be addressed by permitting states to calculate TMDLs with alternative and less costly assessment methods, such as using proxies (e.g., impervious cover or stormwater volume) that are correlated with water quality (DeGioia, 2019). Other policy changes to advance TMDL progress could include more direct obligations to limit pollution with discretion and flexibility on how to do so, rather than whether and how much to do so (Stephenson et al., 2022). In addition, there may be an economic case that there are returns to investments to improve water quality. These returns can be quantified through economic models to determine the cost and benefits of water quality improvements; however, uncertainties in these models regarding pollution damages and economic benefits make them difficult to apply (Bosch et al., 2006).
Public participation, which is mandated through the CWA, is an important part of the TMDL development process as it can provide information regarding impairments, collect water quality data, and review and comment on impairment lists and TMDL drafts. This study shows that education, both population with college degrees and population without high school diplomas, has a considerable impact on the assessment of streams and rivers. Efforts to improve public awareness of water quality issues and their implications through improved communication methods and technologies may improve their participation in various stages of the TMDL process (Quinn et al., 2022). Furthermore, it has been shown that citizen scientist programs not only allow states to collect water quality data but also to educate socioeconomically underprivileged communities through information dissemination (Webster & Dennison, 2022).
It is also clear that EPA regions themselves tend to have a similar level of TMDL progress. Some of this can be attributed to geographic similarities as demonstrated in the clustering analysis; however, as Figure 8 indicates, there could be economic factors common to certain regions that influence TMDL progress. EPA regions having a significant influence on the implementation of programs are not surprising as the regional offices are charged with enforcing and overseeing state adherence to the CWA. The findings suggest that progress towards assessing streams and completing TMDLs varies depending on the EPA region, with those with higher amounts of per capita earnings having a larger percentage of their streams assessed. This is aligned with other studies that have found similar differences among EPA regions in implementing elements of the CWA, including the NPDES program (Woods, 2021).
One limitation of this study is that we do not evaluate the methodology for which streams are assessed, which could also vary depending on what resources are available. For example, a robust assessment would have large amounts of data that demonstrate an impairment, while others could have a ‘drive by’ assessment (Neilson & Stevens, 2002). Furthermore, some states may list a large number of impairments due to a large amount of data, while others could have impaired streams that are not listed due to a lack of assessment data. As a matter of practical implementation, states with more streams have a greater workload, and therefore, the results indicate that these states have not assessed as many streams as a percentage of their overall stream lengths.
This study demonstrates the variability in TMDL progress across states and what factors may contribute to it; however, how to bridge the gap between states is unclear. As the TMDL programs continue to mature, modeling approaches and technologies may accelerate assessment and TMDL development for waters of the United States. For example, modeling approaches have evolved over time, and different impairment types may require different modeling tools (Quinn et al., 2019b), which can be selected based on the unique technical criteria and management constraints of a watershed (Sridharan et al., 2021). Furthermore, advancements in remote sensing and geospatial analysis can be used to support TMDL assessment and modeling (Quinn et al., 2019a; Sridharan et al., 2022). In addition, there may be ways to further improve the process of developing TMDLs with modelers, stakeholders, and regulatory entities. This is important as in many cases, the TMDL serves as the framework for contextualizing watershed science and regulatory policies toward stakeholders and the general public through collaboration and coordination of watershed management (Slota, 2021).
Finally, the outcomes of this study can be used by water resource decision makers to improve their approach to the TMDL development process mandated by the federal government. For example, for states or regions with higher economic capacities but low education levels, it might be efficient to reallocate resources to educate communities about the need for public participation in the TMDL process. Furthermore, while the TMDL is a federally mandated national requirement, this study demonstrates that progress toward meeting TMDL requirements differs among EPA regions. Therefore, it may be beneficial to investigate how to reallocate resources to EPA regions such that these discrepancies could be alleviated. To that extent, the use of new federal funding methods and regulations could be explored.
CONCLUSIONS
This study evaluated the relationship that socioeconomics and regionalization have on TMDL progress and implementation. Outcomes indicate that TMDL progress and impairments had a large degree of variation, some of which could be explained by EPA region, spatial clustering, and socioeconomic variables. To that end, results suggest that the size of a state, the length of total streams, and the economic output are related to the percentage of streams that are assessed within a state. In addition, states largely followed similar patterns based on the EPA region that they were within, indicating that regions play a large role in TMDL progress. The outcomes of this study can be used by water resource decision makers to improve their approach to the TMDL development process mandated by the federal government, including the targeted allocation of resources among and within states. Overall, this study highlights the diversity in the implementation of TMDLs across states and highlights some of the factors that may explain variation in TMDL approaches to date.
DATA AVAILABILITY STATEMENT
All relevant data are included in the paper or its Supplementary Information.
CONFLICT OF INTEREST
The authors declare there is no conflict.