Application of high-throughput 16S rRNA sequencing to identify fecal contamination sources and to complement the detection of fecal indicator bacteria in rural groundwater

Residents in rural communities across Canada collect potable water from aquifers. Fecal contaminants from sewage and agricultural runoffs can penetrate aquifers, posing a public health risk. Standard methods for detecting fecal contamination test for fecal indicator bacteria (FIB), but the presence of these do not identify sources of contamination. In contrast, DNA-based diagnostic tools can achieve this important objective. We employed quantitative polymerase chain reaction (qPCR) and high-throughput DNA sequencing to trace fecal contamination sources in Wainfleet, a rural Ontario township that has been under the longest active boil water advisory in Canada due to FIB contamination in groundwater wells. Using traditional methods, we identified FIBs indicating persistent fecal pollution in well waters. We used 16S rRNA sequencing to profile groundwater microbial communities and identified Campylobacteraceae as a fecal contamination DNA marker in septic tank effluents (STEs). We also identified Turicibacter and Gallicola as a potential cow and chicken fecal contamination marker, respectively. Using human specific Bacteroidalesmarkers, we identified leaking septic tanks as the likely primary fecal contamination source in some of Wainfleet’s groundwater. Overall, the results support the use of sequencing-based methods to augment traditional water quality testing methods and help end-users assess fecal contamination levels and identify point and non-point pollution sources. This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/). doi: 10.2166/wh.2019.295 ://iwaponline.com/jwh/article-pdf/17/3/393/639178/jwh0170393.pdf Paul Naphtali Mahi M. Mohiuddin Athanasios Paschos Herb E. Schellhorn (corresponding author) Department of Biology, McMaster University, Hamilton, ON, Canada E-mail: schell@mcmaster.ca This article has been made Open Access thanks to the generous support of a global network of libraries as part of the Knowledge Unlatched Select initiative.


INTRODUCTION
Fecal pollutants from sewage and agricultural runoff can penetrate decaying groundwater wells and render the well water unsafe to drink (USEPA ). Fecal contaminants may contain waterborne pathogens that transfer into aquatic environments and cause infectious disease (Harwood et al. ). Waterborne pathogen detection remains a challenge because of the diversity and low abundance of pathogens in water. Escherichia coli and Enterococcus, the standard fecal indicator bacteria (FIB), are present in high densities within the intestine of warm-blooded animals. FIB detection act as proxies for high fecal contamination levels (Field & Samadpour ). Public officials in rural communities enact boil water advisories upon FIB detection to lower the risk of waterborne disease outbreaks.
While FIBs indicate fecal contamination, the source of contamination and the true abundance of pathogens cannot be identified using FIBs alone. Current practices for monitoring water quality include the use of microbial source tracking (MST) methods that target genetic markers such as the 16S rRNA gene to quantify and source fecal contamination (Field & Samadpour ). The HF183 16S rRNA sequence, belonging to a human Bacteroidales 16S rRNA gene fragment, was the first genetic DNA marker used to detect human fecal contamination in drinking water (Bernhard & Field ).
Quantitative polymerase chain reaction (qPCR) assays are now commonly used to quantify the HF183 marker as an indicator of human fecal pollution (Seurinck et al. ).
Human and animal-associated DNA markers can also be used to measure point and non-point source contamination inputs in freshwater environments (Staley et al. ).
In addition to MST-based approaches, next-generation DNA sequencing is now being used to identify fecal contamination sources and to examine the co-occurrence of fecal and source water bacteria in respective environments (Unno et al. ). Next-generation sequencing based methods are also used to identify FIBs and pathogens in freshwater reservoirs (Mohiuddin et al. , ). Aquatic microbiota containing taxa belonging to fecal bacteria are more likely to be contaminated through fecal sources (Cao et al. ). A 16S rRNA gene-based sequencing approach can also identify additional human fecal markers, such as Lachnospiraceae, for further characterization with other genetic methods such as oligotyping (McLellan et al. ).
Despite the decreasing cost and increasing usefulness of next-generation sequencing methods to identify fecal contamination sources, MST protocols have not yet, however, integrated next-generation DNA sequencing as a standard for monitoring Canadian drinking water sources.
Wainfleet, a rural Ontario Township by Lake Erie, is under the longest active boil water advisory in Canada.
A previous monitoring study on fecal contamination in 280 private residential groundwater wells determined that 50% contained detectable FIBs (Niagara Region Report ). The town's boil water advisory provides an opportunity to test next-generation sequencing DNA methods in MST monitoring. Combining next-generation DNA sequencing approaches with traditional MST-based methods can augment current water quality monitoring approaches by incorporating new fecal-specific DNA markers to quantify host-specific contamination.
In this proof-of-principle study, we tested whether a next-generation DNA sequencing approach can be used to trace fecal contamination sources in Wainfleet's private well waters. Using both traditional methods of FIB detection and 16S rRNA amplicon sequencing, we quantified fecal contamination levels and identified likely fecal pollution sources. We also measured the concentration of human Bacteroidales gene markers as a proxy of sewage-based contamination. Information obtained from our analyses can augment traditional methods of water quality monitoring by providing additional information on fecal pollution markers in potable waters.

Study site description
Wainfleet is situated in the southwest portion of the Niagara Region (42.92 N,79.38 W). Residents obtain potable water using on-site groundwater wells. Many residences are built in low-lying areas close to Lake Erie. Of the 107 residences surveyed in March 2005 that use on-site groundwater wells, 44 had septic tanks that are 20 years old, and 49% of the residences do not comply with current provincial building codes (Niagara Region Report ). Most homeowners install septic tanks to discharge septic tank effluents (STEs) into the underlying soils through tile beds. Many of the plots have an area too small to install functioning septic tanks to current building standards. Besides the potential for leaking, concentrated raw sewage seeps through the underlying bed and into aquifers that supply wells (Niagara Region Report ).

Groundwater tap sample collection
We identified nine test sampling sites (Sites A-K, Figure 1) and 21 wells based on FIB detection in the previous independent study (Niagara Region Report ). We received written consent from the township and identified volunteers based on the sampling site selection process. We collected tap water samples every month from April to November 2015 and grouped the samples based on season (spring, summer, and fall). A total of 48 samples were collected from nine test sites. For each sampling event, town technical staff collected groundwater tap samples from homeowners by filling 500 mL autoclaved plastic bottles (Nalgene). The water samples were then kept on ice, transported to the laboratory, and processed within 6 h of collection.

Septic tank effluent and manure sample collection
STEs were collected from two septic tanks owned by two homeowners that participated in this study (fall 2015).
Three biological replicates for each sewage sample were collected on three consecutive days. Manure samples were collected from manure mounds stored by a concentrated animal feeding operation for chickens, a cow farm, a horse hobby farm, and a pig farm. Similar to septic samples, three biological replicates for each manure type were collected on three consecutive days. All samples were transferred into 500 mL autoclaved bottles, kept in ice and transported to the laboratory. The samples were then stored at -80 C until further analysis.

FIB detection assay
FIBs in water samples were detected using standard methods (APHA ). To enumerate E. coli and Enterococcus spp., 100-fold serial dilutions were prepared by transferring 1 mL of well water into 100 mL of 1× PBS solution (pH 7.0), to a 10 4 -fold dilution. One hundred mL of the well water sample and PBS diluted samples were passed through 0.45 μm pore-size 47-mm-diameter sterile mixed cellulose ester membrane filters (Thermo Fisher Scientific, Burlington, ON, Canada). Each membrane filter was placed on differential coliform (DC, Oxoid) and mEI agar (BD Difco) plate and incubated at 42 C for 24 h. After incubation, E. coli and Enterococcus spp. colony forming units (CFUs) for filters containing stock and diluted well water samples were enumerated. To determine the concentration of FIBs (CFU/mL) in water samples, we multiplied the CFU count by its associated dilution factor. Detection of one (or more) CFU per 100 mL of drinking water samples were considered positive for FIBs (deemed unsafe for drinking (Health-Canada )).

DNA extraction and library generation for sequencing
DNA was extracted from water samples as described earlier (Mohiuddin et al. ). Briefly, 500 mL of each groundwater sample was passed through 0.45-m pore-size 47-mmdiameter sterile mixed cellulose ester membrane filters.
The filters were then cut into fragments (1 cm 2 size) with sterile scissors and the cut fragments were aseptically transferred with sterile forceps into 1.7 mL microfuge tubes for ) with a sequence similarity threshold of 97% and a minimum query alignment length of 50% using UCLUST (Edgar ). The resulting OTU table was then rarified to 5,000 OTU counts/sample before statistical analyses were performed. A potential DNA marker of a human or animal-specific fecal or manure contaminant was identified if it was detected in 10% abundance in a fecal sample but 1% abundance in all other fecal samples. The relative abundance of the potential markers was determined in the groundwater samples across the sampling months.

Detection of FIB in groundwater wells
As a preliminary assessment of fecal contamination levels in Wainfleet's well waters, we examined FIB detection frequency in tap water collected from private wells using Health Canada recommended guidelines (see above under  Positive E. coli detection ranged from 60.0% to 70.0%. Enterococcus detection rates had a large seasonal range from 10.0% to 81.0%, with the detection rate lowest in the spring and highest in the summer (Figure 2). Groundwater samples collected from two residential wells, B and K, also contained much higher mean E. coli and Enterococcus than the other groundwater wells (Figure 3).

Basic sequencing data
As a first step towards identifying potential fecal contamination sources in the FIB-positive well waters, we profiled microbial communities from 21 groundwater samples, six samples from two STEs, and 12 manure samples from four different animal farms using 16S rRNA sequencing.
Sequence reads obtained from septic tank samples were classified into 972 (±89) OTUs. The number of reads obtained from animal manure and groundwater samples were higher than septic tank samples and reads from animal manure and groundwater wells were classified into 2,987 (±315) and 3,101 (±287) OTUs respectively.

Identification of potential STE and animal manure contamination markers
To identify potential fecal contamination markers in groundwater, we parsed for sewage and manure-associated bacteria in the 16S rRNA sequencing data. We defined a host-specific fecal marker as a microbial group detected at 10.0% relative abundance in one type of fecal matter but detected at 1.0% abundance in the other fecal sources. In the two STE samples, the mean abundance of Campylobacteraceae was 32.5% (Figure 4). At the genus level, sequences annotated to Sulphospirillum and Arcobacter were the most abundant members of Campylobacteraceae. In contrast, the average abundance of Campylobacteraceae was 1% for all four animal manure samples. Gallicola and Turicibacter were the most abundant genera in chicken and cow manure, comprising 42.2% and 9.4% of 16S rRNA sequences respectively. In contrast, these markers were detected at 1.0% abundance among the pig and horse manure and STE samples (Figure 4). No genetic marker was identified in pig and horse manure based on the classification criterion.

Abundance of STE and manure-based OTUs in groundwater samples
To identify possible fecal contaminants in well water, we determined the relative abundance of the human and  animal-associated markers we identified in the reference fecal samples. Groundwater wells sampled in July, September, and November had the highest relative abundance of STE-based Campylobacteraceae sequences ( Figure 5). None of the chicken and cow markers had a relative abundance above 2.0%, with the relative abundance below 1.0% in wells collected in July, August, and November ( Figure 5).

Identification of human fecal contamination
To determine whether the human-specific fecal contamination was present in the groundwater samples, we conducted qPCR assays of the HF183 human Bacteroidales marker in selected wells with or without FIB detection. We prepared a qPCR standard curve that quantifies the human Bacteroidales marker in each run ( Figure S1A). Each standard curve was robust for HF183 quantification ( Figure S1, available with the online version of this paper). Melt curve analysis of the standard curve and all groundwater samples that were examined after the qPCR assay yielded a single peak at 84 C ( Figure S1B).
The HF183 marker genome was more abundant in STE samples than in groundwater wells ( Figure 6). qPCR of the human Bacteroidales marker in groundwater wells B, F and K resulted in positive detection, containing 30-50 genome copies/100 mL (Figure 6). A subset of groundwater wells without E. coli detection also tested positive for the HF183 marker. Groundwater well A, treated with UV light before sampling, contained more HF183 copies than other groundwater wells ( Figure 6). Groundwater well I had a slightly lower HF183 marker level than B, F, and K.

DISCUSSION
The presence of FIBs in drinking waters is a major health concern. Using national guidelines for drinking water and the results of a previous study as a reference, we sought to determine whether high levels of FIBs are still present in the Wainfleet well waters. Our analyses suggest that FIBs are still present in over half of the tested well water sites and, therefore, the quality of drinking water may still be a concern in the majority of the households within the township. Many of the sampled groundwater wells are located near the shores of Lake Erie, a low-lying region where most of the E. coli and total coliform exceedances were previously observed (Niagara Region Report ). Some well water samples also contained far higher FIB counts than others. Mean E. coli and Enterococcus levels in some groundwater wells were three and two orders of magnitude higher than the average FIB counts for the groundwater wells collected across all the other locations. These wells with higher than average FIB counts are located near  septic tanks that may leak raw sewage into the aquifer (Niagara Region Report ). Poor well maintenance also facilitates sewage leaching into the groundwater (Howard et al. ). Altogether, chronic FIB contamination in individual groundwater wells like B and K require further investigation to confirm potential fecal contamination sources.

Characterization of fecal microbiota and identification of potential fecal markers
The presence of host-associated fecal contamination may help to trace fecal contamination sources in complex environmental samples. As a proof-of-principle experiment, we examined whether we could complement FIB detection with 16S rRNA sequencing methods by identifying sewage and manure-associated markers. We selected a maximum detection rate of 1.0% limit to ensure the specificity of the marker to its host. In cow manure, we identified Turicibacter spp., a member of the Firmicutes phylum present in high abundance at the genus level, agreeing with previous cow microbiome profiles (Kim et al. ). In chicken manure, Gallicolaalso a member of the Firmicutes phylumwas the most abundant genus. Both markers were also detected at 0.1% abundance in other manure types and sewage samples. These genera may act as DNA markers of host-associated fecal contamination in the town's groundwater wells.
In the two STE sites, we identified the Campylobactera-  Interestingly, well waters collected in August contained the lowest abundance of the three host-associated markers.
Except for well water site K, E. coli counts did not exceed 100 CFUs/100 mL (raw data), suggesting diffuse fecal contamination among the well water samples collected in August. In other sampling months, there was a far higher abundance of Campylobacteraceae sequences in the autumn. The detection of E. coli counts could be due to leaking septic tanks from individual sites.

Quantification of human-based fecal contamination
To confirm the presence of STE contamination in selected well sites, we used the HF183 marker to quantify humanbased contamination in DNA extracted from STE and groundwater samples. We first validated the use of the HF183 marker by preparing standard and melt curves of the HF183 qPCR assay. Although qPCR assays were conducted with 100 bp fragments to minimize spurious fluorescence, we observed high R 2 value and robust E-value, suggesting the quality of the amplification reactions ( Figure S1A). Furthermore, the melt curve analysis shows that a single product was detected by the qPCR assay ( Figure S1B). Furthermore, we detected the HF183 marker in the two STE sites at 10-fold higher concentration than groundwater samples. These results validate quantifying the entire HF183 amplicon in DNA extracted from the town's well water as a proxy of STE contamination.
We also detected the HF183 DNA marker in B and K, groundwater wells containing the highest E. coli CFUs.
Curiously, groundwater wells I and F, which did not contain E. coli, were also positive for the HF183 marker. A weak correlation between E. coli counts and HF183 marker concentrations was established in residential areas (Nshimyimana et al. ) and the Great Lakes beach sands (Staley et al. ). The weak correlation (Pearson correlation; r ¼ 0.33) between E. coli counts and HF183 marker concentrations indicates that E. coli levels are not a reliable indicator of human fecal contamination in well waters. This is primarily due to the factors that affect viability of E. coli in well water and contamination of well waters through sources other than humans. Within Wainfleet, groundwater wells B and K may receive E. coli from Lake Erie where surface water flows into the aquifer. However, these groundwater wells also receive human contamination loads that may come from leaking septic tanks, evidenced by the positive HF183 detection in septic tanks and the wells.
Interestingly, the groundwater well sample collected from well A contained almost five times the HF183 marker concentration as groundwater wells B and K. Well A's homeowner installed a UV treatment system in residence to inactivate fecal microbes. While E. coli and Enterococcus were absent in the UV treated well water, the HF183 marker was still present in UV treated well water. This could be due to the presence of DNA from dead (or inactivated) Bacteroidales (inactivated through UV treatment) which was amplified during the qPCR assay. Other groundwater wells, like G and J, did not contain a detectable HF183 marker, removing the possibility of STE contamination in those wells. Culturable FIBs were also absent in G and J, corroborating the absence of STE-based contamination.

CONCLUSIONS
A prior study found extensive FIB contamination throughout a rural community (Niagara Region Report ). In this study, we complemented cultured-based methods with 16S sequencing and qPCR to re-assess fecal contamination levels and, further, to identify potential contamination sources in Wainfleet. We found FIB contamination in the well waters we tested, with E. coli counts as high as 10 6 CFU/mL. In addition, the identification of additional STE DNA markers to Campylobacteraceae and the human Bacteroidales is consistent with the idea that human waste may impact residential groundwater wells. The low abundance (and absence) of animal manure-associated DNA markers -Turicibacter in cow manure and Gallicola in chicken manurereduces the possibility that observed contamination is due to animal sources. These results indicate that profiling microbial communities using 16S rRNA sequencing can augment culture-based methods in contamination analysis studies. The use of next-generation sequencing methods can specifically facilitate the assessment of groundwater quality by detecting host-associated markers and quantifying relative contributions of likely fecal sources to groundwater contamination.
To trace fecal contamination sources in water sources more robustly, amplicon sequencing at higher depths (more reads per sample) may be useful in identification of novel or rare taxa. Shotgun metagenomic sequencing may also be used which, at a higher depth, may facilitate the identification of FIB at the species level. While deeper sequencing depth may facilitate species identification, such as the Campylobacter spp. that comprise STEs and possible pig and horse-associated markers within Clostridium and other genera, the cost of such in-depth analyses are fairly high. Highly focused DNA sampling programs that target well water sites having high FIB levels may allow detailed identification contamination inputs that include potential pathogens that, unlike E. coli, are difficult to culture.