With increasing stress on our water resources and recent waterborne disease outbreaks, understanding the epidemiology of waterborne pathogens is crucial to build surveillance systems. The purpose of this study was to explore techniques for describing microbial water quality in rural drinking water wells, based on spatiotemporal analysis, time series analysis and relative risk mapping. Tests results for Escherichia coli and coliforms from private and small public well water samples, collected between 2004 and 2012 in Alberta, Canada, were used for the analysis. Overall, 14.6 and 1.5% of the wells were total coliform and E. coli-positive, respectively. Private well samples were more often total coliform or E. coli-positive compared with untreated public well samples. Using relative risk mapping we were able to identify areas of higher risk for bacterial contamination of groundwater in the province not previously identified. Incorporation of time series analysis demonstrated peak contamination occurring for E. coli in July and a later peak for total coliforms in September, suggesting a temporal dissociation between these indicators in terms of groundwater quality, and highlighting the potential need to increase monitoring during certain periods of the year.
INTRODUCTION
Surveillance of water for microbiological pathogens has traditionally involved the use of indicator organisms (Standridge 2008; World Health Organization 2011). Total coliforms and Escherichia coli have been used as indicators of water quality worldwide (Gleeson & Gray 1996; World Health Organization 2011). The World Health Organization recommends E. coli as an ‘essential parameter’ of minimum water monitoring (World Health Organization 2011). Protection of drinking water requires a multi-barrier approach, including monitoring and management, legislation and guidelines, empowering and informing the public and research for new technological solutions (Federal-Provincial-Territorial Committee on Drinking Water and CCME Water Quality Task Group 2004). Not everyone is subject to drinking water legislation; people living in rural areas often depend on groundwater, most often untreated, for their drinking water (Summers 2010), potentially putting them at greater risk for waterborne illness than their urban counterparts (Galanis et al. 2014).
In Canada, regulations regarding drinking water are overseen by the provincial governments, and thus vary by province, especially for small public and private systems. For instance, in British Columbia small systems with two or more connections fall under regulations, but in Quebec systems that serve 20 individuals or less are not regulated (Cook et al. 2013). In the United States, the Environmental Protection Agency (EPA) regulates public drinking water systems, which are defined as those systems serving 15 connections or 25 individuals (United States Environmental Protection Agency 2015). Private drinking water systems (accessed by approximately 15% of the US population) are not regulated by the EPA (United States Environmental Protection Agency 2012). Consequently, many people in North America and around the world consume groundwater from private wells for which public health is not protected through legislation. Those consuming groundwater without regular testing may be at risk for waterborne disease.
The testing provided by provincial or state laboratories can be used as a foundation of a surveillance plan for microbial water quality, but baseline levels including seasonality and trends need to be established for comparison to future levels. In addition, it is important to understand the current spatial distributions of contamination in order to effectively interpret potential outbreak data (Hay et al. 2013) and determine how future climatic changes may alter the distribution of waterborne pathogen risks and outbreaks (Bezirtzoglou et al. 2011; Galway et al. 2015). Passive data collection, often referred to as passive surveillance, is an economically advantageous method of sampling a large population, or developing a large dataset over a number of years, where active collection may not be feasible or affordable. Although passive data collection can have its drawbacks, such as self-selection bias or incomplete sampling, it is reported to have excellent sensitivity when the dataset is large enough, even if disease prevalence is low (Craighead et al. 2015).
With the advance of more user-friendly geographical information systems in the late 1990s, spatial analysis of epidemiological data has become a key tool for visualizing disease processes spatially, tracing the sources of disease and identifying areas with greater risk of disease (Stevenson et al. 2008). Spatial analysis methods in epidemiology include simple spatial visualization of health indicator patterns, local and global disease/pathogen cluster detection methods, spatial interpolation, spatial risk assessment and regression models which incorporate spatial dependency (Stevenson et al. 2008). These methods have been applied to water contamination research worldwide. The city of Puri, India, used point sampling of water wells and interpolation to create contour maps of groundwater levels in pre- and post-monsoon conditions and identify the seasonal patterns and distribution of bacterial and chemical contaminants (Vijay et al. 2011). The results allowed the authors to make several suggestions to reduce future water contamination. In Canada, a 2013 study of Ontario private well water used a spatial scan statistic methodology employing a circular window to identify spatial clusters of E. coli-positive wells (Krolik et al. 2013). Greater Vancouver, British Columbia used a number of variables including intrinsic aquifer susceptibility, well location records, digital elevation models, land use data and known groundwater contamination sites to create a risk map for water sources in the area. This project also produced a relative risk map, but this map was based on potential risk factors, not on actual contamination outcomes, and focused on a much smaller geographical area (Simpson et al. 2014).
Relative risk maps, also referred to as excess rate maps, are used to demonstrate areas of higher or lower risk for disease (Anselin et al. 2010). Using the overall mean rate of disease for a large region, an expected rate for smaller regions within the large region, such as counties, can be calculated based on the population in each county. The ratio of expected versus actual cases allows a measure of relative risk in each county compared to neighbouring counties (Anselin et al. 2010). This methodology has been used to identify areas at risk for gastrointestinal illness in Northern Canada (Pardhan-Ali et al. 2012), and Cryptosporidium spp. contamination of surface water in Ireland (Samadder et al. 2010). Empirical Bayesian smoothing allows for correction of raw numbers in geographical areas with small populations, which can lead to misleading rates (Owusu-Edusei & Owens 2009).
Time series analyses are techniques often used in epidemiology, as well as a number of other disciplines, not only to track trends over time, but also to model future outcomes based on current and past occurrences (Shumway & Stoffer 2006). Time series analysis was recently used to model the impact of hydroclimatic variables on waterborne gastrointestinal illness in British Columbia, Canada (Galway et al. 2015).
The objectives of this study were to investigate the use of relative risk mapping and time series analysis to establish baseline levels of contamination of rural groundwater with E. coli and total coliforms in the province of Alberta, Canada as a case study, and to explore the use of passive collection of voluntary water sample submissions as a tool for continued water surveillance activities. Specifically, we aimed to: (1) use spatiotemporal techniques to detect patterns in passively collected water contamination data; (2) test if patterns of contamination were spatially and temporally structured, and to what extent; (3) describe methodology for determining baseline levels of contamination and seasonality, as well as areas of greater or lower risk using spatiotemporal analysis and relative risk mapping techniques.
METHODS
Data sources
The study area included the entire province of Alberta, which is over 660,000 square kilometres and is located in Western Canada. The southern border of Alberta follows the 49th parallel and the northern border follows the 60th parallel. The eastern border with the province of Saskatchewan is delineated by the 110th meridian west, and the western border with the province of British Columbia is delineated by the 120th meridian from the north down to the continental divide, and then the border trends eastward following the divide. The province has a population of over four million people, representing 12% of the population of Canada (Statistics Canada 2014). Water submission data included 179,623 test results for E. coli and total coliforms for the years 2004–2012 for Alberta, Canada. Submissions were from rural well water samples (both small public systems and private wells). Testing was performed by the Alberta Provincial Laboratory for Public Health (ProvLab) (Calgary, AB, Canada) and accessed using the Data Integration for Alberta Laboratories (DIAL) tool, a web-based surveillance tool developed by ProvLab. ProvLab is an ISO 17025 accredited laboratory for analysis of microbiological water.
Re-samples, samples collected for quantitative analysis after a positive initial test, were not included in this study. All water samples were collected by the well owners, and inclusion of name, address and location information accompanying the samples was left to the discretion of the person providing the samples. This information was handwritten by the sample submitter on standard provincial water requisition forms at the time of submission and subsequently entered into a computer system by the receiving technologist. If the sample was positive, the local public health agency was contacted by ProvLab and it was their responsibility to inform the well owner/overseer and provide information concerning decontamination and further testing as per Alberta Health and Wellness's Environmental Public Health Field Manual (Technical Advisory Committee on Safe Drinking Water 2004).
Water testing
Water was tested using a presence/absence enzyme substrate test for E. coli/total coliforms (Colilert® IDEXX, Westbrook, ME, USA) according to the manufacturer's protocol (IDEXX Laboratories 2013). One hundred mL specimens of water were incubated with the Colilert® product for 24 hours at 35 ± 0.5 °C. Water samples collected >24 hours before delivery to the laboratory were not analysed (n = 429, 0.2%). A further 67 (0.04%) samples lacked results due to submission or technical errors (too small volume, poor specimen quality or laboratory error). After incubation, the water sample was examined under natural light for a colour change from clear to yellow caused by metabolization of ortho-nitrophenyl-β-galactoside (ONPG) by β-galactosidase, indicating a total coliform positive test. If a colour change occurred, the sample was examined under ultraviolent light for fluorescence caused by metabolization of 4-methylumbelliferyl-beta-D-glucuronide (MUG) by β-glucuronidase, indicating an E. coli-positive test with a sensitivity of 1 colony forming unit per 100 mL (IDEXX Laboratories 2013). Facilities that had positive tests were encouraged to re-submit samples for quantitative analysis (re-submissions following positive tests were not included in this study). Homeowners or private facility water operators are encouraged to submit multiple samples over the course of a year, aligning with the recommendations from Health Canada (Health Canada 2013).
Geolocation
Geographical coordinates of the submission data were derived from the Alberta Township Survey (ATS) System, a system for locating parcels of land in Alberta (Alberta Environment and Parks 2010). Each parcel is located by the closest meridian on its eastern side (the 4th, 5th or 6th), as well as its range, township, section and quarter section. This information allows a parcel of land to be georeferenced to a resolution of 1 quarter section (∼800 × 800 m or 0.65 km2) (Alberta Environment and Parks 2010). In addition, Alberta Health Services (the government administrative body for health in the Province of Alberta) has divided the province into nine geographical health regions and this information was also used for mapping purposes.
Frequency of contamination, overall and by water source
Samples were categorized as being submitted by a private landowner or by a public unregulated system. Public systems included in this study were defined according to Alberta Environment and Parks as those that are not regulated by this ministry, and included non-transient systems with <15 connections and transient systems such as campgrounds and community halls (Alberta Environment 2009). In addition, samples were classified as having complete or non-existent/incomplete/invalid ATS geolocation data. Differences in contamination occurrence with E. coli and total coliforms between public and private wells while controlling for geolocation were examined using Cochran–Mantel–Haenszel (CMH) analysis (Fidalgo & Madeira 2008) using WinEpiscope 2.0 (Thrusfield et al. 2001). Frequency of occurrence of E. coli and/or total coliforms is reported rather than prevalence, as the denominator does not represent all wells at risk in the province, but rather all wells tested by ProvLab from voluntary submissions.
Repeat testing
Among those samples geolocated by ATS to the quarter section, the occurrence of repeated tests (multiple tests in the same quarter section in a single year) was examined, overall and according to water source, public and private. The distribution of water tests per year for public and private wells was examined using the Mann–Whitney U test using R (version 2.14.0, R Development Core Team 2011).
In order to mitigate the bias presented by the repeated testing of these data for the remainder of the analysis, the tests were aggregated by quarter section. One positive test within the quarter section during a month was counted as a positive outcome for that quarter section month.
Spatiotemporal analysis
. | χ2 . | df . | p-value . | Peak date . |
---|---|---|---|---|
Escherichia coli | ||||
South | 473.45 | 2 | <0.001 | July 23 |
Central | 261.25 | 2 | <0.001 | July 21 |
North | 94.87 | 2 | <0.001 | August 6 |
Total coliforms | ||||
South | 641.34 | 2 | <0.001 | August 18 |
Central | 1,005.28 | 2 | <0.001 | September 6 |
North | 358.79 | 2 | <0.001 | September 12 |
. | χ2 . | df . | p-value . | Peak date . |
---|---|---|---|---|
Escherichia coli | ||||
South | 473.45 | 2 | <0.001 | July 23 |
Central | 261.25 | 2 | <0.001 | July 21 |
North | 94.87 | 2 | <0.001 | August 6 |
Total coliforms | ||||
South | 641.34 | 2 | <0.001 | August 18 |
Central | 1,005.28 | 2 | <0.001 | September 6 |
North | 358.79 | 2 | <0.001 | September 12 |
Using aggregated data areas with small numbers of observations can produce misleading results (Cressie 1995). We used empirical Bayesian smoothing to derive confidence from areas with larger populations and adjust observations in areas with smaller populations towards the global mean. Empirical Bayesian smoothing of the crude proportions of E. coli and total coliform contamination in the polygons was performed using GeoDa version 1.4.0 (Anselin et al. 2006). The output of this smoothing was used to produce a relative risk map by calculating observed over expected proportions of positives for each polygon (Figure 4). The number used for expected proportion of positives was the per cent positive for the entire province (21.4% for total coliforms and 2.4% for E. coli). The output of this calculation was then used to detect areas of higher and lower risk using distance-weighted interpolation, which provided an interpolated relative risk for polygons where no data were available.
RESULTS
Geolocation
A survey method used in Western Canada, the ATS System, allows estimates of geographical coordinates. ATS data was provided by 72.6% (n = 130,366) of the water sample submitters. Tests having incomplete ATS data (5.8%, n = 10,363) as well as samples with complete but invalid ATS data (1.3%, n = 2,406) were excluded, resulting in 65.4% (n = 117,597) geolocated samples. In addition to geolocation, health region location of the water sample submitter was provided for 98.9% (n = 177,618) of the samples. The ATS coordinates allow for a parcel of land to be georeferenced to a resolution of 1 quarter section (∼800 × 800 m or 0.65 km2 (Alberta Environment and Parks 2010).
Frequency of contamination, overall and by water source
Overall, including repeated samples, 14.6 and 1.5% of the well samples were total coliform- and E. coli-positive, respectively. Prevalence of total coliform- and E. coli-positive wells in the data aggregated by quarter section and month was 21.4 and 2.4%, respectively.
Tests were evenly divided between private and public well water sources (49.9 and 50.1%, respectively). A larger proportion of private well water samples (81.1%, n = 72,603) was geolocated compared with public samples (50.0%, n = 44,994). The frequency of total coliform-positive wells in public vs. private wells was different between geolocated and non-geolocated wells (Breslow Day = 76.099, df = 1, p < 0.001), so geolocated and non-geolocated wells were presented separately. Private geolocated wells had 3.37 (95% CI: 3.24–3.51) higher odds of being total coliform-positive compared to public geolocated wells, while non-geolocated private wells had a 2.49 (95% CI: 2.35–2.63) higher odds compared to public non-geolocated wells. The difference between public and private was not different by geolocation status for E. coli contamination (Breslow Day = 1.233, df = 1, p = 0.362), so the CMH pooled odds ratio was used. A private well had 5.21 (95% CI: 4.66–5.82) higher odds of being E. coli-positive than a public well, irrespective of geolocation.
Repeat testing
Public wells were more often repeatedly tested with a median of two times per year (Interquartile Range (IQR) 1,4) compared to private wells that were tested a median of one time per year (IQR 1,1) (W = 126,112,944, p < 0.001). The maximum number of repeats for a single quarter section per year for public wells was 399 and for private wells was 136.
Spatiotemporal analysis
The number of samples submitted for water quality testing varied over the week (χ2 = 170,853, df = 6, p < 0.001). Samples were submitted most frequently on Tuesdays (29.5%) and Wednesdays (38.5%).
The STL decomposition demonstrated a peak in both E. coli and total coliform-positive wells in 2005, and a second lower spike for total coliforms in 2009 (Figures 1 and 2). The time series plot also demonstrated a peak in 2005 for both E. coli and total coliforms, but did not indicate a second peak for total coliforms. There was a seasonal spike in both E. coli (χ2 = 1,224, df = 2, p < 0.001) and total coliforms (χ2 = 3,486, df = 2, p < 0.001) in the summer and fall months, respectively, with peak dates for all years combined (2004–2012) of August 25 for total coliforms and July 24 for E. coli using Edward's test of seasonality.
Using the annual trends in the data aggregated by quarter section and month, the 2005 peaks for both total coliforms and E. coli were primarily limited to the southern region of Alberta. All three regions had significant seasonality by Edward's test for both total coliforms and E. coli (Table 1). Peak dates (averaged across all years) were similar for south and central regions of the province with E. coli and total coliforms peaks on July 23 and August 18 for south, and July 21 and September 6 for central Alberta, respectively. The northern regions of the province reflected a later seasonal pattern with peaks on August 6 and September 12, respectively.
Using all data (not aggregated by quarter section/month) and examining frequency maps for E. coli and total coliforms by health region, the maps revealed a south to north gradient with higher rates of both E. coli- and total coliform-positive wells in the southern part of the province, moderate rates in the central portion of the province and lower rates in the north (Figure 3).
Using the quarter section/month aggregated data, an area of high relative risk of E. coli contamination (Figure 4) was found in the south (ranging from 2.4 to 3.4) and three areas of higher relative risk were found in the north (one at 1.6–2.1, one at 1.7–2.3 and one at 1.6–3.2). The total coliforms relative risk map had a more uniform appearance with higher risk areas in the same locations as the E. coli relative risk map, but the risk levels were lower with a maximum relative risk of 1.6 (Figure 4).
DISCUSSION
Overall, use of these techniques to assess routine test data from across a broad region provides a basis for a framework for routine, passively collected surveillance data, giving insight about seasonality, temporality and geographical located relative risk. Use of the empirical Bayesian smoothing technique with the relative risk map allowed visualization of high-risk areas that were not seen with other methods.
The time series analyses provided a visual representation of the baseline levels of contamination and departures from the baseline and Edward's test provided a statistical means of testing seasonality. The seasonality in E. coli-positive wells corresponds with the seasonal trend seen in human cases of E. coli O157:H7 (July peak) (Michel et al. 1999) and in prevalence of E. coli O157 in cattle rectally collected rectal faecal samples (peaks in spring and late summer) (Chapman et al. 1997). The STL decomposition demonstrates a peak in 2005 for E. coli and total coliform contamination. This corresponds with a flooding event that occurred in Alberta in 2005. Visually, the time series by region shows that the peak in 2005 was restricted to the southern part of the province, which is where the flooding occurred. We were unable to determine the cause of the second peak in 2009 in total coliform contamination revealed by the STL decomposition.
Contamination maps of E. coli and total coliforms aggregated by Alberta Health Services health region indicated areas of high contamination in the southern health regions 1 and 2, moderate rates in the centre of the province and low rates in the north (Figure 3). This is corroborated by earlier studies: spatial clusters of E. coli O157 cases in humans have been identified in the same general area of the province as one of the areas this study has identified as high risk for water contamination (Pearl et al. 2006) and incidence rates of human cases of cryptosporidiosis and campylobacteriosis are consistently higher for the south health zone than for Alberta overall (Government of Alberta 2015).
Relative risk maps created from point data aggregated by quarter section and month with empirical Bayesian smoothing applied identified the same areas of risk in the south of Alberta as the sample-level contamination maps (Figure 3), but also identified three areas of higher risk in the north-western part of the province (close to the cities Grand Prairie, Peace River and High Level; Figure 4). The elevated risk in these regions may be due to geographical features such as type of soil or aquifer, precipitation patterns, well depths or land use patterns, as well as socio-economic factors. Distance-weighted interpolation allowed us to interpolate relative risk to geographical regions without data. Relative risk maps will be useful as a guide for further investigations into identifying hot spots of microbial contamination in groundwater and as a geographic baseline for future surveillance.
Although this study uses data collected for other purposes, we suggest this form of surveillance with passively acquired data is generalizable to the larger population of wells across the province, understanding there may be some bias in the sampling frame. In addition, using passively collected data increased the power of the study due to the large size of the database (n = 179,623). The overall per cent of E. coli- and total coliform-positive wells in this study, 1.5 and 14.6%, respectively, is lower than reported in studies from other regions of Canada and the United States (Goss et al. 1998; Borchardt et al. 2003; Hetcher-Aguila 2005). In Ontario, Canada, between 17 and 24% purposively selected wells on agricultural land were E. coli-positive at least once in a two-year period, with a higher prevalence of positives in summer (Goss et al. 1998). In Wisconsin, 28 and 2% of 50 wells sampled with a focus on septic field density were total coliform- or E. coli-positive in longitudinal study over one year (Borchardt et al. 2003). Of 24 untreated public supply wells and 13 private residential wells in the Chemung River Basin, New York, USA, sampled in the summer of 2003, 32% were total coliform-positive and 16% E. coli-positive. Site selection for this study was based on selecting sites with greater vulnerability to contamination as well as good representation of the geographical extent of the study (Hetcher-Aguila 2005). A total of 22% of UK private water sources, primarily groundwater, was positive for total coliforms, faecal coliforms or faecal streptococci (Fewtrell et al. 2007). The purposive nature of the sample selection in these studies alone, selecting for areas more likely to be contaminated (Goss et al. 1998; Borchardt et al. 2003; Hetcher-Aguila 2005), could account for the higher levels of contamination in other studies. However, lower levels of contamination in our study also may be due to bias in our sampling. Notably, most wells examined were voluntarily sampled by their owners and were not randomly selected from the population of wells at risk. A survey study in Alberta indicated that voluntary sampling of private wells represented only 11% of wells in service (Summers 2010). Specifically, bias would occur if people who provided samples were more conscientious in their well maintenance compared with people who did not provide samples. Alternatively, our sample would be biased in the opposite direction if people submitted water samples because of suspicions about water contamination. The impact of the voluntary bias could be better understood using a study which included both a survey portion to examine individual water testing habits coupled with a water test to indicate whether or not those inclined to test voluntarily are more or less likely to have contamination. This would allow a better understanding of the feasibility of using voluntary samples in a surveillance system. However, similar to our findings, a study of 816 wells in 2001 in Alberta demonstrated a total coliform prevalence of 13.8% and a faecal coliform prevalence of 3.1% (no E. coli-specific testing was performed in this study, and the study design was based on convenience sampling with mandatory inclusion of sites in each of 64 municipalities) (Fitzgerald et al. 2001).
Unlike other study designs where sampling frequency is mandated, in our study, water samples from the same wells could be submitted more than once per year (repeated testing). Repeated testing was important to examine because contamination with both total coliforms and faecal coliforms can be sporadic (Oliphant et al. 2002). A positive test in a well that has only been tested once in a year may have different implications than a well that has been tested 100 times in a year with one positive finding. It is not possible from our data to determine if a positive test from wells that have only been tested once represents a sporadic contamination event, or a constant problem. As using the proportion of positive tests as the statistical unit would make a single positive test over the course of a year's worth of testing seem unimportant, we decided to aggregate to quarter section and month. This has the effect of potentially biasing the outcome towards higher positivity, but from a public health perspective, treating the positives under all conditions as serious is not unreasonable. In addition, the data may be misleading in that test submissions were aggregated by quarter section and multiple tests in a year may have been from one well tested multiple times or several wells tested one or more times. High numbers of tests for a single quarter section were likely occurring where there were multiple wells on the quarter section. It is impossible to calculate what bias this may have introduced into the results. Geolocation using the ATS system allowed us to use the large database that we had, but was not ideal. Global positioning system (GPS) coordinates would have allowed better resolution for spatial analysis, and better transferability to other regions.
A portion of the population may be at risk but is not testing their water on a regular basis or at all. Summers (2010) previously identified that few private well owners were testing annually, and identified the most frequent reason for not testing was feeling there was no need to test. Regulated annual routine testing of private and small public system well water may be of benefit to Albertans, and our data supports the recommendations outlined in the Guidance on Waterborne Bacterial Pathogens by Health Canada (2013), where private and semi-public drinking water systems should be tested two or three times per year, with a focus on times when contamination is more likely, i.e., spring and summer.
Determining which samples were repeated testing was a difficult task on such a large database, as names were not entered consistently. For instance J. Smith and John Smith could represent the same person. A fuzzy lookup function (K2 Enterprises 2012) could potentially identify inconsistent name entries, but this approach would not identify cases where multiple samples from the same well were submitted by family members or friends with different names. To address this problem, we aggregated all geolocated samples in the same quarter section, and used quarter sections as our statistical unit.
The use of CMH allowed examination of the differences in contamination between public and private wells while controlling for geolocation. Although total coliform and E. coli contamination was higher for private wells than public wells, private well overseers tested their wells less frequently. The majority of small public water systems have no regulations for testing, with the exception of campgrounds, which require water testing just prior to opening in the spring (Province of Alberta 2004), but the guidelines suggest more frequent testing than for private systems, depending on the water source. For instance, it is recommended that treated surface water and treated groundwater under the influence of surface water be tested weekly for communal/public supplies (Technical Advisory Committee on Safe Drinking Water 2007). In addition, because of multiple people involved with small public water systems, more resources and organization may be in place to ensure testing, maintenance and disinfection. These factors may account for the lower contamination in public wells. A 2011 survey of private well owners in Alberta demonstrated most study participants had a lack of knowledge about groundwater sources and well management and only 11% tested for microbial contamination on at least a yearly basis (Summers 2010). As a result of the higher contamination rates, private well owners do appear to be more at risk than public well users, so encouraging private well owners to test more frequently and providing more educational opportunities for private well owners may be of value.
Limitations of this study include the previously mentioned, largely voluntary nature of the test samples, allowing inference only to this population of tested well water samples, and the inability to geolocate all samples. In addition, private samples were better represented in the relative risk maps because it was possible to geolocate more private samples (81.1%, n = 72,603) than public samples (50.0%, n = 44,994), and this increased the contamination rates as private well water samples had higher contamination than public well water samples. The ability to geolocate all the samples may have changed the map and, potentially, the location of higher risk areas.
CONCLUSIONS
Using geolocation with empirical Bayes smoothing with a substantial, passively collected database of water tests, we were able to successfully identify new areas of concern in addition to corroboration of previously identified hot spots of contamination. This kind of information can be used by decision-makers to create targeted surveillance in these high-risk areas. Coordinates such as the ATS system used in this study are not ideal. Passively collected data are strengthened by the addition of GPS coordinates. Although passively collected data have limitations, they are an economical way to perform some types of surveillance and provide baseline information on contamination levels and trends over several years.
ACKNOWLEDGEMENTS
This study was funded by the Government of Alberta, the University of Calgary, Department of Ecosystem and Public Health, the Natural Sciences and Engineering Research Council of Canada, and Mitacs. We appreciate the contribution of Iqbal Jamal and his company, AQL Management, to the Mitacs funding. The authors wish to acknowledge the contributions of the personnel at ProvLab, especially Jennifer May-Hadford, Sumana Fathima and Lorraine Ingham for their help with data, maps and laboratory protocols. In addition, we appreciate Dr Henrik Stryhn's statistical advice and help provided by Peter Peller of the University of Calgary Library with geographic information data and software.