3 Journal of Water and Health | 04.Suppl 2 | 2006 Assessing waterborne risks: an introduction

Information in this paper can help readers evaluate the results of epidemiologic studies of waterborne disease risks. It is important that readers understand the various epidemiologic study designs, their strengths and limitations, and potential biases. Terminology used by epidemiologists to describe disease risks can be confusing. Thus, readers should not only evaluate the adequacy of the information to estimate waterborne risks but should also understand how the risk was estimated. For example, one author's definition of attributable risk may be quite different from another author's in terms of the population to which the risk may apply and how it should be interpreted.


INTRODUCTION
Sanitary practices for the disposal of sewage, source water protection, and the filtration and chlorination of drinking water have dramatically decreased waterborne disease risks in the United States, especially for typhoid fever mortality (Figure 1).In fact, the treatment of drinking water has been acclaimed as one of the major public health achievements of the 20th century (NRC 1999;2004).Nevertheless, outbreaks associated with contaminated drinking water still occur in the United States, and a substantial fraction of waterborne illness may not be reported (Hauschild & Bryan 1980;Bennett et al. 1987).
An annual average of 17 drinking water outbreaks was reported during 1991 -2002-only slightly less than the annual average of 23 outbreaks reported during 1920 -30.
These outbreaks were associated with inadequately treated water systems and distribution system contamination.In some outbreaks, water systems had not exceeded water quality regulations.In addition to outbreaks, public health officials have now become more aware of the importance of non-outbreak waterborne risks.Payment et al. (1991;1997) found that increased gastrointestinal illness was associated with the use of tap water in a major Canadian municipal water system that met current microbiological water quality standards and reported no outbreaks.

WATERBORNE ASSOCIATIONS
Although acute and chronic exposures to various chemicals in water can cause illness, microbiologically contaminated water is the concern of the authors of papers in this special issue of the Journal of Water and Health.Infectious waterborne diseases are usually caused by exposure to enteric pathogens that are transmitted by the "fecal-oral" pathway.Occasionally the pathogens may be in urine (e.g.

Leptospira)
. Waterborne pathogens are excreted by infected persons and in many instances by wild or domestic animals.
Although the principal exposure to waterborne pathogens is ingestion through contaminated drinking water or food and hand-to-mouth activity, dermal contact or inhalation of contaminated aerosols may also be important.Illness Table 1 can also be waterborne, but the proportion of waterborne illnesses caused by various waterborne pathogens may be quite different.Although not capturing information on sporadic and endemic waterborne disease, reports of drinking-water AGI outbreaks associated with infectious agents can help identify the important waterborne etiologies.Of the waterborne outbreaks reported during 1971 -2002, 54% had an unknown etiology.Of the remaining 46%, approximately 17% were due to bacteria, 23% to parasites, and only 6% to viruses.In another paper in this special issue, Craun et al. (2006) discuss the causes of waterborne outbreaks in the United States.Additional information about waterborne pathogens is also available in several recent publications (AWWA 1999;Hunter et al. 2003;Cotruvo et al. 2004).

Infection and illness
If exposed to a waterborne pathogen, a human or animal host may become infected.The pathogen can multiply or pass through its life cycle within the host, and the host may excrete pathogens into the environment and become infectious to others.Illness refers to the symptomatic manifestation of an infection.The severity of illness can range from self-diagnosed, mild AGI to death (Figure 2).Other measures of severity include the duration of symptoms, impact on daily activity, and the cost of physician visits or hospitalizations (Rice et al. 2006).Chronic sequelae may occur.For example, bacterial infections due to several waterborne pathogens may act as triggers in susceptible persons for reactive arthritis, Reiter's syndrome, and ankylosing spondylitis (CAST 1994).Hemolytic uremia syndrome has been associated with Escherichia coli O157:H7 infection, and Campylobacter can be a precipitating factor for Guillain -Barre syndrome (CAST 1994).
Infection without illness is also important to consider.
Although not exhibiting illness symptoms, asymptomatic persons may be sources of continuing infection and illnesses in the community.For some pathogens frequent low-level exposures from infected persons or other sources may confer protective immunity from illness; however, not all pathogens can confer protective immunity, and for some pathogens, the immunity may be short-lived.Epidemiologists use the term "herd immunity" to refer to the immunity of a population group or community and their resistance  to the invasion and spread of an infectious agent (Last 1995).
A large proportion of asymptomatic infections in a population may reflect the ability of a pathogen to confer immunity.For some pathogens, asymptomatic infection can be studied by serological or other clinical tests (Casemore 2006).

Waterborne transmission
Confirming the waterborne transmission of disease in a  (Blackburn et al. 2004).For other waterborne illness, at least two cases must be reported for an epidemiologist to evaluate a potential common exposure and determine a mode of transmission.
In waterborne outbreak investigations, epidemiologists usually consider the primary mode of transmission (i.e. drinking water), but secondary infection can also occur, especially through person-to-person transmission.
In waterborne outbreaks caused by E. coli O157:H7 and Cryptosporidium, transmission to familial, institutional, or other contacts by a primary case has been confirmed epidemiologically.Secondary transmission of cryptosporidiosis associated with the 1993 Milwaukee waterborne outbreak was estimated at 5% among residents of all ages (MacKenzie et al. 1995) and 40% among the elderly (Naumova et al. 2003).Cordell & Addiss (1994) reported secondary cryptosporidiosis transmission rates of 12-22% from infected children to their household members and caregivers.When secondary transmission is not considered, the impact of a waterborne pathogen may be underestimated (Eisenberg et al. 2004).

The epidemiologic triad
Epidemiologists use disease models to help understand the cause and development of diseases (Rockett, 1994).A simple model is the epidemiologic triad (Figure 3).Although the pathogen, environment, and host coexist independently, disease occurs through their interaction.The presence of a pathogen is necessary for exposure to occur, but usually the pathogen is not a sufficient cause of the illnesses.With few exceptions, cofactors or host characteristics play an important role in the development and severity of illness.
Thus, to assess waterborne illness risks, we need to understand pathogen-host-environmental interactions.
Waterborne pathogens (e.g.Campylobacter, Leptospira, and E. coli O1:57:H7) may have significant animal reservoirs in addition to human sources of contamination.
Some pathogens live and multiply in the water environment Causal pathways can be complex, and these complexities are illustrated in a model diagram developed by Eisenberg et al. (2002).The model describes the relationship of drinking water, medication, and immune status in HIVassociated diarrhea (Figure 4).The most straightforward pathway is: an increased prevalence of diarrhea results in a concern for drinking water quality which in turn may cause persons to seek additional water treatment or bottled water.
However, diarrhea prevalence may be increased due to the side effects of medication use, and although water quality may be improved, the prevalence of diarrhea may not necessarily be decreased.CD4 þ T lymphocyte counts influence a physician's decision to use antiretroviral medication, and medication may decrease the prevalence of diarrhea.A low CD4 þ count may also increase a person's susceptibility to infection by an enteric pathogen.

DETERMINING DISEASE RISKS
The risk of contracting an illness may be expressed as the probability of infection or illness during a defined time period or that may be attributed to an exposure (e.g.tap water).The risk may be an individual-or population-based risk.Various epidemiologic study designs are conducted to provide quantitative risk estimates, and these are briefly described here.Epidemiologic studies of waterborne risks and their findings are discussed in other papers of this special issue (Calderon & Craun 2006;Colford et al. 2006;Craun & Calderon 2006;Roy et al. 2006).Risk assessment approaches, which rely on modeling the occurrence, exposure, and transmission of known waterborne pathogens, can also be

Concern about water quality
Water treatment (e.g., boiling, filtering, bottled water)

Diarrhea
Medication (CD4 count influences use of antiretrovirals; side effect is diarrhea)

CD4 count (low count increases susceptibility to enteric pathogens)
Figure 4 | Causal model diagram HIV-infected patients and diarrhea (adapted from Eisenberg et al. 2002).
conducted (ILSI 2000).For example, quantitative microbial risk assessments have made use of dose -response information from human volunteer and other studies to estimate risks of infection or illness associated with concentrations of several waterborne pathogens.A framework for structuring risk assessments and the various exposure scenarios for waterborne pathogens are discussed in greater detail in this special issue (Soller 2006).

Disease occurrence
Epidemiologists use several measures of disease frequency to ascertain illness risk in populations.Two frequently used measures are prevalence and incidence.Prevalence, the proportion of people who have a specific disease, condition, or infection at any specified time includes both new and existing cases (Last 1995).Incidence measures the new cases that occur in a population and is usually expressed as a rate during a defined time period (e.g.cases per person-year).
Cumulative incidence, the proportion of healthy persons who move to the disease state, is a measure of individual rather than population risk (Kleinbaum et al. 1982;Ahlbom & Norell 1990).Prevalence and incidence measures are related.Under most conditions, prevalence is directly proportional to incidence (Kleinbaum et al. 1982).A general rule of thumb is that prevalence is equal to incidence times the duration of the disease.A meaningful measure of either prevalence or incidence requires the accurate compilation of the conditions (e.g.cases of illness) of interest and an estimate of the susceptible or at-risk population from which the cases arise.

Epidemic or outbreak disease
Epidemic disease is a clear increase in illness or other health-related events above that which is normally expected (Figure 5).The time period and geographic area in which cases occur must be specified (Last 1995).An increased number of illnesses may occur during a short time period ranging from a few hours (e.g.food intoxications) to several days or weeks, or the increase can continue for months or years (e.g.AIDs or a cholera epidemic).There is no generally accepted number or percentage increase of cases that may describe an epidemic.It is what public health officials consider, usually based on previous surveillance, to be greater than expected for a specific disease or symptoms for that area.The increase can be relatively small.
For example, a few sporadic cases of smallpox anywhere in the world would likely be characterized as an epidemic, since even a single case is more than would be expected.

Epidemic disease does not have a geographic restriction
nor is it limited to traditional geopolitical boundaries.
Epidemics can occur in a few city blocks, encompass an entire municipality or country, or cross international boundaries (i.e. a pandemic).It is also possible for one population in a geographic area to experience an outbreak while another does not.For example in 1996, a waterborne outbreak of Cryptosporidium in Canada caused symptomatic illness only among young children in the town and visitors.Although adult residents did not experience an increased incidence of illness, an increased serological response among these residents suggested that they had been exposed (Frost et al. 2000).
The recognition of an outbreak will vary based on the sensitivity of a surveillance system to detect disease or infection.Langmuir (1963)  an increased occurrence of illness can be recognized.
Officials may conduct active surveillance, relying on a formal surveillance network of health care providers and/or clinical laboratories to report cases (Frost et al. 2003).
Periodic contact is maintained to encourage reporting.
Officials may also passively wait for physicians to report a sudden increase in patients presenting with gastrointestinal symptoms or for clinical laboratories to report an increase in positive stools for a specific microorganism (Frost et al. 2003).The key characteristics of a surveillance system include the timely interpretation of the information and an action plan to respond to increased illnesses.The CDC has proposed guidelines for evaluating disease surveillance systems (CDC 2001).
In the United States, cases of some 40 or more infectious diseases, some of which may be transmitted by water, are voluntarily reported to the CDC.Requirements for reporting are based on legislation or administrative rule in the state or city; clinical laboratories may also be required to report the identification of positive cultures and immunodiagnostic tests for selected infections.Individual cases of AGI are not usually required to be reported, but the reporting of extraordinary occurrence or clustering of AGI cases may be required by the state or city.

Endemic disease
In contrast to epidemic, endemic refers to the persistent low to moderate level or the usual ongoing occurrence of illness in a given population or geographic area (Figure 5).Endemic can apply to a wide range of health outcomes, case definitions, and severity measures; asymptomatic infection may also be assessed.There is no generally accepted incidence or prevalence of AGI that is considered usual.What may be the usual occurrence in one population or geographic area may be unusual in another.The time period is an important factor when studying AGI occurrence, as there is often a seasonal change in the number of illnesses, with peaks of illness occurring in the late summer to early fall and the late winter to early spring (Monto & Koopman 1980).This seasonal component may vary depending on local factors.
In the United States, incidence rates of infectious AGI have been estimated to be less than one illness per adult person per year and two illnesses per person per year for children under 10 years of age (Hodges et al. 1956;Monto & Koopman 1980).Elsewhere in this special issue, Roy et al. (2006) report current estimates from the Foodborne Diseases Active Surveillance Network (FoodNet) and other sources.

Hyper-endemic and sporadic disease
A persistently high level of illness occurrence during a specific period is referred to as hyper-endemic; an irregular pattern of occurrence is called sporadic (Figure 5).Over a long period of time, the number of cases may be similar between endemic and sporadic disease, but when shorter time periods are considered, sporadic cases may appear infrequently or in a random-like fashion.Seasonal fluctuations of AGI may result in a sustained increase that is considered hyper-endemic.Last (1995) notes that hyperendemic illness should affect all age groups equally; however, we feel this would be an unreasonable condition to impose for infectious diseases where factors such as the host's immune status may affect an age-group's susceptibility to symptomatic illness.

EPIDEMIOLOGIC STUDY DESIGNS
Many epidemiologists have discussed how endemic and epidemic waterborne disease associations should be interpreted given the potential biases that may affect study findings (Fewtrell & Bartram 2001;Craun et al. 2001, Craun & Frost 2002;Frost et al. 2003;Hunter et al. 2003).We provide a brief discussion here.Various epidemiology textbooks, as well as review articles on epidemiologic methods, can be consulted for a more comprehensive coverage of study designs and how to evaluate epidemiologic associations (Hill 1965;MacMahon & Pugh 1970;Lilienfeld 1976;Kleinbaum et al. 1982;Rothman 1986;Hennekens & Buring 1987;Monson 1990;Beaglehole et al. 1993;Rockett 1994;Gordis 2000;Craun et al. 2003).
Epidemiologic studies fall into two general categories: (1) experimental studies in which investigators control the conditions of exposure in the study, and (2) observational studies in which investigators study populations with selected diseases or health outcomes under exposure conditions as they exist (Table 2).

Experimental studies
Randomized, controlled trials or clinical studies can provide some of the strongest evidence that a given exposure or risk factor "causes" a health outcome.These studies involve ethical concerns, are expensive, and are challenging to conduct because the exposure must be controlled and participants must be randomly assigned to an exposure.
Dose-response studies where individuals are given known doses of a microorganism of interest have been conducted.
Studies have also assessed intervention effects of water treatment at the individual or household level (Colford et al. 2006), and health outcomes have been evaluated for populations in which a community water treatment process has changed (Calderon & Craun 2006).Community-intervention studies are considered quasi-experimental because they take advantage of a natural experiment.Since participants are not randomly assigned to a water treatment regimen, we have classified community-intervention studies as observational rather than experimental (Table 2).

Household-intervention trial
In this type of study, persons are randomly selected and placed into two groups: one that will receive an experimental treatment or intervention and the other that will not.
Randomization tends to produce comparability between the two groups with respect to other factors that might affect the health outcome being studied.The greatest objectivity is provided when the three primary groups involved in the study (i.e. the subjects, investigators, and statisticians analyzing the data) are unaware of the subject's allocation to a particular treatment or intervention.When this is achieved, the trial is said to be triple-blinded.The major advantage of a randomized, blinded trial is that the design precludes many sources of systematic error that may be associated with observational studies and tends to minimize the potential for confounding (Tables 3 and 4).However, microbial drinking water contaminants have multiple routes of exposure, and it may be difficult to separate primary (e.g.waterborne) from secondary (e.g.food, person-to-person) transmission.Other difficulties include separating household and non-household waterborne exposures, participant dropout, and noncompliance with study procedures.Also, it may not be possible to generalize the results to other populations due to differences in water system vulnerabilities or differences in the population studied.Because of their cost, randomized controlled trials are generally considered for environmental exposures only when a well-defined hypothesis has been identified and it is ethically feasible.

Observational studies
Observational epidemiologic studies can be descriptive or analytical (Table 2).Descriptive epidemiology is primarily Because the results of ecologic studies may be difficult to interpret, investigators usually conduct them to help develop hypotheses for further evaluation with analytical studies (Greenland & Robins 1994a, b;Piantadosi 1994;Poole 1994).The major limitation is the so-called ecologic bias in which the observed association fails to accurately represent the biologic effect at the individual level.Ecologic studies, however, can be useful to study environmental exposures, especially when group exposure measures reflect individual exposures (Walter 1991).
Analytical studies that test specific hypotheses provide a quantitative estimate of the risk and help epidemiologists assess causality (Monson 1990).Because exposure and disease are measured at the same point in time, cross-sectional studies are most useful for studying diseases with a short latency or incubation period.
The incubation period is the time interval between initial contact with the pathogen and the first appearance of symptoms associated with infection.Hepatitis A may have an incubation period of 30 days or more.Cryptosporidium Selection bias A bias resulting from systematic differences in characteristics between those who are selected for study and those who are not.Comparable criteria should be used to select study participants.
Information bias A bias resulting from flaws in measuring exposure or disease, especially the use of noncomparable methods to collect information that results in different quality (accuracy) of information between the groups being compared.This can be due to interviewers' subconscious or conscious gathering of information.
Recall bias refers to differences in the accuracy or completeness of information provided by study participants due to their memory of past events or experiences.Misclassification bias is the erroneous classification of a study participant into either a disease or exposure category.

Confounding
A situation where the measure of the effect of an exposure on disease risk is distorted because of the association of the exposure with other factor(s) that influence the disease.Effect modification refers to a change in the magnitude of the effect and should not be confused with confounding.

Analytic bias
A bias resulting from the way the data are analyzed.An example would be a distortion in the shape of an exposure -response trend produced by poor categorization of a continuously measured exposure.
and Giardia may have an incubation period of seven or more days.However, most waterborne pathogens have a much shorter incubation period.FoodNet is a crosssectional survey of the occurrence of AGI and possible foodborne exposures (Roy et al. 2006).

Case-control study
A case-control design is often used when investigators want to determine the association of a health outcome with multiple rather than a single exposure factor (Craun & Calderon 2006).These studies are also called case-referent or case-comparison studies (Miettinen 1974).3 and 4), especially recall bias.A major advantage of the case-control design is that it is ideal for looking at relatively rare illnesses.

Cohort study
This design compares disease rates among groups of persons who have differing exposures (Craun & Calderon 2006).Cohort studies of infectious diseases can be prospective (i.e.current exposure information is recorded and the follow-up period occurs afterward) or retrospective (i.e. the cohort is based on a historical exposure period).
Disease incidence is determined during the follow-up period for the exposed and unexposed groups.An advantage of this study is that several health-related outcomes or diseases can be studied.The cost, however, is usually greater than the cost of a case-control study.

Other study designs
Epidemiologic studies such as community-intervention and longitudinal time-series studies (Craun & Calderon 2006) incorporate elements of the basic designs previously described.A community-intervention study may be considered when water utilities change their water source or drinking water treatment such as adding filtration.Timeseries studies can evaluate how changes in water quality may affect AGI.
Quantifying the exposure -disease relationship

Measures of association
Several measures can convey information about risk.The appropriate measure depends on the study design and the way the data were collected.Analytical studies can provide a direct estimate of individual risk, and the incidence of illness among the unexposed and exposed can be directly compared.The basic measures are the risk or rate difference (RD), incidence rate ratio (IRR), cumulative incidence ratio (CIR) or odds ratio (OR).The RD is a measure of the absolute difference between two measures of incidence (e.g.
incidence rate for the exposed minus the incidence rate for the unexposed in a cohort study) and includes units (e.g. person-years).The IRR and CIR are relative measures of incidence.The IRR, CIR and OR are sometimes referred to as measures of the relative risk (RR), and if certain specific conditions are met, all of these measures will be similar.
Readers should note that the RD is sometimes referred to as the attributable risk (AR), and thus, they should consider how the AR was computed (Table 5).
A reported RR of unity (1.0) indicates no association; any other value signifies either an increased or decreased risk.Because participants in a case-control study are selected according to their disease status, the exposure odds ratio (OR) is determined.As defined by Last (1995), "The exposure-odds ratio for a set of case-control data is the ratio of the odds in favor of exposure among the cases to the odds in favor of exposure among non-cases."The OR may be defined differently in a cross-sectional study when the "odds of disease" are of interest (Last 1995).If the rate of disease is rare in the general population, the OR is considered to be a good estimate of the RR (Cornfield & Haenszel 1960).If the cases are incident cases rather than old or prevalent cases, the OR is equivalent to the RR (Miettinen 1976).Even if neither of these conditions applies to the study, Monson (1990) notes that the OR can still be interpreted similar to the RR.However, if a disease is not rare, the OR can overestimate (if RR . 1) or underestimate (if RR , 1) the IRR or the CIR (Kleinbaum et al. 1982).
Ecologic studies can be designed to estimate an IRR (Craun & Calderon 2006) or a standardized mortality or morbidity ratio (SMR).The SMR is the ratio of the observed number of cases to the number of expected cases if the population under study were unexposed.An SMR above 1 indicates an increased risk for the exposed group.For example, a SMR of 1.35 indicates that there were 35% more cases in the exposed group compared to the unexposed group.An SMR is often multiplied by 100 for interpretation purposes; in the previous example, the SMR would then be reported as 135.
Epidemiologists have used several terms, including population attributable risk (PAR) or the PAR percent (PAR%), to describe the incidence or proportion of a disease or other outcome in a population that can be attributed to the exposure or risk factor in question.
Alternatively used terms include etiologic fraction (Miettinen 1974), attributable proportion (Rothman 1986), and attributable fraction, all of which are similar computations (Table 5).Epidemiologists may even use a different terminology to describe the PAR or they may estimate a different measure (Table 5).In general, all of these measures provide an estimate of the amount by which a particular disease rate (e.g.AGI) might be reduced if the specified exposure were removed (MacMahon & Pugh 1970).In defining the PAR, we quote from two epidemiology texts.Beaglehole et al. (1993) define the PAR as "a measure of the excess rate of disease in a total study population which is attributable to an exposure… and could be removed if the exposure were avoided completely."Hennekens & Buring (1987) define the "population attributable risk percent expresses the proportion of disease in the study population that is attributable to the exposure and thus could be eliminated if the exposure were eliminated."In contrast to the PAR and PAR%, the attributable risk (AR), AR%, or AR (exposed) is the incidence or proportion of disease only among the exposed members of the population that can be attributed to the exposure.This distinction is important, because usually the incidence of disease in the exposed persons that are being studied will be greater than or at least equal to the incidence of disease in the entire study population of both exposed and unexposed persons.
The AR is often reported instead of the PAR.Epidemiologists may refer to the AR computation as the AR (exposed), risk difference (exposed), absolute risk, etiologic fraction (exposed), attributable proportion (exposed), or attributable fraction (exposed).When the exposure is beneficial, epidemiologists may compute the prevented fraction, which applies to only the exposed population or the preventable fraction (population), which applies to both the exposed and unexposed members of the population.To add further confusion, the various terminologies may be used without specifying the population.For example, risk difference (RD) may be used rather than RD (population) or RD (exposed), and it may be difficult to know which is being reported.Readers should carefully evaluate the reported measure to ensure they understand the population to which they apply.Hopefully, sufficient information is provided in the study results for the reader to calculate the risk measure.
From a public health perspective, estimation of the PAR is most useful when there is consensus about causality of the association and that the exposure is amenable to intervention (Hill 1965;Rothman 1986;Last 1995;Rockhill et al. 1998).Sufficient information is available to infer causality for known pathogens that might be important causes of waterborne disease in the United States.Information from clinical and microbiological studies is supported by waterborne outbreak data, and there is general knowledge of biology of the illness and the microorganism.
To achieve the anticipated benefits suggested by the PAR, we should have some knowledge about the specific exposure and how we can eliminate it.An important concern is whether the estimates of PAR for endemic waterborne illness should be extrapolated to an entire population or only to select populations based on their exposure (e.g.type of water source and/or water treatment) or other factors (e.g.age, susceptibility).For example, a PAR that has been estimated for a conventionally filtered river water source may not be relevant for populations that use unfiltered water from protected reservoirs or populations that use groundwaters.

Interpreting epidemiologic results
Results from a relatively large number of studies in various geographical areas and using different designs allow for a more definitive estimate of the magnitude of an association.
However, the design, precision, and validity of each individual study should be evaluated before attempting to interpret the results of a group of studies that assess similar risks and exposures (Table 3).

Random error
The likelihood that an observed association is due to random error is assessed by the level of statistical significance ("p" value) or the confidence interval (C.I.).As a measure of the stability of the risk measure, the "p-value indicates the likelihood that, if the study were repeated a number of times, a (risk) as large as or larger than that obtained would occur, given no true association between exposure and disease," assuming the data are unbiased (Monson 1990) (Monson 1990).Monson sees little utility to the computation of a measure of stability for very small studies or for studies with biased data.It should also be remembered that random error or chance can never be completely ruled out, and similarly, studies may fail to observe an effect which is truly significant.

Systematic error and confounding
Possible bias or confounding in each study and the consistency of the results among various studies should be evaluated.The importance of assessing and controlling bias and confounding has been extensively discussed (Murphy 1990;Last 1995).Bias refers to any trend in the collection, analysis, interpretation, publication, or review of data that can lead to distortion in an estimate of effect.As used by epidemiologists, bias does not carry an imputation of prejudice or the investigator's desire for a particular outcome and thus differs from the conventional usage which refers to a partisan point of view (Last 1995).Last (1995) defines confounding and more than 26 specific biases.We have classified these into four major groups which are briefly described in Table 4 ( Craun et al. 2001).

Magnitude of risk measures
Historically, a small RR or OR was considered by many epidemiologists to indicate that the observed association was suspect because of possible uncontrolled confounding (Table 6).More recently, a substantial number of environmental epidemiologic studies have found relatively small associations, and public health officials have begun to look at a small RR or OR as having implications on a population basis rather than on an individual level.If a large proportion of the population is exposed and the outcome of interest is relatively common, environmental exposures with weak associations can have a substantive impact.Thus, for smaller associations, epidemiologists should thoroughly evaluate the possibility that the association is affected by uncontrolled confounding.Techniques to identify confounding factors include causal diagrams (Greenland 1999) and modeling approaches such as a change in estimate or backward elimination procedure (Rothman & Greenland 1998).On the other hand, a very large RR or OR is unlikely to be completely explained by an unidentified or uncontrolled confounding factor.The magnitude of a RR, however, has no bearing on the possibility that an association is due to bias.RRs for continuous quantitative exposures cannot be evaluated by the simplified descriptions in Table 6 because the magnitude then becomes scale dependent (e.g.RR per 1 unit increase of turbidity; 10 unit increase; or 100 unit increase).
that should be considered in the ascertainment of cases or conditions include the diagnostic criteria for defining a case, population source and selection of incident or prevalent cases and appropriate controls.Persons with the disease or outcome may be selected from a defined geographical area, hospital(s), clinic(s), or a cohort.A comparison group of persons (controls) in which the condition or disease is absent is also selected, preferably at random from the same population from which the cases were selected.The frequency of existing or past attributes and exposures thought to be relevant in the development of the disease are determined for all participants and compared among cases and controls.Information about the relevant past exposures or behaviors (e.g.drinking tap water, swimming activities) may be obtained by interview.Clinical specimens may be collected to help define the disease or condition.Environmental exposure information can be obtained from historical records, or if appropriate, current exposures can be assessed by analyzing environmental samples from the surrounding environment or the study participants' micro-environment.Case-control studies are subject to several sources of systematic error (Tables However, there are important distinctions among the various measures.Some measures refer to the removal of the exposure among only the exposed members of the population; others refer to the removal of the exposure among the general population of both exposed and unexposed persons.In addition, the use of different terminology to describe the same computation can cause confusion.So as not to draw erroneous inferences when comparing statistics across different studies, readers should fully understand these risk measures, their computation, and the terminology used to describe them. This special issue of the Journal of Water and Health is devoted exclusively to a discussion of waterborne risks in the United States and other developed countries.This paper provides an introduction to epidemiologic methods for assessing waterborne disease risks and identifies several important considerations when evaluating the current information about waterborne risks.Readers should consider the following questions when reading the subsequent papers in this special issue: † Is sufficient information available to estimate the magnitude (e.g.PAR) of an increased risk of endemic AGI or a specific disease (e.g.cryptosporidiosis) that may be associated with drinking water systems in the United States or another developed country?† If so, can the risk be generalized to the national population?Are certain populations (e.g. the elderly) at greater or less risk?† What factors (e.g.water source and treatment processes) may modify the risk?† What are the uncertainties associated with estimating the magnitude of waterborne disease risks?How can these uncertainties be addressed?† How can waterborne outbreak information help inform the endemic waterborne risk estimate?† How can microbial risk assessments help inform the estimate?† Is it sufficient to estimate the number of cases or should the severity of the cases, the full disease burden, be considered?

Table 1 |
Estimated annual cases of acute gastroenteritis caused by known foodborne pathogens, United States a Mead et al. (1999) et al. (1999).b Greater than 70% of cases acquired abroad.

Table 2 |
Types of epidemiological studies a Monson (1990)mMonson (1990).used to summarize disease information, assess geographical or temporal patterns of disease, and develop hypotheses about disease etiologies.Ecologic studies, also called geographical, correlation, group, or aggregate studies, are descriptive and are used primarily to explore relationships between available health statistics and population characteristics or environmental and water quality measures.

Table 3 |
Assessing bias for reported associations

Table 4 |
Systematic error or bias in epidemiological studies a

Table 5 |
(Rothman 1986;Beaglehole et al. 1993;Rockett 1994;Last 1995;Sahai & Khurshid 1996et al. 1993;Rockett 1994;Last 1995;Sahai & Khurshid 1996) I e ) / I p When the exposure is preventative, proportion of disease in the population that would be prevented if the whole population were exposed to factor or the intervention Ie ¼ incidence of disease in the exposed.I p ¼ incidence of disease in the entire population.I u ¼ incidence of disease in the unexposed.P e ¼ proportion of exposed persons in population.