Library-based microbial source tracking (MST) can assist in reducing or eliminating fecal pollution in waters by predicting sources of fecal-associated bacteria. Library-based MST relies on an assembly of genetic or phenotypic “fingerprints” from pollution-indicative bacteria cultivated from known sources to compare with and identify fingerprints of unknown origin. The success of the library-based approach depends on how well each source candidate is represented in the library and which statistical algorithm or matching criterion is used to match unknowns. Because known source libraries are often built based on convenience or cost, some library sources may contain more representation than others. Depending on the statistical algorithm or matching criteria, predictions may become severely biased toward classifying unknowns into the library's dominant source category. We examined prediction bias for four of the most commonly used statistical matching algorithms in library-based MST when applied to disproportionately-represented known source libraries; maximum similarity (MS), average similarity (AS), discriminant analyses (DA), and k-means nearest neighbor (k-NN). MS was particularly sensitive to disproportionate source representation. AS and DA were more robust. k-NN provided a compromise between correct prediction and sensitivity to disproportional libraries including increased matching success and stability that should be considered when matching to disproportionally-represented libraries.
Skip Nav Destination
Article navigation
Research Article|
May 01 2007
A statistical appraisal of disproportional versus proportional microbial source tracking libraries
Brian J. Robinson;
1National Oceanic and Atmospheric Administration, 219 Fort Johnson Road, Charleston SC 29412-9110, USA
Tel: +1 843 762-8572 Fax: +1 843 762-8700; E-mail: [email protected]
Search for other works by this author on:
Kerry J. Ritter;
Kerry J. Ritter
2Southern California Coastal Water Research Project, 7171 Fenwick Lane, Westminster CA 92683, USA
Search for other works by this author on:
R. D. Ellender
R. D. Ellender
3The University of Southern Mississippi, Department of Biological Sciences, 118 College Drive #5018, Hattiesburg MS 39406-0001, USA
Search for other works by this author on:
J Water Health (2007) 5 (4): 503–509.
Citation
Brian J. Robinson, Kerry J. Ritter, R. D. Ellender; A statistical appraisal of disproportional versus proportional microbial source tracking libraries. J Water Health 1 December 2007; 5 (4): 503–509. doi: https://doi.org/10.2166/wh.2007.044
Download citation file: