Abstract

The severity of ill effects (SEV) index is based on the limited meta-analysis of previous peer reviewed reports and consultations, and described as a function of duration of exposure to turbid conditions in fisheries or fish life stages by fish adapted to life in clear water ecosystems. In this study, the performance of classification by SEV index was investigated using the K-Means clustering algorithm. This study is based on 303 tests undertaken on aquatic ecosystem quality over a wide range of sediment concentrations (1–50,000 mg SS/L) and durations of exposure (1–35,000 h). Training and testing data includes concentration of suspended sediment, duration of exposure, species and life stages as the input variables and the SEV index for fish as the output variable. Results indicate that the K-Means clustering algorithm, as an efficient novel approach with an acceptable range of error, can be used successfully for improving the performance of classification by SEV index.

INTRODUCTION

The sudden release of large volumes of sediment may create serious problems downstream, such as channel aggradations and flooding, interference with water supply and cooling water intakes, as well as adverse impacts on fisheries and the environment (Khakzad & Elfimov 2015a, 2015b).

MacDonald & Newcombe (1993) grouped effects of suspended sediment on fish into three categories: lethal, sublethal and behavioral. These categories include the following:

  • Lethal effects kill individual fish, alter populations and decrease the capacity of fish to reproduce. They include sublethal and behavioral effects that give rise to reductions in population size.

  • Sublethal effects include tissue injury or changes in the physiology of an organism. The effect is chronic and may lead to an eventual decline in population size.

  • Behavioral effects are effects that result in any change in activity normally associated with a species in an undisturbed environment. These changes may result in immediate death, or changes in population size or death over time.

Newcombe & Jensen (1997) developed a risk index and presented six regression equations for management decisions that relate biological response to duration of exposure and suspended sediment concentration. The equations all have the form: where, z is severity of ill effect, x is duration of exposure (h), y is concentration of suspended sediment (mg SS/L), a is the intercept, and b and c are slope coefficients. However, the study provided primary available estimates of the onset of sublethal and lethal effects, they applied regression models as a method to estimate SEV and have difficulties in showing the important factors affecting SEV (Khakzad & Elfimov 2015a, 2015b). In addition, it is likely that the assumptions that are made in a regression model may be violated if data for diseases or disorders are used in the model, because linear regression models need assumptions to be made, including assumptions about the linearity, normality, and homoscedasticity of the data, among others (Byeon 2014).

As mentioned before, the prediction of significant ill effect for fish that is essentially an uncertain and random process is not easy to accomplish by using deterministic equations. Therefore, it is ideally suited to the K-Means clustering algorithm with various distance metrics since they are primarily aimed at the recognition of a random pattern in a given set of input values. K-Means clustering is helpful in predicting the value of the output of a system from its corresponding random inputs as the application of K-Means clustering does not require knowledge of the underlying physical process as a precondition.

K-Means is a prototype-based, simple partitional clustering algorithm that attempts to find K non-overlapping clusters. These clusters are represented by their centroids (a cluster centroid is typically the mean of the points in that cluster) (Wu 2012). The clustering process of K-Means is as follows. First, K initial centroids are selected, where K is specified by the user and indicates the desired number of clusters. Every point in the data is then assigned to the closest centroid, and each collection of points assigned to a centroid forms a cluster. The centroid of each cluster is then updated based on the points assigned to that cluster. This process is repeated until no point changes clusters.

Considering that there are numerous clustering algorithms proposed in the literature, it may be queried why this paper is focused on the K-Means clustering. Let us understand this from the following two perspectives. First, K-Means has some distinct advantages compared with other clustering algorithms. That is, K-Means is very simple and robust, highly efficient, and can be used for a wide variety of data types. Indeed, it has been ranked the second among the top 10 data mining algorithms (Wu et al. 2008), and has become the defacto benchmark method for newly proposed methods. Moreover, K-Means as an optimization problem still has some theoretical challenges.

The present study develops and presents a new expert system to improving performance of classification by SEV as an indicator of ill effect for fish using K-Means clustering algorithm and results compared with previous models.

MATERIAL AND METHOD

Material

In this study, we provide information (data from 303 tests) about aquatic ecosystem quality over a wide range of sediment concentrations, durations of exposure species, life stage and severity of ill effect for fishes (Table 1). Supporting data extracted from the review included taxonomic group, species of fish, natural history, life history phase, and sediment particle size range.

Table 1

Available data on the effects of suspended sediments on biota. Data take from the original literature

Species Life stage Concentration (mg/L) Exposure duration (h) SEV Fish response description Reference 
Adult salmonids and rainbow smelt 
Grayling (Arctic) 100.0 0.1 Fish avoided turbid water Suchanek et al. (1984a, 1984b
Grayling (Arctic) 100.0 1.008 Fish had decreased resistance to environmental stresses McLeay et al. (1984)  
Grayling (Arctic) 100.0 1.008 Impaired feeding McLeay et al. (1984)  
Grayling (Arctic) 100.0 1.008 Reduced growth McLeay et al. (1984)  
Salmon 25.0 Feeding activity reduced Phillips (1970)  
Salmon 16.5 24 Feeding behavior apparently reduced Ott (1984)  
Salmon 1,650.0 240 Loss of habitat causcd by excessive sediment transport Coats et al. (1985)  
Salmon 75.0 168 Reduced quality of rearing habitat Slaney et al. (1977)b)  
Salmon 210.0 24 10 Fish abandoned their traditional spawning habitat Hamilton (1961)  
Salmon (Atlantic) 2,500.0 24 10 Increased risk of predation Gibson (1933)  
Salmon (chinook) 650.0 168 No histological signs of damage to olfactory epithelium Brannon et al. (1981)  
Salmon (chinook) 350.0 0.17 Home water preference disrupted Whitman et al. (1982)  
Salmon (chinook) 650.0 168 Homing behavior normal, but fewer test fish returned Whitman et al. (1982)  
Salmon (chinook) 39,300.0 24 10 No mortality Newcomb & Flagg (1983)  
Salmon (chinook) 82,400.0 12 Mortality rate 60% Newcomb & Flagg (1983)  
Salmon (chinook) 207,000.0 14 Mortality rate I00% Newcomb & Flagg (1983)  
Salmon (Pacific) 525.0 588 10 No mortality (other end points not investigated) Griffin (1938)  
Salmon (sockeye) 500.0 96  Servizi & Martens (1987)  
Salmon (sockeye) 1,500.0 96  Servizi & Martens (1987)  
Salmon (sockeye) 39,300.0 24 10 No mortality Newcomb & Flagg (1983)  
Salmon (sockeye) 82,400.0 12 Mortality rate 60% Newcomb & Flagg (1983)  
Smell (rainbow) 3.5 168 Increased vulnerability to predation Swenson (1978)  
Stcelhcad 500.0 Signs of sublethal stress (VA) Redding & Schreck (1982)  
Steelhead 16,500.0 240 Loss of habit caused by excessive sediment transport Coats et al. (1985)  
Stcelhcad 500.0 Blood cell count and blood chemistry change Redding & Schreck (1982)  
Trout 16.5 24 Feeding behavior apparently reduced Ott (1984)  
Trout 75.0 168 Reduced quality of rearing habitat Slaney et al. (1977)  
Trout 270.0 312 Gill tissue damaged Herbert & Merkens (1961)  
Trout 525.0 588 10 No mortality (other end points not investigated) Griffin (1938)  
Trout 300.0 720 12 Decrease in population size Newcomb & Flagg (1983)  
Trout (brook) 4.5 168 Fish more active and less dependent on cover Newcomb & Flagg (1983)  
Trout (brown) 18.0 720 10 Abundance reduced Newcombe & Jensen (1997)  
Trout (cutthroat) 35.0 Feeding ceased; fish sought cover Cordone & Kelley (1961)  
Trout (lake) 35.0 168 Fish avoided turbid areas Swenson (1978)  
Trout (rainbow) 66.0 Avoidance behavior manifested part of the lime Lawrence & Scherer (1974)  
Trout (rainbow) 100.0 0.1 Fish avoided turbid water (avoidance behavior) Suchanek et al. (1984a, 1984b
Trout (rainbow) 100.0 0.25 Rate of coughing increased (FSS) Hughes (1975)  
Trout (rainbow) 250.0 0.25 Rate of coughing increased (FSS) Hughes (1975)  
Trout (rainbow) 810.0 504 Gills of fish that survived had thickencd epithelium Herbert & Merkens (1961)  
Trout (rainbow) 17,500.0 168 Fish survived: gill epithelium proliferated and thickencd Slanina (1962)  
Trout (rainbow) 50.0 960 Rate of weight gain reduced (CWS) Herbert & Richards (1963)  
Trout (rainbow) 50.0 960 Rate of weight gain reduced (WF) Herbert & Richards (1963)  
Trout (rainbow) 810.0 504 10 Some fish died Herbert & Merkens (1961)  
Trout (rainbow) 270.0 3,240 10 Survival rate reduced Herbert & Merkens (1961)  
Trout (rainbow) 200.0 24 10 Test fish began to die on the first day (WF) Herbert & Richards (1963)  
Trout (rainbow) 80,000.0 24 10 No mortality Herbert & Richards (1963)  
Trout (rainbow) 18.0 720 10 Abundance reduced Newcombe & Jensen (1997)  
Trout (rainbow) 59.0 2,232 10 Habitat damage: reduced porosity of gravel Slaney et al. (1977)  
Trout (rainbow) 4,250.0 588 12 Mortality rate 50% (CS) Herbert & Wakeford (1962)  
Trout (rainbow) 49,838.0 96 12 Mortality rate 50% (DM) Lawrence & Schercr (1974)  
Trout (sea) 210.0 24 10 Fish abandoned traditional spawning habitat Hamilton (1961)  
Whitefish (lake) 16,613.0 96 12 Mortality rate 50% (DM) Lawrence & Scherer (1974)  
Whitefish (mountain) 10,000.0 24.0 10 Fish died; silt-clogged gills Langer (1980)  
Juvenile salmonids 
Grayling (Arctic) 100.0 756 Fish moved out of the test channel McLeay et al. (1987)  
Grayling (Arctic) 1,000.0 1.008 Fish had frequent misstrikes while feeding McLeay et al. (1987)  
Grayling (Arctic) 1,000.0 1.008 Fish responded very slowly to prey McLeay et al. (1987)  
Grayling (Arctic) 300.0 1.008 Rate of feeding reduced McLeay et al. (1987)  
Grayling (Arctic) 1,000.0 840 Rate of feeding reduced McLeay et al. (1987)  
Grayling (Arctic) 1,000.0 1.008 Fish failed to consume all prey McLeay et al. (1987)  
Grayling (Arctic) 300.0 840 Serious impairment of feeding behavior McLeay et al. (1987)  
Grayling (Arctic) 300.0 1.008 Respiration rate increased (FSS) McLeay et al. (1987)  
Grayling (Arctic) 300.0 1.008 Fish less tolerant of pentachlorophenol McLeay et al. (1987)  
Grayling (Arctic) YY 3,810.0 144 Mucus and sediment accumulated in the gill lamellae Simmons (1982)  
Grayling (Arctic) YY 3,810.0 144 Fish displayed many signs of poor condition Simmons (1982)  
Grayling (Arctic) YY 1,250.0 48 Moderate damage to gill tissue Simmons (1982)  
Grayling (Arctic) YY 1,388.0 96 Hyperplasia and hypertrophy of gill tissue Simmons (1982)  
Grayling (Arctic) 100.0 1.008 Growth rate reduced McLeay et al. (1984)  
Grayling (Arctic) 100.0 840 Fish responded less rapidly to drifting food McLeay et al. (1987)  
Grayling (Arctic) 300.0 1.008 Weight gain reduced McLeay et al. (1987)  
Grayling (Arctic) 300.0 756 10 Fish displaced from their habitat McLeay et al. (1987)  
Salmon (chinook) 943.0 72 Tolerance to stress reduced (VA) Stober et al. (1981)  
Salmon (chinook) 6.0 1,440 Growth rate reduced (LNFH) Newcomb & Flagg (1983)  
Salmon (coho) 240.0 24 Cough frequency increased more than 5-fold Servizi & Martens (1992)  
Salmon (coho) 1,547.0 96 Gill damage Noggle (1978)  
Salmon (coho) 2,460.0 24 Fatigue of the cough reflex Servizi & Martens (1992)  
Salmon (coho) 3,000.0 48 High level sublethal stress: avoidance Servizi & Martens (1992)  
Salmon (coho) 8,000.0 96.0 10 Mortality rate 1% Servizi & Martens (1991)  
Salmon (coho) 35,000.0 96 12 Mortality rate 50% Noggle (1978)  
Salmon (coho) 22,700.0 96 12 Mortality rate 50% Servizi & Martens (1991)  
Salmon (coho) F* 8,100.0 96 12 Mortality rate 50% Servizi & Martens (1991)  
Salmon (coho) PS 18,672.0 96 12 Mortality rate 50% Stober et al. (1981)  
Salmon (coho) 28,184.0 96 12 Mortality rate 50% (VA) Stober et al. (1981)  
Salmon (coho) 29,580.0 96 12 Mortality rate 50% Stober et al. (1981)  
Salmon (sockeye) 1,261.0 96 Body moisture content reduced Servizi & Martens (1987)  
Salmon (sockeye) 1,465.0 96 Hypertrophy and necrosis of gill tissue (CSS) Servizi & Martens (1987)  
Salmon (sockeye) 3,143.0 96 Hypertrophy and necrosis of gill tissue (FSS) Servizi & Martens (1987)  
Salmon (sockeye) 2,688.0 96 Hypertrophy and necrosis of gill tissue (MCSS) Servizi & Martens (1987)  
Salmon (sockeye) 2,100.0 96 10 No fish died (MFSS) Servizi & Martens (1987)  
Salmon (sockeye) 9,000.0 96 10 No mortality Servizi & Martens (1987)  
Salmon (sockeye) 13,900.0 96.0 10 Mortality rate 10% (FSS) Servizi & Martens (1987)  
Salmon (sockeye) 9,850.0 96 10 Gill hyperplasia, hypertrophy, separation, necrosis (MFSS) Servizi & Martens (1987)  
Salmon (sockeye) 9,400.0 36 12 Mortality rale 50% Newcomb & Flagg (1983)  
Salmon (sockeye) 8,200.0 96 12 Mortality rate 50% (MFSS) Servizi & Martens (1987)  
Salmon (sockeye) 17,560.0 96 12 Mortality rate 50% (FSS) Servizi & Martens (1987)  
Salmon (sockeye) 23,900.0 96 14 Mortality rate 90% (FSS) Servizi & Martens (1987)  
Steelhead 102.0 336 Growth rate reduced (FC. BC) Sigler et al. (1984)  
Trout (brook) FF* 100.0 1,176.0 Test fish weighed 16% of controls (LNFH) Sykora et al. (1972)  
Trout (brook) FF 50.0 1,848 Growlh rates declined (LNFH) Sykora et al. (1972)  
Trout (rainbow) 4,887.0 384 Hyperplasia of gill tissue Goldes (1983)  
Trout (rainbow) 4,887.0 384 Parasitic infection of gill tissue Goldes (1983)  
Trout (rainbow) 171.0 96.0  Goldes (1983)  
Trout (rainbow) 7,433.0 672 11 Mortality rate 40% (CS) Herbert & Wakeford (1962)  
Salmonid eggs and larvae 
Grayling (Arctic) SF 25.0 24 10 Mortality rate 5.7% Newcombe & Jensen (1997)  
Grayling (Arctic) SF 22.5 48 10 Mortality rate 14.0% Newcombe & Jensen (1997)  
Grayling (Arctic) SF 65.0 24 10 Mortality rate 15.0% Newcombe & Jensen (1997)  
Grayling (Arctic) SF 21.7 72 10 Mortality rate 14.7% Newcombe & Jensen (1997)  
Grayling (Arctic) SF 20.0 96 10 Mortality rate 13.4% Newcombe & Jensen (1997)  
Grayling (Arctic) SF 142.5 48 11 Mortality rale 26% Newcombe & Jensen (1997)  
Grayling (Arctic) SF 185.0 72 12 Mortality rate 41.3% Newcombe & Jensen (1997)  
Grayling (Arctic) SF 230.0 96 12 Mortality rate of 47% Newcombe & Jensen (1997)  
Salmon (coho) 157.0 1,728 14 Mortality rate 100% (controls, 16.2%) Shaw & Maga (1943)  
Steelhead 37.0 1,488 12 Hatching success 42% (controls, 63%) Newcombe & Jensen (1997)  
Trout (rainbow) 6.6 1,152 11 Mortality rate 40% Newcomb & Flagg (1983)  
Troul (rainbow) 57.0 1,488.0 12 Mortality rate 47% (controls, 32%) Newcomb & Flagg (1983)  
Trout (rainbow) 120.0 384 13 Mortality rates 60–70% (controls, 38.6%) Erman & Lignon (1988)  
Trout (rainbow) 20.8 1,152 13 Mortality rale 72% Newcomb & Flagg (1983)  
Troul (rainbow) 46.6 1,152 14 Mortality rale 100% Newcombe & Jensen (1997)  
Trout (rainbow) 101.0 1,440 14 Mortality rate 98% (controls, 14.6%) Nonsalmonid eggs and larvae (estuarined, group 4) TUrnpenny & Williams (1980)  
Nonsalmonid eggs and larvae 
Bass (striped) 100.0 24 Hatching delayed Newcombe & Jensen (1997)  
Bass (striped) 1,000.0 68 11 Mortality rate 35% (controls, 16%) Auld & Schubel (1978)  
Bass (striped) 500.0 72 12 Mortality rate 42% (controls, 17%) Auld & Schubel (1978)  
Bass (striped) 485.0 24 12 Mortality rate 50% Morgan et al. (1973)  
Herring 10.0 Depth preference changcd Johnson & Wildish (1982)  
Herring (lake) 16.0 24 Depth preference changed Swenson & Matson (1976)  
Hemng (Pacific) 2,000.0 Feeding rate reduced Newcombe & Jensen (1997)  
Herring (Pacific) 1,000.0 24 Mechanical damage to epidermis Newcombe & Jensen (1997)  
Perch (while) 800.0 24 Egg development slowed significantly Morgan et al. (1983)  
Perch (while) 100.0 24 Hatching delayed Newcombe & Jensen (1997)  
Perch (white) 155.0 48 12 Mortality rate 50% Morgan et al. (1973)  
Perch (white) 373.0 24 12 Mortality rate 50% Morgan et al. (1973)  
Perch (white) 280.0 48 12 Mortality rate 50% Morgan et al. (1973)  
Perch (yellow) 500.0 96 11 Mortality rale 37% (controls, 7%) Auld & Schubel (1978)  
Perch (yellow) 1,000.0 96 11 Mortality rate 38% (controls, 7%) Auld & Schubel (1978)  
Shad (American) 100.0 96 10 Mortality rate 18% (controls, 5%) Auld & Schubel (1978)  
Shad (American) 500.0 96 11 Mortality rate 36% (controls, 4%) Auld & Schubel (1978)  
Shad (American) 1,000.0 96 11 Mortality rate 34% (controls, 5%) (estuarine or riverine-estuarine, group 5) Adult nonsalmonids Auld & Schubel (1978)  
Adult nonsalinonids 
Anchovy (bay) 231.0 24 10 Mortality rate 10% (FE) Sherk et al. (1975)  
Anchovy (bay) 471.0 24 12 Mortality rate 50% (FE) Sherk et al. (1975)  
Bass (striped) 1,500.0 336 Haemalocrit increased (FE) Sherk et al. (1975)  
Bass (striped) 1,500.0 336 Plasma osmolality increased (FE) Sherk et al. (1975)  
Cunner 28,000.0 24 12 Mortality rale 50% (20.0—25.0^0) Rogers (1969)  
Cunner 133,000.0 12 12 Mortality rate 50% (15 °C) Rogers (1969)  
Cunner 100,000.0 24 12 Mortality rate 50% (15 °C) Rogers (1969)  
Cunner 72,000.0 48 12 Monaliiy rale 50% (15 °C) Rogers (1969)  
Fish 3,000.0 240 10 Fish died Kemp (1949)  
Killifish (striped) 3,277.0 24 10 Mortality rale 10% (FE) Sherk et al. (1975)  
Killifish (striped) 3,819.0 24 12 Mortality rale 50% Sherk et al. (1975)  
Killifish (striped) 12,820.0 24 12 Mortality rate 50% Sherk et al. (1975)  
Killifish (striped) 16,930.0 24 13 Mortality rate 90% Sherk et al. (1975)  
Menhaden (Atlantic) 154.0 24 10 Mortality rate 10% (FE) Sherk et al. (1975)  
Menhaden (Atlantic) 247.0 24 12 Mortality rate 50% (FE) Sherk et al. (1975)  
Minnow (sheepshead) 100,000.0 24 14 Mortality rate 90% (I9 °C) Rogers (1969)  
Mummichog 2,447.0 24 10 Mortality rale 10% (FE) Sherk et al. (1975)  
Mummichog 3,900.0 24 12 Mortality rate 50% (FE) Sherk et al. (1975)  
Mummichog 6,217.0 24 14 Mortality rate 90% Sherk et al. (1975)  
Perch (white) 985.0 24 12 Mortality rate 50% Sherk et al. (1975)  
Perch (white) 3,181.0 24 14 Mortality rate 90% (FE) Sherk et al. (1975)  
Rasbora (harlequin) 40,000.0 24 10 Fish died (BC) Alabaster & Lloyd (1980)  
Silverside (Atlantic) 58.0 24 10 Mortality rate 10% (FE) Sherk et al. (1975)  
Silverside (Atlantic) 250.0 24 12 Mortality rate 50% (FE) Sherk et al. (1975)  
Silverside (Atlantic) 1,000.0 24 14 Mortality rate 90% (FE) Sherk et al. (1975)  
Spot 114.0 48 10 Mortality rate 10% (FE) Sherk et al. (1975)  
Spot 1,309.0 24 10 Mortality rate 10% (FE) Sherk et al. (1975)  
Spot 6,875.0 24 10 Mortality rate 10% Sherk et al. (1975)  
Spot 189.0 48 12 Mortality rate 50% (FE) Sherk et al. (1975)  
Spot 2,034.0 24 12 Mortality rate 50% Sherk et al. (1975)  
Spot 8,800.0 24 12 Mortality rate 50% Sherk et al. (1975)  
Spot 11,263.0 24 14 Mortality rate 90% Sherk et al. (1975)  
Stickleback (fourspine) 100.0 24 10 Mortality rate <1% (IA) Rogers (1969)  
Stickleback (fourspine) 10,000.0 24 10 No mortality (KS; 10-I2 °C) Rogers (1969)  
Stickleback (fourspine) 300.0 24 12 Mortality rale ∼50% (IA) Rogers (1969)  
Stickleback (fourspine) 18,000.0 24 12 Mortality rate 50% (15.0-I6.0 °C) Rogers (1969)  
Stickleback (fourspine) 53,000.0 24 12 Mortality rate 50% (10–12 °C) Rogers (1969)  
Stickleback (fourspine) 330,000.0 24 12 Mortality rate 50% (9.0–9.5 °C) Rogers (1969)  
Stickleback (fourspine) 500.0 24 14 Mortality rate 100% Rogers (1969)  
Stickleback (threespine) 28,000.0 96 10 No mortality in test designed to identify lethal threshold LeGore & DesVoigne (1973)  
Toadlish (oysier) 14,600.0 72 Fish largely unaffected, but developed latent ill effects Neumann et al. (1975)  
Toadlish (oyster) 11,090.0 72 Latent ill effects manifested in subsequent test at low SS (freshwater, group 6) Neumann et al. (1975)  
Bass (largemouth) 62.5 720 Weight gain reduced ∼50% Buck (1956)  
Bass (largemouth) 144.5 720 Growth retarded Buck (1956)  
Bluegill 144.5 720 Growth retarded Buck (1956)  
Bluegill 62.5 720 Weight gain reduced ∼50% Buck (1956)  
Bluegill 144.5 720 12 Fish unable to reproduce Buck (1956)  
Carp (common) 25,000.0 336 10 Some mortality (MC) Wallen (1951)  
Fish 120.0 384 10 Density of fish reduced Erman & Lignon (1988)  
Fish 620.0 48 10 Fish kills downstream from sediment source Hesse & Newcomb (1982)  
Fish 900.0 720 12 Fish absent or markedly reduced in abundance Herbert & Richards (1963)  
Fish (warmwater) 100,000.0 252 10 Some fish died: most survived Wallen (1951)  
Fish (warmwater) 22.0 8,760 12 Fish populations destroyed Newcombe & Jensen (1997)  
Goldfish 25,000.0 336 10 Some mortality (MC) Wallen (1951)  
Sunfish (redear) 62.5 720 Weight gain reduced ∼50% compared to controls Buck (1956)  
Sunfish (redear) 144.5 720 Growth retarded Buck (1956)  
Species Life stage Concentration (mg/L) Exposure duration (h) SEV Fish response description Reference 
Adult salmonids and rainbow smelt 
Grayling (Arctic) 100.0 0.1 Fish avoided turbid water Suchanek et al. (1984a, 1984b
Grayling (Arctic) 100.0 1.008 Fish had decreased resistance to environmental stresses McLeay et al. (1984)  
Grayling (Arctic) 100.0 1.008 Impaired feeding McLeay et al. (1984)  
Grayling (Arctic) 100.0 1.008 Reduced growth McLeay et al. (1984)  
Salmon 25.0 Feeding activity reduced Phillips (1970)  
Salmon 16.5 24 Feeding behavior apparently reduced Ott (1984)  
Salmon 1,650.0 240 Loss of habitat causcd by excessive sediment transport Coats et al. (1985)  
Salmon 75.0 168 Reduced quality of rearing habitat Slaney et al. (1977)b)  
Salmon 210.0 24 10 Fish abandoned their traditional spawning habitat Hamilton (1961)  
Salmon (Atlantic) 2,500.0 24 10 Increased risk of predation Gibson (1933)  
Salmon (chinook) 650.0 168 No histological signs of damage to olfactory epithelium Brannon et al. (1981)  
Salmon (chinook) 350.0 0.17 Home water preference disrupted Whitman et al. (1982)  
Salmon (chinook) 650.0 168 Homing behavior normal, but fewer test fish returned Whitman et al. (1982)  
Salmon (chinook) 39,300.0 24 10 No mortality Newcomb & Flagg (1983)  
Salmon (chinook) 82,400.0 12 Mortality rate 60% Newcomb & Flagg (1983)  
Salmon (chinook) 207,000.0 14 Mortality rate I00% Newcomb & Flagg (1983)  
Salmon (Pacific) 525.0 588 10 No mortality (other end points not investigated) Griffin (1938)  
Salmon (sockeye) 500.0 96  Servizi & Martens (1987)  
Salmon (sockeye) 1,500.0 96  Servizi & Martens (1987)  
Salmon (sockeye) 39,300.0 24 10 No mortality Newcomb & Flagg (1983)  
Salmon (sockeye) 82,400.0 12 Mortality rate 60% Newcomb & Flagg (1983)  
Smell (rainbow) 3.5 168 Increased vulnerability to predation Swenson (1978)  
Stcelhcad 500.0 Signs of sublethal stress (VA) Redding & Schreck (1982)  
Steelhead 16,500.0 240 Loss of habit caused by excessive sediment transport Coats et al. (1985)  
Stcelhcad 500.0 Blood cell count and blood chemistry change Redding & Schreck (1982)  
Trout 16.5 24 Feeding behavior apparently reduced Ott (1984)  
Trout 75.0 168 Reduced quality of rearing habitat Slaney et al. (1977)  
Trout 270.0 312 Gill tissue damaged Herbert & Merkens (1961)  
Trout 525.0 588 10 No mortality (other end points not investigated) Griffin (1938)  
Trout 300.0 720 12 Decrease in population size Newcomb & Flagg (1983)  
Trout (brook) 4.5 168 Fish more active and less dependent on cover Newcomb & Flagg (1983)  
Trout (brown) 18.0 720 10 Abundance reduced Newcombe & Jensen (1997)  
Trout (cutthroat) 35.0 Feeding ceased; fish sought cover Cordone & Kelley (1961)  
Trout (lake) 35.0 168 Fish avoided turbid areas Swenson (1978)  
Trout (rainbow) 66.0 Avoidance behavior manifested part of the lime Lawrence & Scherer (1974)  
Trout (rainbow) 100.0 0.1 Fish avoided turbid water (avoidance behavior) Suchanek et al. (1984a, 1984b
Trout (rainbow) 100.0 0.25 Rate of coughing increased (FSS) Hughes (1975)  
Trout (rainbow) 250.0 0.25 Rate of coughing increased (FSS) Hughes (1975)  
Trout (rainbow) 810.0 504 Gills of fish that survived had thickencd epithelium Herbert & Merkens (1961)  
Trout (rainbow) 17,500.0 168 Fish survived: gill epithelium proliferated and thickencd Slanina (1962)  
Trout (rainbow) 50.0 960 Rate of weight gain reduced (CWS) Herbert & Richards (1963)  
Trout (rainbow) 50.0 960 Rate of weight gain reduced (WF) Herbert & Richards (1963)  
Trout (rainbow) 810.0 504 10 Some fish died Herbert & Merkens (1961)  
Trout (rainbow) 270.0 3,240 10 Survival rate reduced Herbert & Merkens (1961)  
Trout (rainbow) 200.0 24 10 Test fish began to die on the first day (WF) Herbert & Richards (1963)  
Trout (rainbow) 80,000.0 24 10 No mortality Herbert & Richards (1963)  
Trout (rainbow) 18.0 720 10 Abundance reduced Newcombe & Jensen (1997)  
Trout (rainbow) 59.0 2,232 10 Habitat damage: reduced porosity of gravel Slaney et al. (1977)  
Trout (rainbow) 4,250.0 588 12 Mortality rate 50% (CS) Herbert & Wakeford (1962)  
Trout (rainbow) 49,838.0 96 12 Mortality rate 50% (DM) Lawrence & Schercr (1974)  
Trout (sea) 210.0 24 10 Fish abandoned traditional spawning habitat Hamilton (1961)  
Whitefish (lake) 16,613.0 96 12 Mortality rate 50% (DM) Lawrence & Scherer (1974)  
Whitefish (mountain) 10,000.0 24.0 10 Fish died; silt-clogged gills Langer (1980)  
Juvenile salmonids 
Grayling (Arctic) 100.0 756 Fish moved out of the test channel McLeay et al. (1987)  
Grayling (Arctic) 1,000.0 1.008 Fish had frequent misstrikes while feeding McLeay et al. (1987)  
Grayling (Arctic) 1,000.0 1.008 Fish responded very slowly to prey McLeay et al. (1987)  
Grayling (Arctic) 300.0 1.008 Rate of feeding reduced McLeay et al. (1987)  
Grayling (Arctic) 1,000.0 840 Rate of feeding reduced McLeay et al. (1987)  
Grayling (Arctic) 1,000.0 1.008 Fish failed to consume all prey McLeay et al. (1987)  
Grayling (Arctic) 300.0 840 Serious impairment of feeding behavior McLeay et al. (1987)  
Grayling (Arctic) 300.0 1.008 Respiration rate increased (FSS) McLeay et al. (1987)  
Grayling (Arctic) 300.0 1.008 Fish less tolerant of pentachlorophenol McLeay et al. (1987)  
Grayling (Arctic) YY 3,810.0 144 Mucus and sediment accumulated in the gill lamellae Simmons (1982)  
Grayling (Arctic) YY 3,810.0 144 Fish displayed many signs of poor condition Simmons (1982)  
Grayling (Arctic) YY 1,250.0 48 Moderate damage to gill tissue Simmons (1982)  
Grayling (Arctic) YY 1,388.0 96 Hyperplasia and hypertrophy of gill tissue Simmons (1982)  
Grayling (Arctic) 100.0 1.008 Growth rate reduced McLeay et al. (1984)  
Grayling (Arctic) 100.0 840 Fish responded less rapidly to drifting food McLeay et al. (1987)  
Grayling (Arctic) 300.0 1.008 Weight gain reduced McLeay et al. (1987)  
Grayling (Arctic) 300.0 756 10 Fish displaced from their habitat McLeay et al. (1987)  
Salmon (chinook) 943.0 72 Tolerance to stress reduced (VA) Stober et al. (1981)  
Salmon (chinook) 6.0 1,440 Growth rate reduced (LNFH) Newcomb & Flagg (1983)  
Salmon (coho) 240.0 24 Cough frequency increased more than 5-fold Servizi & Martens (1992)  
Salmon (coho) 1,547.0 96 Gill damage Noggle (1978)  
Salmon (coho) 2,460.0 24 Fatigue of the cough reflex Servizi & Martens (1992)  
Salmon (coho) 3,000.0 48 High level sublethal stress: avoidance Servizi & Martens (1992)  
Salmon (coho) 8,000.0 96.0 10 Mortality rate 1% Servizi & Martens (1991)  
Salmon (coho) 35,000.0 96 12 Mortality rate 50% Noggle (1978)  
Salmon (coho) 22,700.0 96 12 Mortality rate 50% Servizi & Martens (1991)  
Salmon (coho) F* 8,100.0 96 12 Mortality rate 50% Servizi & Martens (1991)  
Salmon (coho) PS 18,672.0 96 12 Mortality rate 50% Stober et al. (1981)  
Salmon (coho) 28,184.0 96 12 Mortality rate 50% (VA) Stober et al. (1981)  
Salmon (coho) 29,580.0 96 12 Mortality rate 50% Stober et al. (1981)  
Salmon (sockeye) 1,261.0 96 Body moisture content reduced Servizi & Martens (1987)  
Salmon (sockeye) 1,465.0 96 Hypertrophy and necrosis of gill tissue (CSS) Servizi & Martens (1987)  
Salmon (sockeye) 3,143.0 96 Hypertrophy and necrosis of gill tissue (FSS) Servizi & Martens (1987)  
Salmon (sockeye) 2,688.0 96 Hypertrophy and necrosis of gill tissue (MCSS) Servizi & Martens (1987)  
Salmon (sockeye) 2,100.0 96 10 No fish died (MFSS) Servizi & Martens (1987)  
Salmon (sockeye) 9,000.0 96 10 No mortality Servizi & Martens (1987)  
Salmon (sockeye) 13,900.0 96.0 10 Mortality rate 10% (FSS) Servizi & Martens (1987)  
Salmon (sockeye) 9,850.0 96 10 Gill hyperplasia, hypertrophy, separation, necrosis (MFSS) Servizi & Martens (1987)  
Salmon (sockeye) 9,400.0 36 12 Mortality rale 50% Newcomb & Flagg (1983)  
Salmon (sockeye) 8,200.0 96 12 Mortality rate 50% (MFSS) Servizi & Martens (1987)  
Salmon (sockeye) 17,560.0 96 12 Mortality rate 50% (FSS) Servizi & Martens (1987)  
Salmon (sockeye) 23,900.0 96 14 Mortality rate 90% (FSS) Servizi & Martens (1987)  
Steelhead 102.0 336 Growth rate reduced (FC. BC) Sigler et al. (1984)  
Trout (brook) FF* 100.0 1,176.0 Test fish weighed 16% of controls (LNFH) Sykora et al. (1972)  
Trout (brook) FF 50.0 1,848 Growlh rates declined (LNFH) Sykora et al. (1972)  
Trout (rainbow) 4,887.0 384 Hyperplasia of gill tissue Goldes (1983)  
Trout (rainbow) 4,887.0 384 Parasitic infection of gill tissue Goldes (1983)  
Trout (rainbow) 171.0 96.0  Goldes (1983)  
Trout (rainbow) 7,433.0 672 11 Mortality rate 40% (CS) Herbert & Wakeford (1962)  
Salmonid eggs and larvae 
Grayling (Arctic) SF 25.0 24 10 Mortality rate 5.7% Newcombe & Jensen (1997)  
Grayling (Arctic) SF 22.5 48 10 Mortality rate 14.0% Newcombe & Jensen (1997)  
Grayling (Arctic) SF 65.0 24 10 Mortality rate 15.0% Newcombe & Jensen (1997)  
Grayling (Arctic) SF 21.7 72 10 Mortality rate 14.7% Newcombe & Jensen (1997)  
Grayling (Arctic) SF 20.0 96 10 Mortality rate 13.4% Newcombe & Jensen (1997)  
Grayling (Arctic) SF 142.5 48 11 Mortality rale 26% Newcombe & Jensen (1997)  
Grayling (Arctic) SF 185.0 72 12 Mortality rate 41.3% Newcombe & Jensen (1997)  
Grayling (Arctic) SF 230.0 96 12 Mortality rate of 47% Newcombe & Jensen (1997)  
Salmon (coho) 157.0 1,728 14 Mortality rate 100% (controls, 16.2%) Shaw & Maga (1943)  
Steelhead 37.0 1,488 12 Hatching success 42% (controls, 63%) Newcombe & Jensen (1997)  
Trout (rainbow) 6.6 1,152 11 Mortality rate 40% Newcomb & Flagg (1983)  
Troul (rainbow) 57.0 1,488.0 12 Mortality rate 47% (controls, 32%) Newcomb & Flagg (1983)  
Trout (rainbow) 120.0 384 13 Mortality rates 60–70% (controls, 38.6%) Erman & Lignon (1988)  
Trout (rainbow) 20.8 1,152 13 Mortality rale 72% Newcomb & Flagg (1983)  
Troul (rainbow) 46.6 1,152 14 Mortality rale 100% Newcombe & Jensen (1997)  
Trout (rainbow) 101.0 1,440 14 Mortality rate 98% (controls, 14.6%) Nonsalmonid eggs and larvae (estuarined, group 4) TUrnpenny & Williams (1980)  
Nonsalmonid eggs and larvae 
Bass (striped) 100.0 24 Hatching delayed Newcombe & Jensen (1997)  
Bass (striped) 1,000.0 68 11 Mortality rate 35% (controls, 16%) Auld & Schubel (1978)  
Bass (striped) 500.0 72 12 Mortality rate 42% (controls, 17%) Auld & Schubel (1978)  
Bass (striped) 485.0 24 12 Mortality rate 50% Morgan et al. (1973)  
Herring 10.0 Depth preference changcd Johnson & Wildish (1982)  
Herring (lake) 16.0 24 Depth preference changed Swenson & Matson (1976)  
Hemng (Pacific) 2,000.0 Feeding rate reduced Newcombe & Jensen (1997)  
Herring (Pacific) 1,000.0 24 Mechanical damage to epidermis Newcombe & Jensen (1997)  
Perch (while) 800.0 24 Egg development slowed significantly Morgan et al. (1983)  
Perch (while) 100.0 24 Hatching delayed Newcombe & Jensen (1997)  
Perch (white) 155.0 48 12 Mortality rate 50% Morgan et al. (1973)  
Perch (white) 373.0 24 12 Mortality rate 50% Morgan et al. (1973)  
Perch (white) 280.0 48 12 Mortality rate 50% Morgan et al. (1973)  
Perch (yellow) 500.0 96 11 Mortality rale 37% (controls, 7%) Auld & Schubel (1978)  
Perch (yellow) 1,000.0 96 11 Mortality rate 38% (controls, 7%) Auld & Schubel (1978)  
Shad (American) 100.0 96 10 Mortality rate 18% (controls, 5%) Auld & Schubel (1978)  
Shad (American) 500.0 96 11 Mortality rate 36% (controls, 4%) Auld & Schubel (1978)  
Shad (American) 1,000.0 96 11 Mortality rate 34% (controls, 5%) (estuarine or riverine-estuarine, group 5) Adult nonsalmonids Auld & Schubel (1978)  
Adult nonsalinonids 
Anchovy (bay) 231.0 24 10 Mortality rate 10% (FE) Sherk et al. (1975)  
Anchovy (bay) 471.0 24 12 Mortality rate 50% (FE) Sherk et al. (1975)  
Bass (striped) 1,500.0 336 Haemalocrit increased (FE) Sherk et al. (1975)  
Bass (striped) 1,500.0 336 Plasma osmolality increased (FE) Sherk et al. (1975)  
Cunner 28,000.0 24 12 Mortality rale 50% (20.0—25.0^0) Rogers (1969)  
Cunner 133,000.0 12 12 Mortality rate 50% (15 °C) Rogers (1969)  
Cunner 100,000.0 24 12 Mortality rate 50% (15 °C) Rogers (1969)  
Cunner 72,000.0 48 12 Monaliiy rale 50% (15 °C) Rogers (1969)  
Fish 3,000.0 240 10 Fish died Kemp (1949)  
Killifish (striped) 3,277.0 24 10 Mortality rale 10% (FE) Sherk et al. (1975)  
Killifish (striped) 3,819.0 24 12 Mortality rale 50% Sherk et al. (1975)  
Killifish (striped) 12,820.0 24 12 Mortality rate 50% Sherk et al. (1975)  
Killifish (striped) 16,930.0 24 13 Mortality rate 90% Sherk et al. (1975)  
Menhaden (Atlantic) 154.0 24 10 Mortality rate 10% (FE) Sherk et al. (1975)  
Menhaden (Atlantic) 247.0 24 12 Mortality rate 50% (FE) Sherk et al. (1975)  
Minnow (sheepshead) 100,000.0 24 14 Mortality rate 90% (I9 °C) Rogers (1969)  
Mummichog 2,447.0 24 10 Mortality rale 10% (FE) Sherk et al. (1975)  
Mummichog 3,900.0 24 12 Mortality rate 50% (FE) Sherk et al. (1975)  
Mummichog 6,217.0 24 14 Mortality rate 90% Sherk et al. (1975)  
Perch (white) 985.0 24 12 Mortality rate 50% Sherk et al. (1975)  
Perch (white) 3,181.0 24 14 Mortality rate 90% (FE) Sherk et al. (1975)  
Rasbora (harlequin) 40,000.0 24 10 Fish died (BC) Alabaster & Lloyd (1980)  
Silverside (Atlantic) 58.0 24 10 Mortality rate 10% (FE) Sherk et al. (1975)  
Silverside (Atlantic) 250.0 24 12 Mortality rate 50% (FE) Sherk et al. (1975)  
Silverside (Atlantic) 1,000.0 24 14 Mortality rate 90% (FE) Sherk et al. (1975)  
Spot 114.0 48 10 Mortality rate 10% (FE) Sherk et al. (1975)  
Spot 1,309.0 24 10 Mortality rate 10% (FE) Sherk et al. (1975)  
Spot 6,875.0 24 10 Mortality rate 10% Sherk et al. (1975)  
Spot 189.0 48 12 Mortality rate 50% (FE) Sherk et al. (1975)  
Spot 2,034.0 24 12 Mortality rate 50% Sherk et al. (1975)  
Spot 8,800.0 24 12 Mortality rate 50% Sherk et al. (1975)  
Spot 11,263.0 24 14 Mortality rate 90% Sherk et al. (1975)  
Stickleback (fourspine) 100.0 24 10 Mortality rate <1% (IA) Rogers (1969)  
Stickleback (fourspine) 10,000.0 24 10 No mortality (KS; 10-I2 °C) Rogers (1969)  
Stickleback (fourspine) 300.0 24 12 Mortality rale ∼50% (IA) Rogers (1969)  
Stickleback (fourspine) 18,000.0 24 12 Mortality rate 50% (15.0-I6.0 °C) Rogers (1969)  
Stickleback (fourspine) 53,000.0 24 12 Mortality rate 50% (10–12 °C) Rogers (1969)  
Stickleback (fourspine) 330,000.0 24 12 Mortality rate 50% (9.0–9.5 °C) Rogers (1969)  
Stickleback (fourspine) 500.0 24 14 Mortality rate 100% Rogers (1969)  
Stickleback (threespine) 28,000.0 96 10 No mortality in test designed to identify lethal threshold LeGore & DesVoigne (1973)  
Toadlish (oysier) 14,600.0 72 Fish largely unaffected, but developed latent ill effects Neumann et al. (1975)  
Toadlish (oyster) 11,090.0 72 Latent ill effects manifested in subsequent test at low SS (freshwater, group 6) Neumann et al. (1975)  
Bass (largemouth) 62.5 720 Weight gain reduced ∼50% Buck (1956)  
Bass (largemouth) 144.5 720 Growth retarded Buck (1956)  
Bluegill 144.5 720 Growth retarded Buck (1956)  
Bluegill 62.5 720 Weight gain reduced ∼50% Buck (1956)  
Bluegill 144.5 720 12 Fish unable to reproduce Buck (1956)  
Carp (common) 25,000.0 336 10 Some mortality (MC) Wallen (1951)  
Fish 120.0 384 10 Density of fish reduced Erman & Lignon (1988)  
Fish 620.0 48 10 Fish kills downstream from sediment source Hesse & Newcomb (1982)  
Fish 900.0 720 12 Fish absent or markedly reduced in abundance Herbert & Richards (1963)  
Fish (warmwater) 100,000.0 252 10 Some fish died: most survived Wallen (1951)  
Fish (warmwater) 22.0 8,760 12 Fish populations destroyed Newcombe & Jensen (1997)  
Goldfish 25,000.0 336 10 Some mortality (MC) Wallen (1951)  
Sunfish (redear) 62.5 720 Weight gain reduced ∼50% compared to controls Buck (1956)  
Sunfish (redear) 144.5 720 Growth retarded Buck (1956)  

*: A = adult; E = egg; EE = eyed egg; F = fry; F* = swim-up fry; FF = young fry (<30 weeks old); FF* = older fry (>30 weeks old); J = juvenile; L = larva; PS = presmolt; S = smolt; SF = sac fry; U = underyearling; Y = approximate yearling; YY = young of the year. As abbreviated here. VFSS = very fine; FSS = fine; MFSS = medium to fine; MCSS = medium to coarse; and CSS = coarse. Usual ‘sediments’ used: BC = bentonite clay; CS = calcium sulfate; CWS = coal washery solids; DE = dtatomaceous earth; DM = drilling mud (nontoxic); FC = fire clay; FE = fuller's earth; IA = ncinerator ash; KC = kaolin clay; KS = Kingston silt; LNFH = lime-neutralized ferric hydroxide; MC = montmorillonite clay; VA = volcanic ash; WF = wood fibers, NTU = nephelometric turbidity units.

We scored qualitative response data along a semiquantitative ranking scale (Table 1). Superimposed on a 15-point scale (0–14) were four major classes of effect: (1) nil effect, (2) behavioral effects, (3) sub lethal effects (a category that also includes effects such as short-term reduction in feeding success), and (4) lethal effects (direct mortality, or its paralethal surrogates reduced growth, reduced ash density, habitat damage such as reduced porosity of spawning gravel, delayed hatching, and reduction in population size). When these various effects could be compared directly, pollution episodes associated with sub lethal or lethal effects also degraded habitat and reduced population size, which is why these seemingly disparate ill effects are grouped together in the hierarchy. For events between the extremes of nil effect and 100% mortality, we assumed for modeling purposes that the SEV scale represents proportional differences in true effects (Table 2). In this study, we define dose as concentration of suspended sediment (SS) times duration of exposure; dose has the units mg SS.h.L−1. The single decision tree (SDT), which is the basis of data presentation in this study, encompasses all combinations of sediment concentration (1–500,000 mg SS/L) and exposure.

Table 2

Scale of the severity (SEV) of ill effects in fishes exposed to excess suspended sediment

Severity Index Description of effect 
 Nil effect 
No behavioral effect 
 Behavioral effects 
Alarm reaction 
Abandonment of cover 
Avoidance response 
 Sublethal effects 
Short-term reduction in feeding rate; short-term reduction in feeding success 
Minor physiological stress; increase in rate of coughing; increased respiration rate 
Moderate physiological stress 
Moderate habitat degradation; impaired homing 
Indications of major physiological stress; long-term reduction in feeding rate; long-term reduction in feeding success; poor condition 
 Lethal and paralethal effects 
Reduced growth rate; delayed hatching; reduced fish density 
10 0–20% mortality; increased predation; moderate to severe habitat degradation 
11 >20–40% mortality 
12 >40–60% mortality 
13 >60–80% mortality 
14 >80–100% mortality 
Severity Index Description of effect 
 Nil effect 
No behavioral effect 
 Behavioral effects 
Alarm reaction 
Abandonment of cover 
Avoidance response 
 Sublethal effects 
Short-term reduction in feeding rate; short-term reduction in feeding success 
Minor physiological stress; increase in rate of coughing; increased respiration rate 
Moderate physiological stress 
Moderate habitat degradation; impaired homing 
Indications of major physiological stress; long-term reduction in feeding rate; long-term reduction in feeding success; poor condition 
 Lethal and paralethal effects 
Reduced growth rate; delayed hatching; reduced fish density 
10 0–20% mortality; increased predation; moderate to severe habitat degradation 
11 >20–40% mortality 
12 >40–60% mortality 
13 >60–80% mortality 
14 >80–100% mortality 

K-Means clustering

K-Means clustering is used so that clusters of items with the same target category are identified, and predictions for new data items are made by assuming they are of the same type as the nearest cluster center (Kim & Yamashita 2010).

K-Means clustering is similar to two other more modern methods:

  • Radial basis function neural networks. An RBF network also identifies the centers of clusters, but RBF networks make predictions by considering the Gaussian-weighted distance to all other cluster centers rather than just the closest one.

  • Probabilistic neural networks. Each data point is treated as a separate cluster, and a prediction is made by computed the Gaussian-weighted distance to each point.

Usually, both RBF networks and PNN networks are more accurate than K-Means clustering models. PNN networks are among the most accurate of all methods, but they become impractically slow when there are more than about 10,000 rows in the training data file. K-Means clustering is faster than RBF or PNN networks, and it can handle large training files.

K-Means clustering can be used only for classification (i.e., with a categorical target variable), not for regression. The target variable may have two or more categories. The algorithm in its simplest form is comprised of the following steps:

  • 1.

    Place K points into the space represented by the objects that are being clustered. These points represent initial group centroids.

  • 2.

    Assign each object to the group that has the closest centroid.

  • 3.

    When all objects have been assigned, recalculate the positions of the K centroids.

  • 4.

    Repeat Steps 2 and 3 until the centroids no longer move.

This produces a separation of the objects into groups from which the metric to be minimized can be calculated. The K-Means clustering routine was not designed to show the relationship between clusters. Instead, K-Means clusters are constructed so that the average behavior in each group is distinct from any of the other groups. For example, in a time series experiment you could use K-Means clustering to identify unique classes of pedestrian-involved crashes that are determined in a time dependent manner.

In the K-Means routine, a simple and widely used square error cost function is employed to measure the distance, which is defined as: 
formula
(1)
where N, and k are the number of data and the number of centers respectively; vi is the data sample, in this case, the location (coordinates) of the ith crash belonging to center cj. During the clustering process, the centers are adjusted according to a certain set of rules such that by searching for the center cj as the data are presented, the total distance in Equation (1) is minimized. The Euclidean distances between the data sample and all the centers are calculated and the nearest center is updated according to: 
formula
(2)
where z indicates the nearest center to the data v(t). Notice that, the centers and the data are written in terms of time t where cz(t − 1) represents the center location at the previous clustering step.

RESULTS AND DISCUSSION

In this section of study, the K-Means clustering algorithm was used to evaluate the performance of classification using the SEV index on fish. There are two issues in creating a K-Means clustering model: 1) determine the optimal number of clusters to create; and 2) determine the center of each cluster.

In this paper, we provide an automatic search function that creates models using a varying number of clusters, tests each one and reports which is best. The model performance tests can be performed using cross-validation or holdout sampling. The results for determining the optimal number of clusters based on misclassification(%) shown in Figure 1. As can be seen (Figure 1), the best number of clusters is 47.

Figure 1

Misclassification(%) with a varying number of clusters.

Figure 1

Misclassification(%) with a varying number of clusters.

Given the number of clusters, the second part of the problem is determining where to place the center of each cluster. Often, points are scattered and don't fall into easily recognizable groupings. Cluster center determination is done in two steps:

A. Determine starting positions for the clusters. This is performed in two steps:

  • 1.

    Assign the first center to a random point.

  • 2.

    Find the point furthest from any existing center and assign the next center to it. Repeat this until the specified number of cluster centers have been found.

B. Adjust the center positions until they are optimized.

Table 3 and 4 shown the results of the confusing matrix and sensitivity & specificity for each target category for the SEV index. The key concept of the confusion matrix is that it calculates the number of correct and incorrect predictions, which are further summarized with the number of count values and breakdown into each class.

Table 3

Confusing matrix for each target category

Category 10 11 12 13 14 
10 
26 
13 
10 27 
11 
12 13 32 
13 
14 10 
Category 10 11 12 13 14 
10 
26 
13 
10 27 
11 
12 13 32 
13 
14 10 
Table 4

Sensitivity & specificity for each target category

 10 11 12 13 14 
Accuracy 97.69% 92.08% 94.72% 96.04% 96.04% 96.04% 84.16% 93.07% 78.22% 95.38% 75.58% 97.36% 86.14% 
True positive (TP) 0.00% 0.99% 3.30% 0.00% 0.33% 0.66% 8.58% 4.29% 8.91% 1.65% 10.56% 0.00% 1.98% 
True negative (TN) 97.69% 91.09% 91.42% 96.04% 95.71% 95.38% 75.58% 88.78% 69.31% 93.73% 65.02% 97.36% 84.16% 
False positive (FP) 1.32% 3.30% 2.31% 1.65% 1.32% 1.98% 8.91% 2.64% 11.55% 2.64% 12.87% 0.99% 7.26% 
False negative (FN) 0.99% 4.62% 2.97% 2.31% 2.64% 1.98% 6.93% 4.29% 10.23% 1.98% 11.55% 1.65% 6.60% 
Sensitivity 0.00% 17.65% 52.63% 0.00% 11.11% 25.00% 55.32% 50.00% 46.55% 45.45% 47.76% 0.00% 23.08% 
Specificity 98.67% 96.50% 97.54% 98.31% 98.64% 97.97% 89.45% 97.11% 85.71% 97.26% 83.47% 98.99% 92.06% 
Geometric mean of sensitivity 0.00% 41.27% 71.65% 0.00% 33.11% 49.49% 70.35% 69.68% 63.17% 66.49% 63.14% 0.00% 46.09% 
Positive Predictive Value (PPV) 0.00% 23.08% 58.82% 0.00% 20.00% 25.00% 49.06% 61.90% 43.55% 38.46% 45.07% 0.00% 21.43% 
Negative Predictive Value (NPV) 99.00% 95.17% 96.85% 97.65% 97.32% 97.97% 91.60% 95.39% 87.14% 97.93% 84.91% 98.33% 92.73% 
Geometric mean of PPV and NPV 0.00% 46.86% 75.48% 0.00% 44.12% 49.49% 67.03% 76.84% 61.60% 61.37% 61.86% 0.00% 44.58% 
 10 11 12 13 14 
Accuracy 97.69% 92.08% 94.72% 96.04% 96.04% 96.04% 84.16% 93.07% 78.22% 95.38% 75.58% 97.36% 86.14% 
True positive (TP) 0.00% 0.99% 3.30% 0.00% 0.33% 0.66% 8.58% 4.29% 8.91% 1.65% 10.56% 0.00% 1.98% 
True negative (TN) 97.69% 91.09% 91.42% 96.04% 95.71% 95.38% 75.58% 88.78% 69.31% 93.73% 65.02% 97.36% 84.16% 
False positive (FP) 1.32% 3.30% 2.31% 1.65% 1.32% 1.98% 8.91% 2.64% 11.55% 2.64% 12.87% 0.99% 7.26% 
False negative (FN) 0.99% 4.62% 2.97% 2.31% 2.64% 1.98% 6.93% 4.29% 10.23% 1.98% 11.55% 1.65% 6.60% 
Sensitivity 0.00% 17.65% 52.63% 0.00% 11.11% 25.00% 55.32% 50.00% 46.55% 45.45% 47.76% 0.00% 23.08% 
Specificity 98.67% 96.50% 97.54% 98.31% 98.64% 97.97% 89.45% 97.11% 85.71% 97.26% 83.47% 98.99% 92.06% 
Geometric mean of sensitivity 0.00% 41.27% 71.65% 0.00% 33.11% 49.49% 70.35% 69.68% 63.17% 66.49% 63.14% 0.00% 46.09% 
Positive Predictive Value (PPV) 0.00% 23.08% 58.82% 0.00% 20.00% 25.00% 49.06% 61.90% 43.55% 38.46% 45.07% 0.00% 21.43% 
Negative Predictive Value (NPV) 99.00% 95.17% 96.85% 97.65% 97.32% 97.97% 91.60% 95.39% 87.14% 97.93% 84.91% 98.33% 92.73% 
Geometric mean of PPV and NPV 0.00% 46.86% 75.48% 0.00% 44.12% 49.49% 67.03% 76.84% 61.60% 61.37% 61.86% 0.00% 44.58% 

Comparing accuracy for each target category for the SEV index in Table 4 showed that categories 12 and 10 (accuracies for categories 10 and 12 are 78.22% and 75.58%, respectively) have low accuracy with lower recognition rate and forecast accuracy and weaker practical value. A validation measure is therefore required to tell us how good the clustering is. Indeed, cluster validity has become the core task of cluster analysis, for which a great number of validation measures have been proposed and carefully studied in the literature.

SUMMARY AND CONCLUSIONS

Clustering for understanding is to employ cluster analysis to automatically find conceptually meaningful groups of objects that share common characteristics. It plays an important role in helping people to analyze, describe and utilize the valuable information hidden in the groups. Despite the vast amount of research devoted to the SEV index, there is as yet no consistent and conclusive solution to cluster validation, and the best suitable measures to use in practice remain unclear. In this paper, we provide the methodology, which is simple and qualitative, by comparing the predictive clustering algorithm with various distance metrics to improve the performance of classification by the SEV index on fish. We implemented the K-Means clustering algorithm on 303 results regarding aquatic ecosystem quality and 14 categories based on SEV. Results of this study showed that categories 12 and 10 on SEV should be improved, and we have to consider their accuracy in calculations related to analysis of aquatic ecosystem quality. Finally, we can use the K-Means clustering algorithm to make inferences that help us understand the ‘big picture’ of the model and to identify other areas of concern that may warrant further investigation, analysis, problem identification and countermeasure design.

REFERENCES

REFERENCES
Alabaster
J. S.
&
Lloyd
R.
1980
Finely divided solids
. In:
Water Quality Criteria for Freshwater Fish
.
Butterworth
,
London
, pp.
1
20
.
Auld
B.
&
Schubel
L.
1978
Effects of Short Term Exposure to Suspended Sediments on the Behaviour of Juvenile Coho Salmon
.
Master's thesis
,
University of British Columbia
,
Vancouver
.
Brannon
E. L.
,
Whitman
R. P.
&
Quinn
T. P.
1981
Report on the Influence of Suspended Volcanic ash on the Homing Behavior of Adult Chinook Salmon (Oncorhynchux Tshawyischa)
.
Final Report to Washington State University. Washington Water Research Center
,
Pullman
.
Buck
D. H.
1956
Effects of turbidity on fish and fishing
.
Transactions of the North American Wildlife Conference
21
,
249
261
.
Byeon
H.
2014
The risk factors of laryngeal pathology in Korean adults using a decision tree model
.
Journal of Voice
29
(
1
),
59
64
.
Coats
R.
,
Collins
L.
,
Florsheim
J.
&
Kaufman
D.
1985
Channel change, sediment transport, and fish habitat in a coastal stream: effects of an extreme event
.
Environmental Management
9
,
35
48
.
Cordone
A. J.
&
Kelley
D. W.
1961
The influences of inorganic sediment on the aquatic life of streams
.
California Fish and Game
47
,
189
223
.
Gibson
A. M.
1933
Construction and Operation of A Tidal Model of the Severn Estuary
.
His Majesty's Stationery Office
,
London
.
Goldes
S. A.
1983
Histological and Ultrastructural Effects of the Inert Clay Kaolin on the Gills of Rainbow Trout (Salmo Gairdneri Richardson)
.
Master's thesis
.
University of Guelph
,
Guelph, Ontario
.
Griffin
L. E.
1938
Experiments on the tolerance of young trout and salmon for suspended sediment in water
.
Oregon Department of Geology and Mineral Industries
.
Bulletin 10, 10, 28
.
Hamilton
J. D.
1961
The effect of sand-pit washings on a stream fauna
.
Internationale Vereinigung Fiir Theoretische und Angewandte Limnologie Verhandlungen
14
,
435
439
.
Herbert
D. M. W.
&
Merkens
J. C.
1961
The effect of suspended mineral solids on the survival of trout
.
International Journal of Air and Water Pollution
5
,
46
55
.
Herbert
D. W. M.
&
Richards
J. M.
1963
The growth and survival of fish in some suspensions of solids of industrial origin
.
International Journal of Air and Water Pollution
7
,
297
302
.
Herbert
D. W. M.
&
Wakeford
A. C.
1962
The effect of calcium sulphate on the survival of rainbow trout
.
Water and Waste Treatment
8
,
608
609
.
(Not seen: cited by Alabaster and Lloyd 1980)
.
Hesse
L. W.
&
Newcomb
B. A.
1982
Effects of flushing Spencer Hydro on water quality, fish, and insect fauna in the Niobrara River, Nebraska
.
North American Journal of Fisheries Management
2
,
45
52
.
Johnson
D. D.
&
Wildish
D. J.
1982
Effect of suspended sediment on feeding by larval herring (Clupea harengus harengus L.)
.
Bulletin of Environmental Contamination and Toxicology
29
,
261
267
.
Kemp
H. A.
1949
Soil pollution in the Potomac River basin
.
American Water Works Association Journal
41
,
792
796
.
(Not seen: cited by Cordone and Kelley 1961.)
.
Khakzad
H.
&
Elfimov
V. I.
2015b
Estimate of time required for environmentally friendly flushing in Dez dam reservoir
.
Water Practice and Technology
10
(
1
),
73
85
.
Langer
O. E.
1980
Effects of Sedimentation on Salmonid Stream Life. Environment Canada
.
Environmental Protection Service
,
unpublished report
,
North Vancouver, British Columbia
.
Lawrence
M.
&
Scherer
E.
1974
Behavioral Responses of Whitefish and Rainbow Trout to Drilling Fluids
.
Canada Fisheries and Marine Service Technical Report 502
.
MacDonald
D. D.
&
Newcombe
C. P.
1993
Utility of the stress index for predicting suspended sediment effects: response to comment
.
North American Journal of Fisheries Management
13
,
873
876
.
McLeay
D. J.
,
Ennis
G. L.
,
Birtwell
I. K.
&
Hartman
G. F.
1984
Effects on Arctic grayling (Thymallus arcticus) of prolonged exposure to Yukon placer mining sediment: laboratory study
.
Canadian Technical Report of Fisheries and Aquatic Sciences
,
report
1241
.
McLeay
D. J.
,
Birtwell
I. K.
,
Hartman
G. F.
&
Ennis
G. L.
1987
Responses of Arctic grayling (Thymallus arcticus) to acute and prolonged exposure to Yukon placer mining sediment
.
Canadian Journal of Fisheries and Aquatic Sciences
44
,
658
673
.
Morgan
R. P.
II.
,
Rasin
J. V.
Jr.
&
Noe
L. A.
1973
Effects of Suspended Sediments on the Development of Eggs and Larvae of Striped Bass and White Perch, Appendix 11
.
Final Report to U.S. Army Corps of Engineers
,
Contract DACW61-71-C0062
,
Philadelphia
.
Morgan
R. P.
II.
,
Rasin
J. R.
Jr.
&
Noe
L. A.
1983
Sediment effects on eggs and larvae of striped bass and white perch
.
Transactions of the American Fisheries Society
112
,
220
224
.
Neumann
D. A.
,
O'Connor
J. M.
,
Sherk
J. A.
&
Wood
K. V.
1975
Respiratory and hemalological responses of oyster toad fish (Opsanus tau) to suspended solids
.
Transactions of the American Fisheries Association
104
,
775
781
.
Newcomb
T. W.
&
Flagg
T. A.
1983
Some effects of Mt. St. Helens ash on juvenile salmon smolts
.
U.S. National Marine Fisheries Service Marine Fisheries Review
45
(
2
),
8
12
.
Newcombe
C. P.
&
Jensen
J. O.
1997
Channel Suspended Sediment and Fisheries: A Concise Guide
.
Resource Stewardship Branch, Ministry of Environment, Lands and Parks
,
Victoria, British Columbia
.
Noggle
C. C.
1978
Behavioral, Physiological and Lethal Effects of Suspended Sediment on Juvenile Salmonids
.
Master's thesis
,
University of Washington
,
Seattle
.
Ott
A. G.
1984
Personal Communication. Alaska Department of Fish and Game. Fairbanks
.
(Not seen cited as personal communication in Lloyd 1985)
.
Phillips
R. W.
1970
Effects of sediment on the gravel environment and rish production
. In:
Proceedings of the Symposium on Forest Land use and Stream Environment
.
Oregon State University, Continuing Education Publications
,
Corvallis, Oregon
, pp.
64
74
.
Redding
J. M.
&
Schreck
C. B.
1982
Mount St. Helens ash causes sublethal stress responses in steclhead trout. In: Mt St. Helens: Effects on Water Resources. Washington State University, Washington Water Research Center, Report 41, Pullman, pp. 300–307.
Rogers
B. A.
1969
Tolerance Levels of Four Species of Estuarine Fishes to Suspended Mineral Solids
.
Master's thesis
,
University of Rhode Island
,
Kingston
.
Servizi
J. A.
&
Martens
D. W.
1987
Some effects of suspended Fraser River sediments on sockeye salmon (Oncorhynchus nerka)
.
Canadian Special Publication of Fisheries and Aquatic Sciences
96
,
254
264
.
Servizi
J. A.
&
Martens
D. W.
1991
Effect of temperature, season, and fish size on acute lethality of suspended sediments to coho salmon (Oncorhynchus kisutch)
.
Canadian Journal of Fisheries and Aquatic Sciences
48
,
493
497
.
Servizi
J. A.
&
Martens
D. W.
1992
Sublethal responses of coho salmon (Oncorhynchus kisutch) to suspended sediments
.
Canadian Journal of Fisheries and Aquatic Sciences
49
,
1389
1395
.
Shaw
P. A.
&
Maga
J. A.
1943
The effect of mining silt on yield of fry from salmon spawning beds
.
California Fish and Game
29
,
29
41
.
Sherk
J. A.
,
O'Connor
J. M.
&
Neumann
D. A.
1975
Effects of Suspended and Deposited Sediments on Estuarine Environments
(
Cronin
L. E.
ed.).
Estuarine Research 2. Academic Press
,
New York
, pp.
541
558
.
Sigler
J. W.
,
Bjornn
T. C.
&
Everest
F. H.
1984
Effects of chronic turbidity on density and growth of steelheads and coho salmon
.
Transactions of the American Fisheries Society
113
,
142
150
.
Simmons
R. C.
1982
Effects of Placer Mining on Arctic Grayling of Interior Alaska
.
Master's thesis
,
University of Alaska
,
Fairbanks
.
Slaney
P. A.
,
Halsey
T. G.
&
Tautz
A. F.
1977
.
Effects of forest harvesting practices on spawning habitat of stream salmonids in the Centennial Creek watershed, British Columbia. British Columbia Ministry of Recreation and Conservation, Fish and Wildlife Branch. Fisheries Management Report 73, Victoria.
Slanina
K.
1962
Beitrag ziir Wirkung mineralischer Suspensionen auf Fische
.
Wasser und Abwasser
1962
,
186
194
.
Stober
Q. J.
, &
five coauthors
1981
Effects of Suspended Volcanic Sediment on Coho and Chinook Salmon in the Toutle and Cowlitz Rivers
.
University of Washington. Fisheries Research Institute
,
Technical Completion Report FRI-UW-8124
.
Seattle
.
Suchanek
P. M.
,
Marshall
R. P.
,
Hale
S. S.
&
Schmidt
D. C.
1984a
Juvenile Salmon Rearing Suitability Criteria
.
Alaska Department of Fish and Game. Susitna Hydro Aquatic Studies, 1984 Report 2, Part 3, Anchorage. (Not seen: cited by Lloyd 1985)
.
Suchanek
P. M.
,
Sundet
R. L.
&
Wenger
M. N.
1984b
Resident Fish Habitat Studies
.
Alaska Department of Fish and Game. Susitna Hydro Aquatic Studies, 1984 Report 2, Part 6. Anchorage. (Not seen: cited by Lloyd 1985)
.
Swenson
W. A.
1978
Influence of Turbidity on Fish Abundance in Western Lake Superior
.
U. S. Environmental Protection Agency. National Environmental Research Center. Ecological Research Series EPA 600/3-78-067. (Not seen: cited by Gradall and Swenson 1982)
.
Swenson
W. A.
&
Matson
M. L.
1976
Influence of turbidity on survival, growth, and distribution of larval lake herring (Coregonus artedii)
.
Transactions of the American Fisheries Society
105
,
541
545
.
Turnpenny
A. W. H.
&
Williams
R.
1980
Effects of sedimentation on the gravels of an industrial river system
.
Journal of Fish Biology
17
,
681
693
.
Wallen
E. I.
1951
The direct effect of turbidity on fishes. Oklahoma Agricultural and Mechanical College
.
Arts and Sciences Studies, Biological Series
48
(
2
),
2
27
.
Whitman
R. P.
,
Quinn
T. P.
&
Brannon
E. L.
1982
Influence of suspended volcanic ash on homing behavior of adult chinook salmon
.
Transactions of the American Fisheries Society
111
,
63
69
.
Wu
J.
2012
Advances in K-Means Clustering. A Data Mining Thinking
.
Springer-Verlag Berlin Heidelberg
:
Heidelberg
.
Wu
X.
,
Kumar
V.
,
Quinlan
J. R.
,
Ghosh
J.
,
Yang
Q.
,
Motoda
H.
,
McLachlan
G. J.
,
Ng
A.
,
Liu
B.
,
Yu
P. S.
,
Zhou
Z. H.
,
Steinbach
M.
,
Hand
D. J.
&
Steinberg
D.
2008
Top 10 algorithms in data mining
.
Knowledge and Information Systems
14
(
1
),
1
37
.