## Abstract

The measurement and statistical modeling of water quality data are essential to developing a region-based stream-wise database that would be of great use to the EPA's needs. Such a database would also be useful in bio-assessment and in the modeling of processes that are related to riparian vegetation surrounding a water body such as a stream network. With the help of easily measurable data, it would be easier to come up with database-intensive numerical and computer models that explain the stream water quality distribution and biological integrity and predict stream water quality patterns. Statistical assessments of nutrients, stream water metallic and non-metallic pollutants, organic matter, and biological species data are needed to accurately describe the pollutant effects, to quantify health hazards, and in the modeling of water quality and its risk assessment. The study details the results of statistical nonlinear regression and artificial neural network models for Upper Green River watershed, Kentucky, USA. The neural network models predicted the stream water quality parameters with more accuracy than the nonlinear regression models in both training and testing phases. For example, neural network models of pH, conductivity, salinity, total dissolved solids, and dissolved oxygen gave an *R*^{2} coefficient close to 1.0 in the testing phase, while the nonlinear regression models resulted in less than 0.6. For other parameters also, neural networks showed better generalization compared with nonlinear regression models.

## INTRODUCTION

The water quality in streams is fundamentally influenced by many factors such as stream morphology, longitudinal and lateral-mixing processes, evaporation processes, stream turbulence, small- to large-scale and localized random hydrological and meteorological cycle patterns, Hortonian-overland inflows into the stream, point and non-point source pollutants (diffuse source pollution) in the neighboring areas and creeks, and stream ecology. Along with the above factors, the two- and three-dimensional structure of stream network, spatial variability in land use, heterogeneity of stream-bed surface, spatially variable sources and sinks increase the complexity of modeling of stream-water quality even though the flow and transport processes are strictly governed by a well-known set of partial differential equations. In a mathematical sense, all the stream water quality parameters and nutrients can be treated as scalars that undergo advection, dispersion, sorption, diffusion, and reaction independently. However, the specific descriptions of different water quality parameters are strongly influenced by local effects of stream dynamics, meanderings and river morphology. All of the above factors induce ‘stochasticity’ into stream water quality contaminant description. Therefore, the statistical modeling of stream water quality is essential to understand stream flow processes, stream water quality distribution, stream contaminant transport modeling and in coming up with a water quality risk assessment framework that would use the available data for ecosystem restoration, land-use management, and design of agricultural and crop-related policy purposes (Anmala *et al.* 2015, Zhang *et al.* 2018).

A list of contaminants and maximum concentration levels (MCLs) as stipulated by the EPA's guideline of drinking water quality standards can be found in EPA (1999, 2002). Landscape attributes were significantly correlated with stream water quality (Ou & Wang 2011), and both natural and anthropogenic factors were found to be important for characterizing the stream water quality. Stormwater influences on stream water quality and increased inflow of pollutant concentrations were reported (Maniquiz *et al.* 2012) into constructed wetlands. It was also recommended to maximize the water holding capacity of wetlands for effective pollutant removal from streams and watersheds. Ouyang *et al.* (2013) estimated surface water quality for a stream in the Yazoo river basin, Mississipi, USA, using the recurrence interval analysis and duration curve approach. This study considered water temperature, dissolved oxygen (DO), specific conductivity, and pH in the analysis. Multivariate analysis was performed by Canobbio *et al.* (2013) to assess the habitat integrity of urban stream ecosystems. They identified a few metrics to assess habitat loss and found their results to be site-specific. Attempts at nonlinear regression were described using three cases (various combinations of the area and percentage imperviousness) and ten scenarios (various combinations of imperviousness, rainfall, slope, land usage) for a water quality assessment model (Cho 2014). The influence of hydrological components (precipitation and flow) on surface and subsurface nitrate loads was found to be the highest among natural and anthropogenic components (Li *et al.* 2015) using principal component analysis (PCA) and nearest neighborhood analysis. Water quality index (WQI) was estimated for a river in Algeria by Rachedi & Amarchi (2015) to determine its water quality from bacteriological and physicochemical parameters. They found it (WQI) to be poor and attributed this mainly to the lack of control of discharges and lack of water treatment. Cho (2016) presented the effectiveness of datamining tools such as a model tree, artificial neural network (ANN) and radial basis function over traditional, physical watershed models by considering various cases and scenarios of watershed characteristics such as hydrology, geology, and land use. The linkage between the spatial variation of water quality and the land-use composition of sub-watersheds was emphasized (Xu *et al.* 2016) for a catchment in China. Fuzzy comprehensive evaluation was used to assess water quality (Zhang *et al.* 2017) for six parameters: chloride, chemical oxygen demand, ammonia nitrogen, nitrate, fluoride, and sulfate for a river on the Tibetan plateau. More recently, multivariate statistical methods such as cluster analysis, principal component analysis, factor analysis, and multiple linear regression were used to assess water quality and to identify the sources for a typical urban river (Huang *et al.* 2018). Temporal and spatial similarity analysis along with cluster analysis were used (Zhang *et al.* 2018) to identify the important periods and the main pollution sources for an improved understanding of the temporal and spatial variations of the river quality of the watershed.

Even though a large number of studies considered causal parameters as in the present study, they were limited to being site-specific in the applicability of their modeling attempts. A more general modeling framework is developed in the current study to predict water quality parameters from the precipitation, temperature and land-use data that can be applied to any site or river basin. The objectives of the current study are: (i) to perform statistical assessment of water quality parameters in the form of linear, polynomial and nonlinear regression models; and (ii) to study the applicability and potential of artificial neural network models to predict water quality data from easily measurable parameters such as precipitation, temperature and land-use data. ANN models have been chosen as the model functional form need not be assumed a priori and model development does not involve the inclusion of site-specific details, or in other words, the model is site-independent.

## STUDY AREA AND DATA

The water quality observations were obtained from the water samples collected from the Green River and its tributaries from May 2002 to October 2002. Direct measurement of water samples was considered. The samples were collected monthly at 42 different locations along the stream network of Green River basin. The samples were collected as a part of the Green River Enhancement Program of the Center for Water Resource Studies & Upper Green River Biological Reserve initiative of Western Kentucky University, Bowling Green, Kentucky, USA. Bacteriological and stream water quality samples were collected in sterile, clear bottles made of glass that have good protection for opening. The bottles were sterilized before data collection and the standard bacteriological and water quality testing was performed in the laboratory soon after the collection of the samples. Some of these details and sources of land-use data and water quality data are available in Anmala *et al.* (2015).

## RESULTS FROM LINEAR AND POLYNOMIAL REGRESSION ANALYSIS

The individual correlations of all the water quality parameters with two-day cumulative precipitation, temperature, urban, forest, and agricultural land-use factors are obtained using correlation analysis. Only fecal coliform shows any significant correlation (coefficient of 0.41) with precipitation and dissolved oxygen with temperature (coefficient of 0.4) and the remaining multiple linear regressions of all the water quality parameters with all the five influencing variables are less than 0.44. The details of this multiple linear regression analysis can be found in Anmala *et al.* (2015). Since the atrazine data consisted of fewer data points, a polynomial regression analysis was performed against each land use. The complete cubic model of atrazine gives a Pearson correlation coefficient or an *R*^{2} value of 0.84 (Table 1). The independent correlations of forest and agricultural land-use factors with atrazine look weak, but the combined curvilinear model gives an *R*^{2} value of 0.76. More interestingly, the curvilinear model of forest and agricultural land-use practices does better than a curvilinear model of urban and forest land-use practices, and urban and agricultural land-use practices. This clearly shows the nonlinear interactions both between forest and agricultural land-use practices and also the nonlinear influences of these interactions of atrazine. The only other model (that is closest to the best model) shows the cubic variation of the urban land-use factor and first-, second-, and third- power relationships with forest and agricultural land-use factors, with an *R*^{2} value of 0.83. The results of the detailed curvilinear models for atrazine are given in Table 1.

Model . | R^{2}
. |
---|---|

X = [U] | 0.56 |

X = [F] | 0.19 |

X = [A] | 0.018 |

X = [U U^{2} U^{3} F F^{2} F^{3} A A^{2} A^{3}] | 0.84 |

X = [A A^{2} A^{3} F F^{2} F^{3}] | 0.76 |

X = [U^{3} A A^{2} A^{3} F F^{2} F^{3}] | 0.83 |

Model . | R^{2}
. |
---|---|

X = [U] | 0.56 |

X = [F] | 0.19 |

X = [A] | 0.018 |

X = [U U^{2} U^{3} F F^{2} F^{3} A A^{2} A^{3}] | 0.84 |

X = [A A^{2} A^{3} F F^{2} F^{3}] | 0.76 |

X = [U^{3} A A^{2} A^{3} F F^{2} F^{3}] | 0.83 |

## RESULTS FROM NONLINEAR REGRESSION ANALYSIS

*R*

^{2}correlation of 0.65, larger than the linear regression analysis. The stream water acidity or pH does not show much dependence on the hydrologic variables and land-use data (

*R*

^{2}= 0.15). The spatial variability of pH ranges between 7.15 and 8.60 and implies alkalinity. Conductivity shows an

*R*

^{2}value of 0.33, Salinity shows an

*R*

^{2}value of 0.37. From considering the results of other nonlinear models, which include spatial land-use practices alone, it is found that conductivity and salinity are influenced by non-local influences between urban, forest and agricultural land-use factors. The nonlinear model of turbidity gives an

*R*

^{2}value of 0.57. The spatial variability of turbidity may imply phytoplankton growth and tributary runoff (EPA 2002). The concentration of total dissolved solids (TDS) gives an

*R*

^{2}value of 0.51. The specific conductivity correlations are found to be of the same order as salinity and the concentrations of total dissolved solids, which is in consistent agreement with the findings of EPA (2002). Dissolved oxygen (a most important feature of water quality) shows primary dependence on temperature alone in the form of

*R*

^{2}= 0.58. The model fits the concentration of total suspended solids, with an

*R*

^{2}value of 0.31. The five-parameter model of orthophosphates gives an

*R*

^{2}value of 0.52. The concentration of orthophosphates shows strong nonlinear dependence on temperature and precipitation. The total phosphates do not imply much dependence on any of the parameters. The best model gives an

*R*

^{2}value of 0.21. The concentration of sulfates is predicted with an

*R*

^{2}value of 0.42. The statistical model that considers temperature, precipitation and urban land-use factor alone simulates the concentration of ammonium nitrate with an

*R*

^{2}value of 0.27. Nitrogen nitrate primarily depends upon temperature. The model correlation gives an

*R*

^{2}value of 0.22. Forest and urban land-use factors show a large influencing effect on the concentration of atrazine with correlations of 0.49 and 0.65 respectively. The combined model of urban, forest and agricultural land-use practices gives an

*R*

^{2}value of 0.89 for Atrazine. This model also shows the nonlinear interactions between agricultural and the remaining two land-use practices. The results of the nonlinear regression model are presented in Table 2. The nonlinear functional form of the water quality parameter is given by: where are log-logistic model coefficients. The results of the lumped nonlinear model (Equation (2)) are given in Table 3. The table also gives a comparative study of the best competitive model other than the complete lumped model (which includes all the five input variables).

Water quality parameter . | R^{2} (NRA)
. |
---|---|

Fecal coliform | 0.65 |

pH | 0.15 |

Conductivity | 0.28 |

Salinity | 0.37 |

Turbidity | 0.36 |

TDS | 0.51 |

DO | 0.58 |

TSS | 0.31 |

Ortho PO_{4} | 0.52 |

Total P | 0.21 |

Sulfates | 0.42 |

NH_{4}N | 0.27 |

NO_{3}N | 0.22 |

Atrazine | 0.89 |

Water quality parameter . | R^{2} (NRA)
. |
---|---|

Fecal coliform | 0.65 |

pH | 0.15 |

Conductivity | 0.28 |

Salinity | 0.37 |

Turbidity | 0.36 |

TDS | 0.51 |

DO | 0.58 |

TSS | 0.31 |

Ortho PO_{4} | 0.52 |

Total P | 0.21 |

Sulfates | 0.42 |

NH_{4}N | 0.27 |

NO_{3}N | 0.22 |

Atrazine | 0.89 |

Water quality parameter . | R^{2} of best model (NRA)
. | Second best model (NRA) . |
---|---|---|

Fecal coliform | 0.64 (T, P, U, F, A) | 0.56 (T,P,U) |

PH | 0.059 (U, F, A) | 0.047 (T, P) |

Conductivity | 0.33 (T, P, U, F, A) | 0.31 (T, P, U) |

Salinity | 0.32 (T, P, U, F, A) | 0.31 (U, F, A) |

Turbidity | 0.57 (T, P, U, F) | 0.57 (T, P, U, F, A) |

TDS | 0.31 (U, F, A) | 0.24 (A) |

DO | 0.43 (T) | 0.10 (T, P) |

TSS | 0.28 (T, P, U, F) | 0.27 (T, P, U, F, A) |

Ortho PO_{4} | 0.52 (T, P, U, F, A) | 0.44 (T, P, U) |

Total P | 0.14 (T, P, U) | 0.11 (T, P) |

Sulfates | 0.14 (T, P, U) | 0.13 (U, F, A) |

NH_{4}N | 0.26 (T, P, U) | 0.12 (T) |

NO_{3}N | 0.16 (T) | 0.08 (T, P, U) |

Atrazine | 0.89 (U, F, A) | 0.65 (U) |

Water quality parameter . | R^{2} of best model (NRA)
. | Second best model (NRA) . |
---|---|---|

Fecal coliform | 0.64 (T, P, U, F, A) | 0.56 (T,P,U) |

PH | 0.059 (U, F, A) | 0.047 (T, P) |

Conductivity | 0.33 (T, P, U, F, A) | 0.31 (T, P, U) |

Salinity | 0.32 (T, P, U, F, A) | 0.31 (U, F, A) |

Turbidity | 0.57 (T, P, U, F) | 0.57 (T, P, U, F, A) |

TDS | 0.31 (U, F, A) | 0.24 (A) |

DO | 0.43 (T) | 0.10 (T, P) |

TSS | 0.28 (T, P, U, F) | 0.27 (T, P, U, F, A) |

Ortho PO_{4} | 0.52 (T, P, U, F, A) | 0.44 (T, P, U) |

Total P | 0.14 (T, P, U) | 0.11 (T, P) |

Sulfates | 0.14 (T, P, U) | 0.13 (U, F, A) |

NH_{4}N | 0.26 (T, P, U) | 0.12 (T) |

NO_{3}N | 0.16 (T) | 0.08 (T, P, U) |

Atrazine | 0.89 (U, F, A) | 0.65 (U) |

## BEST-FIT DISTRIBUTIONS

All the water quality parameters and the input parameters were fitted with Weibull, normal, lognormal, logistic, log-logistic, gamma, smallest extreme value, largest extreme value and exponential distributions with multiple parameters using SPC, an MSExcel-based program. The best-fit distribution is found by identifying the one with the minimum AIC (Akaike information criterion) score (Akaike 1973). AIC is an information-theoretic measure which computes the relative quality or suitability of statistical distributions for a given data set. This is the equivalent of deciding the best-fit distribution using a *χ*^{2} test or Kolmogorov–Smirnov test. The best-fit distributions of the current study as determined by AIC value are given in Table 4 for all of the parameters involved in the study. None of the parameters exhibit ‘normal’ or ‘Gaussian’ behavior at watershed-scale for the period of study. The P-P plots of these distributions (in Table 5) were found to be straight with little skewness and data falling on the 1:1 line. The best-fit distributions as determined by the AIC method are shown in Figure 2 for a few of the water quality parameters.

S. no. . | Water quality parameter or influencing parameter . | Best-fit distribution . | Minimum AIC value obtained . |
---|---|---|---|

1 | 2-day cumulative precipitation | LogNormal – 3 parameter | 1,098.1 |

2 | Temperature | Weibull – 3 parameter | 1,506.9 |

3 | Urban land-use factor | Gamma – 3 parameter | −536.6 |

4 | Forest land-use factor | Logistic | −318.8 |

5 | Agricultural land-use factor | Weibull | −451.3 |

6 | Fecal coliform | LogNormal | 4,286.9 |

7 | pH | Gamma | 36.15 |

8 | Conductivity | LogNormal – 3 parameter | 2,671.3 |

9 | Salinity | Logistic | −721.2 |

10 | Dissolved oxygen | Gamma | 836.1 |

11 | Total dissolved solids (TDS) | Logistic | −639.9 |

12 | Total suspended solids (TSS) | LogLogistic – 3 parameter | 899.9 |

13 | Sulfates | Gamma | 963.3 |

14 | Ortho PO_{4} | Gamma – 3 parameter | −28.75 |

15 | Total P | LogNormal – 3 parameter | −102.3 |

16 | NH_{4}N | LogLogistic – 3 parameter | 172.4 |

17 | NO_{3}N | LogLogistic – 3 parameter | 281.5 |

S. no. . | Water quality parameter or influencing parameter . | Best-fit distribution . | Minimum AIC value obtained . |
---|---|---|---|

1 | 2-day cumulative precipitation | LogNormal – 3 parameter | 1,098.1 |

2 | Temperature | Weibull – 3 parameter | 1,506.9 |

3 | Urban land-use factor | Gamma – 3 parameter | −536.6 |

4 | Forest land-use factor | Logistic | −318.8 |

5 | Agricultural land-use factor | Weibull | −451.3 |

6 | Fecal coliform | LogNormal | 4,286.9 |

7 | pH | Gamma | 36.15 |

8 | Conductivity | LogNormal – 3 parameter | 2,671.3 |

9 | Salinity | Logistic | −721.2 |

10 | Dissolved oxygen | Gamma | 836.1 |

11 | Total dissolved solids (TDS) | Logistic | −639.9 |

12 | Total suspended solids (TSS) | LogLogistic – 3 parameter | 899.9 |

13 | Sulfates | Gamma | 963.3 |

14 | Ortho PO_{4} | Gamma – 3 parameter | −28.75 |

15 | Total P | LogNormal – 3 parameter | −102.3 |

16 | NH_{4}N | LogLogistic – 3 parameter | 172.4 |

17 | NO_{3}N | LogLogistic – 3 parameter | 281.5 |

Water quality parameter . | Inputs . | Output(s) . | Network architecture . | Training . | Testing . |
---|---|---|---|---|---|

Fecal coliform | P, T, U, F, A | Fecal coliform | 5-2-1 | 0.66 | 0.62 |

Turbidity | P, T, U, F, A, Fecal coliform | Turbidity | 6-7-1 | 0.92 | 0.80 |

pH | P, T, U, F, A | Turbidity, pH, Conductivity, Salinity, TDS, DO | 5-5-6 | 1.0 | 1.0 |

Conductivity | P, T, U, F, A | Turbidity, pH, Conductivity, Salinity, TDS, DO | 5-5-6 | 0.94 | 0.96 |

Salinity | P, T, U, F, A | Turbidity, pH, Conductivity, Salinity, TDS, DO | 5-5-6 | 0.92 | 0.95 |

TDS | P, T, U, F, A | Turbidity, pH, Conductivity, Salinity, TDS, DO | 5-5-6 | 0.93 | 0.96 |

DO | P, T, U, F, A | Turbidity, pH, Conductivity, Salinity, TDS, DO | 5-5-6 | 0.98 | 0.99 |

TSS | P, T, U, F, A, Ortho-PO_{4}, Total P, FC | TSS | 8-6-1 | 0.72 | 0.53 |

Ortho PO_{4} | P, T, U, F, A | TSS, Ortho PO_{4}, Total P, Sulfates, NH_{4}N, NO_{3}N | 5-4-6 | 0.67 | 0.32 |

Total P | P, T, U, F, A | TSS, Ortho PO_{4}, Total P, Sulfates, NH_{4}N, NO_{3}N | 5-4-6 | 0.46 | 0.60 |

Sulfates | P, T, U, F, A | TSS, Ortho PO_{4}, Total P, Sulfates, NH_{4}N, NO_{3}N | 5-4-6 | 0.64 | 0.80 |

NH_{4}N | P, T, U, F, A | TSS, Ortho PO_{4}, Total P, Sulfates, NH_{4}N, NO_{3}N | 5-4-6 | 0.27 | 0.52 |

NO_{3}N | P, T, U, F, A | TSS, Ortho PO_{4}, Total P, Sulfates, NH_{4}N, NO_{3}N | 5-4-6 | 0.83 | 0.32 |

Water quality parameter . | Inputs . | Output(s) . | Network architecture . | Training . | Testing . |
---|---|---|---|---|---|

Fecal coliform | P, T, U, F, A | Fecal coliform | 5-2-1 | 0.66 | 0.62 |

Turbidity | P, T, U, F, A, Fecal coliform | Turbidity | 6-7-1 | 0.92 | 0.80 |

pH | P, T, U, F, A | Turbidity, pH, Conductivity, Salinity, TDS, DO | 5-5-6 | 1.0 | 1.0 |

Conductivity | P, T, U, F, A | Turbidity, pH, Conductivity, Salinity, TDS, DO | 5-5-6 | 0.94 | 0.96 |

Salinity | P, T, U, F, A | Turbidity, pH, Conductivity, Salinity, TDS, DO | 5-5-6 | 0.92 | 0.95 |

TDS | P, T, U, F, A | Turbidity, pH, Conductivity, Salinity, TDS, DO | 5-5-6 | 0.93 | 0.96 |

DO | P, T, U, F, A | Turbidity, pH, Conductivity, Salinity, TDS, DO | 5-5-6 | 0.98 | 0.99 |

TSS | P, T, U, F, A, Ortho-PO_{4}, Total P, FC | TSS | 8-6-1 | 0.72 | 0.53 |

Ortho PO_{4} | P, T, U, F, A | TSS, Ortho PO_{4}, Total P, Sulfates, NH_{4}N, NO_{3}N | 5-4-6 | 0.67 | 0.32 |

Total P | P, T, U, F, A | TSS, Ortho PO_{4}, Total P, Sulfates, NH_{4}N, NO_{3}N | 5-4-6 | 0.46 | 0.60 |

Sulfates | P, T, U, F, A | TSS, Ortho PO_{4}, Total P, Sulfates, NH_{4}N, NO_{3}N | 5-4-6 | 0.64 | 0.80 |

NH_{4}N | P, T, U, F, A | TSS, Ortho PO_{4}, Total P, Sulfates, NH_{4}N, NO_{3}N | 5-4-6 | 0.27 | 0.52 |

NO_{3}N | P, T, U, F, A | TSS, Ortho PO_{4}, Total P, Sulfates, NH_{4}N, NO_{3}N | 5-4-6 | 0.83 | 0.32 |

## ANN MODELING RESULTS USING THE SPSS NEURAL NETWORK

The SPSS multilayer perceptron networks were trained and tested for all the water quality parameters with only one hidden layer. The regression results of predicted vs observed in the form of the coefficient of *R*^{2} are shown in Table 5 and in Figure 3. When a water quality parameter showed a linear correlation with another parameter, it was included as an effective input variable along with precipitation, temperature, and urban, forest and agricultural land-use factors. This can be seen from Table 5. One hidden layer was used for all the networks and the network architectures are given in Table 5. A small initial learning rate of 0.1 was used and the convergence was achieved within 1,000 epochs. The 6-month data has been divided into 70% training data, and 30% testing data. Fecal coliform was predicted with an *R*^{2} coefficient of 0.66 in training and 0.62 in testing. The water quality parameters, turbidity, pH, conductivity, salinity, TDS and DO, are predicted with very high *R*^{2} coefficients (more than 0.9) in training and testing, which can be seen from Table 5. These parameters were predicted with a single network of six output nodes. For the remaining parameters, TSS, ortho PO_{4}, total P, sulfates, NH_{4}N, and NO_{3}N, predictions (with a single network) are not as high as the previous network and need more advanced neural networks or learning algorithms. The advantage of modeling with neural networks can be clearly seen with no assumption of functional form as required in the case of the polynomial regression of atrazine discussed earlier. Also, the coefficients of *R*^{2} using neural networks are much higher than those of nonlinear, linear and polynomial regression analysis discussed in the previous sections. Although the details are not given here, the *R*^{2} coefficient, root mean square error (RMSE) and bias were used as performance indicators in coming up with the best models in neural network modeling (such as that of Najafzadeh *et al.* (2018) and Najafzadeh & Zahiri (2015)).

## CONCLUSIONS

Firstly, the distributions of stream water quality parameters, land-use data, precipitation, and temperature data are found using the Akaike information criterion model. Then, the prediction of stream water quality parameters using a more causal, input–output model is attempted here with polynomial, nonlinear regression models and basic artificial neural network models. The success of neural network modeling promises even better results with more sophisticated and advanced algorithms for the chosen highly nonlinear problem of stream water quality from easily measurable data. However, from the current study, the following conclusions can be drawn:

- 1.
The fecal coliform distribution shows a strong correlation with precipitation. Nonlinear and polynomial regression models fit the water quality distribution better than linear regression models.

- 2.
Precipitation, temperature and land-use data are important governing factors to predict water quality parameters, considering the model complexity and scale of the model problem.

- 3.
All the water quality parameters and land-use data exhibit non-Gaussian behavior over a large watershed-scale area for the period of study.

- 4.
Artificial neural network models predict better than nonlinear and linear regression models for all of the water quality parameters.

## ACKNOWLEDGEMENTS

We would like to express our gratitude to Dr Ouida Meier, Prof. Albert Meier, Dr Scott Grubbs, Dr Stuart Foster, Director of Kentucky Climate Center, Western Kentucky University, Bowling Green, Kentucky, USA, and Mr Tim Rink for helping us with the required data.