Abstract

The measurement and statistical modeling of water quality data are essential to developing a region-based stream-wise database that would be of great use to the EPA's needs. Such a database would also be useful in bio-assessment and in the modeling of processes that are related to riparian vegetation surrounding a water body such as a stream network. With the help of easily measurable data, it would be easier to come up with database-intensive numerical and computer models that explain the stream water quality distribution and biological integrity and predict stream water quality patterns. Statistical assessments of nutrients, stream water metallic and non-metallic pollutants, organic matter, and biological species data are needed to accurately describe the pollutant effects, to quantify health hazards, and in the modeling of water quality and its risk assessment. The study details the results of statistical nonlinear regression and artificial neural network models for Upper Green River watershed, Kentucky, USA. The neural network models predicted the stream water quality parameters with more accuracy than the nonlinear regression models in both training and testing phases. For example, neural network models of pH, conductivity, salinity, total dissolved solids, and dissolved oxygen gave an R2 coefficient close to 1.0 in the testing phase, while the nonlinear regression models resulted in less than 0.6. For other parameters also, neural networks showed better generalization compared with nonlinear regression models.

INTRODUCTION

The water quality in streams is fundamentally influenced by many factors such as stream morphology, longitudinal and lateral-mixing processes, evaporation processes, stream turbulence, small- to large-scale and localized random hydrological and meteorological cycle patterns, Hortonian-overland inflows into the stream, point and non-point source pollutants (diffuse source pollution) in the neighboring areas and creeks, and stream ecology. Along with the above factors, the two- and three-dimensional structure of stream network, spatial variability in land use, heterogeneity of stream-bed surface, spatially variable sources and sinks increase the complexity of modeling of stream-water quality even though the flow and transport processes are strictly governed by a well-known set of partial differential equations. In a mathematical sense, all the stream water quality parameters and nutrients can be treated as scalars that undergo advection, dispersion, sorption, diffusion, and reaction independently. However, the specific descriptions of different water quality parameters are strongly influenced by local effects of stream dynamics, meanderings and river morphology. All of the above factors induce ‘stochasticity’ into stream water quality contaminant description. Therefore, the statistical modeling of stream water quality is essential to understand stream flow processes, stream water quality distribution, stream contaminant transport modeling and in coming up with a water quality risk assessment framework that would use the available data for ecosystem restoration, land-use management, and design of agricultural and crop-related policy purposes (Anmala et al. 2015, Zhang et al. 2018).

A list of contaminants and maximum concentration levels (MCLs) as stipulated by the EPA's guideline of drinking water quality standards can be found in EPA (1999, 2002). Landscape attributes were significantly correlated with stream water quality (Ou & Wang 2011), and both natural and anthropogenic factors were found to be important for characterizing the stream water quality. Stormwater influences on stream water quality and increased inflow of pollutant concentrations were reported (Maniquiz et al. 2012) into constructed wetlands. It was also recommended to maximize the water holding capacity of wetlands for effective pollutant removal from streams and watersheds. Ouyang et al. (2013) estimated surface water quality for a stream in the Yazoo river basin, Mississipi, USA, using the recurrence interval analysis and duration curve approach. This study considered water temperature, dissolved oxygen (DO), specific conductivity, and pH in the analysis. Multivariate analysis was performed by Canobbio et al. (2013) to assess the habitat integrity of urban stream ecosystems. They identified a few metrics to assess habitat loss and found their results to be site-specific. Attempts at nonlinear regression were described using three cases (various combinations of the area and percentage imperviousness) and ten scenarios (various combinations of imperviousness, rainfall, slope, land usage) for a water quality assessment model (Cho 2014). The influence of hydrological components (precipitation and flow) on surface and subsurface nitrate loads was found to be the highest among natural and anthropogenic components (Li et al. 2015) using principal component analysis (PCA) and nearest neighborhood analysis. Water quality index (WQI) was estimated for a river in Algeria by Rachedi & Amarchi (2015) to determine its water quality from bacteriological and physicochemical parameters. They found it (WQI) to be poor and attributed this mainly to the lack of control of discharges and lack of water treatment. Cho (2016) presented the effectiveness of datamining tools such as a model tree, artificial neural network (ANN) and radial basis function over traditional, physical watershed models by considering various cases and scenarios of watershed characteristics such as hydrology, geology, and land use. The linkage between the spatial variation of water quality and the land-use composition of sub-watersheds was emphasized (Xu et al. 2016) for a catchment in China. Fuzzy comprehensive evaluation was used to assess water quality (Zhang et al. 2017) for six parameters: chloride, chemical oxygen demand, ammonia nitrogen, nitrate, fluoride, and sulfate for a river on the Tibetan plateau. More recently, multivariate statistical methods such as cluster analysis, principal component analysis, factor analysis, and multiple linear regression were used to assess water quality and to identify the sources for a typical urban river (Huang et al. 2018). Temporal and spatial similarity analysis along with cluster analysis were used (Zhang et al. 2018) to identify the important periods and the main pollution sources for an improved understanding of the temporal and spatial variations of the river quality of the watershed.

Even though a large number of studies considered causal parameters as in the present study, they were limited to being site-specific in the applicability of their modeling attempts. A more general modeling framework is developed in the current study to predict water quality parameters from the precipitation, temperature and land-use data that can be applied to any site or river basin. The objectives of the current study are: (i) to perform statistical assessment of water quality parameters in the form of linear, polynomial and nonlinear regression models; and (ii) to study the applicability and potential of artificial neural network models to predict water quality data from easily measurable parameters such as precipitation, temperature and land-use data. ANN models have been chosen as the model functional form need not be assumed a priori and model development does not involve the inclusion of site-specific details, or in other words, the model is site-independent.

STUDY AREA AND DATA

The Upper Green River watershed is located in the lower western part of southern Kentucky. The main channel Green River flows through Hart, Barren, and Edmonson counties and divides into two tributaries at the tip of the watershed. The water sampling locations of the experiments of the current study are located in and near the main channel and its tributaries. The water quality sampling locations, stream flow measurement locations, and precipitation measurement locations are shown in the geographic information system (GIS) map in Figure 1. All the water quality sampling locations are located downstream of the dam (which exists at the mouth of the watershed) and spread over the entire watershed. The schematic of the stream network is also shown in Figure 1 with the main tributaries of the Green River. Using the GIS map layer of land use and GIS analysis, the land-use factors of the upstream sub-watersheds for each sampling location are estimated as follows: 
formula
Figure 1

GIS map consisting of stream network, sampling locations, precipitation, flow measurements and stream network schematic (with permission from ASCE).

Figure 1

GIS map consisting of stream network, sampling locations, precipitation, flow measurements and stream network schematic (with permission from ASCE).

Similarly, 
formula
and 
formula

The water quality observations were obtained from the water samples collected from the Green River and its tributaries from May 2002 to October 2002. Direct measurement of water samples was considered. The samples were collected monthly at 42 different locations along the stream network of Green River basin. The samples were collected as a part of the Green River Enhancement Program of the Center for Water Resource Studies & Upper Green River Biological Reserve initiative of Western Kentucky University, Bowling Green, Kentucky, USA. Bacteriological and stream water quality samples were collected in sterile, clear bottles made of glass that have good protection for opening. The bottles were sterilized before data collection and the standard bacteriological and water quality testing was performed in the laboratory soon after the collection of the samples. Some of these details and sources of land-use data and water quality data are available in Anmala et al. (2015).

RESULTS FROM LINEAR AND POLYNOMIAL REGRESSION ANALYSIS

The individual correlations of all the water quality parameters with two-day cumulative precipitation, temperature, urban, forest, and agricultural land-use factors are obtained using correlation analysis. Only fecal coliform shows any significant correlation (coefficient of 0.41) with precipitation and dissolved oxygen with temperature (coefficient of 0.4) and the remaining multiple linear regressions of all the water quality parameters with all the five influencing variables are less than 0.44. The details of this multiple linear regression analysis can be found in Anmala et al. (2015). Since the atrazine data consisted of fewer data points, a polynomial regression analysis was performed against each land use. The complete cubic model of atrazine gives a Pearson correlation coefficient or an R2 value of 0.84 (Table 1). The independent correlations of forest and agricultural land-use factors with atrazine look weak, but the combined curvilinear model gives an R2 value of 0.76. More interestingly, the curvilinear model of forest and agricultural land-use practices does better than a curvilinear model of urban and forest land-use practices, and urban and agricultural land-use practices. This clearly shows the nonlinear interactions both between forest and agricultural land-use practices and also the nonlinear influences of these interactions of atrazine. The only other model (that is closest to the best model) shows the cubic variation of the urban land-use factor and first-, second-, and third- power relationships with forest and agricultural land-use factors, with an R2 value of 0.83. The results of the detailed curvilinear models for atrazine are given in Table 1.

Table 1

Curvilinear regression model for atrazine (where U = urban land-use, F = forest land-use, A = agricultural land-use factors)

Model R2 
X = [U] 0.56 
X = [F] 0.19 
X = [A] 0.018 
X = [U U2 U3 F F2 F3 A A2 A30.84 
X = [A A2 A3 F F2 F30.76 
X = [U3 A A2 A3 F F2 F30.83 
Model R2 
X = [U] 0.56 
X = [F] 0.19 
X = [A] 0.018 
X = [U U2 U3 F F2 F3 A A2 A30.84 
X = [A A2 A3 F F2 F30.76 
X = [U3 A A2 A3 F F2 F30.83 

RESULTS FROM NONLINEAR REGRESSION ANALYSIS

A couple of nonlinear models are studied that show similar functional form as a log-logistic model (Equation (1)). The model parameters are estimated using a nonlinear, gradient-search Newton convergence scheme. Nonlinear regression analysis is first performed against each of the water quality parameters separately to determine the individual influences on water quality. The best results are obtained when the multivariate, log-logistic model is used for all the five variables, namely – precipitation, temperature, urban land-use factor, forest land-use factor, agricultural land-use factor. The fecal coliform analysis shows an R2 correlation of 0.65, larger than the linear regression analysis. The stream water acidity or pH does not show much dependence on the hydrologic variables and land-use data (R2 = 0.15). The spatial variability of pH ranges between 7.15 and 8.60 and implies alkalinity. Conductivity shows an R2 value of 0.33, Salinity shows an R2 value of 0.37. From considering the results of other nonlinear models, which include spatial land-use practices alone, it is found that conductivity and salinity are influenced by non-local influences between urban, forest and agricultural land-use factors. The nonlinear model of turbidity gives an R2 value of 0.57. The spatial variability of turbidity may imply phytoplankton growth and tributary runoff (EPA 2002). The concentration of total dissolved solids (TDS) gives an R2 value of 0.51. The specific conductivity correlations are found to be of the same order as salinity and the concentrations of total dissolved solids, which is in consistent agreement with the findings of EPA (2002). Dissolved oxygen (a most important feature of water quality) shows primary dependence on temperature alone in the form of R2 = 0.58. The model fits the concentration of total suspended solids, with an R2 value of 0.31. The five-parameter model of orthophosphates gives an R2 value of 0.52. The concentration of orthophosphates shows strong nonlinear dependence on temperature and precipitation. The total phosphates do not imply much dependence on any of the parameters. The best model gives an R2 value of 0.21. The concentration of sulfates is predicted with an R2 value of 0.42. The statistical model that considers temperature, precipitation and urban land-use factor alone simulates the concentration of ammonium nitrate with an R2 value of 0.27. Nitrogen nitrate primarily depends upon temperature. The model correlation gives an R2 value of 0.22. Forest and urban land-use factors show a large influencing effect on the concentration of atrazine with correlations of 0.49 and 0.65 respectively. The combined model of urban, forest and agricultural land-use practices gives an R2 value of 0.89 for Atrazine. This model also shows the nonlinear interactions between agricultural and the remaining two land-use practices. The results of the nonlinear regression model are presented in Table 2. The nonlinear functional form of the water quality parameter is given by: 
formula
(1)
where are log-logistic model coefficients. The results of the lumped nonlinear model (Equation (2)) are given in Table 3. The table also gives a comparative study of the best competitive model other than the complete lumped model (which includes all the five input variables). 
formula
(2)
Table 2

Nonlinear regression models for water quality parameters (using log-logistic model (Equation (1)) for multi-variables)

Water quality parameter R2 (NRA) 
Fecal coliform 0.65 
pH 0.15 
Conductivity 0.28 
Salinity 0.37 
Turbidity 0.36 
TDS 0.51 
DO 0.58 
TSS 0.31 
Ortho PO4 0.52 
Total P 0.21 
Sulfates 0.42 
NH40.27 
NO30.22 
Atrazine 0.89 
Water quality parameter R2 (NRA) 
Fecal coliform 0.65 
pH 0.15 
Conductivity 0.28 
Salinity 0.37 
Turbidity 0.36 
TDS 0.51 
DO 0.58 
TSS 0.31 
Ortho PO4 0.52 
Total P 0.21 
Sulfates 0.42 
NH40.27 
NO30.22 
Atrazine 0.89 
Table 3

Nonlinear lumped regression models (using Equation (2)) for water quality parameters (the input variables are shown in brackets next to the correlation coefficients: T = temperature, P = precipitation, U = urban land-use, F = forest land-use, A = agricultural land-use factors)

Water quality parameter R2 of best model (NRA) Second best model (NRA) 
Fecal coliform 0.64 (T, P, U, F, A) 0.56 (T,P,U) 
PH 0.059 (U, F, A) 0.047 (T, P) 
Conductivity 0.33 (T, P, U, F, A) 0.31 (T, P, U) 
Salinity 0.32 (T, P, U, F, A) 0.31 (U, F, A) 
Turbidity 0.57 (T, P, U, F) 0.57 (T, P, U, F, A) 
TDS 0.31 (U, F, A) 0.24 (A) 
DO 0.43 (T) 0.10 (T, P) 
TSS 0.28 (T, P, U, F) 0.27 (T, P, U, F, A) 
Ortho PO4 0.52 (T, P, U, F, A) 0.44 (T, P, U) 
Total P 0.14 (T, P, U) 0.11 (T, P) 
Sulfates 0.14 (T, P, U) 0.13 (U, F, A) 
NH40.26 (T, P, U) 0.12 (T) 
NO30.16 (T) 0.08 (T, P, U) 
Atrazine 0.89 (U, F, A) 0.65 (U) 
Water quality parameter R2 of best model (NRA) Second best model (NRA) 
Fecal coliform 0.64 (T, P, U, F, A) 0.56 (T,P,U) 
PH 0.059 (U, F, A) 0.047 (T, P) 
Conductivity 0.33 (T, P, U, F, A) 0.31 (T, P, U) 
Salinity 0.32 (T, P, U, F, A) 0.31 (U, F, A) 
Turbidity 0.57 (T, P, U, F) 0.57 (T, P, U, F, A) 
TDS 0.31 (U, F, A) 0.24 (A) 
DO 0.43 (T) 0.10 (T, P) 
TSS 0.28 (T, P, U, F) 0.27 (T, P, U, F, A) 
Ortho PO4 0.52 (T, P, U, F, A) 0.44 (T, P, U) 
Total P 0.14 (T, P, U) 0.11 (T, P) 
Sulfates 0.14 (T, P, U) 0.13 (U, F, A) 
NH40.26 (T, P, U) 0.12 (T) 
NO30.16 (T) 0.08 (T, P, U) 
Atrazine 0.89 (U, F, A) 0.65 (U) 

BEST-FIT DISTRIBUTIONS

All the water quality parameters and the input parameters were fitted with Weibull, normal, lognormal, logistic, log-logistic, gamma, smallest extreme value, largest extreme value and exponential distributions with multiple parameters using SPC, an MSExcel-based program. The best-fit distribution is found by identifying the one with the minimum AIC (Akaike information criterion) score (Akaike 1973). AIC is an information-theoretic measure which computes the relative quality or suitability of statistical distributions for a given data set. This is the equivalent of deciding the best-fit distribution using a χ2 test or Kolmogorov–Smirnov test. The best-fit distributions of the current study as determined by AIC value are given in Table 4 for all of the parameters involved in the study. None of the parameters exhibit ‘normal’ or ‘Gaussian’ behavior at watershed-scale for the period of study. The P-P plots of these distributions (in Table 5) were found to be straight with little skewness and data falling on the 1:1 line. The best-fit distributions as determined by the AIC method are shown in Figure 2 for a few of the water quality parameters.

Table 4

Best-fit distributions using minimum AIC value

S. no. Water quality parameter or influencing parameter Best-fit distribution Minimum AIC value obtained 
2-day cumulative precipitation LogNormal – 3 parameter 1,098.1 
Temperature Weibull – 3 parameter 1,506.9 
Urban land-use factor Gamma – 3 parameter −536.6 
Forest land-use factor Logistic −318.8 
Agricultural land-use factor Weibull −451.3 
Fecal coliform LogNormal 4,286.9 
pH Gamma 36.15 
Conductivity LogNormal – 3 parameter 2,671.3 
Salinity Logistic −721.2 
10 Dissolved oxygen Gamma 836.1 
11 Total dissolved solids (TDS) Logistic −639.9 
12 Total suspended solids (TSS) LogLogistic – 3 parameter 899.9 
13 Sulfates Gamma 963.3 
14 Ortho PO4 Gamma – 3 parameter −28.75 
15 Total P LogNormal – 3 parameter −102.3 
16 NH4LogLogistic – 3 parameter 172.4 
17 NO3LogLogistic – 3 parameter 281.5 
S. no. Water quality parameter or influencing parameter Best-fit distribution Minimum AIC value obtained 
2-day cumulative precipitation LogNormal – 3 parameter 1,098.1 
Temperature Weibull – 3 parameter 1,506.9 
Urban land-use factor Gamma – 3 parameter −536.6 
Forest land-use factor Logistic −318.8 
Agricultural land-use factor Weibull −451.3 
Fecal coliform LogNormal 4,286.9 
pH Gamma 36.15 
Conductivity LogNormal – 3 parameter 2,671.3 
Salinity Logistic −721.2 
10 Dissolved oxygen Gamma 836.1 
11 Total dissolved solids (TDS) Logistic −639.9 
12 Total suspended solids (TSS) LogLogistic – 3 parameter 899.9 
13 Sulfates Gamma 963.3 
14 Ortho PO4 Gamma – 3 parameter −28.75 
15 Total P LogNormal – 3 parameter −102.3 
16 NH4LogLogistic – 3 parameter 172.4 
17 NO3LogLogistic – 3 parameter 281.5 
Table 5

R2 coefficients of SPSS multilayer perceptron network (P = precipitation, T = temperature, U = urban land-use, F = forest land-use, A = agricultural land-use factors)

Water quality parameter Inputs Output(s) Network architecture Training Testing 
Fecal coliform P, T, U, F, A Fecal coliform 5-2-1 0.66 0.62 
Turbidity P, T, U, F, A, Fecal coliform Turbidity 6-7-1 0.92 0.80 
pH P, T, U, F, A Turbidity, pH, Conductivity, Salinity, TDS, DO 5-5-6 1.0 1.0 
Conductivity P, T, U, F, A Turbidity, pH, Conductivity, Salinity, TDS, DO 5-5-6 0.94 0.96 
Salinity P, T, U, F, A Turbidity, pH, Conductivity, Salinity, TDS, DO 5-5-6 0.92 0.95 
TDS P, T, U, F, A Turbidity, pH, Conductivity, Salinity, TDS, DO 5-5-6 0.93 0.96 
DO P, T, U, F, A Turbidity, pH, Conductivity, Salinity, TDS, DO 5-5-6 0.98 0.99 
TSS P, T, U, F, A, Ortho-PO4, Total P, FC TSS 8-6-1 0.72 0.53 
Ortho PO4 P, T, U, F, A TSS, Ortho PO4, Total P, Sulfates, NH4N, NO35-4-6 0.67 0.32 
Total P P, T, U, F, A TSS, Ortho PO4, Total P, Sulfates, NH4N, NO35-4-6 0.46 0.60 
Sulfates P, T, U, F, A TSS, Ortho PO4, Total P, Sulfates, NH4N, NO35-4-6 0.64 0.80 
NH4P, T, U, F, A TSS, Ortho PO4, Total P, Sulfates, NH4N, NO35-4-6 0.27 0.52 
NO3P, T, U, F, A TSS, Ortho PO4, Total P, Sulfates, NH4N, NO35-4-6 0.83 0.32 
Water quality parameter Inputs Output(s) Network architecture Training Testing 
Fecal coliform P, T, U, F, A Fecal coliform 5-2-1 0.66 0.62 
Turbidity P, T, U, F, A, Fecal coliform Turbidity 6-7-1 0.92 0.80 
pH P, T, U, F, A Turbidity, pH, Conductivity, Salinity, TDS, DO 5-5-6 1.0 1.0 
Conductivity P, T, U, F, A Turbidity, pH, Conductivity, Salinity, TDS, DO 5-5-6 0.94 0.96 
Salinity P, T, U, F, A Turbidity, pH, Conductivity, Salinity, TDS, DO 5-5-6 0.92 0.95 
TDS P, T, U, F, A Turbidity, pH, Conductivity, Salinity, TDS, DO 5-5-6 0.93 0.96 
DO P, T, U, F, A Turbidity, pH, Conductivity, Salinity, TDS, DO 5-5-6 0.98 0.99 
TSS P, T, U, F, A, Ortho-PO4, Total P, FC TSS 8-6-1 0.72 0.53 
Ortho PO4 P, T, U, F, A TSS, Ortho PO4, Total P, Sulfates, NH4N, NO35-4-6 0.67 0.32 
Total P P, T, U, F, A TSS, Ortho PO4, Total P, Sulfates, NH4N, NO35-4-6 0.46 0.60 
Sulfates P, T, U, F, A TSS, Ortho PO4, Total P, Sulfates, NH4N, NO35-4-6 0.64 0.80 
NH4P, T, U, F, A TSS, Ortho PO4, Total P, Sulfates, NH4N, NO35-4-6 0.27 0.52 
NO3P, T, U, F, A TSS, Ortho PO4, Total P, Sulfates, NH4N, NO35-4-6 0.83 0.32 
Figure 2

Best-fit distributions of fecal coliform (FC), pH, conductivity, salinity, TDS, and DO.

Figure 2

Best-fit distributions of fecal coliform (FC), pH, conductivity, salinity, TDS, and DO.

ANN MODELING RESULTS USING THE SPSS NEURAL NETWORK

The SPSS multilayer perceptron networks were trained and tested for all the water quality parameters with only one hidden layer. The regression results of predicted vs observed in the form of the coefficient of R2 are shown in Table 5 and in Figure 3. When a water quality parameter showed a linear correlation with another parameter, it was included as an effective input variable along with precipitation, temperature, and urban, forest and agricultural land-use factors. This can be seen from Table 5. One hidden layer was used for all the networks and the network architectures are given in Table 5. A small initial learning rate of 0.1 was used and the convergence was achieved within 1,000 epochs. The 6-month data has been divided into 70% training data, and 30% testing data. Fecal coliform was predicted with an R2 coefficient of 0.66 in training and 0.62 in testing. The water quality parameters, turbidity, pH, conductivity, salinity, TDS and DO, are predicted with very high R2 coefficients (more than 0.9) in training and testing, which can be seen from Table 5. These parameters were predicted with a single network of six output nodes. For the remaining parameters, TSS, ortho PO4, total P, sulfates, NH4N, and NO3N, predictions (with a single network) are not as high as the previous network and need more advanced neural networks or learning algorithms. The advantage of modeling with neural networks can be clearly seen with no assumption of functional form as required in the case of the polynomial regression of atrazine discussed earlier. Also, the coefficients of R2 using neural networks are much higher than those of nonlinear, linear and polynomial regression analysis discussed in the previous sections. Although the details are not given here, the R2 coefficient, root mean square error (RMSE) and bias were used as performance indicators in coming up with the best models in neural network modeling (such as that of Najafzadeh et al. (2018) and Najafzadeh & Zahiri (2015)).

Figure 3

ANN model performance in testing for fecal coliform, pH, conductivity, salinity, TDS, and DO.

Figure 3

ANN model performance in testing for fecal coliform, pH, conductivity, salinity, TDS, and DO.

CONCLUSIONS

Firstly, the distributions of stream water quality parameters, land-use data, precipitation, and temperature data are found using the Akaike information criterion model. Then, the prediction of stream water quality parameters using a more causal, input–output model is attempted here with polynomial, nonlinear regression models and basic artificial neural network models. The success of neural network modeling promises even better results with more sophisticated and advanced algorithms for the chosen highly nonlinear problem of stream water quality from easily measurable data. However, from the current study, the following conclusions can be drawn:

  • 1.

    The fecal coliform distribution shows a strong correlation with precipitation. Nonlinear and polynomial regression models fit the water quality distribution better than linear regression models.

  • 2.

    Precipitation, temperature and land-use data are important governing factors to predict water quality parameters, considering the model complexity and scale of the model problem.

  • 3.

    All the water quality parameters and land-use data exhibit non-Gaussian behavior over a large watershed-scale area for the period of study.

  • 4.

    Artificial neural network models predict better than nonlinear and linear regression models for all of the water quality parameters.

ACKNOWLEDGEMENTS

We would like to express our gratitude to Dr Ouida Meier, Prof. Albert Meier, Dr Scott Grubbs, Dr Stuart Foster, Director of Kentucky Climate Center, Western Kentucky University, Bowling Green, Kentucky, USA, and Mr Tim Rink for helping us with the required data.

REFERENCES

REFERENCES
Akaike
H.
1973
Information theory and an extension of the maximum likelihood principle
. In:
2nd International Symposium on Information Theory, Tsahkadsor, Armenia, USSR, September 2–8, 1971
(
Petrov
B. N.
&
Csaki
F.
, eds),
Akademiai Kiado
,
Budapest, Hungary
, pp.
267
281
.
Anmala
J.
,
Meier
O. W.
,
Meier
A. J.
&
Grubbs
S.
2015
GIS and artificial neural network-based water quality model for a stream network in the Upper Green River Basin, Kentucky, USA
.
Journal of Environmental Engineering
141
,
04014082
.
Canobbio
S.
,
Azzellino
A.
,
Cabrini
R.
&
Mezzanotte
V.
2013
A multivariate approach to assess habitat integrity in urban streams using benthic macroinvertebrate metrics
.
Water Science & Technology
67
(
12
),
2832
2837
.
Huang
J.
,
Xie
R.
,
Yin
H.
&
Zhou
Q.
2018
Assessment of water quality and source apportionment in a typical urban river in China using multivariate statistical methods
.
Water Science & Technology: Water Supply
18
(
5
),
1841
1851
.
Li
S.
,
Bhattarai
R.
,
Wang
L.
,
Cooke
R. A.
,
Ma
F.
&
Kalita
P. K.
2015
Assessment of water quality in Little Vermillion River watershed using principal component and nearest neighbor analyses
.
Water Science & Technology: Water Supply
15
(
2
),
327
338
.
Maniquiz
M. C.
,
Choi
J. Y.
,
Lee
S. Y.
,
Kang
C. G.
,
Yi
G. S.
&
Kim
L. H.
2012
System design and treatment efficiency of a surface flow constructed wetland receiving runoff impacted stream water
.
Water Science & Technology
65
(
3
),
525
532
.
Najafzadeh
M.
,
Ghaemi
A.
&
Emamgholizadeh
S.
2018
Prediction of water quality parameters using evolutionary computing-based formulations
.
International Journal of Environmental Science and Technology
.
https://doi.org/10.1007/s13762-018-2049-4.
Ouyang
Y.
,
Parajuli
P. B.
&
Marion
D. A.
2013
Estimation of surface water quality in a Yazoo river tributary using the duration curve and recurrence interval approach
.
Water Science & Technology: Water Supply
13
(
2
),
515
523
.
Rachedi
L. H.
&
Amarchi
H.
2015
Assessment of the water quality of the Seybouse River (north-east Algeria) using the CCME WQI model
.
Water Science & Technology: Water Supply
15
(
4
),
793
801
.
US EPA (Environmental Protection Agency)
1999
National Recommended Water Quality Criteria – Correction
.
EPA/822/Z-99/001
,
Office of Water
,
Washington, DC, USA
.
US EPA (Environmental Protection Agency)
2002
Environmental Curricula Handbook: Tools in Your Schools
.
EPA/625/R-02/009
,
Office of Environmental Information
,
Washington, DC, USA
.
Xu
E.
,
Zhang
H.
,
Dong
G.
,
Kang
L.
&
Zhen
X.
2016
Spatial variation of water quality in upper catchment of Miyun Reservoir
.
Water Science & Technology: Water Supply
16
(
3
),
817
827
.
Zhang
Q.
,
Wang
S.
,
Yousaf
M.
,
Wang
S.
,
Nan
Z.
,
Ma
J.
,
Wang
D.
&
Zang
F.
2017
Hydrochemical characteristics and water quality assessment of surface water in the northeast Tibetan Plateau of China
.
Water Science & Technology: Water Supply
18
(
5
),
1757
1768
.