Surface waterbodies, on which the growing population of Kashmir Valley is reliant in a variety of ways, are increasingly deteriorated due to anthropogenic pollution from rapid economic development. This research aims to assess the quality of the surface waterbodies in the north-eastern region of Kashmir Valley. Standard analytical procedures were used to analyze the water samples taken from 11 distinct sampling stations for 14 physicochemical parameters. The results were compared with the standard permissible levels which showed that the water quality of rivers and lakes in the north-east Himalayan region has steadily declined. Furthermore, multivariate statistical techniques were used with the goal of identifying key variables that influence seasonal and sectional water quality variations. The analysis of variance (ANOVA) analysis revealed that there is substantial spatio-temporal variability in the water quality parameters. According to principal component analysis (PCA) results, four primary components, which together accounted for 79.23% of the total variance, could be used to evaluate all data. Chemical, organic, and conventional pollutants were found to be significant latent factors influencing the water quality of rivers in the study region. The results indicate that PCA and ANOVA may be used as vital tools to identify crucial surface water quality indices and the most contaminated river sections.

  • Water quality parameters of key rivers of Kashmir Valley breach the standard quality standards.

  • ANOVA analysis shows significant spatio-temporal variability in the water quality parameters.

  • PCA analysis reveals that the River Sindh has the poorest water quality and the upper stretches of River Jhelum are least affected.

  • Increase in side-stream pollution is a sign of increased anthropogenic pressure in the watersheds.

Graphical Abstract

Graphical Abstract
Graphical Abstract

Surface water quality is a very important part of human life and is a major global problem due to its delicate characteristics. Since prehistoric times, riverine systems have been crucial to the growth of human civilizations because they provide a necessary and convenient source of fresh water for household, agricultural, and industrial uses. Unplanned industrialization and urbanization created for socioeconomic advancements today have greatly threatened the very existence of life on earth by polluting the water (Avtar et al. 2019). Both artificial and natural sources of pollution have an impact on the river's water quality (Singh et al. 2009). Changes in precipitation, surface runoff, erosion, and weathering are examples of natural processes that affect water quality (Khatri & Tyagi 2015). Human impacts including sewage, municipal waste, effluents, and irrigational operations also have an impact (Hanjra et al. 2012).

Humans must have access to clean water as a basic necessity for survival and well-being. Fresh water is essential in many regions of the world due to industrial activity's role in water contamination, but it will become even more scarce as a result of population growth, urbanization, and climate change (Rosegrant et al. 2009; Gude 2017). Anthropogenic activity-related declines in water quality are quite concerning. Deteriorations in water quality triggered the need to examine and evaluate surface water quality, particularly when taking into account its beneficial uses, such as drinking, industrial and agricultural operations, etc.

Since the dawn of time, nature has provided the large population of the Kashmir Valley with copious water supplies on which it depends in countless ways. Numerous direct and indirect activities connected to these water bodies provide people with a living. The rivers and lakes in Kashmir are treated as dumping sites of wastewater despite the enormous benefits they provide (Qayoom et al. 2022). This has led to significant degradation of water quality in the recent past. Furthermore, the valley's water resources’ biological and physicochemical characteristics have been significantly impacted by extensive changes in land use, spontaneous urbanization, forest cover degradation/deforestation, high tourism pressures, landform degradation, and uncontrolled use of fertilizers and pesticides (Mir & Gani 2019). A number of recent studies have concentrated on investigating the water quality of River Jhelum and its associated tributaries (Qadir et al. 2008; Mehmood et al. 2017; Bhat et al. 2021). For instance, Showqi et al. (2014) observed that changes in the land use/land cover (LULC), hydrometeorological climate, and anthropogenic effect had caused the River Jhelum to deteriorate. According to Ganie et al. (2021), the deterioration of Wular Lakes is a result of changes in the LULC and the hydrological changes that follow, such as decreased runoff, increased erosion, and sedimentation. However, it is worth mentioning that the water quality of major tributaries like River Sindh and Lidder has a significant effect on the water quality scenarios of River Jhelum and thereby on the hydro-geochemistry of Wular Lake (Khanday et al. 2021).

Making appropriate pollution prevention efforts requires knowledge of the river's degraded sections and the real sources of pollution along various river segments. Sedimentation is one of the biggest dangers faced by river ecosystems worldwide. The total dissolved solids (TDS) and total suspended solids (TSS) contents in river water have considerably increased. An analysis of the 145 largest rivers in the world with reliable long-term sediment records revealed that roughly 50% of them statistically had a trend of drastically reduced flow because of sedimentation (Butler et al. 2020). In recent years, it has been clear that one of the main causes of poor water quality is sediment transport in the water bodies. Different algorithms and models have been used by the researchers to estimate the dissolved sediment load (DSL) and suspended sediment load (SSL) carried by rivers (Sun et al. 2021; Zhao et al. 2021). Many direct and indirect factors influence the variations in sediment movement, with evapotranspiration being one of the potential factors. Accurate daily evapotranspiration calculations can give scientific direction for the formulation of sediment management plans and the rational use of water resources (He et al. 2022). Furthermore, the estimate of the longitudinal dispersion coefficient is a critical measure for evaluating and regulating pollution spread in rivers (Goliatt et al. 2021). In order to get relevant results from the examination of water quality data, multivariate statistical approaches like factor analysis (FA) and analysis of variance (ANOVA) have been widely used (Najar & Khan 2012; Aydin et al. 2021). Additionally, it has been frequently utilized to describe and assess the quality of water in order to analyze spatio-temporal fluctuations brought on by anthropogenic and natural processes, as well as heterogeneity between various river sections (Machiwal et al. 2018). The use of multivariate statistical methods enables the extraction of hidden information regarding potential environmental influences on water quality from the dataset.

In FA, relationships between observations are attempted to be explained in terms of underlying components that are not immediately visible. The first stage of FA is to derive the correlation matrix of parameters. It is used to take into consideration how much variability in each specific pair of water quality characteristics is shared by the other. The correlation matrix's eigenvalues and factor loadings are then calculated. The sets of variables that have a high degree of correlation with one another are identified by the eigenvectors that eigenvalues correspond to. Lower eigenvalues might not make much of an impact on the data's capacity for explanation. Most of the parameter variability can be explained by the first few components alone (Tipping & Bishop 1999). Factor loadings are used to gauge the degree of correlation between the variables and the factors once the correlation matrix and eigenvalues have been obtained (Garrido et al. 2013). To determine how the values of two categorical factors affect the mean of a quantitative variable and if there is a statistically significant difference between the means of three or more independent groups that have been divided on two factors, a two-way ANOVA is utilized.

Accurate information on the quality of the water is crucial for effective and efficient water management, since it helps to identify the causes of degradation as well as the state of rivers and the landscapes that surround them. Using this data, we can create restoration strategies, calculate the ecological risks connected to proposed land use in a watershed, or decide on already-existing development choices to reduce river degradation. In this study, water samples from different sites of River Jhelum and its major tributaries from north-eastern Himalayas were used as a preliminary survey on water contamination, with the following goals: (1) To access the water quality of major tributaries of north-eastern Himalayas of Kashmir and their impact on River Jhelum; (2) To detect the regional and temporal changes in water quality and potential sources of pollution using descriptive statistics (DS) and principal component analysis (PCA); and (3) To assess the variation of water quality measures at different stations using ANOVA. The findings of this study could help decision makers to manage water quality, stop pollution sources, and safeguard the water resources of Kashmir Valley.

Study area and sampling points

The River Jhelum is the primary river that flows along the entirety of Kashmir Division's 140 km. The majority of the villages and towns are situated along its banks. It originates from a beautiful spring known as ‘Verinag’. The Lidder River, which is the largest of all the effluents and the source of the River Jhelum's head waters, meets on the right slope of the mountain at a distance of 2 km. The second-largest tributary, Nallah Sindh, combines with it at Shadipora on the right bank, along with a few smaller input sources. The river enters the Wular Lake near Banyari which also receives the Arin and Madhumati streams and finally leaves the lake at its southwest corner before flowing westward across the alluvial plain for 21 km till it reaches the Baramulla bridge.

In order to achieve the goals and objectives of this investigation, 11sampling points on the riverbank of Jhelum and its major tributaries in 2021 were chosen based on the strength of anticipated pollutants as well as the geology of the riverbed, which can be seen in Figure 1 and are listed in Table 1. The samples were collected during the spring (March–April), summer (June–July), autumn (September–October), and winter (December–January) seasons.
Table 1

Sampling locations

SiteLocationLatitudeLongitude
Site 1 River Jhelum:1 km upstream of confluence point of Lidder and Jhelum 33.7394 75.1295 
Site 2 River Lidder: 1 km upstream of confluence point of Lidder and Jhelum 33.7468 75.1382 
Site 3 River Jhelum: 1 km downstream of confluence point of Lidder and Jhelum 33.7478 75.1242 
Site 4 River Jhelum:1 km upstream of confluence point of Sindh and Jhelum 34.1777 74.6805 
Site 5 River Sindh:1 km upstream of confluence point of Sindh and Jhelum 34.1840 74.6853 
Site 6 River Jhelum:1 km downstream of confluence point of Sindh and Jhelum 34.1930 74.6700 
Site 7 River Jhelum:1 km upstream of Jhelum entering Wular Lake 34.3336 74.6356 
Site 8 River Arin:1 km upstream of Arin entering Wular Lake 34.4172 74.6254 
Site 9 River Madhumati:1 km upstream of Madhumati entering Wular Lake 34.4455 74.6463 
Site 10 Wular Lake: Spot near Plan Bandipora 34.3965 74.6144 
Site 11 Wular Lake: Spot near Ashtongu 34.4028 74.5756 
SiteLocationLatitudeLongitude
Site 1 River Jhelum:1 km upstream of confluence point of Lidder and Jhelum 33.7394 75.1295 
Site 2 River Lidder: 1 km upstream of confluence point of Lidder and Jhelum 33.7468 75.1382 
Site 3 River Jhelum: 1 km downstream of confluence point of Lidder and Jhelum 33.7478 75.1242 
Site 4 River Jhelum:1 km upstream of confluence point of Sindh and Jhelum 34.1777 74.6805 
Site 5 River Sindh:1 km upstream of confluence point of Sindh and Jhelum 34.1840 74.6853 
Site 6 River Jhelum:1 km downstream of confluence point of Sindh and Jhelum 34.1930 74.6700 
Site 7 River Jhelum:1 km upstream of Jhelum entering Wular Lake 34.3336 74.6356 
Site 8 River Arin:1 km upstream of Arin entering Wular Lake 34.4172 74.6254 
Site 9 River Madhumati:1 km upstream of Madhumati entering Wular Lake 34.4455 74.6463 
Site 10 Wular Lake: Spot near Plan Bandipora 34.3965 74.6144 
Site 11 Wular Lake: Spot near Ashtongu 34.4028 74.5756 
Figure 1

Study area map showing drainage network, river gauging stations, and sampling sites.

Figure 1

Study area map showing drainage network, river gauging stations, and sampling sites.

Close modal

Testing procedures

In this study, the four seasons of spring, summer, autumn, and winter have been characterized by the data generated for the surface water samples gathered from 11 different sites during the year 2021. The sample vials were airtight and had a 3-l capacity, which was deemed adequate for sampling. The collected samples were examined at Tehkeek International's environmental laboratory and the State Pollution Control Board's (SPCB) water laboratory in Srinagar in accordance with the APHA standards (2017). While the other characteristics were determined in the laboratory, the parameters for transparency, pH, and conductivity were determined immediately near the sampling site. The various physicochemical parameters’ estimates using analytical methods are given in Table 2.

Table 2

Water quality parameters and their analytical procedures

ParametersAbbreviationMethodUnit
H+ concentration pH Digital pH meter – 
Transparency Secchi disk cm 
Conductivity EC Conductivity meter μS/cm 
Total alkalinity TA Titrimetric method mg/l 
Hardness EDTA method mg/l 
Calcium Ca EDTA method mg/l 
Chloride Cl Argentometric titration mg/l 
Free carbon dioxide CO2 Titration method mg/l 
Dissolved oxygen DO Winkler's method mg/l 
Phosphorus PO4 Perchloric acid method μg/l 
Ammoniacal nitrogen NH3-N Phenate-spectrophotometric method μg/l 
Nitrate nitrogen NO3-N Flotation-spectrophotometric method μg/l 
Total dissolved solids TDS Gravimetric analysis mg/l 
Total suspended solids TSS Gravimetric analysis mg/l 
ParametersAbbreviationMethodUnit
H+ concentration pH Digital pH meter – 
Transparency Secchi disk cm 
Conductivity EC Conductivity meter μS/cm 
Total alkalinity TA Titrimetric method mg/l 
Hardness EDTA method mg/l 
Calcium Ca EDTA method mg/l 
Chloride Cl Argentometric titration mg/l 
Free carbon dioxide CO2 Titration method mg/l 
Dissolved oxygen DO Winkler's method mg/l 
Phosphorus PO4 Perchloric acid method μg/l 
Ammoniacal nitrogen NH3-N Phenate-spectrophotometric method μg/l 
Nitrate nitrogen NO3-N Flotation-spectrophotometric method μg/l 
Total dissolved solids TDS Gravimetric analysis mg/l 
Total suspended solids TSS Gravimetric analysis mg/l 

Statistical analysis

Different types of graphs have been used to rapidly and effectively present visual summaries of data that summarize the data's significant content and offer insight into the data (Tripathi & Singal 2019). Graphs make it easier to decide whether or not more intricate modeling is required (Ahmed et al. 2019). Line diagrams and box plots were employed in this study to summarize a dataset during exploratory data analysis. Data were evaluated using two-way ANOVA at a 0.05% level of significance with the aim of determining significant differences between the sites as well as seasons for all water quality measures. Furthermore, the multivariate technique known as PCA was applied to the stream water quality. By exploring groups and sets of variables with similar qualities, PCA may enable us to uncover the structure or patterns in the presence of chaotic or perplexing data, thus simplifying our explanation of observations (Molina et al. 2020). The SPSS (v. 26) software was used to conduct a typical statistical analysis. A data collection with many connected variables is reduced in dimension using FA, which does so by splitting the dataset into a new set of variables called principal components (PCs), which are orthogonal (noncorrelated) and are organized in decreasing order of importance. The PCA is a method of data reduction that specifies how many different types are crucial to understanding the observed variance in the data (Nguyen & Holmes 2019). When computing PCs mathematically, eigenvalues and eigenvectors are obtained using covariance or other cross-product matrices that depict the dispersion of the many observed parameters and the initial variables. With fewer variables, PCA can explain the same amount of variance as it does with more variables (PCs). Additionally, PCA makes an effort to explain the relationship between the observations in terms of underlying elements that are not immediately visible (Lawson et al. 2012).

The spatial and temporal fluctuation in water quality of the study region is depicted in Figure 2(a)–2(n). Down the Jhelum River's stream (site 1, site 3, site 4, site 6, and site 7), pH, DO, hardness, calcium, chloride, alkalinity, transparency, ammonia, and nitrate nitrogen levels showed a declining tendency, but phosphorus, carbon dioxide, TDS, TSS, and conductivity showed an upward trend.
Figure 2

(a–n) Spatio-temporal variation of water quality parameters during all four seasons at different sites.

Figure 2

(a–n) Spatio-temporal variation of water quality parameters during all four seasons at different sites.

Close modal

It can be clearly seen that the water quality of River Jhelum shows significant variations along its length due to its instream pollution and abrupt changes at input sampling stations, namely, site 2 and site 5, which depict the water quality of Rivers Lidder and Sindh, respectively. Furthermore, the water quality of River Jhelum takes another shape when the river takes a pause in the Wular Lake due to other input sources of Wular Lake, namely Arin and Madhumati whose quality parameters are shown at site 8 and site 9, respectively. The water quality indicators of Wular Lake are shown at site 10 and site 11 which are quite different from the values shown at the starting section of River Jhelum. Hence, it can be evidently understood that the water quality of River Jhelum varies greatly along its length until it reaches the Wular Lake due to its various input sources in the form of its tributaries. Transparency, TDS, and TSS are at their peak values in Wular Lake due to higher levels of dissolved organic matter and sediment yield carried by its input water resources. Ammoniacal nitrogen and nitrate nitrogen levels are higher in all sampling stations of River Jhelum which may be due to residential and industrial cover adjacent to its banks contributing a greater amount of animal and human waste to the river.

At site 7, as compared with the other 10 sites, the conductivity, which measures the number of dissolved ions and level of mineralization in water, is relatively greater. The increasing level of chemical effluents into the River Jhelum is attributed to a higher discharge of residential and municipal waste effluents into the river near the mentioned site, which is situated in the Bandipora's district's urbanized area. The pH and alkalinity of Sindh are greater depicted at site 7, which may be the result of excess contaminants interacting during flow, including chemicals, minerals, pollution, soil, or bedrock composition. Due to many circumstances, including low water temperature and little dissolved salt content, the dissolved oxygen levels in the Sindh and Lidder Rivers are considerably greater than those in other sampling stations. Low calcium and hardness levels were discovered at sites 2 and 7, which may be related to the lack of limestone, dolomite, gypsum, and other calcium-containing rocks and minerals in the riverbeds of Lidder and Sindh. The Cl concentration has not shown any drastic change along the length of River Jhelum; however, it typically has a higher value for the sampling stations of Wular Lake. The Cl ions are typically found in the environment as NaCl, CaCl, KCl, and MgCl. The Cl ion and its salts are easily released into the riverine systems close to the source because they are mobile (Huang et al. 2021). Phosphorus concentrations of Lidder and Sindh have a comparatively higher value than other sampling points which may be attributed to the high bank erosion of these rivers and the lack of sewage disposal facilities in these watersheds. Higher CO2 concentrations in Wular Lake monitoring stations suggest that there is a lot of dead material undergoing decomposition. This could also be a naturally occurring phenomenon or the result of several forms of water contamination. Furthermore, the line diagrams show significant seasonal variability of all parameters across all 11 sampling stations. The values of DO, TSS, TDS, and conductivity are higher in spring seasons, whereas the values of hardness, calcium, alkalinity, and carbon dioxide are higher in the winter season. However, only a small number of the elements at some sites that were studied showed either a high concentration (positive peak) or a low concentration (negative peak) during all four seasons. Figure 3(a)–3(n) displays box plots that highlight a dataset's extreme values (minimum and maximum values), median, degree of dispersion, degree of skew, and unusual values.
Figure 3

(a–n) Box plots showing the seasonal variation of various water quality parameters.

Figure 3

(a–n) Box plots showing the seasonal variation of various water quality parameters.

Close modal

Ironically, limited research works have been carried out in the past to access the water quality parameters of River Jhelum due to which comparison analysis of past studies was arduous. However, the available studies were compared with the results of this study as shown in Table 3. It can be clearly seen that there is significant surge in the amounts of TDS and TSS from 1982 to 2021 which hints toward excessive soil and nutrient loss from the adjoining areas of the River Jhelum in the last two decades. These findings indicate that the River Jhelum has become more eutrophic in contrast to prior water quality assessments, showing a continuing decline in the water quality.

Table 3

Comparison of mean values of water quality parameters from 1982 to 2021

ParametersSymbolUnitsChoudhary et al. (1982) Qureshi et al. (2008)Mir et al. (2016)Current study 2021
H+ concentration pH – 8.08 7.48 7.75 7.41 
Transparency cm – – – 43.03 
Conductivity EC μS/cm – – – 203.96 
Total alkalinity TA mg/l 10.72 – 231.9 134.64 
Hardness mg/l – 104.95 119.1 164.39 
Calcium Ca mg/l – – 33.65 116.75 
Chloride Cl mg/l 1.54 – 5.5 15.16 
Free carbon dioxide CO2 mg/l 36.35 – – 9.44 
Dissolved oxygen DO mg/l 5.5 – – 8.06 
Phosphorus PO4 μg/l – – – 165.82 
Ammoniacal nitrogen NH3-N μg/l – – – 223.21 
Nitrate nitrogen NO3-N μg/l – – 700.8 419.67 
Total dissolved solids TDS mg/l 236.65 209.67 149.6 358.75 
Total suspended solids TSS mg/l 315.69 288.02 – 374.82 
ParametersSymbolUnitsChoudhary et al. (1982) Qureshi et al. (2008)Mir et al. (2016)Current study 2021
H+ concentration pH – 8.08 7.48 7.75 7.41 
Transparency cm – – – 43.03 
Conductivity EC μS/cm – – – 203.96 
Total alkalinity TA mg/l 10.72 – 231.9 134.64 
Hardness mg/l – 104.95 119.1 164.39 
Calcium Ca mg/l – – 33.65 116.75 
Chloride Cl mg/l 1.54 – 5.5 15.16 
Free carbon dioxide CO2 mg/l 36.35 – – 9.44 
Dissolved oxygen DO mg/l 5.5 – – 8.06 
Phosphorus PO4 μg/l – – – 165.82 
Ammoniacal nitrogen NH3-N μg/l – – – 223.21 
Nitrate nitrogen NO3-N μg/l – – 700.8 419.67 
Total dissolved solids TDS mg/l 236.65 209.67 149.6 358.75 
Total suspended solids TSS mg/l 315.69 288.02 – 374.82 

ANOVA analysis

To assess the variation of the parameters affecting water quality, a two-way ANOVA was used. At a probability of 5%, the value of parameters with significant F was compared between the stations as well as seasons. The findings indicate that there is a substantial difference between F and F-critical value for all sample values, and that the P-value is relatively small in comparison with alpha value (0.05) except for the phosphorus. Null Hypothesis (H0) is rejected for almost all parameters showing there is a significant variation in parameter values across all sampling stations as well as among all the sampling seasons. The only parameter where Null Hypothesis is accepted across sampling stations is phosphorus showing there is no significant differentiating of values across 11 sampling sites. The results of the two-way ANOVA analysis are shown in Table 4.

Table 4

Results of two-way ANOVA analysis

ParameterSource of variationSum of squaresDegree of freedomMean squaresF-valueF-criticalp-value
pH Seasons 3.0873 10 0.3087 12.54 2.16 
Stations 3.4043 1.1348 46.11 2.92 
NO3-N Seasons 1,283,238.9 10 128,323.8 8.81 2.16 
Stations 0,266,841.88 988,947.29 6.11 2.92 0.0001 
TDS Seasons 430,912.68 10 43,091.26 35.94 2.16 
Stations 34,737.52 1,198.67 9.66 2.92 
DO Seasons 12.9900 10 1.29 14.73 2.16 
Stations 14.7755 4.92 5.87 2.92 
Conductivity Seasons 652,930.26 10 6,529.30 8.41 2.16 
Stations 18,622.82 6,207.60 8.00 2.92 
TSS Seasons 117,946.18 10 11,794.61 42.26 2.16 
Stations 101,875.70 33,958.56 121.67 2.92 
CO2 Seasons 139.87 10 13.98 9.44 2.16 
Stations 75.42 25.14 16.98 2.92 
Hardness Seasons 72,101.68 10 7,210.16 8.54 2.16 
Stations 16,830.25 5,610.08 6.65 2.92 
Calcium Seasons 31,645.50 10 3,164.55 21.32 2.16 
Stations 8,875.09 2,958.36 19.93 2.92 
Alkalinity Seasons 14,927.18 10 1,492.71 6.30 2.16 
Stations 4,262.18 1,420.72 5.99 2.92 0.0001 
Phosphorus Seasons 67,098.63 10 6,709.86 3.08 2.16 0.0082 
Stations 9,696.61 3,232.20 1.48 2.92 0.1931 
Chloride Seasons 842.46 10 84.24 7.79 2.16 
Stations 113.27 37.75 3.49 2.92 0.0038 
NH3-N Seasons 206,325.40 10 20,632.54 11.40 2.16 
Stations 41,069.36 13,689.78 7.56 2.92 
Transparency Seasons 4,795.22 10 479.52 17.78 2.16 
Stations 3,374.97 1,124.99 41.72 2.92 
ParameterSource of variationSum of squaresDegree of freedomMean squaresF-valueF-criticalp-value
pH Seasons 3.0873 10 0.3087 12.54 2.16 
Stations 3.4043 1.1348 46.11 2.92 
NO3-N Seasons 1,283,238.9 10 128,323.8 8.81 2.16 
Stations 0,266,841.88 988,947.29 6.11 2.92 0.0001 
TDS Seasons 430,912.68 10 43,091.26 35.94 2.16 
Stations 34,737.52 1,198.67 9.66 2.92 
DO Seasons 12.9900 10 1.29 14.73 2.16 
Stations 14.7755 4.92 5.87 2.92 
Conductivity Seasons 652,930.26 10 6,529.30 8.41 2.16 
Stations 18,622.82 6,207.60 8.00 2.92 
TSS Seasons 117,946.18 10 11,794.61 42.26 2.16 
Stations 101,875.70 33,958.56 121.67 2.92 
CO2 Seasons 139.87 10 13.98 9.44 2.16 
Stations 75.42 25.14 16.98 2.92 
Hardness Seasons 72,101.68 10 7,210.16 8.54 2.16 
Stations 16,830.25 5,610.08 6.65 2.92 
Calcium Seasons 31,645.50 10 3,164.55 21.32 2.16 
Stations 8,875.09 2,958.36 19.93 2.92 
Alkalinity Seasons 14,927.18 10 1,492.71 6.30 2.16 
Stations 4,262.18 1,420.72 5.99 2.92 0.0001 
Phosphorus Seasons 67,098.63 10 6,709.86 3.08 2.16 0.0082 
Stations 9,696.61 3,232.20 1.48 2.92 0.1931 
Chloride Seasons 842.46 10 84.24 7.79 2.16 
Stations 113.27 37.75 3.49 2.92 0.0038 
NH3-N Seasons 206,325.40 10 20,632.54 11.40 2.16 
Stations 41,069.36 13,689.78 7.56 2.92 
Transparency Seasons 4,795.22 10 479.52 17.78 2.16 
Stations 3,374.97 1,124.99 41.72 2.92 

Factor analysis

FA is a multivariate statistical analysis technique for variables having numerous internal dependent interactions (Jehan et al. 2019). Its goal is to uncover the fundamental organization of the data and identify a small number of ‘abstract’ indicators that serve as the foundation (Auerswald & Moshagen 2019). These abstract indices can explain the observation of the interdependence between variables, naming these variables as factors; FA is about how to extract the information into a few factors with the least amount of loss. These abstract indices can represent a significant portion of information that is reflected by many of the original observation variables. Additionally, this approach can offer the scientific foundation for decision-making by analyzing and assessing in a reasonable and scientific manner (Sellbom & Tellegen 2019). PCA was used in this work to analyze 14 water quality indices at 11 study-area monitoring stations. First, the Kaiser-Meyer-Olkin (KMO) and Barlett tests were used to determine whether PCA was applicable. These tests were used to confirm, respectively, the sufficiency of the sample and the independence of each variable (Elsaman et al. 2022). KMO = 0.566 (>0.5) and Barlett test value = 0 (0.05) were the estimated findings, suggesting that the data are appropriate for PCA.

Data standardization

Using SPSS 26.0 software (IBM, Armonk, NY, USA), a correlation coefficient matrix was created by standardizing the original monitoring data. The DS obtained from the software is presented in Table 5 while the correlation matrix is shown in Table 6.

Table 5

Descriptive statistics

ParameterRangeMinimumMaximumMean
Std. DeviationVariance
StatisticStatisticStatisticStatisticStd. ErrorStatisticStatistic
pH 1.80 6.50 8.30 7.5523 0.06182 0.41004 0.168 
TDS 401.00 180.00 581.00 375.886 16.2825 108.006 11665.359 
EC 134.00 156.00 290.00 217.590 5.41524 35.9206 1290.294 
TSS 280.00 200.00 480.00 341.113 10.9822 72.8481 5306.847 
DO 4.50 5.90 10.40 8.3500 0.12678 0.84096 0.707 
CO2 11.18 6.00 17.18 8.9318 0.37050 2.45762 6.040 
202.00 54.00 256.00 136.113 7.77044 51.5432 2656.708 
Ca 122.00 43.00 165.00 98.0000 4.87540 32.3397 1045.860 
TA 115.00 71.00 186.00 144.636 3.72808 24.7293 611.539 
PO4 268.00 54.00 322.00 185.659 8.66495 57.4767 3303.579 
Cl 22.33 6.43 28.76 13.3345 0.82252 5.45600 29.768 
NH3-N 342.00 47.00 389.00 184.954 12.6269 83.7576 7015.347 
NO3-N 832.00 89.00 921.00 316.204 32.4045 214.947 46202.353 
51.00 28.00 79.00 50.0227 2.17848 14.4503 208.813 
ParameterRangeMinimumMaximumMean
Std. DeviationVariance
StatisticStatisticStatisticStatisticStd. ErrorStatisticStatistic
pH 1.80 6.50 8.30 7.5523 0.06182 0.41004 0.168 
TDS 401.00 180.00 581.00 375.886 16.2825 108.006 11665.359 
EC 134.00 156.00 290.00 217.590 5.41524 35.9206 1290.294 
TSS 280.00 200.00 480.00 341.113 10.9822 72.8481 5306.847 
DO 4.50 5.90 10.40 8.3500 0.12678 0.84096 0.707 
CO2 11.18 6.00 17.18 8.9318 0.37050 2.45762 6.040 
202.00 54.00 256.00 136.113 7.77044 51.5432 2656.708 
Ca 122.00 43.00 165.00 98.0000 4.87540 32.3397 1045.860 
TA 115.00 71.00 186.00 144.636 3.72808 24.7293 611.539 
PO4 268.00 54.00 322.00 185.659 8.66495 57.4767 3303.579 
Cl 22.33 6.43 28.76 13.3345 0.82252 5.45600 29.768 
NH3-N 342.00 47.00 389.00 184.954 12.6269 83.7576 7015.347 
NO3-N 832.00 89.00 921.00 316.204 32.4045 214.947 46202.353 
51.00 28.00 79.00 50.0227 2.17848 14.4503 208.813 
Table 6

Correlation matrix

ParameterpHTDSECTSSDOCO2HCaTAPO4ClNH3-NNO3-NT
pH 1.00              
TDS 0.068 1.00             
EC 0.410 0.286 1.00            
TSS 0.077 0.158 0.024 1.00           
DO 0.466 −0.188 0.534 −0.114 1.00          
CO2 −0.630 0.322 −0.242 −0.062 −0.447 1.00         
−0.602 −0.283 −0.438 0.390 −0.287 0.090 1.00        
Ca −0.533 −0.339 −0.451 0.452 −0.121 0.212 0.838 1.00       
TA 0.364 −0.212 0.508 −0.175 0.742 −0.349 −0.377 −0.240 1.00      
PO4 0.280 0.389 0.304 −0.301 0.286 0.098 −0.591 −0.531 0.183 1.00     
Cl −0.144 0.269 −0.293 0.564 −0.457 0.166 0.377 0.373 −0.612 −0.291 1.00    
NH3-N −0.313 −0.709 −0.584 0.144 −0.279 0.028 0.391 0.513 −0.177 −0.347 −0.025 1.00   
NO3-N 0.018 −0.346 −0.001 0.596 −0.081 −0.181 0.326 0.423 −0.123 −0.445 0.120 0.543 1.00  
0.093 −0.207 −0.139 −0.829 0.156 −0.239 −0.392 −0.463 0.312 0.193 −0.527 −0.069 −0.542 1.00 
ParameterpHTDSECTSSDOCO2HCaTAPO4ClNH3-NNO3-NT
pH 1.00              
TDS 0.068 1.00             
EC 0.410 0.286 1.00            
TSS 0.077 0.158 0.024 1.00           
DO 0.466 −0.188 0.534 −0.114 1.00          
CO2 −0.630 0.322 −0.242 −0.062 −0.447 1.00         
−0.602 −0.283 −0.438 0.390 −0.287 0.090 1.00        
Ca −0.533 −0.339 −0.451 0.452 −0.121 0.212 0.838 1.00       
TA 0.364 −0.212 0.508 −0.175 0.742 −0.349 −0.377 −0.240 1.00      
PO4 0.280 0.389 0.304 −0.301 0.286 0.098 −0.591 −0.531 0.183 1.00     
Cl −0.144 0.269 −0.293 0.564 −0.457 0.166 0.377 0.373 −0.612 −0.291 1.00    
NH3-N −0.313 −0.709 −0.584 0.144 −0.279 0.028 0.391 0.513 −0.177 −0.347 −0.025 1.00   
NO3-N 0.018 −0.346 −0.001 0.596 −0.081 −0.181 0.326 0.423 −0.123 −0.445 0.120 0.543 1.00  
0.093 −0.207 −0.139 −0.829 0.156 −0.239 −0.392 −0.463 0.312 0.193 −0.527 −0.069 −0.542 1.00 

Common degree analysis

Table 7 displays the typical levels of FA. It includes initial common degrees which equals the value of 1 for all parameters. The communality of the variables following factor extraction is listed in the third column. We can have a look at the seven indices: pH, TDS, TSS, DO, alkalinity, calcium, and transparency, all of which have high common degrees (>0.80), with thorough information. Furthermore, the common degree of conductivity, CO2, hardness, phosphorus, chloride, NH3-N, and NO3-N is low (<0.80), with insufficient information.

Table 7

Communalities

ParameterInitialExtraction
pH 1.000 0.887 
Total dissolved solids 1.000 0.874 
Conductivity 1.000 0.787 
Total suspended solids 1.000 0.887 
Dissolved oxygen 1.000 0.813 
Carbon dioxide 1.000 0.767 
Hardness 1.000 0.763 
Calcium 1.000 0.867 
Alkalinity 1.000 0.822 
Total phosphorus 1.000 0.530 
Chloride 1.000 0.740 
Ammonical nitrogen 1.000 0.760 
Nitrate nitrogen 1.000 0.709 
Transparency 1.000 0.885 
ParameterInitialExtraction
pH 1.000 0.887 
Total dissolved solids 1.000 0.874 
Conductivity 1.000 0.787 
Total suspended solids 1.000 0.887 
Dissolved oxygen 1.000 0.813 
Carbon dioxide 1.000 0.767 
Hardness 1.000 0.763 
Calcium 1.000 0.867 
Alkalinity 1.000 0.822 
Total phosphorus 1.000 0.530 
Chloride 1.000 0.740 
Ammonical nitrogen 1.000 0.760 
Nitrate nitrogen 1.000 0.709 
Transparency 1.000 0.885 

Total variance explained

Table 8 shows that all four common factors have distinctive roots that are greater than one and their combined contribution rate to variation accounts for 79.234% of all variance. Furthermore, only the first four components are extracted and rotated. The variance explaining the original variables of the numerous factors was redistributed by the factor, bringing the variance of the factors closer together. It means that these four parameters essentially reflect the core elements of the original data.

Table 8

Total variance explained

C.No.Initial eigenvalues
Extraction sums of squared loadings
Rotation sums of squared loadings
Total% of varianceCumulative %Total% of varianceCumulative %Total% of varianceCumulative %
4.932 35.231 35.231 4.932 35.231 35.231 3.258 23.274 23.274 
2.591 18.511 53.741 2.591 18.511 53.741 3.067 21.904 45.178 
2.384 17.030 70.771 2.384 17.030 70.771 2.622 18.727 63.906 
1.185 8.463 79.234 1.185 8.463 79.234 2.146 15.328 79.234 
0.923 6.590 85.824       
0.641 4.578 90.402       
0.397 2.838 93.240       
0.310 2.212 95.452       
0.230 1.642 97.094       
10 0.172 1.226 98.320       
11 0.090 0.640 98.959       
12 0.073 0.521 99.480       
13 0.043 0.305 99.785       
14 0.030 0.215 100.000       
C.No.Initial eigenvalues
Extraction sums of squared loadings
Rotation sums of squared loadings
Total% of varianceCumulative %Total% of varianceCumulative %Total% of varianceCumulative %
4.932 35.231 35.231 4.932 35.231 35.231 3.258 23.274 23.274 
2.591 18.511 53.741 2.591 18.511 53.741 3.067 21.904 45.178 
2.384 17.030 70.771 2.384 17.030 70.771 2.622 18.727 63.906 
1.185 8.463 79.234 1.185 8.463 79.234 2.146 15.328 79.234 
0.923 6.590 85.824       
0.641 4.578 90.402       
0.397 2.838 93.240       
0.310 2.212 95.452       
0.230 1.642 97.094       
10 0.172 1.226 98.320       
11 0.090 0.640 98.959       
12 0.073 0.521 99.480       
13 0.043 0.305 99.785       
14 0.030 0.215 100.000       

Scree plot analysis

We can select the PCs and comprehend the fundamental data structure with the aid of the scree plot. The information about all factors’ eigenvalues is represented via a scree plot. It is useful to identify the ideal number of primary components using the PCA selection principles (Isiyaka et al. 2019). After the fourth component, it was noticed that the slope significantly flattened down. The first four main components, which have eigenvalues more than one and accounted for 79.234% of the dataset's variance, were kept. Figure 4 shows that there is a larger variation in the eigenvalues between factors 1 and 2, 3, and 4, as well as between factors 4 and 5. Additionally, there is not much of a difference between factors 5 and 6 and beyond. This implies that the top four factors have more reliable general information, which is appropriate, taking the top four factors as the primary composition factor to represent all the 14 variables.
Figure 4

Scree plot.

The factor loading matrix, which provides the load each variable has on PCs, is shown in Table 9 before and after rotation. We can see that, following rotation, the load factor has been substantially polarized.

Table 9

Component matrix and rotated component matrix

ParameterComponent matrix
Rotated component matrix
PC (1)PC (2)PC (3)PC (4)PC (1)PC (2)PC (3)PC (4)
pH −0.595 0.244 0.483 −0.490 0.255 0.040 0.228 0.877 
TDS −0.219 −0.824 0.378 0.061 0.877 0.192 −0.237 −0.108 
EC −0.601 0.049 0.572 0.309 0.530 0.206 0.640 0.234 
TSS 0.544 0.051 0.767 −0.026 −0.076 0.929 −0.103 0.088 
DO −0.586 0.513 0.262 0.371 0.073 −0.029 0.868 0.232 
CO2 0.314 −0.624 −0.271 0.454 0.254 −0.021 −0.280 −0.790 
0.823 0.157 −0.046 0.245 −0.558 0.427 −0.180 −0.487 
Ca 0.820 0.252 0.009 0.361 −0.596 0.488 −0.042 −0.521 
TA −0.623 0.529 0.091 0.383 0.023 −0.190 0.867 0.182 
PO4 −0.633 −0.345 0.004 0.098 0.621 −0.324 0.186 0.067 
Cl 0.599 −0.423 0.359 −0.273 0.079 0.573 −0.634 −0.057 
NH3-N 0.590 0.513 −0.352 −0.155 −0.850 0.006 −0.177 −0.073 
NO3-N 0.521 0.500 0.418 −0.116 −0.531 0.622 0.032 0.198 
−0.560 0.131 −0.716 −0.204 −0.062 −0.919 0.080 0.175 
ParameterComponent matrix
Rotated component matrix
PC (1)PC (2)PC (3)PC (4)PC (1)PC (2)PC (3)PC (4)
pH −0.595 0.244 0.483 −0.490 0.255 0.040 0.228 0.877 
TDS −0.219 −0.824 0.378 0.061 0.877 0.192 −0.237 −0.108 
EC −0.601 0.049 0.572 0.309 0.530 0.206 0.640 0.234 
TSS 0.544 0.051 0.767 −0.026 −0.076 0.929 −0.103 0.088 
DO −0.586 0.513 0.262 0.371 0.073 −0.029 0.868 0.232 
CO2 0.314 −0.624 −0.271 0.454 0.254 −0.021 −0.280 −0.790 
0.823 0.157 −0.046 0.245 −0.558 0.427 −0.180 −0.487 
Ca 0.820 0.252 0.009 0.361 −0.596 0.488 −0.042 −0.521 
TA −0.623 0.529 0.091 0.383 0.023 −0.190 0.867 0.182 
PO4 −0.633 −0.345 0.004 0.098 0.621 −0.324 0.186 0.067 
Cl 0.599 −0.423 0.359 −0.273 0.079 0.573 −0.634 −0.057 
NH3-N 0.590 0.513 −0.352 −0.155 −0.850 0.006 −0.177 −0.073 
NO3-N 0.521 0.500 0.418 −0.116 −0.531 0.622 0.032 0.198 
−0.560 0.131 −0.716 −0.204 −0.062 −0.919 0.080 0.175 

The first principal component (PC1) accounted for 35.23% of the overall variation and had a significant positive loading on TDS (0.877), and the second principal component (PC2) accounted for 18.51% of the overall variation and had a significant positive loading on TSS (0.929). It shows that a substantial sediment production as a result of bank erosion, agricultural runoff and watershed disintegration are to blame for the stream's worst pollution problem. For factor 3, the maximum factor loading values (>0.86) for DO and alkalinity indicate that soil erosion and bed decomposition (phosphates, limestone) are other significant environmental pollutants in rivers. The fourth major component, which accounted for 8.46% of the overall variation, significantly increased the positive loading on the pH index (0.877). The link between a river water pH and other water quality indices is complex. Toxic pollution from industrial manufacturing, however, may be to blame.

Component score

Table 10 represents the score matrix of PCs. The first and second component show the amount of water that has been contaminated by sewage and soil erosion, respectively, and represent inorganic and organic pollution factors. The third component is a complete pollution factor, and it primarily displays the amount of water pollution caused by the material's inherent structure in river beds; and the fourth component is metal pollution.

Table 10

Component score coefficient matrix

ParameterPrincipal components
PC (1)PC (2)PC (3)PC (4)
pH 0.017 0.058 −0.104 0.470 
Total dissolved solids 0.324 0.109 −0.091 −0.075 
Conductivity 0.167 0.171 0.284 −0.056 
Total suspended solids 0.033 0.326 0.008 0.093 
Dissolved oxygen −0.008 0.073 0.389 −0.081 
Carbon dioxide 0.151 −0.020 0.049 −0.443 
Hardness −0.120 0.098 0.076 −0.212 
Calcium −0.127 0.130 0.163 −0.265 
Alkalinity −0.032 0.011 0.388 −0.109 
Total phosphorus 0.181 −0.059 0.042 −0.056 
Chloride 0.080 0.158 −0.265 0.122 
Ammonical nitrogen −0.281 −0.073 −0.061 0.066 
Nitrate nitrogen −0.154 0.195 0.030 0.161 
Transparency −0.100 −0.333 −0.084 0.088 
ParameterPrincipal components
PC (1)PC (2)PC (3)PC (4)
pH 0.017 0.058 −0.104 0.470 
Total dissolved solids 0.324 0.109 −0.091 −0.075 
Conductivity 0.167 0.171 0.284 −0.056 
Total suspended solids 0.033 0.326 0.008 0.093 
Dissolved oxygen −0.008 0.073 0.389 −0.081 
Carbon dioxide 0.151 −0.020 0.049 −0.443 
Hardness −0.120 0.098 0.076 −0.212 
Calcium −0.127 0.130 0.163 −0.265 
Alkalinity −0.032 0.011 0.388 −0.109 
Total phosphorus 0.181 −0.059 0.042 −0.056 
Chloride 0.080 0.158 −0.265 0.122 
Ammonical nitrogen −0.281 −0.073 −0.061 0.066 
Nitrate nitrogen −0.154 0.195 0.030 0.161 
Transparency −0.100 −0.333 −0.084 0.088 

Factor score analysis

The relevant factor scores at different sampling stations can be obtained using the SPSS software.

In the data editing box of SPSS Soft, paste the four columns of component matrix data from Table 7 (label the variables as α, β, μ, and Ɩ). Use ‘transform → compute variable’ to input formula as ‘A = α/SQR(Xi), B = β/SQR(Xi), C = μ/SQR(Xi), D = Ɩ/SQR(Xi)’, where Xi is the eigenvalue of for each PC as shown in Table 6 and A, B, C, and D depicts the corresponding eigenvectors. The standardized data is multiplied by the eigenvectors A, B, C, and D, and the four extracted PC expressions are derived as follows:
formula
(1)
formula
(2)
formula
(3)
formula
(4)
where F is the principal component score; A, B, C, and D are the corresponding eigenvectors; Z is the standardized Z-score; and y is the corresponding principal component value.
Considering the different weights of variance for the four primary components (e1, e2, e3, and e4), the comprehensive evaluation function F can be calculated using the formula (5).
formula
(5)

The results obtained using the above formulas are shown in Table 11. The results in the table reveal that of the 11 sampling stations, site 5 has the poorest water quality and site 1 is the least affected sampling station. Comprehensive data indicate the 11 cross-sections’ pollution status, analyzed using evaluation function F in the following order: site 5 > site 10 > site 9 > site 11 > site 8 > site 2 > site 7 > site 9 > site 6 > site 3 > site 6 > site 3 > site 4 > site 1.

Table 11

Water quality principal component score and order of 11 sampling stations

SiteF1F2F3F4FOrder
Site 1 −1.51235 0.596498 −0.26949 0.439998 −0.54403 11 
Site 2 0.2485 −0.81411 0.79471 0.658 0.161391 
Site 3 −0.96004 0.040725 0.197673 0.474013 −0.32424 
Site 4 −1.16214 0.21282 −0.24717 −0.18774 −0.54019 10 
Site 5 1.02993 −0.75451 0.569888 0.771443 0.486563 
Site 6 −0.30473 0.012438 0.1461 −0.07663 −0.10937 
Site 7 −0.73228 0.699215 0.481193 −0.46036 −0.10799 
Site 8 0.689618 −0.80739 0.144203 0.205468 0.17095 
Site 9 0.742448 −0.7055 0.39468 0.096533 0.260444 
Site 10 0.971715 0.70219 −1.021 −0.8013 0.291078 
Site 11 0.989328 0.817623 −1.1908 −1.11944 0.255402 
SiteF1F2F3F4FOrder
Site 1 −1.51235 0.596498 −0.26949 0.439998 −0.54403 11 
Site 2 0.2485 −0.81411 0.79471 0.658 0.161391 
Site 3 −0.96004 0.040725 0.197673 0.474013 −0.32424 
Site 4 −1.16214 0.21282 −0.24717 −0.18774 −0.54019 10 
Site 5 1.02993 −0.75451 0.569888 0.771443 0.486563 
Site 6 −0.30473 0.012438 0.1461 −0.07663 −0.10937 
Site 7 −0.73228 0.699215 0.481193 −0.46036 −0.10799 
Site 8 0.689618 −0.80739 0.144203 0.205468 0.17095 
Site 9 0.742448 −0.7055 0.39468 0.096533 0.260444 
Site 10 0.971715 0.70219 −1.021 −0.8013 0.291078 
Site 11 0.989328 0.817623 −1.1908 −1.11944 0.255402 

The key problem with conventional water quality monitoring systems is the rapid generation of massive physicochemical data matrices that demand an efficient data processing system in order to evaluate the results, associate variables, and draw conclusions that are relevant. The results of this research show that PCA, a potent multivariate statistical tool, decreases a dataset's dimensionality while keeping as much of the dataset's variability as possible and enables the assessment of relationships between variables. The environmental variables highlighted by PCA clearly define and explain the pollution gradients.

The purpose of this study was to determine the overall surface water quality scenario of the north-east Himalayan region of Kashmir Valley using standard testing procedures, and analyzing the results using multivariate statistical tools. Furthermore, the impact of side-streams, specifically Lidder, Sindh, Arin, and Madhumati, on the water quality of the River Jhelum and the Wular Lake was evaluated. Surface water quality testing was done at 11 sampling stations for the year 2021 by collecting the samples throughout the four seasons to check the spatial and temporal variability. Line diagrams and box plots show that certain parameters, as per WHO guidelines (Cotruvo 2017), are beyond the allowable limits at a specific site for the specified sampling season and are not acceptable for drinking, farming, fishing, or other household uses. The increasing amount of side-stream pollution was a sign of increased anthropogenic pressure in the watersheds of the north-east Himalayas.

The sampled parameters were subjected to a two-way ANOVA analysis to determine whether there was any seasonal or sectional divergence. The results showed that there was significant spatio-temporal variability. Furthermore, the most important indicator parameters impacting water quality and potential sources of pollution were extracted using the PCA method. On the basis of optimizing the retention of the original data information, PCA integrates and simplifies the high-dimensional variable system. Four significant PCs were extracted from the 14 water quality measures by PCA, which accounts for 79.23% of the variation in the initial dataset. PC1 (35.23%) and PC2 (18.51%) represented chemical pollutants, indicating the influence of bank erosion and other deposited sediments on water quality. PC3 (17.03%) provided a positive correlation with DO and alkalinity and represented pollution due to bed decomposition and industrial as well as residential wastes. PC4 (8.43%) has the positive loadings of pH representing toxic pollution from the municipal area surrounding the surface water resources.

In order to improve decision-making and the efficient management of water resources in the River Jhelum and its related tributaries, this study makes use of the technique's strengths in assessing and interpreting sources of pollutants as well as the fluctuation of the water quality. The findings of this study could prompt deeper and logical considerations and lead to an improvement in critical management for the ecology and environment of surface water resources in the north-eastern Himalayas.

The main limitation of the methodology used in this study is the difficulties faced in the interpretation of obtained results. The PCs are given by PCA. PCs are not as readable and interpretable as original parameters. PCs attempt to account for as much variance among the variables in a dataset as possible, but if the number of PCs is not carefully chosen, it may leave out some critical information from the original set of features. Secondly, field data from 2021 was used to determine the water quality status in the study watersheds. However, a long-term monitoring data would have been helpful in more accurate analysis to find the patterns of water quality degradation. In view of this, we recommend a comprehensive monitoring programme that would consider data over a longer period of time and for more water quality components, such as heavy metals and other contaminants, which might enter the River Jhelum through the use of pesticides in agricultural farms. Furthermore, it is suggested that the in-depth analysis of potential factors responsible for river water pollution may be carried out using optimized machine learning models.

All relevant data are included in the paper or its Supplementary Information.

The authors declare there is no conflict.

Ahmed
A. N.
,
Othman
F. B.
,
Afan
H. A.
,
Ibrahim
R. K.
,
Fai
C. M.
,
Hossain
M. S.
,
Ehteram
M.
&
Elshafie
A.
2019
Machine learning methods for better water quality prediction
.
Journal of Hydrology
578
(
1
),
124084
.
American Public Health Association (APHA)
2017
Standard Methods for Examination of Water and Wastewater
, 23rd edn.
APHA, AWWA, WPCF
,
Washington
.
Avtar
R.
,
Tripathi
S.
,
Aggarwal
A. K.
&
Kumar
P.
2019
Population–urbanization–energy nexus: a review
.
Resources
8
(
3
),
136
.
Aydin
H.
,
Ustaoğlu
F.
,
Tepe
Y.
&
Soylu
E. N.
2021
Assessment of water quality of streams in northeast Turkey by water quality index and multiple statistical methods
.
Environmental Forensics
22
(
1–2
),
270
287
.
Bhat
S. U.
,
Bhat
A. A.
,
Jehangir
A.
,
Hamid
A.
,
Sabha
I.
&
Qayoom
U.
2021
Water quality characterization of Marusudar River in Chenab Sub-Basin of North-Western Himalaya using multivariate statistical methods
.
Water, Air, & Soil Pollution
232
(
11
),
1
22
.
Butler
C. L.
,
Lucieer
V. L.
,
Wotherspoon
S. J.
&
Johnson
C. R.
2020
Multi-decadal decline in cover of giant kelp Macrocystis pyrifera at the southern limit of its Australian range
.
Marine Ecology Progress Series
653
,
1
18
.
Choudhary
N.
,
Shukla
N. P.
,
Amin
R.
,
Qalandar
S. P.
,
1982
Study of water pollution in river Jhelum of Kashmir
. In:
Proceedings Water and Waste Engineering
(
Cotton
A.
&
Pickford
J.
, eds).
WEDC Group, Loughborough, UK
, pp.
151
160
.
Cotruvo
J. A.
2017
WHO guidelines for drinking water quality: first addendum to the fourth edition
.
Journal-American Water Works Association
109
(
7
),
44
51
.
Garrido
L. E.
,
Abad
F. J.
&
Ponsoda
V.
2013
A new look at Horn's parallel analysis with ordinal variables
.
Psychological Methods
18
(
4
),
454
.
Goliatt
L.
,
Sulaiman
S. O.
,
Khedher
K. M.
,
Farooque
A. A.
&
Yaseen
Z. M.
2021
Estimation of natural streams longitudinal dispersion coefficient using hybrid evolutionary machine learning model
.
Engineering Applications of Computational Fluid Mechanics
15
(
1
),
1298
1320
.
Gude
V. G.
2017
Desalination and water reuse to address global water scarcity
.
Reviews in Environmental Science and Bio/Technology
16
(
4
),
591
609
.
Hanjra
M. A.
,
Blackwell
J.
,
Carr
G.
,
Zhang
F.
&
Jackson
T. M.
2012
Wastewater irrigation and environmental health: implications for water governance and public policy
.
International Journal of Hygiene and Environmental Health
215
(
3
),
255
269
.
Isiyaka
H. A.
,
Mustapha
A.
,
Juahir
H.
&
Phil-Eze
P.
2019
Water quality modelling using artificial neural network and multivariate statistical techniques
.
Modeling Earth Systems and Environment
5
(
2
),
583
593
.
Jehan
S.
,
Khan
S.
,
Khattak
S. A.
,
Muhammad
S.
,
Rashid
A.
&
Muhammad
N.
2019
Hydrochemical properties of drinking water and their sources apportionment of pollution in Bajaur agency, Pakistan
.
Measurement
139
(
1
),
249
257
.
Lawson
D. J.
,
Hellenthal
G.
,
Myers
S.
&
Falush
D.
2012
Inference of population structure using dense haplotype data
.
PLoS Genetics
8
(
1
),
e1002453
.
Machiwal
D.
,
Cloutier
V.
,
Güler
C.
&
Kazakis
N.
2018
A review of GIS-integrated statistical techniques for groundwater quality evaluation and protection
.
Environmental Earth Sciences
77
(
19
),
1
30
.
Mehmood
M. A.
,
Shafiq-ur-Rehman
A. R.
&
Ganie
S. A.
2017
Spatio-temporal changes in water quality of Jhelum River, Kashmir Himalaya
.
International Journal of Environmental Bioenergy
12
(
1
),
1
29
.
Molina
J. L.
,
Zazo
S.
,
Martín-Casado
A. M.
&
Patino-Alonso
M. C.
2020
Rivers’ temporal sustainability through the evaluation of predictive runoff methods
.
Sustainability
12
(
5
),
1720
.
Nguyen
L. H.
&
Holmes
S.
2019
Ten quick tips for effective dimensionality reduction
.
PLoS Computational Biology
15
(
6
),
e1006907
.
Qadir
A.
,
Malik
R. N.
&
Husain
S. Z.
2008
Spatio-temporal variations in water quality of Nullah Aik-tributary of the river Chenab, Pakistan
.
Environmental Monitoring and Assessment
140
(
1
),
43
59
.
Qayoom
U.
,
Bhat
S. U.
,
Ahmad
I.
&
Kumar
A.
2022
Assessment of potential risks of heavy metals from wastewater treatment plants of Srinagar city, Kashmir
.
International Journal of Environmental Science and Technology
19
(
9
),
9027
9046
.
Qureshi
T. A.
2008
Effect of thermal pollution on the hydrological parameters of river Jhelum (J & K)
.
Journal of Indian Fisheries Association
35
(
1
),
159
166
.
Rosegrant
M. W.
,
Ringler
C.
&
Zhu
T.
2009
Water for agriculture: maintaining food security under growing scarcity
.
Annual Review of Environment and Resources
34
(
1
),
205
222
.
Showqi
I.
,
Rashid
I.
&
Romshoo
S. A.
2014
Land use land cover dynamics as a function of changing demography and hydrology
.
GeoJournal
79
(
3
),
297
307
.
Singh
K. P.
,
Basant
A.
,
Malik
A.
&
Jain
G.
2009
Artificial neural network modeling of the river water quality – a case study
.
Ecological Modelling
220
(
6
),
888
895
.
Sun
K.
,
Rajabtabar
M.
,
Samadi
S.
,
Rezaie-Balf
M.
,
Ghaemi
A.
,
Band
S. S.
&
Mosavi
A.
2021
An integrated machine learning, noise suppression, and population-based algorithm to improve total dissolved solids prediction
.
Engineering Applications of Computational Fluid Mechanics
15
(
1
),
251
271
.
Tipping
M. E.
&
Bishop
C. M.
1999
Probabilistic principal component analysis
.
Journal of the Royal Statistical Society: Series B (Statistical Methodology)
61
(
3
),
611
622
.
Zhao
N.
,
Ghaemi
A.
,
Wu
C.
,
Band
S. S.
,
Chau
K. W.
,
Zaguia
A.
,
Mafarja
M.
&
Mosavi
A. H.
2021
A decomposition and multi-objective evolutionary optimization model for suspended sediment load prediction in rivers
.
Engineering Applications of Computational Fluid Mechanics
15
(
1
),
1811
1829
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY-NC-ND 4.0), which permits copying and redistribution for non-commercial purposes with no derivatives, provided the original work is properly cited (http://creativecommons.org/licenses/by-nc-nd/4.0/).