Water resource management substantially depends on water quality (WQ). Anthropogenic and geogenic pollutants in water system are challenging to identify, transport, and properly dispose of, thus demanding frequent monitoring. Study focuses on application of statistical approach to analyse pattern and to monitor WQ parameters of region. Paper presents computation of water quality index (WQI) based on various WQ parameters of the Daman Ganga River situated in Vapi, Gujarat, India. 17 WQ parameters considered were pH, electrical conductivity, temperature (Temp), total dissolved solids, NO2 + NO3, (P-Tot), Ca, Mg, Na, K, Cl, SO4, CO3, HCO3, total hardness, sodium absorption ratio (SAR), and calcium hardness (HAR_Ca). Quartile deviation was carried out as preprocessing technique to identify fair analysis of trend followed by other parameters. Application of PCA followed by varimax rotation factor analysis was attempted to identify contribution of significant parameters. Methods developed by Council of Canadian Ministry of Environment (CCME) and British Columbia (BC) were applied to compute WQI. WQI evaluated were 42.35 and 63.29 for CCME and BC, respectively, based on five significantly influencing parameters, namely, HAR_Ca, SAR, CO3, Temp, and P-Tot. Study signifies the hardness and salinity factors impacting WQ and efficiently reduces subjectivity and bias to determine the WQI model.

  • Trend analysis followed by 17 water quality parameters from 2000 to 2019 was studied.

  • Application of data preprocessing and analysis through principal component analysis followed by varimax rotation factor analysis was performed.

  • Factor analysis of data notifies that five parameters are suspected to influence the water quality index (WQI) values.

  • The WQI was computed with the methods developed by the Council of Canadian Ministry of Environment and British Columbia.

Background information

Water quality (WQ) management is of paramount importance for environmental sustainability, public health, and ecosystem preservation. Water is a precious natural resource that is necessary for both human survival and the wellbeing of ecosystems. Over the last few decades, there has been an increase in human activity, particularly in industrial areas, which has had an impact on water bodies. These concerns are present worldwide. There have been expansions in population, industry, agricultural practices, and urban areas along the corridor or riverbanks in modern times due to scientific and technological advancements (Kumar et al. 2019). Large amounts of nutrients, heavy metals, and organic pollutants are transported into surface waters by anthropogenic factors including urbanization and agricultural development. This leads to the contamination of water bodies and sediments, biome outbreaks, and organic pollution (Khabouchi et al. 2020).

In the semi-arid region, the water surface serves as a main WQ indicator for sustainable development. Aspects of sustainable development and water resource management dependent on land cover and natural resource availability, such as surface energy balance, biogeochemical cycling, and climatic regulation, are crucial to the growth of agriculture and water resources (Haghiabi et al. 2018).

Climate change and other associated stresses, particularly rainfall and temperature, can make water reserves more vulnerable. The results of Khambhat City in Gujarat showed that increasing changes in land use and land cover were one of the main causes of change in groundwater quality between 2001 and 2011. Sharp reductions in freshwater resources have been noted in terms of both quantity (lower water table) and quality (high nutrient content, saltwater incursion) because of the aforementioned factors and conditions (Kumar et al. 2019).

At different places, the main drivers of declining WQ are agriculture, industries, human activities, and climate changes. In recent years, several agricultural locations in Baghdad have been transformed into urban districts. There is little doubt that the decrease in agricultural activity has had an adverse effect on the quality of the Tigris River's water, which flows through Baghdad. In addition, human and industrial activities including home sewage and pollution from household and industrial wastes influence the river's WQ. Furthermore, the lack of rain brought on by climate change and neighboring countries' control over rivers that Iraq shares with them have a significant negative influence on the rivers’ salinity levels (Ahmed et al. 2023).

Pollutant transport is indirectly influenced by land use patterns in the surrounding region, but the configurations of water bodies (types, sizes, and location) are the direct elements that determine the effect of lowering water pollution. Season is a more sensitive factor in the impact of landscape on WQ than other criteria. According to Zhao et al. (2020), changing the layouts of water bodies and land use patterns can greatly lower the loads of pollutants.

Also, the surface water body variations are common throughout the world for several reasons (Woolway et al. 2020). For instance, the extinction of Arctic ponds and distinct variations in the length of water bodies in China and the contiguous United States have been caused by climate change (Smol & Douglas 2007). The effects of groundwater pumping, the building and management of dams, and the enhancement of riverbeds further complicate the hydraulic connectivity between surface water and groundwater (Kandekar et al. 2021). An increasing evaporation/precipitation ratio has resulted in a reduction of the pond surface area in the Italian Alps (Salerno et al. 2014). However, it is possible to support the sustainable development of water resources and yield several benefits for humans by paying attention on the dynamics of surface water bodies and comprehending the elements that influence them (Wang et al. 2020).

The preservation of ecosystem services, management of water resources, and promotion of economic growth depend on sustainable development (Moharir & Pande 2020). Yet, comprehensive information about surface water's long-term trends in WQ is lacking. Field measurements give detailed data on the rivers being monitored for WQ. The problem with field WQ measurements, though, was that they required a lot of operation. The cost of WQ sampling using these methods may increase due to sensor calibration, cleaning, and technical issues (Wang et al. 2020).

Many authors have demonstrated to efficiently assess and monitor the overall quality of inland water by computing water quality index (WQI) using various WQ parameters (Asadi et al. 2007). WQI is a mathematical tool that incorporates the calculation through various water characteristics into a single value (Yogendra & Puttaiah 2008). Brown et al. (1972) introduced the concept of WQI initially. The different WQIs used worldwide are Weighted Arithmetic Water Quality Index (WA-WQI), British Columbia Water Quality Index (BC-WQI), Canadian Council of Ministers of the Environment Water Quality Index (CCME-WQI), Oregon Water Quality Index (O-WQI), and the US National Sanitation Foundation Water Quality Index (NSF-WQI) (Kumar et al. 2020). Gupta et al. (2017) studied the effects of eight physical, biological, and chemical parameters at six sites to compute the WQI along the Narmada River at Madhya Pradesh, India, using three methods (NSF-WQI, CCME-WQI, and WA-WQI) and found that the WQ is good in summer and winter seasons while poor in the monsoon season. Roy et al. (2021), with 11 parameters at 18 sites using Comprehensive Pollution Index and Eutrophication Index values, found that the WQ of the Shilabati River, West Bengal, India, is poor and severely polluted. An et al. (2015) presented the successful application of principal component analysis (PCA) as one of the soft-computing technique to reduce dimensionality to assess the WQ. Ibrahim et al. (2023) investigated the most significant among 22 WQ parameters that were affecting the WQ and suggested the decision-making model using the PCA by developing the best inputs. Benkov et al. (2023) proposed valuable information through studying the Struma River Catchment, Bulgaria, with the trend analysis of computed WQI values using the CCME-WQI method and identified the latent factors that are controlling WQ using the PCA. Ghoderao et al. (2022) computed the groundwater quality index using two methods by applying varimax rotation factor analysis at five different sites with 11 WQ parameters and found that three sites were not suitable for drinking condition, one site had moderate drinking condition, and one site had excellent drinking condition.

This study is structured in three parts to compute the WQI of inland water body in Vapi region, Gujarat, India. (1) The raw dataset was pre-processed using the quartile deviation technique. (2) The application of PCA followed by varimax rotation factor analysis was done after standardizing the raw dataset. (3) The WQI was computed based on the identification of significantly influencing parameters using two methods, i.e., CCME-WQI and BC-WQI.

Contextualizing the research problem

Vapi is a city and municipal corporation on the banks of the Daman Ganga River in Pardi, Valsad district, situated at the southern end of Gujarat. As it is home to numerous chemical industries and the factories of some well known brands, it is also referred to as the ‘Chemical City of Gujarat’. The city is home to several chemical companies as well as the factories of some well-known brands. This study proposes the following:

  • (1) To examine the quality of inland water body through the analysis of 17 essential physicochemical parameters.

  • (2) To enhance the CCME-WQI and BC-WQI models through the implementation of PCA followed by factor analysis to more accurately reflect the unique WQ conditions of the rivers.

  • (3) To determine the elements regulated by the most critical WQ parameters that affect the overall WQ and apply PCA to allocate suitable weights for a more precise evaluation.

  • (4) To analyze the trends in WQ over time to gain insights into long-term changes and guide future water management strategies.

  • (5) To present the practical insights into environmental authorities and policymakers for the betterment of the region's WQ management and monitoring programs.

Study area and data source

As shown in Figure 1 Vapi is a city and municipal corporation on the banks of the Daman Ganga River in Pardi, Valsad district, situated at the southern end of Gujarat. It is known as the ‘Chemical City of Gujarat’. The city is home to several chemical companies as well as the factories of some well-known brands. The basin is situated between 19° 51′ to 20° 28′ North latitude and 72° 50′ to 73° 38′ East longitude. The basin's total drainage area is 2,318 km2 that collects the annual rainfall of about 2,200 mm from the months of June to September.

Water sampling

WQ parameter data were provided by the Gujarat State Water Data Centre on request. The samples were collected on various dates from July 2000 to January 2019. The general observed frequency of collecting sample was once in 2 months. On various days, the general timing for collecting the water sample for testing varied from 8:00 AM to 12:30 PM. Twenty-five WQ parameters, including physical, chemical, and biological parameters, were included in the given data; of these, 17 were visually identifiable and should be considered during computation.

Determination of properties of water

The dataset of 25 parameters was available of which 17 physiochemical WQ parameters were taken into consideration for the study such as pH, electrical conductivity (EC), temperature (Temp), total dissolved solids (TDS), nitrogen oxides such as NO2 + NO3, total phosphate, calcium (Ca), magnesium (Mg), sodium (Na), potassium (K), chloride (Cl), SO4, CO3, HCO3, total hardness (HAR_Total), sodium absorption ratio (SAR), and calcium hardness (HAR_Ca). Table 1 presents the fundamental statistical analysis of the 17 parameters.

Table 1

Statistical analysis of the 17 physiochemical WQ parameters with their details

AbbreviationParameterVariable IDUnitPermissible limits (BIS)MeanVarianceStandard deviation
pH pH VAR01 pH units 6.5–8.5 6.16 1.30 1.14 
EC Electrical conductivity VAR02 μmho/cm 1,000a 2,609.90 6,119,447.85 2,473.75 
Temp Temperature VAR03 °C 27 27.86 9.47 3.08 
TDS Total dissolved solids VAR04 mg/L 500 3,828.71 11,638,887.84 3,411.58 
NO2 + NO3 Nitrogen, total oxidized VAR05 mg N/L 45.00 18.93 363.65 19.07 
P-Tot Total phosphorous VAR06 mg P/L 1.00 0.75 0.52 0.72 
Ca Calcium VAR07 mg/L 75.00 188.04 32,795.96 181.10 
Mg Magnesium VAR08 mg/L 0.10 83.35 5,266.24 72.57 
Na Sodium VAR09 mg/L 200b 975.35 901,149.09 942.29 
Potassium VAR10 mg/L 12b 1.07 3.73 1.93 
Cl Chloride VAR11 mg/L 250.00 1,456.69 2,023,581.63 1,422.53 
SO4 Sulfate VAR12 mg/L 200.00 153.34 15,770.99 125.58 
CO3 Carbonate VAR13 mg/L 30.00 5.71 58.01 7.62 
HCO3 Bicarbonate VAR14 mg/L 25.00 158.01 12,208.27 110.49 
HAR_Total Total hardness VAR15 mg CaCO3/L 600.00 817.23 368,763.11 607.26 
SAR Sodium absorption ratio VAR16 – 3.00 14.12 199.91 14.14 
HAR_Ca Calcium hardness VAR17 mg CaCO3/L 200.00 470.10 204,974.76 452.74 
AbbreviationParameterVariable IDUnitPermissible limits (BIS)MeanVarianceStandard deviation
pH pH VAR01 pH units 6.5–8.5 6.16 1.30 1.14 
EC Electrical conductivity VAR02 μmho/cm 1,000a 2,609.90 6,119,447.85 2,473.75 
Temp Temperature VAR03 °C 27 27.86 9.47 3.08 
TDS Total dissolved solids VAR04 mg/L 500 3,828.71 11,638,887.84 3,411.58 
NO2 + NO3 Nitrogen, total oxidized VAR05 mg N/L 45.00 18.93 363.65 19.07 
P-Tot Total phosphorous VAR06 mg P/L 1.00 0.75 0.52 0.72 
Ca Calcium VAR07 mg/L 75.00 188.04 32,795.96 181.10 
Mg Magnesium VAR08 mg/L 0.10 83.35 5,266.24 72.57 
Na Sodium VAR09 mg/L 200b 975.35 901,149.09 942.29 
Potassium VAR10 mg/L 12b 1.07 3.73 1.93 
Cl Chloride VAR11 mg/L 250.00 1,456.69 2,023,581.63 1,422.53 
SO4 Sulfate VAR12 mg/L 200.00 153.34 15,770.99 125.58 
CO3 Carbonate VAR13 mg/L 30.00 5.71 58.01 7.62 
HCO3 Bicarbonate VAR14 mg/L 25.00 158.01 12,208.27 110.49 
HAR_Total Total hardness VAR15 mg CaCO3/L 600.00 817.23 368,763.11 607.26 
SAR Sodium absorption ratio VAR16 – 3.00 14.12 199.91 14.14 
HAR_Ca Calcium hardness VAR17 mg CaCO3/L 200.00 470.10 204,974.76 452.74 

aValues as per Environment Protection Agency.

bValues as per WHO.

Primary data and data treatment of WQ parameters

IBM® SPSS v.29 (International Business Machines Corporation Statistical Product and Service Solutions, Armonk, NY, USA), Online MATLAB® (Matrix Laboratory, MathWorks, Natick, AM, USA), and Microsoft® Excel (Redmond, Washington, DC, USA) for Windows were utilized for data processing. Simplified digital maps were developed using the ArcMap 10.8.2 GIS software (ESRI®; Environmental Systems Research Institute, Redlands, CA, USA).

Analyzing big datasets needs more consideration during processing and interpreting the results. Processing large datasets arbitrarily might lead to biased results that deviate the study from the anticipated trend. In addition, visual presentation of the big datasets may increase the complexity that may encounter the error (Sivarajah et al. 2017). This study deals with the data sample of 17 parameters each having a sample size of 143 numbers, i.e., a dataset of 2,431 numbers. There were outliers in the provided data sample of WQ parameters, suggesting that the dataset was noisy. Thus, the quartile deviation method was used to de-noise the sample of water parameters data. The concept of quartile deviation is to measure the deviation at the middle of the data. The values of WQ parameters laying in the range of upper bound and lower bound derived from the quartile deviation were considered for further processing (Kumar et al. 2021). This process leads to a final data sample of 56 in numbers for each 17 parameters, i.e., a dataset of 952 numbers as shown in Figure 2.

During data processing, the detection of wide disparity in the parameters was marked. As the variables have different units in this investigation, the method of standardizing the data was adopted to combine the parameters. The dataset of 17 WQ parameters with 56 sample numbers from 2000 to 2019 had been standardized, using Equation (1):
(1)
The comparison of raw dataset with standardized dataset and centered dataset of five significantly influencing WQ parameters is shown in Figure 2.
Figure 1

Study area map of the Daman Ganga River in Vapi, Pardi, Valsad.

Figure 1

Study area map of the Daman Ganga River in Vapi, Pardi, Valsad.

Close modal
Figure 2

Comparison of raw dataset with standardized (z) dataset and centered (c) dataset of parameters: (a) HAR_Ca, (b) Temp, (c) SAR, (d) P-Tot, and (e) CO3.

Figure 2

Comparison of raw dataset with standardized (z) dataset and centered (c) dataset of parameters: (a) HAR_Ca, (b) Temp, (c) SAR, (d) P-Tot, and (e) CO3.

Close modal

Principal component analysis

PCA is a method to find the linear combination that accounts for as much variability as possible. The fundamental of PCA is to find the optimal values of the weights to maximize the variance of the combined parameters. The reason behind maximizing the variance is that the variance can be seen as information. Thus, by combining parameters, the information can be kept as much as possible in combined variables. PCA can be used to reduce the number of dimensions or parameters in the dataset for further types of analysis (An et al. 2015). There are various methods based on which PCA can be computed such as singular value decomposition and eigendecomposition of the covariance matrix (Takane 2003). The scope of this article covers math based on the eigendecomposition of the covariance matrix to compute the PCA. With this research, we have perceived a transformation of 17 parameters into simply two components.

The PCA first computes the covariance matrix as shown in Figure 3. The pattern between the significant WQ parameters was studied through covariance matrix as shown in Figures 4 and 5.
Figure 3

Correlation among the WQ parameters. Red represents a high correlation whereas blue represents low correlation.

Figure 3

Correlation among the WQ parameters. Red represents a high correlation whereas blue represents low correlation.

Close modal
Figure 4

Box plot of parameters for (a)–(d) range vs raw dataset.

Figure 4

Box plot of parameters for (a)–(d) range vs raw dataset.

Close modal
Figure 5

Box plot of parameters for (a)–(d) range vs standardized dataset.

Figure 5

Box plot of parameters for (a)–(d) range vs standardized dataset.

Close modal

Since the values of parameters are relatively varying due to the change in the units and their permissible concentration in the water, the variance has shown relative disparity as shown in Table 1 and Figure 4. The eigenvalues and eigenvectors of a covariance matrix would then be computed using MATLAB software. The order of eigenvectors have been rearranged into a matrix based on the eigenvector corresponding to the largest eigenvalue, which becomes the first column of the matrix. The values of weight associated with the parameters in this instance are held by the eigenvectors; the values of five eigenvectors are presented in Table 2. The WQ parameters are combined using the values from the eigenvector as weights to transform the data into principal components.

Table 2

Eigen vector matrix

ParameterEV1EV2EV3EV4EV5
pH −0.16 −0.38 −0.11 −0.25 0.16 
EC 0.20 −0.01 0.27 0.48 −0.21 
Temp 0.04 0.17 −0.32 0.42 0.44 
TDS 0.29 0.01 −0.04 0.09 0.22 
NO2 + NO3 0.13 −0.36 −0.13 0.14 −0.05 
P-Tot 0.17 0.20 0.17 0.18 −0.54 
Ca 0.30 −0.12 0.39 −0.22 0.15 
Mg 0.27 −0.16 −0.17 0.29 −0.02 
Na −0.35 −0.10 0.29 0.24 0.11 
0.04 −0.42 −0.18 0.25 0.00 
Cl 0.35 0.08 −0.28 −0.24 −0.13 
SO4 0.24 0.10 −0.01 0.07 0.46 
CO3 0.12 −0.50 −0.15 0.05 −0.27 
HCO3 −0.20 −0.31 −0.02 −0.17 0.02 
HAR_Total 0.36 −0.17 0.21 −0.02 0.11 
SAR 0.26 0.17 −0.41 −0.27 −0.14 
HAR_Ca 0.30 −0.12 0.39 −0.22 0.15 
ParameterEV1EV2EV3EV4EV5
pH −0.16 −0.38 −0.11 −0.25 0.16 
EC 0.20 −0.01 0.27 0.48 −0.21 
Temp 0.04 0.17 −0.32 0.42 0.44 
TDS 0.29 0.01 −0.04 0.09 0.22 
NO2 + NO3 0.13 −0.36 −0.13 0.14 −0.05 
P-Tot 0.17 0.20 0.17 0.18 −0.54 
Ca 0.30 −0.12 0.39 −0.22 0.15 
Mg 0.27 −0.16 −0.17 0.29 −0.02 
Na −0.35 −0.10 0.29 0.24 0.11 
0.04 −0.42 −0.18 0.25 0.00 
Cl 0.35 0.08 −0.28 −0.24 −0.13 
SO4 0.24 0.10 −0.01 0.07 0.46 
CO3 0.12 −0.50 −0.15 0.05 −0.27 
HCO3 −0.20 −0.31 −0.02 −0.17 0.02 
HAR_Total 0.36 −0.17 0.21 −0.02 0.11 
SAR 0.26 0.17 −0.41 −0.27 −0.14 
HAR_Ca 0.30 −0.12 0.39 −0.22 0.15 

Mathematically, PCA can be conceptualized as applying weights to each parameter as coefficients and then the weighted parameters are added together as shown in Equation (2). In this equation, PC{i} represents the principal component of respective sample number, w represents the weightage associated with each parameter, and P represents the value of the respective parameter:
(2)

One of the aims of this study is to examine which parameters are related to which principal components. A better understanding of principal components can be done by rotating them (Kilmer 2010). Rotation is a procedure in which the factors are rotated to achieve a simple structure. A simple structure means that each factor should have a few high loadings with the rest of the loading being zero or close to zero. Rotation methods can be categorized as either orthogonal or oblique methods. Orthogonal rotation methods assume that the factors in the analysis are uncorrelated. Varimax rotation method is one among the orthogonal methods. In this study, we will use the varimax rotation factor analysis method on the components. Varimax rotation is a method in addition to the PCA that increases the interpretation of our weights (Acal et al. 2020).

WQI assessment methods

The WQI is a thorough index to evaluate WQ that can be applied to measure the extent of pollution in the water (Brown et al. 1972; Khan et al. 2023). Diverse techniques have been developed by experts to determine the WQI (Debels et al. 2005). It uses mathematics to combine a huge volume of data on WQ into a single number that sums up the total WQ level and depicts the WQ (Şener et al. 2017). This study aims to compute the WQI for the Vapi district, Gujarat, India, in addition to the factor analysis method. On availing the onset of significant parameters, the WQI was computed using the CCME-WQI and BC-WQI methods.

CCME-WQI provides a quantitative framework to evaluate WQ conditions in relation to WQ objectives. It is adaptive to the various types and quantity of WQ factors that need to be examined, the time frame for implementation, and the type of water body (stream, river reach, lake, etc.). CCME works on three objectives such as scope (F1), frequency (F2), and amplitude (F3). An index can be computed from 0 to 100 to assess the WQ using these factors. Objective scope (F1) of CCME-WQI represents the extent of non-compliance of the WQ guideline, over the time. Objective frequency (F2) of CCME-WQI represents the proportion of individual tests that fail to meet the required standards of permissible water parameters, i.e., failed tests. Objective amplitude (F3) of CCME-WQI represents the degree to which the failed tests fall short of their goals. The equation to evaluate WQI according to the CCME is:
(3)
The BC-WQI is used to measure the WQ parameters, and their violation is determined by comparison with a predefined limit. The aspects of the BC-WQI are being calculated by the CCME-WQI. The equation to evaluate the WQI according to the BC is:
(4)

WQ parameter

According to the report of the Central Pollution Control Board (CPCB), the river Daman Ganga comes under the Priority 1 category which is having a biochemical oxygen demand (BOD) value of more than or equal to 30 mg/L.

As perceived from the Table 1, the content of 11 among 17 parameters such as EC, Temp, TDS, Ca, Mg, Cl, HCO3, Na, HAR_Ca, SAR, and HAR_Total in river water samples collected from the monitoring stations during the 2000–2019 sampling period exceeds the parametric value given by the Bureau of Indian Standards (BIS) and the World Health Organization (WHO). The temporal distribution of parameters content in the river sample is depicted in Figure 4. The elevated contents such as major cations (Ca, Mg, and Na), major anions (CO3, HCO3, and Cl), and dissolved solids concentration in the river water sample contribute to the major impurities in water bodies resulting in noxious impact on aquatic habitat (Kumar et al. 2022).

Furthermore, the significance of the WQ parameters can be understood with the help of a box plot and a loading plot developed through PCA. Figure 4(a)–(d) shows the box plot of the WQ parameters having values of raw dataset, whereas Figure 5(a)–(d) shows standardized dataset of WQ parameter. However, it was quite strenuous to extract information from the plot of the raw dataset, as each parameter has a different permissible limit and units of measurement. Using a standardized dataset, the level of consistency in the data was better understood as it reveals that parameters such as EC, TDS, Na, Cl, HCO3, HAR_Total, and HAR_Ca had spread more.

Based on the loading plot as shown in Figure 6, three observations were made: (1) the parameters total phosphorus (P-Tot), SAR, SO4, Na, Cl, TDS, EC, HAR_Ca, Ca, HAR_Total, and Mg have more effect on PC1; (2) the parameters K and Temp have more effect on PC2; (3) the parameters pH, Temp, HCO3, CO3, and NO2 + NO3 have effect on both PC1 and PC2. Although, on studying the absolute values of the weights associated to PC1, it was noticed that all absolute values are around 2. Concurrently, the same is also true for the weights associated with PC2 as around 1.20. By using the absolute values, it was not possible to tell which parameters are associated to PC1 or PC2. Hence, further observation to understand the relation between parameters and components is made based on the varimax rotation factor analysis.
Figure 6

Loading plot of PCs.

Figure 6

Loading plot of PCs.

Close modal

Principal component analysis

On computing PCA to combine the parameters using raw dataset format resulted in the biased values of the weights, as it put little weight on the few parameters and more weight on the other. This very little information from these parameters had contributed to the linear combination because the variance was wavering for each parameter. Standardizing the WQ characteristics prior to computing the primary components allowed us to overcome this complexity (Pessanha Santos 2023). The PCA score plot based on the standardized dataset is shown in Figure 7. The number of principal components was extracted by setting the cut-off value of variance as 75% from a scree plot, as seen in Figure 8.
Figure 7

PCA score plot.

Figure 8

Scree plot for PCA.

Figure 8

Scree plot for PCA.

Close modal

Jollife & Cadima (2016) suggested that during the evaluation process, the variances of the principal components correspond to the eigenvalues that we calculated. Thus, assigning the eigenvectors as weights, corresponding to the largest eigenvalues, has achieved the principle of arranging the dataset with new components with maximal variance. The weights can be interpreted as how much WQ parameters contribute to the principal components. The weights for the PC1 come from the first eigenvector with the highest eigenvalue, whereas the weights for the PC2 comes from the second eigenvector with the second highest eigenvalue. Given that HAR_Total has the highest absolute weight (0.36), it was given more weight while calculating PC1. In contrast, CO3 had the largest absolute weight (0.50), so it was given more weight while calculating PC2. We observed that the weights assigned to SAR, SO4, Na, Cl, TDS, EC, HAR_Ca, Ca, HAR_Total, and Mg parameters were more for computing the PC1, on the other hand weights assigned to HCO3, P-Tot, K, pH, Temp, CO3, and NO2 + NO3 parameters were more for computing the PC2.

The plot of PC1 vs standardized WQ parameter is shown in Figure 9(a) and 9(b) and PC2 vs standardized WQ parameter is shown in Figure 9(c) and 9(d). Parameters zHCO3 and zpH clearly show negative signs in associated weight with their respective components. This implies that these parameters are negatively correlated with their respective principal components. As we plot the graph for PC1 and zpH parameter, it is clearly seen in Figure 9(a) that they are negatively correlated. Also, the weights assigned to the standardized WQ parameters such as zEC is 0.2 for PC1, zHCO3 is −0.31 for PC2, and zTemp is −0.17 for PC2.
Figure 9

Plot of PC1 vs WQ parameters: (a) PC1 vs zpH, (b) PC1 vs zEC, (c) PC2 vs zHCO3, and (d) PC2 vs zTemp.

Figure 9

Plot of PC1 vs WQ parameters: (a) PC1 vs zpH, (b) PC1 vs zEC, (c) PC2 vs zHCO3, and (d) PC2 vs zTemp.

Close modal

PC1–PC5 captures the variance sum up to 75% as reflected from the scree plot. This implies that PC1–PC5 stores almost all information about the WQ parameters. In addition, the covariance of the PCA scores was equal to zero. This implies that all the principal components are completely uncorrelated to each other. Based on this, it was decided to use the orthogonal approach for the rotation factor analysis. However, PCA combines the variables in a way that optimizes the PCs' variance, which is advantageous for reducing the number of dimensions but disadvantageous when interpreting the components.

Furthermore, the analysis of the parameters using varimax rotation revealed the following findings as shown in Table 3: (a) Factor 1 (VF1), explaining 35.33% variance of the total variability, is a dipolar factor with high positive loadings (greater than +0.6) for HAR_Ca, Ca, and HAR_Total and moderate negative loading (−0.211) for Temp; (b) Factor 2 (VF2) accounts for 13.04% of the total variance and is a factor with high positive loadings for SAR, Na, and Cl (greater than +0.6) and moderate negative loading (−0.193) for pH; (c) Factor 3 (VF3), explaining 11.41% variance of the total variability, is a factor with high positive loadings (greater than +0.6) for CO3, K, and NO2 + NO3 and moderate negative loading (−0.194) for HCO3; (d) Factor 4 (VF4), explaining 7.889% variance of the total variability, is a factor with high positive loadings (greater than +0.6) for P-Tot and EC and high negative loading (−0.677) for pH; and (e) Factor 5 (VF5), explaining 6.51% variance of the total variability, is a factor with high positive loadings (greater than +0.6) for Temp only and moderate negative loading (−0.215) for pH.

Table 3

Varimax rotation factor analysis matrix

S. No.Variable IDParameterVarimax rotation factor
VF1VF2VF3VF4VF5
VAR17 HAR_Ca 0.955 0.143 0.048 0.141 −0.093 
VAR07 Ca 0.955 0.143 0.048 0.141 −0.093 
VAR15 HAR_Total 0.833 0.273 0.317 0.248 0.09 
VAR04 TDS 0.467 0.366 0.174 0.201 0.386 
VAR16 SAR 0.037 0.948 0.037 0.071 0.103 
VAR09 Na 0.294 0.918 0.125 0.165 0.116 
VAR11 Cl 0.295 0.913 0.153 0.168 0.092 
VAR13 CO3 0.099 0.139 0.813 −0.061 −0.229 
VAR10 −0.033 −0.078 0.717 −0.115 0.096 
10 VAR05 NO2 + NO3 0.139 0.116 0.651 −0.012 0.045 
11 VAR08 Mg 0.242 0.335 0.565 0.287 0.32 
12 VAR06 P-Tot 0.082 0.201 −0.027 0.769 −0.214 
13 VAR02 EC 0.324 −0.146 0.271 0.727 0.114 
14 VAR01 pH −0.088 −0.193 0.278 −0.677 −0.215 
15 VAR14 HCO3 0.173 0.295 −0.194 0.484 0.304 
16 VAR03 Temp −0.211 0.076 0.04 0.029 0.817 
17 VAR12 SO4 0.487 0.252 −0.027 0.064 0.553 
Eigenvalue 6.01 2.22 1.94 1.341 1.11   
Variance % by component 35.33 13.04 11.41 7.889 6.51   
Cumulative variance % by component 35.33 48.37 59.79 67.67 74.18   
S. No.Variable IDParameterVarimax rotation factor
VF1VF2VF3VF4VF5
VAR17 HAR_Ca 0.955 0.143 0.048 0.141 −0.093 
VAR07 Ca 0.955 0.143 0.048 0.141 −0.093 
VAR15 HAR_Total 0.833 0.273 0.317 0.248 0.09 
VAR04 TDS 0.467 0.366 0.174 0.201 0.386 
VAR16 SAR 0.037 0.948 0.037 0.071 0.103 
VAR09 Na 0.294 0.918 0.125 0.165 0.116 
VAR11 Cl 0.295 0.913 0.153 0.168 0.092 
VAR13 CO3 0.099 0.139 0.813 −0.061 −0.229 
VAR10 −0.033 −0.078 0.717 −0.115 0.096 
10 VAR05 NO2 + NO3 0.139 0.116 0.651 −0.012 0.045 
11 VAR08 Mg 0.242 0.335 0.565 0.287 0.32 
12 VAR06 P-Tot 0.082 0.201 −0.027 0.769 −0.214 
13 VAR02 EC 0.324 −0.146 0.271 0.727 0.114 
14 VAR01 pH −0.088 −0.193 0.278 −0.677 −0.215 
15 VAR14 HCO3 0.173 0.295 −0.194 0.484 0.304 
16 VAR03 Temp −0.211 0.076 0.04 0.029 0.817 
17 VAR12 SO4 0.487 0.252 −0.027 0.064 0.553 
Eigenvalue 6.01 2.22 1.94 1.341 1.11   
Variance % by component 35.33 13.04 11.41 7.889 6.51   
Cumulative variance % by component 35.33 48.37 59.79 67.67 74.18   

VF1 is the most significant factor as it contributes to the highest proportion of the total variance among the parameters as shown in Table 3. The variability of HAR_Ca, Ca, and HAR_Total can be attributed to the process of mixing synthesized water with river water (Mahamat Nour et al. 2020). Also, the close relationship among HAR_Ca, Ca, and HAR_Total can be attributed to the presence of hardness in the water. Thus, VF1 expresses the hardness content of the river water samples determined by these parameters and can be referred to as ‘hardness factor’ for the examined datasets. VF1 identified the antipathetic relation between HAR_Ca, Ca, HAR_Total, and Temp in sampling campaigns indicating that Temp significantly influences these WQ parameters. In contrast, VF2 is the second most significant factor of the total variance as shown in Table 3. The fluctuation of SAR, Na, and Cl in greater quantities in the river sample might indicate the presence of ions and salts due to chemicals (Zaman et al. 2018). Thus, VF2 expresses the salinity content of the river water samples determined by these parameters and can be referred to as ‘salinity factor’ for the examined datasets. VF2 identified the antipathetic relation between SAR, Na, Cl, and pH in sampling campaigns indicating that pH significantly influences these WQ parameters. HAR_Ca, SAR, CO3, and Temp were selected from PC1 through PC4.

A total of 35.33% of the variation was explained by VF1, which was best reflected by the hardness caused by calcium (HAR_Ca). VF2 significantly affected SAR and accounted for 13.04% of the overall variance. With a positive loading of the greatest on CO3, VF3 explained 11.41% of the variance. VF4 accounted for 7.889% of the total variation and had a substantial loading on P-Tot. VF5 accounting for 6.51% of the total variance had significant loading on Temp.

Water quality index

Based on various key WQ parameters, the WQI delivers a single figure that represents the total WQ at a certain place and time.

The evaluated value of WQI according to the CCME was 42.35, as shown in Table 4. As per Table 5, the computed value indicates that WQI falls in a poor rank. The interpretation can be understood from the computed value that the conditions often depart from the natural or desirable levels. Although the computed value is near the marginal rank according to the CCME, it can be interpreted that the quality will not deteriorate until the danger level. Although as shown in Table 4 the computed WQI value according to the BC method was 63.29, as per Table 6, the WQI falls in the poor class and is near the borderline.

Table 4

Computed WQI values for the river water sample

S. No.Computed WQIComputation methodRemarks
42.35 CCME-WQI Poor 
63.29 BC-WQI Poor 
S. No.Computed WQIComputation methodRemarks
42.35 CCME-WQI Poor 
63.29 BC-WQI Poor 
Table 5

WQI classification as per CCME

RankWQIValueDescription
Excellent 95–100 Water quality is protected with a virtual absence of threat or impairment 
Good 80–94 Water quality is protected with only a minor degree of threat or impairment 
Fair 65–79 Water quality is usually protected but occasionally threatened or impaired 
Marginal 45–64 Water quality is frequently threatened or impaired 
Poor 0–44 Water quality is almost always threatened or impaired 
RankWQIValueDescription
Excellent 95–100 Water quality is protected with a virtual absence of threat or impairment 
Good 80–94 Water quality is protected with only a minor degree of threat or impairment 
Fair 65–79 Water quality is usually protected but occasionally threatened or impaired 
Marginal 45–64 Water quality is frequently threatened or impaired 
Poor 0–44 Water quality is almost always threatened or impaired 
Table 6

WQI classification as per BC-WQI

RankDescription of WQIIndex value
Excellent 0–3 
Good 04–17 
Fair 18–43 
Borderline 44–59 
Poor 60–100 
RankDescription of WQIIndex value
Excellent 0–3 
Good 04–17 
Fair 18–43 
Borderline 44–59 
Poor 60–100 
Table 7

Represents the comparison of various WQI methods

WQI methodsAdvantagesDisadvantages
NSF-WQI The aggregation method is straightforward and easy to use.
It uses a reduced number of WQ parameters. 
Assesses WQ by using individual parameter weights assigned by experts, which can be subjective and prone to an ‘eclipsing’ effect – where a single parameter disproportionately influences the overall score – creating sensitivity issues and limiting the method's ability to reflect the effects of individual WQ parameters. 
CCME-WQI The method is more objective based and allows flexibility regarding the type and amount of WQ parameters used; chosen based on the water's utilization purpose and data availability. The method is more complex because it has to calculate the objectives F1, F2, and F3 values.
The approach requires more sampling and testing of data. 
Oregon Water Quality Index Equal weighting is more suitable to determine the surface water's quality for general use. Sub-indices equation is too ideal for the river in the method and prone to an ‘ambiguity’ effect. 
WQI methodsAdvantagesDisadvantages
NSF-WQI The aggregation method is straightforward and easy to use.
It uses a reduced number of WQ parameters. 
Assesses WQ by using individual parameter weights assigned by experts, which can be subjective and prone to an ‘eclipsing’ effect – where a single parameter disproportionately influences the overall score – creating sensitivity issues and limiting the method's ability to reflect the effects of individual WQ parameters. 
CCME-WQI The method is more objective based and allows flexibility regarding the type and amount of WQ parameters used; chosen based on the water's utilization purpose and data availability. The method is more complex because it has to calculate the objectives F1, F2, and F3 values.
The approach requires more sampling and testing of data. 
Oregon Water Quality Index Equal weighting is more suitable to determine the surface water's quality for general use. Sub-indices equation is too ideal for the river in the method and prone to an ‘ambiguity’ effect. 

A closer examination of the factors that cause these relative high index values indicates that objective exceedances occurred for four parameters among five significant parameters such as Temp, P-Tot, HAR_Total, and SAR. The objectives of CO3 were never exceeded. In addition, exceedances observed in the parameter P-Tot were fairly minor, whereas exceedances in SAR, Temp, and HAR_Ca were high. These exceedances in the parameters resulted in high index values as computed by CCME-WQI and BC-WQI.

Benkov et al. (2023) proposed the estimation of WQI as one of the significant tasks to consider for the environmental agencies. The approach revealed latent factors impacting WQ, estimated using CCME-WQI method integrated with PCA. Nath Roy et al. (2024) focused on a study on four Dhaka-based rivers that revealed the reduction in the subjectivity of WQI models using the PCA approach. Also, the dilatation caused by local rainfall was the reason for the higher values in the WQI trend throughout the rivers during the wet season. Guenouche et al. (2024) assessed the WQI using 16 physiochemical parameters using PCA to develop inter-relationship between the parameters that identified distinct characteristics of various study area sites. Dutta et al. (2018) clearly demonstrated the application of PCA and cluster analysis with WQI to categorize the analyzed trend into four major polluting factors: (1) mineral and nutrient pollution, (2) heavy metal pollution, (3) organic pollution, and (4) fecal contamination.

Generalizability of findings and external validity

According to several studies released by different media outlets, industries that have been established are those that employ fly ashes and silos to manufacture industrial chemicals, dyes, pulp, paper, board, work involving fluoride, and metals near the Daman Ganga River, Vapi. These kinds of industrial effluents may have an effect on river health. Through this study, the categorized factors are developed as VF1, indicating the influence of hardness factor, and VF2, expressing the dominance of salinity.

Wang et al. (2022) proposed four processes that result in the majority of wastewater production: pretreatment, dyeing, printing, and functional finishing. Chlorine, hardness, and pH are components of a definite parameter that is used to characterize effluent from textile, pulp, paper, and board processes. Also, Chockalingam et al. (2019) observed high levels of hardness in the effluents collected from the textile industries of Tiruppur, which resulted in increased alkalinity and pH in the nearby environments. Hereby, the study successfully determines that VF1 identified the antipathetic relation between HAR_Ca, Ca, and HAR_Total.

In addition, numerous industrial operations, like those in the chemical industry, pharmaceutical processing, and papermaking, produce significant volumes of high-salinity wastewater containing complex constituents and contaminants that are difficult to decompose. Direct discharge will result in the introduction of garbage and a significant amount of potential salt resources into the river water body (Guo et al. 2023). Zhang et al. (2024) gave an overview of the WQ characteristics of saline wastewater discharged from various industrial sectors and examined the consequences of increasing salinity on various treatment methods. Large-scale turbidity discharge and a high pH are typical characteristics of wastewater from textile printing and dyeing processes, according to research (Xu et al. 2018). Hereby, this study successfully determines that VF2 identified the antipathetic relation between SAR, Na, Cl, and pH in sampling campaigns indicating that pH significantly influences these WQ parameters.

By effectively modifying the CCME-WQI using PCA and the varimax factor analysis method, the viewpoints were articulated, as HAR_Ca, SAR, CO3, Temp, and P-Tot are important components that aid in identifying the WQI based on the performed methodology.

Although the present method works well for reducing parameters, there are several drawbacks to the existing methodology. The use of PCA might involve subjectivity, especially when deciding how many principal components to use. Simpler methods such as NSF-WQI and WA-WQI offer more accessible and interpretable results with less precision, making the application of PCA with factor analysis approach potentially more practical for policy and decision-making processes.

Any WQI model development, nevertheless, has several drawbacks. The applicability, precision, and reliability of WQI models are constrained, and machine learning techniques may be utilized to estimate and forecast these. Uncertainty in the model may result from sampling numbers and ecological variables that fluctuate over a good range for a short period of time. Furthermore, there are instances where time and space constrain the WQI models.

Study significance: exploring aspects and relationships

Traditional WQI techniques (comparison as shown in Table 7), including the NSF-WQI and the WA-WQI, frequently depend on arbitrary weighting or presume that all parameters are equally important (Marselina et al. 2022). These presumptions are challenged by this study, which shows that PCA can use statistical analysis to objectively determine each parameter's significance. With this method, WQI calculations become more objective and regionally specific, offering more specialized insights for efficient water management techniques. Moreover, in this study, the employed CCME-WQI method introduces a dynamic approach by incorporating factors such as scope, frequency, and amplitude, which allows for a more nuanced understanding of fluctuations in WQ data with the established guidelines. This is in contrast with techniques such as NSF-WQI and WA-WQI, which rely on predefined weights and might not adequately account for the temporal fluctuations present in WQ.

Due to the available point source or non-point source pollution loading, each location has distinct geographical features that have an impact on the river body. Also, each river has distinct geomorphological features that influence the surrounding hydrosphere's ecosystem. Hereby, this study offers several novel aspects that enhance knowledge and methods for evaluating the WQ, especially in areas where industrial activity is present. One significant advancement is the use of varimax rotation analysis in conjunction with PCA, which makes it possible to identify and appropriately weigh important WQ factors. Particularly, in complex environments like the Vapi region where data were available for 17 WQ parameters, the accuracy of WQI measurements is improved through this advanced data-driven approach. The incorporation of two different WQI approaches, CCME-WQI and BC-WQI, within the same research framework is another addition. This dual technique offers improved reliability and depth of research by enabling a thorough comparison of WQ outcomes. In addition, the study offers innovation by addressing a topic that is often overlooked in research, with its focus on the highly industrialized Vapi region and its analysis of nearly two decades of data (2000–2019). This extensive study provides novel insights into the short-term impacts of industrial pollution. In addition, the evaluation procedure is made simpler by concentrating on the most important variables by utilizing PCA to select five significant WQ parameters: HAR_Ca, SAR, CO3, Temp, and P-Tot. Taking advantage of quantile deviation for preprocessing data enhances the study substantially by eliminating noise and outliers from the data and ensuring a reliable and accurate dataset. Finally, by providing insights into the effects of industrial pollution and directing focused determinations for sustainable water resource management, the study advances environmental and public health policies.

Summary of key findings

The units of WQ parameters selected have shown impact on how the variables in the PCA are handled when we have parameters with various units. The PCA extraction method followed by varimax rotation has appropriately classified the patterns in the data without a theory. The highly influenced parameters that account for the greatest proportion of total variance in the WQ datasets include HAR_Total, SAR, CO3, Temp, and P-Tot. The contents of high SAR concentrations affect the salinity of water which in turn increases the struggle for the direct application of water in irrigation. Also Ding et al. (2023) demonstrated through experiments how SAR radically changes the composition of bacterial communities in soil when applied for irrigation. Despite the high value of HAR_Total and CO3 categories, the impact of hardness factor is more in the river water. Tiwari & Bajpai (2012) suggested that constant usage of very hard/hard water can lead to several health issues, so consumption can be done after due treatment. The large amount of utilization of the chemicals in the Vapi district of Gujarat may indicate that the disposal of treated water from industries or nearby water treatment plants in the river may lead to increase in such type of factors. The combined activities influenced by humans and natural factors can result in the accumulation of P-Tot in inland water. Based on the results of varimax rotation factor analysis of the datasets, it was found that the major portion of the river water get influenced by the two categorized factors such as ‘hardness factor’ and ‘salinity factor’. Also, it was revealed through statistical analysis that the major impact on hardness factor was observed by the change in temperature values, whereas the major impact on salinity factor was observed by the change in the pH values. Moreover, as it comes to the contamination of the aquatic environment, it might be challenging to intuitively determine the relative importance of each factor (Zhang et al. 2023).

CCME-WQI classifies the inland river water as belonging to the poor category which implies that the WQ is always threatened or impaired and the same can be observed from the BC-WQI. However, it is important to have exact information about the water parameters to find the source of pollution. The limitation to computing WQI for any region is that they can be used for managing water bodies, rather it should not be utilized in place of a thorough review of environmental modeling. It can, however, offer a comprehensive summary of environmental performance. As per the investigation of this study, comparing the index values with different methods for the river water samples can therefore suggest applying regular audits by competent authorities of the region on water withdrawal and its quality issues suggested. Also, WQ monitoring and management should be prioritized to safeguard the water resources from contamination and provide technologies to make water suitable for residential and drinking uses.

Key takeaway points from our recent study on the WQ of the Daman Ganga River in Vapi, Gujarat, are as follows:

  • (1) The varimax rotation enhances the study by reducing the ambiguity, making the loadings more distinct. This aided in clearly identifying the variables that significantly contribute to each factor.

  • (2) Our analysis identified that the most significant factors affecting WQ are related to hardness (HAR_Ca, Ca, HAR_Total) and salinity (SAR, Na, Cl), with pH also playing crucial roles.

  • (3) These findings underscore the importance of monitoring these specific parameters to effectively manage and improve the WQ.

Implications for health, efficiency, and economics

Encroachment of high salinity and hardness in the water directly impact the nearby soil type. This may impact the future land use pattern and may affect the nearby available agricultural and forest areas. Various possible cases are observed, where one of the major potential implication is that the end of the Daman Ganga River is at the Arabian Sea; hence, pollution in the Daman Ganga River will also affect the aquatic ecosystem present in the seawater.

Both ecological systems and human health are significantly impacted by the WQI. Increased levels of SAR, HAR_Ca, and P-Tot indicate poor WQ, which contributes to waterborne diseases that affect both human and agricultural health. In terms of ecology, these pollutants change the quantities of dissolved oxygen, which impacts fish and plants, and too much phosphorus causes eutrophication, which further disturbs biodiversity. Therefore, WQI aids in the early detection of these threats, directing strategies to lower pollution and safeguard ecosystems.

Although unique to Vapi, the study's conclusions can be applied to other industrial areas with comparable pollution levels. The elevated HAR_Ca, SAR, and P-Tot values in Vapi are indicative of pollutants found in many industrial areas. A strong framework is provided by the combination of PCA and factor analysis with the WQI; nevertheless, parameter selection and weighting may need to be adjusted to accommodate local circumstances. This flexible approach makes the methodology useful as a template while taking into account distinct geographic and industrial situations across various regions, allowing for region-specific WQ assessments and efficient environmental management measures.

There are significant financial and environmental advantages to using the WQI as a benchmark, particularly in industrial locations like Vapi. By encouraging industries to enhance wastewater management, minimizing the load of contamination on public water systems, and lowering treatment costs, WQI can result in cost savings in water treatment. In addition, industrial compliance is driven by WQI requirements, which encourage businesses to implement pollution control systems to stay out of trouble. Improved WQ increases production efficiency and quality, which benefits industries like agriculture and medicines that rely on clean water. In addition, better WQ benefits communities and lessens financial burdens while lowering healthcare expenses by minimizing waterborne illnesses.

Recommendations for future research

Geographic Information Systems and WQI models should be integrated in future studies to evaluate and visualize the effects of urban stressors on WQ. Using machine learning techniques might improve the accuracy of predictions and categorization. It is crucial to look at how environmental factors affect the patterns of WQ as well as how WQ standards have changed over time and incorporate biological indicators for a comprehensive evaluation. In addition, concentrating on pollution sources and seasonal fluctuations would facilitate efficient control. It is possible to improve evaluations and educate sustainable water resource management methods for ecosystem preservation by implementing advanced technologies such as remote sensing and real-time monitoring.

Stakeholder implications and societal impact

For instance, studies from other industrial regions have also identified hardness, salinity, and chemical pollutants as dominant factors in determining WQ, which aligns with this study's identification of HAR_Ca, SAR, and P-Tot as key contributors, requiring immediate attention. By providing a clear and quantifiable assessment of WQ in the Vapi region, the study equips policymakers with essential information for informed decision-making regarding water management strategies, allowing policymakers

  • (1) to educate local communities about the management of water resources,

  • (2) to promote environmental supervision for sustainable management practices in industrialized regions,

  • (3) to implement effective strategies that safeguard WQ, specifically considering the significantly impacting parameters on priority,

  • (4) to ensure the integrity and diligence with strict oversight mechanisms, and

  • (5) to study the impact assessment on the ground water due to seepage through available impurities in surface flow.

The authors acknowledge the assistance provided by the State Water Data Centre, Gandhinagar, Government of Gujarat.

Data cannot be made publicly available; readers should contact the corresponding author for details.

The authors declare there is no conflict.

Acal
C.
,
Aguilera
A. M.
&
Escabias
M.
(
2020
)
New modeling approaches based on varimax rotation of functional principal components
,
Mathematics
,
8
(
11
),
1
15
.
https://doi.org/10.3390/math8112085
.
Ahmed
W.
,
Mohammed
S.
,
El-Shazly
A.
&
Morsy
S.
(
2023
)
Tigris river water surface quality monitoring using remote sensing data and GIS techniques
,
Egyptian Journal of Remote Sensing and Space Science
,
26
(
3
),
816
825
.
https://doi.org/10.1016/j.ejrs.2023.09.001
.
An
S.
,
Xie
X.
&
Ma
Y.
(
2015
)
Evaluation of water quality using principal component analysis
,
Nature Environment and Pollution Technology
,
14
(
4
),
855
858
.
Asadi
S. S.
,
Vuppala
P.
&
Reddy
M. A.
(
2007
)
Remote sensing and GIS techniques for evaluation of groundwater quality in municipal corporation of Hyderabad (Zone-V), India
,
International Journal of Environmental Research and Public Health
,
4
(
1
),
45
52
.
https://doi.org/10.3390/ijerph2007010008
.
Benkov
I.
,
Varbanov
M.
,
Venelinov
T.
&
Tsakovski
S.
(
2023
)
Principal component analysis and the water quality index – a powerful tool for surface water quality assessment: a case study on Struma river catchment, Bulgaria
,
Water (Switzerland)
,
15
(
10
), 1961.
https://doi.org/10.3390/w15101961
.
Brown
R. M.
,
McClelland
N. I.
,
Deininger
R. A.
&
O'Connor
M. F.
(
1972
)
A water quality index – crashing the psychological barrier
.
In: Thomas, W.A. (eds). Indicators of Environmental Quality. Environmental Science Research, vol 1. Springer, Boston, MA, USA. https://doi.org/10.1007/978-1-4684-2856-8_15.
Chockalingam
N.
,
Banerjee
S.
&
Muruhan
S.
(
2019
)
Characterization of physicochemical parameters of textile effluents and its impacts on environment
,
Environment and Natural Resources Journal
,
17
(
2
),
41
53
.
https://doi.org/10.32526/ennrj.17.2.2019.11
.
Debels
P.
,
Figueroa
R.
,
Urrutia
R.
,
Barra
R.
&
Niell
X.
(
2005
)
Evaluation of water quality in the Chillán River (Central Chile) using physicochemical parameters and a modified water quality index
,
Environmental Monitoring and Assessment
,
110
(
1–3
),
301
322
.
https://doi.org/10.1007/s10661-005-8064-1
.
Ding
B.
,
Bai
Y.
,
Guo
S.
,
He
Z.
,
Wang
B.
,
Liu
H.
,
Zhai
J.
&
Cao
H.
(
2023
)
Effect of irrigation water salinity on soil characteristics and microbial communities in cotton fields in Southern Xinjiang, China
,
Agronomy
,
13
(
7
),
1
19
.
https://doi.org/10.3390/agronomy13071679
.
Dutta
S.
,
Dwivedi
A.
&
Suresh Kumar
M.
(
2018
)
Use of water quality index and multivariate statistical techniques for the assessment of spatial variations in water quality of a small river
,
Environmental Monitoring and Assessment
,
190
(
12
), 718.
https://doi.org/10.1007/s10661-018-7100-x
.
Ghoderao
S. B.
,
Meshram
S. G.
&
Meshram
C.
(
2022
)
Development and evaluation of a water quality index for groundwater quality assessment in parts of Jabalpur District, Madhya Pradesh, India
.
Water Supply
,
22
(
6
),
6002
6012
.
https://doi.org/10.2166/ws.2022.174
.
Guenouche
F. Z.
,
Mesbahi-Salhi
A.
,
Zegait
R.
,
Chouia
S.
,
Kimour
M. T.
&
Bouslama
Z.
(
2024
)
Assessing water quality in North-East Algeria: a comprehensive study using water quality index (WQI) and PCA
,
Water Practice and Technology
,
19
(
4
),
1232
1248
.
https://doi.org/10.2166/wpt.2024.073
.
Guo
L.
,
Xie
Y.
,
Sun
W.
,
Xu
Y.
&
Sun
Y.
(
2023
)
Research progress of high-salinity wastewater treatment technology
,
Water
,
15
(
4
),
684
.
https://doi.org/10.3390/w15040684
.
Gupta
N.
,
Pandey
P.
&
Hussain
J.
(
2017
)
Effect of physicochemical and biological parameters on the quality of river water of Narmada, Madhya Pradesh, India
.
Water Science
,
31
(
1
),
11
23
.
https://doi.org/10.1016/j.wsj.2017.03.002
.
Haghiabi
A. H.
,
Nasrolahi
A. H.
&
Parsaie
A.
(
2018
)
Water quality prediction using machine learning methods
,
Water Quality Research Journal
,
53
(
1
),
3
13
.
https://doi.org/10.2166/wqrj.2018.025
.
Ibrahim
A.
,
Ismail
A.
,
Juahir
H.
,
Iliyasu
A. B.
,
Wailare
B. T.
,
Mukhtar
M.
&
Aminu
H.
(
2023
)
Water quality modelling using principal component analysis and artificial neural network
,
Marine Pollution Bulletin
,
187
,
114493
.
https://doi.org/10.1016/J.MARPOLBUL.2022.114493
.
Jollife
I. T.
&
Cadima
J.
(
2016
)
Principal component analysis: a review and recent developments
,
Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences
,
374
(
2065
), 20150202.
https://doi.org/10.1098/rsta.2015.0202
.
Kandekar
V. U.
,
Pande
C. B.
,
Rajesh
J.
,
Atre
A. A.
,
Gorantiwar
S. D.
,
Kadam
S. A.
&
Gavit
B.
(
2021
)
Surface water dynamics analysis based on sentinel imagery and Google Earth Engine platform: a case study of Jayakwadi dam
,
Sustainable Water Resources Management
,
7
(
3
),
1
11
.
https://doi.org/10.1007/s40899-021-00527-7
.
Khabouchi
I.
,
Khadhar
S.
,
Driouich Chaouachi
R.
,
Chekirbene
A.
,
Asia
L.
&
Doumenq
P.
(
2020
)
Study of organic pollution in superficial sediments of Meliane river catchment area: aliphatic and polycyclic aromatic hydrocarbons
,
Environmental Monitoring and Assessment
,
192
(
5
), 283.
https://doi.org/10.1007/s10661-020-8213-6
.
Khan
M. H. R. B.
,
Ahsan
A.
,
Imteaz
M.
,
Shafiquzzaman
M.
&
Al-Ansari
N.
(
2023
)
Evaluation of the surface water quality using global water quality index (WQI) models: perspective of river water pollution
,
Scientific Reports
,
13
(
1
),
1
15
.
https://doi.org/10.1038/s41598-023-47137-1
.
Kilmer
P. D.
(
2010
)
Review article: review article
,
Journalism
,
11
(
3
),
369
373
.
https://doi.org/10.1177/1461444810365020
.
Kumar, A., Tripathi, V. K., Sachan, P., Rakshit, A., Singh, R. M., Shukla, S. K., Pandey, R., Vishwakarma, A. & Panda, K. C. (2022) Chapter 10 – Sources of ions in the river ecosystem. In: Madhav, S., Kanhaiya, S., Srivastav, A., Singh, V. & Singh, P. (eds). Ecological Significance of River Ecosystems. Elsevier, pp. 187–202.
Kumar
M.
,
Munoz-Arriola
F.
,
Furumai
H.
&
Chaminda
T.
(
2020
)
Resilience, Response, and Risk in Water Systems. Shifting Management and Natural Forcings Paradigms
.
Springer, Singapore
. https://doi.org/https://doi.org/10.1007/978-981-15-4668-6.
Kumar
P.
,
Dasgupta
R.
,
Johnson
B. A.
,
Saraswat
C.
,
Basu
M.
,
Kefi
M.
&
Mishra
B. K.
(
2019
)
Effect of land use changes on water quality in an ephemeral coastal plain: Khambhat City, Gujarat, India
,
Water (Switzerland)
,
11
(
4
), 724.
https://doi.org/10.3390/w11040724
.
Kumar
R.
,
Singh
H.
&
Singh Chohan
J.
(
2021
)
A novel quartile deviation score evaluation (QDSE) method to minimize uncertainty in evaluation of solid waste management
,
Materials Today: Proceedings
,
48
,
1084
1088
.
https://doi.org/10.1016/j.matpr.2021.07.355
.
Mahamat Nour
A.
,
Vallet-Coulomb
C.
,
Bouchez
C.
,
Ginot
P.
,
Doumnang
J. C.
,
Sylvestre
F.
&
Deschamps
P.
(
2020
)
Geochemistry of the Lake Chad tributaries under strongly varying hydro-climatic conditions
,
Aquatic Geochemistry
,
26
(
1
),
3
29
.
https://doi.org/10.1007/s10498-019-09363-w
.
Marselina
M.
,
Wibowo
F.
&
Mushfiroh
A.
(
2022
)
Water quality index assessment methods for surface water: a case study of the Citarum River in Indonesia
,
Heliyon
,
8
(
7
), e09848.
https://doi.org/10.1016/j.heliyon.2022.e09848
.
Moharir
K. N.
,
Pande
C. B.
,
Singh
S. K.
&
Del Rio
R. A.
(
2020
)
Nath Roy
B.
,
Roy
H.
,
Rahman
K. S.
,
Mahmud
F.
,
Bhuiyan
M. M. K.
,
Hasan
M.
,
Bhuiyan
A. A. K.
,
Hasan
M.
,
Mahbub
M. S.
,
Jahedi
R. M.
&
Islam
M. S.
(
2024
)
Principal component analysis incorporated water quality index modeling for Dhaka-based rivers
,
City and Environment Interactions
,
23
(
April
),
100150
.
https://doi.org/10.1016/j.cacint.2024.100150
.
Pessanha Santos
N.
(
2023
)
The expansion of data science: dataset standardization
,
Standards
,
3
(
4
),
400
410
.
https://doi.org/10.3390/standards3040028
.
Roy
M.
,
Shamim
F.
&
Chatterjee
S.
(
2021
)
Evaluation of physicochemical and biological parameters on the water quality of Shilabati River, West Bengal, India
,
Water Science
,
35
(
1
),
71
81
.
https://doi.org/10.1080/23570008.2021.1928902
.
Salerno
F.
,
Gambelli
S.
,
Viviano
G.
,
Thakuri
S.
,
Guyennon
N.
,
D'Agata
C.
,
Diolaiuti
G.
,
Smiraglia
C.
,
Stefani
F.
,
Bocchiola
D.
&
Tartari
G.
(
2014
)
High alpine ponds shift upwards as average temperatures increase: a case study of the Ortles-Cevedale mountain group (Southern Alps, Italy) over the last 50 years
,
Global and Planetary Change
,
120
,
81
91
.
https://doi.org/10.1016/j.gloplacha.2014.06.003
.
Şener
Ş.
,
Şener
E.
&
Davraz
A.
(
2017
)
Evaluation of water quality using water quality index (WQI) method and GIS in Aksu River (SW-Turkey)
,
Science of the Total Environment
,
584–585
,
131
144
.
https://doi.org/10.1016/j.scitotenv.2017.01.102
.
Sivarajah
U.
,
Kamal
M. M.
,
Irani
Z.
&
Weerakkody
V.
(
2017
)
Critical analysis of big data challenges and analytical methods
,
Journal of Business Research
,
70
,
263
286
.
https://doi.org/10.1016/j.jbusres.2016.08.001
.
Smol
J. P.
&
Douglas
M. S. V.
(
2007
)
Crossing the final ecological threshold in high Arctic ponds
,
Proceedings of the National Academy of Sciences of the United States of America
,
104
(
30
),
12395
12397
.
https://doi.org/10.1073/pnas.0702777104
.
Takane, Y. (2003) Relationships among Various Kinds of Eigenvalue and Singular Value Decompositions. In: Yanai, H., Okada, A., Shigemasu, K., Kano, Y., Meulman, J.J. (eds) New Developments in Psychometrics. Springer, Tokyo. https://doi.org/10.1007/978-4-431-66996-8_4.
Tiwari
D.
&
Bajpai
R.
(
2012
)
Assessment of water quality in terms of total hardness and iron of some freshwater resources of Kanpur and its suburbs
,
Nature Environment and Pollution Technology
,
11
(
2
),
235
238
.
Wang
X.
,
Xiao
X.
,
Zou
Z.
,
Dong
J.
,
Qin
Y.
,
Doughty
R. B.
,
Menarguez
M. A.
,
Chen
B.
,
Wang
J.
,
Ye
H.
,
Ma
J.
,
Zhong
Q.
,
Zhao
B.
&
Li
B.
(
2020
)
Gainers and losers of surface and terrestrial water resources in China during 1989–2016
,
Nature Communications
,
11
(
1
),
1
12
.
https://doi.org/10.1038/s41467-020-17103-w
.
Wang
X.
,
Jiang
J.
&
Gao
W.
(
2022
)
Reviewing textile wastewater produced by industries: characteristics, environmental impacts, and treatment strategies
,
Water Science and Technology
,
85
(
7
),
2076
2096
.
https://doi.org/10.2166/wst.2022.088
.
Woolway
R. I.
,
Kraemer
B. M.
,
Lenters
J. D.
,
Merchant
C. J.
,
O'Reilly
C. M.
&
Sharma
S.
(
2020
)
Global lake responses to climate change
,
Nature Reviews Earth and Environment
,
1
(
8
),
388
403
.
https://doi.org/10.1038/s43017-020-0067-5
.
Xu
H.
,
Yang
B.
,
Liu
Y.
,
Li
F.
,
Shen
C.
,
Ma
C.
,
Tian
Q.
,
Song
X.
&
Sand
W.
(
2018
)
Recent advances in anaerobic biological processes for textile printing and dyeing wastewater treatment: a mini-review
,
World Journal of Microbiology and Biotechnology
,
34
(
11
),
165
.
https://doi.org/10.1007/s11274-018-2548-y
.
Yogendra
K.
&
Puttaiah
E. T.
(
2008
). '
Determination of water quality index and suitability of an urban waterbody in Shimoga Town, Karnataka
',
Proceedings of Taal2007: The 12th World Lake Conference: Determination
, pp.
342
346
.
Zaman
M.
,
Shahid
S. A.
&
Heng
L.
(
2018
)
Guideline for salinity assessment, mitigation and adaptation using nuclear and related techniques. Springer Nature, Cham. https://doi.org/10.1007/978-3-319-96190-3
.
Zaman, M., Shahid, S. A. & Heng, L. (2018) Guideline for Salinity Assessment, Mitigation and Adaptation Using Nuclear and Related Techniques. In Guideline for Salinity Assessment, Mitigation and Adaptation Using Nuclear and Related Techniques. https://doi.org/10.1007/978-3-319-96190-3.
Zhang
H.
,
Zhou
X.
,
Lv
X.
,
Xu
X.
,
Weng
Q.
&
Lei
K.
(
2023
)
Exploration of the factors that influence total phosphorus in surface water and an evaluation of surface water vulnerability based on an advanced algorithm and traditional index method
,
Journal of Environmental Management
,
342
,
118155
.
https://doi.org/10.1016/j.jenvman.2023.118155
.
Zhang
L.
,
Zhu
X.
,
Wang
H.
&
Liu
X.
(
2024
)
Research progress in the treatment of high-salinity wastewater
,
Journal of Physics: Conference Series
,
2706
,
1
.
https://doi.org/10.1088/1742-6596/2706/1/012042
.
Zhao
K.
,
Wu
H.
,
Chen
W.
,
Sun
W.
,
Zhang
H.
,
Duan
W.
,
Chen
W.
&
He
B.
(
2020
)
Impacts of landscapes on water quality in a typical headwater catchment, Southeastern China
.
Sustainability (Switzerland)
,
12
(
2
),
721. https://doi.org/10.3390/su12020721
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).