This research aimed to determine the pollution resources and the main parameters affecting the Aras River water quality, as one of the important rivers of the Caspian Sea watershed, within the borders of Iran, and to present control solutions. Hence, canonical correlation analysis (CCA) and principal component analysis (PCA) were used to reduce data dimensions and determine the pollution criteria based on four physical and six chemical parameters premeasured at 19 stations between 2020 and 2022. The CCA results showed that pH and dissolved oxygen (as physical variables) and also mercury, nitrate, and sulfate (as chemical variables) had a significant role in predicting the canonical variables of chemical and physical parameters, respectively. These pollutants predominantly originated from anthropogenic pollution sources, including runoff infiltration from surrounding vast agricultural lands (based on the first principle component results accounted for 39.3% of the variations), and also the soil erosion in the watershed (based on the second principle component results accounted for 25.7% of the variations). To conclude, the Aras River water quality management programs should focus more on controlling anthropogenic pollution sources to monitor the status of effective physicochemical parameters using multivariate statistical methods, especially CCA and PCA methods.

  • CCA and PCA methods were appropriate to determine the main physical and chemical parameters affecting the Aras River water quality.

  • Physical parameters were a function of chemical parameters, which were more affected by anthropogenic activities.

  • The grouping of important water quality parameters was done well through PCA, which can be used in future studies.

The surface water quality is of vital importance for humans and aquatic organisms (Safizadeh et al. 2021). One of the key issues in the quality management of water resources is the determination of a relationship between various physical and chemical parameters affecting water bodies (Valiallahi 2022). The response to this issue is a fundamental step toward a more detailed exploration of pollution resources and provides appropriate solutions to reduce the adverse effects of pollutants and ultimately to improve the quality of such resources (Sajjadi et al. 2019). Water quality monitoring requires accuracy in the identification of pollution resources and the determination of physicochemical and biological parameters affecting the surrounding environment (Mackialeagha et al. 2022).

The nature of physical pollutants generally depends on the physical attributes and climate conditions of the watershed and the chemical pollutants are often produced by human activities. Hence, the physical and chemical parameters can be considered as dependent and independent variables, respectively. Thus, there is a need for comprehensive studies to determine the correlation between these parameters (Fataei & Shiralipoor 2011). By determining the degree of correlation between these two categories of variables, it can be claimed that the behavior of the dependent variables is a function of the behavior of the independent variables. If an acceptable correlation is found between these two variables, it can be concluded that the origin of both pollutants is anthropogenic activities (Abdolabadi et al. 2014).

To properly manage water quality, it is necessary and inevitable to know the reality of the problem and use models and tools that provide clear, reliable, and practical results as much as possible. Many environmental indicators have been proposed over the past years. One of the simple methods to evaluate the water quality conditions along the river is to use the water quality index (WQI), as one of the widely used indicators to classify surface water quality. In this method, a large amount of data obtained from water quality measurements is converted into a single and dimensionless number, which is defined on a graded scale and has meaning and interpretation. In the WQI method, it is not possible to use all the physicochemical and biological parameters at the same time and it is limited to only nine parameters, including total dissolved solids (TDS), pH, turbidity, phosphate, nitrate, biochemical oxygen demand (BOD), chemical oxygen demand (COD), dissolved oxygen (DO), temperature and coliform (Samadi et al. 2015; Ewaid 2017). Therefore, it is not possible to change variables through this method to determine other parameters affecting water quality (Uddin et al. 2017).

On the one hand, although quality parameters may be accurately measured, index results depend to a large extent on the opinions and judgments of experts. In addition, it has complications and weaknesses in the interpretation of different levels of water quality (very bad to very good) so the analyst has to analyze many water quality variables in some cases. Also, it is difficult to summarize and draw conclusions from a large amount of information in the WQI method, which reduces the value of this method. On the other hand, finding the connection between pollutants using laboratory tests involves huge costs, and it is also not possible to determine the degree of relationship between parameters and clarify their possible common origin using these methods (Sayadi & Gharehmahmoodlu 2019).

Reportedly, the multivariate statistical methods lack these limitations and allow the use of a large and diverse number of parameters (Neissi & Tishehzan 2018). Therefore, the current research decided to use canonical correlation analysis (CCA) and principal component analysis (PCA) in order to reduce the number of water quality parameters in determining criteria and sources of pollution and to reduce data dimensions in water quality management of the Aras River, as one of the important rivers of the Caspian Sea watershed, within the limits of Iran.

Different methods have been reported to determine the correlation between two variables. For example, the standard correlation coefficient or Pearson correlation coefficient (PCC) expresses the direction and degree of correlation between two sets of variables. In addition, non-parametric measurement methods, such as Kendall's Tau and Spearman's rank correlation coefficients, operate on the basis of ranking similarity between two variables. Multivariate regression is a technique capable of estimating the correlation between a dependent variable and a set of independent variables. Therefore, the use of correlation analysis methods, which can explain the correlation between a set of variables, will be effective in analyzing the correlation of multidimensional variables (Eskandare et al. 2015). To this end, many researchers have applied multivariate statistical methods, such as CCA, to engineering problems and subjects (Keskin & Yasar 2007).

In this regard, Noori et al. (2010) used the CCA method to determine the relationship between physical pollutants, including total suspended solids (TSS), turbidity, electrical conductivity (EC) and temperature, and chemical parameters, including BOD, COD, phosphate, nitrate, chloride, and hardness, in the catchment basin of the Karun River in southwestern Iran. Their results indicated a strong correlation between the studied physical and chemical variables. Rasi Nezami et al. (2013) studied the water quality of the Kor River (Iran) using CCA and PCA techniques. The PCA results demonstrated that all the quality parameters selected to check the water quality of the Kor River were of equal importance. According to CCA findings, the correlation coefficient for the first three standard variables was 0.896, 0.848, and 0.593, respectively, indicating that turbidity, DO and TSS among the physical parameters, and BOD, COD, total phosphorus (TP), and total nitrogen (TN) among the chemical parameters had the highest score among the canonical variables. Cheng Chan et al. (2013) analyzed water quality and correlation between physical and chemical variables using PCA and CCA methods in Macau Main Storage Reservoir, China. Among the 28 water quality parameters studied, six variables were less important in explaining the monthly variation of the water quality, and electro-conductivity and chloride compounds were dominant variables in the physical and chemical parameters, respectively. Banerjee et al. (2015) investigated the water quality of the Bakreswar reservoir in the Birbhum district of Bengal using PCA and CCA tests. Their results showed that the main factors affecting water quality were TN, TP, temperature, and relative humidity. The CCA results revealed the correlation of all water quality variables with temporal variation. Neissi & Tishehzan (2018) also applied CCA and PCA to find relationships between physicochemical variables and determine the main variables affecting the Dez River water quality in southern Iran. The results of their study revealed a significant relationship between two categories of physical and chemical variables. Most of the changes were attributed to anthropogenic sources. In addition, the results of the principle components indicated that the main variables were EC, temperature, sodium adsorption ratio (SAR), sulfate, pH, and DO, which often originated from industrial pollutants and other sources of anthropogenic pollution. Soltani et al. (2019) monitored the surface water quality of the Aras watershed (Iran) using statistical multivariate techniques and concluded that the main causes of river water pollution were agricultural activities, domestic sewage, and weathering.

The Aras River is one of the most important and water-rich border rivers in Iran, belonging to the Caspian Sea watershed in the northwest of Iran, which provides the water needed for drinking, agriculture, and industrial purposes in vast areas. The present research aimed to determine the pollution resources and the main parameters affecting the Aras River water quality also to investigate the correlation between the main physical and chemical variables affecting the water quality within the limits of Iran, as well as to present possible control solutions.

Study area

The Aras River is one of the most important border rivers in Iran. The watershed of this river covers parts of Turkey, Armenia, Azerbaijan, and Iran. It has a length of 1,072 km, of which 410 km form the joint border of Iran, Azerbaijan, and Armenia. This river receives different types of pollutants from various pollution sources, including sewage from nearby cities and villages, as well as agricultural, industrial, and mining effluents from nearby areas. Effluents from fish farms, villages, and cities located on the banks of the river in the Nakhchivan Autonomous Republic of Azerbaijan are discharged into the water resources of the Aras River without being treated. In addition, the wastewater of vast agricultural lands in these areas, as well as in Armenia and Iran, is discharged into this river through the main drains. On the other hand, this river receives pollutants from the route traveled in Turkey at the station entering Iran, which originates from the sewage of the Igdir Plain. It also experiences the wastewater of the industrial and mining areas of Armenia and the 200-hectare fish farm of this country, where different types of fish are raised. Untreated wastewater discharge in the cities of Poldasht and Maku in Iran also affects the water quality of this river. After the dissolution of the Soviet Union, not much action has been taken regarding the construction of sewage treatment plants in Azerbaijan and Armenia. In addition, the few sewage treatment plants left from that era were sometimes destroyed and no regular maintenance and repair operations have been done on them. Thus, the effluents of these sewage treatment plants are discharged into the Aras River without meeting the necessary standards. On the other hand, industrial effluents in these two countries are highly polluted due to the age and consumption of industrial units. In addition to the above factors, sediment organic load caused by deforestation and agriculture in the river basin is another source of pollution (Khoshnoodmotlagh et al. 2020). Figure 1 shows the geographical location of the 19 water sampling stations under study across the borders of Iran on the Aras River (covering the three provinces of East Azarbaijan, West Azarbaijan, and Ardabil).
Figure 1

Geographical location of the Aras River watershed.

Figure 1

Geographical location of the Aras River watershed.

Close modal

Measured parameters

Samples were collected from 19 sampling stations during a 2-year period from 20 March 2020 to 19 March 2022 seasonally. Thus, water sampling containers were first washed with detergent and tap water, then with nitric acid, and finally with distilled water. At the time of sampling, they were washed three times with the desired sampling water. Prior to transport to the laboratory, the samples were stabilized by the required chemicals, the air and water temperatures were measured, and the samples were labeled (including sampling station specifications recorded by GPS, sampling time, and weather conditions). The samples were transported to the laboratory under suitable temperature conditions and in the shortest possible time and refrigerated at 4 °C until the test.

Sampling containers for nitrate, sulfate, and heavy metal parameters were made of polyethylene and one liter in volume. Sampling containers for measuring BOD and COD parameters were sterilized with dark glass. The heavy metal samples were acidified (pH up to 2) using concentrated nitric acid to prevent the possible precipitation of cations and the growth of microorganisms, and also to minimize adsorption by the container walls.

In the laboratory, the collected samples were analyzed for 10 parameters based on Standard Methods for the Examination of Water and Wastewater 2015, as described in Table 1. The concentration of heavy metals was measured through inductively coupled plasma mass spectrometry (ICP-MS). Nitrate and sulfate concentrations were measured by a spectrophotometer. DO, pH, EC, and Turb. were recorded by a Hach portable device. The values of BOD and COD were measured using Hach Company's measuring device.

Table 1

Water quality parameters, units, and methods of analysis used in the Aras River in Iran from 2020 to 2022

ParametersAbbreviationsUnitsAnalytical instruments and methods
Electrical conductivity EC S cm−1 μ Electrometer 
Dissolved oxygen DO mgL−1 Azide-Winkler titration method 
Turbidity Turb. NTU Turbidity Meter 
pH pH pH unit pH-meter 
Nitrate NO3 mgL−1 Spectrophotometry 
Sulfate SO42− mgL−1 Spectrophotometry 
Chemical oxygen demand COD mgL−1 Dichromate reflex method 
Biochemical oxygen demand BOD mgL−1 Azide-Winkler titration method 
Heavy metals Hg pbb Inductively coupled plasma mass spectrometry 
As 
ParametersAbbreviationsUnitsAnalytical instruments and methods
Electrical conductivity EC S cm−1 μ Electrometer 
Dissolved oxygen DO mgL−1 Azide-Winkler titration method 
Turbidity Turb. NTU Turbidity Meter 
pH pH pH unit pH-meter 
Nitrate NO3 mgL−1 Spectrophotometry 
Sulfate SO42− mgL−1 Spectrophotometry 
Chemical oxygen demand COD mgL−1 Dichromate reflex method 
Biochemical oxygen demand BOD mgL−1 Azide-Winkler titration method 
Heavy metals Hg pbb Inductively coupled plasma mass spectrometry 
As 

Statistical analysis

Data were analyzed by SPSS24 and MINITAB15 software using CCA, PCC, and PCA. The normality of the data was checked by the Kolmogorov–Smirnov test. Bartlett's test was used to determine the homogeneity of variances.

Canonical correlation analysis

The CCA is one of the most common methods of multivariate analysis, with the aim of determining the linear correlation between multidimensional variables (Nash & Chaloud 2002). In this method, considering one set of factors as independent variables and another set as dependent variables can be useful (Zamani et al. 2016). One of the advantages of this method in comparison with ordinary correlation analysis (OCA) is that the OCA is dependent on the coordinate system defined in it. This means that even if there is a very strong correlation between two multidimensional variables, this connection may be missed by the choice of coordinate system used in the CCA method, while this linear correlation may provide a high degree of correlation in another coordinate system. The CCA method finds the coordinate system where the correlation coefficient has an optimal value. In some multivariate data sets, the variables are naturally divided into two separate groups, such as X1, X2, …, XP (Xs) and Y1, Y2, …, YP (Ys). For example, response variables (physical parameters) and predictor variables (chemical parameters) of the canonical correlation matrix are used to interpret the correlation between these two categories, and this method is applied as a tool to reduce the volume of data evaluated in calculations. The goal of CCA is to build two new categories of components U=Ax and v=By, which are linear combinations of primary variables. For example, the matrix (p+q)*(p+q) is the correlation matrix between the variables of X1, X2, …, XP and Y1, Y2, …, YP, which is obtained from the sampling, that is, p is a physical parameter and q is a chemical parameter (Noori et al. 2010). The matrix B−1C and A−1C can be calculated based on the following matrix, and then the eigenvalues can be obtained from the following equation:
(1)
It should be noted that the resulting eigenvalues are the square of the correlation coefficient between the canonical variables and the values of the eigenvectors b1, b2, …, br give the desired coefficients to construct the canonical variable V from the primary variables Y. The coefficients of X variables are obtained to make the linear combination of canonical variables Ui (that is, ais) according to the following equation:
(2)
In all of these calculations, it is assumed that the primary variables of X1, X2, …, XP and Y1, Y2, …, YP have been standardized, meaning they have a mean 0 and a standard deviation of 1 (Boyacıoğlu et al. 2013). Finally, the vectors Ui=ai1X1+ai2X2+and V=bi1Y1+bi2+are the results. The correlation between the pairs of Ui and Vi is checked and presented as a canonical correlation coefficient with values between +1 and −1. The modeling steps of quality parameters in CCA are presented in Figure 2.
Figure 2

Modeling steps to determine the canonical correlation coefficient between canonical variables.

Figure 2

Modeling steps to determine the canonical correlation coefficient between canonical variables.

Close modal

Principal component analysis

PCA is one of the most powerful multivariate statistical methods in order to reduce data dimensions and extract a smaller number of latent factors for correlation analysis between variables (Tokalıoglu & Kartal 2006). This reduction is obtained by transforming the data set into a new set of variables (principal components) so that they are significantly correlated and their subject is close to each other. As a result, all variables are summarized into several groups and each group is considered a principal component (Marengo et al. 2008). Equation (3) shows the method of determining principal components.
(3)
where PCij is the ith component for the jth component, and Wik is the loading of the kth original variable for the jth observation. The final judgment on the variables used in this section and their role in the factors is determined through the total value of the explained variance of the variables. The eigenvalue of each factor is a proportion of the total variance of the variables that is explained by that factor. Therefore, the eigenvalues show the exploratory significance of factors in dependance on variables (Hashemi et al. 2012). Finally, the number of PCs that are able to have a comprehensive description of the variables is determined and the variables are placed in their subgroups according to the degree of correlation with these components (Neissi & Tishehzan 2018).

This method is used as a technique for cases such as reducing the number of variables, determining the periodic trends of data, water quality, and influential factors, and also extracting important parameters in pollution (Rahnama & Sayari 2019). Data reduction is achieved by substituting the derived factors for the original variables in the data set. Factor analysis pursues three goals, including determining the correlation between variables, recognizing representative variables from a huge set of data, and building a new (smaller) set of variables to replace with principal variables in future analyzes (Mohamed et al. 2015).

The CCA and PCA methods were used to analyze the data obtained from 2 years of sampling at 19 stations of the studied river in order to explore the correlation between physical and chemical parameters and to determine the most important parameters affecting the water quality of this river. Table 2 shows the results obtained from the CCA method, in which we can accurately see that the number of fewer variables, that is, four physical parameters, determined the number of created canonical categories. In this method, after obtaining the U and V sets, their correlations were determined.

Table 2

The results of CCA for independent and dependent parameters in the Aras River watershed in Iran

Canonical variablesCategory 1Category 2Category 3Category 4
V1, U1V2, U2V3, U3V4, U4
Canonical variables 0.933 0.743 0.427 0.240 
Eigenvalues 6.74 1.23 0.23 0.06 
Degrees of freedom 24 15 
Significance level 1.955* 0.877 0.382 0.244 
Physical parameters Standard canonical coefficients Canonical loading Standard canonical coefficients Canonical loading Standard canonical coefficients Canonical loading Standard canonical coefficients Canonical loading 
DO 0.228 0.874 0.441 0.331 −0.560 −0.061 1.775 −0.350 
EC −0.171 −0.604 −0.468 −0.537 −1.068 −0.588 −0.683 −0.025 
pH 0.728 0.954 −0.731 −0.274 −0.130 −0.085 1.151 0.081 
Turb. −0.044 −0.054 0.548 −0.735 0.576 0.566 −0.729 −0.368 
Chemical parameters Standard canonical coefficients Canonical loading Standard canonical coefficients Canonical loading Standard canonical coefficients Canonical loading Standard canonical coefficients Canonical loading 
As −0.058 −0.099 0.227 0.472 0.387 0.331 0.355 0.645 
BOD −0.270 0.153 0.353 −0.052 0.165 0.719 0.314 −0.520 
COD −0.349 0.032 −0.418 −0.644 −0.571 0.251 −0.391 −0.404 
Hg 0.034 0.648 0.640 0.603 −0.540 −0.151 −0.991 −0.418 
NO3 0.760 0.932 0.088 −0.049 −0.001 −0.237 1.057 0.201 
SO4 0.328 0.713 −0.493 −0.511 0.321 0.300 −0.481 −0.312 
Canonical variablesCategory 1Category 2Category 3Category 4
V1, U1V2, U2V3, U3V4, U4
Canonical variables 0.933 0.743 0.427 0.240 
Eigenvalues 6.74 1.23 0.23 0.06 
Degrees of freedom 24 15 
Significance level 1.955* 0.877 0.382 0.244 
Physical parameters Standard canonical coefficients Canonical loading Standard canonical coefficients Canonical loading Standard canonical coefficients Canonical loading Standard canonical coefficients Canonical loading 
DO 0.228 0.874 0.441 0.331 −0.560 −0.061 1.775 −0.350 
EC −0.171 −0.604 −0.468 −0.537 −1.068 −0.588 −0.683 −0.025 
pH 0.728 0.954 −0.731 −0.274 −0.130 −0.085 1.151 0.081 
Turb. −0.044 −0.054 0.548 −0.735 0.576 0.566 −0.729 −0.368 
Chemical parameters Standard canonical coefficients Canonical loading Standard canonical coefficients Canonical loading Standard canonical coefficients Canonical loading Standard canonical coefficients Canonical loading 
As −0.058 −0.099 0.227 0.472 0.387 0.331 0.355 0.645 
BOD −0.270 0.153 0.353 −0.052 0.165 0.719 0.314 −0.520 
COD −0.349 0.032 −0.418 −0.644 −0.571 0.251 −0.391 −0.404 
Hg 0.034 0.648 0.640 0.603 −0.540 −0.151 −0.991 −0.418 
NO3 0.760 0.932 0.088 −0.049 −0.001 −0.237 1.057 0.201 
SO4 0.328 0.713 −0.493 −0.511 0.321 0.300 −0.481 −0.312 

* Significant at the 5% level

** Significant at the 1% level

According to Table 2, the canonical correlation coefficient for the category of canonical variables 1, 2, and 3 was 0.933, 0.743, and 0.427, respectively, which were relatively high values. The correlation coefficient was 0.240 for the fourth category of canonical variables. According to Tabachnick & Fidell (2000), the correlation between the main canonical variables can be interpreted only with a value greater than 0.3; due to the lower correlation coefficient of the fourth category of canonical variables, it is not used for drawing conclusions. It is also noteworthy that the higher the correlation coefficients of the categories, the more decisively it is possible to judge the existence of a correlation between the categories of the variables under investigation. According to the p-value for each category, it can be seen that only the canonical variables of the first category were valid, because its significance level was less than 0.05, indicating a high correlation between the parameters for the first canonical variables. Therefore, with a confidence interval of 95%, it can be stated that the two categories of physical (dependent) and chemical (independent) variables in the first category, whose correlation coefficient was 0.933, had a high correlation. According to the obtained significance level results, the other three categories of canonical variables did not provide valid results regarding the existence of a correlation between parameters.

More precisely, it could be seen that the correlation between chemical and physical pollution parameters based on the canonical correlation result in Category 4 was very small due to the low correlation coefficient of this category, and it practically indicated the lack of correlation between the two categories of chemical and physical pollution parameters. In general, it was observed that there was an effective correlation between the physical and chemical parameters with a high canonical correlation for Categories 1 and 2, and the canonical variables of Categories 1 and 2 could be the basis of the conclusion that the chemical and physical parameters under study probably had the same pollution sources.

According to Table 2, the principle variables in the first canonical variables (Category 1) included physical variables related to U1 and chemical variables related to V1. The CCA results showed that pH and DO variables in column U1 of physical variables had canonical coefficients of 0.728 and 0.228, respectively, and thus they were more important, indicating the greater influence of the linear combination of U1 (physical parameters) from these two water quality variables. Among the chemical variables V1, the parameters of NO3, COD, and SO4 had the canonical coefficients of 0.760, −0.349, and 0.328, respectively, indicating that the increase in the value of NO3 parameter had a relatively high correlation with V1 compared to other parameters so that increasing the amount of this parameter enhanced the amount of V1. In the canonical category of physical variables U1, the pH parameter had a relatively high correlation with U1; therefore, as the value of the pH parameter increased, the amount of U1 also elevated.

Based on the CCA results (Table 2), the physical variables of pH and DO with canonical loading of 0.954 and 0.874, respectively, had a significant role in predicting the canonical variable of chemical parameters. According to the squared values of canonical loading (0.910 and 0.764, respectively), in predicting the variance of chemical variables, the share of the pH variable was 91% and the share of the DO variable was 76%. On the other hand, the chemical variables of NO3, SO4, and Hg with canonical loadings of 0.932, 0.713, and 0.648, respectively, had a significant role in predicting the canonical variables of physical parameters, with the squared canonical loads of 0.868, 0.510, and 0.420, respectively. Hence, in predicting the variance of chemical variables, the contributions of NO3, SO4, and Hg were 87, 51, and 42%, respectively.

According to the results obtained for the correlation of each pair of Ui and Vi, it could be stated that the physical variables of pH and DO and the chemical variables of NO3, SO4, and Hg in terms of p-value were the dominant variables among the parameters investigated in this research. Therefore, the results of this research could reveal a strong correlation between two categories of chemical and physical parameters. In fact, it could be claimed that both chemical and physical parameters originated from the same source. Since chemical factors in watersheds are often caused by anthropogenic activities, and physical factors can also be of anthropogenic or natural origin, based on the CCA results, it could be stated that both chemical and physical parameters were of anthropogenic origin in the Aras River watershed in Iran.

In evaluating water quality parameters using CCA and cluster analysis, Abdolabadi et al. (2014) concluded that the physical parameters of TDS and EC as well as the chemical parameters of pH and COD were the dominant variables according to the results obtained for the relationship of each pair of Ui and Vi. Also, Eskandare et al. (2015) presented a new solution for quality management of the Sefidrood River (Iran) based on the CCA method. They announced that EC and T in the linear combination of Ui (chemical parameters category) as well as NO3 and PO4 in the linear combination of Vi (physical parameters category) were the most influential parameters, which had the same man-made origin. Sayadi & Gharehmahmoodlu (2019) investigated and predicted water quality parameters of the Gamasyab River using CCA and time series. According to their results, HCO3 and Mg parameters at Pol-Chehr station with canonical coefficients of 0.938 and 0.933, respectively, were in the first category, and the Na parameter with a canonical coefficient of 0.845 was situated in the second category. At the DoAb station, HCO3 and Ca parameters with canonical coefficients of 0.945 and 0.0789 were placed in the first category, and also Na and Cl with canonical coefficients of 0.930 and 0.800 were in the second category, respectively. The origin of the first and second categories of conventional variables could be related to the dissolution of limestone and evaporative deposits.

As previously discussed, based on the results of Table 2, there was a strong correlation between pH and DO parameters and NO3, COD, and SO4 parameters. The high coefficient of variation of pH and DO parameters was attributed to the discharge of agricultural effluents and sewage from villages and towns located on the banks of the river along the route traveled in Turkey, Armenia, and Azerbaijan. The discharge of these effluents has caused a decrease in the values of both pH and DO parameters at the discharge site.

On the other hand, the increase in NO3, COD, and SO4 values could be attributed to some sources of pollution, which are mentioned below:

  • – Wastewater returned from Igdir Plain lands and water quality change caused by the volcanic nature of Mount Ararat along the route in Turkey.

  • – Wastewater discharge of industrial areas and mines in Armenia.

  • – Mineral activities and mineral processing in the East Azarbaijan region of Iran.

  • – Effluent of large sturgeon breeding complexes in Armenia and Nakhchivan Autonomous Republic of Azerbaijan.

The discharge of waste mines and active mines in Armenia and East Azarbaijan province in Iran, in which various organic and inorganic chemicals are used to separate metal from the deposit, introduced various chemicals and heavy metals into the Aras River watershed and thus affected its quality.

Table 3 and Figure 3 show the PCA results. In the present research, the PCA method was used to validate the correlation coefficient matrix obtained from CCA. According to this test, most of the variations could be justified through the first three components, so 76.3% of the variations could be seen through their eigenvalues. In the first three principal components, black-colored parameters were recognized as the most important parameters in each component due to higher coefficients.
Table 3

The results of PCA for water quality parameters of the Aras River watershed in Iran

ParametersSignificant principal component
PC1PC2PC3PC4
pH 0.442 −0.040 0.349 −0.066 
EC −0.332 −0.200 0.396 0.345 
DO 0.448 0.169 −0.034 0.024 
Turb. −0.015 −0.429 0.183 0.183 
BOD 0.161 −0.376 −0.649 −0.019 
COD 0.079 −0.555 −0.123 0.093 
SO4 0.368 −0.312 0.146 0.161 
NO3 0.429 0.103 0.343 −0.005 
Hg 0.365 0.216 −0.291 0.333 
As −0.091 0.377 −0.164 −0.659 
Eigenvalues (%) 39.3 25.7 11.3 9.28 
ParametersSignificant principal component
PC1PC2PC3PC4
pH 0.442 −0.040 0.349 −0.066 
EC −0.332 −0.200 0.396 0.345 
DO 0.448 0.169 −0.034 0.024 
Turb. −0.015 −0.429 0.183 0.183 
BOD 0.161 −0.376 −0.649 −0.019 
COD 0.079 −0.555 −0.123 0.093 
SO4 0.368 −0.312 0.146 0.161 
NO3 0.429 0.103 0.343 −0.005 
Hg 0.365 0.216 −0.291 0.333 
As −0.091 0.377 −0.164 −0.659 
Eigenvalues (%) 39.3 25.7 11.3 9.28 
Figure 3

Scree plot of PCA for physicochemical parameters.

Figure 3

Scree plot of PCA for physicochemical parameters.

Close modal

A scree plot can be used to determine the correlation between the eigenvalues and the number of variables. Components with eigenvalues more than one are selected as influential components. The quality changes between the stations can be interpreted with higher accuracy and confidence through the important parameters of these components (Neissi & Tishehzan 2018). As can be seen in the diagramed scree plot (Figure 3), the first three principal components had an eigenvalue greater than one and so were of the highest significance.

The effective components in PCA are identified through eigenvalues (Debnath et al. 2023). The parameters influencing the water quality of the Aras River in the Iran area were identified through the highest eigenvalues of principal components (Figure 3). The components of more than 1 were examined as significant components (Yaganoglu et al. 2020). Figure 4 represents a three-dimensional image of important parameters in the first three principal components. According to the PCA results (Table 3 and Figure 4), the first category (PC1) explained 39.3% of the variations and determined that the major changes caused by the DO, pH, and EC parameters from the physical group and NO3, SO4, and Hg parameters from the chemical group were introduced as the main variables of the Aras River water quality due to their high coefficients. In nature, the source of these parameters, especially the chemical parameters that significantly influence the changes in physical parameters, was due to the introduction of anthropogenic pollution sources. Based on the parameters affecting the water quality of the Aras River in PC1, it was found that the water quality of the river was first affected by the runoff infiltration from vast agricultural lands as well as the discharge of surrounding rural and urban wastewaters. On the other hand, mercury pollution is generally caused by the production of lamp production industries. In this regard, there is a huge lamp production plant called Electrolamp in Yerevan, Armenia, whose waste has been released to adjacent water resources in recent years. In PC2, which explained 25.7% of the variations, these changes were due to turbidity (the physical variable) and also COD, BOD, As, and SO4 (the chemical variables) as the main variables of water quality determination in this group. The physical parameters of this category were caused by soil erosion in the watershed as well as the chemical parameters caused by anthropogenic pollution sources (discharge of industrial and mining wastewater). Although the mining activities and mineral processing (iron, copper, molybdenum, lead, zinc, gold, silver, and aluminum) have decreased in Armenia, the threat of their pollution still remains. Accumulated tailings from past activities are washed away by rain and finally discharged into the rivers in the form of acidic waters rich in all kinds of heavy metals. In addition, the use of various organic and inorganic chemicals to separate metal from deposits in the active mines of this country causes the discharge of various types of chemicals and heavy metals into water sources. These mines are generally located in the south of Armenia and on the border of Iran (East Azerbaijan province). In PC3, the results revealed that pH, EC, and BOD were the most important parameters in this component, which could be attributed to the introduction of anthropogenic pollution sources (discharge of various sources of agricultural, industrial, and mineral pollutants), and watershed erosion as well as the structure of the river bed constituents. The results of the temporal and spatial evaluation of the quality changes in the Zohreh River, Iran, using PCA showed that the quality changes of the river were due to natural and anthropogenic pollution sources (Ravanbakhsh et al. 2019). Gull et al. (2023) used multivariate statistical methods to upgrade the Jhelum River Water Monitoring System and announced that the first and second principal components indicated that the water pollution was due to human sewage and soil erosion, respectively, as well as the third principal component made it clear that the water pollution was caused by the intrinsic structure of the material in the river bed. Solgi & Sheikhzadeh (2016) monitored the quality of the Aras River Water Physicochemistry in the Ardebil province, Iran, and concluded that the water quality of this river was mainly affected by agricultural runoff and municipal wastewater.
Figure 4

Three-dimensional scatter plot of 11 parameters at 19 stations of the Aras River relative to the first, second, and third principal components.

Figure 4

Three-dimensional scatter plot of 11 parameters at 19 stations of the Aras River relative to the first, second, and third principal components.

Close modal

Table 4 presents the PCC matrix for physical and chemical parameters. As it is clear from the correlation between the parameters, among the physical parameters, the EC and pH parameters had a statistically significant correlation with the DO parameter (P < 0.01). In chemical parameters, a significant correlation (P < 0.01) was found between BOD and COD, between Hg and NO3, and between NO3 and SO4, and there was also a significant correlation (P < 0.05) between As and COD and between COD and SO4. Therefore, comparing the results of canonical correlation (Table 3) confirmed the results of the correlation between variables in the PCC test (Table 4). Similarly, Neissi & Tishehzan (2018) investigated the physicochemical parameters affecting the Dez River water quality in Iran and declared that the physical variables of EC and temperature and the chemical variables of SAR, COD, SO4, and NO3 confirmed each other in affecting the water quality based on canonical and Pearson correlations.

Table 4

Correlation matrix of physicochemical variables studied in the Aras River in Iran

ParametersDOECpHTurb.AsBODCODHgNO3SO4
DO          
EC 0.002**         
pH 0.001** 0.107        
Turb. 0.420 0.673 0.770       
As 0.911 0.367 0.451 0.344      
BOD 0.577 0.317 0.652 0.322 0.324     
COD 0.668 0.465 0.561 0.055* 0.028* 0.002*    
Hg 0.001** 0.012* 0.052* 0.128 0.766 0.381 0.504   
NO3 0.000** 0.055* 0.000** 0.696 0.814 0.793 0.914 0.015*  
SO4 0.040* 0.260 0.000** 0.150 0.174 0.061 0.036* 0.331 0.017* 
ParametersDOECpHTurb.AsBODCODHgNO3SO4
DO          
EC 0.002**         
pH 0.001** 0.107        
Turb. 0.420 0.673 0.770       
As 0.911 0.367 0.451 0.344      
BOD 0.577 0.317 0.652 0.322 0.324     
COD 0.668 0.465 0.561 0.055* 0.028* 0.002*    
Hg 0.001** 0.012* 0.052* 0.128 0.766 0.381 0.504   
NO3 0.000** 0.055* 0.000** 0.696 0.814 0.793 0.914 0.015*  
SO4 0.040* 0.260 0.000** 0.150 0.174 0.061 0.036* 0.331 0.017* 

* Significant at the 5% level

** Significant at the 1% level

The Aras River is of vital importance in supplying drinking, agricultural, and industrial water in Iran. Meanwhile, many pollutants are discharged into this river through different pollution sources from the countries of Iran, Azerbaijan, Armenia, and Turkey. Therefore, it seemed necessary to evaluate the water quality of this river in a comprehensive and integrated manner in the entire locational range of Iran. The results of using CCA and PCA models in water quality management of this river indicated that the high impact of DO, pH, and EC (physical parameters) as well as NO3, SO4, and Hg (chemical parameters) on water quality was attributed to various natural and anthropogenic pollution sources. These sources were residential and industrial sewage, mining effluent, runoff from vast agricultural lands, and erosion and destruction of pastures and forests in the catchment area of this river through neighboring countries.

Therefore, it is necessary to pay special attention to this important river to reduce pollution. The Aras River water quality management programs should focus more on human resources and activities. Based on the research findings, the following suggestions were made for controlling the water quality of the river under study:

  • 1. Prevention of the introduction of effluent and drainage caused by mineral activity and processing of active minerals and abandoned mines to the river.

  • 2. Construction and improvement of residential and industrial wastewater treatment systems.

  • 3. Proper management in controlling the excessive use of agricultural fertilizers and pesticides and establishing a pretreatment system before releasing it into the river.

  • 4. Serious effort in controlling floods and soil erosion and implementing soil protection and watershed management plans.

  • 5. Compilation of legally binding treaties and conventions in the field of quality protection of river water, emphasizing the principle of ‘prohibition of causing damage to the territory of another state’.

  • 6. The necessity of creating an ‘international binding legal mechanism for toxic parameters’.

  • 7. Establishment of a joint online quality monitoring system and exchange of monitoring information between neighboring countries.

Based on the research results, the CCA and PCA models worked with high accuracy in determining the correlation between parameters and identifying the most effective parameters in water quality, respectively, and provided valuable information. Therefore, these techniques can be used in the selection of influential variables for the production of water quality indicators in the evaluation of the quality of water resources, especially in different geographical locations. The use of these methods makes it possible to easily access the required information about calculation methods in the selection of variables in the comprehensive assessment of the quality of water resources.

This research has been extracted from the Ph.D. dissertation written by M.M.S. in Environmental Engineering and Science at the Islamic Azad University of Ardabil Branch. Hereby, the authors appreciate the esteemed president and educational and research deputies of Ardabil Islamic Azad University for their cooperation in facilitating the implementation of this project.

No funding.

M.M.S. conceptualized the study; investigated the study; visualized the study; and wrote the article; A.A.I. did formal analysis; M.M.S. acquired funds. M.S. performed the methodology. E.F. did project administration. H.B. collected resources. A.A.I. did software analysis. E.F. supervised the study and validated the study.

All relevant data are included in the paper or its Supplementary Information.

The authors declare there is no conflict.

Abdolabadi
H.
,
Ardestani
M.
&
Hasanlou
H.
2014
Evaluation of water quality parameters using multivariate statistical analysis (Case study: Atrak River)
.
Journal of Water and Wastewater
25
(
3
),
110
117
.
Banerjee
M.
,
Mukherjee
J.
,
Banerjee
A.
,
Roy
M.
,
Bandyopdhyay
G.
&
Ray
S.
2015
Impact of environmental factors on maintaining water quality of Bakreswar reservoir, India
.
Computational Ecology and Software
5
(
3
),
239
253
.
Boyacıoğlu
H.
,
Gündogdu
V.
&
Boyacıoğlu
H.
2013
Investigation of priorities in water quality management based on correlations and variations
.
Marine Pollution Bulletin
69
(
1, 2
),
48
54
.
Cheng Chan
M.
,
Lou
I.
,
Kin Ung
W.
&
Meng Mok
K.
2013
Integrating principle component analysis and canonical correlation analysis for monitoring water quality in storage reservoir
.
Applied Mechanics and Materials
284–287,
1458
1462
.
Debnath
P.
,
Roy
S.
,
Bharadwaj
S.
,
Hore
S.
,
Nath
H.
,
Mitra
S.
&
Ciobotaru
A.-M.
2023
Application of multivariable statistical and geo-spatial techniques for evaluation of water quality of Rudrasagar wetland, the Ramsar Site of India
.
Water
15
,
4109
.
https://doi.org/10.3390/w15234109
.
Eskandare
A.
,
Noure
R.
,
Rasoule
A.
&
Vesale
M.
2015
Offering a suitable method for water quality management of the Sefidrood River based on canonical correlation analysis
.
Environmental Researches
5
(
9
),
79
86
.
Fataei
E.
&
Shiralipoor
S.
2011
Evaluation of surface water quality using cluster analysis: A case study
.
World Journal of Fish and Marine Sciences
3
(
5
),
366
370
.
Gull
S.
,
Shah
S. R.
&
Dar
A. M.
2023
Assessment and interpretation of surface water quality in Jhelum River and its tributaries using multivariate statistical methods
.
Environmental Monitoring Assessment
195
(
6
),
746
.
doi:10.1007/s10661-023-11346-y. PMID: 37237143
.
Hashemi
S. H.
,
Ranjkesh
Y.
,
Ramezani
S.
&
Ghasemi Ziarani
E.
2012
A comparative analysis on water quality of Karaj River through Statistical Technique (Factor Analysis) and QUAL2K model
. In:
National Conference on Water Flow and Pollution
.
University of Tehran, Tehran, Iran
.
Keskin
S.
&
Yasar
F.
2007
Use of canonical correlation analysis for determination of correlations among several traits in eggplant (Solanum melongena L.) under salt stress
.
Pakistan Journal of Botany
39
(
5
),
1547
1552
.
Khoshnoodmotlagh
S.
,
Jochem Verrelst
S.
,
Daneshi
A.
,
Mirzaei
M.
,
Azadi
H.
,
Haghighi
M.
,
Hatamimanesh
M.
&
Marofi
S.
2020
Transboundary basins need more attention: anthropogenic impacts on land cover changes in Aras River Basin, monitoring and prediction
.
Remote Sensing
12
(
20
),
3329
.
https://doi.org/10.3390/rs12203329
.
Mackialeagha
M.
,
Salarian
M. B.
&
Behbahaninia
A.
2022
The use of multivariate statistical methods for the classification of groundwater quality: A case study of aqueducts in the east of Tehran, Iran
.
Anthropogenic Pollution
6
(
2
),
1
9
.
doi:10.22034/ap.2022.1965587.1134
.
Marengo
E.
,
Carla Gennaro
M.
,
Elisa
R.
,
Maiocchi
A.
,
Pavese
G.
,
Indaco
A.
&
Rainero
A.
2008
Statistical analysis of groundwater distribution in Alessandria Province (Piedmont-Italy)
.
Microchemical Journal
88
,
167
177
.
Mohamed
I.
,
Othman
F.
,
Ibrahim
A. L.
,
Alaa-Eldin
M. E.
&
Yunus
R. M.
2015
Assessment of water quality parameters using multivariate analysis for Klang River basin, Malaysia
.
Environmental Monitoring and Assessment
187
(
1
),
41
82
.
Nash
M. S.
&
Chaloud
D. J.
2002
Multivariate Analyses (Canonical Correlation and Partial Least Square (PLS)) to Model and Assess the Association of Landscape Metrics to Surface Water Chemical and Biological Properties Using Savannah River Basin Data
.
Report: U.S. Environmental Protection Agency, Las Vegas
,
NV, USA
.
Neissi
L.
&
Tishehzan
P.
2018
Dez River water quality assessment by using multivariate statistical methods
.
Irrigation and Water Engineering
9
(
1
),
139
150
.
Noori
R.
,
Sabahi
M. S.
,
Karbassi
A. R.
,
Baghvand
A.
&
Taati-Zadeh
H.
2010
Multivariate statistical analysis of surface water quality based on correlations and variations in the data set
.
Desalination
260
,
129
136
.
Rahnama
S.
&
Sayari
N.
2019
Survey and trends of chemical water quality parameters of Tajan River water quality using principal component analysis and aqua chem software
.
Human & Environment
17
(
1
),
13
25
.
Rasi Nezami
S.
,
Nazariha
M.
,
Baghvand
A.
&
Moridi
A.
2013
Karkheh river water quality using multivariate statistical analysis and qualitative data variations
.
Health System Research
8
(
7
),
1280
1292
.
Ravanbakhsh
M.
,
Tahmasebi Birgani
Y.
,
Dastoorpoor
M.
&
Ahmadi Angali
K.
2019
Evaluation of temporal and spatial variations of water quality parameters in Zohreh River, Iran
.
Avicenna Journal of Environmental Health Engineering
6
(
2
),
75
82
.
Safizadeh
E.
,
Karimi
D.
,
Gahfarzadeh
H. R.
&
Pourhashemi
S. A.
2021
Investigation of physicochemical properties of water in downstream areas of selected dams in Aras catchment and water quality assessment (Case study: Aras catchment in the border area of Iran and Armenia)
.
Anthropogenic Pollution
5
(
1
),
41
48
.
doi:10.22034/ap.2021.1912491.1082
.
Sajjadi
N.
,
Davoodi
M.
&
Jozi
S.
2019
The quality assessment of Kan River's resources in terms of agricultural and drinking purposes
.
Anthropogenic Pollution
3
(
1
),
46
53
.
doi:10.22034/ap.2019.582368.1036
.
Samadi
M. T.
,
Sadeghi
S.
,
Rahmani
A.
&
Saghi
M. H.
2015
Survey of water quality in Moradbeik river basis on WQI index by GIS
.
Environmental Health Engineering and Management Journal
2
(
1
),
7
11
.
Sayadi
M.
&
Gharehmahmoodlu
M.
2019
Investigation and prediction of quality parameters of Gamasyab River using multivariate method of canonical correlation analysis and time series
.
Iranian Journal of Research in Environmental Health
5
(
2
),
108
122
.
Solgi
E.
&
Sheikhzadeh
H.
2016
Study of water quality of Aras River using physico-chemical variables
.
Iran-Water Resources Research
12
(
3
),
207
213
.
Soltani
S.
,
Ghohroudi Tali
M.
&
Sadoogh
S.
2019
Evaluation of the surface water quality using statistical multi-variate techniques, case study: Aras watershed
.
Iran-Water Resources Research
15
(
2
),
319
328
.
Tabachnick
B. L.
&
Fidell
S.
2000
Using Multivariate Statistics
.
A Pearson education Company
,
Needham Heights, USA
, p.
966
.
Valiallahi
J.
2022
Groundwater quality zoning based on wilcox index using geographic information system in jajarm district, north Khorasan, Iran
.
Anthropogenic Pollution
6
(
2
),
16
24
.
doi:10.22034/ap.2023.1964980.1136
.
Yaganoglu
E.
,
Yaganoglu
A. M.
,
Arslan
G.
&
Sonmez
A. Y.
2020
Determination of spatial and temporal changes in surface water quality of Filyos River (Turkey) using principal component analysis and cluster analysis
.
Marine Science & Technology Bulletin
9
(
2
),
207
214
.
https://doi.org/10.33714/masteb.784959
.
Zamani
A.
,
Pari Zanganeh
A. H.
&
Khandoozi
F.
2016
Canonical correlation analysis to determine the relationship between water quality parameters and heavy metals in water samples (Ramian City-Golestan Province)
.
Journal of Water and Soil Conservation
23
(
4
),
267
280
.
doi:10.22069/jwfst.2016.9936.2431
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).