With growing urbanization, water contamination has become a problem. The water quality is assessed using physicochemical parameters and requires manual collection. Moreover, physicochemical parameters are insufficient for water quality monitoring as heavy rainfalls and abundance of air pollutants cause water pollution. Thus, considering natural factors as influencing parameters and the latest technology for easy and global coverage for sampling, water quality monitoring is modified. This study investigates Rawal watershed with (a) physicochemical, (b) air pollutants like nitrogen dioxide (NO2), and (c) meteorological variables like wind speed for June 2018 to September 2022. Correlation and regression analysis are performed. The results show negative correlations for NO2 with total dissolved solids (TDS) (ranging, 0.51–0.85), turbidity (range, 0.53–0.65), pH (range, 0.5–0.75), and dissolved oxygen (DO) (range, 0.5–0.82), and positive correlation with electric conductivity (EC) (range, 0.54–0.85). The regression analysis with LightGBM, multi-layer perceptron (MLP), and support vector machine (SVM) is applied with air pollutants, and meteorological parameters taken as independent variables giving root-mean-square error (RMSE) (ranging, 0.015–0.18). MLP gave an RMSE of 0.18 and 0.003 for TDS and pH, respectively. SVM performed well for DO, turbidity, and EC with RMSE ranging from 0.015 to 0.027. Moreover, floods on August 2022 are taken as a case study.

  • Impact assessment of air pollutants on physicochemical parameters.

  • Meteorological features can have a moderate impact on water quality, i.e., wind speed with chl-α, EC, DO, and TDS, and air temperature with DO and TDS in August and September.

  • Machine learning approaches, i.e., LightGBM, MLP, and SVM, are applied for the analysis.

  • Floods can have a negative impact on water quality introducing an excess of pollutants and nutrients in water.

With the growing human population, the need for production of goods and other resources is increasing rapidly giving rise to the pollution of water and air, in turn affecting the entire ecosystem and human health (Fuller et al. 2022). Water quality is affected by agricultural, industrial, and urban anthropogenic activities that result in large quantities of pollutants that may include nutrients, pathogens, and toxins entering the surface waters. According to the latest report by the United Nations (UN), more than 80% of the wastewater discharged into the rivers is a resultant of human activities (Nations 2022). The discharge of wastewater (e.g., brine) degrades the quality of water that cannot be directly used for potable water (via desalination) and industrial applications (Panagopoulos 2021, 2022; Panagopoulos & Giannika 2022). Along with, the anthropogenic causes of water contamination, the effect of air pollution and climatic changes on the quality of water cannot be ignored (Kan et al. 2012; Matyssek et al. 2012). The abundance of some air pollutants that includes nitrogen (N) can accelerate nutrient pollution or eutrophication in the water, resulting in a complex chain of events that harms the aquatic ecosystem (National Research Council 2000; Nie et al. 2018). In addition, the high concentrations of atmospheric carbon dioxide (CO2) levels can increase biological productivity in water bodies, which leads to acidification (Doney et al. 2009), which has a direct or indirect impact on marine organisms. Besides the effects of the air pollutants, climatic changes can be a reason of concern for the water contamination as they bring changes in the water cycle (Stanković et al. 2019). Over the years, meteorological events such as heavy precipitation, flood have intensified with climatic changes that affect water quality (Puczko & Jekatierynczuk-Rudczyk 2020).

Pakistan is a victim of contaminated surface and ground water because of the increasing urban population, resulting in a rise in agricultural and industrial pollution (Fatima et al. 2022; Mehmood et al. 2022). The resultant polluted water is the cause of major waterborne diseases such as typhoid and cholera (Shah et al. 2016). Like most countries, Pakistan relies on grab sampling or collecting manual samples from the location for water quality assessment and management activities (Ahmed et al. 2021). This is a hard and time-consuming task, which involves dependency on manual labour and is limited to collecting samples from inlet and outlet streams of the sampling sites. Using remote sensing techniques is a new approach that can enhance the data collection sampling process with acquiring data in high spatial and temporal resolution from thousands of sampling points at a time (Usali & Ismail 2010; Gholizadeh et al. 2016). Moreover, remote sensing technology is also being used to monitor the atmosphere (Martin 2008; Yang et al. 2017). This brings an opportunity to analyse the air, meteorological factor, and other factors affecting the physicochemical parameters for any location at any time.

The associations amongst the air pollutants, meteorological factor, and physicochemical water quality parameters are unknown as the influence of the natural factors over the water health is not backed by any concrete evidence. To prove the link among such factors, manual collection of sample through tools and equipment is required at a continuous rate which becomes a complex and time-consuming task. In this study, the solution to the data collection problem is proposed with the use of remote sensing and data reanalysis techniques for acquisition of data samples. As the physicochemical parameters are insufficient to determine the overall water quality due to the impact of natural occurring phenomenona water health that include events such as heavy precipitation patterns and abundance of air pollutants that can decompose and transport harmful pollutants or nutrients to the water bodies. Thus, the present study proposes an improved water quality monitoring model based on a hybrid of remote sensing and data mining techniques with a unique set of monitoring data that consider natural factors as influencing parameters and uses the latest technology as a source for easy and global coverage for sample collection. Three categories of data, i.e., (a) air pollutants, (b) meteorological parameter, and (c) physicochemical parameter, are collected for the monsoon months (June to September) of the years 2018–2022 for the stream network of Rawal Watershed, located in Islamabad, Pakistan. The unique set of data encompasses a total of 16 parameters, which include: six air pollutants, namely, (i) carbon monoxide (CO), (ii) nitrogen dioxide (NO2), (iii) ozone (O3), (iv) sulphur dioxide (SO2), (v) formaldehyde (HCHO), and (vi) methane (CH4), acquired from Sentinel-5 Precursor Level 2 (S5P-L2) TROPOspheric Monitoring Instrument (TROPOMI); three meteorological parameters, namely, (i) air temperature, (ii) wind speed, and (iii) total precipitation, taken from the ERA5 Climate Reanalysis Project (ERA5-CRP); and finally, seven physicochemical parameters, i.e., (i) total dissolved solids (TDS), (ii) pH, (iii) electrical conductivity (EC), (iv) Secchi disk depth (SDD), (v) dissolved oxygen (DO), (vi) turbidity (Tur), and (vii) chlorophyll-α (chl-α), acquired from the Sentinel-2 Multispectral Imager (S2-MSI) Level 1C (L1C) satellite. Pearson correlation analysis and regression analysis using three machine learning techniques, i.e., LightGBM (LGBM), multi-layer perceptron (MLP), and support vector machine (SVM), are performed on the acquired data to explore the interrelationships among the three categories of data. The air pollutants and meteorological parameters are taken as independent variables, and physicochemical water quality features are taken as dependent variables, giving a least root-mean-square error (RMSE) in the range of 0.015–0.18. In addition, the extracted data are examined by applying the weighted arithmetic water quality index (WAWQI) method. The new hybrid approach has led to practical and globally applicable methods for analysing the associations among the features and monitoring the water health of any water body. Moreover, the study reveals that the air pollutants and the meteorological parameters have a significant impact on the quality of the water especially with the abundance of certain air pollutants like NO2 that has an inverse correlation with the physicochemical parameters causing a prominent disturbance in the concentration levels of parameters including DO, pH, and TDS with correlations ranging from 0.61 to 0.86. Overall, the major contributions of the present study are summarized as follows:

  1. An improved water quality monitoring model based on a hybrid of remote sensing and data mining techniques is proposed with a total of 16 unique parameters that are extracted for the stream network of the Rawal watershed which include three categories of data, i.e., (a) air pollutants, (b) meteorological, and (c) physicochemical parameters pertaining to the years 2018–2022 for the monsoon months of June to September. Previously, these sets of parameters have not been used in water quality monitoring models.

  2. Correlation analysis is performed on the unique dataset extracted using remote sensing satellites and data reanalysis techniques to observe the dependencies amongst the natural factors and the physicochemical water quality parameters. This is the first study that has practically acquired remote sensing data, i.e., air pollutants and meteorological features for monitoring the water quality along with the physicochemical water quality parameters.

  3. Regression analysis is proposed using three machine learning techniques, i.e., LGBM, MLP, and SVM to further find any dependencies amongst the 16 features with the air pollutants and meteorological parameters taken as independent variables to predict the five physicochemical water quality features individually, i.e., TDS, DO, pH, Tur, and EC.

This article is organized as follows: Section 2 discusses the related work. Section 3 covers the material and methods used, i.e., the study area, data collection, and pre-processing and describes the proposed methodology for the correlation and regression analysis along with the indexing method. Next, the results of the correlation and regression analysis are elaborated in Section 4 along with the flood case study. In Section 5, the conclusion and future work of this research is presented.

The relationship amongst the water quality, the air pollutants, and meteorological features is significant and has been studied over the years. The deposition of air pollutants such as NO2 can influence the aquatic ecosystems. Eutrophication in water bodies is attributed to the high concentrations of NO2 emissions (Lee & Schwartz 1981). In theory, the eutrophication can result in a complex chain of events disturbing the water health (National Research Council 2000; Nie et al. 2018). However, to support this theory, one needs access to tools and equipment that can help prove how much these natural factors can impact the water health. In general, this can be a complex task as these tools require the manual collection and the data acquired are not available in real time. Thus, even though the literature suggests the presence of complex interactions among the air pollutants, meteorological parameters, and physicochemical parameters, there is a disparity in such associations and the amount of work available that can support these theories and reflect the influence of air pollutants and air temperature on the water quality. Moreover, there is no evidence available of any comparisons between parameters that are employed using modern technology including remote sensing or machine learning techniques. Most of the studies have gathered data from the monitoring stations to analyse the individual air and water pollution of their respective study areas. These studies investigate the pollution level of the location using statistical methods such as generalized additive model (GAM) and other GIS-based techniques (Balaji et al. 2022; Ruhela et al. 2022). Table 1 discusses the data acquisition, methodology, and parameters used to address the relationships amongst meteorological, air, and water quality parameters.

Table 1

Previous work on relationship amongst meteorological, air, and water quality parameters

PaperStudy areaTime periodData acquisitionParametersMethodologyResults
Balaji et al. (2022)  Madurai city, India 2006 and 2020 Tamil Nadu Pollution Control Board Physicochemical, particulate matter and lead Spatial interpolation technique Higher than prescribed limits 
Zhang et al. (2017)  Tianjin, China 2000–2011 Water quality monitoring station, Meteorological station Suspended solids, total dissolved solids, wind speed, rainfall, and solar radiation GAM Positive correlations between suspended solids and meteorological parameters 
Zhang & Zhi (2020)  Lake Erhai, China 1999–2012 China meteorological data network, Yunnan Province Environmental Bulletin Physicochemical, air temperature, precipitation, wind speed, and sunshine hours GAM Lower rainfall leads to poor water quality, total nitrogen increases with air Temperature 
Zhang et al. (2021)  Lake Okeechobee, China January 1996–December 2010 Eight sampling sites by South Florida Water Management District Physicochemical GAM and random forest Total nitrogen as predictor for chl-α 
Gintamo et al. (2021)  Cape Town, South Africa 1979–2018 South Africa Weather Services, National Groundwater Archive of South Africa Physicochemical, temperature, precipitation GIS Decrease in water quality with high temperature and precipitation 
PaperStudy areaTime periodData acquisitionParametersMethodologyResults
Balaji et al. (2022)  Madurai city, India 2006 and 2020 Tamil Nadu Pollution Control Board Physicochemical, particulate matter and lead Spatial interpolation technique Higher than prescribed limits 
Zhang et al. (2017)  Tianjin, China 2000–2011 Water quality monitoring station, Meteorological station Suspended solids, total dissolved solids, wind speed, rainfall, and solar radiation GAM Positive correlations between suspended solids and meteorological parameters 
Zhang & Zhi (2020)  Lake Erhai, China 1999–2012 China meteorological data network, Yunnan Province Environmental Bulletin Physicochemical, air temperature, precipitation, wind speed, and sunshine hours GAM Lower rainfall leads to poor water quality, total nitrogen increases with air Temperature 
Zhang et al. (2021)  Lake Okeechobee, China January 1996–December 2010 Eight sampling sites by South Florida Water Management District Physicochemical GAM and random forest Total nitrogen as predictor for chl-α 
Gintamo et al. (2021)  Cape Town, South Africa 1979–2018 South Africa Weather Services, National Groundwater Archive of South Africa Physicochemical, temperature, precipitation GIS Decrease in water quality with high temperature and precipitation 

In addition, along with the climatic changes and weather conditions, certain landscape changes can also worsen the environmental quality as suggested by Chen et al. (2021) who examined the impact of river dust on the PM10 concentrations for the downstream areas of Da'an and Dajia rivers. The results reveal that PM10 concentrations have increased during wet and dry seasons (Chen et al. 2021). Along with the traditionally used physicochemical features that are extracted by applying remote sensing techniques, other environmental factors can also be derived using the satellite images. These include topographical parameters that can be extracted from remote sensing techniques using Shuttle Radar Topography Mission (SRTM) digital elevation model (DEM) data. A study in 2014 (Beeson et al. 2014) extracted slope and other topographic data from DEM and concluded that special attention should be made in the selection of spatial resolution and input source as they keep changing due to the advancements in remote sensing and can prove critical in water quality models. In 2019 (Oyedotun 2019), land use changes were examined for Chaohu Lake using Landsat MSS and OLI/TIRS images of 1979–2015. The results showed a 25% increase in built-up area causing degradation of the basin due to improper land use activities. A study (Oyedotun & Timothy 2022) extracted hydrological parameters like the drainage network for Chaohu Lake using SRTM DEM data created with Landsat MSS and OLI/TIRS images for the time period of 1979–2015. The study focused on analysing the dynamics of the different streams by extracting the changes in land use patterns.

Nonetheless, studies show the individual acquisition of physicochemical water quality parameters and air pollutants using remote sensing. The collection of the physicochemical parameters gathered through high-resolution images from Landsat (Mohsen et al. 2021), Sentinel (Oiry & Barillé 2021), and MODIS satellites (Arıman 2021). By using semi-analytical methods, patterns are found that relate the physicochemical parameters with the satellite image bands, wavelengths to get equations that can estimate concentrations for these parameters including temperature (Ritchie et al. 2003), total suspended solids (Imen et al. 2015; Sharaf El Din 2020), chl-α (Gitelson & Merzlyak 1998; Liu et al. 2018; Xu et al. 2019), Tur (Harrington et al. 1992; Kapalanga 2015; Lim & Choi 2015), and DO (Theologou et al. 2015; Ahmed et al. 2022a). Similarly, air pollutants such as CO, SO2, NO2, and PM10 have been estimated using S5P-L2 (Al-Alola et al. 2022) and MODIS satellites (Dinoi et al. 2010).

The literature has revealed that modern technology such as remote sensing and data mining techniques is a more robust and economical method for the acquisition of parameters. In addition, there are studies that discuss the influence of air pollutants and meteorological features on the physicochemical parameters of water. However, there is no evidence found on utilizing the modern technology, i.e., remote sensing and data mining techniques for analysing the relationships amongst such features to enhance the water quality monitoring and management. Thus, in this study, a hybrid of remote sensing and data mining techniques is employed to investigate the influence of air pollutants and meteorological variables on the physicochemical water quality parameters through correlation and regression analysis. Therefore, unlike previous studies that have assessed the individual air and water pollution of a study area, this study has evaluated the interactions amongst the meteorological, air, and water quality parameters. In addition, the data are acquired through remote sensing and data reanalysis techniques on a large scale from multiple sample points of the study area unlike the traditional collection from inlet or outlet points of the streams and selected air/water quality monitoring stations. This gives access to a substantial set of data points that are used to perform data analytics and establish strong associations among the extracted parameters.

This article analyses the associations among the air, meteorological, and physicochemical parameters for the study area of Rawal watershed stream network. The steps and the respective methods used are discussed in this section.

Study area

Rawal watershed (Ali et al. 2013) is located at latitude 33° 42′ N and longitude of 73° 7′ E and lies in the capital city of Islamabad. With a population of 1.2 million and a total area of 906.50 km2, Islamabad is the ninth populous city of the nation. It has a humid subtropical climate with an average annual temperature of 28.5 °C. The cool winter season in Islamabad lasts for 3 months (December to March) with the lowest temperature in January of 4 °C and hot, humid summers (May to August) with the highest temperature in June of 38 °C. A monsoon season (June to September) with an annual average precipitation of 1,143 mm. With an average of 15.2 days, July has the wet days with at least 1.016 mm of precipitation. The average annual humidity of Islamabad is 77% (Weather by month 2022). The air quality of Islamabad lies in the ‘Unhealthy for Sensitive Groups’ category with an air quality index of 115. However, it is termed as the cleanest city of Pakistan for the year 2019 with an annual average reading of 35.2 μg/m3 for PM2.5 pollutant. This makes it the tenth cleanest city of the nation, although the air quality status still remains unsafe for young children and its inhabitants with sensitive health conditions (PM2.5 2022).

Rawal watershed is modified using hydrological GIS tools to extract the water bodies from the area, giving the Rawal Stream Network. This stream network is extracted from the DEM that is created with SRTM (Van Zyl 2001) data. Figure 1 shows the location map of the study area, i.e., Rawal watershed and the resultant stream network located in Islamabad, Pakistan.
Figure 1

(a) Map of Pakistan, (b) Rawal watershed, and (c) Rawal stream network.

Figure 1

(a) Map of Pakistan, (b) Rawal watershed, and (c) Rawal stream network.

Close modal

Data collection

Three types of features are acquired in the current study, i.e., (a) the air pollutants, (b) meteorological factors, and (c) physicochemical water quality features. The air pollutants data for the Rawal stream network are extracted from S5P-L2 satellite images (Veefkind et al. 2012). The concentrations of six pollutants: CO, NO2, ozone (O3), SO2, formaldehyde (HCHO), and methane (CH4). The air pollutants data are extracted from the TROPOMI imager of the S5P satellite that operates with a swath of 2,600 km, a spatial resolution of 3.5 km ×7 km, and four spectrometers, i.e., ultra violet (UV), UV–visible (UV-VIS), near-infrared (NIR), short wave–infra red (SWIR) and eight spectral bands. The NO2 concentrations are derived by measuring the solar light that is backscattered by the Earth's atmosphere using UV, UV-VIS spectrometer. Band 4 is used for NO2 retrieval (Van Geffen et al. 2020). The HCHO, O3, and SO2 concentrations are derived from the Band 3 of the UV-VIS spectrometer (Theys et al. 2017; De Smedt et al. 2018; Garane et al. 2019). The Band 7 of the SWIR is used to measure CH4 and CO concentrations (Magro et al. 2021). The details of the S5P-L2 are given in Table 2. Table 3 describes the type of parameters along with their units, sources, effects, water solubility, and other details. The meteorological parameters, namely, air temperature, wind speed, and total precipitation, are extracted from ERA5-CRP of the Copernicus Climate Change Service (C3S). ERA5-CRP (Hersbach et al. 2020) has a Climate Data Store with a detailed record of the global atmosphere, ocean waves, and the land surface. It combines the historical observations based on assimilation and advanced modelling into a global consistent dataset. ERA5-CRP has a spatial resolution of 31 km and will cover a dataset from 1950 to real time. However, only 1979 to July 2020 data are available at the time of the study. The daily aggregates of total precipitation and averages of wind speed and air temperature are taken for the study. The air temperature is taken at a 2 m distance from the Earth's surface. The wind speed is taken at a 10 m distance from the surface of Earth and represents the northward neutral wind. Total precipitation is the accumulated rain and snow water that falls on the Earth's surface. Seven physicochemical parameters include TDS, pH, EC, SDD, DO, Tur, and chlorophyll-α (chl-α) are extracted from S2-MSI Level 1C (L1C) (Baillarin et al. 2012). S2 satellite has the MSI imager with a swath width of 290 km. S2-MSI contains the geo-located top of atmosphere reflectance in the L1C product scaled by value of 10,000. The physicochemical parameters are extracted from the adapted equations that are applied on the Sentinel images. Ground truth data from Rawal Dam Water Filtration Plant have been used for verification of the satellite data output results. The equation with the lowest RMSE is selected as the adapted equation for the study area. The selected equation and the respective RMSE for each parameter are mentioned in Table 4.

Table 2

Product description

ProductLaunch dateCoverage/cycle/revisit timeResolutionBandsSpectral range (nm)
S5P-L2 13 October 2017 Global, <1 d, 16 d 3.5 × 7 km2
(launch date – August 2019)
3.5 × 5.5 km2
(6 August 2019 to present) 
UV (Bands 1, 2) 270–495 
UV-VIS (Bands 3, 4)  
NIR (Bands 5, 6) 675–775 
SWIR (Bands 7, 8) 2,305–2,385 
ERA5-CRP Early 2020 Global, 1 h 0.28 × 0.28 (31) km2 – – 
S2-MSI L1C 23 June 2015 Global, 5 d 10–60 m Ultra-Blue, Blue, Green, Red 443–665 
VNIR (Bands 5, 6, 7, 8, 8a) 705–865 
SWIR (Bands 9, 10, 11, 12) 940–2,190 
ProductLaunch dateCoverage/cycle/revisit timeResolutionBandsSpectral range (nm)
S5P-L2 13 October 2017 Global, <1 d, 16 d 3.5 × 7 km2
(launch date – August 2019)
3.5 × 5.5 km2
(6 August 2019 to present) 
UV (Bands 1, 2) 270–495 
UV-VIS (Bands 3, 4)  
NIR (Bands 5, 6) 675–775 
SWIR (Bands 7, 8) 2,305–2,385 
ERA5-CRP Early 2020 Global, 1 h 0.28 × 0.28 (31) km2 – – 
S2-MSI L1C 23 June 2015 Global, 5 d 10–60 m Ultra-Blue, Blue, Green, Red 443–665 
VNIR (Bands 5, 6, 7, 8, 8a) 705–865 
SWIR (Bands 9, 10, 11, 12) 940–2,190 
Table 3

Six air pollutants, three meteorological, and seven physicochemical water quality parameters with their units and description

Variable (unit)Availability time periodDescription
Air pollutants CO (mol/m22018/06/28 − 2022/09/17 Sources: Combustion of fossil fuels, atmospheric oxidation of methane and other hydrocarbons, exhausts of motor vehicles
Effects: Unintentional and suicidal poisonings, dizziness, confusion, unconsciousness, and death.
Water solubility: Poor 
NO2 (mol/m22018/06/28 − 2022/09/10 Sources: Burning of fuel, emissions from cars, trucks and buses, power plants, animal manure, precipitation falls across hard surfaces
Effects: Acid rain, nutrient pollution, algae blooms, ozone, smog
Water solubility: high, forms HNO3, forms nitrogen monoxide (NO) 
O3 (mol/m22018/09/8 − 2022/09/17 Sources: Volatile organic compounds and nitrogen oxides. Chemical plants, gasoline pumps, oil-based paints
Effects: Sensitive vegetation, smog
Water solubility: partial, forms OH-radicals 
SO2 (mol/m22018/12/05 − 2022/09/17 Sources: Combustion of fossil fuels, steel making, fertilizer manufacturing
Effects: Respiratory problems
Water solubility: High, forms sulphuric acid 
HCHO (mol/m2Sources: Oxidation of hydrocarbons, decomposition of plant residues, automotive exhaust, cigarette smoke
Effects: Allergic reaction, certain types of cancer
Water solubility: High, forms glycol (CH₂(OH)₂) 
CH4 (parts per billion (ppb)) 2019/02/08 − 2022/09/17 Sources: Agricultural activities, biomass burning
Effects: Ozone
Water solubility: almost insoluble 
Meteorological Air temperature (K) 1979/01 − 2020/07 Temperature of air at 2 m above the surface of land, sea, or inland waters 
Total precipitation (m) Accumulated liquid and frozen water comprising rain and snow 
Wind speed (ms−1Northward component of the ‘neutral wind’ 
Physicochemical pH 2015/06/23 − 2022/09/20 Measure of hydrogen ion activity 
TDS (mg/l) Measure of organic and inorganic materials, dissolved in water 
Tur (NTU) Measure of clarity of a liquid 
SDD (m) Measure of light penetration into a water body 
chl-α (mg/l) Measure of the amount of algae growing in a water body 
DO (mg/l) Measure of the degree of pollution by organic matter 
EC (mS/cm) Measure of water capacity to convey electric current 
Variable (unit)Availability time periodDescription
Air pollutants CO (mol/m22018/06/28 − 2022/09/17 Sources: Combustion of fossil fuels, atmospheric oxidation of methane and other hydrocarbons, exhausts of motor vehicles
Effects: Unintentional and suicidal poisonings, dizziness, confusion, unconsciousness, and death.
Water solubility: Poor 
NO2 (mol/m22018/06/28 − 2022/09/10 Sources: Burning of fuel, emissions from cars, trucks and buses, power plants, animal manure, precipitation falls across hard surfaces
Effects: Acid rain, nutrient pollution, algae blooms, ozone, smog
Water solubility: high, forms HNO3, forms nitrogen monoxide (NO) 
O3 (mol/m22018/09/8 − 2022/09/17 Sources: Volatile organic compounds and nitrogen oxides. Chemical plants, gasoline pumps, oil-based paints
Effects: Sensitive vegetation, smog
Water solubility: partial, forms OH-radicals 
SO2 (mol/m22018/12/05 − 2022/09/17 Sources: Combustion of fossil fuels, steel making, fertilizer manufacturing
Effects: Respiratory problems
Water solubility: High, forms sulphuric acid 
HCHO (mol/m2Sources: Oxidation of hydrocarbons, decomposition of plant residues, automotive exhaust, cigarette smoke
Effects: Allergic reaction, certain types of cancer
Water solubility: High, forms glycol (CH₂(OH)₂) 
CH4 (parts per billion (ppb)) 2019/02/08 − 2022/09/17 Sources: Agricultural activities, biomass burning
Effects: Ozone
Water solubility: almost insoluble 
Meteorological Air temperature (K) 1979/01 − 2020/07 Temperature of air at 2 m above the surface of land, sea, or inland waters 
Total precipitation (m) Accumulated liquid and frozen water comprising rain and snow 
Wind speed (ms−1Northward component of the ‘neutral wind’ 
Physicochemical pH 2015/06/23 − 2022/09/20 Measure of hydrogen ion activity 
TDS (mg/l) Measure of organic and inorganic materials, dissolved in water 
Tur (NTU) Measure of clarity of a liquid 
SDD (m) Measure of light penetration into a water body 
chl-α (mg/l) Measure of the amount of algae growing in a water body 
DO (mg/l) Measure of the degree of pollution by organic matter 
EC (mS/cm) Measure of water capacity to convey electric current 
Table 4

Adapted equations for the physicochemical water quality parameters

VariableAdapted equationsReferenceRMSE
Turbidity 35.121 − 14.489 ((R3)/(R4)) − 0.911 (R8a) Khattab & Merkel (2014)  7.65 NTU 
pH 8.790 + 0.141 (R11) − 0.228 (R3/R4) Abdullah (2015)  3.36 
EC 422.034 − 1,080.365 (R11) Abdullah (2015)  228.7 mS/cm 
chl-α 54.658 + 520.451 (R2) − 1,221.89 (R3) + 611.115 (R4) − 198.199 (R8a) Lim & Choi (2015)  10.15 mg/l 
DO 10.841 − 0.682 ((R1)/(R8a)) − 0.002 ((R2)/(R8a) + (B2)) Abdullah (2015)  2.82 mg/l 
TDS 120.750 + 264.752(R8a/R1) Abdullah (2015)  111.92 mg/l 
SDD 0.2 + 1.4 ln (R2/R4) Deutsch et al. (2014)  0.22 m 
VariableAdapted equationsReferenceRMSE
Turbidity 35.121 − 14.489 ((R3)/(R4)) − 0.911 (R8a) Khattab & Merkel (2014)  7.65 NTU 
pH 8.790 + 0.141 (R11) − 0.228 (R3/R4) Abdullah (2015)  3.36 
EC 422.034 − 1,080.365 (R11) Abdullah (2015)  228.7 mS/cm 
chl-α 54.658 + 520.451 (R2) − 1,221.89 (R3) + 611.115 (R4) − 198.199 (R8a) Lim & Choi (2015)  10.15 mg/l 
DO 10.841 − 0.682 ((R1)/(R8a)) − 0.002 ((R2)/(R8a) + (B2)) Abdullah (2015)  2.82 mg/l 
TDS 120.750 + 264.752(R8a/R1) Abdullah (2015)  111.92 mg/l 
SDD 0.2 + 1.4 ln (R2/R4) Deutsch et al. (2014)  0.22 m 

Data pre-processing

Google Earth Engine software (Google earth engine 2022) is used for the pre-processing of the satellite images, extracting parameters and sample points from the study area. The maps are prepared by Arc-Map 10.8 (ArcGIS 2022). The satellite images cover a larger part, and to extract the area of interest, i.e., Rawal stream network, GIS clipping tools, are used to select the target boundaries from the image. Once the images are clipped, the dates are matched for the three different datasets. The matching dates, the respective parameters, and the number of samples retrieved are also shown in Table 5. A total of 4,998 samples are extracted from the clipped images with the same latitude–longitude values for each monsoon month of the year (i.e., the matching dates of the month) for the three categories of data, which are air pollutants, meteorological, and physicochemical parameters. This gives a total sum of 84,966 samples for the pre-processed extracted images of the study area. Once the data are compiled, the sample points extracted for each monsoon month (16 months in total) on the matching dates and latitude longitude values are then averaged to get a single dataset for a month.

Table 5

Samples compiled with matching dates between S5P-L2, S2-MSI, and ERA5-CRP

YearMatching datesNo. of pre-processed images per monthNo. of parametersNo. of samples per month
2018 2018/07/05, 2018/07/30
2018/08/09, 2018/08/19, 2018/08/24, 2018/08/29
2018/09/03, 2018/09/08, 2018/09/18 
126 (S5P-L2)
9 (S2-MSI)
9 (ERA5-CRP) 
12 4,998 
Total samples per year 14,994 
2019 2019/06/05, 2019/06/10, 2019/06/25
2019/07/05, 2019/07/15, 2019/07/20
2019/08/04, 2019/08/19, 2019/08/24
2019/09/03, 2019/09/08, 2019/09/18 
168 (S5P-L2)
12 (S2-MSI)
12 (ERA5-CRP) 
15 4,998 
Total samples per year 19,992 
2020 2020/06/04, 2020/06/14, 2020/06/29
2020/07/09, 2020/07/14, 2020/07/29
2020/08/03, 2020/08/23
2020/09/07, 2020/09/12, 2020/09/17, 2020/09/22, 2020/09/27 
182 (S5P-L2)
13 (S2-MSI)
13 (ERA5-CRP) 
16 4,998 
Total samples per year 19,992 
2021 2021/06/04, 2021/06/09, 2021/06/14, 2021/06/19, 2021/06/24, 2021/06/29
2021/07/04, 2021/07/24
2021/08/03, 2021/08/13, 2021/08/18
2021/09/02, 2021/09/12, 2021/09/22, 2021/09/27 
210 (S5P-L2)
15 (S2-MSI)
15 (ERA5-CRP) 
13 4,998 
Total samples per year 19,992 
2022 2022/06/04, 2022/06/09, 2022/06/14, 2022/06/19, 2022/06/24, 2022/06/29
2022/08/13, 2022/08/23 
112 (S5P-L2)
8 (S2-MSI)
8 (ERA5-CRP) 
13 4,998 
Total samples per year 9,996 
Total number of samples 84,966 
YearMatching datesNo. of pre-processed images per monthNo. of parametersNo. of samples per month
2018 2018/07/05, 2018/07/30
2018/08/09, 2018/08/19, 2018/08/24, 2018/08/29
2018/09/03, 2018/09/08, 2018/09/18 
126 (S5P-L2)
9 (S2-MSI)
9 (ERA5-CRP) 
12 4,998 
Total samples per year 14,994 
2019 2019/06/05, 2019/06/10, 2019/06/25
2019/07/05, 2019/07/15, 2019/07/20
2019/08/04, 2019/08/19, 2019/08/24
2019/09/03, 2019/09/08, 2019/09/18 
168 (S5P-L2)
12 (S2-MSI)
12 (ERA5-CRP) 
15 4,998 
Total samples per year 19,992 
2020 2020/06/04, 2020/06/14, 2020/06/29
2020/07/09, 2020/07/14, 2020/07/29
2020/08/03, 2020/08/23
2020/09/07, 2020/09/12, 2020/09/17, 2020/09/22, 2020/09/27 
182 (S5P-L2)
13 (S2-MSI)
13 (ERA5-CRP) 
16 4,998 
Total samples per year 19,992 
2021 2021/06/04, 2021/06/09, 2021/06/14, 2021/06/19, 2021/06/24, 2021/06/29
2021/07/04, 2021/07/24
2021/08/03, 2021/08/13, 2021/08/18
2021/09/02, 2021/09/12, 2021/09/22, 2021/09/27 
210 (S5P-L2)
15 (S2-MSI)
15 (ERA5-CRP) 
13 4,998 
Total samples per year 19,992 
2022 2022/06/04, 2022/06/09, 2022/06/14, 2022/06/19, 2022/06/24, 2022/06/29
2022/08/13, 2022/08/23 
112 (S5P-L2)
8 (S2-MSI)
8 (ERA5-CRP) 
13 4,998 
Total samples per year 9,996 
Total number of samples 84,966 

Correlation and regression analysis

Pearson correlation analysis is the commonly used technique to measure a linear relationship. The strength and direction of the relationship between two parameters are observed with the change in one variable. The correlation analysis is performed on the collected data samples to account for trends among the physicochemical water quality parameters, meteorological parameters, and the air pollutants. Because of the lack of data on the same available dates, certain parameters including methane, wind speed, and other meteorological features are inaccessible for certain years. Various parameter combinations are taken for a number of years to deduce the important patterns with the correlation analysis.

A regression problem determines the function that can approximate the future values with a high accuracy. Moreover, regression analysis is performed using three types of machine learning algorithms that include LGBM, MLP, and SVM. LGBM (Ke et al. 2017) is a highly efficient gradient-boosting decision tree (GBDT) that is an ensemble model with the decision tree as the base classifier. It uses gradient-based one-side sampling (GOSS) and exclusive feature bundling (EFB). The split point is determined with the GOSS for calculating the information gain. The EFB speeds up the training of the decision tree by exclusively bundling the features to fewer features. Thus, with employing EFB and GOSS, LGBM is an efficient GBDT that does not impact the accuracy of the tree. On the other hand, the MLP regressors (Murtagh 1991) are a network of neurons that train using back propagation with a single output neuron. The only main difference between MLP used for classification and regression problems is the output is a single neuron with no activation function, and the loss function is the mean squared error. Similarly, SVM (Awad & Khanna 2015) is a popular machine learning algorithm that is used for modelling complex engineering systems. SVM is based on the concept of the structural risk minimization that finds connections between input and output features. The SVM maps the training data from the input space to a high-dimensional feature space. A hyperplane is constructed in the feature space with a maximum margin. The air pollutants and meteorological features are taken as independent variables, and five of the physicochemical parameters, i.e., TDS, DO, pH, Tur, and EC are individually predicted with the unique feature set.

Water quality index

Over the years, the physicochemical and biological parameters are mostly used to monitor the quality of water that should fall under set standards and guidelines. The occurrence of these parameters beyond the defined limit can be harmful for human health. To express the water quality in some standard form, researchers have come up with a number of water quality indices, which are the most effective tool used to describe the quality of water. WQI classification may also help to analyse the trend of water quality over a period of time and can identify how environmental impact and anthropogenic activities have affected the water quality for drinking or other water consumption.

The WAWQI (Chandra et al. 2017) is calculated from the extracted physicochemical parameters for the monsoon months. The WAWQI is calculated for the month by applying Equation (1). Here, n is the number of the physicochemical parameters, which in this case is 7. qn is the quality rating of the nth parameter, which is calculated by Equation (2). In Equation (2), Sn is the standard value, Vn is the observed value, and Vid is the ideal value of the nth water quality parameter. wn is the unit weight of the nth parameter calculated by Equation (3). In Equation (3), K is the proportionality constant defined by Equation (4). Table 6 shows the computed K, Vid, Sn, and wn values using Equations (1)–(4). These are then used to compute the final WAWQI value that is classified in six categories: (i) unfit for drinking (above 150), (ii) very poor (100–150), (iii) poor (75–100), (iv) fair (50–75), (v) good (25–50), and (vi) excellent (<25).
(1)
(2)
(3)
(4)
Table 6

WAWQI parameters calculated

VariableStandard value (Sn)Ideal value (Vid)Proportionality constant (K)Unit weight (wn)
TDS 1,000 mg/l 1.74003 0.00174 
pH 8.5 1.74003 0.20471 
EC 2,000 mS/cm 1.74003 0.00087 
SDD 18 m 1.74003 0.09667 
DO 10 mg/l 14.6 1.74003 0.17400 
Tur 5 NTU 1.74003 0.34801 
chl-α 10 mg/l 1.74003 0.17400 
VariableStandard value (Sn)Ideal value (Vid)Proportionality constant (K)Unit weight (wn)
TDS 1,000 mg/l 1.74003 0.00174 
pH 8.5 1.74003 0.20471 
EC 2,000 mS/cm 1.74003 0.00087 
SDD 18 m 1.74003 0.09667 
DO 10 mg/l 14.6 1.74003 0.17400 
Tur 5 NTU 1.74003 0.34801 
chl-α 10 mg/l 1.74003 0.17400 

The aim of this study is to analyse the impact of air pollutants and meteorological factors on the quality of water as the sources of water pollution can be attributed to activities that are either anthropogenic or naturally occurring. Thus, the present study has proposed a hybrid of remote sensing and data mining techniques with a unique set of monitoring data. Firstly, the results of the correlation analysis using the Pearson method are discussed. Next, the results of the interrelationships amongst the three categories of data with the regression framework are presented. Finally, the WAWQI method results on the classification of the study area water quality are discussed.

Correlation analysis

The Pearson correlation method is applied on the air pollutants, meteorological, and physicochemical data to establish certain key relationships among the parameters for the monsoon months starting from July 2018 to August 2022. The binned scatter plots for the associations are shown in Figure 2 for the months of June and July, while Figure 3 displays the correlations for August and September, respectively. In these plots, the dataset has been divided into equally sized bins of size 15, where the positive and negative correlations between two parameters of the air pollutants and physicochemical features for the monsoon months are revealed. Furthermore, the classification of the correlation with respect to the meteorological parameters, i.e., air temperature, wind speed, and total precipitation is also given. For example, Figure 2(a) shows the data points in a pattern of trending down from left to right where the concentrations of CO lie in the range of 0.032–0.044 mol/m2, while pH decreases from a high value of 8.62–8.47. This means that the air pollutant CO has a negative correlation with the physicochemical parameter pH for the month of June 2019. Moreover, the binned scatterplot by group is then observed with respect to the meteorological parameters as shown in Figure 2(b)–2(d). Figure 2(b) shows that the downfall of the pH, and CO relation is classified with a high air temperature of 31 °C. Similar observations are shown in Figure 2(c) and 2(d), where the downward trend of the association shows the highest wind speed (−0.12 ms−1) and high precipitation rate of 0.002 m. However, this is not true for the month of July as shown in Figure 2(e), where the downfall of the negative correlation for NO2 and TDS has a mixed classification for all three parameters (high air temperature, high wind speed, and medium to high precipitation).
Figure 2

The binned scatter plots for the associations are given as follows: (a) negative correlation between CO and pH for June 2019 with respect to (b) air temperature, (c) wind speed, and (d) total precipitation. Next, (e) negative correlation between NO2 and TDS for July 2019 with respect to (f) air temperature, (g) wind speed, and (h) total precipitation.

Figure 2

The binned scatter plots for the associations are given as follows: (a) negative correlation between CO and pH for June 2019 with respect to (b) air temperature, (c) wind speed, and (d) total precipitation. Next, (e) negative correlation between NO2 and TDS for July 2019 with respect to (f) air temperature, (g) wind speed, and (h) total precipitation.

Close modal
Figure 3

The binned scatter plots for the associations are given as follows: (a) negative correlation between NO2 and turbidity for August 2018 with respect to (b) air temperature, (c) wind speed, and (d) total precipitation. Next, (e) positive correlation between NO2 and chl-α for September 2018 with respect to (f) air temperature, (g) wind speed, and (h) total precipitation.

Figure 3

The binned scatter plots for the associations are given as follows: (a) negative correlation between NO2 and turbidity for August 2018 with respect to (b) air temperature, (c) wind speed, and (d) total precipitation. Next, (e) positive correlation between NO2 and chl-α for September 2018 with respect to (f) air temperature, (g) wind speed, and (h) total precipitation.

Close modal

Figure 3(a) shows the data points in the downward trend for the NO2 and Tur from left to right, which shows that these parameters have a negative correlation for the month of August 2018. Figure 3(b)–3(d) shows that with respect to the meteorological parameters, a high air temperature (29 °C), low wind speed (−0.12 ms−1), and a medium precipitation rate of 0.0015 m are observed for August. Figure 3(e) shows an upward trend for the NO2 and chl-α parameters with a high air temperature (28 °C), low wind speed (0.05 ms−1), and low precipitation level (0.00006 m). Thus, the correlation analysis results show that for each monsoon month, there are a number of relationships with the physicochemical parameters that prove the connection between air and water pollution. Furthermore, it is observed that the meteorological parameters tend to exhibit high values in cases of negative correlations and low values in instances of positive correlations among the air pollutants and physicochemical parameters. This suggests that meteorological conditions may play a role in influencing the associations between air pollutants and physicochemical parameters (Stull 2017; Gupta et al. 2018).

For the month of June, negative relationships for NO2, HCHO, and CO pollutants are prominent with other physicochemical parameters, i.e., DO, TDS, Tur, and positive relations with EC and SDD. For each individual year, the relations vary like for June 2019, and prominent negative relationships are observed for the (i) wind speed with TDS; (ii) CO with DO, pH, TDS, and Tur; (iii) NO2 with DO, TDS, pH, and Tur; and (iv) HCHO with Tur and TDS. The positive relationships observed are (i) CO with chl-α, EC, and SDD; (ii) NO2 with EC and SDD; and (iii) HCHO with SDD. For June 2020, the negative relations included (i) NO2 with DO, TDS, pH, and Tur; (ii) O3 with DO, TDS, pH, and Tur; (iii) SO2 with EC and SDD; (iv) HCHO with DO and TDS; and (v) CH4 with DO, pH, TDS, and Tur. The positive relations for June 2020 include (i) NO2 with chl-α, EC, and SDD; (ii) O3 with SDD, EC, and chl-α; (iii) SO2 with pH, TDS, and Tur; (iv) CH4 with chl-α, EC, and SDD. For June 2021, the negative relations include (i) NO2 with DO, TDS, and Tur and (ii) CO with DO. The positive relations for June 2021 include (i) NO2 with EC and SDD. For June 2022, the negative relations included (i) NO2 with DO, TDS, and Tur. The positive relations for June 2022 include (i) NO2 with EC and SDD. Figure 4 (i) shows the air pollutants, meteorological parameters that had the most impact on the physicochemical parameters, the WQI, and the correlation matrix for the month of June 2019.
Figure 4

(i) (a) The mean CO concentrations of June 2019, (b) NO2 mean concentrations for June 2019, (c) the mean HCHO concentrations for June 2019, (d) the mean wind speed component for June 2019, (e) the WQI of June 2019, and (f) the correlation matrix for the air pollutants, meteorological, and physicochemical parameters for June 2019. (ii) (a) The mean CO concentrations of July 2020, (b) NO2 mean concentrations for July 2020, (c) the mean HCHO concentrations for July 2020, (d) the mean O3 concentrations for July 2020, (e) the WQI of July 2020, and (f) the correlation matrix for the air pollutants, meteorological, and physicochemical water parameters for July 2020.

Figure 4

(i) (a) The mean CO concentrations of June 2019, (b) NO2 mean concentrations for June 2019, (c) the mean HCHO concentrations for June 2019, (d) the mean wind speed component for June 2019, (e) the WQI of June 2019, and (f) the correlation matrix for the air pollutants, meteorological, and physicochemical parameters for June 2019. (ii) (a) The mean CO concentrations of July 2020, (b) NO2 mean concentrations for July 2020, (c) the mean HCHO concentrations for July 2020, (d) the mean O3 concentrations for July 2020, (e) the WQI of July 2020, and (f) the correlation matrix for the air pollutants, meteorological, and physicochemical water parameters for July 2020.

Close modal

For the month of July, strong correlations are observed with NO2 air pollutant. In 2018, the negative relationships for NO2 with TDS and DO are observed. For July 2019, the negative relationships that are observed include (i) CO with DO and TDS and (ii) NO2 with DO and TDS. The positive relations for 2019 include (i) NO2 with EC and (ii) CO with EC. For July 2020, the negative relations include (i) NO2 with DO, TDS, pH, and Tur, (ii) CO with DO, pH, Tur, and TDS, (iii) O3 with DO, TDS, pH, Tur, and (iv) HCHO with DO, TDS, pH, and Tur. The positive relations for July 2020 include (i) NO2 with chl-α, EC, and SDD; (ii) CO with chl-α, EC, and SDD, (iii) O3 with EC and SDD, and (iv) HCHO with chl-α, EC, and SDD. For July 2021, the negative relations include (i) NO2 with TDS and Tur. The positive relations for July 2021 include NO2 with EC. Figure 4 (ii) shows the air pollutants and meteorological parameters that had the most impact on physicochemical parameters, the WQI, and the correlation matrix for the month of July 2020.

Figure 5 shows the correlation matrices for the air pollutants, meteorological, and water parameters for the month of August. For the month of August, strong correlations are observed for NO2 and HCHO with physicochemical parameters, i.e., EC and TDS. In 2018, (i) air temperature with DO and TDS, (ii) wind speed with chl-α and EC, and (iii) NO2 with DO and TDS show negative correlations. The positive relationships for August 2018 include (i) wind speed with DO, TDS, and (ii) NO2 with chl-α and EC. For August 2019, the negative relations that are prominent include (i) NO2 with DO, TDS, and pH, (ii) O3 DO and TDS, and (iii) HCHO DO, pH, and TDS. The positive relationships for August 2019 include (i) NO2 with chl-α, EC, and SDD, (ii) HCHO with chl-α, EC, and SDD. For August 2020, the negative relations include (i) NO2 with TDS and Tur, (ii) O3 with TDS, and (iii) SO2 with chl-α, EC, and SDD. The positive relations for August 2020 include (i) NO2 with chl-α, EC, and SDD, (ii) O3 with chl-α, EC, and SDD, (iii) SO2 with TDS, and (iv) HCHO with EC. For August 2021, the negative relations include (i) NO2 with DO, TDS, and Tur. The positive relations for August 2021 include (i) NO2 with EC. For August 2022, the negative relations include (i) NO2 with TDS, (ii) CO with TDS, (iii) HCHO with TDS, and (iv) SO2 with TDS. The positive relations for August 2022 include (i) NO2 with EC, (ii) CO with EC, (iii) SO2 with EC, and (iv) HCHO with EC.
Figure 5

The correlation matrices for the month of August for the years 2018–2021.

Figure 5

The correlation matrices for the month of August for the years 2018–2021.

Close modal

For the month of September, the major relationships are observed for CO and NO2 pollutants with DO and EC. In September 2018, the prominent negative correlations include (i) air temperature with DO and TDS, (ii) wind speed with chl-α and EC, and (iii) NO2 with DO, TDS, pH, and Tur. The positive relationships for September 2018 include (i) wind speed with DO and TDS (ii) NO2 with chl-α, EC, and SDD. For September 2019, the negative relations are observed for (i) CO with DO, pH, and TDS, (ii) NO2 with DO, TDS, pH, and Tur, and (iii) SO2 with pH, and Tur. The positive relations for September 2019 include (i) CO with EC, (ii) NO2 with chl-α, EC, and SDD, and (ii) SO2 with SDD and EC. For September 2020, the negative relations include (i) SO2 with EC and (ii) CO with chl-α, EC, and SDD. The positive relations for September 2020 include (i) CO with DO, pH, TDS, and Tur and (ii) SO2 with DO, pH, and TDS. For September 2021, the negative relations include (i) NO2 with DO, TDS, and Tur, (ii) CO with DO, and TDS, (iii) CH4 with DO and TDS, and (iv) SO2 with Tur. The positive relations for September 2021 include (i) NO2 with chl-α and EC, (ii) CO with EC, (iii) SO2 with EC, (iv) HCHO with TDS, and (v) CH4 with chl-α and EC.

As the meteorological data are available up to July 2020, relationships observed between physicochemical and meteorological parameters include the (i) wind speed with chl-α, EC, DO, and TDS and (ii) air temperature with DO and TDS in the months of August and September. The impact of meteorological factors including the wind speed is dependent on the topography and surroundings of the water body. The dynamics of TDS is impacted by the maximum wind speed as it induces sediment resuspension (Evans 1994; Zhang et al. 2017). The wind-driven sediment resuspension is directly linked to the water quality as it increases phosphorus and nitrogen concentrations (Reddy et al. 1996). The meteorological and physicochemical relationships observed are prevailing in the month of August. The average temperature and humidity of Islamabad in August are 28.7 °C and 73%. The meteorological features can intensify the hydrological cycle, i.e., change the precipitation patterns, which can mobilize the transport of pollutants to the water bodies. Table 7 shows the negative and positive correlations discovered in the monsoon months. NO2 and TDS have dominant relation in all months followed by NO2 and DO. Similarly, the positive relation between NO2 and EC is the most common. These relations are observed to be more prevailing in the month of August. This can be related to the fact that Asian monsoon seasons have strong control over the water flow regimes (Mamun et al. 2021).

Table 7

The prominent relationships amongst air pollutants, meteorological, and physicochemical features for monsoon season (2018–2022)

SnRelationCorrelationPrevailing month
NO2 and TDS Negative August 
NO2 and DO June 
NO2 and Tur June 
NO2 and pH June and September 
CO and DO June and September 
CO and TDS September 
NO2 and EC Positive August 
NO2 and SDD June 
NO2 and chl-α August and September 
10 CO and EC July and September 
SnRelationCorrelationPrevailing month
NO2 and TDS Negative August 
NO2 and DO June 
NO2 and Tur June 
NO2 and pH June and September 
CO and DO June and September 
CO and TDS September 
NO2 and EC Positive August 
NO2 and SDD June 
NO2 and chl-α August and September 
10 CO and EC July and September 

The top most prominent relationships are listed in Table 7 that are discovered over the 5-year time span and are discussed in detail as follows:

NO2 and TDS

This relation is most dominant in the month of August. When the NO2 dissolves in water and decomposes, it forms nitric acid (HNO3). Nitric acid forms nitrate salts when it is neutralized. Thus, NO2 exists and reacts either as gases in the air, as acids in droplets of water, or as a salt (NeiláCape & Lammel 1996). These gases, acids, and salts together contribute to pollution effects that have been observed and attributed to acid rain (Singh & Agrawal 2007). Nitrates are an indirect contributor to the change in TDS levels in water as they fuel algal blooms (Paerl 1988; Lopes et al. 2021). The excess amount of nitrates, i.e., the higher concentrations (10 mg/L) in the water gives rise to nutrient pollution that results in the creation of dead zones (Diaz & Rosenberg 2008) in the water, known as hypoxia (Khangaonkar et al. 2018). These zones have very little to no oxygen present that are caused by the algal blooms (red tides) that consume the oxygen during decomposition and thus are dangerous for the survival of aquatic life. These toxic red tides produce a shadow causing the death of other plants. This phenomenon is referred to as eutrophication (Jingzhong et al. 1985), which makes the bottom strata of water unsafe for fish and other aquatic animals or plants. It is also estimated that between 12 and 44% of the nitrogen loading of coastal water bodies comes from the air (Price et al. 1997).

NO2 and DO

This relation is most dominant in the month of June. The abundance of NO2 can lead to nutrient pollution or eutrophication. The sources of NO2 pollution can be agricultural runoff or burning of fuel. With the abundance of such nitrate salts in water, hypoxia occurs which means the low level of oxygen is the cause of deprivation of DO (Correll 1998; Kann & Welch 2005; Arend et al. 2011). Hypoxia gives rise to the overgrowth of algae which leads to low oxygen, and such species sinks and decomposes at the bottom of the sea. The eutrophication phenomenon affects the survival of aquatic life with dead zones or red tides. Thus, the negative correlations observed for this relationship prove the fact that an excess amount of NO2 can lead to a decline of DO in water.

NO2 and Tur

NO2 and turbidity are prominent in the month of June for the 5-year time period. Eutrophic waters generally have low water quality as there are frequent algal blooms and low levels of oxygen. The excess of nutrients results in a high productivity in such waters. This phenomenon can cause an increase in the turbidity and decrease the clarity of water. The water turns to a green or brownish colour, making it hard for aquatic organisms to prey and be on a lookout for predators (Lehtiniemi et al. 2005). Thus, this shows that NO2 and turbidity are linked together whether they have a positive or negative relation depends on the levels or range of the concentrations of the two parameters. Whether a parameter has a positive effect on the other depends on the abundance of that parameter. After a threshold is reached, the effect can be negative (Odum et al. 1979).

NO2 and pH

This relation is noticeable in the months of June and September. Ocean acidification (Gattuso & Hansson 2011) is the by-product of the chain reactions set off by eutrophication. Acidification occurs when carbon dioxide is produced in abundance due to the decomposition of algae and plants, thus decreasing the pH level of water. This process slows the growth of fish which can eventually lead to a reduction in fisheries resulting in smaller harvests (Turner & Chislock 2010). The relation observed for the pH and NO2 parameter is an evidence of the inverse proportionality of the two parameters.

NO2 and EC

The correlation between NO2 and conductivity is significant for August as compared to other monsoon months. Conductivity in a water body is affected by the increase in nitrogen and phosphorus nutrients, caused by eutrophication. The findings of this study are consistent with other studies that relate conductivity with nitrogen (Kløve 2001).

NO2 and chl-α

NO2 and chl-α have a dominant relation in the months of August and September. Nutrients like nitrate and phosphate are responsible for phytoplankton growth and metabolism (Filstrup & Downing 2017; Filstrup et al. 2018). The higher these nutrients are the higher is chlorophyll-α. This strong relationship is evident in Japanese lakes proving the significance established in this research.

NO2 and SDD

The relationship between NO2 and SDD is most dominant in June. As it is established that the nitrate salts in excess amounts give rise to eutrophication, which leads to the increase in turbidity levels and the decrease in the water clarity that makes the water cloudy and difficult for aquatic organisms to prey. Thus, the reduction in light penetration leads to the lowering of the Secchi depth. Lake Tahoe is an example of such occurrence where the eutrophic water caused the decrease in the Secchi depth as observed by Goldman et al. (2003). The relation between SDD and NO2 for this study is positive but that is dependent on the range of the concentrations of both the parameters. Nevertheless, the two parameters have a prominent association.

CO and EC

This association is seen to be more prevailing in the months of July and September. The CO may have an indirect association on the EC property of water as some major sources of CO production are the wetlands and near-coastal regions. The formation of CO in oceans and lakes is attributed to the methanogenic, sulphate-reducing, and acetogenic bacteria (Conrad 1988). Among, the low-molecular-weight carbonyl compounds in natural water bodies, CO is the dominant one, which is the product of the photochemical degradation of dissolved organic matter (DOM) in the sea and is emitted to the atmosphere (Chen et al. 1978; Kieber et al. 1990; Mopper et al. 1991; Weber 2020). The DOM has an impact on the surface water ecosystem affecting the water temperature, biogeochemical process, and water transparency (Solomon et al. 2015). Moreover, the DOM can accelerate hypoxia and eutrophication (Ledesma et al. 2012; Kritzberg et al. 2020), which in turn affects the conductivity in a water body.

Figure 6 depicts a pair plot showcasing the relationship between CO and EC parameters. The plot is organized in a grid format, where each cell represents a specific combination of CO and EC values. The shared y-axis across each row and shared x-axis across each column allow for easy comparison and interpretation. In this particular plot, the focus is on data from September 2018. The different elements presented in Figure 6 include (i) the average concentrations of CO, (ii) the correlation between CO and EC displayed with EC on the y-axis and CO on the x-axis, (iii) the mean concentrations of EC, and (iv) the CO–EC correlation shown with CO on the y-axis and EC on the x-axis. The plots in Figure 6 are further analysed with respect to two meteorological parameters: wind speed and air temperature. In Figure 6(a), the wind speed parameter is examined. Notably, the wind speed is relatively low (0.053 m/s) for both CO and EC parameters individually. However, for the peak of the CO–EC relationship, the wind speed reaches a medium-range value of 0.224 m/s. In Figure 6(b), the focus shifts to the air temperature parameter. The CO and EC parameters individually exhibit high air temperatures around 28 °C. However, for the correlation peak between CO and EC, the air temperature decreases to 23.52 °C. Through this detailed analysis in Figure 6, the relationship between CO and EC parameters, as well as their association with wind speed and air temperature, can be thoroughly understood. The plot provides valuable insights into the variations and dependencies among these variables during the specified time frame in September 2018.
Figure 6

(i) The mean CO concentrations, (ii) CO–EC correlation with EC on y-axis and CO on x-axis, (iii) the mean EC concentrations, and (iv) CO–EC correlation with CO on y-axis and EC on x-axis is shown with respect to (a) wind speed and (b) air temperature for September 2018.

Figure 6

(i) The mean CO concentrations, (ii) CO–EC correlation with EC on y-axis and CO on x-axis, (iii) the mean EC concentrations, and (iv) CO–EC correlation with CO on y-axis and EC on x-axis is shown with respect to (a) wind speed and (b) air temperature for September 2018.

Close modal

Regression analysis

A regression framework, seen in Figure 7, is proposed to analyse other patterns amongst the air pollutants, meteorological, and physicochemical water quality features. The framework is built using three types of machine learning algorithms that include LGBM, MLP, and SVM. For the proposed model, the air pollutants and meteorological features are considered as the independent variables. This unique set of parameters is then passed on to the LGBM, MLP, or SVM model to predict one of the five physicochemical parameters that are taken as dependent variables, i.e., TDS, DO, pH, Tur, and EC. The dataset is split into training and test sets with a 67:33% ratio.
Figure 7

Regression framework for predicting TDS, DO, pH, Tur, and EC.

Figure 7

Regression framework for predicting TDS, DO, pH, Tur, and EC.

Close modal

Each model predicts the physicochemical parameter with the air pollutants and meteorological dataset and is evaluated by the RMSE. The RMSE of the three regression models are observed, and the results are shown in Table 8. The correlation analysis shows that the CO and NO2 variables have a significant association with the physicochemical parameters. Thus, considering this a combination, (a) CO and meteorological and (b) NO2 and meteorological are taken to observe the outcomes. The results show that SVM regression model performed best for predicting the DO, EC, and Tur parameters with an RMSE of 0.01477, 0.024616, and 0.026881, respectively. The least RMSE of 0.18330 is achieved with all the six air pollutants and meteorological parameters for estimating the concentrations of TDS with the MLP regressor. For predicting TDS, the second-best RMSE of 0.189 is given for the NO2 and meteorological dataset with the SVM regressor. Similarly, for the pH parameter, the best RMSE of 0.0029 is achieved with either CO or NO2 dataset with MLP. Overall, SVM regressor performed best amongst the three regression models with the RMSE in the 0.015–0.03.

Table 8

RMSE of the regression models for estimating the concentrations of TDS, DO, pH, Tur, and EC

Variables (independent)ModelRMSE
TDSDOpHTUREC
All air pollutant and meteorological LGBM 0.20559 0.06112 0.0422 0.08886 0.08579 
MLP 0.18330 0.01626 0.0067 0.02776 0.072097 
SVM 0.205118 0.01477 0.00302 0.026881 0.024616 
CO and meteorological LGBM 0.27382 0.08491 0.04981 0.1158 0.10947 
MLP 0.1912 0.01562 0.002929 0.027552 0.042880 
SVM 0.189846 0.01688 0.0030 0.030389 0.091375 
NO2 and meteorological LGBM 0.2612 0.08691 0.0518 0.1167 0.10571 
MLP 0.192496 0.015552 0.002927 0.0284148 0.04883 
SVM 0.18984 0.016887 0.003022 0.0304012 0.09137 
Variables (independent)ModelRMSE
TDSDOpHTUREC
All air pollutant and meteorological LGBM 0.20559 0.06112 0.0422 0.08886 0.08579 
MLP 0.18330 0.01626 0.0067 0.02776 0.072097 
SVM 0.205118 0.01477 0.00302 0.026881 0.024616 
CO and meteorological LGBM 0.27382 0.08491 0.04981 0.1158 0.10947 
MLP 0.1912 0.01562 0.002929 0.027552 0.042880 
SVM 0.189846 0.01688 0.0030 0.030389 0.091375 
NO2 and meteorological LGBM 0.2612 0.08691 0.0518 0.1167 0.10571 
MLP 0.192496 0.015552 0.002927 0.0284148 0.04883 
SVM 0.18984 0.016887 0.003022 0.0304012 0.09137 

Bold values show the “Best Results Achieved (Lowest RMSE)”.

Water quality index

The WQI for the monsoon months of years 2018–2022 is shown in Figure 8. The water quality for the Rawal stream network mostly lies in the ‘Unfit for drinking’ category. The WQI results are dependent on the flexibility and accuracy of the WAWQI method, which is prone to biasness. Nevertheless, it can be seen that the worst months with respect to the water quality are June and July. This can be attributed to the fact that the average temperature of Islamabad for June lies in the range of 28 to 37.5 °C. The temperature and humidity can intensify the hydrological cycle, i.e., the rainfall patterns that can affect the water health. The year 2021 has a better water quality as compared to the other years, as it has fewer samples lying in the ‘Unfit for drinking’ class with the exception of September.
Figure 8

WQI for the monsoon months of years 2018–2022.

Figure 8

WQI for the monsoon months of years 2018–2022.

Close modal

Floods in Pakistan (2022)

In August 2022, Pakistan experienced severe flooding during the monsoon rains, affecting 81 districts. This catastrophic event had far-reaching consequences, including the loss of lives, property, and agricultural land, impacting approximately 33 million people. One of the significant repercussions of such flooding is the introduction of excessive nutrients, pollutants, and harmful sediments into water bodies. These substances disrupt the delicate balance of the aquatic ecosystem and contaminate the water and other food resources. This process, known as eutrophication, leads to a decrease in the concentration levels of certain physicochemical parameters. Of particular concern is the proximity of the flood-affected districts in Punjab to Rawal Lake. Rawal Lake is a crucial water resource as it is part of the Soan River Basin, which forms the midstream of the three sub-basins of the Indus River. Since all the sub-basins eventually flow into the Indus River, the flooding events can indirectly impact the water health of Rawal Lake. The influx of floodwaters, carrying sediments, pollutants, and nutrients, can potentially deteriorate the water quality of the lake, posing challenges to the sustainability of the ecosystem.

Furthermore, it is essential to recognize that Pakistan heavily relies on the Indus River and its tributaries for surface water resources. Therefore, the consequences of flooding extend beyond immediate areas of impact, as the disruption caused by flooding can affect the overall water health of these crucial water sources. The devastating floods in August 2022 have not only caused immediate devastation but also pose long-term challenges for water management and preservation in Pakistan. Addressing the indirect impacts of flooding on the water quality of Rawal Lake and the broader implications for surface water resources is critical in ensuring the well-being and sustainability of both the environment and the population reliant on these water bodies.

The heavy floods occurred on three specific dates: August 2, 20, and 29 2022. However, available S2 satellite images are only accessible for two dates, namely, 13 and 23 August 2022. To assess the environmental impact on the water health of Rawal Lake during the floods, data related to air pollutants, meteorological features, and physicochemical variables were extracted for 13 and 23 August 2022. Figure 9 illustrates the air pollutants and meteorological parameters for these two dates: 11 days after the first flood hit Pakistan (August 13) and 3 days after the second flood (August 23). Upon observation, it is evident that the levels of CO in Rawal Lake reach up to 0.04 mol/m2, whereas concentrations are comparatively lower in other parts of the stream. For the second flood time period (August 23), the CO concentrations in the lake fall within the mid-range when compared to the streams. Furthermore, the precipitation level during the second flood event is much lower than during the first flood. However, there is a shift in the wind speed from low to medium concentration levels, while the air temperature remains constant. Notably, the air temperature appears to be higher for Rawal Lake in comparison to other areas of the stream. This discrepancy can be attributed to the south-facing slope of Rawal Lake, which experiences higher levels of warmth and humidity.
Figure 9

Air and meteorological parameters of Rawal network for the 13th (11 days after the first flood hit Pakistan) and 23rd (3 days after the second flood hit) of August 2022.

Figure 9

Air and meteorological parameters of Rawal network for the 13th (11 days after the first flood hit Pakistan) and 23rd (3 days after the second flood hit) of August 2022.

Close modal
Figure 10 depicts a comparison of Rawal Lake before and after the flooding events. The first image represents the lake on 29 July 2022, 3 days before the initial flood. Contrasting this with the image taken on August 23, 3 days after the second flood, it becomes apparent that samples classified as ‘Good’ and ‘Fair’ before the flood are now categorized as ‘Unfit for drinking’ according to the WAWQI. In addition, Figure 10 includes a map of Pakistan, illustrating the flooded regions and waterways to emphasize their proximity to Rawal Lake.
Figure 10

Comparison of water quality of July (before flood) and August (after flood). (a) WQI for 29 July 2022, (b) WQI for 23 August 2022, and (c) map of Pakistan showing waterways, Rawal watershed, and the flooded areas.

Figure 10

Comparison of water quality of July (before flood) and August (after flood). (a) WQI for 29 July 2022, (b) WQI for 23 August 2022, and (c) map of Pakistan showing waterways, Rawal watershed, and the flooded areas.

Close modal

In summary, the data and images provide insights into the impact of the floods on Rawal Lake's water health. The concentrations of CO and the classification of water quality indicate notable changes after the flooding events. Understanding these effects is crucial for assessing and mitigating the environmental consequences of such natural disasters on water bodies.

Rawal watershed is surrounded by high-population sites, and traditionally the water quality of the watershed is assessed with the physicochemical parameters that include turbidity, pH, and DO. The data collection is performed with manual grab sampling through field visits. This gives us a limited set of variables that are insufficient to truly determine the quality of the water as these high-population sites have anthropogenic emissions of air pollutants and other meteorological factors that can have an influence on the water health. Thus, the present study collected three categories of data, i.e., (a) physicochemical parameters including pH, TDS, electrical conductivity, DO, SDD, turbidity, and chlorophyll-α from S2-MSI L1C satellite, (b) air pollutants, i.e., CO, NO2, O3, SO2, HCHO, and CH4 are extracted from S5P-L2, and (c) meteorological parameters, i.e., air temperature, wind speed, and total precipitation taken from the ERA5-CRP project for the years 2018–2022. Thus, the environmental factors are taken as influencing parameters with easy and global coverage for sample collection with remote sensing technology to propose a water quality monitoring model with a unique set of features. Pearson's correlation and regression analysis are performed on this new dataset along with the application WAWQI method to rank the water quality. Moreover, the floods of August 2022 are taken as an example to observe the impact of natural calamities on the quality of water.

The correlation analysis shows four prominent negative relationships for physicochemical parameters with the air pollutants for all monsoon months. The top associations include the NO2–TDS with correlation ranging from 0.51 to 0.85 and NO2–DO with correlation ranging from 0.5 to 0.82. This is followed by NO2–Tur with correlation ranging from 0.53 to 0.65 and NO2–pH with correlation ranging from 0.5 to 0.75. These negative correlations are the most common in the month of June for Tur, pH, and DO. The NO2 and TDS relation is dominant in the month of August. Both these months have an ‘Unfit for drinking’ rating with the WAWQI method. The correlations are evident in the eutrophication process that occurs in the water when the nitrate nutrients are in abundance causing a chain reaction of change in Tur, pH, TDS, and DO parameters. Four positive associations are observed with NO2 and CO pollutants and the physicochemical parameters. These include NO2–EC (range, 0.54–0.85), NO2–chl-α (range, 0.53–0.79), NO2–SDD (range, 0.5–0.74), and CO–EC (range, 0.51–0.67). These relationships are the most common in August for NO2–chl-α and NO2–EC. On the other hand, NO2–SDD and CO–EC are prominent in June and July, respectively. These correlations prove the fact that the high amount of nitrate will cause an increase in phytoplankton giving rise to chl-α, decreasing water clarity or SDD and increasing EC. The meteorological features can have a moderate impact on the water quality, but due to the limitation of the available data, the relationships observed for the time period of July 2018 to July 2020 include the (i) wind speed with chl-α, EC, DO, and TDS and (ii) air temperature with DO and TDS in the months of August and September. The wind speed has a positive correlation with DO (range, 0.55–0.60) and with TDS in the range of 0.57–0.71. The relation between the wind speed and TDS is justifiable as it causes resuspension which changes the TDS levels. Moreover, the meteorological features are also observed for the flooding events in August 2022 to observe the negative impact of natural calamities as the flood-affected districts of Punjab lie in close proximity to the Rawal Lake, which can introduce an excess of pollutants and nutrients in the water bodies giving rise to eutrophication and eventually lowering the water quality. The results show that the precipitation level is much lower after the flooding events. However, the wind speed is shifted from low to medium concentration levels and the air temperature is constant. However, the air temperature is much higher for the lake as compared to the other parts of the stream as Rawal watershed has a south-facing slope that is much warmer and prone to having high air temperature.

The regression analysis using machine learning techniques, i.e., LGBM, MLP, and SVM are applied with the air pollutants and meteorological parameters taken as independent variables to predict the concentrations of TDS, pH, turbidity, EC, and DO parameters. MLP gave the best results for TDS and pH with an RMSE of 0.18 and 0.003, respectively. While SVM performed well for DO, turbidity, and EC with an RMSE of 0.015, 0.027, and 0.025, respectively. In addition, the WAWQI method is used for the classification of the water quality for the Rawal stream network that is calculated with the physicochemical water parameters alone and does not consider air pollutants and meteorological factors or other hydrological features (Ahmed et al. 2022b), i.e., slope, aspect, lithology, geology, and land cover/land use. Thus, the WAWQI method seems to be biased over the location and weights assigned to specific water quality parameters.

Therefore, in the future, an improved water quality indexing technique to effectively analyse and interpret the impact of human activities on water quality shall be investigated. This may also include the integration of various natural factors, including topographical parameters such as slope and aspect, as well as hydrological parameters such as lithology, geology, and soil type, that directly or indirectly influence the overall water quality (Ahmed et al. 2022c). For instance, the slope steepness plays a critical role in water contamination by regulating the speed at which rainfall runoff flows down the slope. Steep slopes can result in rapid flow, leading to soil erosion, the swift transport of pollutants and sediments, and disturbance to aquatic ecosystems. Similarly, soil type is another important factor in water pollution. Soils with high infiltration capacity can reduce the amount of runoff, thereby mitigating potential contamination. Hence, in the future, the integration of natural factors and development of advanced indexing techniques based on machine learning technologies will contribute to effective water quality monitoring and management. In addition, studying the impact of natural disasters such as floods, landslides, wildfires, and droughts on water quality can also help identify patterns and comprehend the effects of these disasters on water health. Such analyses can provide valuable insights and inform the development of strategies aimed at protecting ecosystems for the benefit and welfare of humanity.

Research and development of this study were conducted in IoT Lab, NUST-SEECS, Islamabad, Pakistan

Data cannot be made publicly available; readers should contact the corresponding author for details.

The authors declare there is no conflict.

Abdullah
H. S.
2015
Water Quality Assessment for Dokan Lake Using Landsat 8 Oli Satellite Images
.
PhD Thesis
,
University of Sulaimani Sulaimani University, Bakrajo Street, Sulaimaniyah, lraq
.
Ahmed
M.
,
Mumtaz
R.
,
Baig
S.
&
Zaidi
S. M. H.
2022c
Assessment of correlation amongst physico-chemical, topographical, geological, lithological and soil type parameters for measuring water quality of Rawal watershed using remote sensing
.
Water Supply
22
(
4
),
3645
3660
.
Al-Alola
S. S.
,
Alkadi
I. I.
,
Alogayell
H. M.
,
Mohamed
S. A.
&
Ismail
I. Y.
2022
Air quality estimation using remote sensing and GIS-spatial technologies along Al-Shamal Train pathway, Al-Qurayyat city in Saudi Arabia
.
Environmental and Sustainability Indicators
15
,
100184
.
Ali
M.
,
Qamar
A. M.
&
Ali
B.
2013
Data analysis, discharge classifications, and predictions of hydrological parameters for the management of Rawal Dam in Pakistan
. In
2013 12th International Conference on Machine Learning and Applications
. Vol.
1
, pp.
382
385
.
IEEE
,
Miami, FL, USA
.
https://doi.org/10.1109/ICMLA.2013.78
.
Arend
K. K.
,
Beletsky
D.
,
DePinto
J. V.
,
Ludsin
S. A.
,
Roberts
J. J.
,
Rucinski
D. K.
,
Scavia
D.
,
Schwab
D. J.
&
Höök
T. O.
2011
Seasonal and interannual effects of hypoxia on fish habitat quality in Central Lake Erie
.
Freshwater Biology
56
(
2
),
366
383
.
Awad
M.
&
Khanna
R.
(
2015
).
Support Vector Regression
. In:
Efficient Learning Machines
.
Apress
,
Berkeley, CA
.
https://doi.org/10.1007/978-1-4302-5990-9_4.
Baillarin
S. J.
,
Meygret
A.
,
Dechoz
C.
,
Petrucci
B.
,
Lacherade
S.
,
Trémas
T.
,
Isola
C.
,
Martimort
P.
&
Spoto
F.
2012
Sentinel-2 level 1 products and image processing performances
. In
2012 IEEE International Geoscience and Remote Sensing Symposium
. pp.
7003
7006
,
IEEE
,
Munich, Germany
.
https://doi.org/10.1109/IGARSS.2012.6351959
.
Balaji
L.
,
Muthukannan
M.
&
Devi
R. K.
2022
A GIS-based study of air and water quality trends in Madurai City, India
.
Nature Environment and Pollution Technology
21
(
1
),
21
32
.
Beeson
P. C.
,
Sadeghi, Ali
M.
,
Lang
M. W.
,
Tomer
M. D.
&
Daughtry
C. S. T.
2014
Sediment delivery estimates in water quality models altered by resolution and source of topographic data
.
Journal of Environmental Quality
43
(
1
),
26
36
.
Chandra
D. S.
,
Asadi
S. S.
&
Raju
M. V. S.
2017
Estimation of water quality index by weighted arithmetic water quality index method: a model study
.
International Journal of Civil Engineering and Technology
8
(
4
),
1215
1222
.
Chen
Y.
,
Khan
S. U.
&
Schnitzer
M.
1978
Ultraviolet irradiation of dilute fulvic acid solutions
.
Soil Science Society of America Journal
42
(
2
),
292
296
.
Chen
C.-Y.
,
Chen
H. W.
,
Sun
C.-T.
,
Chuang
Y. H.
,
Nguyen
K. L. P.
&
Lin
Y. T.
2021
Impact assessment of river dust on regional air quality through integrated remote sensing and air quality modeling
.
Science of the Total Environment
755
,
142621
https://doi.org/10.1007/978-1-4684-5409-3_7
Springer, Boston, MA
.
Conrad
R.
1988
Biogeochemistry and ecophysiology of atmospheric CO and H2
. In:
Advances in Microbial Ecology
.
Springer, Boston, MA
, pp.
231
283
.
https://doi.org/10.1007/978-1-4684-5409-3_7
.
Correll
D. L.
1998
The role of phosphorus in the eutrophication of receiving waters: a review
.
Journal of Environmental Quality
27
(
2
),
261
266
.
De Smedt
I.
,
Theys
N.
,
Yu
H.
,
Danckaert
T.
,
Lerot
C.
,
Compernolle
S.
,
Van Roozendael
M.
,
Richter
A.
,
Hilboll
A.
,
Peters
E.
,
Pedergnana
M.
,
Loyola
D.
,
Beirle
S.
,
Wagner
T.
,
Eskes
H.
,
van Geffen
J.
,
Boersma
K. F.
&
Veefkind
P.
2018
Algorithm theoretical baseline for formaldehyde retrievals from s5p TROPOMI and from the qa4ecv project
.
Atmospheric Measurement Techniques
11
(
4
),
2395
2426
.
Deutsch
E.
,
Alameddine
I.
&
El-Fadel
M.
2014
Developing landsat based algorithms to augment in situ monitoring of freshwater lakes and reservoirs
. In:
11th International Conference on Hydroinformatics
. Vol.
1
.
City University of New York (CUNY)
,
New York, USA
.
Diaz
R. J.
&
Rosenberg
R.
2008
Spreading dead zones and consequences for marine ecosystems
.
Science
321
(
5891
),
926
929
.
Dinoi
A.
,
Perrone
M. R.
&
Burlizzi
P.
2010
Application of modis products for air quality studies over Southeastern Italy
.
Remote Sensing
2
(
7
),
1767
1796
.
Doney
S. C.
,
Fabry
V. J.
,
Feely
R. A.
&
Kleypas
J. A.
2009
Ocean acidification: the other CO2 problem
.
Annual Review of Marine Science
1
,
169
192
.
Esri
. 2022.
ArcGIS Pro. Available from: https://www.esri.com/en-us/arcgis/products/arcgis-pro/overview (accessed 4 October 2022)
.
Fatima
S. U.
,
Khan
M. A.
,
Siddiqui
F.
,
Mahmood
N.
,
Salman
N.
,
Alamgir
A.
&
Shaukat
S. S.
2022
Geospatial assessment of water quality using principal components analysis (PCA) and water quality index (WQI) in Basho Valley, Gilgit Baltistan (Northern Areas of Pakistan)
.
Environmental Monitoring and Assessment
194
(
3
),
1
22
.
Filstrup
C. T.
,
Wagner
T.
,
Oliver
S. K.
,
Stow
C. A.
,
Webster
K. E.
,
Stanley
E. H.
&
Downing
J. A.
2018
Evidence for regional nitrogen stress on chlorophyll a in lakes across large landscape and climate gradients
.
Limnology and Oceanography
63
(
S1
),
S324
S339
.
Fuller
R.
,
Landrigan
P. J.
,
Balakrishnan
K.
,
Bathan
G.
,
Bose-O'Reilly
S.
,
Brauer
M.
,
Caravanos
J.
,
Chiles
T.
,
Cohen
A.
,
Corra
L.
,
Cropper
M.
,
Ferraro
G.
,
Hanna
J.
,
Hanrahan
D.
,
Hu
H.
,
Hunter
D.
,
Janata
G.
,
Kupka
R.
,
Lanphear
B.
,
Lichtveld
M.
,
Martin
K.
,
Mustapha
A.
,
Sanchez-Triana
E.
,
Sandilya
K.
,
Schaefli
L.
,
Shaw
J.
,
Seddon
J.
,
Suk
W.
,
Téllez-Rojo
M. M.
&
Yan
C.
2022
Pollution and health: a progress update
.
The Lancet Planetary Health
6
(
6
),
e535
e547
.
Garane
K.
,
Koukouli
M.-E.
,
Verhoelst
T.
,
Lerot
C.
,
Heue
K.-P.
,
Fioletov
V.
,
Balis
D.
,
Bais
A.
,
Bazureau
A.
,
Dehn
A.
,
Goutail
F.
,
Granville
J.
,
Griffin
D.
,
Hubert
D.
,
Keppens
A.
,
Lambert
J.-C.
,
Loyola
D.
,
McLinden
C.
,
Pazmino
A.
,
Pommereau
J.-P.
,
Redondas
A.
,
Romahn
F.
,
Valks
P.
,
Van Roozendael
M.
,
Xu
J.
,
Zehner
C.
,
Zerefos
C.
&
Zimmer
W
2019
TROPOMI/s5p total ozone column data: global ground-based validation and consistency with other satellite missions
.
Atmospheric Measurement Techniques
12
(
10
),
5263
5287
.
Gattuso
J.-P.
&
Hansson
L.
2011
Ocean Acidification
.
Oxford University Press
,
Paris, France
.
Gintamo
T. T.
,
Mengistu
H.
&
Kanyerere
T.
2021
GIS-based modelling of climate variability impacts on groundwater quality: Cape flats aquifer, Cape Town, South Africa
.
Groundwater for Sustainable Development
15
,
100663
.
Gitelson
A. A.
&
Merzlyak
M. N.
1998
Remote sensing of chlorophyll concentration in higher plant leaves
.
Advances in Space Research
22
(
5
),
689
692
.
Goldman
C.R.
,
James
M.R.
,
Vant
W.
&
Severne
C.
2003
Requirements for Lake Management
.
In: Kumagai, M., Vincent, W.F. (eds) Freshwater Management
.
Springer
,
Tokyo
.
https://doi.org/10.1007/978-4-431-68436-7_6.
Google Earth Engine
.
Available from: https://earthengine.google.com/ (accessed 26 October 2022)
.
Gupta
P. K.
,
Gupta
M.
&
Singh
R. P.
2018
Meteorological factors influencing particulate matter concentration and elemental composition in the ambient air of an urban area
.
Atmospheric Research
205
,
11
23
.
doi:10.1016/j.atmosres.2018.02.010
.
Harrington
J. A.
Jr.
,
Schiebe
F. R.
&
Nix
J. F.
1992
Remote sensing of Lake Chicot, Arkansas: monitoring suspended sediments, turbidity, and Secchi depth with Landsat MSS data
.
Remote Sensing of Environment
39
(
1
),
15
27
.
Hersbach
H.
,
Bell
B.
,
Berrisford
P.
,
Hirahara
S.
,
Horányi
A.
,
Muñoz-Sabater
J.
,
Nicolas
J.
,
Peubey
C.
,
Radu
R.
,
Schepers
D.
,
Simmons
A.
,
Soci
C.
,
Abdalla
S.
,
Abellan
X.
,
Balsamo
G.
,
Bechtold
P.
,
Biavati
G.
,
Bidlot
J.
,
Bonavita
M.
,
De Chiara
G.
,
Dahlgren
P.
,
Dee
D.
,
Diamantakis
M.
,
Dragani
R.
,
Flemming
J.
,
Forbes
R.
,
Fuentes
M.
,
Geer
A.
,
Haimberger
L.
,
Healy
S.
,
Hogan
R. J.
,
Hólm
E.
,
Janisková
M.
,
Keeley
S.
,
Laloyaux
P.
,
Lopez
P.
,
Lupu
C.
,
Radnoti
G.
,
de Rosnay
P.
,
Rozum
I.
,
Vamborg
F.
,
Villaume
S.
&
Thépaut
J.-N.
2020
The ERA5 global reanalysis
.
Quarterly Journal of the Royal Meteorological Society
146
(
730
),
1999
2049
.
Jingzhong
Z.
,
Liping
D.
&
Baoping
Q.
1985
Preliminary studies on eutrophication and red tide problems in Bohai Bay
.
Hydrobiologia
127
(
1
),
27
30
.
Kan
H.
,
Chen
R.
&
Tong
S.
2012
Ambient air pollution, climate change, and population health in China
.
Environment International
42
,
10
19
.
Kann
J.
&
Welch
E. B.
2005
Wind control on water quality in shallow, hypereutrophic Upper Klamath Lake, Oregon
.
Lake and Reservoir Management
21
(
2
),
149
158
.
Kapalanga
T. S.
2015
Assessment and Development of Remote Sensing Based Algorithms for Water Quality Monitoring in Olushandja Dam, North-Central Namibia
.
PhD Thesis, MSc Thesis
,
University of Zimbabwe, Harare
,
Zimbabwe
.
Ke
G.
,
Meng
Q.
,
Finley
T.
,
Wang
T.
,
Chen
W.
,
Ma
W.
,
Ye
Q.
&
Liu
T.-Y.
2017
Lightgbm: a highly efficient gradient boosting decision tree
.
Advances in Neural Information Processing Systems
30
,
3146
3154
.
Khangaonkar
T.
,
Nugraha
A.
,
Xu
W.
,
Long
W.
,
Bianucci
L.
,
Ahmed
A.
,
Mohamedali
T.
&
Pelletier
G.
2018
Analysis of hypoxia and sensitivity to nutrient pollution in Salish sea
.
Journal of Geophysical Research: Oceans
123
(
7
),
4735
4761
.
Kritzberg
E. S.
,
Hasselquist
E. M.
,
Skerlep
M.
,
Löfgren
S.
,
Olsson
O.
,
Stadmark
J.
,
Valinia
S.
,
Hansson
L.-A.
&
Laudon
H.
2020
Browning of freshwaters: consequences to ecosystem services, underlying drivers, and potential mitigation measures
.
Ambio
49
(
2
),
375
390
.
Ledesma
J. L. J.
,
Köhler
S. J.
&
Futter
M. N.
2012
Long-term dynamics of dissolved organic carbon: implications for drinking water supply
.
Science of the Total Environment
432
,
1
11
.
Lee
Y.-N.
&
Schwartz
S. E.
1981
Evaluation of the rate of uptake of nitrogen dioxide by atmospheric and surface liquid water
.
Journal of Geophysical Research: Oceans
86
(
C12
),
11971
11983
.
Lehtiniemi
M.
,
Engström-Ost
J.
&
Viitasalo
M.
2005
Turbidity decreases anti-predator behaviour in pike larvae, Esox lucius
.
Environmental Biology of Fishes
73
(
1
),
1
8
.
Liu
H.
&
Xu
M.
,
Beck
R.
2018
An ensemble approach to retrieving water quality parameters from multispectral satellite imagery
. In:
IGARSS 2018-2018 IEEE International Geoscience and Remote Sensing Symposium
.
IEEE
, pp.
9284
9287
.
Valencia, Spain
.
Lopes
M. C.
,
Martins
A. L.
,
Simedo
M. B.
,
Martins Filho
M. V.
,
Costa
R. C.
,
do Valle Junior
R. F.
,
Rojas
N. E.
,
Fernandes
L. F.
,
Pacheco
F. A.
&
Pissarra
T. C.
2021
A case study of factors controlling water quality in two warm monomictic tropical reservoirs located in contrasting agricultural watersheds
.
Science of the Total Environment
762
,
144511
.
Magro
C.
,
Nunes
L.
,
Gonçalves
O. C.
,
Neng
N. R.
,
Nogueira
J. M.
,
Rego
F. C.
&
Vieira
P.
2021
Atmospheric trends of CO and CH4 from extreme wildfires in Portugal using sentinel-5p TROPOMI level-2 data
.
Fire
4
(
2
),
25
.
Mamun
M.
,
Atique
U.
,
Kim
J. Y.
&
An
K.-G.
2021
Seasonal water quality and algal responses to monsoon-mediated nutrient enrichment, flow regime, drought, and flood in a drinking water reservoir
.
International Journal of Environmental Research and Public Health
18
(
20
),
10714
.
Martin
R. V.
2008
Satellite remote sensing of surface air quality
.
Atmospheric Environment
42
(
34
),
7823
7843
.
Matyssek
R.
,
Wieser
G.
,
Calfapietra
C.
,
de Vries
W.
,
Dizengremel
P.
,
Ernst
D.
,
Jolivet
Y.
,
Mikkelsen
T. N.
,
Mohren
G. M. J.
,
Le Thiec
D.
,
Tuovinen
J.-P.
,
Weatherall
A.
&
Paoletti
E.
2012
Forests under climate change and air pollution: gaps in understanding and future directions for research
.
Environmental Pollution
160
,
57
65
.
Mehmood
Y.
,
Qadar
A.
&
Waheed
A.
2022
Water Contamination, Households’ Risk Perceptions, and Averting Behavior: Evidence from the Nullah Lai, Rawalpindi, Pakistan
.
Journal of Asian and African Studies
0 (0). https://doi.org/10.1177/0021909622107611
Mohsen
A.
,
Elshemy
M.
&
Zeidan
B.
2021
Water quality monitoring of Lake Burullus (Egypt) using landsat satellite imageries
.
Environmental Science and Pollution Research
28
(
13
),
15687
15700
.
Mopper
K.
,
Zhou
X.
,
Kieber
R. J.
,
Kieber
D. J.
,
Sikorski
R. J.
&
Jones
R. D.
1991
Photochemical degradation of dissolved organic carbon and its impact on the oceanic carbon cycle
.
Nature
353
(
6339
),
60
62
.
Murtagh
F.
1991
Multilayer perceptrons for classification and regression
.
Neurocomputing
2
(
5–6
),
183
197
.
National Research Council
2000
Clean Coastal Waters: Understanding and Reducing the Effects of Nutrient Pollution
.
Washington, DC
:
The National Academies Press
.
https://doi.org/10.17226/9812.
NeiláCape
J.
&
Lammel
G.
1996
Nitrous acid and nitrite in the atmosphere
.
Chemical Society Reviews
25
(
5
),
361
369
.
Nie
J.
,
Feng
H.
,
Witherell
B. B.
,
Alebus
M.
,
Mahajan
M. D.
,
Zhang
W.
&
Yu
L.
2018
Causes, assessment, and treatment of nutrient (n and p) pollution in rivers, estuaries, and coastal waters
.
Current Pollution Reports
4
(
2
),
154
161
.
Odum
E. P.
,
Finn
J. T.
&
Franz
E. H.
1979
Perturbation theory and the subsidy-stress gradient
.
Bioscience
29
(
6
),
349
352
.
Oyedotun
T. D. T.
2019
Land use change and classification in chaohu lake catchment from multi-temporal remotely sensed images
.
Geology, Ecology, and Land-scapes
3
(
1
),
37
45
.
Paerl
H. W.
1988
Nuisance phytoplankton blooms in coastal, estuarine, and inland waters 1
.
Limnology and Oceanography
33
(
4part2
),
823
843
.
Panagopoulos
A.
2021
Water-energy nexus: desalination technologies and renewable energy sources
.
Environmental Science and Pollution Research
28
(
17
),
21009
21022
.
PM2.5
.
2022
Interactive Global map of 2021 pm2.5 Concentrations by City
.
Available from: https://www.iqair.com/world-air-quality-report (accessed 4 October 2022)
.
Price
D.
,
Birnbaum
R.
,
Batiuk
R.
,
McCullough
M.
&
Smith
R.
1997
Nitrogen Oxides: Impacts on Public Health and the Environment. Technical Report
.
Environmental Protection Agency
,
Washington, DC
,
United States
.
Office of Air and Radiation
.
Reddy
K. R.
,
Fisher
M. M.
&
Ivanoff
D.
1996
Resuspension and Diffusive Flux of Nitrogen and Phosphorus in a Hypereutrophic Lake
.
Journal of Environmental Quality
25
,
363
371
.
https://doi.org/10.2134/jeq1996.00472425002500020022x
.
Ritchie
J. C.
,
Zimba
P. V.
&
Everitt
J. H.
2003
Remote sensing techniques to assess water quality
.
Photogrammetric Engineering & Remote Sensing
69
(
6
),
695
704
.
Ruhela
M.
,
Sharma
K.
,
Bhutiani
R.
,
Chandniha
S. K.
,
Kumar
V.
,
Tyagi
K.
,
Ahamad
F.
&
Tyagi
I.
2022
GIS-based impact assessment and spatial distribution of air and water pollutants in mining area
.
Environmental Science and Pollution Research
29
(
21
),
31486
31500
.
Shah
A. A.
,
Khan
M. A.
,
Kanwal
N.
&
Bernstein
R.
2016
Assessment of safety of drinking water in tank district: an empirical study of water-borne diseases in rural Khyber Pakhtunkhwa, Pakistan
.
International Journal of Environmental Sciences
6
(
4
),
418
428
.
Sharaf El Din
E.
2020
A novel approach for surface water quality modelling based on Landsat-8 tasselled cap transformation
.
International Journal of Remote Sensing
41
(
18
),
7186
7201
.
Singh
A.
&
Agrawal
M.
2007
Acid rain and its ecological consequences
.
Journal of Environmental Biology
29
(
1
),
15
.
Solomon
C. T.
,
Jones
S. E.
,
Weidel
B. C.
,
Buffam
I.
,
Fork
M. L.
,
Karlsson
J.
,
Larsen
S.
,
Lennon
J. T.
,
Read
J. S.
,
Sadro
S.
&
Saros
J. E.
2015
Ecosystem consequences of changing inputs of terrestrial dissolved organic matter to lakes: current knowledge and future challenges
.
Ecosystems
18
(
3
),
376
389
.
Stull
R.
2017
Meteorological influences on air quality: a review
.
BoundaryLayer Meteorology
162
(
2
),
351
367
.
doi:10.1007/s10546-016-0229-y
.
Theologou
I.
,
Patelaki
M.
&
Karantzalos
K.
2015
Can single empirical algorithms accurately predict inland shallow water quality status from high resolution, multi-sensor, multi-temporal satellite data? The international archives of photogrammetry
.
Remote Sensing and Spatial Information Sciences
40
(
7
),
1511
.
Theys
N.
,
De Smedt
I.
,
Yu
H.
,
Danckaert
T.
,
van Gent
J.
,
Hörmann
C.
,
Wagner
T.
,
Hedelt
P.
,
Bauer
H.
,
Romahn
F.
,
Pedergnana
M.
,
Loyola
D.
&
Roozendael
M. V.
2017
Sulfur dioxide retrievals from TROPOMI onboard sentinel-5 precursor: algorithm theoretical basis
.
Atmospheric Measurement Techniques
10
(
1
),
119
153
.
United Nations
.
2022
Water and Sanitation – United Nations Sustainable Development
.
Available from: https://www.un.org/sustainabledevelopment/water-and-sanitation/ (accessed 4 October 2022)
.
Usali
N.
&
Ismail
M. H.
2010
Use of remote sensing and GIS in monitoring water quality
.
Journal of Sustainable Development
3
(
3
),
228
.
Van Geffen
J.
,
Boersma
K. F.
,
Eskes
H.
,
Sneep
M.
,
Ter Linden
M.
,
Zara
M.
&
Veefkind
J. P.
2020
S5p TROPOMI NO2 slant column retrieval: method, stability, uncertainties and comparisons with OMI
.
Atmospheric Measurement Techniques
13
(
3
),
1315
1335
.
Veefkind
J. P.
,
Aben
I.
,
McMullan
K.
,
Förster
H.
,
de Vries
J.
,
Otter
G.
,
Claas
J.
,
Eskes
H. J.
,
de Haan
J. F.
,
Kleipool
Q.
,
van Weele
M.
,
Hasekamp
O.
,
Hoogeveen
R.
,
Landgraf
J.
,
Snel
R.
,
Tol
P.
,
Ingmann
P.
,
Voors
R.
,
Kruizinga
B.
,
Vink
R.
,
Visse
H.
&
Levelt
P. F.
2012
TROPOMI on the ESA sentinel-5 precursor: a GMES mission for global observations of the atmospheric composition for climate, air quality and ozone layer applications
.
Remote Sensing of Environment
120
,
70
83
.
Weather Spark
.
Weather by month, average temperature (Pakistan). Available from: https://weatherspark.com/y/107761/.
Weber
J.
2020
Humic substances and their role in the environment
.
EC Agriculture
3
,
1
6
.
Xu
M.
,
Liu
H.
,
Beck
R.
,
Lekki
J.
,
Yang
B.
,
Shu
S.
,
Liu
Y.
,
Benko
T.
,
Anderson
R.
,
Tokars
R.
,
Johansen
R.
,
Emery
E.
&
Reif
M.
2019
Regionally and locally adaptive models for retrieving chlorophyll – a concentration in inland waters from remotely sensed multispectral and hyperspectral imagery
.
IEEE Transactions on Geoscience and Remote Sensing
57
(
7
),
4758
4774
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY-NC-ND 4.0), which permits copying and redistribution for non-commercial purposes with no derivatives, provided the original work is properly cited (http://creativecommons.org/licenses/by-nc-nd/4.0/).