Abstract
Water quality modeling is very important for the management of water resources. In this study, the upper part of the Porsuk Basin in Türkiye is analyzed using SWAT. In the analysis, in addition to the data provided by the General Directorate of State Hydraulic Works (DSI), the freely available flow and water quality data from the GEMStat data portal were used. This study presents a discussion of the practicality of GEMStat data for a water quality model. For this purpose, firstly, the SWAT model was constructed with freely available global data sources on elevation, land use/land cover, and soil type. Then, the model flow outputs were calibrated and validated for both DSI and GEMStat data in three different time periods. As a result, the flow calibration and validation success in the daily time step is 0.64 and 0.44 according to the Nash–Sutcliffe efficiency (NSE). The model was also validated using GEMStat flow data and calibrated using GEMStat water quality data such as nitrate (NO3), total suspended solids (TSS), and dissolved oxygen (DO) with a reasonable value. Hence, the results showed that GEMStat flow and water quality data can be used as auxiliary open-source data in the modeling process.
HIGHLIGHTS
The importance of data monitoring and the benefit of the freely available data were emphasized.
The usability of the data obtained from the global data provider portal GEMStat was tested using SWAT.
Model-generated flow and water quality data showed reasonable fit (NSE values between 0.35 and 0.64) with the real data in the daily time step.
INTRODUCTION
The protection of aquatic ecosystems against pollution is an important global priority (United Nations General Assembly 2015). The studies on this issue aimed at conserving clean water resources and increasing the water quality, which depends on the measurement of the parameters that cause pollution. It is obvious that freshwater resources are becoming increasingly polluted due to agriculture and urban activities (Carpenter et al. 1998; Bhateria & Jain 2016). In recent years, measuring and monitoring water quality parameters in rivers, lakes, wetlands, or coastal waters has gained importance (USEPA 2000; Kirschke et al. 2020). Monitoring water quantity and quality is necessary for informed decision-making and minimizing uncertainties regarding water management (WMO 2009; Stewart 2015) even if these procedures are laborious and costly (Barcelona et al. 1985; Chapman 1996; Zessner 2021). These measurements offer multiple benefits such as flood forecasts and warnings, protecting lives and property, flood plain mapping, determining environmental or ecological flows, and regulating pollutant discharges (USGS 2006). Conceptually, such measurements are also essential for watershed modeling studies performed for the management and regulation of water quality. Moreover, calibration and validation of the watershed models require long-term and accurate observations of water quantity and quality parameters, which are always limited (Sivapalan 2003).
Although watershed models are robust tools for predicting water quality in a region, modeling procedures are challenging due to the data scarcity issue, especially in low- and middle-income countries (Wagner et al. 2009). Therefore, a simple but scientifically proven hydrological model that does not require too much data would be a useful tool for water quantity and quality estimations (Kwakye & Bárdossy 2020). Choosing an appropriate model and obtaining the necessary data is critical to construct a reliable model. Knowledge about the water quantity and quality of the basin can be obtained from modeling studies. There are numerous modeling tools for examining water quality parameters (Tsakiris & Alexakis 2012; Wang et al. 2013; Gao & Li 2014; Gülbaz 2019; Ejigu 2021; Bai et al. 2022). HSPF (Hydrological Simulation Program-FORTRAN) (Barnwell & Johanson 1981), WASP (Water Quality Analysis Simulation Program) (Di Toro et al. 1983), QUAL2E (Brown & Barnwell 1987), SWAT (Soil and Water Assessment Tool) (Arnold et al. 1998), and AQUATOX (Park et al. 2008) are some commonly used water quality models. Among them, SWAT is an effective tool for both water quantity and quality assessment, especially in agricultural watersheds. Costa et al. (2021) showed that SWAT is the most frequently used water quality model in the world.
Besides, obtaining the data required for modeling is as important as the selection of an appropriate modeling tool. Although many physical-based (such as elevation, land use/land cover, and soil type) and meteorological (such as precipitation and temperature) global data sources are available, obtaining previously measured water quality data is relatively difficult. Calibration or output testing in water quality models is also quite challenging. Studies in this field need great effort and data archive. Required data can be measured by researchers, obtained from local organizations, or both. Also, it is reported that open and free data sources were for hydrology studies (Newcomer et al. 2022). Researchers prefer using different data sources if they did not conduct field measurements in the region. There are a few continental and global datasets such as CESI (Canadian Environmental Sustainability Indicators program), GEMStat (Global Freshwater Quality Database), GLORICH (GLObal RIver CHemistry), and Waterbase and WQP (Water Quality Portal) (Virro et al. 2021). Only GEMStat provides datasets about the water quality of some rivers in Türkiye. Among these rivers, water quality measurement data for Porsuk Stream is available in the GEMStat data portal. Furthermore, some researchers collected their own field measurements in the Porsuk Basin (Orak 2006; Yüce et al. 2006; Solak 2009; Gürel 2011; Çelen et al. 2014; Köse et al. 2018; Şahin 2018) and some gathered data from local organizations (Yerel 2010; Güngör 2011). In the present study, the GEMStat data portal was chosen as the exclusive global source for water quality measurement data pertaining to Turkish rivers among all available options. The primary reason for selecting GEMStat data was its unique coverage of Turkish rivers. GEMStat is highly favored due to its user-friendly interface and convenient access to data through its visual database, which includes mapped representations. Furthermore, it is important to highlight that GEMStat offers this valuable data at no cost. Both water quantity (flow) and water quality (nitrate (NO3), total suspended solids (TSS), and dissolved oxygen (DO)) data were used. The aim of this study is to investigate the availability of global water quality data, rather than to present a comprehensive water quality model. Accordingly, flow and water quality data obtained from the model were calibrated and validated in three different time periods. Therefore, the usability of GEMStat data in such a study is also examined. Our results are believed to provide more insight into the usability of GEMStat data in future studies.
MATERIALS AND METHODS
Study area and data
The upper part of the Porsuk Basin, which is an important branch of the Sakarya River, was selected as the study area. Porsuk Stream feeds the Sakarya River, which is one of the 25 main basins in Türkiye. It is an important water source in Türkiye, especially the upper part of the Porsuk Basin (a part up to Eskişehir province) is used as a drinking and irrigation water basin (Köse et al. 2016). Therefore, examining the water quality in this region is of great importance for the health of ecosystems.
The spatial data consists of a digital elevation model (DEM), land use/land cover, and soil type maps. SRTM Shuttle Radar Topography Mission (SRTM) with 30 m cell size was utilized for watershed delineation and stream network process. The European Environment Agency – Coordination of Information on the Environment (EEA-CORINE) with 100 m cell size and the Digital Soil Map of the World (FAO-DSMW) scale 1:5,000,000 by the Food and Agriculture Organization of the United Nations Educational, Scientific and Cultural Organization (FAO-UNESCO) data were employed for generating land use/land cover and soil type, respectively. In this study, the preference for CORINE and FAO-DSMW data is widespread on a global scale, particularly among users of the SWAT model (ShangGuan et al. 2014; Abbaspour et al. 2019; Busico et al. 2020; López-Ballesteros et al. 2023). Despite the coarse resolution and relatively outdated nature of the FAO-DSMW data, one of the reasons it is favored in this study is its capability to directly align with the SWAT model codes. The availability of this matching system contributes significantly to the preference for this data in the context of this study. The meteorological data were gathered from global weather data derived from Climate Forecast System Reanalysis (CFSR) (Fuka et al. 2014). The available date range for this data is 1979–2013. Also, there are two flow stations on the upper part of the Porsuk River: The first one (Station Code: E12A003) is located at the outlet of the whole basin and has continuous discharge data over the 1980–2006 period. The second station (Station Code: D12A033) is located in the middle of the basin. The continuous discharge data of this station can be obtained between the years 1997 and 2006. The second station has also non-continuous data for both instantaneous flows, NO3, TSS, and DO. These parameters were gathered from the United Nations Environment Programme Global Environment Monitoring System (UNEP GEMS, UNEP-GEMS/Water Programme 2006). The NO3 data covered the years 1984–1987, while discharge, TSS, and DO data covered 1980–1987. Therefore, the period 1979–2006 was selected as the modeling period. Table 1 shows the details of the data.
Data description, source, and scale/resolution used in the modeling
Data type . | Data source . | Scale/Resolution . |
---|---|---|
HRU definition data | ||
Digital elevation model (DEM) | SRTM | Grid cell 30 × 30 m |
Land use/land cover | CORINE (year 1990) | Grid cell 100 × 100 m |
Soil | FAO-UNESCO DSMW | Scale 1:5,000,000 |
Meteorological data | ||
Precipitation | CFSR | Grid cell ∼38 km |
Max./Min. temperature | ||
Relative humidity | ||
Solar radiation | ||
Wind speed | ||
Calibration/Validation data | ||
Flow | DSI and GEMStat | Ground station |
Water quality (NO3, TSS, and DO) | GEMStat | Ground station |
Data type . | Data source . | Scale/Resolution . |
---|---|---|
HRU definition data | ||
Digital elevation model (DEM) | SRTM | Grid cell 30 × 30 m |
Land use/land cover | CORINE (year 1990) | Grid cell 100 × 100 m |
Soil | FAO-UNESCO DSMW | Scale 1:5,000,000 |
Meteorological data | ||
Precipitation | CFSR | Grid cell ∼38 km |
Max./Min. temperature | ||
Relative humidity | ||
Solar radiation | ||
Wind speed | ||
Calibration/Validation data | ||
Flow | DSI and GEMStat | Ground station |
Water quality (NO3, TSS, and DO) | GEMStat | Ground station |
Model setup
The SWAT (Soil and Water Assessment) model is a deterministic, semi-distributed, process-based watershed model developed by the USDA (United States Department of Agriculture) (Arnold et al. 1998; Neitsch et al. 2011). Successful applications of the SWAT model have been demonstrated by different disciplines in various areas from the field scale to the basin scale, in different climatic zones with different geographical conditions (Gassman et al. 2007; Ahn & Kim 2019).
For the current study, the ArcSWAT extension was used in the model setup for the Upper Porsuk Basin. The model inputs can be easily incorporated into the model in the GIS environment. In the first step, the Upper Porsuk Basin was subdivided into 20 sub-basins considering the DEM. Then, 586 HRUs were automatically created by combining the spatial model inputs (elevation, land use/land cover, and soil). Ten elevation bands were used. The meteorological data prepared in an appropriate format covering the whole modeling period (1979–2013) were entered into the model. Depending on the meteorological data, the Penman-Monteith method (Monteith 1965) was preferred for evapotranspiration calculation.
RESULTS
Flow calibration and validation
The SWAT-CUP (SWAT-Calibration and Uncertainty Program) automatic calibration program (Abbaspour 2012) was used to obtain optimized parameters. The SUFI-2 (Sequential Uncertainty and Fitting-version2) algorithm (McKay et al. 1979) in SWAT-CUP was employed for calibration. With the global sensitivity analysis performed in SWAT-CUP before the calibration, sensitive parameters were determined. The result of the sensitivity analysis points to parameter CN2 as the most sensitive parameter supported by a significant absolute t-stat. Consistent with previous research results using the SWAT model (White & Chaubey 2005; Van Griensven et al. 2006; Arnold et al. 2012a, 2012b; Abbaspour et al. 2015; Khalid et al. 2016), this was not surprising, emphasizing the expected importance of the CN2 parameter. As shown in Table 2, the parameters with the highest absolute t-stat and p-value < 0.05 are the most sensitive to flow outputs. In the initial analysis, calibration utilizing seven parameters that passed the sensitivity threshold (p-value < 0.05) yielded results falling short of the acceptable performance. Subsequently, the count of sensitive parameters was incremented to 20, accompanied by a revision of the performance criteria to attain more reasonable levels. The calibration was made at the daily time step and over four iterations with 500 simulations. The sensitive parameters, descriptions, change methods, and the final calibrated values used for the calibration procedure are presented in Table 3.
Global sensitivity analysis results
Number . | Parameter name . | p-value . | t-stat . | Absolute t-stat . |
---|---|---|---|---|
1 | CN2 | 0.0000 | −39.9703 | 39.9703 |
2 | SMFMN | 0.0000 | −5.1006 | 5.1006 |
3 | SOL_Z | 0.0000 | 4.8760 | 4.8760 |
4 | ESCO | 0.0003 | −3.6491 | 3.6491 |
5 | SOL_AWC | 0.0060 | 2.7683 | 2.7683 |
6 | SFTMP | 0.0073 | −2.7033 | 2.7033 |
7 | GW_DELAY | 0.0102 | 2.5870 | 2.5870 |
8 | CH_K1 | 0.0548 | 1.9285 | 1.9285 |
9 | SMFMX | 0.0660 | −1.8461 | 1.8461 |
10 | TLAPS | 0.0856 | −1.7253 | 1.7253 |
11 | CH_N1 | 0.1796 | 1.3454 | 1.3454 |
12 | GWQMN | 0.1917 | −1.3089 | 1.3089 |
13 | REVAPMN | 0.1967 | −1.2942 | 1.2942 |
14 | SMTMP | 0.2056 | −1.2689 | 1.2689 |
15 | RCHRG_DP | 0.2087 | −1.2601 | 1.2601 |
16 | TIMP | 0.6528 | −1.2087 | 1.2087 |
17 | ALPHA_BF | 0.2572 | 1.1355 | 1.1355 |
18 | SOL_K | 0.3118 | −1.0133 | 1.0133 |
19 | SLSUBBSN | 0.6863 | −0.7521 | 0.7521 |
20 | GW_REVAP | 0.5602 | −0.7352 | 0.7352 |
21 | PLAPS | 0.5046 | −0.6681 | 0.6681 |
22 | EPCO | 0.4628 | 0.5833 | 0.5833 |
23 | LAT_TTIME | 0.2278 | 0.4504 | 0.4504 |
24 | SOL_BD | 0.4526 | 0.4044 | 0.4044 |
25 | SURLAG | 0.7608 | −0.3048 | 0.3048 |
26 | FFCB | 0.7648 | −0.2995 | 0.2995 |
27 | OV_N | 0.8171 | −0.2316 | 0.2316 |
28 | CANMX | 0.8906 | 0.1377 | 0.1377 |
29 | CH_N2 | 0.9318 | 0.0856 | 0.0856 |
30 | CH_K2 | 0.9795 | 0.0257 | 0.0257 |
Number . | Parameter name . | p-value . | t-stat . | Absolute t-stat . |
---|---|---|---|---|
1 | CN2 | 0.0000 | −39.9703 | 39.9703 |
2 | SMFMN | 0.0000 | −5.1006 | 5.1006 |
3 | SOL_Z | 0.0000 | 4.8760 | 4.8760 |
4 | ESCO | 0.0003 | −3.6491 | 3.6491 |
5 | SOL_AWC | 0.0060 | 2.7683 | 2.7683 |
6 | SFTMP | 0.0073 | −2.7033 | 2.7033 |
7 | GW_DELAY | 0.0102 | 2.5870 | 2.5870 |
8 | CH_K1 | 0.0548 | 1.9285 | 1.9285 |
9 | SMFMX | 0.0660 | −1.8461 | 1.8461 |
10 | TLAPS | 0.0856 | −1.7253 | 1.7253 |
11 | CH_N1 | 0.1796 | 1.3454 | 1.3454 |
12 | GWQMN | 0.1917 | −1.3089 | 1.3089 |
13 | REVAPMN | 0.1967 | −1.2942 | 1.2942 |
14 | SMTMP | 0.2056 | −1.2689 | 1.2689 |
15 | RCHRG_DP | 0.2087 | −1.2601 | 1.2601 |
16 | TIMP | 0.6528 | −1.2087 | 1.2087 |
17 | ALPHA_BF | 0.2572 | 1.1355 | 1.1355 |
18 | SOL_K | 0.3118 | −1.0133 | 1.0133 |
19 | SLSUBBSN | 0.6863 | −0.7521 | 0.7521 |
20 | GW_REVAP | 0.5602 | −0.7352 | 0.7352 |
21 | PLAPS | 0.5046 | −0.6681 | 0.6681 |
22 | EPCO | 0.4628 | 0.5833 | 0.5833 |
23 | LAT_TTIME | 0.2278 | 0.4504 | 0.4504 |
24 | SOL_BD | 0.4526 | 0.4044 | 0.4044 |
25 | SURLAG | 0.7608 | −0.3048 | 0.3048 |
26 | FFCB | 0.7648 | −0.2995 | 0.2995 |
27 | OV_N | 0.8171 | −0.2316 | 0.2316 |
28 | CANMX | 0.8906 | 0.1377 | 0.1377 |
29 | CH_N2 | 0.9318 | 0.0856 | 0.0856 |
30 | CH_K2 | 0.9795 | 0.0257 | 0.0257 |
Calibrated values for sensitive parameters with descriptions, change methods, and initial ranges
Parameters . | Descriptions . | Initial ranges . | Change method* . | Calibrated value . | |
---|---|---|---|---|---|
TLAPS | Temperature lapse rate (°C/km) | −6 | −3 | v_ | −5.60 |
SFTMP | Snowfall temperature (°C) | −3 | 3 | v_ | −1.01 |
SMTMP | Snowmelt base temperature (°C) | −3 | 3 | v_ | 1.18 |
SMFMX | Melt factor for snow on June 21 (mm H2O/°C-day) | 3 | 6 | v_ | 4.30 |
SMFMN | Melt factor for snow on December 21 (mm H2O/°C-day) | 0 | 3 | v_ | 1.63 |
TIMP | Snowpack temperature lag factor | 0 | 1 | v_ | 0.36 |
ESCO | Soil evaporation compensation factor | 0.7 | 1 | v_ | 0.82 |
CN2 | Initial SCS runoff curve number for moisture condition II | −0.3 | 0.3 | r_ | −0.30 |
ALPHA_BF | Baseflow alpha factor (1/days) | 0 | 1 | v_ | 0.58 |
GWQMN | Threshold depth of water in the shallow aquifer required for return flow to occur (mm H2O) | 100 | 3,000 | v_ | 1,402.56 |
GW_DELAY | Groundwater delay time (days) | 1 | 50 | v_ | 1.02 |
GW_REVAP | Groundwater revap coefficient | 0.02 | 0.2 | v_ | 0.14 |
REVAPMN | Threshold depth of water in the shallow aquifer for revap or percolation to the deep aquifer to occur (mm H2O) | 100 | 300 | v_ | 187.95 |
RCHRG_DP | Deep aquifer percolation fraction | 0 | 0.5 | v_ | 0.31 |
CH_K1 | Effective hydraulic conductivity in tributary channel alluvium (mm/h) | −0.3 | 0.3 | r_ | −0.03 |
CH_N1 | Manning's ‘n’ value for the tributary channels | −0.3 | 0.3 | r_ | −0.19 |
SOL_Z | Depth from soil surface to bottom of layer (mm) | −0.3 | 0.3 | r_ | 0.22 |
SOL_AWC | Available water capacity of the soil layer (mm H2O/mm soil) | −0.3 | 0.3 | r_ | −0.14 |
SOL_K | Saturated hydraulic conductivity (mm/h) | −0.3 | 0.3 | r_ | 0.30 |
SLSUBBSN | Average slope length (m) | −0.3 | 0.3 | r_ | −0.10 |
Parameters . | Descriptions . | Initial ranges . | Change method* . | Calibrated value . | |
---|---|---|---|---|---|
TLAPS | Temperature lapse rate (°C/km) | −6 | −3 | v_ | −5.60 |
SFTMP | Snowfall temperature (°C) | −3 | 3 | v_ | −1.01 |
SMTMP | Snowmelt base temperature (°C) | −3 | 3 | v_ | 1.18 |
SMFMX | Melt factor for snow on June 21 (mm H2O/°C-day) | 3 | 6 | v_ | 4.30 |
SMFMN | Melt factor for snow on December 21 (mm H2O/°C-day) | 0 | 3 | v_ | 1.63 |
TIMP | Snowpack temperature lag factor | 0 | 1 | v_ | 0.36 |
ESCO | Soil evaporation compensation factor | 0.7 | 1 | v_ | 0.82 |
CN2 | Initial SCS runoff curve number for moisture condition II | −0.3 | 0.3 | r_ | −0.30 |
ALPHA_BF | Baseflow alpha factor (1/days) | 0 | 1 | v_ | 0.58 |
GWQMN | Threshold depth of water in the shallow aquifer required for return flow to occur (mm H2O) | 100 | 3,000 | v_ | 1,402.56 |
GW_DELAY | Groundwater delay time (days) | 1 | 50 | v_ | 1.02 |
GW_REVAP | Groundwater revap coefficient | 0.02 | 0.2 | v_ | 0.14 |
REVAPMN | Threshold depth of water in the shallow aquifer for revap or percolation to the deep aquifer to occur (mm H2O) | 100 | 300 | v_ | 187.95 |
RCHRG_DP | Deep aquifer percolation fraction | 0 | 0.5 | v_ | 0.31 |
CH_K1 | Effective hydraulic conductivity in tributary channel alluvium (mm/h) | −0.3 | 0.3 | r_ | −0.03 |
CH_N1 | Manning's ‘n’ value for the tributary channels | −0.3 | 0.3 | r_ | −0.19 |
SOL_Z | Depth from soil surface to bottom of layer (mm) | −0.3 | 0.3 | r_ | 0.22 |
SOL_AWC | Available water capacity of the soil layer (mm H2O/mm soil) | −0.3 | 0.3 | r_ | −0.14 |
SOL_K | Saturated hydraulic conductivity (mm/h) | −0.3 | 0.3 | r_ | 0.30 |
SLSUBBSN | Average slope length (m) | −0.3 | 0.3 | r_ | −0.10 |
v_ indicates that the parameter value has been changed directly. r_ indicates that the parameter value has been changed relatively.
Precipitation and flow relation for (a) the calibration and the first validation, (b) the second validation, and flow values for (c) the third validation periods.
Precipitation and flow relation for (a) the calibration and the first validation, (b) the second validation, and flow values for (c) the third validation periods.
Water quality: NO3, TSS, and DO
After the flow calibration and validation procedure, simulations of the water quality data were examined in terms of the obtained variables (NO3, TSS, and DO). The water quality variables provided by the GEMStat data portal were used to calibrate the SWAT model outputs. Calibration was done manually using values in the literature (Arnold et al. 2012a, 2012b). Nitrate-related and TSS-related parameters were tested with multiple re-runs and their sensitivities were determined accordingly. Also, the most effective parameters were adjusted as recommended by Arabi et al. (2008), Kannan (2012), Qiu et al. (2012), and Abbaspour et al. (2015) to improve the performance. As a result, the selected eight parameters were adjusted manually, and the values showing the highest performance were fit. The calibrated values of the determined parameters and the descriptions of these parameters are given in Table 4. Also, results were adjusted in kg for NO3 and in tons for TSS. Furthermore, DO concentration in the stream obtained from GEMStat was converted from mg/L to kg to calibrate with the data obtained from the SWAT model. In the transformation of water quality variables such as load from concentration (mg/L), flow data provided from the GEMStat data portal, which is previously validated in Section 3.1, were used (third validation period).
Calibrated values for water quality parameters with their descriptions
Parameter . | Descriptions . | Calibrated value . |
---|---|---|
RCN | Concentration of nitrogen in rainfall (mg N/L) | 1 |
NPERCO | Nitrate percolation coefficient | 0.2 |
CMN | Rate factor for humus mineralization of active organic nutrients (N and P) | 0.0003 |
CH_COV1 | Channel erodibility factor | 0.1 |
CH_COV2 | Channel cover factor | 0.1 |
SPCON | Linear parameter for calculating the maximum amount of sediment that can be re-entrained during channel sediment routing | 0.00015 |
SPEXP | Exponent parameter for calculating sediment re-entrained in channel sediment routing | 1.05 |
PRF | Peak rate adjustment factor for sediment routing in the main channel | 1.7 |
Parameter . | Descriptions . | Calibrated value . |
---|---|---|
RCN | Concentration of nitrogen in rainfall (mg N/L) | 1 |
NPERCO | Nitrate percolation coefficient | 0.2 |
CMN | Rate factor for humus mineralization of active organic nutrients (N and P) | 0.0003 |
CH_COV1 | Channel erodibility factor | 0.1 |
CH_COV2 | Channel cover factor | 0.1 |
SPCON | Linear parameter for calculating the maximum amount of sediment that can be re-entrained during channel sediment routing | 0.00015 |
SPEXP | Exponent parameter for calculating sediment re-entrained in channel sediment routing | 1.05 |
PRF | Peak rate adjustment factor for sediment routing in the main channel | 1.7 |
Model performance results for flow and water quality variables with the periods and number of samples
Variable . | Period . | Source . | Number of samples . | Performance criteria . | ||
---|---|---|---|---|---|---|
NSE . | R2 . | PBIAS . | ||||
Flow-Calibration | 1.1.2002–12.31.2006 | DSI | Continuous-5 year daily | 0.64 | 0.65 | − 11.36 |
Flow-Validation1 | 1.1.1997–12.31.2000 | DSI | Continuous-4 year daily | 0.44 | 0.50 | − 35.19 |
Flow-Validation2 | 1.1.1980–12.31.2006 | DSI | Continuous-26 year daily | 0.40 | 0.52 | + 1.77 |
Flow-Validation3 | 1.1.1980–12.31.1987 | GEMStat | 96 measurements | 0.35 | 0.40 | − 7.36 |
NO3 | 1.1.1984–12.31.1987 | GEMStat | 40 measurements | 0.47 | 0.57 | + 31.42 |
TSS | 1.1.1980–12.31.1987 | GEMStat | 96 measurements | 0.42 | 0.42 | − 0.24 |
DO | 1.1.1980–12.31.1987 | GEMStat | 96 measurements | 0.61 | 0.62 | − 8.15 |
Variable . | Period . | Source . | Number of samples . | Performance criteria . | ||
---|---|---|---|---|---|---|
NSE . | R2 . | PBIAS . | ||||
Flow-Calibration | 1.1.2002–12.31.2006 | DSI | Continuous-5 year daily | 0.64 | 0.65 | − 11.36 |
Flow-Validation1 | 1.1.1997–12.31.2000 | DSI | Continuous-4 year daily | 0.44 | 0.50 | − 35.19 |
Flow-Validation2 | 1.1.1980–12.31.2006 | DSI | Continuous-26 year daily | 0.40 | 0.52 | + 1.77 |
Flow-Validation3 | 1.1.1980–12.31.1987 | GEMStat | 96 measurements | 0.35 | 0.40 | − 7.36 |
NO3 | 1.1.1984–12.31.1987 | GEMStat | 40 measurements | 0.47 | 0.57 | + 31.42 |
TSS | 1.1.1980–12.31.1987 | GEMStat | 96 measurements | 0.42 | 0.42 | − 0.24 |
DO | 1.1.1980–12.31.1987 | GEMStat | 96 measurements | 0.61 | 0.62 | − 8.15 |
Calibrated outputs for (a) NO3, (b) TSS, and (c) DO in the sampling periods with the GEMStat measurement points.
Calibrated outputs for (a) NO3, (b) TSS, and (c) DO in the sampling periods with the GEMStat measurement points.
In addition to the model performance metrics listed in Table 5, the measurement values for the same study area in the literature were examined. Thus, it was aimed to avoid one-sidedness basing the reliability of GEMStat data on SWAT model simulations alone. For this purpose, the river water quality observation data obtained by different researchers in the Upper Porsuk Basin are listed in Table 6. It provides the opportunity to discuss the GEMStat NO3, TSS, and DO data in terms of order with the available observation data values in the literature. Based on the findings, it is evident that the GEMStat data align with the observations from other data, exhibiting a similar range. Although there may be a few exceptional cases in the GEMStat data regarding TSS, these instances are minimal and inconsequential in terms of the overall situation.
NO3, TSS, and DO data available in the literature in the Upper Porsuk Basin with GEMStat
Data source . | Ranges . | Observation date . | ||
---|---|---|---|---|
NO3-N (mg/L) . | TSS* (mg/L) . | DO (mg/L) . | ||
GEMStat | 0.5–2.15 | 3–500 | 6.2–12.4 | 1984–1987 |
Köse et al. (2016) | 1.01 | NA | 7.56 | 2015 |
Orak (2006) | NA | NA | 8.24–9.11 | 2001–2002 |
Solak (2009) | NA | 280.2–497.3 | 4.22–10.13 | 2006 |
Güngör (2011) | NA | 90 | NA | 2003–2005 |
Gürel (2011) | 0.15–2.15 | NA | 1.9–13 | 2009 |
Şahin (2018) | 0.5–6 | NA | 9–11 | 2016 |
Data source . | Ranges . | Observation date . | ||
---|---|---|---|---|
NO3-N (mg/L) . | TSS* (mg/L) . | DO (mg/L) . | ||
GEMStat | 0.5–2.15 | 3–500 | 6.2–12.4 | 1984–1987 |
Köse et al. (2016) | 1.01 | NA | 7.56 | 2015 |
Orak (2006) | NA | NA | 8.24–9.11 | 2001–2002 |
Solak (2009) | NA | 280.2–497.3 | 4.22–10.13 | 2006 |
Güngör (2011) | NA | 90 | NA | 2003–2005 |
Gürel (2011) | 0.15–2.15 | NA | 1.9–13 | 2009 |
Şahin (2018) | 0.5–6 | NA | 9–11 | 2016 |
*Ranges and number of samplings for TSS Range Number of sampling
3–500 91
500–1,000 2
>1,000 3
CONCLUSIONS
Determining the hydrological behavior of the Upper Porsuk Basin is an important step to assess water quality parameters. For this purpose, as a first stage, the flow outputs were examined with the basin model developed on SWAT. The basin model was tested by analyzing the flow outputs in the calibration and validation periods. Thus, a reasonable and concrete flow model for the Upper Porsuk Basin was achieved. In the first stage, a sufficient amount of information was obtained to proceed with the development of a water quality model. The NSE values varied between 0.35 and 0.44 for three validation periods. Although these values were below the NSE value of 0.64 obtained in the calibration, they were within the acceptable range. In the second stage of the study, the water quality outputs were calibrated. For this purpose, NO3, TSS, and DO values provided by the GEMStat data portal were compared with the SWAT model outputs, and the performance of the model was determined. Accordingly, NSE values were found to be 0.47, 0.42, and 0.61 for NO3, TSS, and DO, respectively.
SWAT is a reliable and widely used modeling tool in terms of water quantity and quality outputs. GEMStat data portal is a very important data source to control outputs of water quality models like SWAT. Only a few measured datasets that can be used to calibrate water quality models are available and obviously, taking samples and analyzing them would be costly and require a lot of time. Global or continental-scale data portals such as GEMStat help researchers overcome this difficulty. Collection of data from rivers is very important for the calibration and validation of water quality models. In this study, the practicability of the GEMStat global data was tested with a SWAT water quality model for a basin in Türkiye. Furthermore, we conducted a comparison between GEMStat data and readily available measurements of NO3, TSS, and DO obtained from previous field studies conducted in the Upper Porsuk River section. The similarity in range between the observed values reported in the literature and the GEMStat data serves as an additional validation, supporting the reliability of GEMStat from a different perspective. Therefore, simulated NO3, TSS, and DO data can be validated using this freely available dataset. The results of this study showed that by sharing previous measurements over the internet, time and labor can be saved since difficult field measurements to calibrate and test water quality models will no longer be required. However, it should be noted that the data in the study is dated before the 2000s. For this reason, we strongly recommend the development of up-to-date and open-source free data portals. As the use of open-source and easily accessible data portals such as GEMStat becomes widespread by researchers or institutions for control of outputs, modeling studies would be more accurate and effective. In conclusion, data scarcity, which is the most important issue in strategy planning practices for water pollution and management, can be overcome.
DATA AVAILABILITY STATEMENT
All relevant data are included in the paper or its Supplementary Information.
CONFLICT OF INTEREST
The authors declare there is no conflict.