Abstract
The high-resolution Climate Forecast System Reanalysis (CFSR) data have recently become an alternative input for hydrological models in data-sparse regions. However, the quality of CFSR data for running hydrological models in the Arctic is not well studied yet. This paper aims to compare the quality of CFSR data with ground-based data for hydrological modeling in an Arctic watershed, Målselv. The QSWAT model, a coupling of the hydrological model SWAT (soil and water assessment tool) and the QGIS, was applied in this study. The model ran from 1995 to 2012 with a 3-year warm-up period (1995–1997). Calibration (1998–2007), validation (2008–2012), and uncertainty analyses were performed by the Sequential Uncertainty Fitting Version 2 (SUFI-2) algorithm in the SWAT Calibration Uncertainties Program for each dataset at five hydro-gauging stations within the watershed. The objective function Nash–Sutcliffe coefficient of efficiency for calibration is 0.65–0.82 with CFSR data and 0.55–0.74 with ground-based data, which indicate higher performance of the high-resolution CFSR data than the existing scattered ground-based data. The CFSR weather grid points showed higher variation in precipitation than the ground-based weather stations across the whole watershed. The calculated average annual rainfall by CFSR data for the whole watershed is approximately 24% higher than that by ground-based data, which results in some higher water balance components. The CFSR data also demonstrates its high capacities to replicate the streamflow hydrograph, in terms of timing and magnitude of peak and low flow. Through examination of the uncertainty coefficients P-factors (≥0.7) and R-factors (≤1.5), this study concludes that CFSR data is a reliable source for running hydrological models in the Arctic watershed Målselv.
HIGHLIGHTS
The high-resolution CFSR dataset has higher performance than the existing scattered ground-based dataset in terms of statistical coefficients, R2, NSE, and RSR.
The CFSR dataset has higher simulation results for some water balance components, e.g., actual evapotranspiration, lateral flow, water yield, etc., than the scattered conventional dataset.
The CFSR demonstrates its high capacities to replicate the streamflow hydrograph.
Uncertainty analysis reveals that CFSR is a reliable weather input for running hydrological models in the Arctic watershed Målselv.
The emerging and open-source QSWAT is a valuable tool for the SWAT scientific community because of its upgraded availability and functionality compared to other SWAT interfaces.
INTRODUCTION
A watershed is a basic land unit for studying of hydrological cycle and for water resource management and planning (Edwards et al. 2015; Yu & Duffy 2018). It is defined as a land area where most of the precipitation drains to the same places, e.g., water bodies or low land areas (Edwards et al. 2015). The development of hydrological models has been a high target of the hydrologists (Ehret et al. 2014; Clark et al. 2017) in order to improve the understanding of the hydrological processes and supporting for the management of the watershed (Yu & Duffy 2018). However, an existing challenge and time consuming of modeling task is collecting accurately representative weather input data for hydrological models (Mehta et al. 2004; Kouwen et al. 2005; Fuka et al. 2014; Lu et al. 2019). Generally, the ground-based weather stations do not always sufficiently represent the weather pattern across the whole watershed (Fuka et al. 2014) because (1) the sparse spatial distribution and the far distances of the meteorological stations from the watershed to be modeled (Zhang et al. 2016; Tolera et al. 2018); (2) time-series data usually contain gaps and errors; (3) up-to-date datasets are not available. Due to these limitations of ground-based data, finding alternative sources of weather inputs for hydrological models is essential. This is especially crucial for the data-sparse Arctic region (Lindsay et al. 2014; WMO 2018). An alternative source, which has recently been preferred by scientists, is to use the multiyear globally atmospheric reanalyzed data (Fuka et al. 2014).
Basically, the atmospheric reanalyzed data are generated through data assimilation, which is the process of integrating all available information, to estimate as accurately as possible the characteristics of a system (Talagrand 1997), from observed data (e.g., from the ground-based gauges, ships, aircraft, and satellites) and forecasted data (e.g., from numerical modeling of weather prediction) (Parker 2016). Reanalysis provides comprehensive features of climate at regular time steps over a long period usually from years to decades. Therefore, reanalysis data have been used in various fields, such as atmospheric dynamics (Kidston et al. 2010), investigation of climate variability (Kravtsov et al. 2014), evaluation of climate models (Gleckler et al. 2008), studying greenhouse gas fingerprints (Santer et al. 2004), and in the study of hydrology and hydrological models (Lavers et al. 2012; Najafi et al. 2012; Quadro et al. 2013; Smith & Kummerow 2013; Fuka et al. 2014; Bressiani et al. 2015; Alemayehu et al. 2016; Tolera et al. 2018). Many atmospheric reanalysis products have been generated recently, and some well-known ones are listed below (Lindsay et al. 2014):
The National Centers for Environmental Prediction (NCEP)–National Center for Atmospheric Research Reanalysis (NCAR) 1 (NCEP-R1) (Kalnay et al. 1996; Kistler et al. 2001);
The NCEP–U.S. Department of Energy (DOE) Reanalysis 2 (NCEP-R2) (Kanamitsu et al. 2002);
Climate Forecast System Reanalysis (CFSR) generated by the NCEP (Saha et al. 2010);
Twentieth-Century Reanalysis (20CR) generated by the National Oceanic and Atmospheric Administration (NOAA) Earth System Research Laboratory (ESRL)–Cooperative Institute for Research in Environmental Sciences (CIRES) (Whitaker et al. 2004; Compo et al. 2006, 2011);
Modern-Era Retrospective Analysis for Research and Applications, Version 2 (MERRA-2) generated by the National Aeronautics and Space Administration (NASA) Global Modeling and Assimilation Office (GMAO) (Gelaro et al. 2017; Tao et al. 2019);
ERA5, the successor of ERA-Interim, generated by European Centre for Medium-Range Weather Forecasts (ECMWF) (Hersbach et al. 2020); and
Japanese 25-year Reanalysis Project (JRA-25) generated by the Japanese Meteorological Agency (JMA) (Onogi et al. 2007).
A comparison on the characteristics of the above seven well-known reanalysis products is shown in Table 1. Of them, the CFSR and ERA5 have the highest spatial resolution with a Gaussian grid (Washington & Parkinson 2005) of approximately 38 km (NCAR 2017) and approximately 31 km (Hersbach et al. 2020), respectively. However, CFSR is the only one that covers all required input data (e.g., precipitation, maximum and minimum air temperature, relative humidity, solar radiation, and wind speed) for the hydrological model, the SWAT (soil and water assessment tool) model, used by this study. Therefore, the CFSR is selected for the evaluation of its performance for running the hydrological model in the Arctic conditions.
. | NCEP-R1 . | NCEP-R2 . | CFSR . | 20CR . | MERRA-2 . | ERA5 . | JRA-25 . |
---|---|---|---|---|---|---|---|
Sponsoring agencies | NCEP–NCAR | NCEP–DOE | NCEP | NOAA–ESRL–CIRES | NASA–GMAO | ECMWF | JMA |
Temporal coverage | 1948–present | 1979–present | 1979–2017 | 1871–2012 | 1980–2017 | 1950–2019 | 1979–2004 |
Temporal resolution | Sub-daily, daily, monthly | Sub-daily, daily, monthly | Sub-daily, monthly | Sub-daily, daily, monthly | Sub-daily, daily, monthly | Sub-daily, daily, monthly | Sub-daily, monthly |
Spatial coverage | Global grid | Global grid | Global grid | Global grid | Global grid | Global grid | Global grid |
Spatial resolution | 210 km | 210 km | 38 km | 210 km | 50 km | 31 km | 120 km |
References | Kalnay et al. (1996); Kistler et al. (2001) | Kanamitsu et al. (2002) | Saha et al. (2010) | Whitaker et al. (2004); Compo et al. (2006, 2011) | Gelaro et al. (2017); Tao et al. (2019) | Hersbach et al. (2020) | Onogi et al. (2007) |
. | NCEP-R1 . | NCEP-R2 . | CFSR . | 20CR . | MERRA-2 . | ERA5 . | JRA-25 . |
---|---|---|---|---|---|---|---|
Sponsoring agencies | NCEP–NCAR | NCEP–DOE | NCEP | NOAA–ESRL–CIRES | NASA–GMAO | ECMWF | JMA |
Temporal coverage | 1948–present | 1979–present | 1979–2017 | 1871–2012 | 1980–2017 | 1950–2019 | 1979–2004 |
Temporal resolution | Sub-daily, daily, monthly | Sub-daily, daily, monthly | Sub-daily, monthly | Sub-daily, daily, monthly | Sub-daily, daily, monthly | Sub-daily, daily, monthly | Sub-daily, monthly |
Spatial coverage | Global grid | Global grid | Global grid | Global grid | Global grid | Global grid | Global grid |
Spatial resolution | 210 km | 210 km | 38 km | 210 km | 50 km | 31 km | 120 km |
References | Kalnay et al. (1996); Kistler et al. (2001) | Kanamitsu et al. (2002) | Saha et al. (2010) | Whitaker et al. (2004); Compo et al. (2006, 2011) | Gelaro et al. (2017); Tao et al. (2019) | Hersbach et al. (2020) | Onogi et al. (2007) |
The CFSR is the third generation of reanalysis product. This dataset is the cooperation between the National Center for Atmospheric Research (NCAR 2017) and the NCEP (NCEP 2010). A coupling of atmosphere–ocean–land surface–sea ice systems in order to offer the best estimation of the weather pattern of those coupled areas is the great features of the CFSR product. The CFSR data have been verified as weather input for hydrological models in numerous studies at different climate conditions around the world (e.g., temperate, tropical, subtropical, Asian monsoon, and semi-arid) and provided reliable results. First of all, in the temperate climate zone, CFSR performed better than ground-based data for simulation of daily variation of streamflow in four watersheds in the USA, and CFSR could meet the challenge of hydrological simulation in ungauged watersheds (Fuka et al. 2014). In another study in the snow-dominated East River basin, Colorado, USA, CFSR was used as forcing data for the prediction of volumetric streamflow and returned good results (Najafi et al. 2012). Additionally, in a study of surface and atmospheric water budgets in the Upper Colorado River basin, CFSR showed its high capacity to capture the seasonal cycle of each water budget component (Smith & Kummerow 2013). CFSR was also used as weather input to detect the influences of atmospheric rivers on winter floods in nine river basins along the western coast of Great Britain and showed consistent results with other reanalysis products: the ERA-Interim, the 20CR, the MERRA, and the NCEP–NCAR (Lavers et al. 2012). Secondly, in a tropical climate zone in Ethiopia, CFSR performed better than ground-based data for the prediction of daily streamflow in the Gumera watershed (Fuka et al. 2014) and for the prediction of monthly streamflow in the Awash watershed (Tolera et al. 2018). It is concluded that CFSR could perform better in large-scale basins (Tolera et al. 2018). CFSR also demonstrated its high capacity for predicting potential evapotranspiration in the data-scarce Upper Mara Catchment in Kenya and Tanzania (Alemayehu et al. 2016). Thirdly, in a study conducted over South America, with climate characteristics varying from tropical to subtropical zones, CFSR provided the smallest bias in results, compared with other reanalysis products (e.g., MERRA and NCEP-R2), for simulation of the hydrological cycle (Quadro et al. 2013). Another study in the semi-arid climate of the Jaguaribe basin, Northeast Brazil, with CFSR as weather input for studying monthly streamflow variation, stated that CFSR's results were good to very good, and had the best performance compared to other weather input datasets (Bressiani et al. 2015). Lastly, in the region dominated by the Asian monsoon climate, CFSR demonstrated good performance to simulate monthly streamflow variation in the largest river, the Yangtze River, in China, and was considered an alternative input for the large-scale basins (Lu et al. 2019). However, in some case studies, CFSR data performed worse than ground-based data and were not recommended (specifically for those study areas) as an alternative input to replace the high-quality ground-based data (Dile & Srinivasan 2014; Roth & Lemann 2016). Although the CFSR dataset has demonstrated its performance in hydro-meteorological simulations around the world, this has yet to be verified well in the data-sparse Arctic region. Therefore, to fill this knowledge gap, this paper aims:
- 1.
Investigate the performance of the CFSR in running hydrological models in Arctic conditions, and
- 2.
Examine whether CFSR data could be an alternative for weather input and could replace the limited ground-based data for hydrological models in the data-sparse Arctic region.
STUDY AREA
An Arctic watershed, Målselv, located in northern Norway, was chosen as the study area to investigate the performance of CFSR (Figure 1). The watershed is distributed at high latitudes from 68°21′N to 69°17′N and approximately 200 km above the Arctic circle (at 66°33′N) calculated from the southernmost point of the watershed. It covers an area of approximately 5,913 km2. The elevation distribution of the ground surface is in the range of 0–1,718 m. According to long-term data from the Norwegian Water Resources and Energy Directorate (NVE), the average annual precipitation in the study area varies from approximately 500 to 1,500 mm. The average annual air temperature fluctuates from −5 to 6 °C. The whole watershed has approximately 11 categories of land use, with wooded tundra, mixed tundra, and deciduous broadleaf forest accounting for the highest percentage of total land-use area: 32.38, 23.93, and 22.12% for each type, respectively (Supplementary Material, Table S1). Sandy loam dominates the soil texture of the watershed (Supplementary Material, Table S1).
MATERIALS AND METHODS
SWAT model
The physically based (or process-based) (Neitsch et al. 2009), semi-distributed model SWAT was applied to test the quality of CFSR data. The SWAT was developed to simulate the anthropogenic impacts (Gassman et al. 2007) and climate change impacts (Dile et al. 2013) on water resources and environmental matters. The model has capacity to simulate the large-scale catchments with complex conditions over a long period. Especially, the SWAT demonstrates its strengths to fulfill the requirements of the current modeling philosophy: transparency of the model (Abbaspour et al. 2015). It means that calibration, validation, sensitivity, and uncertainty analyses are performed by the model.
Specifically, the surface runoff in the SWAT is calculated separately for each HRU, using the Soil Conservation Service's curve number (CN) method, and then transmitted for each sub-basin (Reddy et al. 2018). The water balance is mainly controlled by climate factors, such as precipitation, maximum/minimum air temperature, solar radiation, wind speed, and relative humidity (Arnold et al. 2012). In the SWAT, snow is considered, and it is calculated whenever the air temperature falls below the freezing point. Additionally, soil temperature is also calculated, since it influences the water movement in soil (Arnold et al. 2012).
The loadings of water and other components, such as sediment, nutrients, and pesticides from the land phase, are transformed into the mainstream, where the second phase (the routing phase) occurs (Arnold et al. 2012). In the routing phase, the loadings are routed through the mainstream and reservoirs within the catchment. Particularly, the routing phase describes several processes taking place in the mainstream, including the movement of water, mass flow, chemicals process, flood routing, sediment routing, nutrient routing, and pesticide routing. Streamflow in the mainstream consists of the contributions of water yield (YIELD) from the sub-basins. The YIELD is calculated by summarizing surface runoff, lateral flow, and groundwater, subtracting the transmission loss (Tolera et al. 2018). In this study, streamflow and water balance components are main outputs simulated by the SWAT, and these results are used to compare the performances of two weather input datasets.
QSWAT interface
The SWAT model runs on a GIS (Geographical Information System) platform where GIS functions are used to collect, manipulate, visualize, and analyze the inputs and outputs of the model (Srinivasan & Arnold 1994). Several GIS interfaces, e.g., GRASS-GIS (https://grass.osgeo.org/), ArcGIS (http://www.esri.com/software/arcgis), MapWindow GIS (https://www.mapwindow.org/), and the Quantum Geographical Information System QGIS (https://qgis.org/en/site/), have been coupled with the SWAT model. Of them, GRASS-SWAT is the first and major interface, while ArcSWAT, MWSWAT, and QSWAT are later developed (Dile et al. 2016). ArcSWAT is the most popular interface; it, however, requires a license ArcGIS platform (Winchell et al. 2013) and very costly (Dile et al. 2016). Additionally, the present version of ArcSWAT does not have an integrated functionality for the visualization of model outputs (Dile et al. 2016). MWSWAT has an advantage of being an open source, but it shows limitations to perform in large watersheds and large input datasets (Chen et al. 2010). Among open-source GIS softwares, QGIS is evaluated as an outperformed tool (Chen et al. 2010). For example, QGIS could satisfy the desired functionalities for water resource management, and it owns most of functions like a commercial GIS package. Because of the benefits of QGIS, it is highly desired from SWAT users community to couple QGIS with the SWAT model (Dile et al. 2016). Therefore, QSWAT is developed from that and it is currently considered as an emerging SWAT interface. QSWAT was firstly tested in a study in the Gumera watershed, Ethiopia and showed a successful performance (Dile et al. 2016). To continue that success, the present study applies the new interface QSWAT in order to verify its performance in the Arctic conditions.
Data acquisition
To run the SWAT model, several inputs are required: (1) spatial data, including Digital Elevation Map (DEM), soil, and land use; and (2) time-series data, including climate data and river discharge (Table 2).
Data type . | Resolution . | Source of data . |
---|---|---|
DEM | 10 × 10 m | Geonorge (2013) |
Land use | approximately 600 m | Waterbase (2007a) |
Soil | approximately 5,000 m | Waterbase (2007b) |
Climate | Ground-based data: four stations | ECAD (2002) |
CFSR data: 21 grid points, approximately 38 km grid | TAMU (2012) | |
River discharge | Five stations | Sildre (2020) |
Data type . | Resolution . | Source of data . |
---|---|---|
DEM | 10 × 10 m | Geonorge (2013) |
Land use | approximately 600 m | Waterbase (2007a) |
Soil | approximately 5,000 m | Waterbase (2007b) |
Climate | Ground-based data: four stations | ECAD (2002) |
CFSR data: 21 grid points, approximately 38 km grid | TAMU (2012) | |
River discharge | Five stations | Sildre (2020) |
A high-resolution DEM (10 × 10 m) is collected from the Norwegian Mapping Authority. The DEM is used to define the catchment topography and generate the catchment boundary, sub-basins, and stream networks. Additionally, other important parameters of the sub-basins, e.g., terrain slope length, slope gradient, slope classes, and channel length, are generated from the DEM. The soil data (scale of 1:5,000,000) and land use (600 m resolution) are collected from the Waterbase organization. The soil and land use are reclassified to represent the specific land use (Supplementary Material, Table S1) and soil types (Supplementary Material, Table S2) of the catchment based on the SWAT database.
The climate inputs used in this study are from two data sources, which are used to compare their performances: (1) the CFSR weather data (Figure 2(a)) and (2) the ground-based data (Figure 2(b)). The CFSR global weather data cover a 36-year period from 1 January 1979 to 31 July 2014 (TAMU 2012). In total, 21 weather grid points, which are located inside and nearby the catchment, are picked up by the SWAT model with the method of the nearest-neighbor search (NNS). The CFSR time-series data are almost continuous. In contrast, only four ground-based weather stations located within and nearby the study area have continuous time-series data and have the same time window as the CFSR data (regarding the investigation period of this study). Generally, most of the ground-based weather stations locate in the downstream. Of them, two weather stations are inside the watershed, while the other two are outside and close to the watershed's boundary. The ground-based data are collected from the European Climate Assessment & Dataset project (ECAD). It is obvious from Figure 2 that the networks of the available ground-based weather stations (Figure 2(b)) in the Målselv watershed are highly scattered, while the CFSR weather grid points (Figure 2(a)) are denser. Detailed description of the CFSR weather grid points and the ground-based weather stations and their rainfall data are summarized in Table 3.
Weather data . | Station . | Latitude . | Longitude . | Elevation (m) . | Average annual rainfall (mm) . |
---|---|---|---|---|---|
Ground-based | ECAD_1057 | 69.1 | 18.5 | 76 | 711 |
ECAD_2749 | 69.2 | 19.2 | 27 | 852 | |
ECAD_2748 | 68.9 | 18.3 | 114 | 903 | |
ECAD_2744 | 68.6 | 18.2 | 230 | 967 | |
CFSR | CFSR_692184 | 69.2 | 18.4 | 156 | 1,413 |
CFSR_692188 | 69.2 | 18.8 | 516 | 1,382 | |
CFSR_692191 | 69.2 | 19.1 | 194 | 1,350 | |
CFSR_692194 | 69.2 | 19.4 | 954 | 1,329 | |
CFSR_692200 | 69.2 | 20 | 440 | 1,303 | |
CFSR_692197 | 69.2 | 19.7 | 970 | 1,314 | |
CFSR_688197 | 68.8 | 19.7 | 587 | 1,059 | |
CFSR_688200 | 68.8 | 20 | 1,140 | 1,005 | |
CFSR_688188 | 68.8 | 18.8 | 1,000 | 1,267 | |
CFSR_688194 | 68.8 | 19.4 | 1,040 | 1,119 | |
CFSR_688191 | 68.8 | 19.1 | 800 | 1,192 | |
CFSR_688184 | 68.8 | 18.4 | 267 | 1,320 | |
CFSR_688181 | 68.8 | 18.1 | 175 | 1,366 | |
CFSR_688203 | 68.8 | 20.3 | 760 | 957 | |
CFSR_685203 | 68.5 | 20.3 | 686 | 750 | |
CFSR_685200 | 68.5 | 20 | 708 | 826 | |
CFSR_685188 | 68.5 | 18.8 | 837 | 1,345 | |
CFSR_685194 | 68.5 | 19.4 | 1,290 | 1,039 | |
CFSR_685184 | 68.5 | 18.4 | 1,041 | 1,437 | |
CFSR_685197 | 68.5 | 19.7 | 668 | 923 | |
CFSR_685191 | 68.5 | 19.1 | 880 | 1,192 |
Weather data . | Station . | Latitude . | Longitude . | Elevation (m) . | Average annual rainfall (mm) . |
---|---|---|---|---|---|
Ground-based | ECAD_1057 | 69.1 | 18.5 | 76 | 711 |
ECAD_2749 | 69.2 | 19.2 | 27 | 852 | |
ECAD_2748 | 68.9 | 18.3 | 114 | 903 | |
ECAD_2744 | 68.6 | 18.2 | 230 | 967 | |
CFSR | CFSR_692184 | 69.2 | 18.4 | 156 | 1,413 |
CFSR_692188 | 69.2 | 18.8 | 516 | 1,382 | |
CFSR_692191 | 69.2 | 19.1 | 194 | 1,350 | |
CFSR_692194 | 69.2 | 19.4 | 954 | 1,329 | |
CFSR_692200 | 69.2 | 20 | 440 | 1,303 | |
CFSR_692197 | 69.2 | 19.7 | 970 | 1,314 | |
CFSR_688197 | 68.8 | 19.7 | 587 | 1,059 | |
CFSR_688200 | 68.8 | 20 | 1,140 | 1,005 | |
CFSR_688188 | 68.8 | 18.8 | 1,000 | 1,267 | |
CFSR_688194 | 68.8 | 19.4 | 1,040 | 1,119 | |
CFSR_688191 | 68.8 | 19.1 | 800 | 1,192 | |
CFSR_688184 | 68.8 | 18.4 | 267 | 1,320 | |
CFSR_688181 | 68.8 | 18.1 | 175 | 1,366 | |
CFSR_688203 | 68.8 | 20.3 | 760 | 957 | |
CFSR_685203 | 68.5 | 20.3 | 686 | 750 | |
CFSR_685200 | 68.5 | 20 | 708 | 826 | |
CFSR_685188 | 68.5 | 18.8 | 837 | 1,345 | |
CFSR_685194 | 68.5 | 19.4 | 1,290 | 1,039 | |
CFSR_685184 | 68.5 | 18.4 | 1,041 | 1,437 | |
CFSR_685197 | 68.5 | 19.7 | 668 | 923 | |
CFSR_685191 | 68.5 | 19.1 | 880 | 1,192 |
River discharges, which are used for model calibration and validation, are collected from the Norwegian Water Resources and Energy Directorate. Five datasets from five hydro-gauging stations are gathered, with measurement intervals varying from 30 min to 1 h. The raw dataset is then averaged to a monthly interval dataset, in order to be compatible with the time step format of monthly simulation in the SWAT model. However, there are still some small gaps in the time-series data of river discharges due to technical errors or other reasons.
Model setup
QSWAT version 1.9, a coupling of the hydrological model SWAT version 2012 and the open-source QGIS version 2.6.1, is applied in this study for the evaluation of the CFSR data. Before running the model, two necessary steps including watershed delineation and HRUs creation are performed (Dile et al. 2016). The watershed delineation step is carried out by using the input of DEM. In this step, the sub-basins and their parameters are generated based on the stream networks and locations of sub-basin outlets, as well as watershed outlets. The second step, HRUs creation, is to divide each sub-basin into smaller units with specific soil types, land uses, and terrain slopes' distribution. The HRUs were generated from the inputs of the land-use map, the soil map, and slope classification. In this study, five slope classes are defined: 0–5, 5–10, 10–25, 25–30, and >30% (Supplementary Material, Table S3). Totally, 459 sub-basins, including 5,601 HRUs, are generated. The sizes of sub-basins vary from 205 to 7,075 hectares (ha). The QSWAT is run with monthly time steps from 1995 to 2012, including a 3-year warming-up period to let the model reach the optimal stage from the estimated initial condition (Arnold et al. 2012; Kim et al. 2018). A 10-year period, 1998–2007, is used for model calibration, and the remaining 5 years from 2008 to 2012 are for model validation. Figure 3 illustrates the overview of methodologies used in this study.
Model calibration, validation, and uncertainty analyses
Model calibration, validation, and uncertainty analyses are performed with the Sequential Uncertainty Fitting Version 2 (SUFI-2) algorithm (Figure 4) in the SWAT Calibration Uncertainties Program (SWAT_CUP) (Abbaspour et al. 2007). Outputs from the SWAT model are imported into SWAT-CUP for analyses. For each weather input, five iterations were performed, with 500 simulations for each, totally 2500 simulations, in order to find the best fit between observed data and simulated data. In each iteration, the SUFI-2 algorithm produces all the possible simulation outputs in a distribution or range, which is called the 95% prediction uncertainty (95PPU) range (Abbaspour et al. 2015). Principally, the 95PPU calculates the possible estimated values, which are in the range from the lowest level of 2.5% up to the highest level of 97.5% of the cumulative distribution, by the method of Latin hypercube (LH) sampling, a statistical method which is used to reduce the number of samples from the multiple dimensional distributions (Mckay et al. 1979; Özdemir 2016). The 95PPU attempts to capture as many of the observed values within the 95PPU's range as possible.
Furthermore, the SUFI-2 algorithm uses two main indicators, P-factor and R-factor, in order to measure the goodness of fit between measured data and simulated data (Abbaspour et al. 2004, 2015). The first indicator, P-factor, is the percent of observed data bracketed in the 95PPU band. The values of P-factor range from 0 to 1, in which the value of 1 presents the high accuracy of the simulation results, or it means that 100% of observed data are bracketed in the 95PPU band. For river discharge, the value of P-factor is recommended to be higher than a value of 0.7 or 0.75, depending on the project scale, quality of input data to run the model, as well as data for calibration. The second indicator, R-factor, presents the thickness of the 95PPU band and is calculated by the ratio between the average width of the 95PPU band and the standard deviation of the observed variable. Ideally, the R-factor should be close to zero. For river discharge, the value of R-factor is recommended to be smaller than a threshold of 1.5, to indicate a highly accurate simulation result. This threshold also depends on the study conditions and quality of input data. Whenever acceptable values of P-factor and R-factor are achieved in the last iteration, sensitive statistical parameters are then calculated for the calibrated variables. The ranges of every model parameter obtained in the last iteration are the calibrated parameters for the model. Table 4 provides a list of a total of 18 model parameters including their ranges for calibration and the best-fitted values after calibration. Such model parameters are recommended as the sensitive ones for river discharge calibration (Abbaspour et al. 2007, 2015).
Evaluation of model performance
Table 5 provides the threshold values of every statistical coefficient, R2, NSE, and RSR (Santhi et al. 2001; Van Liew et al. 2003; Moriasi et al. 2007; Premanand et al. 2018).
Parameters . | Description (unit) . | Range . | Fitted value . | |
---|---|---|---|---|
Minimum . | Maximum . | |||
r_CN2.mgt | Runoff CN (–) | −0.225 | 0.051 | −0.14 |
v_ESCO.hru | Soil evaporation compensation factor (–) | 0.067 | 0.202 | 0.14 |
r_SOL_AWC.sol | Available water capacity of the soil layer (mmH2O/mm soil) | −1 | −0.581 | −0.72 |
v_ALPHA_BF.gw | Baseflow alpha factor (days) | 0 | 0.12 | 0.09 |
v_GW_DELAY.gw | Groundwater delay (days) | 260.05 | 321.81 | 313.35 |
v_GW_REVAP.gw | Groundwater ‘revap’ coefficient (–) | 0.117 | 0.19 | 0.19 |
v_GWQMN.gw | Threshold depth of water in the shallow aquifer required for return flow to occur (mm) | 2,215 | 3,318 | 3,285 |
v_REVAPMN.gw | Threshold depth of water in the shallow aquifer for ‘revap’ to occur (mm) | 252.18 | 382.86 | 353.45 |
v_SFTMP.bsn | Snowfall temperature (°C) | −2.72 | 0.99 | −1.48 |
v_SMFMN.bsn | Minimum melt rate for snow during the year (occurs on winter solstice) (mmH2O °C−1 d−1) | 1.767 | 6.47 | 5.33 |
v_SMFMX.bsn | Maximum melt rate for snow during year (occurs on summer solstice) (mmH2O °C−1 d−1) | 1.914 | 5.744 | 2.91 |
v_SMTMP.bsn | Snow melt base temperature (°C) | −3.189 | 2.557 | −0.56 |
v_TIMP.bsn | Snowpack temperature lag factor (–) | 0.145 | 0.309 | 0.15 |
a_CH_N2.rte | Manning's ‘n’ value for the main channel (–) | 0.145 | 0.227 | 0.21 |
a_CH_K2.rte | Effective hydraulic conductivity in main channel alluvium (mm/h) | −0.01 | 70.781 | 22.01 |
r_SOL_K.sol | Saturated hydraulic conductivity (mm h−1) | 4.482 | 7.977 | 7.30 |
r_SOL_BD.sol | Moist bulk density (g cm−3) | 0.403 | 0.635 | 0.53 |
a_CANMX.hru | Maximum canopy storage (mmH2O) | 4.016 | 12.056 | 4.49 |
Parameters . | Description (unit) . | Range . | Fitted value . | |
---|---|---|---|---|
Minimum . | Maximum . | |||
r_CN2.mgt | Runoff CN (–) | −0.225 | 0.051 | −0.14 |
v_ESCO.hru | Soil evaporation compensation factor (–) | 0.067 | 0.202 | 0.14 |
r_SOL_AWC.sol | Available water capacity of the soil layer (mmH2O/mm soil) | −1 | −0.581 | −0.72 |
v_ALPHA_BF.gw | Baseflow alpha factor (days) | 0 | 0.12 | 0.09 |
v_GW_DELAY.gw | Groundwater delay (days) | 260.05 | 321.81 | 313.35 |
v_GW_REVAP.gw | Groundwater ‘revap’ coefficient (–) | 0.117 | 0.19 | 0.19 |
v_GWQMN.gw | Threshold depth of water in the shallow aquifer required for return flow to occur (mm) | 2,215 | 3,318 | 3,285 |
v_REVAPMN.gw | Threshold depth of water in the shallow aquifer for ‘revap’ to occur (mm) | 252.18 | 382.86 | 353.45 |
v_SFTMP.bsn | Snowfall temperature (°C) | −2.72 | 0.99 | −1.48 |
v_SMFMN.bsn | Minimum melt rate for snow during the year (occurs on winter solstice) (mmH2O °C−1 d−1) | 1.767 | 6.47 | 5.33 |
v_SMFMX.bsn | Maximum melt rate for snow during year (occurs on summer solstice) (mmH2O °C−1 d−1) | 1.914 | 5.744 | 2.91 |
v_SMTMP.bsn | Snow melt base temperature (°C) | −3.189 | 2.557 | −0.56 |
v_TIMP.bsn | Snowpack temperature lag factor (–) | 0.145 | 0.309 | 0.15 |
a_CH_N2.rte | Manning's ‘n’ value for the main channel (–) | 0.145 | 0.227 | 0.21 |
a_CH_K2.rte | Effective hydraulic conductivity in main channel alluvium (mm/h) | −0.01 | 70.781 | 22.01 |
r_SOL_K.sol | Saturated hydraulic conductivity (mm h−1) | 4.482 | 7.977 | 7.30 |
r_SOL_BD.sol | Moist bulk density (g cm−3) | 0.403 | 0.635 | 0.53 |
a_CANMX.hru | Maximum canopy storage (mmH2O) | 4.016 | 12.056 | 4.49 |
Note:
• The term ‘a_’ explains that a given value is added to the existing parameter value.
• The term ‘r_’ explains that an existing parameter value is multiplied by (1+ a given value).
• The term ‘v_’ explains that the existing parameter value is replaced by a given value.
Model performance . | R2 . | NSE . | RSR . |
---|---|---|---|
Very good | 0.70 ≤ R2 ≤ 1.00 | 0.75 < NSE ≤ 1.00 | 0.00 ≤ RSR ≤ 0.50 |
Good | 0.60 ≤ R2 < 0.70 | 0.65 < NSE ≤ 0.75 | 0.50 < RSR ≤ 0.60 |
Satisfactory | 0.50 ≤ R2 < 0.60 | 0.50 < NSE ≤ 0.65 | 0.60 < RSR ≤ 0.70 |
Unsatisfactory | R2 < 0.50 | NSE ≤ 0.50 | RSR > 0.70 |
Model performance . | R2 . | NSE . | RSR . |
---|---|---|---|
Very good | 0.70 ≤ R2 ≤ 1.00 | 0.75 < NSE ≤ 1.00 | 0.00 ≤ RSR ≤ 0.50 |
Good | 0.60 ≤ R2 < 0.70 | 0.65 < NSE ≤ 0.75 | 0.50 < RSR ≤ 0.60 |
Satisfactory | 0.50 ≤ R2 < 0.60 | 0.50 < NSE ≤ 0.65 | 0.60 < RSR ≤ 0.70 |
Unsatisfactory | R2 < 0.50 | NSE ≤ 0.50 | RSR > 0.70 |
Moreover, for the additional evaluation of the performance of the CFSR weather data and the ground-based weather data, the simulation results of two major hydrology components are considered in this study: (1) the annual average water balance components, e.g., the total areal rainfall (PCP), actual evapotranspiration (ET), surface runoff (SUR_Q), lateral runoff (LAT_Q), groundwater recharge amount (PERCO), groundwater contribution to streamflow (GW_Q), and water yield (YIELD = SUR_Q + LAT_Q + GW_Q – Transmission losses) contributing to streamflow and (2) the long-term average monthly streamflow. The results are discussed in the following section.
RESULTS AND DISCUSSION
Comparison of precipitation input between ground-based weather data and CFSR weather data
Monthly precipitation, during the period of 1995–2012, from the ground-based dataset and the CFSR dataset are averaged for all stations across the whole watershed, and the results are plotted as boxplots, where the general trend of the long-term seasonal variation of precipitation, as well as the variation of precipitation in each month for both weather dataset, is displayed (Figure 5).
Generally, precipitation from the ground-based dataset and the CFSR dataset have similar seasonal trends. March, July, September, October, and November showed higher variations in precipitation compared with the remaining months. Magnitudes of monthly precipitation from the CFSR dataset are higher approximately 11–46% than that from the ground-based dataset, except September. The highest differences are observed in April–June when precipitation from the CFSR data is much higher approximately 45–46% than that from the ground-based data. In previous studies in upper Awash catchment, Ethiopia (Tolera et al. 2018) and in mountainous Black Sea catchment (Cuceloglu & Ozturk 2019), they also demonstrated that CFSR data were able to capture the seasonal trend of precipitation in ground-based data. Similar to findings from our study, the higher in magnitudes of monthly precipitation from the CFSR dataset compared with that from the ground-based dataset were also detected in those studies. However, in the tropical region (the study in upper Awash catchment, Ethiopia), the significant differences of monthly precipitation between the CFSR data and the ground-based data were mostly observed in summer time (July–August), while these were in wet seasons (December to April) in the temperate climate zone in the Back Sea catchment. In constrast, our study found the differences in monthly precipitation between two weather data sources from middle spring to beginning of summer (April to June).
The seasonal variation of precipitation (1995–2012 periods) is locally investigated at four co-located points (the points are closest together) between the ground-based weather stations and the CFSR weather grid points (Figure 6). Of them, two co-located points are inside and the other two are outside of the watershed. As shown in Figure 6, the seasonal trends of precipitation of the CFSR data and ground-based data are almost similar at all the co-located points. However, the magnitude of precipitation from CFSR data is overestimated than that from the ground-based data. Especially, one co-located point locating inside the watershed (as in Figure 6(a)) has 8 months of a year, e.g., January, February, April–June, September, October, and December, when precipitation from the CFSR data is overestimated precipitation from the ground-based data. At other co-located points, the significant differences of precipitation between the CFSR data and the ground-based data are observed in the months of January, April–June, and December for co-located point 2 (Figure 6(b)), and in February, April, June, and December for co-located point 3 (Figure 6(c)), and in February, April–June, September, and December for co-located point 4 (Figure 6(d)). In brief, the significant differences of monthly precipitation between the CFSR data and the ground-based data at the co-located points mostly occur in winter, from middle spring to the beginning of summer, and from the beginning to middle autumn.
Figure 7 describes the boxplots of variation of total annual precipitation at four pairs of co-located points between ground-based weather stations and CFSR weather grid points. In general, at each pair of co-located points, the values of annual rainfall from the CFSR weather grid point are higher than that from the ground-based weather station. For example, the average annual rainfall from the CFSR data are higher approximately 49.70% (Figure 7(a)), 32.70% (Figure 7(b)), 31.60% (Figure 7(c)), and 36.90% (Figure 7(d)) compared with that from the gauge-based data.
It is obvious that precipitation from the high-resolution CFSR data is higher than that from the scattered ground-based data. Therefore, it is estimated that simulation results, e.g., streamflow or water balance components, would be higher by using the CFSR weather input compared with that by using the ground-based weather input.
Comparison of model performance based on the statistical coefficients R2, NSE, and RSR
The model performances for the calibration period are shown in Table 6. Generally, the high-resolution CFSR dataset demonstrated higher performance than the existing limited ground-based dataset after calibration. However, model performances are heterogeneous among five hydro-gauging stations within the watershed (Table 6). According to the performance rating from three statistical coefficients, R2, NSE, and RSR, ground-based weather data performed well at Høgskarhus and Målselvfossen stations and satisfactorily at three remaining stations: Lundberg, Lille Rostavatn, and Skogly. On the contrary, CFSR weather data performed very well at two stations, Skogly and Målselvfossen, and well at Lundberg, Lille Rostavatn, and Høgskarhus. Høgskarhus and Målselvfossen are the two stations where performance does not significantly differ between ground-based data and CFSR data, and both have good performances. A very good value of R2 achieved at all five hydro-gauging stations, from using both weather datasets, demonstrates a high correlation between observation and simulation (Table 6 and Supplementary Material, Figure S1). In addition, the R2 values explain a good agreement between measured data and estimated results, in terms of timing for the runoff process occurring in the sub-basins, as well as the hydrograph of streamflow (Malago et al. 2015).
Station . | Sub-basin . | Weather input . | R2 . | NSE . | RSR . | Performance rating . |
---|---|---|---|---|---|---|
Lundberg | 381 | Ground-based | 0.71 | 0.55 | 0.67 | Satisfactory |
CFSR | 0.73 | 0.69 | 0.56 | Good | ||
Lille Rostavatn | 402 | Ground-based | 0.72 | 0.55 | 0.67 | Satisfactory |
CFSR | 0.79 | 0.67 | 0.58 | Good | ||
Høgskarhus | 408 | Ground-based | 0.73 | 0.71 | 0.54 | Good |
CFSR | 0.74 | 0.65 | 0.59 | Good | ||
Skogly | 412 | Ground-based | 0.77 | 0.60 | 0.63 | Satisfactory |
CFSR | 0.77 | 0.77 | 0.48 | Very good | ||
Målselvfossen | 444 | Ground-based | 0.82 | 0.74 | 0.51 | Good |
CFSR | 0.85 | 0.82 | 0.42 | Very good |
Station . | Sub-basin . | Weather input . | R2 . | NSE . | RSR . | Performance rating . |
---|---|---|---|---|---|---|
Lundberg | 381 | Ground-based | 0.71 | 0.55 | 0.67 | Satisfactory |
CFSR | 0.73 | 0.69 | 0.56 | Good | ||
Lille Rostavatn | 402 | Ground-based | 0.72 | 0.55 | 0.67 | Satisfactory |
CFSR | 0.79 | 0.67 | 0.58 | Good | ||
Høgskarhus | 408 | Ground-based | 0.73 | 0.71 | 0.54 | Good |
CFSR | 0.74 | 0.65 | 0.59 | Good | ||
Skogly | 412 | Ground-based | 0.77 | 0.60 | 0.63 | Satisfactory |
CFSR | 0.77 | 0.77 | 0.48 | Very good | ||
Målselvfossen | 444 | Ground-based | 0.82 | 0.74 | 0.51 | Good |
CFSR | 0.85 | 0.82 | 0.42 | Very good |
According to model validation results, the high-resolution CFSR data (Table 7 and Supplementary Material, Figure S6) also demonstrate its higher performance than the scattered ground-based data (Table 7 and Supplementary Material, Figure S5). For example, CFSR performed very well at Lundberg and Skogly, and well at Lille Rostavatn, where the performance of ground-based data is only satisfactory. Additionally, model performance is good at Målselvfossen, through the use of ground-based data, whereas it is very good through the use of CFSR data. Noticeably, simulation results at the Høgskarhus station in the validation period are worse than those in the calibration period for both weather datasets. This could be partly because of gaps in the time-series data of river discharge used for validation (Supplementary Material, Figure S6c). However, the relatively good values of R2 (Table 7 and Supplementary Material, Figure S2) achieved in the validation period indicate that the simulated results have high correlation with the observed data.
Station . | Sub-basin . | Weather input . | R2 . | NSE . | RSR . | Performance rating . |
---|---|---|---|---|---|---|
Lundberg | 381 | Ground-based | 0.82 | 0.64 | 0.60 | Satisfactory |
CFSR | 0.81 | 0.77 | 0.48 | Very good | ||
Lille Rostavatn | 402 | Ground-based | 0.87 | 0.52 | 0.69 | Satisfactory |
CFSR | 0.91 | 0.66 | 0.58 | Good | ||
Høgskarhus | 408 | Ground-based | 0.66 | 0.46 | 0.73 | Unsatisfactory |
CFSR | 0.73 | 0.59 | 0.64 | Satisfactory | ||
Skogly | 412 | Ground-based | 0.78 | 0.55 | 0.67 | Satisfactory |
CFSR | 0.87 | 0.82 | 0.42 | Very good | ||
Målselvfossen | 444 | Ground-based | 0.86 | 0.72 | 0.52 | Good |
CFSR | 0.88 | 0.83 | 0.41 | Very good |
Station . | Sub-basin . | Weather input . | R2 . | NSE . | RSR . | Performance rating . |
---|---|---|---|---|---|---|
Lundberg | 381 | Ground-based | 0.82 | 0.64 | 0.60 | Satisfactory |
CFSR | 0.81 | 0.77 | 0.48 | Very good | ||
Lille Rostavatn | 402 | Ground-based | 0.87 | 0.52 | 0.69 | Satisfactory |
CFSR | 0.91 | 0.66 | 0.58 | Good | ||
Høgskarhus | 408 | Ground-based | 0.66 | 0.46 | 0.73 | Unsatisfactory |
CFSR | 0.73 | 0.59 | 0.64 | Satisfactory | ||
Skogly | 412 | Ground-based | 0.78 | 0.55 | 0.67 | Satisfactory |
CFSR | 0.87 | 0.82 | 0.42 | Very good | ||
Målselvfossen | 444 | Ground-based | 0.86 | 0.72 | 0.52 | Good |
CFSR | 0.88 | 0.83 | 0.41 | Very good |
The performance of the CFSR data in the present study, which is based on an evaluation of the statistical coefficients, is in agreement with the performance of the CFSR data in the previous studies, such as the studies conducted in the temperate climate zone (Najafi et al. 2012; Fuka et al. 2014), the study in the tropical climate zone (Fuka et al. 2014), the study in the Asian monsoon climate zone (Lu et al. 2019), and the study in the semi-arid climate zone (Bressiani et al. 2015). Such studies concluded that the CFSR data were the potential sources for weather inputs to run the hydrological models in ungauged and large-scale catchments. According to outcomes from the present study, it could be concluded that the CFSR data not only perform well in temperate, tropical, semi-arid, and Asian monsoon climate zones, but also in Arctic conditions. However, findings from the present study also contradict findings from other studies (Dile & Srinivasan 2014; Roth & Lemann 2016), which stated that CFSR could not replace the high-quality ground-based data. However, in the data-sparse regions like the Arctic, reanalysis data, e.g., the CFSR, could be an alternative source, since there are not enough representative meteorological stations for the large catchment, or observed data often contain gaps or errors.
Comparison of the simulated streamflow hydrograph
According to the simulation results of the streamflow hydrograph, a good agreement between observed data and simulated results is achieved from both ground-based weather data (Supplementary Material, Figure S3 for calibration and Supplementary Material, Figure S5 for validation) and CFSR data (Supplementary Material, Figure S4 for calibration and Supplementary Material, Figure S6 for validation). A relatively high level of accuracy, in terms of the timing of the streamflow hydrograph, between observed data and simulated results is obtained. Therefore, lag time is not detected in the simulation. This finding is similar to findings in the previous study in Upper Awash Basin, Ethiopia (Tolera et al. 2018). Regarding the calibration period, the magnitude of peak flow is almost captured at Skogly and Målselvfossen for both weather datasets. However, at Høgskarhus, peak flow is captured by using the ground-based data, but it is slightly underestimated by using the CFSR data. This could be explained by the fact that some sub-basins upstream of Høgskarhus have higher areal precipitation achieving from the ground-based data than from the CFSR data. On the contrary, most values of peak flow at the Lundberg station are captured by using CFSR data, but those are somewhat underestimated by using ground-based data. At the Lille Rostavatn station, both weather datasets slightly underestimate the magnitude of peak flow.
Regarding the validation period, the peak flows are almost captured at Skogly and Målselvfossen, but they are underestimated at Lille Rostavatn, for both weather datasets. The differences in model performance between the two weather datasets are observed at Høgskarhus and Lundberg. For instance, the model performs well in peak flow at Høgskarhus, but it performs worse at Lundberg from using the ground-based dataset, whereas the model performance at those stations shows the opposite behaviors through the use of the CFSR weather data.
In terms of low-flow simulation, a relatively good fitness between simulation and observation is achieved from the calibration and validation period by using both weather datasets. This finding is somewhat better than the finding from the study in Upper Awash Basin, Ethiopia (Tolera et al. 2018), since they concluded that simulation of low flow was underestimated/overestimated by using the CFSR data.
Comparison of the simulated water balance components
Rainfall is one of the major inputs of water balance components. In the SWAT, areal rainfall is calculated separately for every sub-basin. In particular, each sub-basin collects rainfall for itself from the stations (e.g., the ground-based weather stations or the CFSR grid points) that are closest to the centroid of the sub-basin by the method of the NNS. The results of spatial variation of areal rainfall calculated for every sub-basin, obtained from ground-based weather data and CFSR weather data, are displayed as in Figure 8. Generally, the total rainfall amount calculated for the whole watershed by CFSR data is approximately 24% higher than that by the ground-based data. Approximately 88% of the watershed area has a rainfall ratio between ground-based data and CFSR data (rainfall ratio (Figure 8(c)) = rainfall amount from ground-based data (Figure 8(a))/rainfall amount from CFSR data (Figure 8(b))) smaller than 1.0, of which 42% of areas in the downstream sections have a rainfall ratio varying from 0.53 to 0.75, while 45.5% of areas in the middle sections have a rainfall ratio varying from 0.75 to 1.0. Exceptionally, approximately 12% of the watershed in the uppermost areas have a rainfall ratio higher than 1.0 which varies from 1.0 to 1.32. This indicates that rainfall in some parts in the upstream calculated from the CFSR dataset is lower than that from the ground-based dataset.
The higher rainfall amount from the CFSR dataset than from the ground-based dataset results in higher simulation results of some water balance components (Table 8). This finding is in agreement with findings from the previous studies in the tropical climate zone (Dile & Srinivasan 2014; Tolera et al. 2018). For example, in this study, water yield (WYLD) contributing to streamflow from the CFSR data is around 11% higher than that from the ground-based data. Actual ET, lateral flow (LAT_Q), and amount of groundwater recharge (PERC) generated from the CFSR data are also higher than from the ground-based weather data. However, the groundwater amount (GW_Q) produced from the ground-based data is higher than that from CFSR data. Noticeably, the surface runoff component generated from the two weather datasets is almost similar.
Weather dataset . | Rainfall . | ET . | SUR_Q . | LAT_Q . | PERC . | GW_Q . | WYLD . | |
---|---|---|---|---|---|---|---|---|
Ground-based | (mm) | 915.2 | 144.8 | 286.7 | 92.5 | 282.2 | 255.3 | 740.8 |
(%) | 100 | 15.8 | 31.3 | 10.1 | 30.8 | 27.9 | 80.9 | |
CFSR | (mm) | 1192 | 170.8 | 286.5 | 391.1 | 310.5 | 127.5 | 834.9 |
(%) | 100 | 14.3 | 24.0 | 32.8 | 26.0 | 10.7 | 70.0 | |
Ground-based/CFSR difference | (mm) | −276.7 | −26.0 | 0.2 | −298.6 | −28.3 | 127.9 | −94.1 |
(%) | −23.2 | −15.2 | 0.1 | −76.4 | −9.1 | 100.3 | −11.3 |
Weather dataset . | Rainfall . | ET . | SUR_Q . | LAT_Q . | PERC . | GW_Q . | WYLD . | |
---|---|---|---|---|---|---|---|---|
Ground-based | (mm) | 915.2 | 144.8 | 286.7 | 92.5 | 282.2 | 255.3 | 740.8 |
(%) | 100 | 15.8 | 31.3 | 10.1 | 30.8 | 27.9 | 80.9 | |
CFSR | (mm) | 1192 | 170.8 | 286.5 | 391.1 | 310.5 | 127.5 | 834.9 |
(%) | 100 | 14.3 | 24.0 | 32.8 | 26.0 | 10.7 | 70.0 | |
Ground-based/CFSR difference | (mm) | −276.7 | −26.0 | 0.2 | −298.6 | −28.3 | 127.9 | −94.1 |
(%) | −23.2 | −15.2 | 0.1 | −76.4 | −9.1 | 100.3 | −11.3 |
Comparison of the simulation results of long-term average monthly streamflow
The simulated monthly streamflows, which are generated from ground-based data and CFSR data, are averaged for a 10-year period, 1998–2007, and the results are compared with the averaged values of observed data and shown in Figure 9. According to the graphs in Figure 9, both weather datasets simulate quite well the low value of the average monthly flow, except that slight overestimations are observed in September at Høgskarhus (Figure 9(b)) and Skogly (Figure 9(c)) from the ground-based data. However, the simulation of peak value of the average monthly flow differs somewhat between two weather datasets. For example, the CFSR replicates the peak flow at Lundberg (Figure 9(a)) and Skogly better than the ground-based data. In contrast, the ground-based data replicate the peak flow at Høgskarhus better than CFSR data. The ground-based data generated higher peak flows at Høgskarhus and Skogly than the CFSR data. This could be because of the contribution of higher areal rainfall in upstream sub-basins from the ground-based data, compared with that from the CFSR data. Interestingly, the graphs of long-term average monthly streamflows at Lille Rostavatn (Figure 9(d)) and Målselvfossen (Figure 9(e)) generated from both weather datasets are almost similar, excluding a slightly higher peak flow at Lille Rostavatn achieved from the CFSR data compared with the ground-based data. The graphs of long-term average monthly streamflow at the downstream station, Målselvfossen, demonstrate that a fairly good model performance was achieved from both weather datasets.
In brief, a relatively good model performance in terms of simulation of the long-term average monthly streamflow, as well as the consistency of modeling results between the ground-based dataset and the CFSR dataset, achieved at Lille Rostavatn and Målselvfossen compared with other hydro-gauging stations, have demonstrated the influences of the representativeness of ground-based weather stations across the Målselv watershed. Since the representative ground-based weather stations are missing for the upstream sub-basins at hydro-gauging stations Lundberg, Skogly, and Hogskarhus, areal rainfall calculated for those sub-basins are from the ground-based weather stations in the downstream and outside of the watershed. However, such weather stations might not be the representative weather stations for the upstream sub-basins. As a result, the simulation results of long-term average monthly streamflow at Lundberg, Skogly, and Hogskarhus stations are not consistent between two weather datasets. In contrast, the hydrographs of the long-term average monthly streamflow at Lille Rostavatn and Målselvfossen are almost consistent between two weather datasets. The reason could be because these sub-basins receive correct rainfall from the representative weather stations.
Uncertainty analysis of the modeling results from the two weather inputs
Values of P-factors, calculated at all five hydro-gauging stations, from both weather input datasets, in the calibration period, are ≥0.75, except that the value of P-factor at the Lille Rostavatn station calculated from the ground-based dataset is slightly under 0.70 (Figure 10(a)). Regarding the validation period, values of P-factors at most hydro-gauging stations, from both weather input datasets, are higher than 0.70, excluding the results at Skogly and Lille Rostavatn from the ground-based dataset (Figure 10(c)). The good values of P-factors achieved from the uncertainty analyses indicate that the measured river discharge is simulated well by the model, or the modeling error is low. The accuracy of modeling results by using the high-resolution CFSR dataset is higher than that by using the existing scattered ground-based dataset.
Values of R-factors obtained from both weather input datasets are ≤1.50 for both calibration and the validation periods, except that R-factors at Høgskarhus and Skogly, which are obtained from the ground-based dataset, are higher than 1.50 (Figure 10(b) and 10(d)). Therefore, based on the analyzed results of R-factors, it could be concluded that using the high-resolution CFSR weather input to simulate river discharge in the Målselv watershed could produce a high certainty of modeling results. In contrast, using the available scattered ground-based data to simulate river discharge may produce uncertain results in upstream sections of the watershed, particularly the areas close to Høgskarhus and Skogly stations. This is because most of the available ground-based stations are located in the downstream of the watershed, and there is a lack of representative stations in the middle, as well as in the upstream, sections.
In brief, according to the above analyses of the statistical coefficients of model performance (e.g., R2, NSE, and RSR), the uncertainty measures (P-factor and R-factor), the simulation results of water balance components, monthly streamflow hydrograph, and long-term average monthly streamflow, the present study demonstrates that using the high-resolution CFSR weather input to run the SWAT model produces better modeling results than using the existing limited ground-based weather input, in the Arctic watershed, Målselv. It could be interpreted that one of the underlying reasons leading to lower model performance by using the ground-based weather input in this study area is that most of the available meteorological stations are located in the downstream sections, and there is a lack of representative stations in the middle, as well as in the upstream, sections. The Målselv watershed has characteristics of mountainous topography, where rainfall is high variant in space and time. Therefore, the scattered ground-based networks could not represent well the rainfall feature of the whole large watershed, unlike the denser grid points of the global reanalysis weather data CFSR distributed across the whole watershed. Furthermore, the SWAT model used the NNS method to calculate the areal rainfall for every sub-basin. This approach could result in uncertain outputs when the local meteorological data are recognized to be representative of larger areas. To our knowledge, the CFSR dataset has been used for the first time to run the QSWAT model in the Arctic watershed Målselv by this study. Since the available ground-based weather data are limited in this study area, the CFSR dataset is evaluated as a reliable alternative source. Also, performances and certainties of the CFSR data are verified in this study via the evaluation of multiple factors and criteria. It could be, therefore, highly reliable to apply the CFSR dataset for running hydrological models in Målselv watershed. According to the performance of the CFSR input dataset in this case study, it is expected that CFSR weather data could be a potential source to be widely applied in other Arctic watersheds.
CONCLUSIONS
Collecting enough weather input data to run hydrological models in the data-sparse Arctic region is a challenge for all modelers. In this study, the possibility of using the high-resolution global reanalysis weather data, CFSR, as an alternative data input for the hydrological models was investigated in an Arctic watershed Målselv. The performance of CFSR data is compared with the ground-based (gauged) data through running the hydrological model QSWAT. Model performance with the high-resolution CFSR data is higher than that with the existing scattered ground-based data via the evaluation of the statistical coefficients. The NSE coefficient is in the range of 0.65–0.82 (good to very good) with the CFSR weather input, whereas it is in the range of 0.55–0.74 (satisfactory to good) with the ground-based weather input. The simulation results also demonstrate the high capacity of CFSR data to replicate the monthly average streamflow, in terms of monthly average hydrograph, peak and low-flow values, during a 10-year period, 1998–2007. In contrast, the ground-based weather data showed lower performance than the CFSR data because the network of the ground-based weather station is scattered with only two stations inside and two stations outside the watershed. In addition, most of the ground-based weather stations locate in the downstream. The representativeness of weather stations in the middle and upstream is missing. The higher rainfall amount and its spatial variation from the CFSR dataset than that from the ground-based dataset leads to higher simulation results of some water balance components, in terms of actual evapotranspiration, lateral flow, groundwater recharge, and water yield contributing to streamflow. By evaluating the uncertainty measures, P-factors (with results ≥0.70) and R-factors (with results ≤1.5), CFSR data demonstrated its capacity to produce a high certainty of modeling results in the Målselv watershed. The promising results from this study will open the chances for hydrological applications of the CFSR data in other watersheds in the Arctic region.
ACKNOWLEDGEMENTS
The authors would like to acknowledge the Department of Technology and Safety, University of Tromsø – The Arctic University of Norway, for their financial support for this study. Additionally, we would like to acknowledge the support from The Research Council Project – Dissemination of climate change research outcomes in cold climate regions (project no. 321305).
DATA AVAILABILITY STATEMENT
All relevant data are included in the paper or its Supplementary Information.