Abstract
Due to the large uncertainties of long-term precipitation prediction and reservoir operation, it is difficult to forecast long-term streamflow for large basins with cascade reservoirs. In this paper, a framework coupling the original Climate Forecasting System (CFS) precipitation with the Soil and Water Assessment Tool (SWAT) was proposed to forecast the nine-month streamflow for the Cascade Reservoir System of Han River (CRSHR) including Shiquan, Ankang and Danjiangkou reservoirs. First, CFS precipitation was tested against the observation and post-processed through two machine learning algorithms, random forest and support vector regression. Results showed the correlation coefficients between the monthly areal CFS precipitation (post-processed) and observation were 0.91–0.96, confirming that CFS precipitation post-processing using machine learning was not affected by the extended forecast period. Additionally, two precipitation spatio-temporal distribution models, original CFS and similar historical observation, were adopted to disaggregate the processed monthly areal CFS precipitation to daily subbasin-scale precipitation. Based on the reservoir restoring flow, the regional SWAT was calibrated for CRSHR. The Nash–Sutcliffe efficiencies for three reservoirs flow simulation were 0.86, 0.88 and 0.84, respectively, meeting the accuracy requirement. The experimental forecast showed that for three reservoirs, long-term streamflow forecast with similar historical observed distribution was more accurate than that with original CFS.
INTRODUCTION
Long-term streamflow forecasting plays an important role in flood control, drought prediction, reservoir operation, efficient water use, etc. How to construct appropriate long-term hydrological forecasting models to meet accuracy requirements has always been one of the key issues researched by hydrologists. A long-term streamflow forecast is defined as a forecast at the monthly, seasonal, or yearly scale, and its lead time is greater than the maximum watershed confluence time (Liang et al. 2018a). There are many studies in long-term hydrological forecasting (Yang et al. 2005; Li et al. 2014a; Anghileri et al. 2016; Hong et al. 2016; Yaseen et al. 2016; Chu et al. 2017; Dariane et al. 2018). However, the research on long-term streamflow forecasting for basins with intensive cascade reservoir system influences is limited.
Currently, long-term hydrological forecasting models are mainly based on the statistical relationship between predictors and predictands or the time series changes in runoff. The predictors could be circulation characteristics, sea surface temperature and hydro-meteorological factors. However, the drawback of using this approach is the lack of a physical basis. Another physically based approach to forecast long-term runoff couples a hydrological model with numerical weather prediction (NWP) results (Wood et al. 2002; Yang et al. 2005). However, there are some limitations to applying this method. It is well acknowledged that the direct outputs from NWPs are inadequate as the input for hydrological models for long-term hydrological forecasting at regional scales. This inadequacy is primarily due to two reasons, as described below.
First, there are doubts about the reliability of some land surface variables output from NWPs (particular variables, such as basin precipitation and surface runoff, which critically depend on sub-grid-scale processes) (Risbey & Stone 1996). Many studies demonstrate that the accuracy of NWP varies with the spatial-temporal scale and forecast period (Bauer et al. 2015). It is acknowledged that the prediction accuracy will increase as the spatial scale increases or the lead time decreases. Specifically, compared with the daily point precipitation forecasting, the monthly area precipitation prediction is more accurate (Hulme 1994). However, in order to achieve an accurate hydrological simulation, the NWP results still require processing.
Second, the spatial resolution of NWPs (such as the Climate Forecasting System (CFS) outputs, which typically have a spatial resolution of 10,000 km2) is larger than that required for input into a hydrological model at smaller scales (on the catchment or basin scales of 102–103km2) (Yuan et al. 2011). Therefore, in the coupled atmospheric-hydrological model, the difficulty in bridging the gap between the coarse resolution of NWPs and fine resolution of hydrological models, known as ‘downscaling for hydrological impact studies’, needs to be resolved (Wilby et al. 2000; Wood et al. 2004). There are mainly two downscaling approaches. Some researchers apply regional climate models, i.e. dynamic downscaling, to translate historical reanalysis or future prediction output into local meteorological forcing for hydrologic models (Kim et al. 2000; Wilby et al. 2000). Another widely used method is the statistical-based approach. The advantages and disadvantages of these two approaches have been thoroughly documented (Wilby & Wigley 1997; Fowler et al. 2007). Compared to the dynamic model-based alternative, the key advantage of the statistical approach is the lower computational requirement (Wilby et al. 1998).
The classical statistical framework, i.e. bias correction and spatial disaggregation (BCSD), uses quantile mapping and spatial interpolation to solve the bias and temporal-spatial mismatch (Wood et al. 2002; Ines & Hansen 2006; Piani et al. 2010; Jiang et al. 2013). Our study adopts this framework to couple the CFS precipitation with the hydrological model. First, the prerequisite of this correction is that the cumulative distribution functions (CDFs) of the measured and forecasted data need to be determined. However, CDFs are uncertain and have a significant impact on the correction results (Li et al. 2010; Chen et al. 2015). By contrast, the post-processing of the original forecasted prediction could be more direct if based on the regression relationship between the raw prediction and the observation. Currently, whether this regression takes on a linear or non-linear relationship, machine learning algorithms can characterize them well (Vandal et al. 2017). Therefore, machine learning algorithms, such as random forest (RF) (Liang et al. 2018b) and support vector regression (SVR) (Adnan et al. 2017), are introduced. In this study, we apply RF and SVR to process the forecasted precipitation from CFS outputs. Second, in the BCSD method, usually spatial interpolation or random sampling is used to execute spatio-temporal disaggregation (Wood et al. 2004). However, research on downscaling approaches considering the spatial-temporal distribution effect of precipitation on the hydrological response process is rare. This research involves the selection of a hydrological model. Currently, many hydrological models have been developed to simulate long-term runoff (Wood et al. 2004; Yang et al. 2008; Li et al. 2014b). For cascade reservoir systems, inflow to the downstream reservoir consists of two parts: one is the flow from the upstream reservoir, which is managed by the operation strategy of the reservoir and is unpredictable; the other is the raw runoff from intermediate areas between the adjacent dams and could be forecasted to assist in cascade reservoir operation. In this paper, to simulate the runoff of different intermediate areas, the Soil and Water Assessment Tool (SWAT), a physically semi-distributed hydrological watershed model, was used. The SWAT model has been widely applied to runoff simulation by many researchers (Kardhana et al. 2017). The two advantages of the SWAT model are that (a) the characteristics of runoff yield and concentration for different intermediate areas can be described with different parameters suites and (b) daily precipitation input at the subbasin scale can describe the spatio-temporal distribution of precipitation over the basin. However, it is difficult to predict daily precipitation in a subbasin. In this study, typical precipitation from two spatio-temporal distribution models, an original CFS forecast and the most similar historical observations, was scaled to forecast the daily precipitation in the subbasin.
This paper describes an exploratory long-term monthly streamflow forecasting model for the Cascade Reservoir System of Han River (CRSHR); this model combines the CFS outputs and SWAT model based on two machine learning algorithms, SVR and RF. The paper is organized as follows: the next section introduces the methodology and data, respectively. In the following section, the forecasted monthly CFS precipitation is processed and disaggregated spatio-temporally, and the streamflow forecast of reservoirs for the CRSHR is obtained. The conclusions are presented in the final section.
METHODOLOGY
In our forecast scheme, after original CFS precipitation post-processing, the daily subbasin precipitation is determined by downscaling the monthly area precipitation with typical scaling. Other meteorological elements (temperature, wind speed, relative humidity and solar radiation) are produced with the SWAT weather generator (Neitsch et al. 2011). Then, the nine-month meteorological driving forcings for the SWAT model are obtained. Finally, nine-month streamflow forecast is produced by driving the initialized SWAT model coupled with the predicted meteorological elements.
CFS precipitation post-processing
The post-processing technology is based on the assumption that either a stable linear or non-linear relationship could be developed between the CFS surface forecast fields and observed areal climatology. To reduce the prediction uncertainties, this study uses two machine learning algorithms (RF and SVR) to process the forecasted CFS precipitation to determine the monthly areal precipitation. Based on the positive correlation between temperature and precipitation (Trenberth & Shea 2005), the temperature forecasting was relatively stable and treated as a covariate in the processing procedure to reduce anomalies. Considering that the accuracy of precipitation forecasting will decrease as lead time is extended (Li et al. 2017), the regression relationships between forecasts for different lead times with the corresponding observations are constructed and investigated. The principles of the two algorithms used in this study are described as follows:
RF is a machine learning algorithm that can train and determine the predictand with a classifier consisting of the classification and regression tree (CART) produced with sampling (Breiman 1996a). First, this algorithm adapts a bootstrap re-sampling approach (Breiman 1996b) and out-of-bag (OOB) error estimation (Breiman 1996a) to sample the original data and calculate the generalization error. Second, it uses decision tree analysis, random subspace theory and the Gini impurity level index to predict the result (Zhu & Pierskalla 2016). Finally, the optimal regression result is obtained by voting or by other methods.
- Another machine learning algorithm is SVR which is based on the support vector machine (SVM) algorithm (Vapnik 2000). In the SVR, a small number of support vectors can be used to represent the entire sample set (Vapnik 2000). SVM uses slack variables and the error penalty parameter (C) to adjust the model complexity and training error (Tripathi et al. 2006). This method can solve non-linear regression optimization problems well. Based on the Lagrange function and Karush–Kuhn–Tucker conditions (Kuhn & Tucker 1951), for a given training datasets (xi, yi) , the objective function can be expressed as follows: where is a kernel function, is an insensitive loss parameter. The optimal solution can be obtained by Equation (2), and the corresponding is the support vectors. Specifically, this paper selected a Gaussian radial basis function (RBF) as the kernel function (Scholkopf et al. 1997). The function is: where is the width factor of the RBF. and C are the key variables in the SVR algorithm.
Precipitation spatio-temporal disaggregation
SWAT hydrological model
The SWAT model can predict water and sediment transport in large basins over a long period (Arnold et al. 1998). It is a semi-distributed subbasin-scale hydrological model (Arnold et al. 2011). Based on various land use/land cover (LULC) conditions, which are a combination of land use, soil and slope type, SWAT divides the subbasin into hydrological response units (HRUs) to represent the runoff units (Neitsch et al. 2011). The runoff parameters of SWAT vary with LULC. Although a HRU is a runoff unit, an entire subbasin is the input unit of the daily meteorological forcings (precipitation, temperature, wind speed, relative humidity and solar radiation) and the output unit of hydrological factors (flow, sediment concentration, etc.) (Winchell et al. 2011). Therefore, SWAT can simulate the intermediate runoff under different LULC conditions in the cascade reservoir system (Nguyen-Tien et al. 2018).
Evaluation criteria
Furthermore, to perform an accuracy assessment (AA) of the precipitation or flow prediction, a ±20% variation in amplitude during multiple years (1986–2016) is taken as the allowable deviation. If the forecast error is within the allowable deviation, the forecast value is qualified. If the forecast is qualified, the MAPE is calculated based on the relative error of the qualified forecast; this index is called the QR-MAPE. The qualified rate (QR) and QR-MAPE can indicate the overall forecast accuracy.
STUDY DOMAIN AND DATA
Study domain
The Han River, with a length of approximately 1,577 m, is the largest tributary of the Yangtze River. The Han River originates from the southern edge of the Qinling Mountains in China and has a basin area of approximately 159,000 km2. There are many large water conservancy projects in this basin, including the Shiquan reservoir, Ankang reservoir, and Danjiangkou reservoir, as shown in Figure 1. The storage capacities of these three reservoirs are 0.44, 2.58 and 29.05 billion cubic metres, respectively. This area is considered the CRSHR due to the reservoir interaction. The three areas of the CRSHR are the Shiquan watershed, Shiquan-Ankang intermediate area and Ankang-Danjiangkou intermediate area, which are shown in Figure 2. The spatio-temporal distribution of precipitation in the Han River basin is non-uniform, and the sum volume of precipitation during the flood season (roughly July to October) accounts for 65% of the overall annual flow.
Data
The second generation CFS is a fully coupled ocean-atmosphere-land model developed by the National Centers for Environmental Prediction (NCEP) (Saha et al. 2006, 2014). This CFS has a T126 spatial resolution (∼10,000 km2) resolution. Figure 1 shows all 25 mesh points over the Han River basin. Compared to other seasonal forecast models, CFS demonstrated a better seasonal climate forecasting performance (Yuan et al. 2011). This NCEP model became operational in March 2011 and runs at 0, 6, 12 and 18 UTC every day (Saha et al. 2014). The forecast period is nine months, and forecasts are provided at six-hour steps. In addition, the NCEP also provides a nine-month retrospective forecast every fifth day, beginning January 1st, over a 29-year period from 1982 to 2010. These data could ensure a larger sample size for robust evaluation.
In this study, daily observed meteorological variables (precipitation, temperature, wind speed, solar radiation and relative humidity) (1982–2016) were provided by China National Meteorological Center. The monthly average restoring inflows of the three reservoirs (2003–2016) were provided by the Yangtze River Waterway Bureau. Note that the restoring monthly inflow is equal to observed monthly inflow plus the variation of upstream reservoir storage capacity. The underlying surface data used as input into the SWAT model include the elevation, land use and soil type. NASA and NIMA provided the 90 m Shuttle Radar Topography Mission (SRTM) digital elevation model (DEM). The land use data were extracted from the WESTDC_Land_Cover_V.1.0 dataset at a 1:100,000 scale provided by the Chinese Academy of Sciences. The soil data were provided by the second national land survey of the Institute of Soil Science and acquired in 1995.
RESULTS AND DISCUSSION
Nine-month precipitation forecasting
SVR and RF regression construction
To evaluate the accuracy of the original CFS, a monthly gridded CFS precipitation forecast with a lead time of one month from 1982 to 2016 was compared with the observed ground precipitation at both the grid and basin scales. The coefficients of correlation are 0.66 and 0.72, respectively, for these two comparisons (shown in Figure 3). This result indicates that the relationship between the CFS and observation may be not linear. Currently, it is difficult to determine that the relationship is linear or non-linear. In this premise, the advantage of RF and SVR is that both machine learning algorithms can describe the relationship well without consideration of linear or non-linear. Moreover, the correlation coefficient between the gridded CFS and on-site precipitation was compared with that between the areal CFS and observed areal precipitation. It suggests that the agreement (correlation) between the CFS precipitation and observation climatology on the area scale is better than that on the mesh-point scale. Therefore, RF and SVR were used to process the original areal CFS precipitation in the Han River basin. Considering that the precipitation at each mesh point has an impact on the basin areal precipitation, this study constructed the regression relationship between the CFS forecast at all mesh points over the Han River basin and basin areal precipitation.
Different regression relationships between the historical CFS forecast and observation were constructed for different lead times because the accuracy of CFS forecasting varies as the forecasting time increases. Note that in the regression relationships, the regression factors are the historical monthly CFS retrospective precipitation and temperature forecasts at the 25 mesh points, and the dependent variable is the monthly observed areal precipitation in the Han River basin. Then, the regression relationships were used to reprocess the historical monthly areal CFS precipitation retrospective forecast. The correlation coefficients between the processed areal CFS precipitation and observed precipitation are shown in Figures 4 and 5. Whether the original CFS was processed by RF or SVR, the R coefficient of the processed CFS is greater than 0.91 for various forecasting periods, indicating that both RF and SVR are effective. Furthermore, to represent the effects of calibration, the RE was calculated to reflect the accuracy of the processed CFS. Taking the processed CFS with a one-month lead time as an example, the RE was calculated from 1982 to 2016, as shown in Figure 6. Figure 6 shows that although there are several processing anomalies with significant errors, the median RE is nearly zero, especially during the rainy seasons from March to September. Therefore, the post-processing result meets the accuracy requirement. Furthermore, the two algorithms have a similar effect, and a better algorithm could not be determined. Therefore, when processing future CFS precipitation forecast, it is suggested that the results of these two algorithms are integrated.
To evaluate the performance of the two algorithms, this study calculated nine-month retrospective precipitation predictions based on historical data from December 31, 2010 to March 31, 2015. For each hindcasting, the regression relationship was reconstructed based on historical observations and forecasts produced before the period of the retrospective prediction. Tables 1 and 2 show the MAPE of the forecast from the validation with different lead times. The results suggest that the predicted MAPE increases with the lead time, which is reasonable. Moreover, it indicates that the processing performance is better for the rainy season than for the dry season.
Forecasting period . | Jan . | Feb . | Mar . | Apr . | May . | Jun . | Jul . | Aug . | Sep . | Oct . | Nov . | Dec . |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1st month | 52 | 65 | 23 | 26 | 18 | 22 | 17 | 19 | 27 | 26 | 54 | 94 |
2nd month | 66 | 52 | 24 | 25 | 17 | 24 | 20 | 21 | 30 | 23 | 98 | 143 |
3rd month | 67 | 50 | 29 | 27 | 17 | 23 | 16 | 23 | 29 | 27 | 92 | 182 |
4th month | 61 | 48 | 25 | 24 | 17 | 22 | 15 | 26 | 27 | 28 | 80 | 183 |
5th month | 67 | 52 | 23 | 24 | 16 | 20 | 16 | 25 | 28 | 30 | 91 | 155 |
6th month | 77 | 47 | 22 | 27 | 18 | 25 | 15 | 25 | 30 | 29 | 93 | 181 |
7th month | 64 | 54 | 22 | 28 | 18 | 21 | 15 | 24 | 28 | 29 | 97 | 129 |
8th month | 76 | 47 | 27 | 23 | 17 | 23 | 15 | 24 | 26 | 24 | 74 | 187 |
9th month | 59 | 59 | 27 | 28 | 17 | 22 | 19 | 22 | 29 | 29 | 81 | 111 |
Forecasting period . | Jan . | Feb . | Mar . | Apr . | May . | Jun . | Jul . | Aug . | Sep . | Oct . | Nov . | Dec . |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1st month | 52 | 65 | 23 | 26 | 18 | 22 | 17 | 19 | 27 | 26 | 54 | 94 |
2nd month | 66 | 52 | 24 | 25 | 17 | 24 | 20 | 21 | 30 | 23 | 98 | 143 |
3rd month | 67 | 50 | 29 | 27 | 17 | 23 | 16 | 23 | 29 | 27 | 92 | 182 |
4th month | 61 | 48 | 25 | 24 | 17 | 22 | 15 | 26 | 27 | 28 | 80 | 183 |
5th month | 67 | 52 | 23 | 24 | 16 | 20 | 16 | 25 | 28 | 30 | 91 | 155 |
6th month | 77 | 47 | 22 | 27 | 18 | 25 | 15 | 25 | 30 | 29 | 93 | 181 |
7th month | 64 | 54 | 22 | 28 | 18 | 21 | 15 | 24 | 28 | 29 | 97 | 129 |
8th month | 76 | 47 | 27 | 23 | 17 | 23 | 15 | 24 | 26 | 24 | 74 | 187 |
9th month | 59 | 59 | 27 | 28 | 17 | 22 | 19 | 22 | 29 | 29 | 81 | 111 |
Forecasting period . | Jan . | Feb . | Mar . | Apr . | May . | Jun . | Jul . | Aug . | Sep . | Oct . | Nov . | Dec . |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1st month | 106 | 95 | 45 | 37 | 18 | 11 | 6 | 12 | 21 | 37 | 94 | 153 |
2nd month | 128 | 107 | 44 | 42 | 20 | 22 | 13 | 20 | 18 | 31 | 110 | 272 |
3rd month | 135 | 87 | 50 | 45 | 15 | 17 | 11 | 23 | 37 | 56 | 154 | 268 |
4th month | 121 | 74 | 36 | 41 | 20 | 28 | 22 | 14 | 25 | 35 | 141 | 314 |
5th month | 120 | 87 | 42 | 41 | 24 | 30 | 16 | 21 | 24 | 36 | 100 | 268 |
6th month | 113 | 97 | 34 | 37 | 22 | 18 | 9 | 13 | 31 | 30 | 111 | 326 |
7th month | 109 | 86 | 48 | 43 | 18 | 21 | 12 | 17 | 29 | 36 | 81 | 223 |
8th month | 126 | 75 | 39 | 31 | 23 | 19 | 9 | 22 | 16 | 32 | 130 | 280 |
9th month | 100 | 93 | 45 | 36 | 23 | 25 | 15 | 23 | 30 | 38 | 131 | 323 |
Forecasting period . | Jan . | Feb . | Mar . | Apr . | May . | Jun . | Jul . | Aug . | Sep . | Oct . | Nov . | Dec . |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1st month | 106 | 95 | 45 | 37 | 18 | 11 | 6 | 12 | 21 | 37 | 94 | 153 |
2nd month | 128 | 107 | 44 | 42 | 20 | 22 | 13 | 20 | 18 | 31 | 110 | 272 |
3rd month | 135 | 87 | 50 | 45 | 15 | 17 | 11 | 23 | 37 | 56 | 154 | 268 |
4th month | 121 | 74 | 36 | 41 | 20 | 28 | 22 | 14 | 25 | 35 | 141 | 314 |
5th month | 120 | 87 | 42 | 41 | 24 | 30 | 16 | 21 | 24 | 36 | 100 | 268 |
6th month | 113 | 97 | 34 | 37 | 22 | 18 | 9 | 13 | 31 | 30 | 111 | 326 |
7th month | 109 | 86 | 48 | 43 | 18 | 21 | 12 | 17 | 29 | 36 | 81 | 223 |
8th month | 126 | 75 | 39 | 31 | 23 | 19 | 9 | 22 | 16 | 32 | 130 | 280 |
9th month | 100 | 93 | 45 | 36 | 23 | 25 | 15 | 23 | 30 | 38 | 131 | 323 |
Figures 4 and 5 show that for a nine-month prediction period, with the exception of the first month, the correlation efficient from RF is equal to or slightly greater than that from SVR in the model calibration period. Therefore, it is plausible that the RF algorithm is superior to the SVR algorithm for CFS precipitation post-processing. However, when testing the two algorithms based on the nine-month (from January to September 2017) monthly CFS precipitation forecast obtained on December 31, 2016, the experimental result shows that SVR generally gives a better post-processing result than RF, as shown in Table 3. Specifically, the QR result from SVR is larger than that from RF, i.e. only the forecast for September 2017 using SVR is not qualified, and the QR-MAPE from SVR is less than that from RF. In addition, for the months of March, May, August and September, the errors from the RF result are smaller than those from the SVR result. For other months, the opposite conclusion could be obtained. Therefore, for precipitation forecast for a certain month, it is difficult to determine which method has a better post-processing effect. To reduce the uncertainty of forecasting, averaging the results of the two algorithms could be good practice.
Date (yyyy-mm) . | Obs (mm) . | RF . | SVR . | ||||
---|---|---|---|---|---|---|---|
Corrected CFS (mm) . | AA . | RE (%) . | Corrected CFS (mm) . | AA . | RE (%) . | ||
2017 − 01 | 17.7 | 19.5 | Y | 10.4 | 16.7 | Y | –5.9 |
2017 − 02 | 22 | 28.7 | Y | 30.4 | 25.7 | Y | 17.0 |
2017 − 03 | 53.9 | 45.7 | Y | –15.2 | 40.9 | Y | –24.1 |
2017 − 04 | 77.9 | 99.8 | N | 78.5 | Y | 0.7 | |
2017 − 05 | 80.4 | 94.0 | Y | 17.0 | 100.6 | Y | 25.2 |
2017 − 06 | 119.8 | 144.9 | Y | 20.9 | 134.0 | Y | 11.9 |
2017 − 07 | 109.1 | 157.8 | N | 143.0 | Y | 31.0 | |
2017 − 08 | 140.1 | 138.5 | Y | –1.2 | 143.7 | Y | 2.6 |
2017 − 09 | 211.6 | 136.1 | N | 111.4 | N | ||
QR | 67% | 89% | |||||
QR-MAPE | 15.8 | 14.8 |
Date (yyyy-mm) . | Obs (mm) . | RF . | SVR . | ||||
---|---|---|---|---|---|---|---|
Corrected CFS (mm) . | AA . | RE (%) . | Corrected CFS (mm) . | AA . | RE (%) . | ||
2017 − 01 | 17.7 | 19.5 | Y | 10.4 | 16.7 | Y | –5.9 |
2017 − 02 | 22 | 28.7 | Y | 30.4 | 25.7 | Y | 17.0 |
2017 − 03 | 53.9 | 45.7 | Y | –15.2 | 40.9 | Y | –24.1 |
2017 − 04 | 77.9 | 99.8 | N | 78.5 | Y | 0.7 | |
2017 − 05 | 80.4 | 94.0 | Y | 17.0 | 100.6 | Y | 25.2 |
2017 − 06 | 119.8 | 144.9 | Y | 20.9 | 134.0 | Y | 11.9 |
2017 − 07 | 109.1 | 157.8 | N | 143.0 | Y | 31.0 | |
2017 − 08 | 140.1 | 138.5 | Y | –1.2 | 143.7 | Y | 2.6 |
2017 − 09 | 211.6 | 136.1 | N | 111.4 | N | ||
QR | 67% | 89% | |||||
QR-MAPE | 15.8 | 14.8 |
Typical monthly precipitation scaling
The daily precipitation forecasting on the subbasin scale was obtained by multiplying the weighted coefficient in the typical month by the processed monthly areal CFS precipitation according to Equations (3)–(5). A typical historical storm was selected by using the hydrological similarity measure method. Finally, the spatio-temporal distribution of daily precipitation from January to September in 2015 was determined as the historical similar precipitation spatio-temporal distribution. Figures 7 and 8 show the spatial and temporal distributions of the forecasted precipitation in June 2017 obtained based on the historical similar observed precipitation and original CFS precipitation forecast, respectively. These results indicate that there are considerable differences between the similar analysis and original CFS forecast method for both the temporal and spatial distribution.
Monthly streamflow forecasting
SWAT model construction
In the SWAT model, the whole Han River basin was divided into 64 subbasins and further subdivided into 367 HRUs according to LULC. Note that the areal average precipitation of a subbasin is the precipitation inputting unit in the SWAT model. The other meteorological elements of the subbasin were obtained by inverse distance interpolation of the data measured at the 12 observation gauges; these elements include temperature, wind speed, solar radiation and relative humidity.
The restored reservoir monthly streamflows for the period of 2003–2012 were used for the SWAT model parameters calibration, and data from 2013 to 2016 were used for validation. The Sequential Uncertainty Fitting optimization algorithm (version 2) (Arnold et al. 2012) in the SWAT-CUP program (Noori & Kalin 2016) was used to calibrate the 19 SWAT parameters automatically targeted (Table 4). Moreover, considering the physical meaning of some parameters, the automated calibration parameters values needed to be manually adjusted. Finally, the optimal parameter set of the SWAT model for all three regions (Shiquan watershed, Shiquan-Ankang intermediate area and Ankang-Danjiangkou intermediate area) was obtained, as shown in Table 5. Figure 9 shows the simulation of monthly discharge for the three reservoirs in the calibration and validation period. To evaluate the accuracy of the model simulation, the NSE was calculated. The values of NSE for the Shiquan, Ankang, Danjiangkou reservoirs are 0.86, 0.88 and 0.84 in the calibration period, while they are 0.73, 0.73 and 0.66 in the validation period, respectively. Therefore, the SWAT model can provide an acceptable accuracy for the monthly streamflow simulation in the Han River basin in the calibration and validation periods.
Parameter . | Input file . | Units . | Default value . | Range . | Description . |
---|---|---|---|---|---|
CN2a | .mgt | – | HRU | ±20% | SCS run off curve number |
ALPHA_BNK | .rte | day | 0.048 | 0–1 | Base flow factor for bank storage |
ESCO | .hru | – | 0.95 | 0.01–1 | Soil evaporation compensation factor |
SOL_BDa | .sol | mg/m3 | Soil data | ±50% | Soil bulk density |
SOL_Ka | .sol | mm/h | Soil data | ±80% | Soil saturated infiltration coefficient |
REVAPMN | .gw | mm | 1 | 0–1000 | Threshold water level in shallow aquifer at which ‘revap’ occurs |
GW_DELAY | .gw | day | 31 | 0–500 | Groundwater delay |
CH_N2 | .rte | – | 0.014 | –0.01 to 0.3 | Channel Manning coefficient |
CH_K2 | .rte | mm/h | 0 | –0.01 to 500 | Channel effective hydraulic conductivity |
TIMP | .bsn | – | 1 | 0.01 to 1 | Snow pack temperature lag factor |
TLAPS | .sub | °C/km | 0 | –10 to 0 | Temperature lapse rate |
SURLAG | .bsn | day | 4 | 1–24 | Surface runoff lag time |
LAT_TTIME | .rte | day | 0 | 0–180 | Lateral flow time |
ALPHA_BF | .gw | day | 0.048 | 0–1 | Base flow recession constant |
SOL_AWCa | .sol | – | Soil data | ±20% | Available water capacity of the soil layer |
EPCO | .hru | – | 1 | 0.01–1 | Plant evaporation compensation factor |
GW_REVAP | .gw | – | 0.02 | 0.02–0.2 | Groundwater ‘revap’ coefficient |
CANMX | .hru | mm | 0 | 0–100 | Maximum canopy storage |
GWQMN | .gw | mm | 0 | 0–5000 | Threshold water depth in the shallow aquifer for return flow to occur |
Parameter . | Input file . | Units . | Default value . | Range . | Description . |
---|---|---|---|---|---|
CN2a | .mgt | – | HRU | ±20% | SCS run off curve number |
ALPHA_BNK | .rte | day | 0.048 | 0–1 | Base flow factor for bank storage |
ESCO | .hru | – | 0.95 | 0.01–1 | Soil evaporation compensation factor |
SOL_BDa | .sol | mg/m3 | Soil data | ±50% | Soil bulk density |
SOL_Ka | .sol | mm/h | Soil data | ±80% | Soil saturated infiltration coefficient |
REVAPMN | .gw | mm | 1 | 0–1000 | Threshold water level in shallow aquifer at which ‘revap’ occurs |
GW_DELAY | .gw | day | 31 | 0–500 | Groundwater delay |
CH_N2 | .rte | – | 0.014 | –0.01 to 0.3 | Channel Manning coefficient |
CH_K2 | .rte | mm/h | 0 | –0.01 to 500 | Channel effective hydraulic conductivity |
TIMP | .bsn | – | 1 | 0.01 to 1 | Snow pack temperature lag factor |
TLAPS | .sub | °C/km | 0 | –10 to 0 | Temperature lapse rate |
SURLAG | .bsn | day | 4 | 1–24 | Surface runoff lag time |
LAT_TTIME | .rte | day | 0 | 0–180 | Lateral flow time |
ALPHA_BF | .gw | day | 0.048 | 0–1 | Base flow recession constant |
SOL_AWCa | .sol | – | Soil data | ±20% | Available water capacity of the soil layer |
EPCO | .hru | – | 1 | 0.01–1 | Plant evaporation compensation factor |
GW_REVAP | .gw | – | 0.02 | 0.02–0.2 | Groundwater ‘revap’ coefficient |
CANMX | .hru | mm | 0 | 0–100 | Maximum canopy storage |
GWQMN | .gw | mm | 0 | 0–5000 | Threshold water depth in the shallow aquifer for return flow to occur |
aThese parameters are varied as a percentage of their default values to maintain their relative spatial variability.
Parameter . | Shiquan watershed . | Shiquan-Ankang intermediate area . | Ankang-Danjiangkou intermediate area . |
---|---|---|---|
SURLAG | 0.325425 | 9.594075 | 9.857525 |
CN2 | –0.0854 | 0.1038 | 0.1986 |
SOL_K | 0.5832 | 0.4376 | 0.7384 |
SOL_BD | 0.26615 | 0.40805 | 0.14625 |
SOL_AWC | 0.0979 | 0.3037 | 0.1501 |
GW_DELAY | 186.75 | 200.75 | 398.75 |
ALPHA_BF | 0.9575 | 0.6715 | 0.0785 |
GWQMN | 912.5 | 1307.5 | 987.5 |
GW_REVAP | 0.09137 | 0.06275 | 0.02675 |
REVAPMN | 852.5 | 350.5 | 866.5 |
ALPHA_BNK | 0.6685 | 0.2005 | 0.7705 |
CH_N2 | 0.249315 | 0.225755 | 0.001625 |
CH_K2 | 380.7476 | 23.24047 | 275.7455 |
CH_N1 | 26.56614 | 22.78741 | 24.82673 |
CH_K1 | 220.35 | 48.15 | 34.95 |
LAT_TTIME | 5.49 | 149.49 | 1.71 |
CANMX | 92.35 | 3.35 | 13.65 |
ESCO | 0.143155 | 0.060985 | 0.233245 |
EPCO | 0.598555 | 0.680725 | 0.454015 |
Parameter . | Shiquan watershed . | Shiquan-Ankang intermediate area . | Ankang-Danjiangkou intermediate area . |
---|---|---|---|
SURLAG | 0.325425 | 9.594075 | 9.857525 |
CN2 | –0.0854 | 0.1038 | 0.1986 |
SOL_K | 0.5832 | 0.4376 | 0.7384 |
SOL_BD | 0.26615 | 0.40805 | 0.14625 |
SOL_AWC | 0.0979 | 0.3037 | 0.1501 |
GW_DELAY | 186.75 | 200.75 | 398.75 |
ALPHA_BF | 0.9575 | 0.6715 | 0.0785 |
GWQMN | 912.5 | 1307.5 | 987.5 |
GW_REVAP | 0.09137 | 0.06275 | 0.02675 |
REVAPMN | 852.5 | 350.5 | 866.5 |
ALPHA_BNK | 0.6685 | 0.2005 | 0.7705 |
CH_N2 | 0.249315 | 0.225755 | 0.001625 |
CH_K2 | 380.7476 | 23.24047 | 275.7455 |
CH_N1 | 26.56614 | 22.78741 | 24.82673 |
CH_K1 | 220.35 | 48.15 | 34.95 |
LAT_TTIME | 5.49 | 149.49 | 1.71 |
CANMX | 92.35 | 3.35 | 13.65 |
ESCO | 0.143155 | 0.060985 | 0.233245 |
EPCO | 0.598555 | 0.680725 | 0.454015 |
Experimental monthly streamflow forecasting for the CRSHR
Based on the above nine-month CFS monthly forecast precipitation post-processing from January to September in 2017 above under ‘Nine-month precipitation forecasting’, we evaluated the monthly streamflow forecast from January to September in 2017 for the three areas of CRSHR using the long-term forecast scheme proposed in this study. For comparison, the daily precipitation in the subbasin was obtained by scaling the original CFS daily precipitation and similar historical observation model, respectively. Other forecasted meteorological elements (temperature, wind speed, solar radiation and relative humidity) were obtained by the weather generator. All the forecasted meteorological elements were input into each subbasin to produce the monthly streamflow at the three outlets (Shiquan, Ankang and Danjiangkou), as shown in Figure 10. The values of QR and QR-MAPE are summarized in Table 6. For the Shiquan reservoir, compared to the streamflow from the original CFS spatio-temporal distribution, the QR of the streamflow from the similar historical observed spatio-temporal distribution was 11% greater, while the QR-MPAE was 15% less; for the Ankang reservoir, although the QR remained unchanged, the QR-MPAE was 8% less; for the Danjiangkou reservoir, the QR was 23% greater, while the QR-MPAE was 28% less. Therefore, the streamflow forecast based on the historical measured spatio-temporal distribution of precipitation is more accurate than that using the original CFS spatio-temporal distribution.
. | Streamflow with the original CFS spatio-temporal distribution . | Streamflow with the similar historical observed spatio-temporal distribution . | ||
---|---|---|---|---|
QR (%) . | QR-MAPE (%) . | QR (%) . | QR-MAPE (%) . | |
Shiquan reservoir | 67 | 27 | 78 | 12 |
Ankang reservoir | 78 | 23 | 78 | 15 |
Danjiangkou reservoir | 44 | 45 | 67 | 17 |
. | Streamflow with the original CFS spatio-temporal distribution . | Streamflow with the similar historical observed spatio-temporal distribution . | ||
---|---|---|---|---|
QR (%) . | QR-MAPE (%) . | QR (%) . | QR-MAPE (%) . | |
Shiquan reservoir | 67 | 27 | 78 | 12 |
Ankang reservoir | 78 | 23 | 78 | 15 |
Danjiangkou reservoir | 44 | 45 | 67 | 17 |
CONCLUSIONS
In this study, a nine-month streamflow forecast for the CRSHR was obtained based on the SWAT model with CFS output. The results and findings are concluded as follows:
An attempt to apply the NCEP Climate Forecast System outputs to long-term streamflow forecasting was made. Two machine learning approaches, RF and SVR, were proposed for the post-processing of the CFS precipitation forecast with different lead times. The results showed that no matter which method (RF or SVR) is used or how long the forecast period is (within nine months), the correlation coefficients between the processed CFS and observed precipitation were greater than 0.91. In addition, the processing performance for the rainy season (from May to August) is better than that for the dry season. Meanwhile, the advantage of post-processing is that an increased lead time has no effect on the accuracy of the processed prediction. Furthermore, the QR and MAPE of the testing results suggested that the two processing methods used for the CFS precipitation met the accuracy requirement. However, it is difficult to determine which method is better. In this paper, to reduce uncertainty, the final precipitation forecast is an average of the results from the two algorithms.
To forecast the streamflow of the studied cascade reservoirs system, SWAT construction for the three regions is suggested to reflect the rainfall-runoff mechanism of different regions. Meanwhile, considering the effect of the spatio-temporal distribution of precipitation in the subbasin, the daily precipitation in each subbasin was determined by using the typical monthly precipitation scaling. Different spatio-temporal distribution models have a significant impact on the streamflow forecast results. For the long-term streamflow forecast, the spatio-temporal distribution model, based on a hydrological similarity analysis, is better than that of the original CFS prediction.
Finally, a long-term streamflow forecasting scheme for the CRSHR was achieved by combining CFS precipitation post-processing with the SWAT model. This scheme could provide new thought for a coupled atmospheric-hydrological streamflow forecasting model. The new framework proposed in this paper could be used to forecast long-term streamflows for similar cascade reservoir systems.
ACKNOWLEDGEMENTS
This paper was jointly supported by the National Key Research and Development Program of China (2016YFC0402706, 2016YFC0402707), the Fundamental Research Funds for the Central Universities (2017B611X14), the Key Program of National Natural Science Foundation of China (41730750), and Postgraduate Research & Practice Innovation Program of Jiangsu Province (KYCX17_0415).