Abstract
There are many sources of uncertainty in reservoir operation. The presence of these uncertainties might lead to operation risks, which directly affect the comprehensive benefit of reservoirs. This study developed a simple framework to quantify the uncertainty contribution arising from the inputs, model structures, model parameters, and their interaction in the reservoirs. We established a deterministic reservoir operations model with the intention of maximizing power generation, and the scheduling results with the inputs and optimal output datasets were used for data-driven models – artificial neural networks (ANNs). The time period, inflow, storage, and inflow in the last period were chosen as input, integrating with ANN models of different structures and parameters, to produce an ensemble of 10-day forecasts of power generation. The analysis of variance (ANOVA) method was applied to quantify the contribution of the uncertainty sources. The results demonstrated that the inputs were the predominating source of uncertainty in the reservoir operation, especially from May to October. In addition, the uncertainty caused by the interactions between the three sources of uncertainty was more considerable than that of the model structure or parameter in November–April, and the uncertainty contributions of the model structure or parameter were relatively marginal.
HIGHLIGHTS
A framework is proposed to quantify the uncertainty contribution from the inputs, model structures, and parameters in the medium-term reservoir operation.
The inputs are the predominating contribution of uncertainty, especially in the flood season of May–October.
The uncertainty contribution of the interactions is more considerable in the non-flood season of November–April than individual uncertainty.
Graphical Abstract
INTRODUCTION
The reservoir operations are significant for water resources management in ecosystem protection, water supply, flood control, and power generation. Scheduling models, a tool for describing the reservoir operation process, can be used to achieve the rational regulation and redistribution of water resources between these competing objectives (Wang et al. 2010; Ticlavilca & Mckee 2011). Many factors in the process of reservoir modeling are associated with uncertainty. The impacts of these factors such as inputs or model characteristics significantly contribute to the total uncertainty in reservoir operation (Ahmadi et al. 2010; Ticlavilca & Mckee 2011; Ba et al. 2019), which adversely affects scheduling decisions and water utilization benefits. The hardest thing is to conduct an independent evaluation of uncertainty in the entire scheduling process, since uncertainties of input are bundled with those of the solution produced by the operational model. Therefore, quantification of individual sources of uncertainties in reservoir operation is required. The sources of uncertainties in reservoir operation include the inputs, model structures, and model parameters. Some researchers have considered part of the uncertainty sources in reservoir operation (Ahmadi et al. 2010; Ticlavilca & Mckee 2011; Liu et al. 2014; He et al. 2018). However, few studies have considered interactions between various uncertainty sources, nor did they address the overall uncertainty that resulted from different sources in reservoir operation. Therefore, this study aims to fill the gap.
The data-driven models have been widely applied to pattern recognition and prediction in various hydrologic problems (Shrestha & Nestmann 2009; Chitsazan et al. 2015; Fayaed et al. 2015; Niu et al. 2019). Unlike the conceptual models based on physical mechanisms, data-driven models can derive the unknown relationship between the variables using massive data. In reservoir operation, Wang et al. (2010) adopted the radial basis function (RBF) neural network with particle swarm optimization (PSO) algorithm to simulate the scheduling process, and derived reservoir operation rules. Niu et al. (2019) used four data-driven methods to derive the operation rule of hydropower reservoirs, including multiple linear regression (MLR), artificial neural network (ANN), extreme learning machine (ELM), and support vector machine (SVM). Because of the merits and drawbacks of each model, no model can achieve 100% accurate simulation results. The ANN, the most commonly data-driven model, was therefore selected in this paper, due to its simplicity, acceptable accuracy, and easy expansion.
The inputs, used as external factors of the ANN models, include storage capacity, inflow, recession flow, and time period. Inputs of reservoir operation vary with the different targets sought (Yang et al. 2020a). Many methods such as input variable selection (CIS; Giuliani et al. 2016) and heuristic input selection (HIS; Yang et al. 2020a) have been used to identify appropriate inputs for reservoir operation. Although these methods are conductive to obtain better scheduling results, they focus on the final optimal input combination, which cannot clarify the individual and interactional effects of each input on the simulation results. Therefore, this study intends to analyze the impact and uncertainty contribution of the diverse individual input or their combinations in reservoir operation, which was not considered in previous studies.
In addition to inputs, previous studies also investigated the impacts of model structure and parameters on results of the data-driven model (Renard et al. 2011; Kasiviswanathan et al. 2013; Chitsazan et al. 2015; Cheng et al. 2016; Tongal & Booij 2017). Renard et al. (2011) used the Bayesian total error analysis (BATEA) methodology to analyze the uncertainty sources contribution of rainfall, runoff, and structure in runoff predictions and found that the structure is the dominant source of uncertainty. van den Tillaart et al. (2013) quantified the effect of errors in observed discharge on model parameters and its associated performance for discharge prediction. Chitsazan et al. (2015) adopted the hierarchical Bayesian model averaging (HBMA) method to analyze the uncertainty of structure and parameters in the ANNs, and found that uncertain inputs and ANN model parameters explain the most variance of the predicted results. Zhang et al. (2020) used a decomposition scheme based on the analysis of variance (ANOVA) to evaluate the contribution of the data-driven model and input set to the uncertainty of the inflow forecast. Compared with the Bayesian methods, the ANOVA method has a simpler calculation process. This study proposed to use the ANOVA method for evaluating contributions of model structure and parameters to uncertainties in reservoir operation, especially using the data-driven model.
Considering the above-mentioned reasons, this study is dedicated to examining the contribution of inputs, model structures, model parameters, and their interaction to uncertainty in reservoir operation for power generation by using the ANN models. In the case study, taking the Huanren Reservoir as an example, an optimization model was first established to obtain the optimal trajectories, including the input and output, for deterministic reservoir operation. Based on these datasets, we subsequently performed a ten-day (10 d) forecast of power generation using the ANN models with different input combinations, the changed hidden neurons about structures, and the perturbed parameters. Here, we simplified the uncertainty analysis by using observed runoff data in the reservoir operation. In other words, we used the ‘perfect’ runoff forecast products for the ten-day forecast of power generation. Finally, we applied the ANOVA model to assess and quantify the contribution of the uncertainty sources. This work has the potential to fill a gap by providing a methodology in quantifying the impact of existing uncertainty sources in reservoir operations for power generation, which provides a fundamental basis for a comprehensive analysis of uncertainty in operating the Huanren Reservoir.
Study area and data
Items . | Unit . | Value . |
---|---|---|
Control catchment area | km2 | 10,364 |
Multi-year average streamflow | m3/s | 143.6 |
Normal water level | m | 300.0 |
Normal water storage capacity | 106m3 | 2,199 |
Dead water level | m | 290.0 |
Dead water storage capacity | 106m3 | 1,380 |
Installation capacity | MW | 222 |
Guaranteed capacity | MW | 33 |
Full-load flow | m3/s | 416 |
Items . | Unit . | Value . |
---|---|---|
Control catchment area | km2 | 10,364 |
Multi-year average streamflow | m3/s | 143.6 |
Normal water level | m | 300.0 |
Normal water storage capacity | 106m3 | 2,199 |
Dead water level | m | 290.0 |
Dead water storage capacity | 106m3 | 1,380 |
Installation capacity | MW | 222 |
Guaranteed capacity | MW | 33 |
Full-load flow | m3/s | 416 |
In this paper, the observed reservoir inflow from 1980 to 2018, along with reservoir characteristics namely elevation-area-volume curve, tail water level curve, and maximum flow through the turbine, were provided by the Hydrological Bureau of Liaoning Province and Reservoir Administration. We divided the data series into two parts: calibration period with 80% dataset (1980–2010) and verification period with 20% dataset (2011–2018), with 10 d as the data time-period for the forecast of power generation.
METHODOLOGY
The framework for quantifying reservoir multi-source uncertainty
In this study, we assumed that a perfect 10-day forecast of reservoir inflow was used for optimal power generation. Our intention is to focus on the uncertainty quantification of the perfect forecast conditions. Actually, natural reservoir inflow varies with time, which is highly uncertain. So, how to optimize power generation using the data-driven model like ANN is a task of huge challenges. Our study is intended to present a methodology that can quantify the main contribution of the uncertainty sources from input data, model structure, model parameters, and their interactions. The perfect reservoir inflow forecast enables us to better focus on the effect of the ANN modeling and the inflow characteristics.
The model for deterministic reservoir operation
With the water balance and various constraint conditions, the deterministic reservoir operation model was established to evaluate the maximum power generation benefits for the Huanren Reservoir.
We established a single objective model instead of a multi-objective model for reservoir operation. The plausible reason is that a multi-objective problem obtains optimal solutions from multiple Pareto rather than single solutions, which compounds operating uncertainties (Yang et al. 2020b). Using a single hydropower generation objective facilitate the straightforward analysis of the uncertainties in the Huanren Reservoir.
Sources of uncertainty in reservoir operation
Combining previous research and the actual physical meaning of the inputs in the Huanren Reservoir, we choose the input variables as comprehensively as possible. Here, the inputs, i.e., time period (t), observed inflow over time period t (), storage at the beginning of time period t (), and observed inflow in the last time period () are selected for power generation decision-making in this study.
The uncertainty from the ANN model structure was carried out by changing the hidden neurons or hidden layers. Most of the researchers suggested using a single hidden layer except a few (Shrestha & Nestmann 2009; Tongal & Booij 2017), which in general gave more accurate results. Therefore, considering a single hidden layer, it is possible to quantify the structure uncertainty contributions by adding or removing the hidden neurons. In this paper, a single hidden layer is considered. Since there is not a systematic or standard procedure to determine the number of hidden neurons, the number of hidden neurons was determined by trials from 1 to 9 neurons with 1 step size. In Equation (7), the parameters in the ANN model are the connection weights ( and ) and bias ( and ). In this paper, the genetic algorithm (GA) was utilized to obtain the optimal values of the parameters in the first stage, considering a population of 100 and stopping criteria of 1,000 generations in the GA. Some investigators choose to obtain the prior parameter distributions, so that parameter sets varying within certain limits can be obtained by sampling (Tongal & Booij 2017; Godo-Pla et al. 2019). However, the distributions of parameters are not always available in advance. Thus, we adopt the method proposed by Dhanesh & Sudheer (2010) and Kasiviswanathan et al. (2013), the interval of parameters was within a range of ±10% of the optimal parameter value. Then, we used the Latin Hypercube Sampling to sample from the predefined intervals 100 times, generating the perturbed ensemble parameter.
This study focuses on the forecast uncertainty caused by the input data, model structure, model parameters, and their interaction under the condition of a perfect inflow forecast with a 10-day forecast period. The method framework proposed in this study can also be used for reservoir power generation scheduling under different forecast periods. Based on the decision period (t = 10 days), we made the power generation flow decisions at the initial moment of every 10 d, employing the input combination related to t, , , and for the ANN models with different hidden nodes and connection weights and bias. is assumed as a perfect 10-day ahead forecast of reservoir inflow. In this way, we can realize the forecast of power generation flow in the 10 days at the initial moment of the 10 d.
Impact assessment indicator
ANOVA method
RESULTS
Summary impact assessment indicator
This study used the GA to train the ANNs with different input and hidden neurons in the first stage to obtain the optimal parameters, then used Latin Hypercube Sampling to sample a range of ±10% from the optimal parameters. The data series was divided into two parts: calibration period of 1980–2010 with 1116 samples for training and verification period of 2011–2018 with 252 samples for validation. The NSE value of the calibration period is higher (mean 0.77) than that in the verification period (mean 0.71). Meanwhile, the NSE varied widely in the range of 0.56–0.95 in the calibration and 0.45–0.94 in the verification, which indicated that different input combinations, model structures, and parameters lead to significantly diverse NSE value distributions.
The impact of the uncertainty sources
Figure 3 shows the NSE changes with the 11 input combinations. It is to be noted that the range of NSE varied widely. For example, the minimum average NSE is 0.47 for combinations of two inputs and t, and the maximum average NSE is 0.89 for combinations of all four inputs. This indicated that the best simulated results of power generation in reservoir operation were obtained when using all four inputs. But this does not mean that the NSE value does always increase with adding further inputs. For example, combinations of two inputs and can help to obtain more significant improvement in the model performance than the three inputs t, and , or t, and . Besides, the range of 0.42 () in NSE values due to the changes of input combinations implies that the inputs hold a very significant impact on the model performance. In addition, the results in Figure 3 (bottom) clearly illustrate the changes of NSE in the case of adding one of the inputs to the six combinations not including it. The x-axis represents the added input. The blue bins denote the NSE range without the corresponding input, and the pink is with the corresponding input. We found that the pink bins are always a little taller than the blue, indicating that each added input can help to improve the model performance. Specifically, adding or leads to a more significant improvement in terms of the NSE, while adding t or leads to a relatively marginal improvement of the model performance. For example, by adding variables, from the combination of , t to , t, , the mean of NSE increased by 0.4, from t, to t, , , the mean of NSE increased by 0.2, and from t, , to t, , , , the mean of NSE increased by 0.34. By adding variables, from the combination of , t to , t, , the mean of NSE increased by 0.08, from t, to t, , , the mean of NSE increased by 0.07, and from t, , to t, , , , the mean of NSE increased by 0.02.
The NSE changes for the validation dataset when increasing the number of hidden nodes in the ANN model from 1 to 9 can be seen in Figure 4. We used the ANN followed by the number of hidden nodes to denote different ANN models. As shown in the x-axis, ‘ANN–1’ represented an ANN model with one node in the hidden layer. It is noted that the average NSE range of the mixed hidden nodes is between 0.60 and 0.74, whose range of 0.14 is significantly smaller than the 0.42 of the diverse input combinations. These results indicate that the quality of reservoir scheduling solutions is more sensitive to the inputs than the ANN model structure. It is also noted that the best performance was obtained at ANN-6 instead of ANN-9, indicating that an increase in hidden nodes does not necessarily lead to growth performance. A reasonable explanation is that adding nodes does not contribute to the good results but the complexity of calculations.
Uncertain sources quantification
We found that the inputs are the predominating uncertainty contribution with the variance fraction of 0.58, and the contribution of inputs is especially dominant from May to October with the variance fraction of 0.71. The contribution of the interactions to the total uncertainty is 0.29 over the entire annual cycle, with a more considerable uncertainty contribution from November to April with the variance fraction of 0.43, revealing that the interactions among different uncertainty sources cannot be ignored. May–October is the flood season in the Huanren Reservoir, and its inflow is relatively large and varied. The reservoir operation depends more on the inputs. Therefore, the variance fraction of the inputs is larger than in other periods. On the contrary, in the non-flood season from November to April, its inflow is relatively small and stable, and the impact of the inputs is less. Besides, the uncertainty contribution of the parameters in reservoir operation is more prominent than that of the structure, and the uncertainty contributions of these two do not show significant changes over time.
We further analyzed the uncertainty contribution with the changes of each input. For the discussion, we divided the inflow into seven intervals according to the frequency of 15, 30, 45, 60, 75, and 90%. The water level was divided into five magnitudes from the dead water level of 290.0 m to the normal water level of 300.0 m, taking 2.0 m as the step. The storage capacity corresponding to the water level was obtained using elevation-area-volume curves in the Huanren Reservoir. The classification results are shown in Table 2. Figure 6 (bottom) depicts the variance decomposition of the changes in different inflow and storage magnitude. With the increase of reservoir inflow, the uncertainty contribution of the inputs shows a significant upward trend from 0.56 to 0.79. The uncertainty contribution of the reservoir storage shows a more significant upward trend from 0.48 to 0.80 than that of inflow. These results are consistent with the analysis in Figure 3, i.e., the inflow and the storage of inputs play a more significant impact on reservoir operation results. In addition, the uncertainty contribution of the interaction is bigger when the inflow and storage levels are at a low magnitude, and the contribution becomes lessening as the magnitude increases. The results demonstrated that the interaction plays a greater role when the reservoir water volume is low, which is consistent with the conclusion that the interactions are of much account in the non-flood season.
. | Magnitude . | ||||||
---|---|---|---|---|---|---|---|
1 . | 2 . | 3 . | 4 . | 5 . | 6 . | 7 . | |
Storage (million m³) | 1,380.3–1,521.4 | 1,521.4–1,671.7 | 1,671.7–1,832.7 | 1,832.7–2,008.4 | 2,008.4–2,199.6 | – | – |
Inflow (m³/s) | <10 | 10–25 | 25–45 | 45–80 | 80–140 | 140–280 | >280 |
. | Magnitude . | ||||||
---|---|---|---|---|---|---|---|
1 . | 2 . | 3 . | 4 . | 5 . | 6 . | 7 . | |
Storage (million m³) | 1,380.3–1,521.4 | 1,521.4–1,671.7 | 1,671.7–1,832.7 | 1,832.7–2,008.4 | 2,008.4–2,199.6 | – | – |
Inflow (m³/s) | <10 | 10–25 | 25–45 | 45–80 | 80–140 | 140–280 | >280 |
DISCUSSION
The framework for quantifying reservoir multi-source uncertainty
The results of reservoir operation are affected by a variety of effect factors (i.e., sources of uncertainty). The absence of available multi-source uncertainty information in reservoir operation may produce misleading information and increase the decision-making risk in practical applications. Hence, meaningful research on multi-source uncertainty quantification in reservoir operation is necessary, which provides the basis for further research on the possible way to reduce the predominating uncertainty source. This study has developed a framework for quantifying the contribution of multi-source uncertainties, rather than only describing uncertainty qualitatively through statistical indices as in the previous study. It is noted that this framework combines more complete uncertainty sources such as inputs, model structures, model parameters, and their interaction in reservoir operation when using the ANN models. Besides, this paper quantifies uncertainty in reservoir scheduling using the ANOVA method, which is simple, straightforward, and practical. The results in the Huanren Reservoir indicate that the framework can effectively quantify the contribution of uncertainty sources using ANOVA in reservoir operation.
Practical applications
There are two guidance learning points from this study in practical application: (1) Obtain the best input and reliable data-driven model with less uncertainty when using the ANN models in the Huanren Reservoir operation. Comparing the simulated power generation results from proposed ANN models with the optimal power generation in terms of the NSE, a reliable ANN forecasting model with higher NSE was obtained. The results indicated that in the Huanren Reservoir, using all four inputs of the t, , , and , and ANN model with six hidden nodes may get relatively optimal scheduling results. (2) Provide the basis for further research on the possible way to reduce the predominating uncertainty source of different seasons. For example, the inputs and interactions are the foremost uncertainty sources in the Huanren Reservoir during the flood season from May to October and during the non-flood season from November to April, respectively. This implies that more attention should be paid to the optimization of the input variables selection in flood season. Meanwhile, the combined effects of the inputs, model structures, and model parameters should be carefully considered in the non-flood season of the reservoir scheduling. In addition, the robustness of the quantifying results must be further tested in other reservoirs that may be affected by the inflow, reservoir characteristics, and scheduling target. The framework proposed for evaluating the contribution of multi-source uncertainty in reservoir operation in this study may be extended to other single-purpose reservoirs.
Limitations and future developments
There are still some limitations and challenges where the framework could be further developed, and where the results of the uncertainty quantification in reservoir operation may be further optimized. This paper defined the sigmoid function in the ANN models as the activation function, and the optimization algorithm is GA. This function or algorithm serves as one of the ways to get better simulation results, but whether these methods impact the uncertain quantitative results requires further research. Besides, the inflow, as the inputs, are from actual observations. This observed data was considered as the exact value and directly used in this paper. However, some researchers reported that the measurement errors of observation exist due to accidental factors (Bhowmik et al. 2020). Therefore, the uncertainty from measurement errors in the reservoir operation may need to be considered in future studies.
In addition, this study considered the perfect 10-day inflow conditions to quantify the main contribution of the involved uncertainty sources, which enables us to better focus on the effect of the ANN modeling and the inflow characteristics. Future studies could also be dedicated to determining the uncertainty from forecast precipitation and runoff forecast models and their impact on reservoir operation, when using the forecast precipitation information in reservoir scheduling.
CONCLUSION
We presented a framework using the ANOVA method to quantify the uncertainty contribution from the inputs, model structures, model parameters, and their interaction in the reservoirs. This study may fill a gap in the lack of the assessment of the uncertain sources’ impact and their contribution to the total uncertainty in reservoir operation. Taking the Huanren Reservoir as an example, we investigated the uncertainty contribution in the reservoir operation based on the 11 input combinations, 9 ANN model structures, and 100 ANN model parameters for the validation period of 2011–2018. The main conclusions are summarized below.
- 1.
The predominating contribution of uncertainty in reservoir operation using the ANN model is the input during the flood season from May to October. This result indicates that inputs and their selection are critical in the ANN model, and more attention should be paid to the input variables selection in the flood season.
- 2.
During the non-flood season from November to April, the interactions between the inputs, model structure, and model parameter are the foremost uncertainty sources. It reveals that the combined effects of the interactions should be carefully considered during the non-flood season of the ANN modeling for reservoir operation.
- 3.
The contribution of model structure or parameter uncertainty is less considerable than inputs or interactions. The uncertainty contribution of the parameters in reservoir operation is more prominent than that of the structure, and the uncertain contributions of these two do not show significant changes over time in a year.
- 4.
The inflow and storage inputs play a more significant contribution to uncertainty than the time period and last period inflow, which implies that the reservoir operation should focus more on the inflow and storage in further research.
ACKNOWLEDGEMENTS
This study was supported by the Second Tibetan Plateau Scientific Expedition and Research (STEP) program, Grant No. 2019QZKK0203.
DATA AVAILABILITY STATEMENT
Data cannot be made publicly available; readers should contact the corresponding author for details.
CONFLICT OF INTEREST
The authors declare there is no conflict.