ABSTRACT
Climate change intensifies and human activities escalate, making hydrological data nonstationary. The current nonstationary flood design methods have low practicality in engineering applications due to high uncertainty, lack of causal mechanisms, or complex model structures. The study focused on the upper Yellow River region in China, which houses cascade hydropower stations, and introduced the Mechanism-based Reconstruction (Me-RS) method to solve nonstationary flood design problems. It evaluates this method against the traditional stationary method, time series decomposition synthesis (TS-DS) method, and generalized additive models for location, scale, and shape (GAMLSS). The design flood values indicate that the calculation results of the Me-RS method are significantly reduced, with the 100-year design flood estimates being 2435.53 m3/s for Me-RS, 2447.53 m3/s for GAMLSS, 3522.52 m3/s for TS-DS, and 4753.76 m3/s for the traditional stationary method. Moreover, Bootstrap uncertainty analysis demonstrates that Me-RS and GAMLSS, which consider physical factors, reduce uncertainty by about 40% compared to TS-DS. It is indicated that the Me-RS method provides a more accurate and less complicated alternative for calculating nonstationary design flood. This study advances the practical application of nonstationary flood frequency analysis methods in the domain of engineering hydrology.
HIGHLIGHTS
This study evaluates the performance of the Me-RS, GAMLSS, and TS-DS methods in nonstationary flood frequency analysis.
The Me-RS method provides a more accurate and less complicated alternative for nonstationary flood frequency analysis techniques.
Physical cause mechanisms enhance the explanatory capacity for nonstationary flood processes and reduce the uncertainty in flood frequency analysis.
INTRODUCTION
As one of the most destructive natural disasters, floods cause unprecedented damage to the natural environment, socio-economy, and lives within the river basin (Wasko et al. 2021). In the field of hydrology, traditional stationary flood frequency analysis is the most widely used technique for characterizing flood risk and assessing engineering safety, which is of great significance for flood control management and water resource utilization in river basins (Barbhuiya et al. 2023). Traditional stationary flood frequency analysis requires that the distribution function and statistical parameters of flood series remain consistent over past, present, and future periods, meaning that the flood samples must satisfy the stationarity condition (Hesarkazzazi et al. 2020). However, under the combined influence of human activities and climate change, the flood conditions in most river basins have undergone nonstationary changes, meaning that the flood distribution has altered over time, which conflicts with traditional stationary frequency analysis that requires flood series to meet stationarity conditions (Serinaldi & Kilsby 2015). Therefore, many scholars have abandoned the equilibrium or stationary paradigm and innovatively developed a series of flood frequency analysis methods under nonstationary conditions (Xiong et al. 2024). Compared to traditional stationary flood frequency analysis, this study employs the mechanism-based reconstruction (Me-RS) method, focusing more attention on the reliability of nonstationary flood frequency analysis methods and their practical application in engineering hydrology, aiming to more effectively address potential flood disasters and thus reduce flood risk.
In general, the flood frequency analysis methods under nonstationary conditions mainly focus on two aspects. The first method involves reconstructing the series by using the concept of reduction/recursion to eliminate the nonstationary characteristics of hydrologic series, and then the traditional stationary frequency analysis method is applied to the reconstructed stationary series to derive design values, such as the time series decomposition synthesis (TS-DS) method (Xie et al. 2005). TS-DS decomposes hydrological series into nonstationary deterministic components and stationary random components based on statistical analysis. After calculating the random component using traditional stationary frequency analysis methods and estimating the deterministic component for the design period, combining these two parts allows for the estimation of design floods in historical, current, and future periods. The TS-DS, due to its simple and clear theoretical framework, has found widespread application in research on various nonstationary conditions such as flood design, navigable water level, and drought (Li et al. 2020; Wang et al. 2020). The second method includes hydrologic frequency analysis methods that are directly applied to nonstationary hydrologic series, such as time-varying moment (TVM), which assumes that nonstationary flood series is influenced by covariates, leading to distribution parameters that vary with time or other covariates (Strupczewski & Kaczmarek 2001). In particular, the generalized additive models for location, scale, and shape (GAMLSS) can simulate changes in probability distribution parameters under different covariate conditions, achieving accurate capture of the nonstationary evolutionary trends within hydrological series, which are commonly utilized in TVM for evaluating nonstationary flood series (Rigby & Stasinopoulos 2005; Xiong et al. 2024). Chen et al. (2021) conducted an in-depth analysis of 158 hydrological measurement stations in the UK and found that the GAMLSS can always achieve the best simulation of nonstationary hydrological series by employing specific covariates. The GAMLSS, owing to its superior versatility and flexibility, has significantly expanded the options for model construction, increasingly becoming the most popular model for conducting nonstationary frequency analysis (Zhou et al. 2022).
Despite extensive research on nonstationary flood frequency analysis, a controversy-free standard method has not yet been established, and there is a lack of widely accepted government guidelines to update existing nonstationary flood frequency analysis methods (Serago & Vogel 2018). This is because these nonstationary frequency analysis methods typically involve complex mathematical models and advanced statistical techniques, and their high computational requirements make these methods difficult to apply directly to practical engineering (Barbhuiya et al. 2023). In fact, facing variable research conditions, researchers must have a deep understanding of the complex theoretical frameworks behind these mathematical models to appropriately select and validate their performance, which is undoubtedly a formidable challenge for them (Singh & Chinnasamy 2021). Additionally, Debele et al. (2017) specifically point out that the current nonstationary hydrological frequency analysis methods primarily focus on the construction of mathematical formulas, attempting to describe the nonstationary changes from the statistical characteristics of the series distribution, which equates statistical significance with practical importance. Ignoring the typical hydrological inference process will lead to an insufficient ability to explain complex hydrological changes. Therefore, the complex model structure and the lack of hydrological inference process faced by the current nonstationary frequency analysis methods constitute the main obstacles in their practical application within engineering hydrology.
An innovative Me-RS method that focuses on the causal mechanisms of hydrological changes and has a simpler theoretical structure should receive more widespread attention. The Me-RS method fills the gap in past nonstationary hydrological frequency analysis methodologies that only concentrate on the distribution function or statistical characteristics of the hydrological series itself. The Me-RS method believed that the nonstationary change of hydrological variables inherently involves a causal mechanism between ‘cause’ and ‘effect’ (Qin & Li 2021). Based on the objective physical causal mechanism between hydrological variables and nonstationary influencing factors, the Me-RS method constructs a mechanism function. This mechanism function describes the intrinsic impact of nonstationary influencing factors on hydrological variables, which remains constant over time. Through the mechanism function, the nonstationary impact of influencing factors on the hydrological series can be effectively removed, achieving the reconstruction of the hydrological series from nonstationary to stationary, and then the traditional stationary method that has been proficiently mastered can be used for frequency analysis. Currently, the Me-RS method has been applied at multiple hydrological stations in the Wei River Basin and the northern Shanxi region of China. The practice has demonstrated that the Me-RS method not only additionally eliminates the nonstationary changes of the second moment of the series but also significantly reduces the uncertainty of the prediction (Li et al. 2021; Li & Qin 2022). The theoretical correctness and application reliability of the Me-RS method have been validated, providing an effective technique for frequency analysis under nonstationary conditions.
This study is dedicated to advancing the development of a nonstationary flood frequency analysis technique for the upper reaches of the Yellow River. By refining the Me-RS method to feature a more streamlined model structure and reduce uncertainty, we aim to enhance the practical utility of nonstationary approaches in engineering hydrology. The research primarily utilizes the Me-RS, which takes into account the causal mechanism between flood series and influencing factors, to estimate the design flood under nonstationary conditions. Additionally, a comparative analysis is performed between the Me-RS and other methods, namely the TS-DS and GAMLSS.
Flowchart for flood frequency analysis under nonstationary conditions.
STUDY AREA AND DATA
Study area
The location and runoff information of the Guide Basin. (a) Location of the hydrological station, topography, river networks, and reservoirs in the Guide Basin. (b) Average annual runoff and average runoff from July to September in the Guide Basin.
The location and runoff information of the Guide Basin. (a) Location of the hydrological station, topography, river networks, and reservoirs in the Guide Basin. (b) Average annual runoff and average runoff from July to September in the Guide Basin.
The upper reaches of the Yellow River, an important hydropower hub in China, have seen the formation of a cascade of reservoir pattern in the basin over the past half-century. As shown in Figure 2(a), there are three reservoirs within the study area, namely Longyangxia, Nina, and Laxiwa. Among them, Longyangxia Reservoir, the closest reservoir to the source area of the Yellow River, was built in 1986. With multi-year regulation capacity, its total storage capacity of 274.4 × 108 m3 controls 65% of the runoff of the Yellow River. Since the beginning of the 21st century, the Nina and Laxiwa reservoirs have been successively completed in 2004 and 2010. Due to the daily regulation capacity, the total storage capacities of Nian and Laxiwa are only 0.262 × 108 and 10.79 × 108 m3, respectively, which have weak ability to regulate runoff.
Figure 2(b) illustrates the decreasing trend in the annual runoff of the Guide Basin following the construction of the Longyangxia Reservoir, with a decline from 226.25 × 108 m3 in the period 1960–1986 to 191.25 × 108 m3 in the period 1987–2020, and Liu et al. (2021) have also observed a comparable reduction trend in the annual runoff of the upper Yellow River. Notably, the average runoff during the flood season from July to September experienced a significant decrease of 50.21 × 108 m3, and its proportion in the total annual runoff also declined from 47.86% during 1960–1986 to 30.37% during 1987–2020. This indicates that during the flood season, Longyangxia maintains a high flood retention capacity to effectively address potential flood disasters, and during the dry season, it stores sufficient water to ensure the water requirements for the ecological environment and the continuous stability of hydropower generation, which will alter the intra-annual distribution pattern of runoff in the Guide Basin.
Data
The Guide Hydrological Station, an important control station in the upper reaches of the Yellow River, provides complete and reliable measured data. The study utilizes the observed daily runoff observation data from 1960 to 2020 at the Guide Hydrological Station, which was extracted from the ‘Hydrologic Data Yearbook’ published by the Yellow River Conservancy Commission (YRCC).
The study also utilizes daily observed precipitation data from the National Climate Center of China Meteorological Administration (https://cdc.cma.gov.cn/) to simulate the influence of rainfall on flood peak changes. By using the Thiessen polygon, the meteorological station data within the Guide Basin are aggregated to the basin scale, thereby establishing a rainfall series for the Guide Basin from 1960 to 2020. In addition, based on the GCM output data provided by the CMIP6 website (https://pcmdi.llnl.gov/CMIP6/) under various emission scenarios, a statistical downscaling model was established to generate future precipitation series.
The hydrological data used in this study are all sourced from official channels. These data are of good quality and are widely accepted in the field of hydrology research. Therefore, no additional data quality control measures were implemented in the study.
METHODOLOGY
In this section, we first briefly introduce the traditional stationary frequency analysis method (Section 2.1). Secondly, nonstationary flood frequency analysis methods are described, including Me-RS (Section 2.2), TS-DS (Section 2.3), and GAMLSS (Section 2.4). Third, the influence factors that affect the nonstationary change of flood series are introduced (Section 2.5). Finally, a non-parametric Bootstrap method for calculating uncertainty is briefly described (Section 2.6).
Traditional stationary frequency analysis method
The traditional stationary frequency analysis method involves fitting probability distributions to historical extreme flow data to estimate the frequency of extreme events and assess their risks. This process includes two key steps: selecting the appropriate frequency distribution model and accurately determining its parameters. Among these, the World Meteorological Organization (WMO) proposed 16 probability distributions commonly used in hydrology, which can be roughly categorized into four families: the normal family, the general extreme value (GEV) family, the Pearson type III family, and the generalized Pareto distribution (Mizuki & Kuzuha 2023). Many countries have established standard specifications for using a single distribution based on the characteristics of their own flood series. However, the consequence of adopting a single standard distribution is a ‘one-size-fits-nobody’ scenario, which may result in poor accuracy of flood prediction in specific river basins (Kidson & Richards 2005). The results of hydrological frequency analysis will be significantly influenced by the choice of distribution and the best way to quantify this is by comparing multiple distributions. In this study, we selected two-parameter and three-parameter probability distributions as candidates (i.e., Lognormal, Gamma, Gumbel, Weibull, Logistic, and GEV) and used the L-moment method for parameter estimation (Table 1). These distributions are recommended for flood frequency analysis in various countries. Among them, the two-parameter distributions, such as Lognormal, Gumbel, Gamma, and Logistic, can be fitted analytically and are straightforward to apply. Their location and scale parameters represent the mean and variance of the sample population, respectively. The GEV and Weibull three-parameter distributions cannot be fitted analytically, but they offer greater flexibility to accommodate various datasets, with their location, scale, and shape parameters representing the mean, variance, and skewness of the sample population, respectively. Additionally, the Kolmogorov–Smirnov (K–S) test, root mean square error (RMSE), and Nash–Sutcliffe efficiency coefficient (NSEQQ) are employed as goodness-of-fit tests to identify the optimal frequency distribution and distribution parameters. Under stable conditions, traditional hydrological frequency analysis methods have a mature computational framework, and detailed information can be found in various studies (Langat et al. 2019; Ul Hassan et al. 2019).
Summary of the two-parameter and three-parameter distributions for analyzing flood frequency in the study
Distributions . | Probability density function (PDF) . |
---|---|
Lognormal | ![]() ![]() |
Gamma | ![]() ![]() |
Gumbel | ![]() ![]() |
Logistic | ![]() ![]() |
Weibull | ![]() ![]() |
GEV | ![]() ![]() |
Distributions . | Probability density function (PDF) . |
---|---|
Lognormal | ![]() ![]() |
Gamma | ![]() ![]() |
Gumbel | ![]() ![]() |
Logistic | ![]() ![]() |
Weibull | ![]() ![]() |
GEV | ![]() ![]() |
Note: μ is the location parameter, σ is the scale parameter, and is the shape parameter.
Me-RS method


The operational process of the Me-RS method is detailed in Figure 1. The Me-RS method can describe the nonstationary changes of hydrological variables through the mechanism function alone, rather than describing the distribution function or statistical characteristics of hydrological variables. By using the multiplicative model and the mechanism function, the nonstationary changes in the first and second moments of the series can be effectively removed, resulting in a stationary reconstructed series with a length equal to that of the sample. Based on the reconstructed series, employing the widely accepted traditional stationary method for frequency analysis of past, present, and future periods, it shows promising application prospects (Li et al. 2021; Li & Qin 2022).
TS-DS method
Xie et al. (2005) proposed that hydrological series comprise deterministic and random components. Deterministic components, influenced by human activities and climate change, tend to be nonstationary with potential abrupt shifts in short periods. In contrast, the random component remains relatively stationary over shorter engineering time scales because it is being influenced by factors such as geology, which require long geological time periods to change. Using the time series decomposition synthesis (TS-DS) method, the deterministic components are separated from the nonstationary hydrological series to yield a stationary random series. After conducting traditional frequency analysis on the random series and estimating the deterministic components for the design period, recombining these components results in the design values of hydrological series under nonstationary conditions. The TS-DS method identifies the deterministic components and establishes a fitting function. Based on this fitting function, the trend of deterministic components can be predicted, which enables the review of the frequency of past hydrological series and the prediction of the frequency of future hydrological series. The simple and clear theoretical framework of the TS-DS method has been applied to nonstationary frequency analysis in various fields (Xie et al. 2005; Li et al. 2020; Wang et al. 2020).
GAMLSS model
Under nonstationary conditions, the GAMLSS models start from the perspective of the probability distribution followed by the flood series, describing the linear or nonlinear relationships between the statistical parameters of the response variables and covariates under various distribution assumptions (Rigby & Stasinopoulos 2005; López & Francés 2013; Yan et al. 2017). Describing the distribution characteristics of hydrological series through covariates better fits the nonstationary changes of hydrological series, thereby enhancing the rigor of flood analysis (Katz et al. 2002). The GAMLSS model provides researchers with more than 90 distribution options, including many with high skewness and kurtosis, allowing for the description of the variation characteristics of hydrological series through location parameters, scale parameters, and shape parameters (Debele et al. 2017). The great flexibility and generality of the GAMLSS model have made it the most widely applied method for nonstationary hydrological frequency analysis (Xiong et al. 2024).
To ensure fitting performance, the GAMLSS not only quantitatively evaluates predictive accuracy and goodness of fit through the Akaike Information Criterion (AIC) and the Schwarz Bayesian Criterion (SBC) but also qualitatively assesses the fitting quality through visual inspection of residual diagnostic plots (i.e., worm plots). All calculations were performed using the R platform and the freely available GAMLSS package.
Under nonstationary conditions, the return period associated with the design value changes every year, and the one-to-one correspondence between them no longer exists, making it difficult to apply in practice. To address this issue, Liang et al. (2016) proposed the concept of ‘equal reliability (ER)’. ER bridges the gap between stationary and nonstationary design criteria through its good adaptability, enabling the calculation of design floods under nonstationary conditions. For a complete description of ER, please refer to the studies by Liang et al. (2016) and Yan et al. (2017).
Selection of influencing factors: human activities and rainfall
Extreme rainfall events and human intervention are the key factors leading to the nonstationary changes in floods. Among these, reservoir operation, as one of the most influential human activities, leads to changes in the hydrological conditions of more than half of the rivers in the world (Knott et al. 2024). In order to fully explain the nonstationary flood changes, this study will focus on reservoirs and rainfall as the two physical factors and explore their impact on the design of flood within river basins.
Utilizable capacity reservoir index




It should be noted that, in order to accurately reflect the nonstationary impact of Longyangxia's operation on the runoff, this study does not make any assumptions about the operational rules of the Longyangxia and instead uses the actual UCRI values based on the real operational rules. Having been in operation and tested for nearly 40 years, Longyangxia has strictly adhered to the design standards, and its substantial reservoir capacity has ensured the capability for multi-year regulation and stable continuous operation. Therefore, the research suggests that the current UCRI can adequately reflect the Longyangxia's operational pattern for a future period, with the average UCRI value from the past 5 years being used as a reference for the current and future stages of reservoir operation.
Rainfall factor
Due to the predictability of rainfall, the study will employ both historical measured rainfall series and future predicted rainfall series to explain the nonstationary changes in the flood peak.
(1) Average rainfall 30 days before flood peak
Given the time lag between rainfall events and the occurrence of flood, this study extends the flood-causing rainfall process to 30 days before the flood peak (Jin et al. 2024) using the ‘average rainfall 30 days before the flood peak’ as the rainfall factor P.
(2) Predictions of future precipitation
In recent years, general circulation models (GCMs) have been favored by researchers in predicting future climate change and have become the most important and commonly used method in this field (Yan et al. 2017). In this paper, five different GCMs (BCC-CSM2-MR, MRI-ESM2-0, CanESM5, FGOALS-g3, and MIROC6) under four shared socioeconomic pathways (SSP1-2.6, SSP2-4.5, SSP3-7.0, and SSP5-8.5) from the latest CMIP6. The source, resolution ratio, and other related information about these GCMs are summarized in Supplementary Material S1. The linear difference and equidistant cumulative distribution function method were used to downscale the model data and correct the deviation, and the data space was unified to 0.25° × 0.25°.
To mitigate uncertainty from spatial–temporal disparities across scenarios, the study first calculates the correlation coefficient (R), RMSE, Taylor skill score (Ts) (Taylor 2001), and spatial–temporal skill score (S) (Pierce et al. 2009) based on historical observation data. Secondly, the prediction capabilities of five GCMs are comprehensively ranked and weighted using the entropy weight Technique for Order Preference by Similarity to Ideal Solution (TOPSIS) method (Behzadian et al. 2012). Finally, the future precipitation of the river basin is estimated using the weighted average method.
It is important to note that the timing of future floods is difficult to predict. Therefore, for the rainfall factors in the future period, the study will use ‘average rainfall in flood season (7.1–9.30)’ as the influencing factor.
Estimating confidence intervals using the Bootstrap method
Comparing the uncertainty of nonstationary design flood results obtained from different methods is a fairer means to assess the performance of these methods. Regarding uncertainty under nonstationary conditions, Obeysekera & Salas (2014) comprehensively reviewed the Delta, Bootstrap, and Profile likelihood methods. Among them, the Bootstrap method exhibits the highest computational efficiency and yields approximately symmetrical confidence intervals, with its high precision and simplicity being particularly beneficial for applications in nonstationary conditions. Serinaldi & Kilsby (2015) suggest using the Bootstrap method to quantify uncertainty, which strictly relies on the existing available information without making any asymptotic assumptions, and it does not depend on specific parameter estimation methods. Currently, the Bootstrap method proposed by Efron (1979) has been widely applied in research on quantifying uncertainty. This study will utilize the Bootstrap method to perform 5,000 replicate samplings on the sample, thereby establishing a 95% confidence interval for nonstationary design floods at a significance level of α = 0.05. This approach effectively quantifies the uncertainty of the various methods. Detailed information on the use of the Bootstrap method under nonstationary conditions can be found in numerous studies (Obeysekera & Salas 2014; Yan et al. 2017).
RESULTS
Flood peak series change characteristics


Flood peak series of the Guide Hydrological Station in the study area.
Nonstationary flood frequency analysis based on the Me-RS method
Construction of reconstructed series based on the UCRI or P
Fitting of the mechanism function and the reconstructed series. (a), (c), and (e) use nonlinear exponential regression analysis to obtain the mechanism functions for each influencing factor. (b), (d), and (f) compare the reconstructed series with the flood peak series.
Fitting of the mechanism function and the reconstructed series. (a), (c), and (e) use nonlinear exponential regression analysis to obtain the mechanism functions for each influencing factor. (b), (d), and (f) compare the reconstructed series with the flood peak series.
According to the mechanism function and Equation (6), the impacts of reservoirs and rainfall on the flood peak were separately removed, thereby obtaining the new reconstructed series RSUCRI (Figure 4(b)) and RSP (Figure 4(d)). After the removal of reservoir impacts, the RSUCRI series significantly increased from 1987 to 2020, with enhanced inter-annual volatility that successfully eliminated the significant downward trend present in the original flood peak series (ZM−K = 0.5663 < 1.96). However, the RSP series still maintains a significant downward trend similar to the original flood peak series (ZM−K = −2.3709 < −1.96), and the mutation point shifted to occur in 1996. Compared to the reconstructed series RSP, the reconstructed series RSUCRI, which has eliminated the influence of the reservoir, demonstrates ideal stationarity. This once again confirms that Longyangxia is a key factor causing the nonstationary changes in flood peaks in the Guide Basin, which aligns with the conclusions discussed in Section 4.1.
Removing rainfall impact based on RSUCRI
According to Equation (8), the impact of rainfall is further removed from the RSUCRI, obtaining a new reconstructed series RSUCRI+P (Figure 4(f)). It is obvious that RSUCRI+P shows a further upward trend compared to RSUCRI, and the M–K test revealed a significant trend (ZM−K = 3.081 > 1.96). The RSUCRI+P will once again become nonstationary, which is similar to the phenomenon encountered by Li et al. (2021) in their study. This does not deny the close relationship between rainfall and runoff. Precisely because rainfall is the primary source of runoff, the annual average runoff (RT) in the UCRI has inherently included the impact of rainfall. Therefore, if the rainfall factor is additionally removed, it will lead to the problem of over-reconstruction, thereby transforming the originally stationary reconstruction series RSUCRI into a nonstationary series RSUCRI+P. Therefore, adhering to the principle of stationarity, the study will rely on the reconstructed series RSUCRI to further derive the design flood.
Calculated design flood based on reconstructed series
The optimal distribution of RSUCRI is determined through the K–S test, the Nash–Sutcliffe efficiency, and RMSE. As shown in Table 2, each candidate distribution has passed the K–S test (P > 0.05). According to the principle of the highest NSEQQ and the lowest RMSE, the Weibull distribution is confirmed as the optimal distribution, and the distribution parameters were estimated using the L-moment method. Therefore, the traditional stationary method can be used to perform frequency analysis on the reconstructed series RSUCRI. Ultimately, based on the UCRI representing the current and future conditions, the design flood RSdesign and frequency curves under nonstationary conditions for the Guide Basin can be estimated, with the 100-year design flood being 2435.53 m3/s (Table 6). The detailed results of the design flood calculation will be thoroughly presented in Section 4.5.
Goodness-of-fit tests for candidate distributions based on the RSUCRI series
Optimal distribution . | K–S test . | NSEQQ . | RMSE . | Parameters . | |
---|---|---|---|---|---|
![]() | P-value . | ||||
Lognormal | 0.1144 | 0.4016 | 0.2184 | 1335.44 | \ |
Gamma | 0.1456 | 0.1506 | 0.8980 | 214.23 | \ |
Gumbel | 0.1133 | 0.4139 | 0.9482 | 156.12 | \ |
Logistic | 0.0912 | 0.6912 | 0.7211 | 323.52 | \ |
Weibull | 0.1381 | 0.1952 | 0.9606 | 141.17 | μ=1670.53 |
σ=894.18 | |||||
ε = 1.16 | |||||
GEV | 0.1046 | 0.5162 | 0.5529 | 345.65 | \ |
Optimal distribution . | K–S test . | NSEQQ . | RMSE . | Parameters . | |
---|---|---|---|---|---|
![]() | P-value . | ||||
Lognormal | 0.1144 | 0.4016 | 0.2184 | 1335.44 | \ |
Gamma | 0.1456 | 0.1506 | 0.8980 | 214.23 | \ |
Gumbel | 0.1133 | 0.4139 | 0.9482 | 156.12 | \ |
Logistic | 0.0912 | 0.6912 | 0.7211 | 323.52 | \ |
Weibull | 0.1381 | 0.1952 | 0.9606 | 141.17 | μ=1670.53 |
σ=894.18 | |||||
ε = 1.16 | |||||
GEV | 0.1046 | 0.5162 | 0.5529 | 345.65 | \ |
Note: The bold represents the optimal distribution selected by goodness-of-fit tests.
Nonstationary flood frequency analysis based on the TS-DS method
Decomposition of the flood peak series
To achieve a stationary random series (St), the study sequentially decomposes the periodicity (Pt) and abrupt changes (Yt) within the deterministic components. Wavelet analysis reveals that the flood peak series in the Guide Basin exhibits significant periodicity at 5-, 8-, and 16-year time scales. Utilizing the periodic wave superposition technique, the study successfully constructed a periodic function to accurately depict the periodic component (Table 3).
Deterministic components of flood peak series in the Guide Basin
Component . | Function . |
---|---|
Periodic component | ![]() |
Abrupt change component | ![]() |
Component . | Function . |
---|---|
Periodic component | ![]() |
Abrupt change component | ![]() |
Remove deterministic component series. (a) Remove the periodic component from the flood peak series and (b) remove the abrupt change component from the St,1 series.
Remove deterministic component series. (a) Remove the periodic component from the flood peak series and (b) remove the abrupt change component from the St,1 series.
Synthesis of design flood
As shown in Table 4, based on the random series St,2, each candidate distribution passed the K–S test (P > 0.05). Following the principle of the highest NSEQQ and the lowest RMSE, the Gamma distribution is confirmed as the optimal distribution, and then the traditional stationary method was used for frequency calculation. Subsequently, using the last observed year (2020) as the design benchmark, the corresponding periodic and abrupt change components were computed and then combined to derive the design flood and frequency curves under nonstationary conditions, with the 100-year design flood set at 3522.52 m3/s (Table 6).
Goodness-of-fit tests for candidate distributions based on the random series (St,2 series)
Optimal distribution . | K–S test . | NSEQQ . | RMSE . | Parameters . | |
---|---|---|---|---|---|
![]() | P-value . | ||||
Lognormal | 0.1109 | 0.4108 | 0.5944 | 524.08 | \ |
Gamma | 0.1388 | 0.1733 | 0.9654 | 121.66 | μ=1.47 |
σ=559.24 | |||||
Gumbel | 0.1108 | 0.4124 | 0.9623 | 122.31 | \ |
Logistic | 0.0890 | 0.6856 | 0.8082 | 254.02 | \ |
Weibull | 0.1271 | 0.2556 | 0.9564 | 133.51 | \ |
GEV | 0.1053 | 0.4761 | 0.8335 | 217.17 | \ |
Optimal distribution . | K–S test . | NSEQQ . | RMSE . | Parameters . | |
---|---|---|---|---|---|
![]() | P-value . | ||||
Lognormal | 0.1109 | 0.4108 | 0.5944 | 524.08 | \ |
Gamma | 0.1388 | 0.1733 | 0.9654 | 121.66 | μ=1.47 |
σ=559.24 | |||||
Gumbel | 0.1108 | 0.4124 | 0.9623 | 122.31 | \ |
Logistic | 0.0890 | 0.6856 | 0.8082 | 254.02 | \ |
Weibull | 0.1271 | 0.2556 | 0.9564 | 133.51 | \ |
GEV | 0.1053 | 0.4761 | 0.8335 | 217.17 | \ |
Note: The bold represents the optimal distribution selected by goodness-of-fit tests.
Nonstationary flood frequency analysis based on the GAMLSS model
Construction of the optimal GAMLSS
Various variation types for location parameter μ and scale parameter σ.
The performance of optimal GAMLSS models with different covariates. (a), (b), (c) and (d) are worm plots. (e), (f), (g) and (h) are the AIC and SBC information, where the pentagram indicates the minimum values of AIC and SBC, representing the optimal model. (i), (j), (k) and (l) are the centile curve plots.
The performance of optimal GAMLSS models with different covariates. (a), (b), (c) and (d) are worm plots. (e), (f), (g) and (h) are the AIC and SBC information, where the pentagram indicates the minimum values of AIC and SBC, representing the optimal model. (i), (j), (k) and (l) are the centile curve plots.
Notably, after considering the P and UCRI covariates in GAMLSS 2 and GAMLSS 3, their AIC and BIC values show a significant decrease compared to GAMLSS 1. From Figure 7(j) and 7(k), it can be seen that the centile curves are no longer a smooth line but instead show fluctuations corresponding to the changes in the physical covariates. Compared to GAMLSS2, GAMLSS3, which considers the influence of reservoirs, has a narrower 25–75% quantile interval for the fitting results from 1987 to 2020, aligning more closely with the trend of the observed flood series. However, from 1960 to 1987, the reservoir was not operational, resulting in a UCRI value of 0; therefore, the simulation results of GAMLSS 3 remained constant. Clearly, physical factors play a significant role in enhancing the simulation performance of the model, but the GAMLSS model using only UCRI or P as covariates is not the best.
As anticipated by Chen et al. (2021), the simulation performance of the GAMLSS model would further improve with the addition of more covariates. GAMLSS 4 incorporates both UCRI and P as covariates, resulting in the lowest AIC and SBC values (Figure 7(h)). The centile curves clearly show that GAMLSS 4 possesses all the advantages of GAMLSS 2 and 3, while further enhancing its ability to capture flood trend changes (Figure 7(l)). Therefore, this study adopts GAMLSS 4 as the optimal model for calculating nonstationary design floods in the Guide Basin.
Projections of future rainfall
In terms of spatial–temporal dimensions, the performance of the five GCMs in predicting rainfall varies due to the variations in goodness-of-fit tests. To mitigate the uncertainty introduced by the selection of GCMs, the study employed the entropy weight TOPSIS method to comprehensively evaluate the predictive performance of each model and accordingly assigned weights. Supplementary Material S2 summarizes the goodness-of-fit test statistics for the five GCMs in both temporal and spatial dimensions, as well as related information such as weights. The entropy weight TOPSIS indicates that the BCC-CSM2-MR model demonstrates the superior comprehensive simulation performance, followed by the FGOALS-g3 model, while the CanESM5 model exhibits the relatively weakest simulation effectiveness. Following the assigned weights, the predicted rainfall from the five GCMs will be subject to a weighted average treatment, obtaining the future rainfall series for the Guide Basin. Supplementary Material S3 provides a detailed discussion of the average rainfall and distribution characteristics during the flood season from 2015 to 2100 in the Guide Basin under four emission scenarios of CMIP6. Under all four emission scenarios, future rainfall shows an upward trend. The study ultimately selects the predicted values under the SSP1-2.6 emission scenario, which better meet the requirements of normal distribution and socio-ecological sustainable development, to serve as rainfall covariates.
ER for design flood calculation
Assume that the Longyangxia Reservoir is planned to be in service from 1987 to 2086, giving it a lifespan of 100 years. After obtaining the UCRI and predicting the future P, the nonstationary distribution and parameters for the best model GAMLSS 4 can be ascertained (Table 5). The nonstationary design flood, calculated using the ER method, is 2447.53 m3/s for the 100-year return period (Table 6).
Summary of the best GAMLSS 4 with UCRI and P covariates
GAMLSS model . | Optimal distribution . | Parameters . |
---|---|---|
GAMLSS 4 UCRI and P as covariates | Lognormal:![]() ![]() | ![]() ![]() ![]() ![]() |
GAMLSS model . | Optimal distribution . | Parameters . |
---|---|---|
GAMLSS 4 UCRI and P as covariates | Lognormal:![]() ![]() | ![]() ![]() ![]() ![]() |
Design floods at corresponding return periods estimated by different methods (m3/s)
Method . | Return period (years) . | |||
---|---|---|---|---|
10 . | 20 . | 50 . | 100 . | |
Stationary method | 3106.99 | 3641.51 | 4292.51 | 4753.76 |
Me-RS | 1704.51 (−45.14%) | 1932.12 (−46.94%) | 2222.27 (−48.23%) | 2435.53 (−48.77%) |
TS-DS | 2106.55 (−32.20%) | 2540.71 (−30.23%) | 3103.07 (−27.71%) | 3522.52 (−25.90%) |
GAMLSS | 1898.57 (−38.89%) | 2074.05 (−43.04%) | 2290.78 (−46.63%) | 2447.53 (−48.51%) |
Method . | Return period (years) . | |||
---|---|---|---|---|
10 . | 20 . | 50 . | 100 . | |
Stationary method | 3106.99 | 3641.51 | 4292.51 | 4753.76 |
Me-RS | 1704.51 (−45.14%) | 1932.12 (−46.94%) | 2222.27 (−48.23%) | 2435.53 (−48.77%) |
TS-DS | 2106.55 (−32.20%) | 2540.71 (−30.23%) | 3103.07 (−27.71%) | 3522.52 (−25.90%) |
GAMLSS | 1898.57 (−38.89%) | 2074.05 (−43.04%) | 2290.78 (−46.63%) | 2447.53 (−48.51%) |
Note: The percentage in parentheses shows the deviation of the design flood values of each nonstationary method from the traditional stable method at different return periods.
Comparison of three nonstationary flood design methods
Every method of nonstationary flood frequency analysis inevitably has certain limitations in terms of fundamental definitions, concepts, and methodological principles (Salas et al. 2018). As Serago & Vogel (2018) point out, no single method of nonstationary flood frequency analysis satisfies everyone. To assess the scientific validity and rationality of the three nonstationary frequency analysis methods, the study will conduct a comparison from the two dimensions of design flood estimation and uncertainty.
Comparative analysis of the design flood value
The authenticity of design flood calculation results directly impacts whether water resources can be utilized efficiently and whether risks can be effectively avoided (Barbhuiya et al. 2023). Table 6, respectively, presents the design flood calculation results for the traditional stationary method, Me-RS, TS-DS, and GAMLSS. Due to the flood peak series not satisfying the stationary requirement, the design flood results using the traditional stationary method are significantly overestimated, with the 100-year design flood being 48.77, 48.51, and 25.90% higher than the design results obtained from the Me-RS, GAMLSS, and TS-DS, respectively. In particular, the Me-RS method and GAMLSS, which take into account the physical factors, can further reduce the estimated design flood, with the reduction exceeding 48% compared to the stationary method. The complex nonstationarity of floods in the Guide Basin has caused the design flood results from the traditional stationary method to lose referential significance, failing to provide effective support for water resource management and engineering construction in the basin. This will compel hydrologists to adopt nonstationary flood frequency analysis methods to estimate current and future design floods.
The three nonstationary flood frequency analysis methods, based on their respective theoretical frameworks and considering different nonstationary characteristics, have effectively reduced the estimation of design floods, with the 100-year design flood values being 2435.53 m3/s for Me-RS, 3522.52 m3/s for TS-DS, and 2447.53 m3/s for GAMLSS. The Me-RS and GAMLSS, by considering physical factors, yield relatively lower design floods than the TS-DS. As the return period increases, the design flood calculation results of GAMLSS and the Me-RS method become increasingly closer, with a difference of only 12 m3/s for the 100-year design flood, and the deviation is merely 0.49%. However, the TS-DS only considers the periodicity and abrupt change components of the flood peak series from a statistical perspective, which results in higher design flood values compared to the Me-RS method and GAMLSS. The deviation from the TS-DS method for the 100-year design flood is 44.63% for the Me-RS method and 43.92% for GAMLSS, which may lead to unnecessary resource wastage.
Comparison of Bootstrap confidence intervals
95% Bootstrap confidence intervals for Me-RS, GAMLSS, and TS-DS methods.
The uncertainty in design floods used for flood risk assessment can have a significant impact on the flood control planning of river basins, determining whether flood protection measures need to be reinforced and where reinforcement is required (Bomers et al. 2019). A narrower uncertainty interval is very important for promoting the reasonable application of results in water resource management, as an overly wide uncertainty interval may lead to unnecessary resource investment (Kasiviswanathan et al. 2017). The better robustness of the Me-RS method results in a narrower uncertainty interval for design floods, which is an essential prerequisite for decision-makers to conduct accurate flood risk assessments, thereby enabling the adoption of proper preventive measures.
DISCUSSION
Comparison with existing nonstationary methods
Establishing the connection between influencing factors and hydrological series to explore trend dynamics of hydrological events has become a crucial step in the current study of nonstationary changes in hydrological series (Serinaldi & Kilsby 2015). In this study case, time, reservoir, and rainfall are considered as influencing factors to analyze the nonstationary changes of flood peaks in the Guide Basin. However, as Singh & Chinnasamy (2021) have pointed out, although time is regarded as an agent for changes in all physical phenomena, it cannot effectively explain the nonstationary changes of flood peaks. The GAMLSS 1, which only takes into account the time factor, failed to capture the abrupt change characteristics of the flood peak series, and the simulated trend of flood peaks decreasing indefinitely over time does not correspond to reality. When a hydrological series exhibits a significant trend, the GAMLSS model inevitably shows a high statistical significance with time-related covariates. In this case, the trend in the fitting results may extend indefinitely over time, thereby compromising the reliability of the frequency analysis results. Similarly, the TS-DS method, which simplifies the flood peak series into a time series, overestimates design flood and increases uncertainty. In fact, the TS-DS method merely fits functions for the deterministic components of hydrological series, such as periodicity, abrupt changes, or trends, with time as the independent variable, which presumes that the nonstationary changes in hydrological series exhibit time-varying characteristics. Therefore, the design flood results from the TS-DS method will vary depending on the start and end times of the calculation period, with this time-dependent design flood prediction likely to deviate from the actual conditions. Compared to the TS-DS method, the Me-RS and GAMLSS 2-4 have significantly improved the ability to simulate nonstationary changes in flood peaks after considering the impact of reservoirs and/or rainfall. The values of design floods are further decreased, accompanied by lower uncertainty.
It is worth noting that the selection of impact factors is not immutable. Jin et al. (2023) pointed out that researchers should make timely adjustments and optimizations to these factors based on environmental changes, as indiscriminately adding more impact factors may lead to worse results. The Me-RS employed in the study clearly indicates that introducing influencing factors without screening may lead to the deterioration of the analysis outcomes. When using Me-RS to further eliminate the rainfall impact on the RSUCRI series, which has had the reservoir impact removed, the newly reconstructed series RSUCRI+P loses the perfect stationarity of RSUCRI. However, the performance of the GAMLSS is quite different from that of the Me-RS. When the GAMLSS considers both the UCRI and P as covariates simultaneously, its simulation results surpass those obtained by considering the UCRI or P individually. Serinaldi & Kilsby (2015) and Chen et al. (2021) have also found that incorporating more influencing factors can make the GAMLSS more accurate in simulating the nonstationary changes of hydrological events. This reveals a common issue with the GAMLSS and other nonstationary methods such as TS-DS: these methods place excessive focus on the study of mathematical formulas and statistical characteristics, while overlooking the actual processes of hydrological change, thereby easily mistaking statistical significance for practical importance (Debele et al. 2017). Indeed, the core of the Me-RS is to deeply analyze the root causes of nonstationary changes in the flood peak series, achieving the reconstruction of the ‘evolutionary trajectory’ of flood peaks. This significantly enhances the ability to identify influencing factors, thereby ensuring the precision and efficiency of design flood calculation under the nonstationary conditions. This is why the Me-RS, which considers only a single impact factor (UCRI), yields comparable results to the GAMLSS that considers multiple impact factors (UCRI and P), with a difference of only 0.49% in the estimated design flood values and 3.29% in uncertainty under a 100-year return period.
Implications for engineering hydrology
The length limitation of hydrological series is a crucial factor in determining whether the statistical significance is reasonable. Debele et al. (2017) point out that for shorter hydrological series, even the addition of just 1 year of data may lead to a change in the optimal distribution. The Me-RS method eliminates the nonstationary influence of influencing factors through a mechanism function, converting nonstationary series into stationary reconstructed series with a constant sample length. A sufficiently long reconstructed series helps to reveal the statistical characteristics of hydrological variables and determines the stochastic rules of hydrological phenomena (Li et al. 2018). The Me-RS method employs the traditional stationary method that has been well mastered to calculate the design floods of the stationary reconstructed series. By considering the status of influencing factors during historical, current, and future periods, researchers and engineers can derive design floods under nonstationary conditions through mechanism functions, which enables the evaluation of historical flood frequency results and the estimation of flood frequencies for the present and future periods.
It is worth noting that the Me-RS method consistently chooses a multiplicative model for its construction, which endows it with unique innovativeness and applicability. Hydrological phenomena arise from the interaction of various influencing factors within the hydrological system. These factors are hardly completely independent of each other, and their intertwined interactions cause the hydrological system to display highly complex adaptive nonlinear characteristics. In the field of engineering hydrology, achieving dimensional harmony is the ideal goal. However, in practical operations, approximate solutions for hydrological variables are usually obtained through experience and fitting. This often results in a multitude of empirical and semi-empirical models that exhibit dimensional inharmony. To address these issues, the Me-RS method based on the multiplicative model can effectively solve them. The Me-RS method more readily achieves the goal of dimensional harmony in engineering hydrology and is more suitable for describing the highly complex, adaptive, and nonlinear characteristics of hydrological systems, showing great potential for application (Li & Qin 2022).
Limitations and future work
The ultimate goal of the Me-RS method is to accurately identify all influencing factors and their mechanism functions, thereby obtaining an stationary constant series. However, it is undeniable that given the complexity of hydrological processes and the limitations of human current cognitive and technical capabilities, we are still unable to identify all influencing factors, and it is almost impossible to determine an accurate mechanism function through theoretical derivation or experimental analysis (Qin & Li 2021). To address this issue, we leverage the characteristic that the causal mechanisms between things are often implicit in statistics and have established mechanism functions using regression analysis that can conveniently characterize the nonstationary changes in hydrological series (Salas et al. 2018). In the future, researchers should continuously deepen their understanding of the natural world and further comprehend the complex hydrological systems. Through thorough experimental analysis, they should reveal the inherent causal mechanisms between hydrological phenomena and influencing factors and construct the mechanism functions through theoretical derivation. As the precision of the mechanism functions of explanatory variables continues to improve, the reconstructed series will gradually tend toward statistical stationarity.
CONCLUSION
This investigation has been undertaken with the objective of enhancing the field of nonstationary flood frequency analysis for the upper reaches of the Yellow River. By refining the Me-RS method to offer a more efficient model structure with minimized uncertainty, our goal has been to bolster the practical application of nonstationary methodologies within engineering hydrology. Furthermore, we have conducted a comparative analysis with the TS-DS and GAMLSS methods, focusing on both design flood estimation and uncertainty, to delineate the nuances between these methods. The complex nonstationary dynamics observed in the Guide Basin, a product of escalating climate change and human activity, provide a fertile ground for assessing the virtues and limitations of the methods under scrutiny. The following summary encapsulates the key findings and insights derived from this comprehensive analysis:
(1) The operation of the Longyangxia Reservoir has caused nonstationary changes in the flood peaks of the Guide Basin, rendering traditional stationary methods no longer suitable for design flood calculation. Compared to nonstationary methods, the design flood results of the traditional stationary method are significantly overestimated, with the 100-year design flood exceeding the values by 48.77% for Me-RS, 48.51% for GAMLSS, and 25.90% for TS-DS.
(2) In estimating design floods under nonstationary conditions, physical factors are of crucial importance. The Me-RS and GAMLSS, when incorporating the influence of Longyangxia, produce lower design estimates than the TS-DS, with the respective 100-year design flood of 2435.53 m3/s for Me-RS, 2447.53 m3/s for GAMLSS, and 3522.52 m3/s for TS-DS. Additionally, Me-RS and GAMLSS reduce uncertainty by about 40% compared to TS-DS, which emphasizes periodicity and abrupt changes but overlooks physical factors.
(3) The Me-RS, based on the causal mechanism between flood peak series and influencing factors, has a more streamlined theoretical framework and a stronger capability to explain the nonstationary changes of floods. The Me-RS, which considers only a single impact factor (UCRI), yields comparable results to the GAMLSS that considers multiple impact factors (UCRI and P). Under a 100-year return period, the difference is only 0.49% for design flood and 3.29% for uncertainty. Especially when the return period is short, the Me-RS is more effective in reducing design flood and has a lower uncertainty.
The Me-RS method provides a more accurate and simpler theoretical framework for nonstationary frequency analysis by exploring the causal mechanism between hydrological phenomena and impact factors. Nevertheless, this study only focuses on the Guide Basin in the upper reaches of the Yellow River, which is subject to the interacting effects of intensifying climate change and escalating human intervention. In the future, more in-depth research should be conducted on other river basins and under various impact factors for nonstationary flood frequency analysis. The aim is to validate the applicability of the Me-RS method across different river basins and under diverse nonstationary conditions, while further promoting the practical application of nonstationary frequency analysis in engineering hydrology.
ACKNOWLEDGEMENTS
We would like to express our sincere gratitude for the valuable data provided by the Yellow River Conservancy Commission. This assistance has been instrumental in our work.
FUNDING
The financial support provided by the Jiangxi Provincial Natural Science Foundation (20212BAB214066 and 20242BAB25311) in this research is sincerely appreciated.
AUTHOR CONTRIBUTIONS
All the authors of this paper contributed to the conception and design of the study. B.X. provided funding support for the entire study. Material preparation, data collection, and analysis were done by S.L. and F.L. L.C. and B.X. proposed the main structure of this study. The first draft and figures of the manuscript were written by F.L. and L.C., and the revised draft was jointly completed by F.L., M.Z., and B.X. All authors read and approved the final manuscript.
DATA AVAILABILITY STATEMENT
Data cannot be made publicly available; readers should contact the corresponding author for details.
CONFLICT OF INTEREST
The authors declare there is no conflict.