Theoretical derivation for the exceedance probability of corresponding flood volume of the equivalent frequency regional composition method in hydrology

The equivalent frequency regional composition (EFRC) method is an important and commonly used tool to determine the design flood regional composition at various sub-catchments in natural conditions. One of the cases in the EFRC method assumes that the exceedance probabilities of design flood volume at upstream and downstream sites are equal, and the corresponding flood volume at intermediate catchment equals the gap between the volumes of upstream and downstream floods. However, the relationship between the exceedance probability of upstream and downstream flood volumes P and that of corresponding intermediate flood volume C has not been clarified, and whether P> C or P C has not been theoretically proven. In this study, based on the normal, extreme value type I and Logistic distributions, the relationship between C and P is deduced via theoretical derivations, and based on the Pearson type III, two-parameter lognormal and generalized extreme value distributions, the relationship between C and P is investigated using Monte Carlo experiments. The results show that C is larger than P in the context of the design flood, whereas P is larger than C in the context of low-flow runoff. Thus, the issue of exceedance probability corresponding flood is further theoretically clarified using the EFRC method.


GRAPHICAL ABSTRACT INTRODUCTION
The ability to estimate the design flood in a given return period is a fundamental issue in engineering design, as well as water resource management and planning. Hydrological frequency analyses have been used worldwide as a standard approach for estimating design flood (Kendall & Stuart ; Ponce ; Hu et al. ). The estimation of design flood generally of interest for hydrologists, engineers, and agriculturalists for the design of hydraulic structures, such as river sections and dam sites. When control engineering has not been implemented upstream of a future dam site, the design flood of the future dam can be directly calculated via a hydrological frequency analysis of the peak or duration of the flood volume series, and the flood regional composition does not need to be considered (Maidment ).
However, for cases in which one or more control engineering features have been implemented upstream of the future dam site, such as a cascade reservoir system, the impact of the outflows at upstream site and intermediate catchment (i.e., the flood regional composition) must be considered to estimate the design flood at the future dam site. Generally, the framework for flood regional composition (Lu et   Compared with on-site hydrological frequency analysis, a multi-site hydrological frequency analysis that considers the flood regional composition is more complicated and difficult. This process should consider not only the design flood of each site but also the influence of flood regional composition on the design flood (flood discharges or volumes) of the future dam site. It can be seen that the number of possible regional composition is countless and the selection of an appropriate combination is significant. In practice, several combinations, such as the best, the worst, and the most likely, are used to simulate the impact of upstream sites on design flood at downstream site (Nijssen et al. ). With respect to the design flood regional composition analysis, semi-theoretical and semi-empirical methods, such as the regional composition method, frequency combination method, and stochastic simulation method, have been widely practiced for decades. The regional composition method specifies that a flood occurs in one catchment with the same exceedance probability as in the design section, and a corresponding flood occurs in the other catchments (Ministry of Water Resources ). Among various possible compositions, the equivalent frequency regional composition (EFRC) method is able to select a specific one as the designed regional composition model to ensure the safety of the calculated results (Lu et  Regardless of which flood regional composition method is used, the volume of the corresponding flood is known, while the exceedance probability (or return period) of the corresponding flood volume is unknown. Using one case of the EFRC method as an example, such as one in which the equivalent exceedance probability of the design flood volume at upstream site and downstream site is used to calculate the corresponding flood at intermediate catchment, the corresponding flood volume is equivalent to the gap between the volumes of upstream and downstream floods.
Whether the exceedance probability of the corresponding flood volume is larger or smaller than that of the design flood volume at downstream site has not been theoretically proven, which has introduced confusion into practical probabilistic applications.    Figure 1(a) shows a natural on-site C that does not involve the flood regional composition. The process of on-site hydrological frequency analyses includes only the selection of distribution functions and the estimation of parameters. As shown in Figure 1(b), site A was constructed upstream of site C. Therefore, the estimation of the design flood volume at site C should consider the flood regional composition impact of upstream, which means that the inflow into site C is divided into two  Figure 1(c), can be divided into many small single intermediate sub-catchments, as shown in Figure 1(b). Every single sub-catchment has a similar design flood volume composition. As the number of sub-catchments (n) increases, the number of flood regional compositions (n) increases uniformly.

EFRC method
The EFRC method is widely used to determine the design where X P and Z P are the design flood volumes at upstream site and downstream site with exceedance probability P, respectively; and Y C is the corresponding design flood volume at intermediate catchment with the exceedance probability C. However, the relationship between P and C, i.e., P > C or P C, has not been theoretically proven.
Form (2): Assuming that the probability of design flood volume at intermediate catchment and downstream site is P and the probability of corresponding flood volume at upstream site is C, the corresponding flood volume at upstream site can be expressed as: where Y P and Z P are the P-design probability flood volumes at intermediate catchment and downstream site, respectively, and X C is the corresponding C-probability flood volume at upstream site. Similarly, whether P > C or P C has not been clarified.
We take the first form of the EFRC method as an example and determine whether P > C or P C via theoretical derivations and MC experiments in the following.

Normal distribution
The probability density function (PDF) of a normally distrib- where μ and σ are the mean and the standard deviation of the random variable X, respectively. When the variable x ¼ μ, the PDF reaches a maximum value of f(x) ¼ 1 ffiffiffiffi 2π p σ , and the exceedance probability of flood P(X ! μ) ¼ 50%.
The variable x takes values in the range À∞ < x < þ∞.
The distribution function of x is given by Equation (6).
where a and m are the scale parameter and the location parameter of the random variable X, respectively, which are estimated by the MOM (Equation (7)).
where μ and σ are the mean and the standard deviation of the random variable X, respectively, and γ ¼ 0:57722 is the Euler-Mascheroni constant.
According to Equation (8), the exceedance probability of flood P(X ! μ) is equal to 42.96%.

Logistic distribution
The PDF and distribution function of the logistic distribution are expressed in Equations (9) and (10), respectively.
where a and m are the scale parameter and the location parameter of the random variable X, respectively, which are estimated by the MOM in Equation (11).
where μ and σ are the mean and the standard deviation of the random variable X, respectively.

PE3 distribution
The PDF of the PE3 distribution is expressed by Equation (13).
where α, β, and are the shape, scale, and location parameters, respectively. These parameters are related to the expected value (EX), coefficient of variation (Cv), and coefficient of skewness (Cs) of the distribution through the following equations.

LN(2) distribution
The PDF of the LN(2) distribution is given by Equation (15), where μ y and σ y are the mean and standard deviation of the natural logarithms of x, respectively. The variable ln (x) can be standardized, as shown in Equation (16).
where the standard normal variable u is obtained with the PDF given in Equation (4).
These parameters are related to the EX and Cv of the distribution through the following equations.

Derivation for normal distribution
Assume that the flood volume at upstream site, intermediate catchment, and downstream site is subject to normal distri- x , (n > 0). As shown in Figure 1 x , which means that the distribution parameters of the random variables Y and Z can be represented by the statistics of X, Y ∼ N(mμ x , n 2 σ 2 x ), and Z ∼ N(μ x þ mμ x , σ 2 x þ n 2 σ 2 x ). With additional calculations, we can identify the exceedance probability of the volumes of upstream, intermediate, and downstream floods. The calculation of upstream flood volume X is used as an example, and the exceedance probability in the context of {X ! x} is obtained as follows.
In the same way, the exceedance probabilities of the volumes of intermediate flood and downstream flood are obtained.
For the EFRC method, the exceedance probabilities of X , and the exceedance probability of Y calculated by Equation (1) is (22) is equal to Equation (24).
Similarly, we can compare Equations (22) and (23) to obtain the relationship between P and C, which indicates that Equations (22) and (23) are different in their integral lower bound. We define Δ as the difference between the lower bounds.
By plugging Equations (2) and (27) into Equation (28), Equation (28) can be easily transformed into the expression in Equation (29). where then P is larger than C.
then P equals C.
Overall, in the context of using the first EFRC method for normal distribution floods, the relationship between C and P depends on whether or not P is larger than 50%.
Thus, for a design flood (volume) whose exceedance probability is generally less than 50%, C is greater than P; however, for low-flow condition whose exceedance probability is generally larger than 50%, C is less than P. In the context of other EFRC methods, similar conclusions can be drawn for a normal distribution flood.

Derivation for EV1(2) distribution
Assume that the floods at upstream site, intermediate catchment, and downstream site are subject to the EV1 (2) distribution as follows, Let μ y ¼ mμ x (m > 0) and σ 2 y ¼ n 2 σ 2 x (n > 0). As shown in Figure 1(b), the downstream flood volume Z equals the sum of upstream flood volume X and intermediate flood volume

Y.
Then x ; thus, the distribution parameters of the random variables Y and Z can be represented by the X statistic.
With additional calculations, we can identify the exceedance probability of flood at upstream site, intermediate catchment, and downstream site. Using the calculation of upstream flood volume X as an example, the exceedance probability of upstream flood volume is obtained as follows.
Similarly, the exceedance probability of the volumes of intermediate flood and downstream flood are obtained.
For the EFRC method, the exceedance probabilities of the volumes of upstream and downstream floods are both and that of the corresponding flood volume calculated by Equation (2) Thus, Equation (36) Similarly, we can compare Equations (36) and (37) to obtain the relationship between P and C. After comparing the equations, we found that Equations (36) and (37) are different in their integral lower bounds. We defined Δ as the difference between these bounds.

If
then P equals C.
Thus, when using the first EFRC method for the EV1 (2) distribution flood, the relationship between C and P depends on whether P is larger than 42.96%. For a design flood (volume) whose exceedance probability is generally less than 42.96%, C is larger than P, whereas for the design low flow whose exceedance probability is generally larger than 42.96%, C is less than P.

Derivation for logistic distribution
Assume that the floods at upstream site, intermediate catchment, and downstream site are independently subject to a logistic distribution.
Let μ y ¼ mμ x , (m > 0) and σ 2 y ¼ n 2 σ 2 x , (n > 0). As shown in Figure 1(b), the downstream flood volume Z equals the sum of upstream flood volume X and intermediate flood volume Y. Then, x ; thus, the distribution parameters of random variables Y and Z can be represented by the statistics of X.
With additional calculations, we can identify the exceedance probability of the volumes of upstream, intermediate, and downstream floods. The calculation of upstream flood volume X is used as an example, and the exceedance probability of upstream flood volume is obtained as follows.
Similarly, the exceedance probabilities of the volumes of intermediate and downstream floods are obtained.
Similarly, we can compare Equations (50) and (51) to obtain the relationship between P and C. After comparing the two equations, we found that Equations (50) and (51) are different in terms of their integral lower bounds. We defined Δ as the difference between the bounds.
then P is larger than C.
When using the first EFRC method for flood regional composition with logistic distributions, the relationship between C and P depends on whether P is larger than 50% or not. Thus, for a design flood (volume), C is greater than P, whereas for a design low flow, C is less than P.

Experiment analysis for PE3 distribution
For a PE3 distribution as shown in Equation (13) Figure 2 shows a flow chart that clearly illustrates the procedure of MC experiment. In the figure, the steps of MC experiment are as follows.
Step 1: Randomly generate upstream flood volume Xrng and downstream flood volume Zrng according to their numerical characteristics (EX, EZ, Cs, and Cv). Zrng minus Xrng is internal flood volume Y.
Step 2: Repeat Step 1 100,000 times. Obtain 100,000 random internal floods Y at the same time.
Step 3: Plot the hydrologic frequency analysis curve of Y through 100,000 random numbers.
Step 4: Plot hydrologic frequency analysis curves of Z and X through their numerical characteristics.
Step 5: Based on design probability P, design floods Zp and Xp are obtained from the hydrologic frequency analysis curves of Z and X, respectively.
Step 6: Zp minus Xp is the corresponding flood volume Yc.
Step 7: Based on the Yc, exceedance probability C is obtained from the hydrological frequency analysis curve of Y.
Find critical point P 0 .
For the PE3 distribution, we performed 100,000 random trials for the flood regional composition. In each of these trials, we produced the random upstream flood volume X and Cv ¼ 1:5, Cs ¼ 6:0. Therefore, six regional composition schemes are obtained in total. Design probabilities P of each regional composition scheme are 0.01, 0.1, 1, 2, 10, 20, 25, 30, 35, 40, 45, 50, 60, 75, 80, 90, 95, 97, 99, and 99.9%. A specific regional composition scheme is used as an  Table 1, and the relationship between the exceedance probabilities of C and P is presented in Figure 3 (the same as the regional composition scheme 2 in Figure 4).
As illustrated in Figure 3, a critical point P 0 is presented, which is the exceedance probability corresponding to the intersection of two lines. When the exceedance probability of design flood P is greater than the critical point P 0 , then the exceedance probability of corresponding flood at intermediate catchment C is less than P (C < P). When P is exactly equal to P 0 , then C ¼ P. When P is smaller than P 0 , then C > P. All of these results show that in the context of using the first EFRC method for PE3 distribution, the relationship between C and P depends on the design probabilities of the volumes of upstream and downstream floods. This conclusion is consistent with that drawn from the theoretical derivation for normal, EV1(2) and logistic distributions. More specifically, the P 0 of normal, EV1 (2) and logistic distributions are 50, 42.96, and 50%, respectively, whereas the P 0 of PE3 distribution is not fixed and may be greater than, less than, or equal to 50%.
P 0 of the typical regional composition in Figure 3 is 45%, which is less than 50%. Similarly, P 0 of the other five regional compositions could be obtained. All of the results are shown in Figure 4 and 2.0, the range of critical point P 0 of the PE3 distribution will be between 45 and 50% (as shown in Table 2).
Taking into account the actual situation of design flood

Experiment analysis for LN(2) distribution
For the flood regional composition of LN(2) distribution (Equation (15)), we performed 100,000 randomized trials, which is similar to the experiment analysis of the PE3 distribution. Six regional composition schemes were constructed as representatives. For the downstream site, EZ is 2,000, and for upstream site, EX is 1,000. The Cv values of both upstream and downstream floods (volume) are 0.2, 0.4, 0.5, 0.7, 1.0, and 1.5.
We then used a specific regional composition scheme as an example, such as one in which the parameters of downstream flood (volume) are EZ ¼ 2,000 and Cv ¼ 0.7, and the parameters of upstream flood (volume) are EX ¼ 1,000 and Cv ¼ 0.7. The results of the statistical experiment and the relationship between the exceedance probabilities of C and P are shown in Figure 5 (the regional composition scheme 4). For this scheme, the critical point P 0 of the LN(2) distribution is 45%, which is less than 50%.
Similarly, the critical point P 0 of the other five regional compositions could be obtained. All of the results are shown in Figure 5 and Table 2. The critical point P 0 of LN(2) distribution is between 45 and 50%.
Therefore, for the design flood regional composition of LN (2)
A specific regional composition scheme is used as an example, such as one in which the parameters of downstream floods (volume) are EZ ¼ 2,000, Cs ¼ 0.8, and Cv ¼ 0.2, and the parameters of upstream floods (volume) are EX ¼ 1,000, The results of the statistical experiment and the relationship between the exceedance probabilities of C and P are shown in Figure 6 (the regional composition scheme 1). For this scheme, the critical point P 0 of a typical GEV distribution is approximately 50%, which is similar to the values of the PE3 and LN(2) distributions.
In the same way, the critical point P 0 of the other five regional compositions could be obtained. All of the results are shown in Figure 6 and Table 2, the critical point P 0 of the GEV distribution is between 45 and 50%. In the actual design flood regional composition, the intermediate exceedance probability is greater than the design exceedance probability. For the design low-flow regional composition, the exceedance probability is less than the design exceedance probability (or design guarantee rate).

CONCLUSIONS
In the EFRC method, which is commonly used to resolve the regional composition of the design flood volume at various sub-catchments in natural conditions, the exceedance probability (or return period) of the estimated corresponding flood volume is unknown. This study performs theoretical derivations and MC experiments to investigate the relationship between the probability of the corresponding flood volume and that of the design flood volume at the dam site. The following conclusions are obtained.
1. Critical probability value P 0 exists in the EFRC method.
When the exceedance probability of downstream design flood volume P is greater than P 0 , the exceedance probability of corresponding flood volume C is less than P, i.e., (C < P); however, when P is equal to P 0 , C ¼ P, and when P is less than P 0 , C > P. The value of P 0 is related to the distribution function of hydrological variables. For normal distribution, EV1(2) distribution and logistic distribution, P 0 equals 50, 42.96, and 50%, respectively. For PE3 distribution, LN(2) distribution and GEV distribution, P 0 is not fixed. For the PE3 distribution, P 0 ranges from approximately 30-50%, whereas for both LN (2) and GEV distributions, P 0 ranges from approximately 45 to 50% based on the MC experiments.
2. In terms of a design flood event, the design exceedance probability P is generally far less than 30% (e.g., P ¼ 0.01, 0.1, or 1%); thus, the corresponding exceedance probability C is greater than P. 3. In terms of a design low-flow event, the design exceedance probability (or design guarantee rate) P is generally greater than 60% (e.g., P ¼ 75, 90, or 99%); thus, the corresponding exceedance probability C is less than P. For example, if 90% guarantee-rate design flows occur in upstream site and downstream site, the guarantee rate of corresponding low-flow at intermediate

DATA AVAILABILITY
The data in this study were randomly generated through statistical experiments.