River flood season segmentation is a significant measure for flood prevention. This study aims to carry out theoretical analysis on flood season segmentation methods and put forward a framework for proper flood season segmentation through comparison between different segmentation methods. The studied framework consists of a Fisher optimal partition method for determining the optimum numbers of the sub-seasons, an ensemble approach for segmenting a defined flood season, and a nonparametric bootstrap combined with a fuzzy optimum selection method (NPB-FOS) for testing the rationality of the flood season staging schemes. The present research findings show that different methods could result in different staging schemes. It is proved through rational analysis that the staging scheme obtained by probability change point (PCP) is superior to others. The flood season of the downstream reach of the Yellow River can be segmented into three sub-seasons, i.e. early flood season (01 June–20 July), main flood season (21 July–28 September), and late flood season (29 September–08 November). The segmentation results of the flood season should play an active role in flood prevention.

  • The conjunctive use of Fisher optimal partition, ensemble approach, nonparametric bootstrap, and fuzzy optimum selection method applied to flood season segmentation.

  • The rational methods segment a flood season into several sub-seasons.

  • An optimal framework for reasonable flood season segmentation was proposed.

  • A case study in the Yellow River was carried out and reasonable results were obtained.

Graphical Abstract

Graphical Abstract
Graphical Abstract

With the increasingly frequent extreme climate change and extreme precipitation events, flood disasters seriously threaten citizens’ lives and restrict the sustainable development of the social economy. As estimated by The World Resources Institute, about 21 million people are affected by floods every year, and even worse, the number would reach 54 million in 2030 due to global climate change. However, a flood is not only a kind of disaster but a precious resource. For the important rivers, flood seasonality analyses and appropriate scheme to segment an entire flood season into multiple sub-seasons help reduce flood risk and utilize water resources effectively. In some monsoon-influenced regions, floods often show the characteristics of significant seasonal changes due to the influence of monsoon or climate change. Therefore, how to scientifically study the seasonal characteristics of floods is a key issue in hydrology and water resource development (Cunderlik et al. 2004). Macdonald et al. (2010) analyzed the spatial and temporal variability of flood seasonality for 30 years in Wales and they stated that noticeable regional variations in timing and length of flood season were evident. Ye et al. (2017) believed that understanding the causes of flood seasonality is helpful for better flood management and they examined the seasonality of maximum annual floods and its variation at more than 250 basins across the contiguous USA. Ledingham et al. (2019) analyzed the seasonality of storm rainfall and flood runoff based on more than 500 rain gauge stations in Britain and gauged rainfall datasets and found that the seasonal occurrences of annual daily rainfall and flood maxima are significantly different in dry lowland catchments. For a large river channel, the flood limiting water level (FLWL) is an important characteristic water level to control flood risk and should be determined based on the river's design flood criteria. In the traditional regulation of river flood control, the FLWL tends to be fixed throughout the flood season (Jiang et al. 2015). However, the fixed FLWL ignores the seasonality of flooding and may result in unreasonable water transfer and management for the river systems. To solve this problem, multi-stage FLWL control has been proposed. And the flood season segmentation is a prerequisite for multi-stage FLWL control which means that the entire flood season must be first divided into sub-seasons.

Generally speaking, there are three main kinds of methods that can be applied to segment the entire flood season, i.e. the qualitative analysis method (Adhikari et al. 2016), statistical analysis method (Chen et al. 2013, 2015), and clustering analysis method (Mo et al. 2018; Al-Jawad et al. 2019). The flood season segmentation obtained by the qualitative analysis method relies on the researchers’ practical experience and subjective judgment, and only a small amount of quantitative calculation is used to verify the results. Besides, due to its strong subjectivity, segmentation results were often vague and rough and could not be applied to the practical operation (Bender & Simonovic 2000). Statistical analysis is a method that obtains the segmentation by statistically analyzing a large amount of hydrological data in flood season. Although river managers can easily use this kind of method for its explicit mathematical concept and clear calculation steps, there are still some shortcomings, such as the single index, the threshold selection subjectivity, and the sampling error (Esogbue et al. 1992). The clustering analysis, which can segment the flood season by calculating the degree of representative indicators, has an advantage due to sufficient mathematical foundation, clear and easy calculation, and more stable results. Therefore, it is a popular method at present. Representative methods of clustering analysis are the fractal analysis method (FAM) (Hajkowicz & Collins 2007), change-point analysis (Liu et al. 2010), and set pair analysis (Nicklow et al. 2010). In summary, even though many studies have been carried out to obtain the results of flood season staging, there are still some problems that remain to be solved: (1) During the studies for flood season staging, the researchers usually classify a flood season into several sub-flood seasons subjectively, but for a particular river channel, especially large river channels, how many sub-flood seasons should be segmented? This is still waiting to be answered. (2) Different methods or indexes will result in different flood season staging schemes, but which one is the most suitable for a river flood season segmentation? (3) How to test the rationality of flood segmentation is also a key problem.

Therefore, the objectives of this paper are to (1) determine the optimum numbers of segmentation by the Fisher optimal partition method; (2) segment a defined flood season into several sub-seasons by an ensemble approach combined with the FAM, mean change-point (MCP) analysis method, and probability change-point (PCP) analysis method; and (3) test the rationality of the flood season staging scheme and make an optimal selection by a nonparametric bootstrap combined with a fuzzy optimum selection (NPB-FOS) method. This study aims to overcome the existing problems in the previous studies and to realize the reasonable segmentation of river flood season.

Fisher optimal partition method

The Fisher optimal partition method is a kind of statistical clustering approach and it aims to determine whether each class has a significant difference in the optimal partition or not. The detailed steps are given as follows (Roach et al. 2018).

Step 1. Define the class diameter

Class diameter is the index that can reflect the degree of difference among classes. The greater the diameter, the greater the difference is. Assume that a class is , and the main equations are as follows:
(1)
(2)
where is the mean value, is the diameter (the sample of the total sum of squares).

Step 2. Define loss function

Suppose that n samples can be classified into k class:
and the loss function is as follows:
(3)
When n and k are known, the optimal segmentation can be described as follows:
(4)

And the corresponding point is the optimal dividing point.

Step 3. Recursive formula

The optimal segmentation of ordered samples is always based on the optimal k–1 segmentation, and the formulas are as follows:
(5)
(6)

Step 4. Determine the optimum number of the sub-flood seasons

Two ways can be adopted to determine the optimum number of the sub-flood seasons:

  • (1)

    Plot the curve variation on the base of the number of the optimal partition k and the minimum loss function, and the k at the bend of the curve can be regarded as the optimum number of the sub-flood seasons.

  • (2)
    Calculate the nonnegative slope as follows:
    (7)
If is large, it shows that k classes are better than k–1 classes. If k approaches 0, it is not necessary to continue to classify. Generally, k is the number of the optimal partition when is maximized.

Ensemble methods for flood season segmentation

In this study, an ensemble method combined with several common methods (FAM , MCP analysis method, PCP analysis method) is used to segment the flood season. We mainly introduce the main principles of the above methods.

Fractal analysis method

The FAM was proposed by Mandelbrot (1982). Assume that the number of days in the flood season period is T, and therefore the capacity dimension of this flood season period can be expressed as:
(8)
(9)
where is the number of samples whose values are larger than a given threshold in the flood season period; is the time scale, is the step size; b is the gradient of the straight line in diagram; and d is the topological dimension. In this study, d = 2 (the samples of runoff series are two-dimensional data).

Calculate the capacity dimension of different flood season periods. If the capacity dimension of one period is similar to another period's capacity dimension, it can be considered that the two periods belong to the same sub-flood season.

MCP analysis method

It is well known that the starting point of the flood season is usually the time when the rainfall changes from less to more, and that the end point of the flood season would be a time when the rainfall varies from more to less, which could be considered as change points. Therefore, the change-point analysis method can be used to determine the corresponding change points. The time series xi that follows a normal distribution and is mutually independent can be expressed as follows (Lin et al. 2020):
(10)
where ei is the random error, and if it is independent and equal variance, E(ei) = 0, assuming that

Obviously, will be an MCP when .

The least-square method is usually applied to determine the MCP, especially in the estimation of multiple change points. Moreover, an independence test needs to be done before using the MCP method.

PCP analysis method

The PCP analysis method based on peak-over-threshold (POT) sampling is commonly used in flood season analysis due to its advantage of good unbiasedness and effectiveness. The cumulative time method is widely used to obtain the PCP (Efron 1979; Perreault et al. 2000), which defines a statistical variable :
(11)
where represents the number of the specified event that occurs during k tests; represents the number of the same event that occurs in all (N times) tests.

Obviously, is the frequency of specified event that occurs in k test, and is the frequency of the specified event that occurs in all tests.

Based on the definition of the expectation, the can be expressed as follows:
(12)
where p1 is the probability of the specified event that occurs before the PCP, p2 is the probability of the specified event that occurs after the PCP, and m is the assumed initial PCP.

NPB-FOS method and optimization procedure

The NPB-FOS method is mainly composed of four theoretical methods: the nonparametric bootstrap, mathematical statistics, relative membership degree (RMD), and two-stage fuzzy optimum selections. The purpose of nonparametric bootstrap is to build a variety of bootstrap samples with the distribution characteristics of original samples. The mathematical statistic is used to compute the relative frequency of annual maximum flood in each sub-flood season and the average peak flood during the main flood season in different staging schemes. The RMD method can calculate the RMD of the relative frequency in each sub-flood season and the average peak flood during the main flood season in different staging schemes. The two-stage fuzzy optimum selection method is applied to integrate the membership degrees to get the two-stage fuzzy membership degree that can test multiple staging schemes’ rationality and make an optimum selection. The detailed steps are described as follows.

Step 1. Nonparametric bootstrap

To obtain a variety of approximations that meet the overall distribution characteristics of original samples, Efron (1979) proposed a ‘re-sampling’ approach, called ‘Bootstrap’. The nonparametric bootstrap method is a basic method of the bootstrap method, which is randomly sampling with replacement from original samples. Therefore, in this study nonparametric bootstrap is applied to get a great number of new samples without making any hypothesis for original samples.

Step 2. Relative frequency and confidence intervals

Apply annual maximum sampling to obtain N flood events, record the number of annual maximum flood in each sub-flood season, and then calculate the initial relative frequency by the following equation:
(13)
where bi is the occurrence times of annual maximum flood during the ith sub-flood season, i = 1, 2, …, M, and M is the number of sub-flood seasons in a flood season.
Due to the numbers of days in each sub-flood season being different, the initial relative frequency needs to be adjusted and the adjusted relative frequency obtained with the following equation:
(14)
where a is the number of days in the whole flood season, and ai is the number of days in the ith sub-flood season.
To make the total relative frequency in each sub-flood season equal to 1 (assume that the sum is ), the relative frequency can be expressed as follows:
(15)
The confidence interval estimation formula of the normal distribution can be used to calculate the upper and lower limit of the confidence interval of relative frequency if it follows a normal distribution. The upper limit and lower limit of the confidence interval are estimated by the following formulas:
(16)
(17)
where is the sample mean; is the quantile of distribution, and S is the standard deviation of the samples.

Step 3. Relative membership degree

X is the matrix which is composed of the characteristic index value:
(18)
where k is the number of the sub-flood seasons; M is the number of the characteristic index in each sub-flood season.
The RMD of each characteristic value in each sub-flood season can be calculated by the following equations:
(19)
(20)
According to the complementary fuzzy set, the relative inferiority degree can be expressed as follows:
(21)

Step 4. Two-stage fuzzy membership degree

Usually, we use ‘good’ or ‘bad’ to evaluate multiple things; the two-stage fuzzy membership is used to combine the membership degree and inferiority degree to evaluate ‘good’ and ‘bad’ of the segmentation, and it can be calculated by the following equation:
(22)
where is the weight of i characteristic index. When equals 1, the j segmentation is the optimal one.

Step 5. Optimization procedure

Assume that a flood season is divided into three sub-flood seasons, i.e. early flood season, main flood season, and late flood season. The guidelines for examining the rationality of the multiple flood season staging schemes and making an optimum selection by the proposed methods are as follow:

  • (1)

    Use the nonparameter bootstrap method to obtain N bootstrap samples from daily inflow series, then record the number of the annual maximum flood during each sub-flood season in each staging scheme, and calculate the relative frequency , and .

  • (2)

    Calculate the upper and lower limit of the relative frequency during each sub-flood season in each staging scheme and take the mean value of all staging schemes as the upper and lower limit of confidence intervals for each sub-flood season.

  • (3)
    Take the upper limit of main flood season and the lower limit , of early flood season and late flood season as the benchmark index (confidence level ). The discrepancy extent between the relative frequency during each sub-flood season in each staging scheme and the benchmark index is represented by (generalized distance), and it can be expressed as follows:
    (23)
  • (4)

    Calculate the RMD () in each staging scheme according to generalized distance, and figure out the of the mean peak flow during the main flood season in different staging schemes.

  • (5)

    Put the RMD () of three sub-flood seasons in the same staging schemes into Equation (23), and assign an equal weight wi as suggested by Chen et al. (2015) to each sub-flood season to calculate the two-stage fuzzy membership degree of relative occurrences frequency of annual maximum flood (simplified as the membership degree of relative frequency).

  • (6)

    Under the multiple combinations of weight and , we can obtain frequency-peak flood flow two-stage fuzzy membership degree by putting the and , which are in the same staging scheme into Equation (23).

  • (7)

    Test the rationality and make an optimal selection according to the . As suggested by Liu et al. (2010), when is less than 0.8, the deflection between the jth staging schemes and bootstrap samples is large, and therefore, it can be regarded as an unreasonable scheme that should be abandoned. If the staging scheme is greater than 0.8, choose the largest of the staging scheme as the optimal one.

In this study, the downstream reaches of the Yellow River were selected as the case study for evaluating the proposed framework.

The downstream reaches of the Yellow River refer to the reaches below the river section of the Huayuankou section, as shown in Figure 1, with a total length of 780 km.

Figure 1

The general situation and the locations of the different channel pattern reaches of the lower Yellow River.

Figure 1

The general situation and the locations of the different channel pattern reaches of the lower Yellow River.

Close modal

The basin area upper the Huayuankou section is 730,000 km2, which accounts for 97.1% of the Yellow River basin area. The runoff of the upper Huayuankou section accounts for 96.6% of the Yellow River total runoff. The variation of runoff and sediment in the Huayuankou section plays a leading role in flood control both in the upper and lower reaches of the Yellow River. The annual sediment transport passing through the Huayuankou section amounts to 1.6 billion tons/year, making the flood control situation complex.

There are 110 dangerous works and control projects in the downstream reaches of the Yellow River; the length of the project is 305.2 km, and the length of protection is 261.3 km. There are 2,830 dam stacks that play an important role in controlling the river regime.

In recent years, a lot of research has been carried out on the topic of flood control in the downstream reaches of the Yellow River, but there are few studies on the rational segmentation of the flood season.

In this paper, the maximum flood peak discharge data from 1950 to 2015 at the Huayuankou station were used, as shown in Figure 2.

Figure 2

The maximum annual flood peak discharge at the Huayuankou station.

Figure 2

The maximum annual flood peak discharge at the Huayuankou station.

Close modal

There are five main control hydrology survey stations in the downstream reaches of the Yellow River, and the design flood on each station section is shown in Table 1 (Cai et al. 2021).

Table 1

Design flood discharge on the downstream reaches of the Yellow River (million m3/s)

StationsProbabilities (%)
0.112510
Huayuankou 22,600 15,700 12,500 10,000 7,500 
Jiahetan 21,000 15,070 12,056 9,645 7,200 
Gaocun 20,300 14,400 11,520 9,200 7,000 
Sunkou 18,100 13,000 10,400 8,300 6,200 
Aishan 10,175 9,775 7,800 6,200 4,500 
StationsProbabilities (%)
0.112510
Huayuankou 22,600 15,700 12,500 10,000 7,500 
Jiahetan 21,000 15,070 12,056 9,645 7,200 
Gaocun 20,300 14,400 11,520 9,200 7,000 
Sunkou 18,100 13,000 10,400 8,300 6,200 
Aishan 10,175 9,775 7,800 6,200 4,500 

The flood prevention in the downstream reaches of the Yellow River is under the Yellow River Conservancy Commission (YRCC) supervision led by the Ministry of Water Resources of China. From the traditional rule, the start and end dates of the flood season are 1 June and 1 October without considering the period features of the entire flood season. The data used in this study are daily rainfall and daily flow. All the data are provided by the YRCC.

Optimum numbers of segmentation

As shown in Figure 3(a), the curve was plotted based on the k (the base of the number of the optimal partition) and (the minimum loss function). The inflection of the curve lay at the point where k equals to 3, and where the loss function changed most sharply at the point k = 3. Also, as shown in Figure 3(b), when k approached 3, reached its maximum and the value was 3.1806. According to the Fisher optimal partition method principle, the number of the optimal classification is 3. In summary, the flood season would be most appropriately segmented into three stages, i.e. early flood season, main flood season, and late flood season.

Figure 3

Analysis results of the optimal partition number.

Figure 3

Analysis results of the optimal partition number.

Close modal

Comparison of different flood season segmentation methods

Segmentation by the FAM

Early flood season

The maximum daily rainfall from 1960 to 2019 was used as the fractal index. The average annual precipitation is 652.9 mm. The longest continuous rainfall is 11 days (in 1957). The maximum annual precipitation is 1041.3 mm. The minimum annual rainfall is 375.9 mm. As suggested by Liu et al. (2010), the thresholds were 1.1 times the average value of the samples during each stage. The beginning time of the early flood season is 1 June. As shown in Table 2, initially T took the period from 01 June– 30 June, that is to say, T = 30 days. Then, the capacity dimension of T = 30–50 days was calculated with 10 days as a step. According to Table 1, when T = 50 days (01 June–20 July), the slope of has a large deviation, and the maximum deviation was 0.089, which was 5.18% (>5%) of the minimum capacity dimension (1.717), indicating that the capacity dimension mutated in T = 50 days. When T = 30–45 days, the maximum deviation was 0.015, which was 0.83% of the minimum capacity dimension. The capacity dimensions were approximately close, and therefore, it could be considered that they belong to the same sub-flood season. Thus, the duration of the early flood season was from 1 June to 15 July (T = 45 days).

Table 2

Determination of early flood season by the FAM

Time periodT (days)Threshold levelSlope ofCapacity dimension
01 Jun.–20 Jun. 30 53.6 0.184 1.821 
01 Jun.–10 Jul. 40 56.7 0.180 1.836 
01 Jun. –15 Jul. 45 57.4 0.195 1.798 
01 Jun.–20 Jul. 50 63.5 0.269 1.717 
Time periodT (days)Threshold levelSlope ofCapacity dimension
01 Jun.–20 Jun. 30 53.6 0.184 1.821 
01 Jun.–10 Jul. 40 56.7 0.180 1.836 
01 Jun. –15 Jul. 45 57.4 0.195 1.798 
01 Jun.–20 Jul. 50 63.5 0.269 1.717 
Main flood season

Suppose that T = 30 days (16 July–14 August), then the capacity dimension of T = 30–70 days is calculated with 10 days as a step. As shown in Table 3, in T = 30–70 days, the maximum deviation of the slope of is 0.037, which is 2.19% of the minimum capacity dimension (1.687), and the capacity dimensions were approximately close. Besides, the capacity dimension of T = 70 days was calculated with 5 days as the step size, and the results showed that the slope of has a large deviation, and the maximum deviation was 0.109, which is 6.46% (>5%) of the minimum capacity dimension. Therefore, they were considered as the same sub-flood season, and then the duration of the main flood season is from 16 July to 23 September (T = 70 days).

Table 3

Determination of the main flood season by the FAM

Time periodT (days)Threshold levelSlope ofCapacity dimension
16 Jul.–14 Aug. 30 87.9 0.273 1.740 
16 July.–24 Aug. 40 85.0 0.267 1.725 
16 July.–03 Sept. 50 89.4 0.304 1.687 
16 July.–13 Sept. 60 86.3 0.293 1.704 
16 July.–23 Sept. 70 80.5 0.283 1.713 
Time periodT (days)Threshold levelSlope ofCapacity dimension
16 Jul.–14 Aug. 30 87.9 0.273 1.740 
16 July.–24 Aug. 40 85.0 0.267 1.725 
16 July.–03 Sept. 50 89.4 0.304 1.687 
16 July.–13 Sept. 60 86.3 0.293 1.704 
16 July.–23 Sept. 70 80.5 0.283 1.713 
Late flood season

In the same way, the late flood season takes 24 September as the beginning date. The initial assumption is T = 30 days (24 September–23 October), and then the capacity dimension of T = 30–50 days was calculated with 10 days as the step. It could be seen from Table 4 that the slope of each is close. Moreover, each capacity dimension is approximately close, and hence, they could be considered to be of the same stage. In T = 30–50 days, the results showed that the maximum deviation of the slope of each is 0.026, which is 1.81% of the minimum capacity dimension (1.436). Therefore, the late flood season was from 24 September to 12 November (T = 50 days).

Table 4

Determination of late flood season by the FAM

Time periodT (days)Threshold levelSlope ofCapacity dimension
24 Sept.–23 Oct. 30 50.3 0.552 1.438 
24 Sept.–03 Nov. 40 50.5 0.563 1.436 
24 Sept.–12 Nov. 50 45.7 0.537 1.455 
Time periodT (days)Threshold levelSlope ofCapacity dimension
24 Sept.–23 Oct. 30 50.3 0.552 1.438 
24 Sept.–03 Nov. 40 50.5 0.563 1.436 
24 Sept.–12 Nov. 50 45.7 0.537 1.455 

In summary, the FAM was applied to divide the flood season into the early flood season (01 June–15 July, T = 45 days), main flood season (16 July–23 September, T = 70 days), and late flood season (24 September–12 November, T = 50 days).

Segmentation by MCP analysis

SPSS software was applied to generate a series of the maximum daily flow and maximum 7-day flood volume during flood season based on the daily flow and 7-day flood volume at the Huayuankou hydrometric section.

According to the results of the Fisher optimal partition method, the flood season should be divided into three stages, namely, the early flood season, the main flood season, and the late flood season. We take 15 days as the calculation step length and extend the selected duration forward and backward. As a result, the ranges of variation of two points were 18 June–20 July and 18 August–30 September. The MCP analysis is carried out by the least-square method and the stepwise adjustment method to gradually adjust the staging results. The results were shown in Table 5. It can be seen from Table 4 that, with the index of maximum daily flow, the flood season can be segmented as early flood season (01 June–15 July), main flood season (16 July–23 September), and late flood season (24 September–03 November); while with the index of maximum 7-day flood volume, the flood season can be segmented as early flood season (01 June–17 July), main flood season (18 July–25 September), and late flood season (26 September–05 November).

Table 5

Segmentation by the MCP method

IndexEarly flood seasonMain flood seasonLate flood season
Maximum daily flow 01 Jun.–15 Jul. 16 Jul.–23 Sept. 24 Sept.–03 Nov. 
Maximum 7-day flood volume 01 Jun.–17 Jul. 18 Jul.–25 Sept. 26 Sept.–05 Nov. 
IndexEarly flood seasonMain flood seasonLate flood season
Maximum daily flow 01 Jun.–15 Jul. 16 Jul.–23 Sept. 24 Sept.–03 Nov. 
Maximum 7-day flood volume 01 Jun.–17 Jul. 18 Jul.–25 Sept. 26 Sept.–05 Nov. 

Segmentation by PCP analysis

According to the current situation of river flow and regulation rules, the flows (4,500, 5,500, 6,500, 7,500, and 8,500 m3) were used as the thresholds for flow POT sampling. Through MCP analysis, the initial two change points were 1 August and 1 October. The results obtained by the PCP analysis method are shown in Table 6. With the threshold of 4,500 m3, the flood season can be segmented as early flood season (01 June–15 July), main flood season (16 July–23 September), and late flood season (24 September–03 November); with the threshold of 5,500 m3, the flood season can be segmented as early flood season (01 June–17 July), main flood season (18 July–25 September), and late flood season (26 September–05 November); with the threshold of 6,500 m3, the flood season can be segmented as early flood season (01 June–20 July), main flood season (21 July–28 September), and late flood season (29 September–08 November); with the threshold of 7,500 m3, the flood season can be segmented as early flood season (01 June–20 July), main flood season (21 July–23 September), and late flood season (24 September–10 November); with the threshold of 8,500 m3, the flood season can be segmented as early flood season (01 June–25 July), main flood season (26 July–25 September), and late flood season (26 September–12 November).

Table 6

Segmentation by the PCP analysis method

Threshold (m3/s)Early flood seasonMain flood seasonLate flood season
4,500 01 Jun.–15 Jul. 16 Jul.–23 Sept. 24 Sept.–03 Nov 
5,500 01 Jun.–17 Jul. 18 Jul.–25 Sept. 26 Sept.–05 Nov. 
65,000 01 Jun.–20 Jul. 21 Jul.–28 Sept. 29 Sept.–08 Nov. 
7,500 01 Jun.–20 Jul. 21 Jul.–23 Sept. 24 Sept.–10 Nov. 
8,500 01 Jun.–25 Jul. 26 Jul.–25 Sept. 26 Sept.–12 Nov. 
Threshold (m3/s)Early flood seasonMain flood seasonLate flood season
4,500 01 Jun.–15 Jul. 16 Jul.–23 Sept. 24 Sept.–03 Nov 
5,500 01 Jun.–17 Jul. 18 Jul.–25 Sept. 26 Sept.–05 Nov. 
65,000 01 Jun.–20 Jul. 21 Jul.–28 Sept. 29 Sept.–08 Nov. 
7,500 01 Jun.–20 Jul. 21 Jul.–23 Sept. 24 Sept.–10 Nov. 
8,500 01 Jun.–25 Jul. 26 Jul.–25 Sept. 26 Sept.–12 Nov. 
Ensemble segmentation approach

The flood season segmentation in the downstream reaches of the Yellow River by an ensemble approach is shown in Table 7.

Table 7

The flood season segmentation with an ensemble method

MethodsSamplesEarly flood seasonMain flood seasonLate flood season
FAM Maximum daily rainfall 01 Jun.–15 Jul. 16 Jul.–23 Sept. 24 Sept.– 12 Nov. 
MCP Maximum daily flow 01 Jun.–15 Jul. 16 Jul.–23 Sept. 24 Sept.– 03 Nov. 
Maximum 7-day flood volume 01 Jun.–17 Jul. 18 Jul.–25 Sept. 26 Sept.– 05 Nov. 
PCP 4,500 01 Jun.–15 Jul. 16 Jul.–23 Sept. 24 Sept.– 03 Nov. 
5,500 01 Jun.–17 Jul. 18 Jul.–25 Sept. 26 Sept.– 05 Nov. 
65,000 01 Jun.–20 Jul. 21 Jul.–28 Sept. 29 Sept.– 08 Nov. 
7,500 01 Jun.–20 Jul. 21 Jul.–23 Sept. 24 Sept.–10 Nov. 
8,500 01 Jun.–25 Jul. 26 Jul.–25 Sept. 26 Sept.–12 Nov. 
MethodsSamplesEarly flood seasonMain flood seasonLate flood season
FAM Maximum daily rainfall 01 Jun.–15 Jul. 16 Jul.–23 Sept. 24 Sept.– 12 Nov. 
MCP Maximum daily flow 01 Jun.–15 Jul. 16 Jul.–23 Sept. 24 Sept.– 03 Nov. 
Maximum 7-day flood volume 01 Jun.–17 Jul. 18 Jul.–25 Sept. 26 Sept.– 05 Nov. 
PCP 4,500 01 Jun.–15 Jul. 16 Jul.–23 Sept. 24 Sept.– 03 Nov. 
5,500 01 Jun.–17 Jul. 18 Jul.–25 Sept. 26 Sept.– 05 Nov. 
65,000 01 Jun.–20 Jul. 21 Jul.–28 Sept. 29 Sept.– 08 Nov. 
7,500 01 Jun.–20 Jul. 21 Jul.–23 Sept. 24 Sept.–10 Nov. 
8,500 01 Jun.–25 Jul. 26 Jul.–25 Sept. 26 Sept.–12 Nov. 

Rationality examination and optimized selection

To ensure that the bootstrap sampling can correctly reflect the distribution characteristics of the population and reduce the sampling deviation, we extracted 10,000 samples each time with replacement and repeated 10,000 times. According to the NPB-FOS method, and were calculated, respectively, and the results were shown in Table 8, with a confidence level of 0.05.

Table 8

RMD of the frequency and average flood peak in main flood season

Methods and samples
FAM 0.9968 0.9051 
MCP (max. daily) 0.9980 0.9408 
MCP (max. 7 days) 1.0000 0.9502 
PCP (4,500 m3/s) 0.9996 0.9087 
PCP (5,500 m3/s) 0.9975 0.9454 
PCP (6,500 m3/s) 0.9982 0.9665 
PCP (7,500 m3/s) 0.9990 0.9525 
PCP (8,500 m3/s) 0.6550 1.0000 
Methods and samples
FAM 0.9968 0.9051 
MCP (max. daily) 0.9980 0.9408 
MCP (max. 7 days) 1.0000 0.9502 
PCP (4,500 m3/s) 0.9996 0.9087 
PCP (5,500 m3/s) 0.9975 0.9454 
PCP (6,500 m3/s) 0.9982 0.9665 
PCP (7,500 m3/s) 0.9990 0.9525 
PCP (8,500 m3/s) 0.6550 1.0000 

From Table 8, we can see that the best flood season segmentation scheme would be the MCP method with the index of maximum 7-day flood volume if only take the occurrence frequency of annual maximum floods in each sub-flood into account. The reason is that this segmentation scheme spans the longest duration during the main flood season, and thus, the annual maximum flood occurs most frequently.

As the mean peak flood flow in the main flood season as concern, the period obtained by the PCP method with a threshold of 4,500 m3 is the best one, whose duration of the main flood season is the shortest. The shorter duration of the main flood season, the bigger the mean peak flood flow in the main flood season would be. In conclusion, it was difficult to select the optimum segmentation scheme only by the membership degree of relative frequency () or the RMD of mean peak flood flow in the main flood season (). To fully consider the frequency and magnitude of flood occurrence, the NPB-FOS method is applied. In the optimization process, we adjusted their weight in the range from 0.3 to 0.7 with the step size of 0.1 and kept the sum equal to 1 to eliminate the influence of different weight value. The results of the two-stage fuzzy optimum selection are shown in Table 9.

Table 9

Two-stage fuzzy comprehensive relative membership degree of different segmentations


Main flood season= 0.3:0.7= 0.4:0.6= 0.5:0.5= 0.6:0.4= 0.7:0.3
16 Jul.–23 Sept. 0.9911 0.9929 0.9950 0.9971 0.9986 
16 Jul.–23 Sept. 0.9967  0.9974 0.9981 0.9989 0.9994 
18 Jul.–25 Sept. 0.9977 0.9982 0.9987 0.9992 0.9996 
16 Jul.–23 Sept. 0.9919 0.9935 0.9954 0.9973 0.9987 
18 Jul.–25 Sept. 0.9970 0.9976 0.9983 0.9990 0.9995 
21 Jul.–28 Sept. 0.9983 0.9987 0.9990 0.9994 0.9997 
21 Jul.–23 Sept. 0.9979 0.9983 0.9988 0.9993 0.9996 
26 Jul.–25 Sept. 0.9800 0.9572 0.9226 0.8795 0.8362 

Main flood season= 0.3:0.7= 0.4:0.6= 0.5:0.5= 0.6:0.4= 0.7:0.3
16 Jul.–23 Sept. 0.9911 0.9929 0.9950 0.9971 0.9986 
16 Jul.–23 Sept. 0.9967  0.9974 0.9981 0.9989 0.9994 
18 Jul.–25 Sept. 0.9977 0.9982 0.9987 0.9992 0.9996 
16 Jul.–23 Sept. 0.9919 0.9935 0.9954 0.9973 0.9987 
18 Jul.–25 Sept. 0.9970 0.9976 0.9983 0.9990 0.9995 
21 Jul.–28 Sept. 0.9983 0.9987 0.9990 0.9994 0.9997 
21 Jul.–23 Sept. 0.9979 0.9983 0.9988 0.9993 0.9996 
26 Jul.–25 Sept. 0.9800 0.9572 0.9226 0.8795 0.8362 

As shown in Table 9, the of the segmentation for the main flood season, i.e. 21 July–28 September, is the largest regardless of the weight combinations, and therefore, the segmentation obtained by the PCP method is the desirable scheme that takes both the relative frequency of occurrence and the mean peak flood flow into consideration.

The determination of the optimal number of the sub-flood seasons, the rational test, and the optimal selection of the flood season segmentation schemes have been a major problem for a large river flood prevention and have not been effectively solved. In this study, a scientific and rational flood season segmentation framework was put forward through a comparison of different segmentation methods. A systematic framework that consists of a Fisher optimal partition method, an ensemble approach, and an NPB-FOS was put forward, which can scientifically determine the optimum numbers of the segmented sub-flood seasons, divide a flood season into several sub-seasons, examine the rationality of the flood season segmentation schemes, and select the optimal schemes.

As a case study, through analysis of the Yellow River's downstream reaches, some research findings can be achieved: (1) The entire flood season can be segmented as three sub-flood seasons, i.e early, middle, and late flood seasons. (2) It can be concluded through comparison analysis that segmentation obtained by the PCP method is better than that obtained by other methods. (3) The best segmentation stages for the downstream reach of the Yellow River are early flood season (01 June–20 July), main flood season (21 July–28 September), and late flood season (29 September–08 November).

The results of this study show that September still belongs to the main flood season (the main flood season is from 21 July to 28 September), and the duration (29 September–08 November) belongs to the late flood season which is different from the traditional flood season partition (1 October is the end of flood season). As is well known, the main flood season is prone to heavy rainfall and flood events; if the water level is raised at this stage, it would increase flood risk for the downstream of the river. Therefore, it is suggested that the relevant flood prevention authorities prepare large flood events in this stage so that the downstream safety is ensured under the scientific management.

The proposed Yellow River flood season segmentation framework could be applied to other river flood season segmentation as well as reservoirs. However, there are still some shortcomings waiting to be discussed, such as that the selection of indexes and thresholds inevitably has some subjectivity, which may bring some errors to the result of stages.

The study was supported by the Natural Science Fund of China (No. 50579020).

The author declares that there are no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

All relevant data are included in the paper or its Supplementary Information.

Adhikari
U.
,
Nejadhashemi
A. P.
,
Herman
M. R.
&
Messina
J. P.
2016
Multiscale assessment of the impacts of climate change on water resources in Tanzania
.
J. Hydrol. Eng.
50
,
05016034
.
Al-Jawad
J. Y.
,
Alsaffar
H. M.
,
Bertram
D.
&
Kalin
R. M.
2019
A comprehensive optimum integrated water resources management approach for multidisciplinary water resources management problems
.
J. Environ. Manage.
239
,
211
224
.
https://doi.org/10.1016/j.jenvman.2019.03.045
.
Cai
R.
,
Zhang
H.
,
Bu
H.
,
Shi
Z.
&
Dai
W.
2021
Flood frequency analysis of the lower Yellow River
.
J. Yellow River
43
(
3
),
54
57
.
Chen
L.
,
Singh
V. P.
,
Guo
S.
,
Fang
B.
&
Liu
P.
2013
A new method for identification of flood seasons using directional statistics
.
Hydrol. Sci. J.
58
(
1
),
28
40
.
Chen
L.
,
Singh
V. P.
,
Guo
S.
,
Zhou
J.
,
Zhang
J.
&
Liu
P.
2015
An objective method for partitioning the entire flood season into multiple sub-seasons
.
J. Hydrol.
528
,
621
630
.
Cunderlik
J. M.
,
Ouarda
T. B. M. J.
&
Bobée
B.
2004
Determination of flood seasonality from hydrological records
.
Hydrol. Sci. J.
49
(
3
),
511
529
.
Efron
B.
1979
Bootstrap methods: another look at the jack knife
.
Ann. Statist.
7
(
1
),
1
26
.
doi:10.1007/978-1-4612-4380-9_41
.
Hajkowicz
S.
&
Collins
K.
2007
A review of multiple attributes analysis for water resource planning and management
.
Water Resour. Manage
.
21
(
9
),
1553
1566
.
Lin
P.
,
You
J.
,
Gan
H.
&
Jia
L.
2020
Rule-based object-oriented water resource system simulation model for water allocation
.
Water Resour. Manage
.
34
,
3183
3197
.
Liu
P.
,
Guo
S.
,
Xiong
L.
&
Chen
L.
2010
Flood season segmentation based on the probability change-point analysis technique
.
Hydrol. Sci. J.
55
(
4
),
540
564
.
Macdonald
N.
,
Phillips
I. D.
&
Mayle
G.
2010
Spatial and temporal variability of flood seasonality in Wales
.
Hydrol. Process.
24
(
13
),
1806
1820
.
Mandelbrot
B.
1982
The Fractal Geometry of Nature
.
Times Books
,
New York
.
Mo
C.
,
Mo
G.
,
Liu
P.
,
Zhong
H.
,
Wang
D.
,
Huang
Y.
&
Jin
J.
2018
Reservoir operation by staging due to climate variability
.
Hydrol. Sci. J.
63
(
6
),
926
937
.
doi:10.1080/02626667.2018.1457220
.
Nicklow
J. F.
,
Reed
P.
,
Savic
D.
,
Dessalegne
T.
,
Harrell
L.
,
Chan-Hilton
A.
,
Karamouz
M.
,
Minsker
B.
,
Ostfeld
A.
,
Singh
A.
&
Zechman
E.
2010
State of the art for genetic algorithms and beyond in water resources planning and management
.
J. Water Resour. Plan. Manag
.
136
(
4
),
412
432
.
Perreault
L.
,
Bernier
J.
,
Bobée
B.
&
Parent
E.
2000
Bayesian change-point analysis in hydrometeorological time series. Part 1. The normal model revisited
.
J. Hydrol.
235
(
3
),
221
241
.
Roach
T.
,
Kapelan
Z.
&
Ledbetter
R.
2018
Resilience-based performance metrics for water resources management under uncertainty
.
Adv. Water Resour
.
116
,
18
28
.
Ye
S.
,
Li
H. Y.
,
Leung
L. R.
,
Guo
J.
,
Ran
Q.
,
Demissie
Y.
&
Sivapalan
M.
2017
Understanding flood seasonality and Its temporal shifts within the contiguous United States
.
J. Hydrometeorol.
18
(
7
),
1997
2009
.
doi:10.1175/jhm-d-16-0207.1
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).