Abstract
Catchment classification strategies based on easily available physical characteristics are important for extrapolating hydrologic model parameters and improving hydrologic predictions in ungauged catchments. In this study, we conduct an experiment of catchment classification and explore the feasibility of characterizing hydrologically similar catchments using certain physical characteristics in upstream regions of the Huai River Basin. The similarity metrics of hydrologic response factors (high flow, low flow and average annual runoff) and physical factors (topography, shape, soil and vegetation) are fed into the K-means algorithm for catchment classification. All the catchments are classified into two classes regardless of the types of metrics used. By comparing the overlap coefficient (η) and Rand index (RI) between any two classification results, we found that the topography classification displays the highest concordance with the high flow classification (η = 79.2% and RI = 0.66) among all metrics. Including more metrics would not produce consistently better classification results. The optimal combination of metrics, with η = 87.5%, is the high flow metrics (Q10%, SFH and MAX90) with the topography metrics (AS and HI). The results indicate that the physical metrics adopted for hydrologic classification should be determined carefully in terms of specific hydrologic characteristics.
INTRODUCTION
‘Profound philosophy takes the simplest form’ is a quote from Tao Te Ching by the Chinese philosopher Lao-tzu from ∼500 bc. Similar to other natural scientists, hydrologists have long been seeking a principle to produce concise, easily understood explanations of hydrologic behaviours (McDonnell & Woods 2004); classification based on hydrologic similarity is a principle that can provide a mapping of catchment form and forcing on its function to improve predictability, especially for ungauged basins (Wagener et al. 2007).
Achieving a general method for identifying and categorizing dominant catchment functions, beyond individual catchments or a particular dataset, has been a particular struggle in hydrology (Beven 2000; Sawicz et al. 2011). However, our limits in predicting the runoff response in ungauged basins may be due in part to a lack of fundamental understanding of hydrologic similarity (McDonnell et al. 2007; Reichl et al. 2009). Generally, different strategies and methods of hydrologic similarity are determined by the similarity factors (also named similarity measures or metrics) adopted, including catchment forms, forcing and function (Wagener et al. 2007; Olden et al. 2012; Sivakumar et al. 2013).
Hydrologic similarity analysis with catchment form factors (e.g., geomorphologic, pedologic and geologic characteristics) and without river flow data is more appealing for ungauged catchments and thus allows data transfer between gauged and ungauged catchments (Olden et al. 2012; Chiverton et al. 2015). In fact, similarity defined by catchment form factors has a long history in hydrology. Rodríguez-Iturbe & Valdés (1979) demonstrated how the shape of a catchment unit hydrograph can be explained from the structure of the channel network. The topographic index ln(α/tanβ) of Beven & Kirkby (1979) has been used to relate local geomorphometric parameters to hydrologic behaviour at a specified location, and the distribution curve of this index describes similarity behaviours among different catchments.
In contrast with the above studies, dimensionless similarity numbers and dimensional analysis allow for the parsimonious development of relationships between catchment characteristics and hydrologic metrics (Dooge 1986; Aryal et al. 2002). In the past decade, Berne et al. (2005) derived a dimensionless hillslope number, the hillslope Péclet number, to relate hillslope forms to hydraulic properties and to subsurface flow response patterns. On the basis of the hillslope Péclet number, Lyon & Troch (2010) further developed a similarity parameter to describe the shallow surface hydrologic response of small catchments. Similarly, a dimensionless similarity framework was developed by Harman & Sivapalan (2009) for assessing the controls of hillslope forms on storage-release dynamics in an idealization hillslope system.
The difficulty in using the hydraulic dimensional analysis mentioned above is that it neglects the complex forms of real-world catchments. Since physical catchment characteristics are potentially valuable in describing catchments with respect to their hydrologic responses, some combinations of physical characteristics have been selected as similarity metrics for real world catchment classification (Oudin et al. 2010; Ali et al. 2012). For instance, Wolock et al. (2004) delineated hydrologic-landscape regions in the United States by using physical descriptors, such as the land-surface form, geologic texture and climate variables. Reichl et al. (2009) optimized a similarity measure through identification of numerous catchment attributes for estimating ungauged streamflow. Szolgayova et al. (2014) suggested that catchment properties, e.g., catchment area, can influence the temporal dependence of river flow, i.e., the dependence of runoff is strongly determined by catchment storage (larger storage in a larger catchment).
However, whether a hydrologic response can be indicated completely by physical factors is still contested. Merz & Blöschl (2009) found that land use, soil types and geology did not seem to exert a major control on the runoff coefficients of the catchments they studied. Ali et al. (2012) and Ley et al. (2011) showed a lack of correlation between flow-derived indicators and catchment characteristics. The difference is likely to be caused by the catchment characteristics, especially the lack of suitable subsurface descriptors, not adequately capturing the climatic effects (first-order control of flow) (Oudin et al. 2010; Sawicz et al. 2014).
It is obvious that a large uncertainty in hydrologic classification will inevitably arise due to arbitrary adoption of catchment metrics. Hydrologic similarity analysis based on physical catchment characteristics needs to be re-examined for real world catchments as they are highly complex; for example, explicit analytical relationships between physical catchment characteristics and hydrologic responses derived under ideal conditions may be inappropriate.
Therefore, extensive searches for suitable strategies to classify catchments in terms of hydrologic functions should be carried out. The main objective of our paper is to explore the applicability of different sets of easily available physical factors for classifying hydrologically similar catchments for a specific hydrologic response. In this study, to eliminate the effects of intensive human activities (e.g., land use changes), we limit our study to 24 upstream catchments in the humid mountainous and hilly areas in the Huai River Basin. A suite of catchment metrics, i.e., hydrologic response factors (high flow, low flow and average annual runoff) and physical factors (topography, shape, soil and vegetation), are provided for the catchment classification experiments. We therefore test various combinations of these factors to explore which physical attributes can be used as similarity factors for a specific hydrologic function.
DATA AND METHODS
Study catchments
The Huai River Basin (31°–36°N, 112°–121°E) is in eastern China. It is the sixth largest river basin in China, with a total drainage area of 270,000 km2. Mountainous regions account for approximately one-third of the total basin area. The basin belongs to a climate transition zone. The basin has a temperate monsoon climate to the north and a subtropical monsoon climate to the south. The average annual temperature is 11°C–16°C, decreasing from south to north. The average annual precipitation, of which over 60% occurs in summer (June–July), is approximately 920 mm. Rainfall increases from north to south, from the plain to mountainous areas, respectively. Funiu Mountain, Tongbai Mountain and Dabie Mountain are in the southwest of the Huai River Basin and serve as the main flood origin locations of the Huai River Basin. This area is more prone to heavy floods due to the torrential rain induced by the western Pacific subtropical high, typhoons, and other weather systems.
Twenty-four catchments were studied in the upper and middle regions of the Huai River Basin (upstream of the Bengbu Floodgate) including the headwater areas of Funiu Mountain, Tongbai Mountain and Dabie Mountain (Figure 1). As shown in Table 1, the catchment areas range from 17.9 km2 to 3,090 km2. The average elevation of six of the selected catchments is higher than 500 m, and Huangweihe had the highest elevation (824 m).
. | ID . | Station . | Area (km2) . | Average altitude (m) . | Average slope (deg) . | Average precipitation (mm/a) . | Longitude . | Latitude . |
---|---|---|---|---|---|---|---|---|
Funiu Mountain | 1 | Hekou | 2,141 | 164 | 5.4 | 871 | 113°40′ | 33°31′ |
2 | Ziluoshan | 1,880 | 818 | 16.1 | 915 | 112°31′ | 34°10′ | |
3 | Guanzhai | 1,030 | 174 | 4.8 | 855 | 113°19′ | 33°23′ | |
4 | Gaocheng | 620 | 484 | 8.5 | 895 | 113°08′ | 34°24′ | |
5 | Zhongtang | 501 | 674 | 16.7 | 1,021 | 112°34′ | 33°45′ | |
6 | Xiagushan | 375 | 456 | 11.2 | 930 | 112°43′ | 33°52′ | |
7 | Jizhong | 46 | 386 | 9.5 | 1,103 | 112°42′ | 33°39′ | |
Tongbai Mountain | 8 | Changtaiguan | 3,090 | 194 | 5.9 | 1,069 | 114°04′ | 32°19′ |
9 | Huangchuan | 2,050 | 146 | 6.3 | 1,255 | 115°03′ | 32°09′ | |
10 | Dapoling | 1,770 | 226 | 6.8 | 1,067 | 113°45′ | 32°25′ | |
11 | Nanlidian | 1,500 | 166 | 7.1 | 1,019 | 114°36′ | 32°02′ | |
12 | Yangzhuang | 814 | 137 | 5.0 | 979 | 113°54′ | 33°23′ | |
13 | Luzhuang | 390 | 206 | 6.0 | 963 | 113°51′ | 32°43′ | |
14 | Baiqueyuan | 284 | 219 | 9.1 | 1,268 | 115°06′ | 31°47′ | |
15 | Xinxian | 256 | 259 | 11.6 | 1,301 | 114°52′ | 31°39′ | |
16 | Tanjiahe | 152 | 280 | 12.2 | 1,305 | 113°53′ | 31°54′ | |
17 | Zhumadian | 109 | 93 | 1.6 | 1,014 | 114°01′ | 32°58′ | |
18 | Lixin | 77.8 | 162 | 3.8 | 1,025 | 113°28′ | 32°57′ | |
19 | Peihe | 17.9 | 446 | 19.7 | 1,248 | 114°51′ | 31°37′ | |
Dabie Mountain | 20 | Huangnizhuang | 805 | 480 | 14.9 | 1,480 | 115°37′ | 31°28′ |
21 | Bailianya | 737 | 663 | 16.4 | 1,504 | 116°10′ | 31°16′ | |
22 | Zhangchong | 493 | 667 | 16.1 | 1,512 | 116°01′ | 31°25′ | |
23 | Huangweihe | 292 | 824 | 18.4 | 1,560 | 116°19′ | 31°08′ | |
24 | Qilin | 178 | 540 | 15.5 | 1,559 | 115°45′ | 31°28′ |
. | ID . | Station . | Area (km2) . | Average altitude (m) . | Average slope (deg) . | Average precipitation (mm/a) . | Longitude . | Latitude . |
---|---|---|---|---|---|---|---|---|
Funiu Mountain | 1 | Hekou | 2,141 | 164 | 5.4 | 871 | 113°40′ | 33°31′ |
2 | Ziluoshan | 1,880 | 818 | 16.1 | 915 | 112°31′ | 34°10′ | |
3 | Guanzhai | 1,030 | 174 | 4.8 | 855 | 113°19′ | 33°23′ | |
4 | Gaocheng | 620 | 484 | 8.5 | 895 | 113°08′ | 34°24′ | |
5 | Zhongtang | 501 | 674 | 16.7 | 1,021 | 112°34′ | 33°45′ | |
6 | Xiagushan | 375 | 456 | 11.2 | 930 | 112°43′ | 33°52′ | |
7 | Jizhong | 46 | 386 | 9.5 | 1,103 | 112°42′ | 33°39′ | |
Tongbai Mountain | 8 | Changtaiguan | 3,090 | 194 | 5.9 | 1,069 | 114°04′ | 32°19′ |
9 | Huangchuan | 2,050 | 146 | 6.3 | 1,255 | 115°03′ | 32°09′ | |
10 | Dapoling | 1,770 | 226 | 6.8 | 1,067 | 113°45′ | 32°25′ | |
11 | Nanlidian | 1,500 | 166 | 7.1 | 1,019 | 114°36′ | 32°02′ | |
12 | Yangzhuang | 814 | 137 | 5.0 | 979 | 113°54′ | 33°23′ | |
13 | Luzhuang | 390 | 206 | 6.0 | 963 | 113°51′ | 32°43′ | |
14 | Baiqueyuan | 284 | 219 | 9.1 | 1,268 | 115°06′ | 31°47′ | |
15 | Xinxian | 256 | 259 | 11.6 | 1,301 | 114°52′ | 31°39′ | |
16 | Tanjiahe | 152 | 280 | 12.2 | 1,305 | 113°53′ | 31°54′ | |
17 | Zhumadian | 109 | 93 | 1.6 | 1,014 | 114°01′ | 32°58′ | |
18 | Lixin | 77.8 | 162 | 3.8 | 1,025 | 113°28′ | 32°57′ | |
19 | Peihe | 17.9 | 446 | 19.7 | 1,248 | 114°51′ | 31°37′ | |
Dabie Mountain | 20 | Huangnizhuang | 805 | 480 | 14.9 | 1,480 | 115°37′ | 31°28′ |
21 | Bailianya | 737 | 663 | 16.4 | 1,504 | 116°10′ | 31°16′ | |
22 | Zhangchong | 493 | 667 | 16.1 | 1,512 | 116°01′ | 31°25′ | |
23 | Huangweihe | 292 | 824 | 18.4 | 1,560 | 116°19′ | 31°08′ | |
24 | Qilin | 178 | 540 | 15.5 | 1,559 | 115°45′ | 31°28′ |
In this paper, the hydrologic datasets, such as those of daily precipitation and daily runoff, for each selected catchment were obtained from the Hydrologic Yearbook of the Huai River Basin. The 30 m Advanced Spaceborne Thermal Emission and Reflection Radiometer Global Digital Elevation Model (ASTER GDEM) and Landsat 4–5 TM land cover data were provided by the Geographical Spatial Data Cloud of the Computer Network Information Center of the Chinese Academy of Sciences (http://www.gscloud.cn/). Soil types were obtained from the China Soil Scientific Database provided by the Institute of Soil Science of the Chinese Academy of Sciences (http://www.soil.csdb.cn/).
Hydrologic and physical metrics
Six different types of metrics were used, which are classified into three hydrologic metrics and three physical metrics. The 23 hydrologic metrics used for catchment classification are ten high flow metrics, ten low flow metrics and three average annual runoff metrics (Table 2).
Combination . | No. . | Metrics . | Symbols and formulas . | Unit . |
---|---|---|---|---|
High flow | 1–5 | Mean annual maximum 1, 3, 7, 30, 90-day flows | MAX1, MAX3, MAX7, MAX30, MAX90 | mm |
6 | Average of high flow (>75th percentile) | HPT | mm | |
7 | Count of high flow (>75th percentile) | HPC | ||
8 | Duration of high flow (>75th percentile) | HPD | days | |
9 | High flow value (10th percentile of FDC) | Q10% | mm | |
10 | The slope of the FDC for the high flow range | |||
Low flow | 11–15 | Mean annual minimum 1, 3, 7, 30, 90-day flows | MIN1, MIN3, MIN7, MIN30, MIN90 | mm |
16 | Average of low flow (<25th percentile) | LPT | mm | |
17 | Count of low flow (<25th percentile) | LPC | ||
18 | Duration of low flow (<25th percentile) | LPD | days | |
19 | Low flow value (90th percentitle of FDC) | Q90% | mm | |
20 | Slope of the FDC for the low flow range | |||
Average annual runoff | 21 | Annual mean runoff coefficient | ||
22 | CV annual mean runoff | CV | ||
23 | Baseflow index | BFI |
Combination . | No. . | Metrics . | Symbols and formulas . | Unit . |
---|---|---|---|---|
High flow | 1–5 | Mean annual maximum 1, 3, 7, 30, 90-day flows | MAX1, MAX3, MAX7, MAX30, MAX90 | mm |
6 | Average of high flow (>75th percentile) | HPT | mm | |
7 | Count of high flow (>75th percentile) | HPC | ||
8 | Duration of high flow (>75th percentile) | HPD | days | |
9 | High flow value (10th percentile of FDC) | Q10% | mm | |
10 | The slope of the FDC for the high flow range | |||
Low flow | 11–15 | Mean annual minimum 1, 3, 7, 30, 90-day flows | MIN1, MIN3, MIN7, MIN30, MIN90 | mm |
16 | Average of low flow (<25th percentile) | LPT | mm | |
17 | Count of low flow (<25th percentile) | LPC | ||
18 | Duration of low flow (<25th percentile) | LPD | days | |
19 | Low flow value (90th percentitle of FDC) | Q90% | mm | |
20 | Slope of the FDC for the low flow range | |||
Average annual runoff | 21 | Annual mean runoff coefficient | ||
22 | CV annual mean runoff | CV | ||
23 | Baseflow index | BFI |
Of the hydrologic metrics, 16 (No. 1–No. 8 and No. 11–No. 18) were calculated by IHA ver. 7.1 (referring to http://conserveonline.org/workspaces/iha), which was released by the Nature Conservancy (TNC). Here, all flows that exceed 75% of the daily flows were defined as high flows, while flows below 25% of the daily flows were defined as low flows. Moreover, four metrics of the flow duration curve (FDC) (No. 9 and No. 10, and No. 19 and No. 20) were also added as high and low flow metrics (Oudin et al. 2010). In addition, the annual mean runoff coefficient (), coefficient of variation of annual mean runoff (CV), and baseflow index (BFI), i.e., the ratio of baseflow divided by total flow, were selected as the average annual runoff metrics.
Simultaneously, to investigate the control factors behind the specific hydrologic response at different flows, 18 physical catchment metrics were analysed and subdivided into three groups: seven topographic metrics, seven shape metrics, and four soil and vegetation metrics. For each combination of these physical metrics, the formulas and descriptions are listed in Table 3.
Combination . | No. . | Metrics . | Symbols and formulas . | Unit . |
---|---|---|---|---|
Topography | 1 | Average elevation | m | |
2 | Integral of area-altitude curve | HI | ||
3 | Slope of area-altitude curve | |||
4 | Average slope | deg | ||
5 | Average topographic indexa | |||
6 | Plan curvatureb | |||
7 | Profile curvatureb | |||
Shape | 8 | Length | L | km |
9 | Width | km | ||
10 | Form factorc | |||
11 | Elongation ratiod | |||
12 | Circularity ratioe | |||
13 | Fractal dimension of river networkf | |||
14 | Peak of GIUH | PGU | ||
Soil and vegetation | 15 | Sand content | Sand | % |
16 | Clay content | Clay | % | |
17 | Silt content | Silt | % | |
18 | NDVIg |
Combination . | No. . | Metrics . | Symbols and formulas . | Unit . |
---|---|---|---|---|
Topography | 1 | Average elevation | m | |
2 | Integral of area-altitude curve | HI | ||
3 | Slope of area-altitude curve | |||
4 | Average slope | deg | ||
5 | Average topographic indexa | |||
6 | Plan curvatureb | |||
7 | Profile curvatureb | |||
Shape | 8 | Length | L | km |
9 | Width | km | ||
10 | Form factorc | |||
11 | Elongation ratiod | |||
12 | Circularity ratioe | |||
13 | Fractal dimension of river networkf | |||
14 | Peak of GIUH | PGU | ||
Soil and vegetation | 15 | Sand content | Sand | % |
16 | Clay content | Clay | % | |
17 | Silt content | Silt | % | |
18 | NDVIg |
a means the local upslope area draining through a certain point per unit contour length and is the local slope in radians.
bPlan curvature and profile curvature (Schmidt et al. 2003): fx, fy, fxx, fyy and fxy are the first and second derivatives of the height function z = f (x, y).
cA means the area of a study catchment.
dd means the diameter of the circle whose area is equal to the catchment area.
eA’ means the area of the circle whose perimeter is equal to the basin perimeter.
fFractal dimension of river network was calculated by box-counting algorithm, supposing that N(r) is the number of boxes of side length r required to cover the set.
g means the reflectivity in the near infrared band and means the reflectivity of the red band.
First, for the seven topographic metrics, the area-altitude curve shows the horizontal cross-sectional area at the corresponding altitude (Strahler 1952). The frequency of the topographic index (Ti) (Beven & Kirkby 1979) shows the spatial distribution of the saturated water deficit. Second, within the seven shape metrics, L is the longest straight-line distance from the outlet to the basin divide. The fractal dimension of a river network (D) reflects the development of a drainage network in its catchment. The geomorphologic instant unit hydrograph (GIUH) represents the relationship between the flow concentration and the basin geomorphic factors. Finally, the metrics of soil and vegetation characteristics include the sand content (Sand), clay content (Clay), silt content (Silt) and the normalized differential vegetation index (NDVI).
Principal component analysis method and cluster analysis method
Unsupervised clustering, i.e., the principal component analysis (PCA) and cluster analysis (CLA) methods, are used due to the high complexity of the real world catchments, and there are no catchments with an explicit analytical relationship between the physical characteristics and hydrologic response that are available for training tests under the supervised clustering methods.
In recent years, PCA and CLA have been the most popular multivariate statistical tools for catchment classification and regionalization studies. The classification process is shown in Figure 2. First, assuming m metrics, an similarity metric matrix can be formed, where n is the total number of catchments. All the metrics were standardized to preclude the problem that some catchment metrics may have a large effect on the classification results. Second, PCA is used to reduce the correlation of the metrics and to make them relatively independent. An eigenvalue equal to or greater than 1.0 is chosen as a criterion to determine the number of principal components. Third, the K-means algorithm of CLA is used for catchment classification. Euclidean distance is adopted as the evaluation measure of similarity, i.e., as the characteristics of two catchments become more close, the catchments are also more similar (Snelder et al. 2005). The classification results are obtained according to the minimum sum of the squared errors with a given number of classes (Nc). Finally, the silhouette coefficient (Sil) (Rousseeuw 1987) was calculated to determine the optimal Nc in terms of the maximum of the average Sil of all catchments. The Sil values lie between −1 and 1, and a Sil close to 1 means that the data are appropriately classified. Judging from previous work, the search range for the optimal Nc is (Zhou et al. 2010). Thus, the final classification results of catchment similarity are obtained.
Results and evaluation of catchment classification
To verify the reliability of this simplified evaluation procedure, the classical efficiency measure Rand index (RI) (Rand 1971) was also employed to evaluate the agreement between the hydrologic classification and physical classification results. An RI closer to 1 means that the two classification results are more consistent.
RESULTS AND DISCUSSION
Determination of the number of classes
To reduce the correlation of the metrics, the principal components of different combinations of metrics were obtained with the PCA method to replace the original metrics. As listed in Table 4, the principal components with an eigenvalue equal to or greater than 1.0 are retained for each combination of metrics. For example, three principal components were retained for the combination of the high flow metrics. Each principal component can explain specific metrics, as listed in Table 4.
Combination . | Components . | Eigenvalue . | Main metrics . |
---|---|---|---|
High flow | PC1 | 6.96 | MAX1-MAX90, Q10%, HPT, SFH |
PC2 | 1.33 | HPC, HPD | |
PC3 | 1.02 | – | |
Low flow | PC1 | 7.55 | MIN1-MIN90, Q90%, LPT, SFH |
PC2 | 1.52 | LPC, LPD | |
Average annual runoff | PC1 | 1.58 | CV, BFI |
PC2 | 1.14 | α | |
Topography | PC1 | 5.30 | H, HI, AS, β, Ti, CC, CP |
Shape | PC1 | 3.75 | L, B, Rc, PGU |
PC2 | 2.31 | Rf, Re, D | |
Soil and vegetation | PC1 | 2.62 | Sand, clay, silt |
PC2 | 1.05 | NDVI |
Combination . | Components . | Eigenvalue . | Main metrics . |
---|---|---|---|
High flow | PC1 | 6.96 | MAX1-MAX90, Q10%, HPT, SFH |
PC2 | 1.33 | HPC, HPD | |
PC3 | 1.02 | – | |
Low flow | PC1 | 7.55 | MIN1-MIN90, Q90%, LPT, SFH |
PC2 | 1.52 | LPC, LPD | |
Average annual runoff | PC1 | 1.58 | CV, BFI |
PC2 | 1.14 | α | |
Topography | PC1 | 5.30 | H, HI, AS, β, Ti, CC, CP |
Shape | PC1 | 3.75 | L, B, Rc, PGU |
PC2 | 2.31 | Rf, Re, D | |
Soil and vegetation | PC1 | 2.62 | Sand, clay, silt |
PC2 | 1.05 | NDVI |
PC1–PC3 represents the first, second and third principal components from PCA of each combination of metrics.
With the principal components identified by PCA, catchments can be further classified into different classes according to a given Nc in the K-means algorithm. However, Nc itself is also a predetermined parameter, and the assessment procedure, e.g., the overlapping coefficient adopted in this paper, requires both the hydrologic classification and physical classification to have the same Nc. In this study, the search range for Nc is [2, 4]. According to the average Sil of six different classification results based on different combinations of the hydrologic and physical metrics, the optimal Nc is 2 (Figure 3). This finding is also consistent with the results of Yu & Wen (1996), that all the upstream catchments in the Huai River Basin can be subdivided into two classes: mountainous and hilly areas. In the upper regions of the Huai River Basin, mountainous areas are characterized by high elevation and steep terrain, and hilly areas are characterized by relatively lower elevation and relatively flat terrain.
Therefore, the 24 catchments can be classified into two classes regardless of the kinds of metrics adopted. In terms of the hydrologic metric values of each class, the class characterized by large water quantities (large values of Q10%, Q90%, and BFI) or fast hydrologic responses (large values of SFH and SFL) was defined as Class 1, and the class characterized by small water quantities (small values of Q10%, Q90%, and BFI) or slow hydrologic responses (small values of SFH and SFL) was defined as Class 2. As reported by previous studies, the water yield increased with elevation (Hunsaker et al. 2012), and the more circular the shape of the catchment, the more concentrated the flash flood was (Hognogi 2014). Additionally, an increase in vegetation can substantially reduce streamflow (Li et al. 2012). Therefore, for the topography classification, the class characterized by high elevation (large value of H) was defined as Class 1 to correspond to a large water quantity, and the class characterized by low elevation (small value of H) was defined as Class 2 to correspond to a small water quantity. For the shape classification, catchments characterized by circular shapes (large value of Rc) were defined as Class 1 to correspond to a fast hydrologic response, and were otherwise defined as Class 2 to correspond to a slow hydrologic response. For the soil and vegetation classification, the class characterized by poor vegetation cover (small value of NDVI) was defined as Class 1 to correspond to a large water quantity, and the class characterized by good vegetation cover (large value of NDVI) was defined as Class 2 to correspond to a small water quantity.
Classification results by hydrologic and physical metrics
The classification results are highly dependent upon the combination of catchment metrics fed into the K-means algorithm (Figures 4–6), and the overlap coefficient and RI were calculated to evaluate the consistency of the hydrologic classification and physical classification results (Table 5). If the value of the overlap coefficient and RI between the two classification results is large, then the physical characteristics can be used to indicate the catchment hydrologic response.
Physical classification . | Hydrologic classification . | Number of overlap . | Number of non-overlap . | η (%) . | Average η (%) . | RI . | Average RI . |
---|---|---|---|---|---|---|---|
Topography | High flow | 19 | 5 | 79.2 | 68.1 | 0.66 | 0.56 |
Low flow | 16 | 8 | 66.7 | 0.54 | |||
Average annual runoff | 14 | 10 | 58.3 | 0.49 | |||
Shape | High flow | 18 | 6 | 75.0 | 55.6 | 0.61 | 0.53 |
Low flow | 13 | 11 | 54.2 | 0.48 | |||
Average annual runoff | 9 | 15 | 37.5 | 0.51 | |||
Soil and vegetation | High flow | 13 | 11 | 54.2 | 51.4 | 0.48 | 0.49 |
Low flow | 10 | 14 | 41.7 | 0.49 | |||
Average annual runoff | 14 | 10 | 58.3 | 0.49 |
Physical classification . | Hydrologic classification . | Number of overlap . | Number of non-overlap . | η (%) . | Average η (%) . | RI . | Average RI . |
---|---|---|---|---|---|---|---|
Topography | High flow | 19 | 5 | 79.2 | 68.1 | 0.66 | 0.56 |
Low flow | 16 | 8 | 66.7 | 0.54 | |||
Average annual runoff | 14 | 10 | 58.3 | 0.49 | |||
Shape | High flow | 18 | 6 | 75.0 | 55.6 | 0.61 | 0.53 |
Low flow | 13 | 11 | 54.2 | 0.48 | |||
Average annual runoff | 9 | 15 | 37.5 | 0.51 | |||
Soil and vegetation | High flow | 13 | 11 | 54.2 | 51.4 | 0.48 | 0.49 |
Low flow | 10 | 14 | 41.7 | 0.49 | |||
Average annual runoff | 14 | 10 | 58.3 | 0.49 |
The quantification of the agreement between different classification results revealed that the topographic metrics strongly indicate the hydrologic response at different ranges of flow, with the highest average overlap coefficient of 68.1% and highest average RI of 0.56. The overlap coefficient of 79.2% and RI of 0.66 between the topography classification and high flow classification were the highest values; therefore, high flow metrics and topographic metrics were the best partners among all the tested combinations of hydrologic and physical metrics. Topographic metrics played a key role in the hydrologic response at high flow. In the high flow classification and the topography classification, the catchments within Class 1 are mainly mountainous, while those within Class 2 are mainly hilly (Figure 5(a) and 5(d)).
The distributions of high flow and topographic metrics within each class are shown in Figure 6(a) and 6(d). It was found that the catchments within Class 1 were mostly convergent (negative CC) and concave (negative CP and larger absolute value of CP); furthermore, the elevations were higher (larger H), and the terrains were steeper (larger AS, HI, , and smaller Ti). Thus, water storage capacities were weak for the catchments within Class 1 (in the topography classification, the average value of for the Class 1 catchments was 0.51, while that for the Class 2 catchments was 0.36). In addition, the flood hydrograph was characterized by a rapid rise and recession (larger SFH and smaller HPD). In the mountainous catchments within Class 1, the convection was intensified due to terrain uplift, creating a heavy rain centre and consequently increasing the flood peak and volume. In contrast, most catchments within Class 2 were divergent (positive CC) and slightly concave (negative CP and smaller absolute value of CP), had relatively lower elevations, and had flatter terrains (smaller AS, HI, , and larger Ti). Therefore, the storage capacity of Class 2 was larger than Class 1, and the flood hydrograph increased and decreased slowly with lower peak discharge (smaller SFH and larger HPD).
The shape factors also had a strong effect on high flow ( and ), second only to the topography factors. The shape of a catchment was an indicator of the flood concentration. According to the distributions of shape metric characteristics (Figure 6(e)), the values of Rf, Re and Rc for the catchments within Class 1 were larger than those of the catchments within Class 2, indicating that basin shape of the catchments within Class 1 were more round. In a round catchment, the time that it takes a droplet falling on any point to reach the outlet was faster, so the flood flow increases dramatically. In contrast, elongated catchments within Class 2 had lower peak flows with longer durations. The GIUH can reflect the relationship between the geomorphic factors and the flood process of a river network. If the GIUH was sharper and the PGU was larger, then the flow concentration was faster and the flood peak was higher. By contrast, the value of the PGU for the catchments within Class 1 was larger, and the value of D for the catchments within Class 1 was relatively smaller (Figure 6(e)). Thus, floods developed rapidly in the catchments within Class 1, and large floods were more common.
Although the shape factors have a strong effect on high flow, they played a weak role in baseflow regulation. The overlap coefficient and RI were low between the shape classification and the low flow (or average annual runoff) classification. The effect of soil and vegetation on runoff at all ranges of flow was also not significant.
Finally, the Huai River Basin is in the climate transition zone between northern and southern China, and precipitation decreases from south to north. The driving force factors, e.g., precipitation, may impact hydrologic behaviours significantly, especially for low flow and average annual runoff. For example, from the low flow classification and the average annual runoff classification (Figure 5(b) and 5(c)), catchments within Class 1 were mainly located in the south, and catchments within Class 2 were more to the north, except for the Hekou, Yangzhuang and Gaocheng catchments, which may be greatly impacted by human activities (e.g., reservoirs and dams) (Zhang et al. 2011). This can partly explain why the results of the hydrological classification and the physical classification will not completely coincide with each other.
How to explain high flow using topographic metrics
Due to the consistency between the high flow classification and topography classification results, the topographic and high flow metrics were selected for further classification experiments. We aim to find whether more topographic metrics adopted for catchment classification will result in a more consistent result with the high flow classification. Due to the consistent trend of the assessment results between the overlap coefficient and RI in the above section (Table 5), the overlap coefficient is adopted as the main assessment measure because it is more intuitive and simple to calculate.
In the following study, the topographic and high flow metrics were screened to improve the overlap between the high flow classification and the topography classification. First, according to the minimum average correlation coefficients (all correlation coefficients are taken as absolute values) between each topographic metric and all the high flow metrics (Figure 7(a)), the topographic metrics with the minimum average correlation coefficient were screened individually. With the PCA and CLA methods, the remaining topographic metrics were used for catchment classifications, as shown in Figure 8. The classification by all high flow metrics was adopted as a reference. As seen from Figure 7(b), with the removal of the topographic metrics, the overlap coefficient had an increasing trend with the minimum average correlation coefficient. The overlap coefficient was the highest, increasing to 83.3%, until the two topographic metrics (AS and HI) were included. However, if only one topographic metric (AS) that possessed the maximum average correlation coefficient with all high flow metrics was adopted to classify the 24 catchments, the overlap coefficient of the classification results dropped to 79.2%. Thus, among all topographic metrics, AS and HI were the closest topographic cousins for similarity classification when applying the physical metrics for explanations of the high flow characteristics. As seen from Figure 9, it is obvious that the area-altitude curves of the two catchment classes are different. It can be inferred that the characteristics of the area-altitude curve were a good indicator of hydrologic response at high flow.
Comprehensive hydrologic characteristics of a catchment are made up of various aspects of its hydrologic behaviours. If we want to make a comprehensive description of the hydrologic characteristics, all the hydrologic metrics ought to be included. Then, all the physical metrics should also be used for the hydrologic similarity classification based on physical characteristics, which is difficult to achieve. For ungauged areas, it is possible to indicate a specific hydrologic response based on some easily available physical factors. Thus, high flow metrics were also selected for the screening experiment. With the same method described earlier, high flow metrics with minimum average correlation coefficients with AS and HI were eliminated one by one (Figure 7(c)), attempting to continuously improve the overlap coefficient and Rand index between the two catchment classifications. The classification results based on the different high flow metrics are shown in Figure 10. The classification result with the closest topographic cousins (AS and HI) was adopted as a reference. When the three high flow metrics (Q10%, SFH and MAX90) with the maximum average correlation coefficient with AS and HI were included for the classification, the overlap coefficient between the high flow classification and topography classification results peaked to 87.5% (Figure 7(d)). However, if the high flow metrics continued to be excluded, the overlap coefficient would drop, as with the topographic metrics. The comparison of the classification results classified by the closest high flow cousins (Q10%, SFH and MAX90) and the closest topographic cousins (AS and HI) is shown in Figure 11.
According to the screening experiments that try to select the most suitable metrics of high flow and topography for similarity classification, the overlap coefficient will not be improved with an increase in the number of metrics. A large number of metrics may obscure internal laws of catchment properties and cannot provide a comprehensive description for catchment properties. In this study, the best metrics for similarity classification based on specific hydrologic responses at high flow are the closest three high flow cousins (Q10%, SFH and MAX90) and the closest two topographic cousins (AS and HI). The most consistent classification results between the high flow classification and the topography classification are found with as few metrics as possible.
CONCLUSIONS
In this study, 24 catchments in the upper regions of the Huai River Basin were studied. With the PCA and CLA methods and various combinations of metrics, including hydrologic response factors (high flow, low flow and average annual runoff) and physical factors (topography, shape, soil and vegetation), all the catchments were classified into two classes. Comparing the consistency of the hydrological classifications and physical classifications showed that the topographic metrics among the physical factors were the best indicators of the hydrologic response for high flow, with the highest overlap coefficient () and the highest Rand index (). However, the topographic metrics could only indicate the specific hydrologic behaviour at high flow instead of all levels of flow characteristics. For low flow and average annual runoff, the climate-driven factors (e.g., precipitation) may be an effective measure for similarity classification.
Regarding hydrologic similarity classification, it is necessary to select the physical control factors for specific hydrologic behaviours. For instance, the best partners for the similarity classification based on the high flow and topographic factors are the three high flow metrics (Q10%, SFH and MAX90) and two topographic metrics (AS and HI). As discussed, changing the number of metrics used will reduce the consistency of the classification results. Through this study, a possible strategy for hydrologic similarity classification based on physical catchment metrics was provided. However, the bootstrap approach according to previous studies to determine the correspondence between one hydrologic similarity group and one physical similarity group still has some uncertainties. Moreover, hydrologic classification and regionalization for ungauged basins are still a challenge. More objective and easy-operating automatic procedures can be developed in further studies. It is likely that a more theoretical interpretation of hydrological responses in terms of essential physical factors will have the most significant impact on this field.
ACKNOWLEDGEMENTS
This work was financially supported by the National Key R&D Program of China (2016YFC0401501), the National Natural Science Foundation of China (NSFC) (grants 41771025, 91647108, 41271040), and the Special Fund of the State Key Laboratory of Hydrology-Water Resources and Hydraulic Engineering (no. 20145028012).