Catchment classification strategies based on easily available physical characteristics are important for extrapolating hydrologic model parameters and improving hydrologic predictions in ungauged catchments. In this study, we conduct an experiment of catchment classification and explore the feasibility of characterizing hydrologically similar catchments using certain physical characteristics in upstream regions of the Huai River Basin. The similarity metrics of hydrologic response factors (high flow, low flow and average annual runoff) and physical factors (topography, shape, soil and vegetation) are fed into the K-means algorithm for catchment classification. All the catchments are classified into two classes regardless of the types of metrics used. By comparing the overlap coefficient (η) and Rand index (RI) between any two classification results, we found that the topography classification displays the highest concordance with the high flow classification (η = 79.2% and RI = 0.66) among all metrics. Including more metrics would not produce consistently better classification results. The optimal combination of metrics, with η = 87.5%, is the high flow metrics (Q10%, SFH and MAX90) with the topography metrics (AS and HI). The results indicate that the physical metrics adopted for hydrologic classification should be determined carefully in terms of specific hydrologic characteristics.

‘Profound philosophy takes the simplest form’ is a quote from Tao Te Ching by the Chinese philosopher Lao-tzu from ∼500 bc. Similar to other natural scientists, hydrologists have long been seeking a principle to produce concise, easily understood explanations of hydrologic behaviours (McDonnell & Woods 2004); classification based on hydrologic similarity is a principle that can provide a mapping of catchment form and forcing on its function to improve predictability, especially for ungauged basins (Wagener et al. 2007).

Achieving a general method for identifying and categorizing dominant catchment functions, beyond individual catchments or a particular dataset, has been a particular struggle in hydrology (Beven 2000; Sawicz et al. 2011). However, our limits in predicting the runoff response in ungauged basins may be due in part to a lack of fundamental understanding of hydrologic similarity (McDonnell et al. 2007; Reichl et al. 2009). Generally, different strategies and methods of hydrologic similarity are determined by the similarity factors (also named similarity measures or metrics) adopted, including catchment forms, forcing and function (Wagener et al. 2007; Olden et al. 2012; Sivakumar et al. 2013).

Hydrologic similarity analysis with catchment form factors (e.g., geomorphologic, pedologic and geologic characteristics) and without river flow data is more appealing for ungauged catchments and thus allows data transfer between gauged and ungauged catchments (Olden et al. 2012; Chiverton et al. 2015). In fact, similarity defined by catchment form factors has a long history in hydrology. Rodríguez-Iturbe & Valdés (1979) demonstrated how the shape of a catchment unit hydrograph can be explained from the structure of the channel network. The topographic index ln(α/tanβ) of Beven & Kirkby (1979) has been used to relate local geomorphometric parameters to hydrologic behaviour at a specified location, and the distribution curve of this index describes similarity behaviours among different catchments.

In contrast with the above studies, dimensionless similarity numbers and dimensional analysis allow for the parsimonious development of relationships between catchment characteristics and hydrologic metrics (Dooge 1986; Aryal et al. 2002). In the past decade, Berne et al. (2005) derived a dimensionless hillslope number, the hillslope Péclet number, to relate hillslope forms to hydraulic properties and to subsurface flow response patterns. On the basis of the hillslope Péclet number, Lyon & Troch (2010) further developed a similarity parameter to describe the shallow surface hydrologic response of small catchments. Similarly, a dimensionless similarity framework was developed by Harman & Sivapalan (2009) for assessing the controls of hillslope forms on storage-release dynamics in an idealization hillslope system.

The difficulty in using the hydraulic dimensional analysis mentioned above is that it neglects the complex forms of real-world catchments. Since physical catchment characteristics are potentially valuable in describing catchments with respect to their hydrologic responses, some combinations of physical characteristics have been selected as similarity metrics for real world catchment classification (Oudin et al. 2010; Ali et al. 2012). For instance, Wolock et al. (2004) delineated hydrologic-landscape regions in the United States by using physical descriptors, such as the land-surface form, geologic texture and climate variables. Reichl et al. (2009) optimized a similarity measure through identification of numerous catchment attributes for estimating ungauged streamflow. Szolgayova et al. (2014) suggested that catchment properties, e.g., catchment area, can influence the temporal dependence of river flow, i.e., the dependence of runoff is strongly determined by catchment storage (larger storage in a larger catchment).

However, whether a hydrologic response can be indicated completely by physical factors is still contested. Merz & Blöschl (2009) found that land use, soil types and geology did not seem to exert a major control on the runoff coefficients of the catchments they studied. Ali et al. (2012) and Ley et al. (2011) showed a lack of correlation between flow-derived indicators and catchment characteristics. The difference is likely to be caused by the catchment characteristics, especially the lack of suitable subsurface descriptors, not adequately capturing the climatic effects (first-order control of flow) (Oudin et al. 2010; Sawicz et al. 2014).

It is obvious that a large uncertainty in hydrologic classification will inevitably arise due to arbitrary adoption of catchment metrics. Hydrologic similarity analysis based on physical catchment characteristics needs to be re-examined for real world catchments as they are highly complex; for example, explicit analytical relationships between physical catchment characteristics and hydrologic responses derived under ideal conditions may be inappropriate.

Therefore, extensive searches for suitable strategies to classify catchments in terms of hydrologic functions should be carried out. The main objective of our paper is to explore the applicability of different sets of easily available physical factors for classifying hydrologically similar catchments for a specific hydrologic response. In this study, to eliminate the effects of intensive human activities (e.g., land use changes), we limit our study to 24 upstream catchments in the humid mountainous and hilly areas in the Huai River Basin. A suite of catchment metrics, i.e., hydrologic response factors (high flow, low flow and average annual runoff) and physical factors (topography, shape, soil and vegetation), are provided for the catchment classification experiments. We therefore test various combinations of these factors to explore which physical attributes can be used as similarity factors for a specific hydrologic function.

Study catchments

The Huai River Basin (31°–36°N, 112°–121°E) is in eastern China. It is the sixth largest river basin in China, with a total drainage area of 270,000 km2. Mountainous regions account for approximately one-third of the total basin area. The basin belongs to a climate transition zone. The basin has a temperate monsoon climate to the north and a subtropical monsoon climate to the south. The average annual temperature is 11°C–16°C, decreasing from south to north. The average annual precipitation, of which over 60% occurs in summer (June–July), is approximately 920 mm. Rainfall increases from north to south, from the plain to mountainous areas, respectively. Funiu Mountain, Tongbai Mountain and Dabie Mountain are in the southwest of the Huai River Basin and serve as the main flood origin locations of the Huai River Basin. This area is more prone to heavy floods due to the torrential rain induced by the western Pacific subtropical high, typhoons, and other weather systems.

Twenty-four catchments were studied in the upper and middle regions of the Huai River Basin (upstream of the Bengbu Floodgate) including the headwater areas of Funiu Mountain, Tongbai Mountain and Dabie Mountain (Figure 1). As shown in Table 1, the catchment areas range from 17.9 km2 to 3,090 km2. The average elevation of six of the selected catchments is higher than 500 m, and Huangweihe had the highest elevation (824 m).

Figure 1

Locations of the 24 catchments in the Huai River Basin.

Figure 1

Locations of the 24 catchments in the Huai River Basin.

Close modal
Table 1

Introduction of 24 catchments

IDStationArea (km2)Average altitude (m)Average slope (deg)Average precipitation (mm/a)LongitudeLatitude
Funiu Mountain Hekou 2,141 164 5.4 871 113°40′ 33°31′ 
Ziluoshan 1,880 818 16.1 915 112°31′ 34°10′ 
Guanzhai 1,030 174 4.8 855 113°19′ 33°23′ 
Gaocheng 620 484 8.5 895 113°08′ 34°24′ 
Zhongtang 501 674 16.7 1,021 112°34′ 33°45′ 
Xiagushan 375 456 11.2 930 112°43′ 33°52′ 
Jizhong 46 386 9.5 1,103 112°42′ 33°39′ 
Tongbai Mountain Changtaiguan 3,090 194 5.9 1,069 114°04′ 32°19′ 
Huangchuan 2,050 146 6.3 1,255 115°03′ 32°09′ 
10 Dapoling 1,770 226 6.8 1,067 113°45′ 32°25′ 
11 Nanlidian 1,500 166 7.1 1,019 114°36′ 32°02′ 
12 Yangzhuang 814 137 5.0 979 113°54′ 33°23′ 
13 Luzhuang 390 206 6.0 963 113°51′ 32°43′ 
14 Baiqueyuan 284 219 9.1 1,268 115°06′ 31°47′ 
15 Xinxian 256 259 11.6 1,301 114°52′ 31°39′ 
16 Tanjiahe 152 280 12.2 1,305 113°53′ 31°54′ 
17 Zhumadian 109 93 1.6 1,014 114°01′ 32°58′ 
18 Lixin 77.8 162 3.8 1,025 113°28′ 32°57′ 
19 Peihe 17.9 446 19.7 1,248 114°51′ 31°37′ 
Dabie Mountain 20 Huangnizhuang 805 480 14.9 1,480 115°37′ 31°28′ 
21 Bailianya 737 663 16.4 1,504 116°10′ 31°16′ 
22 Zhangchong 493 667 16.1 1,512 116°01′ 31°25′ 
23 Huangweihe 292 824 18.4 1,560 116°19′ 31°08′ 
24 Qilin 178 540 15.5 1,559 115°45′ 31°28′ 
IDStationArea (km2)Average altitude (m)Average slope (deg)Average precipitation (mm/a)LongitudeLatitude
Funiu Mountain Hekou 2,141 164 5.4 871 113°40′ 33°31′ 
Ziluoshan 1,880 818 16.1 915 112°31′ 34°10′ 
Guanzhai 1,030 174 4.8 855 113°19′ 33°23′ 
Gaocheng 620 484 8.5 895 113°08′ 34°24′ 
Zhongtang 501 674 16.7 1,021 112°34′ 33°45′ 
Xiagushan 375 456 11.2 930 112°43′ 33°52′ 
Jizhong 46 386 9.5 1,103 112°42′ 33°39′ 
Tongbai Mountain Changtaiguan 3,090 194 5.9 1,069 114°04′ 32°19′ 
Huangchuan 2,050 146 6.3 1,255 115°03′ 32°09′ 
10 Dapoling 1,770 226 6.8 1,067 113°45′ 32°25′ 
11 Nanlidian 1,500 166 7.1 1,019 114°36′ 32°02′ 
12 Yangzhuang 814 137 5.0 979 113°54′ 33°23′ 
13 Luzhuang 390 206 6.0 963 113°51′ 32°43′ 
14 Baiqueyuan 284 219 9.1 1,268 115°06′ 31°47′ 
15 Xinxian 256 259 11.6 1,301 114°52′ 31°39′ 
16 Tanjiahe 152 280 12.2 1,305 113°53′ 31°54′ 
17 Zhumadian 109 93 1.6 1,014 114°01′ 32°58′ 
18 Lixin 77.8 162 3.8 1,025 113°28′ 32°57′ 
19 Peihe 17.9 446 19.7 1,248 114°51′ 31°37′ 
Dabie Mountain 20 Huangnizhuang 805 480 14.9 1,480 115°37′ 31°28′ 
21 Bailianya 737 663 16.4 1,504 116°10′ 31°16′ 
22 Zhangchong 493 667 16.1 1,512 116°01′ 31°25′ 
23 Huangweihe 292 824 18.4 1,560 116°19′ 31°08′ 
24 Qilin 178 540 15.5 1,559 115°45′ 31°28′ 

In this paper, the hydrologic datasets, such as those of daily precipitation and daily runoff, for each selected catchment were obtained from the Hydrologic Yearbook of the Huai River Basin. The 30 m Advanced Spaceborne Thermal Emission and Reflection Radiometer Global Digital Elevation Model (ASTER GDEM) and Landsat 4–5 TM land cover data were provided by the Geographical Spatial Data Cloud of the Computer Network Information Center of the Chinese Academy of Sciences (http://www.gscloud.cn/). Soil types were obtained from the China Soil Scientific Database provided by the Institute of Soil Science of the Chinese Academy of Sciences (http://www.soil.csdb.cn/).

Hydrologic and physical metrics

Six different types of metrics were used, which are classified into three hydrologic metrics and three physical metrics. The 23 hydrologic metrics used for catchment classification are ten high flow metrics, ten low flow metrics and three average annual runoff metrics (Table 2).

Table 2

Hydrologic metrics used for catchment classification

CombinationNo.MetricsSymbols and formulasUnit
High flow 1–5 Mean annual maximum 1, 3, 7, 30, 90-day flows MAX1, MAX3, MAX7, MAX30, MAX90 mm 
Average of high flow (>75th percentile) HPT mm 
Count of high flow (>75th percentile) HPC  
Duration of high flow (>75th percentile) HPD days 
High flow value (10th percentile of FDC) Q10% mm 
10 The slope of the FDC for the high flow range   
Low flow 11–15 Mean annual minimum 1, 3, 7, 30, 90-day flows MIN1, MIN3, MIN7, MIN30, MIN90 mm 
16 Average of low flow (<25th percentile) LPT mm 
17 Count of low flow (<25th percentile) LPC  
18 Duration of low flow (<25th percentile) LPD days 
19 Low flow value (90th percentitle of FDC) Q90% mm 
20 Slope of the FDC for the low flow range   
Average annual runoff 21 Annual mean runoff coefficient   
22 CV annual mean runoff CV  
23 Baseflow index BFI  
CombinationNo.MetricsSymbols and formulasUnit
High flow 1–5 Mean annual maximum 1, 3, 7, 30, 90-day flows MAX1, MAX3, MAX7, MAX30, MAX90 mm 
Average of high flow (>75th percentile) HPT mm 
Count of high flow (>75th percentile) HPC  
Duration of high flow (>75th percentile) HPD days 
High flow value (10th percentile of FDC) Q10% mm 
10 The slope of the FDC for the high flow range   
Low flow 11–15 Mean annual minimum 1, 3, 7, 30, 90-day flows MIN1, MIN3, MIN7, MIN30, MIN90 mm 
16 Average of low flow (<25th percentile) LPT mm 
17 Count of low flow (<25th percentile) LPC  
18 Duration of low flow (<25th percentile) LPD days 
19 Low flow value (90th percentitle of FDC) Q90% mm 
20 Slope of the FDC for the low flow range   
Average annual runoff 21 Annual mean runoff coefficient   
22 CV annual mean runoff CV  
23 Baseflow index BFI  

Of the hydrologic metrics, 16 (No. 1–No. 8 and No. 11–No. 18) were calculated by IHA ver. 7.1 (referring to http://conserveonline.org/workspaces/iha), which was released by the Nature Conservancy (TNC). Here, all flows that exceed 75% of the daily flows were defined as high flows, while flows below 25% of the daily flows were defined as low flows. Moreover, four metrics of the flow duration curve (FDC) (No. 9 and No. 10, and No. 19 and No. 20) were also added as high and low flow metrics (Oudin et al. 2010). In addition, the annual mean runoff coefficient (), coefficient of variation of annual mean runoff (CV), and baseflow index (BFI), i.e., the ratio of baseflow divided by total flow, were selected as the average annual runoff metrics.

Simultaneously, to investigate the control factors behind the specific hydrologic response at different flows, 18 physical catchment metrics were analysed and subdivided into three groups: seven topographic metrics, seven shape metrics, and four soil and vegetation metrics. For each combination of these physical metrics, the formulas and descriptions are listed in Table 3.

Table 3

Physical metrics used for catchment classification

CombinationNo.MetricsSymbols and formulasUnit
Topography Average elevation  
Integral of area-altitude curve HI  
Slope of area-altitude curve   
Average slope  deg 
Average topographic indexa   
Plan curvatureb   
Profile curvatureb   
Shape Length L km 
Width  km 
10 Form factorc   
11 Elongation ratiod   
12 Circularity ratioe   
13 Fractal dimension of river networkf   
14 Peak of GIUH PGU  
Soil and vegetation 15 Sand content Sand 
16 Clay content Clay 
17 Silt content Silt 
18 NDVIg   
CombinationNo.MetricsSymbols and formulasUnit
Topography Average elevation  
Integral of area-altitude curve HI  
Slope of area-altitude curve   
Average slope  deg 
Average topographic indexa   
Plan curvatureb   
Profile curvatureb   
Shape Length L km 
Width  km 
10 Form factorc   
11 Elongation ratiod   
12 Circularity ratioe   
13 Fractal dimension of river networkf   
14 Peak of GIUH PGU  
Soil and vegetation 15 Sand content Sand 
16 Clay content Clay 
17 Silt content Silt 
18 NDVIg   

a means the local upslope area draining through a certain point per unit contour length and is the local slope in radians.

bPlan curvature and profile curvature (Schmidt et al. 2003): fx, fy, fxx, fyy and fxy are the first and second derivatives of the height function z = f (x, y).

cA means the area of a study catchment.

dd means the diameter of the circle whose area is equal to the catchment area.

eA’ means the area of the circle whose perimeter is equal to the basin perimeter.

fFractal dimension of river network was calculated by box-counting algorithm, supposing that N(r) is the number of boxes of side length r required to cover the set.

g means the reflectivity in the near infrared band and means the reflectivity of the red band.

First, for the seven topographic metrics, the area-altitude curve shows the horizontal cross-sectional area at the corresponding altitude (Strahler 1952). The frequency of the topographic index (Ti) (Beven & Kirkby 1979) shows the spatial distribution of the saturated water deficit. Second, within the seven shape metrics, L is the longest straight-line distance from the outlet to the basin divide. The fractal dimension of a river network (D) reflects the development of a drainage network in its catchment. The geomorphologic instant unit hydrograph (GIUH) represents the relationship between the flow concentration and the basin geomorphic factors. Finally, the metrics of soil and vegetation characteristics include the sand content (Sand), clay content (Clay), silt content (Silt) and the normalized differential vegetation index (NDVI).

Principal component analysis method and cluster analysis method

Unsupervised clustering, i.e., the principal component analysis (PCA) and cluster analysis (CLA) methods, are used due to the high complexity of the real world catchments, and there are no catchments with an explicit analytical relationship between the physical characteristics and hydrologic response that are available for training tests under the supervised clustering methods.

In recent years, PCA and CLA have been the most popular multivariate statistical tools for catchment classification and regionalization studies. The classification process is shown in Figure 2. First, assuming m metrics, an similarity metric matrix can be formed, where n is the total number of catchments. All the metrics were standardized to preclude the problem that some catchment metrics may have a large effect on the classification results. Second, PCA is used to reduce the correlation of the metrics and to make them relatively independent. An eigenvalue equal to or greater than 1.0 is chosen as a criterion to determine the number of principal components. Third, the K-means algorithm of CLA is used for catchment classification. Euclidean distance is adopted as the evaluation measure of similarity, i.e., as the characteristics of two catchments become more close, the catchments are also more similar (Snelder et al. 2005). The classification results are obtained according to the minimum sum of the squared errors with a given number of classes (Nc). Finally, the silhouette coefficient (Sil) (Rousseeuw 1987) was calculated to determine the optimal Nc in terms of the maximum of the average Sil of all catchments. The Sil values lie between −1 and 1, and a Sil close to 1 means that the data are appropriately classified. Judging from previous work, the search range for the optimal Nc is (Zhou et al. 2010). Thus, the final classification results of catchment similarity are obtained.

Figure 2

The process of clustering for 24 study catchments. represents the number of catchments belonging to the same class (Class 1) in these two hydrologic and physical classification results and represents the number of catchments belonging to the same class (Class 2) in these two hydrologic and physical classification results. and represent the number of catchments belonging to Class 1 and Class 2 classified by hydrologic metrics. and represent the number of catchments belonging to Class 1 and Class 2 classified by physical metrics. p and n from Equation (1): , .

Figure 2

The process of clustering for 24 study catchments. represents the number of catchments belonging to the same class (Class 1) in these two hydrologic and physical classification results and represents the number of catchments belonging to the same class (Class 2) in these two hydrologic and physical classification results. and represent the number of catchments belonging to Class 1 and Class 2 classified by hydrologic metrics. and represent the number of catchments belonging to Class 1 and Class 2 classified by physical metrics. p and n from Equation (1): , .

Close modal

Results and evaluation of catchment classification

According to the six types of hydrologic and physical metrics, six groups of classification results were obtained. Similar catchments were classified within each classification result. Thus, the overlap rate between results for two different classifications can be determined, i.e., one hydrologic similarity group vs. one physical similarity group. The overlapping coefficient (η) was used as an intuitive index between the two classification results to indicate consistency, and it can determine whether ‘physically similar’ catchments are also ‘hydrologically similar’. A simplified procedure was used for determination of the overlap rate, requiring that both hydrologic classification and physical classification have the same Nc:
(1)
where the number of catchments (p) belonging to the same class in the two different classification results were counted, as illustrated in Figure 2. Greater values of η suggest that the classification results between hydrologic classification and physical classification are more consistent.

To verify the reliability of this simplified evaluation procedure, the classical efficiency measure Rand index (RI) (Rand 1971) was also employed to evaluate the agreement between the hydrologic classification and physical classification results. An RI closer to 1 means that the two classification results are more consistent.

Determination of the number of classes

To reduce the correlation of the metrics, the principal components of different combinations of metrics were obtained with the PCA method to replace the original metrics. As listed in Table 4, the principal components with an eigenvalue equal to or greater than 1.0 are retained for each combination of metrics. For example, three principal components were retained for the combination of the high flow metrics. Each principal component can explain specific metrics, as listed in Table 4.

Table 4

Eigenvalues of principal components

CombinationComponentsEigenvalueMain metrics
High flow PC1 6.96 MAX1-MAX90, Q10%, HPT, SFH 
PC2 1.33 HPC, HPD 
PC3 1.02  
Low flow PC1 7.55 MIN1-MIN90, Q90%, LPT, SFH 
PC2 1.52 LPC, LPD 
Average annual runoff PC1 1.58 CV, BFI 
PC2 1.14 α 
Topography PC1 5.30 H, HI, AS, β, Ti, CC, CP 
Shape PC1 3.75 L, B, Rc, PGU 
PC2 2.31 Rf, Re, D 
Soil and vegetation PC1 2.62 Sand, clay, silt 
PC2 1.05 NDVI 
CombinationComponentsEigenvalueMain metrics
High flow PC1 6.96 MAX1-MAX90, Q10%, HPT, SFH 
PC2 1.33 HPC, HPD 
PC3 1.02  
Low flow PC1 7.55 MIN1-MIN90, Q90%, LPT, SFH 
PC2 1.52 LPC, LPD 
Average annual runoff PC1 1.58 CV, BFI 
PC2 1.14 α 
Topography PC1 5.30 H, HI, AS, β, Ti, CC, CP 
Shape PC1 3.75 L, B, Rc, PGU 
PC2 2.31 Rf, Re, D 
Soil and vegetation PC1 2.62 Sand, clay, silt 
PC2 1.05 NDVI 

PC1–PC3 represents the first, second and third principal components from PCA of each combination of metrics.

With the principal components identified by PCA, catchments can be further classified into different classes according to a given Nc in the K-means algorithm. However, Nc itself is also a predetermined parameter, and the assessment procedure, e.g., the overlapping coefficient adopted in this paper, requires both the hydrologic classification and physical classification to have the same Nc. In this study, the search range for Nc is [2, 4]. According to the average Sil of six different classification results based on different combinations of the hydrologic and physical metrics, the optimal Nc is 2 (Figure 3). This finding is also consistent with the results of Yu & Wen (1996), that all the upstream catchments in the Huai River Basin can be subdivided into two classes: mountainous and hilly areas. In the upper regions of the Huai River Basin, mountainous areas are characterized by high elevation and steep terrain, and hilly areas are characterized by relatively lower elevation and relatively flat terrain.

Figure 3

The silhouette coefficient based on different numbers of classes.

Figure 3

The silhouette coefficient based on different numbers of classes.

Close modal

Therefore, the 24 catchments can be classified into two classes regardless of the kinds of metrics adopted. In terms of the hydrologic metric values of each class, the class characterized by large water quantities (large values of Q10%, Q90%, and BFI) or fast hydrologic responses (large values of SFH and SFL) was defined as Class 1, and the class characterized by small water quantities (small values of Q10%, Q90%, and BFI) or slow hydrologic responses (small values of SFH and SFL) was defined as Class 2. As reported by previous studies, the water yield increased with elevation (Hunsaker et al. 2012), and the more circular the shape of the catchment, the more concentrated the flash flood was (Hognogi 2014). Additionally, an increase in vegetation can substantially reduce streamflow (Li et al. 2012). Therefore, for the topography classification, the class characterized by high elevation (large value of H) was defined as Class 1 to correspond to a large water quantity, and the class characterized by low elevation (small value of H) was defined as Class 2 to correspond to a small water quantity. For the shape classification, catchments characterized by circular shapes (large value of Rc) were defined as Class 1 to correspond to a fast hydrologic response, and were otherwise defined as Class 2 to correspond to a slow hydrologic response. For the soil and vegetation classification, the class characterized by poor vegetation cover (small value of NDVI) was defined as Class 1 to correspond to a large water quantity, and the class characterized by good vegetation cover (large value of NDVI) was defined as Class 2 to correspond to a small water quantity.

Classification results by hydrologic and physical metrics

The classification results are highly dependent upon the combination of catchment metrics fed into the K-means algorithm (Figures 46), and the overlap coefficient and RI were calculated to evaluate the consistency of the hydrologic classification and physical classification results (Table 5). If the value of the overlap coefficient and RI between the two classification results is large, then the physical characteristics can be used to indicate the catchment hydrologic response.

Table 5

Comparison of classification results based on hydrologic and physical metrics

Physical classificationHydrologic classificationNumber of overlapNumber of non-overlapη (%)Average η (%)RIAverage RI
Topography High flow 19 79.2 68.1 0.66 0.56 
Low flow 16 66.7 0.54 
Average annual runoff 14 10 58.3 0.49 
Shape High flow 18 75.0 55.6 0.61 0.53 
Low flow 13 11 54.2 0.48 
Average annual runoff 15 37.5 0.51 
Soil and vegetation High flow 13 11 54.2 51.4 0.48 0.49 
Low flow 10 14 41.7 0.49 
Average annual runoff 14 10 58.3 0.49 
Physical classificationHydrologic classificationNumber of overlapNumber of non-overlapη (%)Average η (%)RIAverage RI
Topography High flow 19 79.2 68.1 0.66 0.56 
Low flow 16 66.7 0.54 
Average annual runoff 14 10 58.3 0.49 
Shape High flow 18 75.0 55.6 0.61 0.53 
Low flow 13 11 54.2 0.48 
Average annual runoff 15 37.5 0.51 
Soil and vegetation High flow 13 11 54.2 51.4 0.48 0.49 
Low flow 10 14 41.7 0.49 
Average annual runoff 14 10 58.3 0.49 
Figure 4

Classifications of 24 catchments by six combinations of metrics. Catchments with large quantity of water or fast hydrologic response belong to Class 1. Catchments with small quantity of water or slow hydrologic response belong to Class 2. Catchment ID corresponds to the ID in Table 1, representing each catchment.

Figure 4

Classifications of 24 catchments by six combinations of metrics. Catchments with large quantity of water or fast hydrologic response belong to Class 1. Catchments with small quantity of water or slow hydrologic response belong to Class 2. Catchment ID corresponds to the ID in Table 1, representing each catchment.

Close modal
Figure 5

Results of the catchment classification: (a) high flow classification; (b) low flow classification; (c) average annual runoff classification; (d) topography classification; (e) shape classification; (f) soil and vegetation classification.

Figure 5

Results of the catchment classification: (a) high flow classification; (b) low flow classification; (c) average annual runoff classification; (d) topography classification; (e) shape classification; (f) soil and vegetation classification.

Close modal
Figure 6

Boxplots of metric characteristics for different classes: (a) high flow classification; (b) low flow classification; (c) average annual runoff classification; (d) topography classification; (e) shape classification; (f) soil and vegetation classification. The boxplots show the 25th and 75th percentile, the median, values inside the interquartile range from the box and outliers. (Continued.)

Figure 6

Boxplots of metric characteristics for different classes: (a) high flow classification; (b) low flow classification; (c) average annual runoff classification; (d) topography classification; (e) shape classification; (f) soil and vegetation classification. The boxplots show the 25th and 75th percentile, the median, values inside the interquartile range from the box and outliers. (Continued.)

Close modal

The quantification of the agreement between different classification results revealed that the topographic metrics strongly indicate the hydrologic response at different ranges of flow, with the highest average overlap coefficient of 68.1% and highest average RI of 0.56. The overlap coefficient of 79.2% and RI of 0.66 between the topography classification and high flow classification were the highest values; therefore, high flow metrics and topographic metrics were the best partners among all the tested combinations of hydrologic and physical metrics. Topographic metrics played a key role in the hydrologic response at high flow. In the high flow classification and the topography classification, the catchments within Class 1 are mainly mountainous, while those within Class 2 are mainly hilly (Figure 5(a) and 5(d)).

The distributions of high flow and topographic metrics within each class are shown in Figure 6(a) and 6(d). It was found that the catchments within Class 1 were mostly convergent (negative CC) and concave (negative CP and larger absolute value of CP); furthermore, the elevations were higher (larger H), and the terrains were steeper (larger AS, HI, , and smaller Ti). Thus, water storage capacities were weak for the catchments within Class 1 (in the topography classification, the average value of for the Class 1 catchments was 0.51, while that for the Class 2 catchments was 0.36). In addition, the flood hydrograph was characterized by a rapid rise and recession (larger SFH and smaller HPD). In the mountainous catchments within Class 1, the convection was intensified due to terrain uplift, creating a heavy rain centre and consequently increasing the flood peak and volume. In contrast, most catchments within Class 2 were divergent (positive CC) and slightly concave (negative CP and smaller absolute value of CP), had relatively lower elevations, and had flatter terrains (smaller AS, HI, , and larger Ti). Therefore, the storage capacity of Class 2 was larger than Class 1, and the flood hydrograph increased and decreased slowly with lower peak discharge (smaller SFH and larger HPD).

The shape factors also had a strong effect on high flow ( and ), second only to the topography factors. The shape of a catchment was an indicator of the flood concentration. According to the distributions of shape metric characteristics (Figure 6(e)), the values of Rf, Re and Rc for the catchments within Class 1 were larger than those of the catchments within Class 2, indicating that basin shape of the catchments within Class 1 were more round. In a round catchment, the time that it takes a droplet falling on any point to reach the outlet was faster, so the flood flow increases dramatically. In contrast, elongated catchments within Class 2 had lower peak flows with longer durations. The GIUH can reflect the relationship between the geomorphic factors and the flood process of a river network. If the GIUH was sharper and the PGU was larger, then the flow concentration was faster and the flood peak was higher. By contrast, the value of the PGU for the catchments within Class 1 was larger, and the value of D for the catchments within Class 1 was relatively smaller (Figure 6(e)). Thus, floods developed rapidly in the catchments within Class 1, and large floods were more common.

Although the shape factors have a strong effect on high flow, they played a weak role in baseflow regulation. The overlap coefficient and RI were low between the shape classification and the low flow (or average annual runoff) classification. The effect of soil and vegetation on runoff at all ranges of flow was also not significant.

Finally, the Huai River Basin is in the climate transition zone between northern and southern China, and precipitation decreases from south to north. The driving force factors, e.g., precipitation, may impact hydrologic behaviours significantly, especially for low flow and average annual runoff. For example, from the low flow classification and the average annual runoff classification (Figure 5(b) and 5(c)), catchments within Class 1 were mainly located in the south, and catchments within Class 2 were more to the north, except for the Hekou, Yangzhuang and Gaocheng catchments, which may be greatly impacted by human activities (e.g., reservoirs and dams) (Zhang et al. 2011). This can partly explain why the results of the hydrological classification and the physical classification will not completely coincide with each other.

How to explain high flow using topographic metrics

Due to the consistency between the high flow classification and topography classification results, the topographic and high flow metrics were selected for further classification experiments. We aim to find whether more topographic metrics adopted for catchment classification will result in a more consistent result with the high flow classification. Due to the consistent trend of the assessment results between the overlap coefficient and RI in the above section (Table 5), the overlap coefficient is adopted as the main assessment measure because it is more intuitive and simple to calculate.

In the following study, the topographic and high flow metrics were screened to improve the overlap between the high flow classification and the topography classification. First, according to the minimum average correlation coefficients (all correlation coefficients are taken as absolute values) between each topographic metric and all the high flow metrics (Figure 7(a)), the topographic metrics with the minimum average correlation coefficient were screened individually. With the PCA and CLA methods, the remaining topographic metrics were used for catchment classifications, as shown in Figure 8. The classification by all high flow metrics was adopted as a reference. As seen from Figure 7(b), with the removal of the topographic metrics, the overlap coefficient had an increasing trend with the minimum average correlation coefficient. The overlap coefficient was the highest, increasing to 83.3%, until the two topographic metrics (AS and HI) were included. However, if only one topographic metric (AS) that possessed the maximum average correlation coefficient with all high flow metrics was adopted to classify the 24 catchments, the overlap coefficient of the classification results dropped to 79.2%. Thus, among all topographic metrics, AS and HI were the closest topographic cousins for similarity classification when applying the physical metrics for explanations of the high flow characteristics. As seen from Figure 9, it is obvious that the area-altitude curves of the two catchment classes are different. It can be inferred that the characteristics of the area-altitude curve were a good indicator of hydrologic response at high flow.

Figure 7

Process of searching for the highest overlap coefficients between two classification results based on two types of metrics regarding high flow and topography. (a) Average correlation coefficients between each topographic metric and all high flow metrics. (b) Overlap coefficients between the classification results based on high flow and topographic metrics with the removal of the topographic metrics. (c) Average correlation coefficients between each high flow metric and the closest topographic cousins (AS and HI). (d) Overlap coefficients between the classification results based on high flow and topographic metrics (AS and HI) with the removal of the high flow metrics.

Figure 7

Process of searching for the highest overlap coefficients between two classification results based on two types of metrics regarding high flow and topography. (a) Average correlation coefficients between each topographic metric and all high flow metrics. (b) Overlap coefficients between the classification results based on high flow and topographic metrics with the removal of the topographic metrics. (c) Average correlation coefficients between each high flow metric and the closest topographic cousins (AS and HI). (d) Overlap coefficients between the classification results based on high flow and topographic metrics (AS and HI) with the removal of the high flow metrics.

Close modal
Figure 8

Different classification results according to different combinations of topographic metrics. The topographic metrics with minimum average correlation coefficient with all high flow metrics were screened one by one.

Figure 8

Different classification results according to different combinations of topographic metrics. The topographic metrics with minimum average correlation coefficient with all high flow metrics were screened one by one.

Close modal
Figure 9

Area-altitude curves of 24 study catchments. Solid lines represent the curves of the catchments within Class 1 in the topography classification and dotted lines represent the curves of the catchments within Class 2.

Figure 9

Area-altitude curves of 24 study catchments. Solid lines represent the curves of the catchments within Class 1 in the topography classification and dotted lines represent the curves of the catchments within Class 2.

Close modal

Comprehensive hydrologic characteristics of a catchment are made up of various aspects of its hydrologic behaviours. If we want to make a comprehensive description of the hydrologic characteristics, all the hydrologic metrics ought to be included. Then, all the physical metrics should also be used for the hydrologic similarity classification based on physical characteristics, which is difficult to achieve. For ungauged areas, it is possible to indicate a specific hydrologic response based on some easily available physical factors. Thus, high flow metrics were also selected for the screening experiment. With the same method described earlier, high flow metrics with minimum average correlation coefficients with AS and HI were eliminated one by one (Figure 7(c)), attempting to continuously improve the overlap coefficient and Rand index between the two catchment classifications. The classification results based on the different high flow metrics are shown in Figure 10. The classification result with the closest topographic cousins (AS and HI) was adopted as a reference. When the three high flow metrics (Q10%, SFH and MAX90) with the maximum average correlation coefficient with AS and HI were included for the classification, the overlap coefficient between the high flow classification and topography classification results peaked to 87.5% (Figure 7(d)). However, if the high flow metrics continued to be excluded, the overlap coefficient would drop, as with the topographic metrics. The comparison of the classification results classified by the closest high flow cousins (Q10%, SFH and MAX90) and the closest topographic cousins (AS and HI) is shown in Figure 11.

Figure 10

Different classification results according to different combinations of high flow metrics. The high flow metrics with minimum average correlation coefficient with the closest topographic metric cousins (AS and HI) were screened one by one.

Figure 10

Different classification results according to different combinations of high flow metrics. The high flow metrics with minimum average correlation coefficient with the closest topographic metric cousins (AS and HI) were screened one by one.

Close modal
Figure 11

Comparison of the classification results classified by the closest high flow cousins (Q10%, SFH and MAX90) and the best topographic cousins (AS and HI). (a) High flow classification by the closest high flow cousins (Q10%, SFH and MAX90). (b) Topography classification by the closest topographic cousins (AS and HI).

Figure 11

Comparison of the classification results classified by the closest high flow cousins (Q10%, SFH and MAX90) and the best topographic cousins (AS and HI). (a) High flow classification by the closest high flow cousins (Q10%, SFH and MAX90). (b) Topography classification by the closest topographic cousins (AS and HI).

Close modal

According to the screening experiments that try to select the most suitable metrics of high flow and topography for similarity classification, the overlap coefficient will not be improved with an increase in the number of metrics. A large number of metrics may obscure internal laws of catchment properties and cannot provide a comprehensive description for catchment properties. In this study, the best metrics for similarity classification based on specific hydrologic responses at high flow are the closest three high flow cousins (Q10%, SFH and MAX90) and the closest two topographic cousins (AS and HI). The most consistent classification results between the high flow classification and the topography classification are found with as few metrics as possible.

In this study, 24 catchments in the upper regions of the Huai River Basin were studied. With the PCA and CLA methods and various combinations of metrics, including hydrologic response factors (high flow, low flow and average annual runoff) and physical factors (topography, shape, soil and vegetation), all the catchments were classified into two classes. Comparing the consistency of the hydrological classifications and physical classifications showed that the topographic metrics among the physical factors were the best indicators of the hydrologic response for high flow, with the highest overlap coefficient () and the highest Rand index (). However, the topographic metrics could only indicate the specific hydrologic behaviour at high flow instead of all levels of flow characteristics. For low flow and average annual runoff, the climate-driven factors (e.g., precipitation) may be an effective measure for similarity classification.

Regarding hydrologic similarity classification, it is necessary to select the physical control factors for specific hydrologic behaviours. For instance, the best partners for the similarity classification based on the high flow and topographic factors are the three high flow metrics (Q10%, SFH and MAX90) and two topographic metrics (AS and HI). As discussed, changing the number of metrics used will reduce the consistency of the classification results. Through this study, a possible strategy for hydrologic similarity classification based on physical catchment metrics was provided. However, the bootstrap approach according to previous studies to determine the correspondence between one hydrologic similarity group and one physical similarity group still has some uncertainties. Moreover, hydrologic classification and regionalization for ungauged basins are still a challenge. More objective and easy-operating automatic procedures can be developed in further studies. It is likely that a more theoretical interpretation of hydrological responses in terms of essential physical factors will have the most significant impact on this field.

This work was financially supported by the National Key R&D Program of China (2016YFC0401501), the National Natural Science Foundation of China (NSFC) (grants 41771025, 91647108, 41271040), and the Special Fund of the State Key Laboratory of Hydrology-Water Resources and Hydraulic Engineering (no. 20145028012).

Ali
G.
,
Tetzlaff
D.
,
Soulsby
C.
,
McDonnell
J. J.
&
Capell
R.
2012
A comparison of similarity indices for catchment classification using a cross-regional dataset
.
Advances in Water Resources
40
,
11
22
.
doi: 10.1016/j.advwatres.2012.01.008
.
Aryal
S. K.
,
O'Loughlin
E. M.
&
Mein
R. G.
2002
A similarity approach to predict landscape saturation in catchments
.
Water Resources Research
38
(
10
),
1208
.
doi: 10.1029/2001WR000864
.
Berne
A.
,
Uijlenhoet
R.
&
Troch
P. A.
2005
Similarity analysis of subsurface flow response of hillslopes with complex geometry
.
Water Resources Research
41
,
W09410
.
doi: 10.1029/2004WR003629
.
Beven
K. J.
2000
Uniqueness of place and process representations in hydrological modeling
.
Hydrology and Earth System Sciences
4
,
203
213
.
doi: 10.5194/hess-4-203-2000
.
Beven
K. J.
&
Kirkby
M. J.
1979
A physically based, variable contributing area model of basin hydrology
.
Hydrological Sciences Bulletin
42
(
1
),
43
69
.
Chiverton
A.
,
Hannaford
J.
,
Holman
I.
,
Corstanje
R.
,
Prudhomme
C.
,
Bloomfield
J.
&
Hess
T. M.
2015
Which catchment characteristics control the temporal dependence structure of daily river flows?
Hydrological Processes
29
,
1353
1369
.
doi:10.1002/hyp.10252
.
Dooge
J. C. I.
1986
Looking for hydrologic laws
.
Water Resources Research
22
(
9
),
46S
56S
.
Harman
C. J.
&
Sivapalan
M.
2009
A similarity framework to assess controls on shallow subsurface flow dynamics
.
Water Resources Research
45
,
W01417
.
doi: 10.1029/2008WR007067
.
Hognogi
G. G.
2014
The general hydrotechnical planning potential of Ţibleş Mountains
.
Geographia Napocensis
VIII
(
1
),
57
70
.
Hunsaker
C. T.
,
Whitaker
T. W.
&
Bales
R. C.
2012
Snowmelt runoff and water yield along elevation and temperature gradients in California's southern Sierra Nevada
.
Journal of the American Water Resources Association
48
(
4
),
667
678
.
Ley
R.
,
Casper
M. C.
,
Hellebrand
H.
&
Merz
R.
2011
Catchment classification by runoff behaviour with self-organizing maps (SOM)
.
Hydrology and Earth System Sciences
15
,
2947
2962
.
doi: 10.5194/hess-15-2947-2011
.
Lyon
S. W.
&
Troch
P. A.
2010
Development and application of a catchment similarity index for subsurface flow
.
Water Resources Research
46
,
W03511
.
doi: 10.1029/2009WR008500
.
McDonnell
J. J.
&
Woods
R.
2004
On the need for catchment classification
.
Journal of Hydrology
299
,
2
3
.
doi: 10.1016/j.jhydrol.2004.09.003
.
McDonnell
J. J.
,
Sivapalan
M.
,
Vaché
K.
,
Dunn
S.
,
Grant
G.
,
Haggerty
R.
,
Hinz
C.
,
Hooper
R.
,
Kirchner
J.
,
Roderick
M. L.
,
Selker
J.
&
Weiler
M.
2007
Moving beyond heterogeneity and process complexity: a new vision for catchment hydrology
.
Water Resources Research
43
,
W07301
.
doi: 10.1029/2006WR005467
.
Merz
R.
&
Blöschl
G.
2009
A regional analysis of event runoff coefficients with respect to climate and catchment characteristics in Austria
.
Water Resources Research
45
,
W01405
.
doi: 10.1029/2008WR007163
.
Oudin
L.
,
Kay
A.
,
Andréassian
V.
&
Perrin
C.
2010
Are seemingly physically similar catchments truly hydrologically similar?
Water Resources Research
46
,
W11558
.
doi: 10.1029/2009WR008887
.
Rand
W. M.
1971
Objective criteria for the evaluation of methods clustering
.
Journal of the American Statistical Association
66
(
336
),
846
850
.
Reichl
J. P. C.
,
Western
A. W.
,
McIntyre
N. R.
&
Chiew
F. H. S.
2009
Optimization of a similarity measure for estimating ungauged streamflow
.
Water Resources Research
45
,
W10423
.
doi: 10.1029/2008WR007248
.
Rodríguez-Iturbe
I.
&
Valdés
J. B.
1979
The geomorphologic structure of hydrologic response
.
Water Resources Research
15
(
6
),
1409
1420
.
doi: 10.1029/WR015i006p01409
.
Rousseeuw
P. J.
1987
Silhouettes: a graphical aid to the interpretation and validation of cluster analysis
.
Computational and Applied Mathematics
20
,
53
65
.
doi: 10.1016/0377-0427(87)90125-7
.
Sawicz
K. A.
,
Wagener
T.
,
Sivapalan
M.
,
Troch
P. A.
&
Carrillo
G.
2011
Catchment classification: empirical analysis of hydrologic similarity based on catchment function
.
Hydrology and Earth System Sciences
15
,
2895
2911
.
Sawicz
K. A.
,
Kelleher
C.
,
Wagener
T.
,
Troch
P. A.
,
Sivapalan
M.
&
Carrillo
G.
2014
Characterizing hydrologic change through catchment classification
.
Hydrology and Earth System Sciences
18
,
273
285
.
Schmidt
J.
,
Evans
I. S.
&
Brinkmann
J.
2003
Comparison of polynomial models for land surface curvature calculation
.
International Journal of Geographical Information Science
17
(
8
),
797
814
.
Sivakumar
B.
,
Singh
V. P.
,
Berndtsson
R.
&
Khan
S. K.
2013
Catchment classification framework in hydrology: challenges and directions
.
Journal of Hydrologic Engineering
20
(
1
),
A4014002
.
doi: 10.1061/(ASCE)HE.1943-5584.0000837
.
Snelder
H. T.
,
Biggs
F. B.
&
Woods
R. A.
2005
Improved eco-hydrological classification of rivers
.
River Research and Application
21
,
609
628
.
Strahler
A. N.
1952
Hypsometric area-altitude analysis of erosional topography
.
Geological Society of America
63
(
11
),
1117
1142
.
Szolgayova
E.
,
Laaha
G.
,
Blöschl
G.
&
Bucher
C.
2014
Factors influencing long range dependence in streamflow of European rivers
.
Hydrological Processes
15
,
2895
2911
.
doi: 10.1002/hyp.9694
.
Wagener
T.
,
Sivapalan
M.
,
Troch
P.
&
Woods
R.
2007
Catchment classification and hydrologic similarity
.
Geography Compass
1
(
4
),
901
931
.
Yu
G. Z.
&
Wen
Z.
1996
Research on the eco-environmental problems of the mountain and hill areas in Huaihe River upper watershed and their control measures
.
Journal of Xinyang Teachers College
9
(
2
),
177
181
(in Chinese).
Zhang
Y. Y.
,
Arthington
A. H.
,
Bunn
S. E.
,
Mackay
S.
,
Xia
J.
&
Kennard
M.
2011
Classification of flow regimes for environmental flow assessment in regulated rivers: the Huai River Basin, China
.
River Research & Applications
28
(
7
),
989
1005
.
Zhou
S. B.
,
Xu
Z. Y.
&
Tang
X. Q.
2010
Method for determining optimal number of clusters in K-means clustering algorithm
.
Journal of Computer Applications
30
(
8
),
1995
1998
(in Chinese)
.