Runoff has been greatly affected by climate change and human activities. Studying nonlinear controls on runoff response is of great significance for water resource management decision-making and ecological protection. However, there is limited understanding of what physical mechanisms dominate the runoff response and of their predictability over space. This study analyzed the spatial patterns of runoff response including runoff changes and its sensitivity to climate–landscape variations in 1,003 catchments of the contiguous United States (CONUS). Then, an interpretable machine learning method was used to investigate the nonlinear relationship between watershed attributes and runoff response, which enables the importance of influencing factors. Finally, the random forest model was employed to predict runoff response according to the predictors of catchment attributes. The results show that alteration of runoff is up to 56%/10 years due to climate change and human activities. Catchment attributes substantially altered runoff over CONUS (−60% to 56%/10 years). Climate, topography, and hydrology are the top three key factors which nonlinearly control runoff response patterns which cannot be captured by the linear correlation method. The random forest can predict runoff response well with the highest R2 of 0.96 over CONUS.

  • Spatial patterns of runoff response were explored over 1,000+ catchments of the contiguous United States.

  • Interpretable machine learning method was used to investigate the nonlinear relationship between runoff response and catchment attributes.

  • Climate, hydrology, and topography are the top three dominant factors that nonlinearly control the runoff response, which can be well predicted through a random forest based on 56 indices.

Global warming will bring about a series of meteorological problems, such as rising temperatures (Gampe et al. 2021), uneven spatiotemporal distribution of rainfall (Ghosh et al. 2011), glacier melting (Huss & Hock 2018), and drought risks (Pokhrel et al. 2021). The impact of such climate change on the hydrological cycle has been a hot research area for hydrologists (Zhang et al. 2021). Human activities are another major factor in runoff behavior. The impact of direct human activities, such as the construction of river engineering, industrial and agricultural water intake, land use/cover changes that cause evaporation and changes in groundwater level on runoff lead to spatiotemporal changes in water resources; human urbanization, farmland transformation, and other activities affect the land–atmosphere feedback mechanism, further affecting hydrological processes (Wang et al. 2020). Therefore, understanding the impact of climate change and human intensification on runoff behavior will help decision-makers in facing water resource shortage crises and drought risks.

Climate elasticity is a method that reflects the sensitivity of runoff to climate change. The larger its absolute value, the more sensitive is runoff variation to the influencing factors. Tang & Lettenmaier (2012) demonstrated that the sensitivity of runoff to climate change is relatively stable in scenarios of global temperature rise and increased emissions, but there are significant differences in high-latitude and arid or semi-arid regions. Gong et al. (2022) investigated the spatial patterns of sensitivity of runoff to climate and catchment properties over 188 catchments of the contiguous United States (CONUS). Their results demonstrated the critical role of catchment properties on runoff, which are consistent with ours; this indicates the robustness of our results. In addition to the natural factors, human activities have a nonnegligible role in runoff. For example, Li et al. (2020) demonstrated that future watershed hydrologic impacts of urbanization vary dramatically across the United States through use of a monthly water balance model across 81,900 12-digit hydrologic unit code (HUC) watersheds of CONUS. The direct impact of human activities on runoff has not been studied in this study. Our results also validated this impact on runoff. Total changes in runoff did not equal the contribution of natural factors, and we infer the difference between them is caused by human activities. However, there is little research on the relationship between climate elasticity, runoff changes, and controlling factors. The current understanding of climate elasticity is stable across time variations within the watershed. Understanding this unchanged pattern and the mechanism would help model the runoff–precipitation relationship.

Machine learning is a black-box model whose behavior and internal connections are often unobservable (Molnar et al. 2018). In order to understand the running process and internal connections of the model, model users need not only to pay attention to the accuracy of the model's results but also to understand the situation of the model itself. The interpretable role of machine learning is to make the running behavior of the black-box model visible. Accumulated local effects (ALE) is an effective interpretation method based on the completion of model operation, which is an extension of the partial dependency plots (PDP) method, through judging the average change in output results through the edge distribution changes of input variables and further evaluating the importance of input variables to the output results. However, the disadvantage of PDP is that it cannot accurately evaluate the importance of predictive variables with related relationships. The ALE method enables the evaluation of nonlinear relationships between attributes, and makes it possible to overcome the effect of correlated catchment attributes; it has not been used often in hydrology. Recently, the method has been widely used to investigate how catchment attributes and climate attributes influence the hydrological response. For example, Stein et al. (2021) applied the ALE method to evaluate how catchment attributes and climate attributes influence the distribution of flood processes across 671 catchments of CONUS. They found the climate in terms of the fraction of snow, aridity, precipitation seasonality, and mean precipitation to be most influential on flood process distribution. The ALE method, however, also has certain limitations: firstly, for example, ALE evaluates the model's response to changes in a certain predictive variable, and its results are not directly based on data (Anchang et al. 2020). When the simulation results of the model are poor, ALE results are generally not good. Secondly, when there is less predictive variable data, ALE results are unreliable.

The ability to accurately predict behavior patterns is a critical test of the adequacy of knowledge in any subject (Crawford & Linsley 1966). Previous studies on spatial prediction in ungauged basins have mainly focused on hydrological variables such as runoff and hydrological signatures (Hrachowitz et al. 2013; Kuentz et al. 2017). Studies on runoff changes caused by climate change and/or human activities and their relative importance have expanded over time (Kalugin 2019; Somorowska & Łaszewski 2019). There are three approaches to achieving spatial predictability of runoff changes based on predictors of climatic and physiographic attributes: (1) statistical models with predictors of catchment characteristics (Beck et al. 2013; Kuentz et al. 2017). For example, Stein et al. (2021) used a random forest model to predict the flood generation process. (2) Interpolation with similar catchments between adjacent gauged basins (Westerberg & McMillan 2015); Kuentz et al. (2017) analyzed the hydrological similarity and further explored the dominant controls on hydrological response over 30,000+catchments of Europe. (3) Hydrological models with parameters regionalized based on catchment attributes (Wagener & Wheater 2006; Ragettli et al. 2017); Shao et al. (2020) used a physics-based distributed hydrological model, embedded with a multi-purpose reservoir module, to quantify the effects of urbanization on floods.

Random forests can consider multiple predictors and nonlinear relationships. Catchment behavior is usually determined by multiple factors, but not a single influence. Beck et al. (2013) have explored global patterns in base flow, and they have shown weak relationships between flow patterns and single factors in catchments. The random forest model can deal with the above problems by coupling with multiple predictors and considering nonlinear relationships; it is not limited by physical principles or hypothesized hydrological processes and is a flexible statistical model without the problem of overfitting. The random forest model can measure the importance of predictors among all predictors using an indicator of the increase in the mean square error. Like other machine learning models, random forests can obtain the highest accuracy with the most convenient calculation procedure, and have been widely used to predict hydrological response. For example, as stated above, Stein et al. (2021) used the model to predict the variability of the flood generation process. Dogan (2023) employed a random forest model to predict streamflow over the Kızılırmak River, Turkey and obtained ideal results.

Our study aims to (1) investigate the controlling factors of runoff response, (2) explore their predictability over space, and (3) discuss how their predictability can be improved (the flowchart is shown in Figure 1). Our study tries to address three interrelated research questions:
  • (1)

    How well can runoff response be predicted by the statistical model (data-driven) with a consideration of nonlinear relationships? And what are the patterns of predictability of runoff response? We used random forests with a consideration of nonlinear relationships to explore the predictions of runoff response based on predictors of catchment attributes covering six types (topography, geology, climate, hydrology, land use/cover, and soil characteristics) indicated by 56 indicators over 1,003 catchments in the CONUS. We quantified how catchment attributes impact strongly the prediction of runoff response by means of ALE, which can rank the importance of catchment attributes to runoff response.

  • (2)

    How do key factors control the runoff response? What differences are there compared with linear relationships? After key factors are determined through step 1, we quantified the nonlinear relationship between runoff response and key factors using ALE.

Figure 1

Flowchart in this study.

Figure 1

Flowchart in this study.

Close modal

Catchment attributes

The United States Geological Survey (USGS) provided Geospatial Attributes of Gages for Evaluating Streamflow (GAGES-II), version II, for evaluating inland hydrological stations in the United States (Falcone et al. 2010) (https://www.usgs.gov/3d-elevation-program). GAGES-Ⅱ includes nearly 9,322 hydrological stations distributed in the United States (including Arras, Hawaii, and Puerto Rico), and the each flow stations are divided into ‘reference station’ and ‘disturbed station’ according to whether they are affected by human activities (http://esapubs.org/Archive/ecol/E091/045/default.htm). GAGES-II is updated from the original dataset GAGES, and has two main purposes: (1) to provide large and comprehensive geographic spatial data within the watershed, especially long-term runoff data; (2) to identify whether runoff sites are affected by human activities, which will help restore or establish ecological flow. The length of the runoff series included in this dataset is at least 20 years (since 1950), but the runoff series might not be continuous and these sites are all valid, and their data can be subject to real-time review. The reference stations are considered to be unaffected or minimally affected by human activities and are all distributed in 12 major ecological regions of the United States, with a total of 2,057 stations distributed in the dataset. There are 7,265 sites that are significantly affected by human activities (‘disturbed sites’), mainly distributed outside the ecological zone. Among the 2,057 reference stations, 1,633 have a runoff sequence length exceeding 20 years, and some stations have a runoff sequence length of 110 years (from 1900 to 2009). Geospatial data mainly include nearly 100 types of watershed attribute-characteristic data, including climate, hydrology, geomorphology, land use/land cover, geology, and human activities within each watershed. This study considered six types of watershed attribute data within the watershed, consisting of topography, geology, climate, hydrology, land use/land cover change, and soil characteristics. The specific descriptions and sources are shown in the Supplementary Information, Table A1.

The runoff data of the GAGES-II hydrological station are downloaded from the USGS Water Information System (USGS NWIS) of the United States Geological Survey (Harris 2002) (https://waterdata.usgs.gov/nwis/). When downloading runoff data, it is necessary to consider the length and continuity of the data, that is, at least 20 years of data, and at least 80% of the data are continuous (at least 8 years of data every decade is continuous on a daily basis). In the end, a total of 2,482 sites (1966–2015) met the above screening criteria, of which 203 were in 1916–2015 and 1,408 were in 1941–2015. This study selected the period 1941–2015 as the research period. In order to facilitate analysis, sites with a runoff coefficient greater than 1 were abandoned. In the end, 1,003 sites were selected for this study.

Climate data

The climate data used in this study mainly include daily rainfall, maximum daily temperature, and minimum daily temperature. The climate data come from the PRISM climate dataset. The parameter elevation regression on independent slopes model (PRISM) climate dataset is a parameter elevation regression model that is independent of slope (Daly et al. 2008). Theoretical-based systems are mainly used to insert climate elements into complex landforms and landscapes. Regression-based PRISM uses point data, digital elevation model (DEM), other spatial datasets, and encoded spatial climate theory to generate estimates of annual, monthly, daily and event-based climate elements. These estimated values are inserted into the regular grid to make them compatible with the geographic information system (GIS). The climate data in the PRISM dataset have undergone peer review, including the official rainfall and temperature map data of 50 states, Caribbean and Pacific islands released by the US Department of Agriculture, the new official climate map of the United States, with 112 consecutive years of monthly temperature, precipitation, and dew-point maps of 48 states, detailed precipitation and temperature maps of Canada, China, and Mongolia, and the first comprehensive precipitation map of the European Alps region (Daly et al. 2021). Its quality is reliably guaranteed (Strachan & Daly 2017). The PRISM Climate Group collects climate observation data from a wide range of monitoring networks, adopts complex quality-control measures, and develops spatial climate datasets to reveal short-term and long-term climate patterns. The generated dataset includes various modeling techniques and is available in various spatial/temporal resolutions, covering the period from 1895 to the present, and the data are freely available. The original grid resolution of the PRISM dataset is 800 m, but it has been filtered to a 4 km resolution for download and operation.

Runoff elasticity and runoff changes

This article mainly uses the Budyko framework combined with the climate elasticity method to calculate runoff changes and their sensitivity to rainfall, potential evaporation, temperature, and watershed characteristics. Budyko (1974) believes that the ratio of annual actual evaporation to annual rainfall is a function of the ratio of potential evaporation to rainfall and the characteristics of the watershed:
(1)
where E is the annual actual evaporation (mm); is the average annual rainfall (mm); is the average annual potential evaporation (mm). The characteristics of a watershed are generally represented by parameters. Since Budyko proposed this theory in 1974, many scholars have begun to study its mathematical form (see Table 1). We adopted the mathematical form of the Budyko theory proposed by Yang et al. (2008):
(2)
where n represents the characteristics of the watershed and describes the water–energy balance within the catchment. When the parameter n is given, the actual evaporation within the catchment is determined. A small value of n occurs where groundwater storage is low, where the catchment is generally rocky and mountainous, and where rainfall completely forms runoff. When the value of n is high, the catchment area is mostly plain, with deep quaternary soil and abundant groundwater content, and the actual evaporation energy within the catchment reaches its maximum (Yang et al. 2008).
Table 1

Expressions for annual actual evapotranspiration related to the Budyko framework

Equation (2) is defined as , and the differential form is taken:
(3)
For large-scale (>1,000 km2) basins and long-term (>11 years) periods, it can be considered that the mean annual water storage change within the basin is zero (Patterson et al. 2013). Therefore, substituting the water balance formula into the above equation:
(4)
where Q is the mean annual runoff depth (mm).
The relative variation of runoff can be obtained by dividing the left and right sides of Equation (4) by :
(5)
where , , represent the sensitivity of runoff to rainfall, potential evaporation, and watershed characteristics, respectively. Therefore, once the mathematical form of Budyko's theory is known, the corresponding partial differential can be taken to calculate the runoff elasticity. This means for every 1% change in influencing factors, the relative change in runoff changes. For example, for every 1% increase in rainfall and 0.5% change in runoff, . According to this definition, the main factors affecting runoff include rainfall, potential evaporation, and watershed characteristics.
Yang & Yang (2011) derived the sensitivity of runoff to rainfall, net radiation, average temperature, average wind speed, and relative humidity based on the first-order partial differential form of the Penman–Monteith equation, and proved that average temperature has an important impact on runoff. Considering the large scale of this study, it is difficult to fully obtain the climate data required by the Penman–Monteith formula. Therefore, this article calculates the elasticity of runoff to temperature using the parameter-free estimation method. In order to fully study the impact of temperature on runoff, this paper calculates the runoff elasticity according to minimum and maximum daily temperatures:
(6)
where is the elasticity of runoff according to climate factors, is the annual runoff (mm), is the mean annual runoff depth (mm), is the annual value of the climate variable, and is the average value of the climate variable. However, it can be observed that when , the above equation is meaningless. Therefore, Zheng et al. (2009) further improved the above equation gives:
(7)

Replacing the data of maximum and minimum daily temperatures with X in the above equation can thus obtain the elasticity of runoff according to maximum and minimum daily temperatures, respectively.

The Food and Agriculture Organization of the United Nations (FAO) has improved and upgraded the estimation method of potential evapotranspiration (ET) by introducing the concept of a reference crop (grass) described by the Penman–Monteith (PM-ET) equation (Allen et al. 1998), and this method has been well verified under different climate and timescale calculations and is now widely used worldwide (Almorox et al. 2015). However, the high cost of maintaining agricultural meteorological stations and estimating the large number of sensors required for ET makes it difficult to apply in practice, especially in rural areas (Moratiel et al. 2020). For this reason, using air temperature estimation ET in areas where wind speed, solar radiation, and air humidity data are not easily available is highly popular. The method of estimating potential evaporation based on temperature has been studied for a long time. Allen et al. (1998) recommended two methods for calculating potential evaporation based on the Penman–Monteith formula: (1) using the Hargreaves–Samani equation; (2) using the Penman–Monteith formula, estimating net radiation and water vapor pressure difference based on temperature. This study uses the second method to estimate potential evaporation; for details of the method, refer to Allen et al. (1998).

Random forest

The random forest is a machine learning model, which generates and combines multiple regression trees (Breiman 2001). The random forest has many advantages in regression and classification, such as combining nonlinear relationships, high flexibility, allowing multiple predictors, avoiding overfitting problems, having strong explanatory power, and having high computational efficiency (Booker & Woods 2014). These advantages make random forests particularly suitable for spatial prediction of hydrological variables (Addor et al. 2018). The random forest has applications in many fields, such as in the field of geographical science, and can be used to predict hydrological signatures and soil characteristics (Chaney et al. 2016). The random forest establishes the regression relationship between predictive factors and response variables based on regression tree clusters. In the regression tree, branching is performed based on the thresholds of a series of predictive factors. The prediction algorithm starts from the top of the tree to the root, first predicting from the top of the tree (by determining whether the prediction factor has passed the threshold, that is, whether the elevation in Figure 2 has passed 1.15 m), and then further branching is carried out on the next branch of the tree through a series of thresholds. The importance of predictive factors for response variables can be determined by the position of the tree in which the predictive factor branches: the earlier the predictive factor passes the threshold condition (or at a higher position in the tree), the more important the predictive factor is for the response variable (in Figure 2, the elevation is the most important predictive factor for the base flow). It should be noted that regression trees are not symmetrical because different variables are present in different parts of the tree.
Figure 2

Schematic random forests for determining the importance of factors.

Figure 2

Schematic random forests for determining the importance of factors.

Close modal

Regression trees generally grow in a recursive binary-splitting manner. Firstly, run from the top of the tree and split the nodes, with one variable and one threshold, in order to minimize the error of the prediction as much as possible. The final prediction result is the average of all prediction factors within each category. Therefore, the prediction of the decision tree is a discrete value (one prediction value per terminal node, for example, in Figure 2, the leftmost terminal node of the tree is 0.48). In order to prevent overfitting, the cross-validation method is used to minimize the error to make the tree grow and prune. Although regression trees are easy to explain and can handle nonlinear relationships between variables, they often lack robustness.

In order to overcome this shortcoming, this paper integrates 500 regression trees to form a random forest, and lets each tree grow freely, which gives the random forest a strong robustness. At each node, predictive factors are randomly selected, and the remaining predictive factors are used for other nodes. This means that predictive factors that are important for response variables may be removed. The number of trees and the number of predictive factors removed at each node can be freely determined by the user. In general, the number of trees at 500 has a good prediction effect, while the removal of prediction factors at 1/3 of the total number for each node has a small impact on the prediction results (Addor et al. 2018).

In this study, 18 hydrological characteristic variables are predicted, each regression tree has 180 parameters, and 500 trees have 90,000 parameters. It is impractical to analyze such a large number of parameters one by one, but we can evaluate the importance of predictive variables through the ALE.

The prediction results of the random forest pass the ten-fold cross-validation evaluation, that is, 90% of the catchment is used as the training set, 10% of the catchment is used as the test set, and the cycle is repeated ten times. During this process, the catchment is randomly sampled.

Spatial patterns of runoff response

Since the 1940s, runoff has been comprehensively affected by various factors showing a heterogeneous spatial distribution. It shows positive and negative changes, but the positive changes (up to 56%/10 years) are greater than the negative changes (up to 36%/10 years, Figure 3). The total changes in the western region show a decreasing trend, while most of the catchments in the central northern and southeastern regions show an increasing trend. The largest changes are mainly in the northern central region of the United States. The spatial pattern of runoff changes caused by rainfall is generally consistent with total changes, with an increasing trend in the east and a decreasing trend in the west. The changes in the South Atlantic Gulf and the California regions are basically decreasing. The maximum (17%/10 years) is in the northern and eastern and the westernmost. The difference is that the runoff changes caused by potential evaporation show an overall positive in the United States, with a maximum of 19.3% per decade. A large number of catchments are in the central and southern regions, with a maximum decrease of 3.7% per decade, mainly distributed in the central and western regions. The changes in runoff caused by average temperature are generally unevenly distributed in the central United States. Except for the catchments near the 100th meridian, which mainly show positive changes, there is no obvious pattern of positive or negative changes in other catchments.
Figure 3

Spatial pattern of nine ecoregions and runoff changes for (a) total changes, and changes that are (b) precipitation-induced, (c) potential-evaporation-induced, (d) average-temperature-induced, and (e) catchment-properties-induced, over the conterminous USA (R_total, R_P, R_ET, R_T, R_n represent total runoff changes, precipitation-, potential-evaporation-, average-temperature-, catchment characteristics n-induced changes, respectively). Unit is % per decade (%/10a in the legend).

Figure 3

Spatial pattern of nine ecoregions and runoff changes for (a) total changes, and changes that are (b) precipitation-induced, (c) potential-evaporation-induced, (d) average-temperature-induced, and (e) catchment-properties-induced, over the conterminous USA (R_total, R_P, R_ET, R_T, R_n represent total runoff changes, precipitation-, potential-evaporation-, average-temperature-, catchment characteristics n-induced changes, respectively). Unit is % per decade (%/10a in the legend).

Close modal
From Figure 4, it can be observed that the rainfall elasticity of most watersheds ranges from 1.0 to 2.0, and these watersheds are unevenly distributed in the United States. The rainfall elasticity of some watersheds in the central and southern regions can reach 11. The distribution of potential evaporation elasticity is close to rainfall elasticity. Many catchments have values around 0–1.08 and are unevenly distributed in the United States, with some catchments having values ranging from −3 to 10 (central and southern). The difference is that the response of runoff to potential evaporation is negative, that is, for every 1% increase (decrease) in potential evaporation, runoff will decrease (increase) by the corresponding amount of change. We also can see that the potential evaporation elasticity in the vast majority of the catchments on the east and west sides of the 100th meridian is within the range of 0–1, and the values of the catchments near the meridian are mainly around −1 to 10. The temperature elasticity around the 100th meridian has a range of −1 to 16, indicating that for every 1% increase (decrease) in average temperature, the amount of runoff decreases (increases) accordingly. In most other regions (northeast, southeast, and west), values are −1 to 3. The distribution of catchment attribute elasticity is relatively regular, with a range of −1 to 5 near the 100th meridian, and about −1.2 in other regions.
Figure 4

Spatial pattern of (a) precipitation elasticity, (b) potential evaporation elasticity, (c) average temperature elasticity, (d) watershed property elasticity, over 1,003 watersheds; and (e) catchment attribute n value in Budyko function of the contiguous USA.

Figure 4

Spatial pattern of (a) precipitation elasticity, (b) potential evaporation elasticity, (c) average temperature elasticity, (d) watershed property elasticity, over 1,003 watersheds; and (e) catchment attribute n value in Budyko function of the contiguous USA.

Close modal

Comparing the spatial distribution of runoff changes caused by rainfall, it can be found that in the central and central southern regions, the rainfall elasticity is in the range from 2 to 11, and the runoff changes are 2%–17%/10 years, indicating that the runoff in this region is linearly increasing (Figures 3 and 4). In the central northern and northeastern regions, the elasticity of rainfall is mainly in the range of 1–2, while the runoff changes are in the range of 5%–17%/10 years, indicating that the water supply in this region should also include snowmelt in addition to rainfall. In the southeastern and western regions, the rainfall elasticity is mainly in the range of 1 to 4, and the runoff changes show a negative change. This can prove that rainfall elasticity is not necessarily the only criterion for judging runoff variation but can be used as one of the means to determine the sensitivity of runoff to influencing factors. Compared with the amount of runoff caused by potential evaporation, the potential evaporation elasticity in the central southern and southeastern regions is −1 to 4, and the corresponding runoff changes are 6%–19%/10 years, indicating that in this type of region, the amount of runoff is about five times the potential evaporation variation (Figures 3 and 4). In most other regions (northeastern, central northern, and western), the average potential evaporation elasticity is −1.0, and the corresponding runoff changes are 1%/10 years. This indicates that in these regions, the changes in runoff and elasticity are consistent, and the corresponding runoff can be quantitatively evaluated using the potential evaporation elasticity. The runoff changes do not vary linearly with the catchment attribute elasticity. It is crucial to further explore the factors that affect this type of runoff change.

Cumulative local effects between catchment behavior and runoff response

The ALE of runoff changes is shown in Figure 5. To facilitate the comparison between different types of runoff changes, the ALE was normalized to the range of maximum–minimum. We can see that longitude has the most impact on the total runoff changes, followed by annual runoff, proportion of snowmelt precipitation, and dune slope flow. When considering the nonlinear relationship, the controls of catchment attributes on the total changes are different (Figure 5). When using a linear correlation relationship, the impact of longitude on total changes is insignificant. This indicates that the relationship between catchment attributes and runoff changes is not simple. The spatial distribution of total changes can explain this impact (Figure 3). There is a substantial difference in total changes in different locations of CONUS, with the largest differences in total changes being between the east and west. The importance of aridity is secondary. The results of Gong et al. (2022) showed a negative exponential relationship between the total changes and aridity. The ALE of rainfall-induced changes showed again the control of longitude. In addition to longitude, the proportion of snowmelt and snowfall is another important factor, which may be one of the reasons for the caused differences in total changes and rainfall-induced changes in the northern and southern regions. The annual herbaceous coverage impacts to some extent the potential evaporation-induced changes (Figure 5), while it is negligible for other types of runoff changes. In addition, the catchment area, base flow, and aridity have an impact on the potential-evaporation-induced changes. The ALE value can only explain the importance of the influencing factors, and cannot explore the specific mathematical form. The influence of catchment attributes on runoff changes caused by temperature is ranked by rainfall days, rainfall, low rainfall duration, and potential evaporation. The ALE does not have confidence as the accuracy of the random forest model simulation is low (R2 is 0.2). The annual runoff and catchment area are the most important for catchment-characteristics-induced changes. The catchment area has no impact on other components of runoff changes, and there is a strong correlation between the catchment area and human activities such as agricultural irrigation proportion and the amount of power-plant construction in the catchment. However, ALE showed that human activity factors have not shown a strong influence on runoff changes, indicating that the linear relationship between the influencing factors and runoff changes cannot capture reality. We can see that for changes caused by watershed attributes, in addition to climate and hydrological factors, the underlying surface conditions within the watershed are also important. The underlying surface conditions affect not only the runoff process but also the distribution of watershed attributes.
Figure 5

Accumulated local effects for catchment attribute influence on runoff changes: total changes (ALE_Rtotal), rainfall-induced changes (ALE_RP), potentialevaporation-induced changes (ALE_RET), temperature-induced changes (ALE_RT), catchment-attributes-induced changes(ALE_Rn). Details for catchment attributes are in Table A1 of the Supplementary Information. Point sizes represent cross-validation R2 prediction accuracy of runoff changes for the random forest model. Color bar is ALE.

Figure 5

Accumulated local effects for catchment attribute influence on runoff changes: total changes (ALE_Rtotal), rainfall-induced changes (ALE_RP), potentialevaporation-induced changes (ALE_RET), temperature-induced changes (ALE_RT), catchment-attributes-induced changes(ALE_Rn). Details for catchment attributes are in Table A1 of the Supplementary Information. Point sizes represent cross-validation R2 prediction accuracy of runoff changes for the random forest model. Color bar is ALE.

Close modal
The impact of catchment attributes on rainfall and potential evaporation elasticity is the same (Figure 6), and annual runoff is the most important factor among all factors. In addition, annual rainfall, longitude, catchment area, and aridity also have an impact on rainfall and potential evaporation elasticity, which is confirmed by Sankarasubramanian et al. (2001). The ALE result also showed that the annual runoff, the proportion of snowmelt precipitation, the average rainfall, and rainfall days have an important impact on temperature elasticity. The annual runoff is of importance for catchment attribute elasticity. In addition, the aridity index and dune slope flow are also nonnegligible. The n value of the watershed revealed the interaction between vegetation, soil, and landforms. From Figure 5, it can be seen that annual runoff is the most important for n value, followed by catchment area and annual maximum temperature. The linear correlation showed that there are many catchment attributes correlated with the n value. The ALE results revealed, however, that the importance of these factors is not clear.
Figure 6

Accumulated local effects distribution for catchment attribute influence on runoff elasticities: rainfall elasticity (elasticity_Pre), potential evaporation elasticity (elasticity_ET), elasticity of watershed attributes (elasticity_n), temperature elasticity (elasticity_T), catchment attribute elasticity (elasticity_nvalue). Details for catchment attributes in Table A1 of the Supplementary Information. Point sizes represent cross-validation R2 prediction accuracy of runoff elasticities for the random forest model. Color bar is ALE.

Figure 6

Accumulated local effects distribution for catchment attribute influence on runoff elasticities: rainfall elasticity (elasticity_Pre), potential evaporation elasticity (elasticity_ET), elasticity of watershed attributes (elasticity_n), temperature elasticity (elasticity_T), catchment attribute elasticity (elasticity_nvalue). Details for catchment attributes in Table A1 of the Supplementary Information. Point sizes represent cross-validation R2 prediction accuracy of runoff elasticities for the random forest model. Color bar is ALE.

Close modal

Predictions of runoff response through random forest

The predictions of the random forest based on predictors of catchment attributes are shown in Figure 5. When the catchment attributes are used to predict the total changes and catchment-attributes-induced changes, the accuracy of both reaches 0.6. The R2 of runoff changes caused by rainfall, potential evaporation, and temperature is 0.5, 0.3, and 0.1, respectively. The reliability of ALE results is closely related to the accuracy of model predictions; the higher the prediction accuracy, the more reliable the ALE results of the model. There is a significant difference in accuracy between different types of runoff changes. Moreover, the 56 watershed attributes adopted in this study cannot confidently explain the distribution of runoff changes caused by rainfall, potential evaporation, and average temperature, even though the nonlinear relationships are considered. The relationship between runoff changes and catchment behavior is not a simple linear relationship, and might even be offset by the role of various factors on the runoff changes. In the future, more influencing factors should be considered and more advanced mathematical methods should be adopted to explore the relationship between them.

The predictions of runoff elasticity of the random forest are shown in Figure 6. The accuracy of the elasticity of watershed attributes is the highest, 0.9, followed by rainfall elasticity (0.7) and potential evaporation elasticity (0.7). The prediction accuracy of watershed attributes n value and average temperature elasticity is relatively low, with 0.4 and 0.1, respectively. For runoff elasticity with high accuracy, an empirical relationship between runoff elasticity and influencing factors can be established based on ALE results to explore the patterns of runoff elasticity in areas without data. For runoff elasticity with lower accuracy, it is necessary to consider more and more comprehensive influencing factors or adopt more comprehensive methods to analyze the relationship between them.

Runoff changes and runoff elasticity present a heterogeneous pattern over the CONUS. The total changes show a positive trend in the east and negative in the west (56% to −36%/10 years); the changes caused by rainfall are generally consistent with the total changes; the changes caused by potential evaporation have shown an overall positive trend in the United States (19.3 to −3.7% per decade); temperature-induced changes show an increasing trend in the 100th meridian; catchment-attributes-induced changes are relatively consistent with total changes. The rainfall elasticity in most catchments is from 1.0 to 2.0; the potential evaporation elasticity is consistent with rainfall elasticity; temperature elasticity around the 100th meridian is 1–16; the elasticity of catchment attributes is relatively regular, with the value in the range of −1 to 5 near the 100th meridian, and an average of −1.2 in other regions. The distribution of the nine ecological regions in CONUS can explain the patterns of runoff response.

Compared with traditional linear correlation analysis, accumulative local effects can be used to analyze quantitatively the nonlinear relationship between watershed attributes and runoff response and determine the importance of influencing factors. Longitude and annual runoff are the top two in importance for total runoff changes and rainfall-runoff changes; the annual runoff is the most important for potential-evaporation-induced changes, and the annual average herbaceous coverage also plays an important role; the precipitation regime is the most important for runoff changes caused by average temperature; the annual runoff is the most important for catchment-attributes-induced changes. The runoff elasticity in arid areas is larger than that in humid areas, and the annual runoff has the greatest impact on the runoff elasticity. The impact of the catchment area cannot be ignored.

The random forest model was used to predict runoff response, and the accuracy was also analyzed. When using 56 indices of watershed attributes to predict the total changes, watershed attributes (n), and runoff changes caused by rainfall, potential evaporation, and average temperature, the accuracy R2 is 0.6, 0.6, 0.5, 0.3, and 0.1, respectively. The accuracy of rainfall and potential evaporation elasticity is 0.7, and 0.96 for catchment attribute elasticity.

This study is supported by the National Key R&D Program Project Topic (No. 2021YFD150080505).

All relevant data are available from an online repository or repositories: catchment properties data from http://ned.usgs.gov/ and https://waterdata.usgs.gov/nwis/.

The authors declare there is no conflict.

Addor
N.
,
Nearing
G.
,
Prieto
C.
, Newman, A. J., Le Vine, N. & Clark, M. P.
2018
A ranking of hydrological signatures based on their predictability in space
.
Water Resources Research
54
(
11
),
8792
8812
.
Allen
R. G.
,
Pereira
L. S.
,
Raes
D.
& Smith, M.
1998
Crop Evapotranspiration - Guidelines for Computing Crop Water Requirements. FAO Irrigation and Drainage Paper 56, FAO, Rome, Italy
.
Irrigations and Drainage Paper
56
.
Anchang
J. Y.
,
Prihodko
L.
,
Ji
W.
, Kumar, S. S., Ross, C. W., Yu, Q., Lind, B., Sarr, M. A., Diouf, A. A. & Hanan, N. P.
2020
Toward operational mapping of woody canopy cover in tropical savannas using Google Earth Engine
.
Frontiers in Environmental Science
8
, 4.
Beck
H. E.
,
van Dijk
A. I. J. M.
,
Miralles
D. G.
,
de Jeu
R. A. M.
,
Sampurno Bruijnzeel
L. A.
,
McVicar
T. R.
&
Schellekens
J.
2013
Global patterns in base flow index and recession based on streamflow observations from 3394 catchments
.
Water Resources Research
49
(
12
),
7843
7863
.
doi:10.1002/2013wr013918
.
Booker
D. J.
&
Woods
R. A.
2014
Comparing and combining physically-based and empirically-based approaches for estimating the hydrology of ungauged catchments
.
Journal of Hydrology
508
,
227
239
.
doi:10.1016/j.jhydrol.2013.11.007
.
Breiman
L.
2001
Random forests
.
Machine Learning
45
,
5
32
.
Budyko
M. I.
1974
Climate and Life
.
Academic Press, New York, USA
.
Chaney
N. W.
,
Wood
E. F.
,
McBratney
A. B.
, Hempel, J. W., Nauman, T. W., Brungard, C. W. & Odgers, N. P.
2016
POLARIS: a 30-meter probabilistic soil series map of the contiguous United States
.
Geoderma
274
,
54
67
.
Crawford
N. H.
&
Linsley
R. K.
1966
Digital Simulation in Hydrology: Stanford Watershed Model IV. Technical Report No. 39, Department of Civil and Environmental Engineering, Stanford University, Stanford, CA, USA
.
Daly
C.
,
Halbleib
M.
,
Smith
J. I.
, Gibson, W. P., Doggett, M. K., Taylor, G. H., Curtis, J. & Pasteris, P. P.
2008
Physiographically sensitive mapping of climatological temperature and precipitation across the conterminous United States
.
International Journal of Climatology
28
(
15
),
2031
2064
.
Daly
C.
,
Doggett
M. K.
,
Smith
J. I.
, Olson, K. V., Halbleib, M. D., Dimcovic, Z., Keon, D., Loiselle, R. A., Steinberg, B., Ryan, A. D., Pancake, C. M. & Kaspar, E. M.
2021
Challenges in observation-based mapping of daily precipitation across the conterminous United States
.
Journal of Atmospheric and Oceanic Technology
38
(
11
),
1979
1992
.
Donohue
R. J.
,
Roderick
M. L.
&
McVicar
T. R.
2012
Roots, storms and soil pores: incorporating key ecohydrological processes into Budyko's hydrological model
.
Journal of Hydrology
436–437
,
35
50
.
doi:10.1016/j.jhydrol.2012.02.033
.
Fu
B. P.
1981
On the calculation of the evaporation from land surface
.
Scientia Atmospherica Sinica
5
(
1
),
23
31
.
Gampe
D.
,
Zscheischler
J.
,
Reichstein
M.
, O'Sullivan, M., Smith, W. K., Sitch, S. & Buermann, W.
2021
Increasing impact of warm droughts on northern ecosystem productivity over recent decades
.
Nature Climate Change
11
(
9
),
772
779
.
Ghosh
S.
,
Das
D.
,
Kao
S.-C.
& Ganguly, A. R.
2011
Lack of uniform trends but increasing spatial variability in observed Indian rainfall extremes
.
Nature Climate Change
2
(
2
),
86
91
.
Harris
S.
2002
National water information system web
.
Government Executive
34 (15).
Hrachowitz
M.
,
Savenije
H. H. G.
,
Blöschl
G.
,
McDonnell
J. J.
,
Sivapalan
M.
,
Pomeroy
J. W.
, Arheimer, B., Blume, T., Clark, M. P., Ehret, U., Fenicia, F., Freer, J. E., Gelfan, A., Gupta, H. V., Hughes, D. A., Hut, R. W., Montanari, A., Pande, S., Tetzlaff, D., Troch, P. A., Uhlenbrook, S., Wagener, T., Winsemius, H. C., Woods, R. A., Zehe, E. &
Cudennec
C.
2013
A decade of predictions in ungauged basins (PUB) – a review
.
Hydrological Sciences Journal
58
(
6
),
1198
1255
.
doi:10.1080/02626667.2013.803183
.
Huss
M.
&
Hock
R.
2018
Global-scale hydrological response to future glacier mass loss
.
Nature Climate Change
8
(
2
),
135
140
.
Kuentz
A.
,
Arheimer
B.
,
Hundecha
Y.
& Wagener, T.
2017
Understanding hydrologic variability across Europe through catchment classification
.
Hydrology and Earth System Sciences
21
,
2863
2879
.
Li
C.
,
Sun
G.
,
Caldwell
P. V.
, Cohen, E., Fang, Y., Zhang, Y., Oudin, L., Sanchez, G. M. & Meentemeyer, R. K.
2020
Impacts of urbanization on watershed water balances across the conterminous United States
.
Water Resources Research
56
(
7
), e2019WR026574.
doi:10.1029/2019wr026574
.
Molnar
C.
, Casalicchio, G. & Bischl, B.
2018
iml: an R package for interpretable machine learning
.
Journal of Open Source Software
3
(
26
),
786
.
Ol'dekop
E. M.
1911
On Evaporation from the Surface of River Basins in Russia. Collection of Works of Students of the Meteorological Observatory (Volume 4), University of Tartu, Tartu, Estonia
.
Patterson
L. A.
,
Lutz
B.
&
Doyle
M. W.
2013
Climate and direct human contributions to changes in mean annual streamflow in the South Atlantic, USA
.
Water Resources Research
49
(
11
),
7278
7291
.
Pokhrel
Y.
,
Felfelani
F.
,
Satoh
Y.
, Boulange, B., Burek, P., Gädeke, A., Gerten, D., Gosling, S. N., Grillakis, M., Gudmundsson, L., Hanasaki, N., Kim, H., Koutroulis, A., Liu, J., Papadimitriou, L., Schewe, J., Schmied, H. M., Stacke, T., Telteu, C.-E., Thiery, W., Veldkamp, T., Zhao, F. & Wada, Y.
2021
Global terrestrial water storage and drought severity under climate change
.
Nature Climate Change
11
(
3
),
226
233
.
Porporato
A.
,
Daly
E.
&
Rodriguez-Iturbe
I.
2004
Soil water balance and ecosystem response to climate change
.
American Naturalist
164
,
625
632
.
Ragettli
S.
,
Zhou
J.
,
Wang
H.
,
Liu
C.
&
Guo
L.
2017
Modeling flash floods in ungauged mountain catchments of China: a decision tree learning approach for parameter regionalization
.
Journal of Hydrology
555
,
330
346
.
doi:10.1016/j.jhydrol.2017.10.031
.
Sankarasubramanian
A.
,
Vogel
R. M.
&
Limbrunner
J. F.
2001
Climate elasticity of streamflow in the United States
.
Water Resources Research
37
(
6
),
1771
1781
.
Schreiber
P. J. M. Z.
1904
Über die Beziehungen zwischen dem Niederschlag und der Wasserführung der Flüsse in Mitteleuropa
. Zeitschrift für Meteorologie
21
(
10
), 441–452.
Shao
M.
,
Zhao
G.
,
Kao
S.-C.
, Cuo, L., Rankin, C. & Gao, H.
2020
Quantifying the effects of urbanization on floods in a changing environment to promote water security – a case study of two adjacent basins in Texas
.
Journal of Hydrology
589
, 125154.
doi:10.1016/j.jhydrol.2020.125154
.
Somorowska
U.
&
Łaszewski
M.
2019
Quantifying streamflow response to climate variability, wastewater inflow, and sprawling urbanization in a heavily modified river basin
.
Science of the Total Environment
656
,
458
467
.
doi:10.1016/j.scitotenv.2018.11.331
.
Stein
L.
,
Clark
M. P.
,
Knoben
W. J. M.
, Pianosi, F. & Woods, R. A.
2021
How do climate and catchment attributes influence flood generating processes? A large-sample study for 671 catchments across the contiguous USA
.
Water Resources Research
57
(
4
), e2020WR028300.
Strachan
S.
&
Daly
C.
2017
Testing the daily PRISM air temperature model on semiarid mountain slopes
.
Journal of Geophysical Research: Atmospheres
122
(
11
),
5697
5715
.
Tang
Q.
&
Lettenmaier
D. P.
2012
21st century runoff sensitivities of major global river basins
.
Geophysical Research Letters
39
(
6
),
L06403
.
Wagener
T.
&
Wheater
H. S.
2006
Parameter estimation and regionalization for continuous rainfall-runoff models including uncertainty
.
Journal of Hydrology
320
(
1–2
),
132
154
.
doi:10.1016/j.jhydrol.2005.07.015
.
Wang
G.
,
Xia
J.
,
Li
X.
, Yang, D., Hu, Z., Sun, S. & Sun, X.
2020
Critical advances in understanding ecohydrological processes of terrestrial vegetation: from leaf to watershed scale
.
Chinese Science Bulletin
66
(
28–29
),
3667
3683
.
Westerberg
I. K.
&
McMillan
H. K.
2015
Uncertainty in hydrological signatures
.
Hydrology and Earth System Sciences
19
(
9
),
3951
3968
.
doi:10.5194/hess-19-3951-2015
.
Yang
H.
,
Yang
D.
,
Lei
Z.
& Sun, F.
2008
New analytical derivation of the mean annual water-energy balance equation
.
Water Resources Research
44
(
3
),
W03410
.
Zhang
L.
,
Dawes
W. R.
&
Walker
G. R.
2001
Response of mean annual evapotranspiration to vegetation changes at catchment scale
.
Water Resources Research
37
(
3
),
701
708
.
Zheng
H.
,
Zhang
L.
,
Zhu
R.
, Liu, C., Sato, Y. & Fukushima, Y.
2009
Responses of streamflow to climate and land surface change in the headwaters of the Yellow River Basin
.
Water Resources Research
45
(
7
), W00A19.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).

Supplementary data