An improved routing algorithm for a large-scale distributed hydrological model with consideration of underlying surface impact

Large-scale hydrological models are important tools for simulating the hydrological effect of climate change. As an indispensable part of the application of distributed hydrological models, large-scale flow routing methods can simulate not only the discharge at the outlet but also the temporal and spatial distribution of flow. The aggregated network-response function (NRF), as a scale-independent routing method, has been tested in many basins and demonstrated to have good runoff simulation performance. However, it had a poor performance and produced an unreasonable travel time when it was applied to certain basins due to a lack of consideration of the influence of the underlying surface. In this study, we improve the NRF routing method by combining it with a velocity function using a new routing parameter b to reflect the wave velocity’s sensitivity to slope. The proposed method was tested in 15 catchments at the Yangtze River basin. The results show that it can provide better daily runoff simulation performance than the original routing model and the calibrated travel times in all catchments are more reasonable. Therefore, our proposed routing method is effective and has great potential to be applied to other basins.


INTRODUCTION
Climate change has received widespread attention from the scientific community and the public, as it has caused a series of serious problems, including natural disasters and extreme climate events (Kharin et  cell-to-cell and source-to-sink. The cell-to-cell method calculates the water movements from each grid to its adjacent downstream grid until the outlet grid of the basin is reached.
The outflow of each cell is calculated based on the inflow and the river channel's storage. The source-to-sink method, under the assumption that the water is rigid, directly calculates the water's movement from the grid where runoff is generated to the outlet of the basin.
The first of these two kinds of routing methods is based on the mass conservation equation and the channel storage function equation. It is widely used due to its simplicity (Miller et  Huang & Lee ). In addition to its simplicity, the cellto-cell routing method has other advantages. It has the potential to include the nonlinear losses from the surface water to the groundwater because it considers the feedback between the flow and the water storage in each cell (Naden et al. ). Besides this, with the cell-to-cell routing method, the water storage in any cell of interest can be queried at each time step. If the ratio of land area covered by surface water to the total land area can be calculated based on the water storage in each cell, it will contribute to land-atmosphere interaction simulation (Olivera et al. ).
However, the choice of a proper spatial resolution is a dilemma for the cell-to-cell routing method. In distributed hydrological models, high resolution can reflect the heterogeneity of the hydrological characteristics of the basin and improve the accuracy of calibrated parameters. On the other hand, higher-resolution DEM will lead to a significant increase in the amount of required computations, especially when coupled with a meteorological model because the meteorological model has vast computational cost. However, a coarser resolution may lead to a decrease in model performance due to not considering routing within the cell (Arora et al. ; Gong et al. ). Besides this, there may be a great difference between the network that is generated at a coarse resolution and the true one, which is another disadvantage of the cell-to-cell routing method. Attempts have been made to bring a network that is generated at a coarse resolution closer to the real one. However, a gap still exists (Fekete et al. ; Guo et al. ).
Some of the source-to-sink routing methods have the same disadvantages as the cell-to-cell method, including the routing models by Ducharne et al. () and Guo et al. (), which will provide worse performance as the cell resolution becomes coarser. The routing model by Ducharne et al. () uses the mean travel time of each large routing cell, so the water in a cell will arrive at the outlet in the same day. This can be unreasonable, especially when the cell size is large. The routing model by Guo et al. () uses the mean flow path length for each cell, making the routing method scale-dependent. Therefore, it would benefit a hydrological modeler to find a routing method with the capability to be scale-independent, which will preserve the spatially distributed travel time information from a finer-resolution DEM for coarser resolution cells in the large-scale hydrological model. NRF, as a source-to-sink routing method, has been applied in basins of China, North America, and Africa (Gong et al. , ; Li et al. , , ; Ngongondo et al. ). The results showed that NRF provides good daily runoff simulation performance. However, in some cases, the calibrated wave velocity is too large to be physically realistic, and NRF will fail to provide good daily runoff simulation performance if the wave velocity becomes smaller (Gong et al. ; Li et al. ). In this method, the underlying factors that affect the routing process are not comprehensively considered. In this respect, the NRF routing method needs to be improved.
This study aims to improve NRF by considering the influence of the underlying surface in order to obtain a more reasonable wave velocity and travel time and better daily runoff simulation performance. To achieve this aim, the original NRF routing method is coupled with a velocity function proposed by Sircar et al. () in order to take the underlying surface's influence into consideration, in which a new routing parameter b is added to reflect how sensitive the wave velocity is to slope. The remainder of this paper is organized as follows. Following a description of the study area, we present the original and improved NRF routing methods. Then, the results of a daily runoff and peak flow simulation by the model before and after the improvement are compared. The travel time and routing parameters and the factors that influence them are analyzed and compared in different catchments to explore the relationship between catchment characteristics and model parameters. Finally, we present our conclusions.
As shown in Figure 1, the Yangtze River is the longest river in China and the third longest river in the world. It has a total length of 6,397 km.
The climate, terrain, land cover and land use, and runoff characteristics vary greatly among the catchments in the Yangtze River basin. Fifteen catchments in the Yangtze River basin were selected in this study in order to evaluate the performance of the improved routing method over a wide range of representative catchments. The characteristics of the terrain in those catchments are shown in Table 1.
'Elevation difference' means the height difference between the highest point and the lowest point in a catchment.
Table 1 also presents the shape factor and the river length of the catchments. The basin shape factor (Morisawa ) is the ratio of the actual basin length to the perimeter of a circle whose area is the same as that of the basin, and It is of great significance to simulate the runoff in the Yangtze River basin, as it accounts for 18% of the total area of China, and has more than 400 million people living in it. The Yangtze River basin is also the most important agricultural production base in China because of its warm and humid climate, which provides good conditions for crop growth (Tian et al. ).

Data and processing
Hydro1 K (USGS ) was used in this study to delineate the basin's boundary. It is a global DEM with a resolution of 1 × 1 km and was produced by the United States Geological Survey's Earth Resources Observing System.
The projection of HYDRO1 k is Lambert azimuthal equalarea projection, whose transformation equations (Weisstein) enable each grid to have an equal area.
Flow direction and flow accumulation databases based on Hydro1 k are also available, which can be used with the discharge stations to obtain the basin's boundary.
Then, the estimated basin boundary can be evaluated and corrected based on the basin boundary that was delineated using the finer DEM at the resolution of 90 × 90 m.
Daily precipitation, temperature, and relative humidity data for the period 1969-2008 were taken from the daily dataset of surface climatological data of China (V 3.0) in National Meteorological Information Center (http://data.cma.cn/ data/cdcdetail/dataCode/SURF_CLI_CHN_MUL_DAY_V3. 0.html). The meteorological data were interpolated using the inverse distance weighted method (Shepard ) to be

Water and Snow Balance Modeling
The Water and Snow Balance Modeling (WASMOD) system is a conceptual modeling system that was developed (), we set two more parameters in this study: c 5 is set to correct the precipitation bias, which results from the lack of snow measurements; c 4 is set in the potential evaporation equation. As shown in Table 2, there are five to seven parameters with or without the snow module. The snow modules are only applied to Jialing River, Jingsha River, and Mintuo River where snowmelting is one of the most important sources of runoff. If the snow module is not taken into consideration, parameters a 1 and a 2 will not be  calibrated in the hydrological model.
where p td is precipitation data described in the Study area and data section and p t is the corrected precipitation.
where pet t is the potential evaporation, tmp t is the temperature, and rh t is the relative humidity. For the wave velocity in grid, Beven & Kirkby () proposed that the overland flow velocity could be calculated based on the topography as follows:

The original routing method
where v i is the wave velocity of the grid, and c i is the slope of the grid. v 45 is the wave velocity in the grid whose slope is 45 .
In the study of Gong et al. (), the authors calculate the wave velocity as follows: For a given grid, the travel time can be calculated as follows: where c i is the slope, v i is the wave velocity, t i is the travel time of the grid, c 0 is the slope threshold, which is set to be a constant to prevent t i from being infinite, and k is the number of grids through the flow path. The slope threshold is 0.1 in this study.
Other details about the NRF routing methods can be seen in Gong et al. (). In this study, the computation cell size in each catchment was determined by its area to ensure a modest number of cells. The cell size in a larger catchment is larger. Conversely, the cell size is smaller in a smaller catchment. The cell size was set to be 10 0 at catchments whose area is greater than 50,000 km 2 . It was set to be 3 0 at catchments whose area is less than 10,000 km 2 and 6 0 otherwise.

The improved routing method
The where b is a parameter that reflects how sensitive the wave velocity is to slope. The value of the parameter b is related to the condition of the underlying surface of the catchment.
There are two parameters in the improved NRF routing method (referred to as 'after' in the following) and one parameter in the original NRF routing method (referred to as 'before' in the following). Table 3

Evaluation criteria
The model before and after the routing method improvement was calibrated and validated in all catchments. The warm-up period was 1 year for all catchments.
The Nash-Sutcliffe (NSE) efficiency (Nash & Sutcliffe ) was used as the criterion for evaluating overall runoff simulation performance in this study.
where O i is the observed runoff, S i is the simulated runoff, and n is the length of the time step.
To measure the peak flow simulation performance, the time lag and relative error of peak flow were also calculated.
ΔT ¼ T sim À T obs (10) where Q re is the relative error of the peak flow, Q sim max is the simulated maximum peak flow, Q obs max is the observed maximum peak flow, T sim is the day when Q sim max appears, and T obs is the day when Q obs max appears. ΔT is the time lag of the simulated peak flow relative to the observed peak flow.
Q re and ΔT of all selected flood events in a catchment were averaged to reflect the overall simulation performance of the peak flow in that catchment as follows: where ΔT is the averaged time lag of the peak flow in the catchment, Q re is the averaged relative error of the peak flow in the catchment, and l is the number of flood events in the catchment.

RESULTS AND DISCUSSION
Modeling performance assessment with the improved NRF Both routing methods were taken as the routing module of the WASMOD. To all catchments, the model was calibrated by using 7 years' data series and validated by using another 4 years' data series during the period of 1964-2008. Figure 2 shows the performance of the daily runoff simulation before and after the improvement.     That indicates that the simulated peak flow is often delayed     For most catchments, the travel time between grids that was calculated by the improved routing method was shorter than that calculated by the original routing method (Figure 6(a) and 6(b)). Besides this, the travel time between the specific upstream hydrological station and the outlet station was shorter and closer to the benchmark value for all the catchments (Figure 6(c)). The main reason why the travel time that was calculated by the improved routing method was longer than the benchmark value in some catchments is that the routing parameters v 45 and b are spatially constant over the whole catchment. However, routing refers to the process in which rainfall in a catchment evolves along a certain path into discharge at the outlet.
The rainfall is first routed by a hillslope and then routed by the river before being transferred into discharge. Hillslope routing and river routing are two forms of routing (Kirkby & Beven ). Usually, the runoff travels faster in river routing than in hillslope routing. However, the hill- Thus, the travel time in the river that is calculated by the improved routing method will be a little longer than the benchmark value. However, the travel time that was calculated in the Shigu catchment is much longer than the benchmark value, which is due to the Shigu catchment's underlying surface. The shape factor of the Shigu catchment is the largest of all of the studied catchments, which means that the shape of the Shigu catchment is narrow and long and the water system's development is immature. The above factors lead to much more hillslope routing and a longer travel time in the Shigu catchment. In addition, the snowmelt and groundwater are the main sources of runoff in the Shigu catchment. The snowmelt runoff moves more slowly down the hillslope than the rainfall runoff, so the optimal routing parameters that were calibrated using the observed runoff lead to a longer travel time in the Shigu catchment compared with the benchmark value, which only considers routing in the river.
The travel time that was calculated using the original routing method produces a late peak flow and a smaller amount of flow (Figure 4). An unreasonable travel time is one of the main causes of a hydrological model's poor performance in long-term runoff simulation. The improved routing method provides a more realistic travel time between the grids compared with the benchmark value.
When analyzed in conjunction with Figures 2 and 6, it can be found that the NSE and the travel time in all catchments have a close relationship. From Figure 2, it can be seen that the NSE that was calculated by the original NRF is lower in some catchments, i.e., the Waizhou, Shimen, Beibei, and Baihe catchments. As shown in Figure 6(c), the runoff travel times in these catchments, as calculated using the original routing method, were unrealistic and at least twice as long as the runoff travel times that were calculated using the improved routing model. On the other hand, for the Dufengkeng, Fushun, and Wulong catchments, we found no obvious change in the travel times that were calculated using the original and improved routing methods. The original and improved routing methods also provided a similar NSE in these catchments.
Analysis of the wave velocity calculated by the improved NRF Figure 7 shows the theoretical curves of wave velocity versus slope. Figure 8(a) and 8(c) plot the calibrated curves that were calculated by the original routing method and the improved routing method (Table 3), respectively. From the analysis in the section Analysis of the routing time calculated by the improved NRF, it is known that the improved routing method is able to achieve a more reasonable travel time. The reason for this is that the improved routing method has a more flexible wave velocity-slope curve ( Figure 7) that can better reflect the real world.
It can be seen from Figure 7 wave velocity is to slope. When b is equal to zero, the slope has no effect on wave velocity, which will be a con-

Factors that influence the wave velocity calculation
The wave velocity of a grid is affected not only by the grid's slope but also by many other factors, including the basin shape factor, rainfall intensity, vegetation cover, branching ratio, basin area, climate, and form of drainage. The factors that influence the travel time are so numerous that they may compensate for each other to some extent. It is difficult to calculate the wave velocity of each grid precisely. So, these influencing factors were generalized to be reflected by the parameter v 45 , the parameter b, and slope in this study.
The parameter b varies across basins due to differences in basin characteristics, and it represents the underlying surface.
The underlying surface influences the wave velocity and travel time in many ways. There is no strict and deterministic functional relationship between basin characteristic and mean wave velocity in a large-scale hydrological model.   (Table 1): mean discharge at the outlet station Q (m 3 /s); area A (km 2 ); shape factor F; mean slope S ( ); river length L r (m); and range of elevation ΔH (m).
The area, shape factor, and river length are basin geometric factors which affect the path and sequence of runoff and then influence the mean wave velocity. The slope and range of elevation influence the wave velocity, because potential energy will be converted into kinetic energy. In the process of runoff routes, runoff will continually merge and diverge, and the speed between water will affect each other. Thus, the mean discharge at the outlet station is also taken as an independent variable.
Based on data of the 15 studied catchments, multiple linear regression analysis was done using the above factors as independent variables and mean wave velocity as the dependent variable. The value of R 2 was found to be equal to 0.73 and the significance level of the F-test was 0.051, which indicates that the mean wave velocity is associated (at the 90% level) with the above factors. The final regression equation is shown as Equation (13) where v (m/s) is the mean wave velocity of the grids; Q (m 3 /s) is the mean discharge at the outlet station; A (km 2 ) is the area; F is the shape factor; S ( ) is the mean slope; L r (m) is the river's length; and ΔH (m) is the range of elevation.
There are other factors that are not included in Equation (13), and for R 2 it is not equal to 1. However, we have yet to find more accurate and detailed rules. The mean wave velocity that was predicted by Equation (13) was compared with the mean wave velocity that was calculated with the improved routing method as shown in Figure 9, which shows an overall good result.
In the future study, more data on the factors that influence wave velocity, such as soil, vegetation cover, and the branching ratio, should be inputted into the method in order to examine the relationship between routing parameters and basin characteristics. If appropriate, the improved routing model has the potential to be applied to data-sparse areas by a parameter transformation based on the distribution in a subzone of climate, soil, and vegetation, or even to calculate the travel time without parameters, which would enable this model to be used to simulate the hydrological effect of climate and land-use change.

CONCLUSIONS
In this paper, a flow routing algorithm was developed by combining an aggregation NRF with a velocity function.