ABSTRACT
In this study, a physics-informed machine learning-based surrogate model (SM) for the variable infiltration capacity (VIC) model was developed to improve simulation efficiency in the Yarlung Tsangpo River basin. The approach combines the empirical orthogonal function decomposition of low-fidelity VIC models to extract spatial and temporal features, with machine learning techniques applied to refine temporal feature series. This allows for accurate reconstruction of high-fidelity spatial simulations from the results of the low-fidelity model. Using the SM built from the 1.0°-resolution VIC model as an example, the study highlights the challenges and solutions associated with low-fidelity simulations. The SM significantly improves accuracy, achieving a Kling–Gupta efficiency of 0.88, an Nash–Sutcliffe efficiency of 0.97, and a PBIAS value of −6.21% with reduced computational demands. Additionally, different machine learning methods impact the performance of the SM, with the support vector machine regression model performing best in these methods. SMs from varying low-fidelity resolutions maintain similar accuracy, but higher resolutions notably enhance computational efficiency, reducing time by 86.31% when compared to the high-fidelity VIC model. These findings demonstrate the potential of the SM to enhance VIC model simulations while reducing computational requirements.
HIGHLIGHTS
Developed a physics-informed machine learning-based surrogate model to enhance the efficiency and accuracy of variable infiltration capacity (VIC) streamflow simulations.
Achieved significant reduction in computational time while retaining key spatial features of the original VIC model.
The model effectively balances accuracy and computational efficiency, making it suitable for large-scale hydrological studies.
INTRODUCTION
The study of hydrological systems is critical for understanding the impacts of climate change on water resources, assessing flood risks, and developing strategies for sustainable water management (Gu et al. 2020a). Distributed models have gained significant attention regarding these topics due to their ability to represent spatial variability in environmental processes, which are designed to simulate the spatial and temporal distribution of hydrological variables, such as runoff, evapotranspiration and soil moisture, across large geographical regions (Wang et al. 2024). For example, Vogeti et al. (2023) used the Soil Water Assessment Tool, Hydrologic Engineering Center–Hydrologic Modelling System, and Hydrologic Simulation Program Fortran to explore climate change impact on streamflow in the Lower Godavari Basin, India; Kovalets et al. (2015) forecasted the extreme floods in mountainous areas with the distributed hydrology soil vegetation model and the weather research and forecasting modeling system. The variable infiltration capacity (VIC) model is one of the popular distributed hydrological models, which represents various processes in the terrestrial water cycle, including runoff generation, soil moisture storage, and energy balance (Liang et al. 1994, 1996). The VIC has been widely adopted in water-related fields, such as climate change impacts on hydrology (Lu et al. 2013), runoff forecasts (Sinha & Sankarasubramanian 2013), and water resources assessment (Sridhar et al. 2019), because of its ability to simulate complex processes using a wide range of parameters, allowing for more accurate simulations. However, the computational demands of the VIC model pose substantial challenges (Tsai et al. 2021).
To alleviate this computational burden, surrogate models (SMs) have emerged as a promising solution. SMs approximate the behavior of more complex models, allowing for faster simulations while maintaining an acceptable accuracy (Maier et al. 2014). Among the different types of SMs, artificial neural networks (ANNs) (Behzadian et al. 2009; Zhou et al. 2021), Gaussian process regression (GPR) models (Gong et al. 2016; Zhang et al. 2016), and low-fidelity models (Razavi & Tolson 2013) have shown great promise in enhancing computational efficiency in hydrological studies. However, traditional SMs such as ANNs and GPRs often struggle to balance computational efficiency with the ability to capture the complex, nonlinear dynamics and spatial-temporal variability inherent in hydrological systems (Razavi et al. 2012). The effectiveness of SMs has been demonstrated in numerous hydrological studies, where they have been used for applications such as streamflow forecasting (Gu et al. 2024a) and flood risk modeling (Zhou et al. 2021; Mosimann et al. 2024). However, these studies typically focus on temporal variations while neglecting spatial correlations. This characteristic leads to a lumped representation in SMs, weakening the advantage of distributed hydrological models in spatial simulation (Gu et al. 2020b).
In recent years, physics-informed machine learning (PIML) has gained attention as a novel approach that can integrate physical knowledge into the training of data-driven models (Karniadakis et al. 2021). Generally speaking, traditional black box models perform well in both simulation efficiency and data fitting (Mishra et al. 2024; Vogeti et al. 2024). However, some studies have also pointed out that the predictions of black box models can be unrealistic or inconsistent with physical laws due to errors from extrapolating beyond the data or biases in the observations, which can lead to poor generalization (Razavi 2021; Yaseen 2023). To settle this issue, PIML teaches machine learning models the laws of physics by incorporating fundamental physical principles and domain knowledge, which in turn provides informative priors, offering strong theoretical constraints and inductive biases on top of the observational data (Karniadakis et al. 2021). This unique feature allows PIML to bridge the gap between high-fidelity physical models and purely data-driven approaches, offering a balance between accuracy and computational efficiency (Fraehr et al. 2023a). By integrating physical constraints, PIML can reduce the need for extensive training data, making them suitable for complex hydrological models like VIC. A hybrid approach involving a low-fidelity model, spatial analysis, and Gaussian processing regression is one kind of PIML, which uses simplified physical models to provide initial approximations and uses the machine learning algorithm to refine these approximations and complete high-fidelity simulation (Fraehr et al. 2024). The dimension reduction techniques, such as empirical orthogonal function (EOF) analysis, have been introduced to overcome these limitations of spatial simulation in the SM (Hu et al. 2019; Donnelly et al. 2022).
This study aims to enhance hydrological modeling by developing a PIML-based SM to boost the VIC model's efficiency in streamflow simulation. The specific objectives are: (1) to decrease computational demands of the VIC model through PIML techniques; (2) to improve the SM's accuracy while minimizing fidelity loss compared to the high-fidelity VIC outputs; and (3) to retain as much spatial detail from the high-fidelity VIC model as possible.
METHODS
Hydrological model
The VIC is a grid-based distributed hydrological model designed for large-scale catchments (Liang et al. 1994, 1996). In the VIC model, each grid cell operates independently without considering the lateral communication between cells. The VIC model does not have a built-in routing module. Hence, a source-to-sink routing model to solve linearized Saint-Venant equations (Lohmann et al. 1996, 1998) is utilized to simulate the movement of runoff and baseflow from each grid cell to a river channel network.
Surrogate model
Low-fidelity, spatiotemporal decomposition, and post-processing model
Framework of the LSP, where EOF is the abbreviation for EOF and PC is the abbreviation for PC.
Framework of the LSP, where EOF is the abbreviation for EOF and PC is the abbreviation for PC.
EOF analysis
The EOF analysis (Marukatat 2023), also known as principal component analysis in the context of spatial data, is a statistical method used to identify and analyze patterns in high-dimensional datasets. It is particularly effective in reducing dimensionality while preserving the essential characteristics of the data. The main steps of EOF are as follows:
(1) Standardization. Remove the trend and periodic components from the data, and then standardize the data into a mean value of 0 and a standard deviation of 1.
The percent of variance explained gives the percentage of the total variance that is explained by each mode. With the percent and cumulative percent of variance explained, the principal modes are selected.
PCk represents the temporal variation of the kth EOF mode, and νk represents the spatial variation.



Machine learning model
(1) Gaussian process regression
GPR is a Bayesian regression technique that provides a probabilistic approach to modeling complex relationships in data (Williams & Rasmussen 2006). GPR operates by defining a prior distribution over functions and updating this distribution based on observed data, which allows it to capture uncertainty in predictions. Unlike traditional regression methods, GPR can automatically account for noise in the data and provides confidence intervals for its predictions (Schölkopf et al. 2001). Key advantages of GPR include: (1) the ability to model complex, nonlinear relationships without the need for specific parametric forms; (2) a natural incorporation of uncertainty estimation, allowing for more informed decision-making; (3) flexibility in choosing kernel functions to tailor the model to specific datasets; and (4) suitability for small- to medium-sized datasets where the overhead of computation remains manageable.
(2) Backpropagation (BP) neural network
The BP neural network model is a widely used supervised learning algorithm in the field of ANNs (Rumelhart et al. 1986). The BP model is composed of the input layer, the hidden layers, and the output layer. The BP algorithm is used to train the weights in the network and correct model errors, which includes forward-pass and backward-pass phases. In the former phase, the input data are fed into the network, and the output is calculated. In the latter phase, the error calculated from the output is propagated back through the network to update the weights and biases. Unlike simpler algorithms, BP can learn complex nonlinear relationships between inputs and outputs by using multiple layers of neurons. When acquired sufficient data and appropriate network complexity, the BP model can approximate almost any continuous function. Its flexible structure to allow customizing internal layer types and sizes makes the model suitable for a wide array of applications.
(3) Support vector machine (SVM) regression model
SVM is a powerful supervised learning technique used to predict continuous outcomes. It builds on the concepts of SVM for classification but focuses on fitting a function that can predict values based on input data (Cortes & Vapnik 1995). The model works by finding the best hyperplane that fits the data, while allowing for a certain amount of error defined by a tolerance margin. SVM can handle both linear and nonlinear relationships thanks to the use of different kernel functions.
(4) Random forest (RF)
RF refers to an ensemble learning method that uses multiple trees to train and predict samples. RF utilizes bootstrapping to generate diverse subsets from the original dataset to build several models and integrate them rather than building a single global model from raw data, which helps mitigate overfitting to some extent (Mariano & Monica 2021). Additionally, decision trees of RF can be trained in parallel, accelerating the algorithm's training process through parallel computing, while also reducing data dimensionality by randomly selecting features.
Accuracy evaluation index

These three parameters are the most commonly used evaluation criteria in the assessment of hydrological models. NSE is sensitive to extreme values or outliers, which can be used to evaluate overall model efficiency and how well the model captures the observed data's variability (Moriasi et al. 2007). PBIAS is important for quantifying the overall tendency of the model to overestimate or underestimate the observed values, providing insight into the model's systematic bias. KGE is a more balanced metric used for a more comprehensive assessment by evaluating correlation, variability, and bias separately, helping to identify specific weaknesses in the model (Althoff & Rodrigues 2021). In this study, these three evaluation criteria were used simultaneously to conduct a comprehensive assessment of the model.
Model setting
In this study, we established four VIC models with different resolutions of 0.25, 0.50, 0.75, and 1.00°, which are denoted as VIC25, VIC50, VIC75, and VIC100, respectively. Among these, VIC25 is referred to as the high-fidelity model, while VIC50, VIC75, and VIC100 are considered low-fidelity models. The parameters of the VIC models are calibrated with the ε-Dominance Non-Dominated Sorted Genetic Algorithm II (ε-NSGA II) (Kollat & Reed 2006; Liu et al. 2017). The details of the algorithm can be found in Gu et al. (2020a). The LSP models constructed with the low-fidelity models, i.e. VIC50, VIC75, and VIC100, are referred to as SM50, SM75, and SM100, respectively. Due to the ability to model complex, nonlinear relationships without the need for specific parametric forms, the GPR model is employed by default for post-processing in the LSP models. In this study, the GPR model utilizes the squared exponential kernel function, while the SVM model uses the linear kernel function, which is found to be most effective in the post-processing for PC. In the RF model, the number of decision trees in the bagged ensemble and the minimum number of leaf node observations are set as 200 and 5, respectively. The three-layer BP model is employed. The number of nodes in the hidden layer is calculated in accordance with the empirical formula proposed by Li & Liu (2009), and set as 25 in this study. The default values for other parameters suggested by Beale et al. (2010) are employed in the training of these machine learning models.
STUDY AREA AND DATA
Study area
Location of the YTR basin, the hydrological station, and the glacier distribution.
Location of the YTR basin, the hydrological station, and the glacier distribution.
The YTR basin experiences an average annual precipitation of around 560 mm, which is relatively stable from year to year. Most of this rainfall occurs during the summer monsoon season, between June and September, influenced by the southwest monsoon (Su et al. 2016; Gu et al. 2023). The average temperature in the basin is about 6.27 °C, with peak temperatures observed in June and July and the lowest in January. This concentrated precipitation pattern, combined with seasonal temperature variations, plays a significant role in the hydrology of the basin, where a large amount of the glacier and snowmelt water is always accompanied by a large amount of precipitation. The hydrological behavior of the YTR basin is primarily driven by a combination of factors, including monsoon-controlled precipitation, snow and glacier meltwater, providing a complex system for the study of hydrological models (You et al. 2007; Chen et al. 2017).
Data
The China Meteorological Forcing Dataset (CMFD) is a gridded near-surface meteorological dataset (Yang & He 2019; He et al. 2020). In this study, the forcing data of the VIC model, including daily precipitation, temperature, wind speed, relative humidity, and radiation, is provided by CMFD. These forcing data range from 1999 to 2018. The daily streamflow of the NX station from 2000 to 2015 is utilized for model calibration and validation. Here, the calibration and validation periods for VIC are 2000–2010 and 2011–2015, respectively. As for the SM, an additional period from 2016 to 2018 is used for testing to show the robustness of the proposed SM. Meanwhile, the training period for the surrogate is from 2000 to 2015.
The terrain data of the YTR basin are from the Shuttle Radar Topography Mission digital elevation model (SRTM DEM). The SRTM DEM is a global elevation dataset that provides high-resolution topographic data, which is widely used in hydrological fields (Sharma & Tiwari 2014). The soil and land cover data for building the VIC model are from the Harmonized World Soil Database (HWSD) (Fischer et al. 2008) and the land cover dataset from the Environmental and Ecological Science Data Center for West China (WESTDC) (Ran et al. 2010). The soil data of HWSD in China are provided by the Nanjing Institute of Soil Science from the second national land survey. The distribution and area of glaciers in the YTR basin are derived from the Second Glacier Inventory of China (Guo et al. 2015).
RESULTS
Results of the VIC model
Comparison of performance across the models with different resolutions
Metric . | Model . | Calibration . | Validation . | Training . | Testing . |
---|---|---|---|---|---|
KGE | VIC25 | 0.91 | 0.88 | – | – |
VIC50 | 0.81 | 0.79 | 0.80 | 0.79 | |
VIC75 | 0.77 | 0.75 | 0.81 | 0.78 | |
VIC100 | 0.65 | 0.63 | 0.64 | 0.62 | |
NSE | VIC25 | 0.87 | 0.86 | – | – |
VIC50 | 0.85 | 0.84 | 0.96 | 0.96 | |
VIC75 | 0.81 | 0.79 | 0.96 | 0.95 | |
VIC100 | 0.75 | 0.73 | 0.88 | 0.86 | |
PBIAS | VIC25 | −1.11% | −0.27% | – | – |
VIC50 | 18.26% | 19.83% | 17.50% | 20.16% | |
VIC75 | 20.32% | 21.69% | 18.80% | 22.02% | |
VIC100 | 33.86% | 35.93% | 32.27% | 36.30% |
Metric . | Model . | Calibration . | Validation . | Training . | Testing . |
---|---|---|---|---|---|
KGE | VIC25 | 0.91 | 0.88 | – | – |
VIC50 | 0.81 | 0.79 | 0.80 | 0.79 | |
VIC75 | 0.77 | 0.75 | 0.81 | 0.78 | |
VIC100 | 0.65 | 0.63 | 0.64 | 0.62 | |
NSE | VIC25 | 0.87 | 0.86 | – | – |
VIC50 | 0.85 | 0.84 | 0.96 | 0.96 | |
VIC75 | 0.81 | 0.79 | 0.96 | 0.95 | |
VIC100 | 0.75 | 0.73 | 0.88 | 0.86 | |
PBIAS | VIC25 | −1.11% | −0.27% | – | – |
VIC50 | 18.26% | 19.83% | 17.50% | 20.16% | |
VIC75 | 20.32% | 21.69% | 18.80% | 22.02% | |
VIC100 | 33.86% | 35.93% | 32.27% | 36.30% |
Note: VIC25, VIC50, VIC75, and VIC100 represent the models with spatial resolutions of 0.25, 0.50, 0.75, and 1.00°. Items ‘Calibration’ and ‘Validation’ represent the performance between the VIC-simulated streamflow and the observations at NX station in the calibration and validation periods. Items ‘Training’ and ‘Testing’ represent the performance between the streamflow simulated by low-fidelity VIC models and the VIC25-simulated streamflow at NX station in the training and testing periods.
Daily streamflow hydrograph of VIC for the calibration and validation periods of the NX station, where VIC25, VIC50, VIC75, and VIC100 represent the models with spatial resolutions of 0.25, 0.50, 0.75, and 1.00°.
Daily streamflow hydrograph of VIC for the calibration and validation periods of the NX station, where VIC25, VIC50, VIC75, and VIC100 represent the models with spatial resolutions of 0.25, 0.50, 0.75, and 1.00°.
Spatial maps of KGE, NSE, and PBIAS between the runoff simulated by the high-fidelity model and the runoff simulated by the low-fidelity model during the training and testing period, where VIC50, VIC75, and VIC100 represent the metrics between the model with spatial resolutions of 0.25° and the models with spatial resolutions of 0.50, 0.75, and 1.00°, respectively.
Spatial maps of KGE, NSE, and PBIAS between the runoff simulated by the high-fidelity model and the runoff simulated by the low-fidelity model during the training and testing period, where VIC50, VIC75, and VIC100 represent the metrics between the model with spatial resolutions of 0.25° and the models with spatial resolutions of 0.50, 0.75, and 1.00°, respectively.
Selection of principal modes
KGE, NSE, and PBIAS between the streamflow simulated by the high-fidelity model at the NX station and the streamflow simulated by the SM with different modes in (a)–(c) the training period and (d)–(f) the testing period.
KGE, NSE, and PBIAS between the streamflow simulated by the high-fidelity model at the NX station and the streamflow simulated by the SM with different modes in (a)–(c) the training period and (d)–(f) the testing period.
(a) KGE, (b) NSE, and (c) PBIAS between the runoff of grid cells simulated by the high-fidelity model and the runoff simulated by the SM with different modes.
(a) KGE, (b) NSE, and (c) PBIAS between the runoff of grid cells simulated by the high-fidelity model and the runoff simulated by the SM with different modes.
Performance of SMs constructed with different spatial resolutions
(a) Daily streamflow hydrograph and (b), (c) empirical cumulative distribution function (CDF) plots of the simulated streamflow at the NX station between the SMs (SM50, SM75, and SM100) and the high-fidelity model (VIC25), where VIC25, SM50, SM75, and SM100 represent the VIC model with spatial resolutions of 0.25° and the SMs constructed with spatial resolutions of 0.50, 0.75, and 1.00°, respectively.
(a) Daily streamflow hydrograph and (b), (c) empirical cumulative distribution function (CDF) plots of the simulated streamflow at the NX station between the SMs (SM50, SM75, and SM100) and the high-fidelity model (VIC25), where VIC25, SM50, SM75, and SM100 represent the VIC model with spatial resolutions of 0.25° and the SMs constructed with spatial resolutions of 0.50, 0.75, and 1.00°, respectively.
Figure 8(b) further confirms that the streamflow time series during the training period are nearly identical across all models. However, Figure 8(a) and 8(c) reveal that the peak flows produced by the SMs are slightly lower than those of the high-fidelity model during the testing period. This suggests that while the SMs effectively replicate the overall streamflow dynamics, they may underpredict peak streamflow events. Additionally, the simulation capabilities of the three SMs are consistent with each other, indicating that they can reliably serve as surrogates for the high-fidelity model.
Spatial maps of KGE, NSE, and PBIAS between the runoff simulated by the high-fidelity model and the runoff simulated by the SMs during the training and testing period, where SM50, SM75, and SM100 represent the metrics between the model with spatial resolutions of 0.25° and the SMs constructed with spatial resolutions of 0.50, 0.75, and 1.00°, respectively.
Spatial maps of KGE, NSE, and PBIAS between the runoff simulated by the high-fidelity model and the runoff simulated by the SMs during the training and testing period, where SM50, SM75, and SM100 represent the metrics between the model with spatial resolutions of 0.25° and the SMs constructed with spatial resolutions of 0.50, 0.75, and 1.00°, respectively.
Furthermore, Figure 9 confirms again that the SMs provide nearly identical performance to the high-fidelity model during the training period. However, during the testing period, the SMs tend to underestimate runoff in the upper reaches of the YTR basin. This underestimation is likely a contributing factor to the observed lower streamflow estimates at the NX station. Despite this, the overall consistency between the SMs and the high-fidelity model supports their utility as computationally efficient surrogates for simulating hydrological processes.
Comparison of SMs with different machine learning methods
Performance between the streamflow simulated by the high-fidelity model and the streamflow simulated by the SM with machine learning methods at the NX station
Method . | Training . | Testing . | ||||
---|---|---|---|---|---|---|
KGE . | NSE . | PBIAS (%) . | KGE . | NSE . | PBIAS (%) . | |
GPR | 1.00 | 1.00 | 0.00 | 0.88 | 0.97 | −6.21 |
BP | 0.99 | 0.99 | 0.12 | 0.94 | 0.99 | −0.54 |
SVM | 0.99 | 0.99 | 0.37 | 0.98 | 0.99 | 0.14 |
RF | 1.00 | 1.00 | −0.02 | 0.92 | 0.98 | −3.44 |
Method . | Training . | Testing . | ||||
---|---|---|---|---|---|---|
KGE . | NSE . | PBIAS (%) . | KGE . | NSE . | PBIAS (%) . | |
GPR | 1.00 | 1.00 | 0.00 | 0.88 | 0.97 | −6.21 |
BP | 0.99 | 0.99 | 0.12 | 0.94 | 0.99 | −0.54 |
SVM | 0.99 | 0.99 | 0.37 | 0.98 | 0.99 | 0.14 |
RF | 1.00 | 1.00 | −0.02 | 0.92 | 0.98 | −3.44 |
Performance between the runoff of grid cells simulated by the high-fidelity model and the runoff simulated by the SM with machine learning methods in the testing period.
Performance between the runoff of grid cells simulated by the high-fidelity model and the runoff simulated by the SM with machine learning methods in the testing period.
Distribution of the glacier and six subwatersheds in the YTR, referring to Gu et al. (2020b).
Distribution of the glacier and six subwatersheds in the YTR, referring to Gu et al. (2020b).
DISCUSSION
Efficiency evaluation
Another key point of the SM is the efficiency. In this study, the SM and the VIC models are operated with one processor of an Intel i9-13900HX Central Processing Unit (CPU) on the Ubuntu 22.04 platform. Based on the practical application of the SMs, this study uses the testing period to evaluate the computational efficiency of the models. Table 3 presents the time required to compute runoff for the testing period using different resolution VIC models under the same operating environment. The computational saving index (Razavi et al. 2012) is used to measure the efficiency gains of surrogates compared with VIC. For VIC, the consumed time decreases remarkably with the simulated spatial resolution from 0.25 to 1.00°. For the LSP model, it requires only a few seconds to reconstruct the low-fidelity model into the high-fidelity model. As a result, the computational time of the LSP model and the routing process is almost negligible compared to the simulation time required by the VIC model. It is noteworthy that the time required for model spatial resolution decreases from 25 to 50, 75, and 100° does not reduce in a proportional manner. This is because, at very low spatial resolutions, more grid cells are needed to ensure complete coverage of the basin. Consequently, the computational time does not scale down linearly with the reduction in resolution. Overall, the simulation efficiency of the SMs shows a significant improvement compared to the high-fidelity model, particularly with SM100, which saves 86.31% of the computational time.
Comparison of the time consumed by VIC and the SM
Model . | Consumed time (s) . | Computational saving (%) . | |||
---|---|---|---|---|---|
VIC . | EOF . | Routing . | Total . | ||
VIC25 | 1,207 | – | 6 | 1,213 | – |
SM50 | 420 | 12 | 6 | 438 | 63.89 |
SM75 | 225 | 12 | 6 | 243 | 79.97 |
SM100 | 158 | 12 | 6 | 166 | 86.31 |
Model . | Consumed time (s) . | Computational saving (%) . | |||
---|---|---|---|---|---|
VIC . | EOF . | Routing . | Total . | ||
VIC25 | 1,207 | – | 6 | 1,213 | – |
SM50 | 420 | 12 | 6 | 438 | 63.89 |
SM75 | 225 | 12 | 6 | 243 | 79.97 |
SM100 | 158 | 12 | 6 | 166 | 86.31 |
Note. VIC’ is the time consumed with VIC; ‘EOF’ is the time consumed with the EOF model; ‘Routing’ is the time consumed for routing the streamflow at the outlet station; ‘Total’ is the total time consumed with SM (VIC + EOF + Routing); the computational saving index δ is calculated with the formula ‘δ = (tv–ts)/tv × 100%’, where tv is the time consumed by VIC and ts is the total time consumed by SMs.
Analysis of spatial results
Figures 4 and 9 illustrate a notable improvement in the spatial simulation accuracy of the SMs compared to the low-fidelity models. However, certain areas still exhibit lower accuracy in the simulations. Interestingly, these areas correspond to Cluster 2 and Cluster 4, as defined in the study by Gu et al. (2020b), which divided the YTR basin into six subbasins, as shown in Figure 11. Cluster 1 and Cluster 2 are located in transition zones between glacier-covered and non-glacier regions. These zones present unique modeling challenges due to the complex hydrological processes, such as snowmelt, glacier dynamics, and varying precipitation patterns. The low-fidelity models, for their larger spatial resolution, are unable to effectively simulate this information. This leads to the difficulty for the GPR model to fully capture the complex processes from the low-fidelity models. Improving model accuracy in these regions might require the integration of additional hydrogeological data. Additionally, we found that using more complex machine learning models, such as SVM and BP, can improve the spatial simulation accuracy of the SMs. This suggests that using more advanced machine learning methods can help address the challenges of simulating hydrological processes in complex regions.
Limitation and perspective
While the SM improves computational efficiency for low-fidelity models, it still faces challenges in accurately capturing fine-scale spatial details in the high-fidelity model. In some cases, the model may still struggle to preserve the full spatial heterogeneity of the hydrological processes, particularly for areas with highly dynamic or complex topography. The model is primarily tested in the YTR basin, and its performance may vary when applied to different geographical regions with distinct climatic and hydrological characteristics. The transferability of SM to other basins needs further validation to assess its robustness across different hydrological regimes. Future work should focus on applying SM to other basins with varying hydrological regimes and climate conditions to test its generalizability. Additionally, expanding the model to multi-scale applications (e.g. from regional to global scales) could further demonstrate its potential for large-scale hydrological assessments.
This study has not considered global climate models (GCMs) or shared socioeconomic pathways (SSPs) due to the focus on improving the computational efficiency and accuracy of the low-fidelity model. Including GCMs and SSPs will significantly increase the complexity and computational cost of the analysis. In future work, GCMs and SSPs will be considered in conjunction with SM for assessing the impacts of future climate scenarios on streamflow and water resources. This could enable the model to provide valuable insights into how hydrological behavior may change under different climate change trajectories, providing better guidance for long-term water resource management and climate adaptation planning. In addition, the ability of SM to significantly reduce computational time opens up opportunities for its integration with real-time hydrological forecasting systems. By integrating numerical weather forecast data, SM could provide rapid flood predictions and inform flood management decisions.
CONCLUSION
To improve the efficiency of the VIC model and alleviate the burden of computation, the VIC SM based on a PIML called the LSP model was proposed and explored in the YTR basin. By conducting EOF decomposition on low-fidelity models, the spatial and temporal features were extracted successfully, whereas the spatial features were from the high-fidelity model. The machine learning algorithms were then employed to correct the temporal feature series from the low-fidelity models. By utilizing these spatial and temporal features, high-fidelity spatial simulation was reconstructed. The main results we found are as follows:
(1) As the spatial resolution of the VIC model decreases, the capacity of both the spatial runoff simulation and the streamflow simulation declines. Specifically, at a spatial resolution of 1.00°, the model's performance is poorest, with a KGE value of 0.63, an NSE value of 0.73, and a PBIAS value of 35.93% in the validation period, using observation as a benchmark.
(2) The LSP model effectively enhances the simulation accuracy of low-fidelity models. SM100 developed from VIC100, benchmarked against SM25, shows notable improvement during the testing period, with a KGE value of 0.88, an NSE value of 0.97, and a PBIAS value of −6.21%. This represents a notable enhancement compared to VIC100.
(3) Different machine learning algorithms have varying effects on the LSP model's performance. Compared to GPR and RF, SVM, and BP show improvements in spatial simulation and peak flow simulation at the NX station. Overall, the SMs developed with different machine learning algorithms achieve acceptable simulation accuracy.
(4) SMs built from low-fidelity models with different spatial resolutions exhibit similar simulation performance. However, simulation efficiency increases as the spatial resolution of the base low-fidelity model increases. Notably, SM100 can reduce computational time by 86.31% compared to VIC25.
ACKNOWLEDGEMENT
This study is financially supported by the National Natural Science Foundation of China (52209036; 52309038; and 52479028), the National Key Research and Development Program of China (2021YFC3201105), and the Natural Science Foundation of Zhejiang Province (LY22E090010; LMS25E090004). We are grateful to the National Tibetan Plateau Data Center (http://data.tpdc.ac.cn) and the National Cryosphere Desert Data Center (http://www.ncdc.ac.cn) for providing the datasets for modeling.
DATA AVAILABILITY STATEMENT
All relevant data are available from an online repository or repositories: CMFD data are downloaded from https://doi.org/10.11888/AtmosphericPhysics.tpe.249369.file. SRTM DEM data are downloaded from https://earthexplorer.usgs.gov/. Soil data are downloaded from https://www.fao.org/soils-portal/soil-survey/soil-maps-and-databases/harmonized-world-soil-database-v12/en/. WestDC landcover data are downloaded from https://cstr.escience.org.cn/CSTR:11738.11.ncdc.Westdc.2020.678. The Second Glacier Inventory of China can be accessed at https://doi.org/10.3972/glacier.001.2013.db.
CONFLICT OF INTEREST
The authors declare there is no conflict.