Abstract
Commencement of the Gravity Recovery and Climate Experiment (GRACE) provides an alternative way to monitor changes in terrestrial water storage (TWS) at large scales. However, GRACE dataset spans from 2002 to present, which greatly limits the application of GRACE data for long-term hydrological studies. Thus, the general linear model (GLM), random forest (RF), support vector machines (SVM), and artificial neural networks (ANN) methods were used to reconstruct the time series of terrestrial water storage anomalies (TWSA, i.e., remove the average value from the time series) in Northwest China (NWC) during 1948–2002 based on the GRACE TWSA during 2003–2015 and hydrological data from the Global Land Data Assimilation System (GLDAS) during 1948–2010. The results showed that soil moisture (SM) anomalies, or the combination of SM, canopy water (CW), and snow water equivalent (SWE) anomalies were better than the other anomalies of GLDAS in NWC. RF method can be regarded as the optimal method to reconstruct TWSA in NWC in the four models. A negative relationship was found between the reconstructed TWSAs and El Niño-Southern Oscillation (ENSO). The method could also offer an approach to reconstruct TWSA and drought events in large river basins during the past several decades.
INTRODUCTION
Hydrological state variables, including surface water, SM, and groundwater, play major roles in the water cycle and land surface processes (Rodell et al. 2009; Seneviratne et al. 2010). Terrestrial water storage (TWS) is typically defined as the sum of SM, surface water, groundwater, and snow. The snow, surface water, and groundwater components of TWS are core water resource variables, relevant to monitoring and predicting hydrological drought and to making informed long-term water management decisions (Kumar et al. 2016). With the development of remote sensing techniques, problems associated with point- and field-scale measurement techniques for hydrological studies can be partly resolved (Swenson & Wahr 2002; Cao et al. 2015), especially after the commencement of the gravity recovery and climate experiment (GRACE). On large spatial scales, GRACE provides an effective way to monitor changes in TWS, which includes both surface and subsurface water (Swenson & Wahr 2002; Rodell et al. 2009).
Two GRACE satellites were launched by the NASA and Digital Learning Sciences (DLS) in March 2002 (Rodell et al. 2004a; Thomas et al. 2014). The distance between the two satellites, which can be accurately measured by those satellites, is about 220 km (http://grace.jpl.nasa.gov/). The change in their distance is primarily caused by variations in the gravitational field (Rodell et al. 2004a). Therefore, based on the changes in the distance between the two satellites, the temporal variation in mass, particularly the change in TWS, can be derived using spherical harmonic coefficients (Tapley et al. 2004). The spherical harmonic coefficients of GRACE have been applied to assess changes in TWS or drought events from basin to continental scales (Chen et al. 2009; Joodaki et al. 2014). For example, Matsuo & Heki (2010) estimated that the glaciers on the Asian Alps decreased at a rate of 6.1 × 1010 tons/a by taking the variation of gravity field and glacial isostatic adjustment (GIA) into consideration for calculating changes in glacier storage. Feng et al. (2013) found that groundwater level displayed a decreasing trend in the North China Plain, at the speed of 1.1 cm/a, which was consistent with the GRACE data. Cao et al. (2015) proposed the method that depicts accurately the drought events during 2009 in NWC, based on the comparison of the results with the standard precipitation index (SPI). Oliveira et al. (2014) proposed that the TWS in the northeast of the Brazilian Cerrado had an upward trend during 2003–2010 by combining water balance variables with GRACE data.
The Global Land Data Assimilation System (GLDAS) was developed as a joint project of the Goddard Space Flight Center (GSFC) of NASA, and the National Center for Environmental Prediction (NCEP) of the National Oceanic and Atmospheric Administration (NOAA). Its land surface process models are driven using both surface observations and satellite remote sensing data. After the model simulations and data assimilation, global surface state variables (e.g., SM, and surface temperature) and flux data (e.g., evaporation and heat flux) are generated (Ghazanfari et al. 2013). Meanwhile, GLDAS products have been widely used in the study of global change and the water cycle, as well as for the comparative analysis of many satellite remote sensing products (e.g., MODIS and GRACE) (Proulx et al. 2013; Yang et al. 2013). The earliest GLDAS products are for 1948, which is valuable in researching changes in hydrological variables over the long term.
Although GRACE was proven to be a useful tool for measuring changes in TWS from regional to global scales, a shortcoming is that the time span of the data is just over a decade (2002 to the present), which greatly limits its applications for long-term hydrological studies. Several attempts have been made to reconstruct TWS time series before 2002 using GRACE data (Becker et al. 2011; Long et al. 2015; Nie et al. 2016; Zhang et al. 2016). For example, Becker et al. (2011) applied the interpolation method developed by Kaplan et al. (2000) to extend the TWS anomaly (TWSA) time series to cover the period from 1980 to 2008 using TWSA data from GRACE and observed river level data. Long et al. (2015) employed an artificial neural networks (ANN) model to reconstruct TWS (1979–2012) using commonly available hydro-meteorological data (e.g., precipitation and temperature). These reconstructed TWS data were then used to investigate drought and flood events in Southwest China. Nie et al. (2016) reconstructed the TWS of the Amazon Basin during 1948–2013 by combining the GRACE TWSA data with GLDAS products through a regression model, and the results were in good agreement with the variation of the ENSO.
In spite of previous efforts, there is no research about the extension of TWSA and the relationship between TWSA and atmosphere circulation in NWC over the past several decades. The objective of this study is to apply a general linear model to reconstruct the TWSA in NWC from 1948 to 2003 by applying GRACE and GLDAS data. The best reconstructed TWSA series is then selected by comparing to the different criteria, and the best hydrological variable or combination is determined. Finally, the best hydrological variable or combination is used to reconstruct TWSA in NWC through the advanced methods (i.e., random forest (RF), support vector machines (SVM), and ANN).
STUDY AREA AND DATA
The study area covers NWC (73 °E to 120 °E and 30 °N to 50 °N; Figure 1) with an area of more than 3.3 × 106 km2, which accounts for more than 25% of the total area of China. The main topographic features consist of the Eastern Tianshan Mountains, Southern Altay Mountains, Kunlun Mountains, Qianlian Mountains, Helan Mountains, and a series of inland basins, such as Junggar Basin, Tarim Basin, and Hexi Corridor (Chen et al. 2014). Due to the long distance between the ocean and the study area and its surrounding mountains, regional climates are of arid and semi-arid types, with potential evapotranspiration greatly exceeding precipitation in some parts of the region (Cao et al. 2015). The annual precipitation in NWC is less than 300 mm, except for the Yili River Basin where annual precipitation is about 700 mm. Temperatures warm up quickly in spring, and are highest in summer and lowest in winter (−2 °C to 19 °C) (Wang et al. 2013; Deng et al. 2014). As westerly winds prevail here in winter and spring, dust is distributed and deposited in NWC (Cao et al. 2015).
The study area. The lower right corner sub-plot stands for the location of the study area in China; the upper left corner sub-plot represents the specific condition of the study area in China. The black points represent the data grids of GRACE and GLDAS.
The study area. The lower right corner sub-plot stands for the location of the study area in China; the upper left corner sub-plot represents the specific condition of the study area in China. The black points represent the data grids of GRACE and GLDAS.
Under the impacts of human activities in the past several decades, especially the groundwater exploitation, the region is seriously threatened by diminishing water resources (Cao et al. 2015). For instance, Mi et al. (2016) found that the groundwater level in the middle of the Heihe River basin decreased at a rate of 0.22 m/a from 1985 to 2013; and Chen et al. (2016a) found that groundwater levels in irrigation areas of the Zhangye Oasis fell at a rate of 1 m/a, while those in non-irrigation areas decreased at a rate of 0.2 m/a. Therefore, sustainable development in this region requires an accurate evaluation of the regional water resources, including the temporal evolution of water storage.
GRACE data
There are three organizations (the University of Texas Center for Space Research (CSR), USA, the Jet Propulsion Laboratory (JPL), USA, and Geo Forschungs Zentrum (GFZ), Potsdam, Germany) that provided the GRACE data based on different algorithms for calculating TWSA (available at http://grace.jpl.nasa.gov/). Here, the GRACE Level-3 RL05 data, which are based on the GRACE Level-2 RL05 spherical harmonics coefficients with truncation at 60th degree and order, were applied. C2 0 coefficients (the spherical harmonics coefficients at second degree and zero order) were replaced with satellite laser ranging (SLR) estimates, and the monthly equivalent water height (EWH) anomalies were obtained by removing the monthly averaged EWH during 2004–2009 from that during 2003–2015. The non-atmospheric mass variation, estimated degree-1 coefficients, destriping filter, and Gaussian filter with 300 km smooth radius were all included in the data processing of the GRACE Level-3 RL05 (http://grace.jpl.nasa.gov/data/get-data/monthly-mass-grids-land/). The spatial resolutions of the GRACE data used here are 1° × 1°, and the temporal resolutions monthly. Detailed information on the data processing of the GRACE Level-3 RL05 was reported by Landerer & Swenson (2012). In addition, the missing data were estimated using linear interpolations. Finally, the GRACE Level-3 RL05 data on TWSA from January 2003 to June 2015 were obtained for the following analysis.
GLDAS data
GLDAS provides hourly and monthly state and flux variables with multiple spatial resolutions (e.g., 1° × 1°, 0.25° × 0.25°) (Rodell et al. 2004b). Except for the Noah model, which is with the spatial resolutions of 1° × 1° used in this study (http://mirador.gsfc.nasa.gov/), there are three further hydrological and land surface models (Mosaic, Community Land Model (CLM), and Variable Infiltration Capacity (VIC)) included in the GLDAS. The hydro-meteorological variables, including precipitation (P), evapotranspiration (E), surface runoff (QSr), subsurface runoff (QSb), SM, CW, and SWE from the Noah model, were used to calculate different terms of changes in storage based on water balance calculations (e.g., TWS=SM+CW+SWE; TWS=P−E; TWS=P−E−Qsr; TWS=P−E−Qsb) to compare the GRACE-based TWSA. In this study, the GLDAS product spans the period from 1948 to 2010. To allow comparability, the hydrological variable anomalies from GLDAS were processed in a similar way to the TWSA from GRACE. First, the anomalies of the hydrological variables were obtained from GLDAS by removing monthly averages of these variables during 2004–2009 from the datasets during the 1948–2010 datasets. In addition, the missing values in the south polar were replaced with 0. Then, the TWSA was expanded to spherical harmonic coefficients at the 60th degree and order, smoothed with a Gaussian filter of 300 km smooth radius, and summed the smoothed harmonic coefficients into area density data.
Sea surface temperature (El Niño 3.4 index)
The sea surface temperature (SST) in the eastern tropical Pacific, ranging from 5°N to 5°S and from 120°W to 170°W, is represented by the El Niño 3.4 index, which is an indication of ENSO activity (Kaplan et al. 1998). In this study, the monthly El Niño 3.4 index from 1950 to 2002 was obtained from the Climate Prediction Center (CPC) (http://www.cpc.ncep.noaa.gov). Although the ENSO phenomenon starts in the eastern tropical Pacific, it can affect both regional and global climates through teleconnections (Glantz et al. 1991; Liu et al. 2013). The warm condition of ENSO is called El Niño, while the cold condition is named La Niña.
METHODS
The Gaussian filter weight function
The general linear regression model




Random forest
RF regression model developed by Breiman (2001) is a non-parametric technique of improving the prediction performance, and it is as an extension of the classification and regression trees program. Moreover, it consists of a combination of many predictor trees, which are originated from the independent and similar distribution of the random vectors for all trees in the forest. Based on a subset of predictor variables chosen from all existing predictors randomly, the subdivisions within each tree are determined. The algorithms in detail can be obtained from the description of Breiman (2001).
Support vector machines
SVM can be applied to classification and regression analysis based on the learning algorithms (Vapnik 1998). They has shown many unique advantages in solving small sample, nonlinear, and high dimensional pattern recognition, and can be applied to the machine learning problems about function regression. In addition to performing linear classification, SVM can efficiently perform a nonlinear classification using what is called the kernel trick, implicitly mapping their inputs into high-dimensional feature spaces. Ghorbani et al. (2017) have described the details of the SVM method.
Artificial neural networks
ANN are computing systems inspired by the biological neural networks that constitute animal brains. Such systems learn to do tasks by considering examples, generally without task-specific programming (McClelland & Rumelhart 1988). In general, an input layer, a hidden layer, and an output layer with Levenberg–Marquardt back propagation learning algorithm are included in the multi-layer feed-forward perceptron (MLP) (Ghorbani et al. 2013). The neurons are associated with the neurons in a subsequent layer by a weight during the training period. While the sigmoids are used in the hidden layer, the linear activation functions are used in the output layer, respectively (Ghorbani et al. 2013).
Performance criteria
Removal of seasonal variations
In this study, the advanced reconstruction models were applied to reconstruct TWSA after the determination of the optimal hydrological variables or combination based on the linear regression model at grid scale. Then, the optimal reconstruction method was analyzed and determined through the traditional assessment indexes and determined hydrological variables.
RESULTS AND DISCUSSION
Spatial and temporal distributions of TWSA during 2003–2015
Figure 2 shows the spatial and temporal features of the GRACE TWSA in NWC. The TWSA time series in NWC during January 2003–June 2015 exhibits a significant seasonal signal with −50 mm fluctuations (Figure 2(a)). The negative slope in Figure 2(a) indicates an insignificant decreasing trend at a rate of 1.6 mm/month. Overall, the TWSA time series can capture both drought years (e.g., 2008, 2009, 2014, and 2015) and wet years (e.g., 2005 and 2006) in the way of extreme values. Both the maximum and minimum EWH time series in the grids of TWSA exhibit the decreasing trend during 2003–2015. As for the reason, previous studies have done a lot of work. For instance, Chen et al. (2016b) revealed that the decreasing TWSA in Tienshan Mountain can be attributed to the increasing height of temperature level in the summer. Yang et al. (2015) reported that the subsurface water dominated the trend of TWS. In addition, the monthly mean TWSA is shown to have significant spatial variations across NWC (Figure 2(b)). For example, while a significant decreasing trend (P < 0.01) can be observed in the regions of the Tienshan Mountains, Kunlun Mountains, Qilian Mountains, and Helan Mountains, an increasing trend was evident in the Tarim River Basin (Figure 2(c) and 2(d)).
The spatial and temporal distribution of the TWSA of GRACE during 2003–2015. The error bars represent the mean square error of the monthly grids at spatial distribution.
The spatial and temporal distribution of the TWSA of GRACE during 2003–2015. The error bars represent the mean square error of the monthly grids at spatial distribution.
Wahr et al. (2006) stated that there were both measurement errors and leakage errors in the GRACE data. Additionally, it was difficult to assess the precision of the GRACE data, due to the lack of observation data on TWS in NWC. Fortunately, the measurement errors in the GRACE data could be successfully evaluated through the root mean square (RMS) of the resultant signal by removing the annual signal, semi-annual seasonal signal, 161-day tidal signal, and the linear trend from the GRACE data (Wahr et al. 2006). Chen et al. (2009) also proposed another method, whereby the changes in ocean mass at equivalent latitudes to the study area were zero or approximately zero, as the de-aliasing process for GRACE data was corrected for these changes, to calculate the measurement error in the GRACE data. In this study, we applied the method from Wahr et al. (2006) to evaluate the measurement error of the GARCE data for NWC. Based on Equation (5), the annual signal, semi-annual signal, 161-day signal, and the linear trend were removed from the GRACE data to obtain the RMS of the remaining signal. The assessed measurement error of the GRACE data, in terms of EWH in NWC during January 2003–June 2015, was 1.6 ± 0.76 mm/month.
The linear regression between the hydrologic variable anomalies and the GRACE-based TWSA
Based on Equation (2), the linear regression coefficient ‘A’ between anomalies of the hydrological variables from GLDAS and the GRACE-based TWSA in the NWC during 2003–2010 was obtained, as shown in Figure 3. To further investigate the linear relationship between the hydrologic variable anomalies and the GRACE-based TWSA during 2003–2010, the correlation coefficients were computed and the results are plotted in Figure 4. It was clear that the anomalies of P, E, Qsr, SM, and SM+SWE+CW exhibited more significant linear relationships with the GRACE-based TWSA than other hydrological anomalies. Meanwhile, the SM being the main constituent of the water resource results in higher correlations between the TWSA and SM/its combinations, which include SM.
The regression coefficient ‘A’ between the hydrologic variables anomalies from GLDAS and the TWSA from GRACE during 2003–2010. From (a) to (l), they represent the hydrological variables of P, E, CW, Runoff (QSr+QSb), SM, SWE, SM+SWE+CW, Qsr, Qsb, P−E, P–E−Qsr, and P–E−Qsb, respectively.
The regression coefficient ‘A’ between the hydrologic variables anomalies from GLDAS and the TWSA from GRACE during 2003–2010. From (a) to (l), they represent the hydrological variables of P, E, CW, Runoff (QSr+QSb), SM, SWE, SM+SWE+CW, Qsr, Qsb, P−E, P–E−Qsr, and P–E−Qsb, respectively.
The correlation coefficient between the hydrologic variables from GLDAS and the TWSA from GRACE during 2003–2010. From (a) to (l), they represent the hydrological variables of P, E, CW, Runoff (QSr+QSb), SM, SWE, SM+SWE+CW, Qsr, Qsb, P−E, P–E−Qsr, and P–E−Qsb, respectively.
The correlation coefficient between the hydrologic variables from GLDAS and the TWSA from GRACE during 2003–2010. From (a) to (l), they represent the hydrological variables of P, E, CW, Runoff (QSr+QSb), SM, SWE, SM+SWE+CW, Qsr, Qsb, P−E, P–E−Qsr, and P–E−Qsb, respectively.
Assessment of the reconstructed TWSA from GLDAS during 1948–2002
To assess the performances of the linear regression models, two criteria (NSE and MAE) were selected to evaluate the reconstructed TWSA from GLDAS during 1948–2002. As shown in Figures 5 and 6, the NSEs between the reconstructed anomalies of P, E, SM, SM+SWE+CW, P−E, P−E−Qsr, P−E−Qsb and the originals are closer to 1 than the others, meaning that these reconstructed anomalies were more reasonable than the other hydrological variable anomalies. The boxplots of the spatial distributions of the NSEs are also shown in Figure 7(a). Thus, the anomalies of SM, and SM+SWE+CW, could be regarded as the best reconstructed GLDAS-based TWSAs. However, the phases of fluctuations would affect the NSEs to some degree. Meanwhile, the MAEs between reconstructed anomalies of P, E, SM, SWE, SM+SWE+CW, P−E, P−E−Qsr, P−E−Qsb and the originals were closer to 0 than the others, suggesting that these reconstructed GLDAS-based anomalies were more suitable. Additionally, Figure 7(b) shows boxplots of the spatial distributions of the correlation coefficients. Notably, the anomalies of SM, and SM+SWE+CW, displayed more homogenous distributions and higher correlation coefficients than the other anomalies. It could be concluded that the anomalies of SM, and SM+SWE+CW, should be chosen as the reconstructed TWSAs based on the correlations. The boxplots of the ranges of the MAEs are shown in Figure 7(c). The anomalies of P, SWE, P−E, P−E−Qsr, and P−E−Qsb can be regarded as the five best reconstructed GLDAS-based TWSAs. However, to some degree, the level of the original GLDAS-based TWSAs would affect the MAEs between these and the reconstructed GLDAS-based TWSAs. Consequently, the hydrological variables and combinations are described in more detail in the following section.
The NSEs between the reconstructed hydrologic variables anomalies and the original hydrologic variables anomalies from GLDAS during 1948–2002.
The NSEs between the reconstructed hydrologic variables anomalies and the original hydrologic variables anomalies from GLDAS during 1948–2002.
The MAEs between the reconstructed hydrologic variables anomalies and the original hydrologic variables anomalies from GLDAS during 1948–2002.
The MAEs between the reconstructed hydrologic variables anomalies and the original hydrologic variables anomalies from GLDAS during 1948–2002.
The box plot regarding NSEs, R and MAEs between the reconstructed hydrologic variables anomalies and the original hydrologic variables anomalies from GLDAS during 1948–2002. TWS represents SM+SWE+CW. The boxplot represent the percentile plot. The crosses stand for the anomalies of the value.
The box plot regarding NSEs, R and MAEs between the reconstructed hydrologic variables anomalies and the original hydrologic variables anomalies from GLDAS during 1948–2002. TWS represents SM+SWE+CW. The boxplot represent the percentile plot. The crosses stand for the anomalies of the value.
Selection of the reconstructed TWSA during 1948–2002
Based on the NSE and MAE values, the reconstructed anomalies of SM+SWE+CW would be the better reconstructed TWSA time series during 1948–2002 (e.g., Figure 8). The fluctuations of the reconstructed time series exhibited significant seasonal fluctuations. Also, the reconstructed hydrological variable anomalies during 1948–2002 were close to the GRACE-based TWSAs during 2003–2015. While the R2 between SM and GRACE-based TWSA was 0.198 (P < 0.05) during 2003–2010, the R2 between SM+CW+SWE and GRACE-based TWSA was 0.198 (P < 0.05) during 2003–2010, as shown in Table 1. Therefore, these two series are considered better reconstructed than the other reconstructed TWSAs. Based on the previous detection, the combination of SM+SWE+CW was applied to reconstruct the TWSA in NWC through the advanced methods (i.e., RF, SVM, and ANN) (Figure 9), and it revealed that the RF was the most optimal reconstructed method.
The regression models between the hydrological variables from GLDAS and TWSA from GRACE during 2003–2010
Hydrological variable – GLDAS . | GRACE . | Equations . | R2 . |
---|---|---|---|
P | TWSA | y = 1.6905x + 0.0812 | 0.2345 |
P−E | y = 3.8675x + 0.1273 | 0.1749 | |
P–E−QSr | y = 3.8676x + 0.1276 | 0.1745 | |
P–E−QSb | y = 3.8799x + 0.1262 | 0.1751 | |
SW | y = 1.0095x + 0.1592 | 0.1984 | |
SW+CW+SWE | y = 1.1148x − 0.3296 | 0.1967 |
Hydrological variable – GLDAS . | GRACE . | Equations . | R2 . |
---|---|---|---|
P | TWSA | y = 1.6905x + 0.0812 | 0.2345 |
P−E | y = 3.8675x + 0.1273 | 0.1749 | |
P–E−QSr | y = 3.8676x + 0.1276 | 0.1745 | |
P–E−QSb | y = 3.8799x + 0.1262 | 0.1751 | |
SW | y = 1.0095x + 0.1592 | 0.1984 | |
SW+CW+SWE | y = 1.1148x − 0.3296 | 0.1967 |
The assessment of the reconstruction regarding TWSA based on GLDAS data (SM+SWE+CW) and advanced reconstruction methods.
The assessment of the reconstruction regarding TWSA based on GLDAS data (SM+SWE+CW) and advanced reconstruction methods.
Comparison between the best reconstructed TWSA from GLDAS and the SST during 1948–2002
As shown by previous studies, long-term hydrological processes at large spatial scales are usually associated with large-scale atmospheric circulations (Zeng et al. 2008). The climate and hydrological processes in NWC are also affected by many factors. First, the complex geomorphic and topographic characteristics directly affect the spatial patterns of moisture conditions in Qinghai Province (Liu et al. 2013). Another potential factor is atmospheric circulation, which could affect water vapor transport and hydrological cycles in the region. For instance, the Indian summer monsoon primarily affected the Tuotuo River (Liu et al. 2013). The western circulation at mid-latitude played a major role in the spring and winter precipitation in the Qaidam Basin (Qian & Qin 2008). The East Asian summer monsoon was associated with the moisture conditions in the Qilian Mountains (Wang 2001). The regional atmospheric circulations, such as the South China Sea subtropical high, West Pacific subtropical high, and North America subtropical high, were identified as the dominant drivers of precipitation in NWC (Li et al. 2016). Although, there are no other methods to validate the long-term reconstructed TWSA data, the variation in TWSA can be indirectly evaluated by SST (Zeng et al. 2008). Tang et al. (2014) assessed the droughts in Southwest China by comparing the TWS from GRACE with the SST anomaly. Nie et al. (2016) also reconstructed a TWSA series by combining hydrological variables from GLDAS, and validated its accuracy by comparing them with the Southern Oscillation Index (SOI). Thus, the results also indicated that there was a certain relationship between TWS and SST, and the SST was used to verify the reconstruction.
After obtaining the reconstructed TWSA time series, the seasonal signal was removed by using Equation (5). The non-seasonal reconstructed TWSA (e.g., SM, SM+CW+SWE) time series in NWC during 1948–2002 is shown in Figure 10(a) and the fluctuations of SST anomalies are shown in Figure 10(b). One could conclude that the significant increases of SST anomalies (≥0.5 °C), continuous for at least six months (El Niño events), were consistent with negative TWSAs; while the significant decreases continuous for at least six months (≤− 0.5 °C) (La Niña events) are in agreement with the positive TWSAs. This is in line with previous studies (Xavier et al. 2010; Becker et al. 2011).
The comparison between the best reconstructed TWSA from GLDAS and the surface temperature anomaly (SST) during 1948–2003: (a) represents the non-seasonal reconstructed TWSA; (b) indicates the SST anomaly.
The comparison between the best reconstructed TWSA from GLDAS and the surface temperature anomaly (SST) during 1948–2003: (a) represents the non-seasonal reconstructed TWSA; (b) indicates the SST anomaly.
Despite the uncertainties in the GRACE TWSAs, the good relationship between the reconstructed TWSAs and SOI was enough to demonstrate the utility of the data for researching the TWSA during 1948–2002. The reconstructed TWSA of hydrological variables from GLDAS provided the possibility of investigating the variations of TWSA in NWC over the past 55 years. While the reconstructed hydrological variables enhance our understanding of the TWSA, it could also help us to study the drought events in the large-scale basin. For instance, the depletion of TWSA in 1957, 1965, 1978, 1986, 1992, 1995, 1997, and 2009 was associated with drought events in NWC. The similarity between this study and that of Nie et al. (2016) was that the reconstructed TWSA was consistent with SST or SOI. On the contrary, the differences were that the reconstructed positive TWSA in this study was linked to La Niña events, while the reconstructed positive TWSA from Nie et al. (2016) was related to El Niño events. This implies that the impacts of atmospheric circulation at a large scale on different regions in the world were not the same. Common approaches to investigating droughts are usually based on the diversity of drought indexes, which depend on meteorological variables (e.g., rainfall, temperature, and runoff) (Bazrafshan et al. 2015; Vergni et al. 2015; Nie et al. 2016). Although the drought indexes can only monitor the variation of sectional water resources, the TWSA from GRACE and GLDAS indicates the variations in the whole water resource system (Nie et al. 2016). Even though the reconstructed TWSAs are associated with the errors discussed above, they offer a possible approach to monitoring the variations of TWSA in the large scale basin during the past decades.
CONCLUSIONS
TWS is an important hydrological variable that links to various hydrological and land surface processes. The TWSAs over Northwest China during 1948–2002 were reconstructed and the main conclusions of this study are as follows:
The TWSA from GRACE in NWC during 2003–2015 indicated a significant seasonal fluctuation with an annual amplitude of 50 mm for EWH, and exhibited a decreasing trend at a rate of 1.6 mm/month during January 2003–June 2015. In addition, a significant increasing trend was shown in the region of Southern Xinjiang Basin, and a significant decreasing trend was exhibited in the regions of the Tienshan Mountains, Kunlun Mountains, Qilian Mountains, and Helan Mountains.
Taking correlation coefficient, NSE, and MAE into account, SM and SM+SWE+CW can be applied to reconstruct TWSA in NWC well. Furthermore, random forest was the most optimal method to reconstruct TWSA in NWC based on the combination of SM+SWE+CW among the three advanced methods (i.e., RF, SVM, and ANN).
Despite the uncertainties of the GRACE data, significant positive TWSA was found during periods related to the periodicity of the ENSO. We concluded that the significant increases in SST anomalies of at least six continuous months (≥0.5 °C) (El Niño events) were associated with negative TWSAs, while the significant decreases of at least six continuous months (≤− 0.5 °C) (La Niña events) were associated with the positive TWSAs. In spite of the imperfect result shown in the study, it could solve the problem of the reconstruction of TWSA to some degree.
ACKNOWLEDGMENT
The research was supported by the National Basic Research Program of China (973 Program: 2012CB956204) and the National Natural Science Foundation of China (NSFC: 41571019).