Abstract
Actual evapotranspiration (ET) is perhaps the most difficult quantity to directly measure among the major water balance components. Because of the high cost and labor constraints associated with the direct measurement of ET, empirical data-driven modeling has frequently been used to estimate ET. Beyond the widely used traditional type regression that has the effect of producing ‘global’ parameter estimates, assumed to be uniform throughout an area, we utilized a more localized spatially non-stationary technique – the geographically weighted regression (GWR) – to estimate mean monthly ET in the Passaic River Basin (PRB). We identified the key environmental controls of ET and developed new sets of spatially varying empirical ET models based on variable combinations that produced the best-fit model. The analysis showed that temporal and spatial variabilities in ET over the PRB are driven by climatic and biophysical factors. We found that the key controlling factors were different from month to month, with wind speed being dominant throughout the year in the study basin. A monthly mean ET index map was further generated from the model to illustrate areas where ET exceeds precipitation. This will among others enable water loss due to evapotranspiration to be accounted for in future water supply plans for the basin.
HIGHLIGHTS
New set of empirical monthly actual ET models was developed.
Use of spatially varying regression technique removes frequently overlooked stationarity assumption of ordinary least square models (OLS).
The GWR technique reveals any rapid changes in environmental variables over an area, necessitating further investigation.
Readily available data can be used to reasonably quantify monthly ET under mean climatic conditions.
Graphical Abstract
INTRODUCTION
Actual evapotranspiration (hereafter referred to as ET), involving soil–water evaporation (E) and vegetation transpiration (T), plays an integral role in the transfer of water and energy within the hydrologic cycle (Gowda et al. 2007). It accounts for more than 60% of precipitation input on a global scale (Ma & Szilagyi 2019), and about 50–70% in the United States (Sun et al. 2002; Brooks et al. 2012). Apart from being used as an index for climate change, location-specific data on ET has practical significance in water resources management planning and monitoring of hydrologic systems (e.g. storage changes) in river basins (Verstraeten et al. 2008; Senay et al. 2016). However, accurate quantification of this important variable at the watershed scale is often costly and time-consuming (Gasca-Tucker et al. 2007), with underlying uncertainties due to the complex array of ET processes (Li et al. 2014), and data limitation. Because it occurs in the gaseous state, unlike precipitation and streamflow, ET is also the most difficult quantity to directly measure among the major water balance components.
The literature is replete with different techniques for quantifying ET at various spatial and temporal scales. Generally, direct measurement of ET includes the lysimeter method, water balance approach, sapflow, eddy covariance (EC), Bowen Ratio methods, and stable isotope techniques (Lu et al. 2003; Williams et al. 2004; Liu et al. 2013; Sanford & Selnick 2013; Gebler et al. 2015). While it is acknowledged that each method exhibits some inherent limitations, the eddy covariance technique is by far the most accurate (Fang et al. 2016). It continuously measures site-level fluxes to generate data series at high temporal resolution. On the other hand, recent advances in optical remote sensing (RS) have also achieved sufficient accuracy in estimating ET and are widely used at large spatial scales (e.g. Zhang et al. 2016; Reitz et al. 2017; Chen & Liu 2020; Ma & Zhang 2022). In estimating ET, remote sensing models employ surface energy balance formulations to partition incident solar radiation into soil heat flux, sensible heat flux, and latent heat flux. At a relatively high spatial resolution (∼4 km) covering broader temporal record, ET has been estimated with the TerraClimate dataset (Abatzoglou et al. 2018). The water balance-based ET (WBET) product from TerraClimate has received wide acceptance in the remote sensing community due to its demonstrated skill in capturing hydroclimatic variables across different regions (e.g. Hu & Hu 2019; Salhi et al. 2019; Xu et al. 2019; Zhao & Gao 2019). It was developed by combining high spatial resolution climatological normals of temperature and precipitation from the WorldClim dataset with time-varying datasets from the Climate Research Unit data (CRU Ts4.0) and Japanese 55-year Reanalysis (JRA-55) (Abatzoglou et al. 2018). The performance of resulting RS – WBET model products are often evaluated using EC flux observations (e.g. Ruhoff et al. 2013; Fang et al. 2016). However, Kalma et al. (2008) after reviewing 30 published validation studies underscored that more sophisticated physical and analytical methods do not necessarily perform better in estimating ET than empirical and statistical techniques.
Because of the high cost and labor constraints associated with the direct measurement of ET (e.g. eddy covariance method), coupled with its limited coverage, empirical data-driven modeling has largely been used to estimate ET (e.g. Lu et al. 2003; Sanford & Selnick 2013; Valipour 2015; Fang et al. 2016). These studies cover multiple scales, use different data types, and address various research questions. Data from such studies are typically drawn from specific geographical units where a single regression is estimated for ET based on a relationship between one or more controlling factors and a dependent variable. For example, Valipour (2015) used a linear regression technique to estimate evapotranspiration from 11 temperature-based models in 31 provinces of Iran and revealed that the best model for estimating evapotranspiration performed well (R2 values > 0.99) in only 11 out of the 31 provinces. While this traditional type of regression has seen wide utility, it has the effect of producing ‘average’ or ‘global’ parameter estimates which are always assumed to be uniform over the study area – an implicit assumption that has frequently been overlooked. This assumption is inherently deficient particularly when applied to environmental variables such as ET which are spatially non-stationary over a large area coverage. Fotheringham et al. (2003) noted that relationships which are not stationary, when applied in a conventional regression model, create problems for the interpretation of estimated parameters. As such, a more localized approach (i.e. geographically weighted regression (GWR)), may be an ideal alternative. GWR belongs to the family of local statistics, comprised of multi-valued estimates as opposed to global statistics. As location changes, local statistics can take on different values. The strength of GWR lies in its ability to explain spatially varying relationships by essentially allowing model parameters to vary over space. Within a highly heterogeneous space, this type of regression will provide an opportunity to make significant progress toward understanding and predicting patterns in response variables on the basis of influential environmental factors.
It is therefore crucial that in order to accurately assess water resources availability in the PRB, the factors that control the amount and timing of ET in the basin be fully explored, based on which ET can be modeled and reliably quantified. This has become even more relevant at this time in light of the pressing need to predict future water stress and risks in the context of climate change (e.g. NJDEP 2017). Given the widely diverse and heterogeneous character of the basin, it will be necessary to identify and analyze the role that internal (e.g. biophysical) characteristics and external (e.g. climatic) conditions play in the spatial distribution of actual evapotranspiration within a spatiotemporal framework.
In that regard, the objectives of this study are to (1) use the classical ordinary least square (OLS) method to identify and determine the major internal (i.e. leaf area index (LAI), elevation) and external (i.e. mean temperature, precipitation, dew point, mean vapor pressure deficit (VPD), solar radiation, wind speed) controls on ET at monthly time scale; and (2) employ GWR to develop spatially varying mathematical models from readily available datasets that can be used to estimate monthly ET within the PRB.
MATERIALS AND METHODS
Study area
The non-tidal PRB is an oval-shaped area of about 2,135 km2 (824 square miles), of which about 84% is located in New Jersey and the rest in New York State (Figure 1). The surface elevation in the basin ranges from below sea level at 0.2 m (0.66 ft) to 454 m (1,490 ft). As previously mentioned, physiographically, the basin is divided into three main regions: the series of parallel ridges that trend northeast/southwest forming the Highlands; the Central Basin, comprised of large areas of swamps and meadows; and the roughly flat Lower Valley.
According to Paulson et al. (1991), New Jersey is located in a modified continental climate zone (i.e. with hot summer and cold winter). Five different climate zones have been identified: North, Central, Southwest, Pine Barrens, and Coastal zones, with PRB located within the North and Central zones. Temperatures in the North are about 10 °F colder than the coastal zone with above 90 °F commonly observed within the Central zone during warm seasons. Annual precipitation typically ranges from 1,016 to 1,321 mm (40–52 inches), with peak values observed along the coast during summer and winter (Paulson et al. 1991). On average, approximately 1,245 mm (49 inches) of precipitation occurs in the PRB annually (Newcomb et al. 2000). Of this, about 50% is lost to the atmosphere through evapotranspiration, another 5% becomes surface runoff, and the remaining 45% becomes available as recharge to groundwater aquifers (Mitchell 1992 as cited in NJ Watershed Basins, n.d.).
Land use/land cover patterns in the basin are dominated by forest type vegetation. According to the 2011 National Land Cover Dataset (NLCD; https://www.mrlc.gov/viewer/), approximately 43% of the area is forested; 36% developed; 14% Woody Wetlands; 3% open water; and the remaining 4% comprising other land cover types. Land use decisions in the basin continue to encourage flood events, with a direct and indirect consequence on water resources (i.e. in terms of flow, water quality, and water quantity). Driven by socioeconomic and biophysical factors, many communities in the lower urbanized section of the basin are close to being built out. Meanwhile, the Highlands region is currently undergoing suburban and rural development (NY Rapid Assessment Profile 2011). Coupled with historically unprecedented warming projected over New Jersey (State Climate Summaries, https://statesummaries.ncics.org/chapter/nj/), it is likely that climate and land use change will pose a serious threat to the water supply systems in the basin. As climate change continues and is expected to accelerate the hydrological cycle, water resource availability and ecosystem services will be directly affected through alteration in evapotranspiration (ET) processes, and indirectly through vegetation water use.
Environmental factors and data sources
Gridded datasets from Parameter-elevation Regressions on Independent Slopes Model (PRISM, Oregon State University, http://prism.oregonstate.edu), TerraClimate (Abatzoglou et al. 2018) (http://www.climatologylab.org/terraclimate.html), and Moderate Resolution Imaging Spectroradiometer (MODIS) LAI product (LAADS/DAAC, https://ladsweb.modaps.eosdis.nasa.gov/search/) provided the environmental variables necessary to develop the ET models. PRISM has received wide acceptance because of its ability to reasonably reproduce weather patterns over areas with complex topography such as the study region and serves as the official spatial climate datasets of the United States Department of Agriculture (USDA; Daly et al. 2008). As actual observations of ET are not available for the basin, the water balance-based TerraClimate ET was used as a proxy. According to Abatzoglou et al. (2018), the accuracy of TerraClimate has been proven by its strong validation with station-based observations from meteorological networks including the Global Historical Climate Network (GHCN), Snow Telemetry (SNOTEL), and Remote Automatic Weather Stations (RAWS). TerraClimate fields of annual reference evapotranspiration also tracked well with reference evapotranspiration from FLUXNET stations (Abatzoglou et al. 2018). In the assessment of water balance through different sources of precipitation and actual ET datasets, Neto et al. (2022) found that against the other three sources, TerraClimate emerged as the most sensitive to variations in the spatial distribution of precipitation and actual ET variables. This demonstrated skill of the TerraClimate product in capturing both point and spatial observations is also corroborated by Soleimani-Motlagh et al. (2022), Wiwoho & Astuti (2022), and Lemenkova (2022), and therefore gives credence to its use as a proxy in this study.
On the other hand, biophysical variables suggested to sufficiently explain variability in ET included precipitation (ppt, mm), air temperature (temp, °C), dew point (dewpt, °C), solar radiation (srad, MJ m−2), vapor pressure deficit (vpd, 100 Pa), wind speed (ws, m/s), leaf area index (LAI), and surface elevation (elev, m) (Lu et al. 2003; Sanford & Selnick 2013; Reitz et al. 2017). Table 1 provides details of the data source, resolution, and time periods of each environmental variable. Further processing of the datasets was carried out using the raster calculator in ArcGIS 10.2.2 to compute mean monthly values. After converting each variable file from raster to feature class, a spatial join was performed to obtain attributes of all the environmental variables in a single layer. Rather than resampling which tends to compromise data values, a one-to-one join operation based on intersect matching was used in the spatial join tool. This ensured that spatial relationship was still established between records of coarse versus finer resolution datasets. In the end, a single feature layer for each month containing attributes of all dependent and independent variables at 800 m grid was prepared for further analysis in R (R 2020).
Data . | Abbreviation . | Spatial resolution . | Time frame . | Source . |
---|---|---|---|---|
Dependent variable | ||||
Actual evapotranspiration | ET | 4 km | 1981–2010 | TerraClimate |
Independent variables | ||||
Elevation | ELEV | 800 m | 1981–2010 | PRISM |
Leaf area index | LAI | 1 km | 2000–2010 | MODIS |
Precipitation | PPT | 800 m | 1981–2010 | PRISM |
Minimum temperature | TEMPmin | 800 m | 1981–2010 | PRISM |
Maximum temperature | TEMPmax | 800 m | 1981–2010 | PRISM |
Dew point | DEWPT | 800 m | 1981–2010 | PRISM |
Maximum vapor pressure deficit | VPDmax | 800 m | 1981–2010 | PRISM |
Minimum vapor pressure deficit | VPDmin | 800 m | 1981–2010 | PRISM |
Solar radiation | SRAD | 4 km | 1981–2010 | TerraClimate |
Wind speed | WS | 4 km | 1981–2010 | TerraClimate |
Data . | Abbreviation . | Spatial resolution . | Time frame . | Source . |
---|---|---|---|---|
Dependent variable | ||||
Actual evapotranspiration | ET | 4 km | 1981–2010 | TerraClimate |
Independent variables | ||||
Elevation | ELEV | 800 m | 1981–2010 | PRISM |
Leaf area index | LAI | 1 km | 2000–2010 | MODIS |
Precipitation | PPT | 800 m | 1981–2010 | PRISM |
Minimum temperature | TEMPmin | 800 m | 1981–2010 | PRISM |
Maximum temperature | TEMPmax | 800 m | 1981–2010 | PRISM |
Dew point | DEWPT | 800 m | 1981–2010 | PRISM |
Maximum vapor pressure deficit | VPDmax | 800 m | 1981–2010 | PRISM |
Minimum vapor pressure deficit | VPDmin | 800 m | 1981–2010 | PRISM |
Solar radiation | SRAD | 4 km | 1981–2010 | TerraClimate |
Wind speed | WS | 4 km | 1981–2010 | TerraClimate |
Regression diagnostics and evaluation
Diagnosing a regression model involves using a class of techniques for detecting problems raised either by the model or the datasets that may compromise the predictive ability of the model. In arriving at an optimal model for predicting ET, environmental variables were assessed using: (1) Pearson correlation metrics (r), (2) statistical significance, (3) Bayesian Information Criterion (BIC), (4) Mallow's Cp, and (5) adjusted R-square (R2). The presence of multi-collinearity that reveals near-redundancy among explanatory variables was also explored. Although multi-collinearity was detected, the exclusion or otherwise of explanatory variables from the model was guided by our understanding of the key processes that influenced the response variable (i.e. ET) as well as their statistical significance. Thus, because all variables were significant, and in order not to compromise the predictive ability of the model, we refrained from removing any variables just on the bases of multi-collinearity.
Further diagnostic analysis such as the Best Subsets regression technique was performed to provide a reasonable subset of variables to include in the model. Unlike other selection techniques such as the Stepwise, the Best Subsets approach models every possible combination of variables that exist. For this study, the 10 independent variables underwent this technique, and the best significant model was selected according to a reasonable adjusted r-square, a lower BIC, and a minimum Cp. Note that underpinning our objectives was to identify the model with the minimum number of variables (i.e. not more than 5) that potentially provided an optimal model accuracy and whose variables are readily available from standard meteorological monitoring stations and regional remote sensing products.
Regression modeling approach
Evaluation of model performance
For the internal variable evaluation, three metrics were used to assess the performance of the geographically weighted ET model. They are:
The local coefficient of determination (local R2), which denotes how much the model explained the variability in the independent variables.
The prediction error (PE), which expresses the difference between Terraclimate ET and modeled geographically weighted ET.
The corrected Akaike Information Criterion (AICc), which estimates the quality of models relative to each other for a given set of data.
RESULTS AND DISCUSSION
Model comparison analysis
Year . | Precip . | Discharge . | Grace storage . | Q/P . | (Q + S) > P . | WB ET . | GW ET . |
---|---|---|---|---|---|---|---|
2003 | 1,645.21 | 697.50 | 7.07 | 0.42 | 704.57 | 940.64 | 757.66 |
2004 | 1,366.83 | 455.57 | −1.73 | 0.33 | 453.84 | 913.00 | 796.89 |
2005 | 1,343.87 | 531.78 | 0.79 | 0.40 | 532.58 | 811.30 | 780.92 |
2006 | 1,392.75 | 531.44 | 0.91 | 0.38 | 532.36 | 860.39 | 815.67 |
2007 | 1,383.73 | 432.44 | −2.56 | 0.31 | 429.88 | 953.85 | 798.63 |
2008 | 1,344.85 | 477.88 | 2.31 | 0.36 | 480.19 | 864.66 | 768.27 |
2009 | 1,281.54 | 403.08 | −0.28 | 0.31 | 402.80 | 878.74 | 809.98 |
2010 | 1,327.27 | 491.78 | −2.14 | 0.37 | 489.64 | 837.63 | 770.69 |
Year . | Precip . | Discharge . | Grace storage . | Q/P . | (Q + S) > P . | WB ET . | GW ET . |
---|---|---|---|---|---|---|---|
2003 | 1,645.21 | 697.50 | 7.07 | 0.42 | 704.57 | 940.64 | 757.66 |
2004 | 1,366.83 | 455.57 | −1.73 | 0.33 | 453.84 | 913.00 | 796.89 |
2005 | 1,343.87 | 531.78 | 0.79 | 0.40 | 532.58 | 811.30 | 780.92 |
2006 | 1,392.75 | 531.44 | 0.91 | 0.38 | 532.36 | 860.39 | 815.67 |
2007 | 1,383.73 | 432.44 | −2.56 | 0.31 | 429.88 | 953.85 | 798.63 |
2008 | 1,344.85 | 477.88 | 2.31 | 0.36 | 480.19 | 864.66 | 768.27 |
2009 | 1,281.54 | 403.08 | −0.28 | 0.31 | 402.80 | 878.74 | 809.98 |
2010 | 1,327.27 | 491.78 | −2.14 | 0.37 | 489.64 | 837.63 | 770.69 |
. | Jan . | Feb . | Mar . | Apr . | May . | Jun . | Jul . | Aug . | Sep . | Oct . | Nov . | Dec . |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Local coefficient of determination | ||||||||||||
Minimum | 0.374 | −0.156 | 0.215 | 0.074 | −0.122 | −0.339 | 0.089 | −0.169 | 0.275 | −2.372 | −1.590 | −0.351 |
1st quantile | 0.809 | 0.729 | 0.717 | 0.739 | 0.715 | 0.662 | 0.669 | 0.709 | 0.767 | 0.694 | 0.759 | 0.770 |
Median | 0.883 | 0.858 | 0.815 | 0.815 | 0.846 | 0.824 | 0.791 | 0.864 | 0.822 | 0.879 | 0.911 | 0.880 |
Mean | 0.867 | 0.820 | 0.793 | 0.801 | 0.808 | 0.778 | 0.776 | 0.813 | 0.810 | 0.805 | 0.851 | 0.846 |
Standard Deviation | 0.098 | 0.148 | 0.124 | 0.113 | 0.157 | 0.180 | 0.141 | 0.171 | 0.100 | 0.225 | 0.162 | 0.143 |
3rd quantile | 0.947 | 0.941 | 0.883 | 0.883 | 0.936 | 0.926 | 0.900 | 0.954 | 0.879 | 0.966 | 0.973 | 0.960 |
Maximum | 1.000 | 1.000 | 0.996 | 0.998 | 0.999 | 1.000 | 0.999 | 1.000 | 0.983 | 1.000 | 1.000 | 1.000 |
Prediction error | ||||||||||||
Minimum | −0.980 | −1.443 | −4.550 | −1.021 | −1.622 | −1.087 | −1.654 | −0.651 | −1.959 | −1.069 | −1.161 | −1.444 |
1st quantile | −0.031 | −0.050 | −0.235 | −0.090 | −0.043 | −0.028 | −0.058 | −0.023 | −0.094 | −0.024 | −0.031 | −0.042 |
Median | 0.000 | 0.000 | 0.001 | −0.001 | −0.001 | 0.000 | −0.001 | 0.000 | −0.002 | 0.000 | 0.000 | 0.000 |
Mean | −0.001 | 0.000 | 0.008 | 0.004 | 0.000 | 0.000 | 0.002 | 0.000 | 0.000 | −0.001 | 0.001 | −0.002 |
Standard Deviation | 0.112 | 0.148 | 0.803 | 0.220 | 0.205 | 0.121 | 0.203 | 0.087 | 0.238 | 0.124 | 0.154 | 0.224 |
3rd quantile | 0.034 | 0.044 | 0.223 | 0.093 | 0.041 | 0.027 | 0.056 | 0.022 | 0.092 | 0.020 | 0.032 | 0.044 |
Maximum | 0.903 | 1.715 | 6.410 | 1.376 | 2.329 | 1.134 | 2.117 | 0.614 | 1.646 | 1.367 | 1.495 | 1.768 |
. | Jan . | Feb . | Mar . | Apr . | May . | Jun . | Jul . | Aug . | Sep . | Oct . | Nov . | Dec . |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Local coefficient of determination | ||||||||||||
Minimum | 0.374 | −0.156 | 0.215 | 0.074 | −0.122 | −0.339 | 0.089 | −0.169 | 0.275 | −2.372 | −1.590 | −0.351 |
1st quantile | 0.809 | 0.729 | 0.717 | 0.739 | 0.715 | 0.662 | 0.669 | 0.709 | 0.767 | 0.694 | 0.759 | 0.770 |
Median | 0.883 | 0.858 | 0.815 | 0.815 | 0.846 | 0.824 | 0.791 | 0.864 | 0.822 | 0.879 | 0.911 | 0.880 |
Mean | 0.867 | 0.820 | 0.793 | 0.801 | 0.808 | 0.778 | 0.776 | 0.813 | 0.810 | 0.805 | 0.851 | 0.846 |
Standard Deviation | 0.098 | 0.148 | 0.124 | 0.113 | 0.157 | 0.180 | 0.141 | 0.171 | 0.100 | 0.225 | 0.162 | 0.143 |
3rd quantile | 0.947 | 0.941 | 0.883 | 0.883 | 0.936 | 0.926 | 0.900 | 0.954 | 0.879 | 0.966 | 0.973 | 0.960 |
Maximum | 1.000 | 1.000 | 0.996 | 0.998 | 0.999 | 1.000 | 0.999 | 1.000 | 0.983 | 1.000 | 1.000 | 1.000 |
Prediction error | ||||||||||||
Minimum | −0.980 | −1.443 | −4.550 | −1.021 | −1.622 | −1.087 | −1.654 | −0.651 | −1.959 | −1.069 | −1.161 | −1.444 |
1st quantile | −0.031 | −0.050 | −0.235 | −0.090 | −0.043 | −0.028 | −0.058 | −0.023 | −0.094 | −0.024 | −0.031 | −0.042 |
Median | 0.000 | 0.000 | 0.001 | −0.001 | −0.001 | 0.000 | −0.001 | 0.000 | −0.002 | 0.000 | 0.000 | 0.000 |
Mean | −0.001 | 0.000 | 0.008 | 0.004 | 0.000 | 0.000 | 0.002 | 0.000 | 0.000 | −0.001 | 0.001 | −0.002 |
Standard Deviation | 0.112 | 0.148 | 0.803 | 0.220 | 0.205 | 0.121 | 0.203 | 0.087 | 0.238 | 0.124 | 0.154 | 0.224 |
3rd quantile | 0.034 | 0.044 | 0.223 | 0.093 | 0.041 | 0.027 | 0.056 | 0.022 | 0.092 | 0.020 | 0.032 | 0.044 |
Maximum | 0.903 | 1.715 | 6.410 | 1.376 | 2.329 | 1.134 | 2.117 | 0.614 | 1.646 | 1.367 | 1.495 | 1.768 |
Table 4 shows how the geographically weighted ET model compares with the traditional OLS model in terms of their Akaike Information Criterion (AIC) (Akaike 1974). The AIC determines which model performs best in associating explanatory variables to the dependent variable. A model with smaller AIC value is considered better as it is more likely to minimize information loss in contrast to the ‘true’ model that generates the observation data (Burnham & Anderson 2003). The smaller AIC values obtained by the geographically weighted ET model in this study further corroborate the point that spatially non-stationary local models are an ideal alternative to the global OLS models when explaining spatially varying relationships over the large scale.
. | OLS AICc . | GW AICc . |
---|---|---|
Jan | 4,507 | − 898 |
Feb | 8,642 | 2,490 |
Mar | 14,829 | 10,490 |
Apr | 6,920 | 1,545 |
May | 7,982 | 1,495 |
Jun | 5,849 | − 1,217 |
Jul | 8,172 | 1,307 |
Aug | 3,504 | − 4,059 |
Sep | 7,629 | 2,082 |
Oct | 6,269 | − 1,212 |
Nov | 5,991 | − 38 |
Dec | 9,088 | 2,776 |
. | OLS AICc . | GW AICc . |
---|---|---|
Jan | 4,507 | − 898 |
Feb | 8,642 | 2,490 |
Mar | 14,829 | 10,490 |
Apr | 6,920 | 1,545 |
May | 7,982 | 1,495 |
Jun | 5,849 | − 1,217 |
Jul | 8,172 | 1,307 |
Aug | 3,504 | − 4,059 |
Sep | 7,629 | 2,082 |
Oct | 6,269 | − 1,212 |
Nov | 5,991 | − 38 |
Dec | 9,088 | 2,776 |
Potential sources of uncertainty associated with the modeled ET can be linked to the corresponding uncertainties inherent in the TerraClimate data, remote sensing LAI data from MODIS, and other meteorological data from PRISM. For instance, any existing noise or errors in the MODIS LAI product will be propagated in the ET mathematical model. Additionally, relatively coarse grid cells from the MODIS LAI (1 km) as well as TerraClimate data (4 km) were applied to an 800 m grid for the study basin to derive the ET model. At such smaller resolution, the MODIS LAI, for instance, may not sufficiently capture sub-grid scale vegetation signals in the basin, especially where the topography and land cover are highly complex and spatially heterogeneous, and thus introduce some errors in the model. Likewise, the coarse grid size of the radiative and aerodynamic variables obtained from the TerraClimate products may present some degree of errors in the derived ET model. In the course of our analysis, it was observed that the accuracy of ET estimates was largely dependent on the meteorological datasets. For the most part, these datasets, obtained from PRISM at a resolution of 800 m, guaranteed a reasonably accurate ET model.
Spatiotemporal patterns of ET in the PRB
The geographically weighted ET model was carried out over the diverse physiographical and biophysical terrain of the PRB from 1981 to 2010 at 800 × 800 m grid resolution. Based on the ET maps (Figures 2 and 3), there is evidence of sharp basin-wide variation and longitudinal gradients of ET, with lower values trending from northwest to higher values in the southeast. Generally, the ET patterns reflect the climatic conditions at the time and varies depending upon land use/land cover type, topography, and availability of water. Lower values are typically observed in the north climate zone where the temperatures are relatively low, with high topographic relief and largely forested vegetation cover. Mean annual ET value of about 748 mm/year is estimated in the upper mountainous region. In contrast, the high ET values are seen in the central climate zone where temperatures are intermediate to high, and in highly developed urban communities. Estimated mean annual ET is approximately 793 mm/year. Although some uncertainties are present and methodologies may differ, the magnitude and spatial patterns of the estimated ET rates over the PRB are consistent with other investigations in the vicinity (e.g. Sumner et al. 2012).
In contrast, low ET rates are observed in the largely forested upper northwest half because of the precipitation deficit. Although vegetation and water bodies typically dominate the upper half, the winter months show significantly low ET values as a result of relatively low mountainous temperatures and sparse deciduous vegetation. Following the trajectory of ET rates in Figure 7 (monthly time series), it can be seen that ET values in all the months are higher in the lower SE half than in the upper NW portion of the basin. The lowest mean ET values are observed in December (Lower SE: 5.65 mm; Upper NW: 2.53 mm) and the highest values in July (Lower SE: 118 mm; Upper NW: 115 mm).
Dominant controls on monthly ET
Pearson correlation coefficient values show that elevation correlates strongly with ET across all seasons (Table 5). This is followed by energy inputs (DEWPT > TMAX > TMIN > SRAD), aerodynamic input (i.e. WS), atmospheric demand (VPD_MAX > VPD_MIN), vegetation biomass (i.e. LAI), and water input (i.e. PPT) in that order. After elevation, the two most important drivers on monthly ET were DEWPT and TMIN in the winter and spring, DEWPT and TMAX in the summer, and SRAD and TMAX in the fall. Interestingly, the strength of the correlation of explanatory variables with ET is relatively weak in the summer (i.e. < 0.5) across the study basin, although these variables are, to a large extent, strongly correlated among themselves. It was also noted that water availability associated with PPT correlated negatively with ET in all seasons except summer.
. | AET . | ELEV . | DEWPT . | LAI . | PPT . | SRAD . | TMAX . | TMIN . | VPD_MIN . | VPD_MAX . | WS . | AET . | ELEV . | DEWPT . | LAI . | PPT . | SRAD . | TMAX . | TMIN . | VPD_MIN . | VPD_MAX . | WS . |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
AET | 1 | Winter | 1 | Spring | ||||||||||||||||||
ELEV | − 0.89 | 1 | − 0.88 | 1 | ||||||||||||||||||
DEWPT | 0.89 | − 0.93 | 1 | 0.8 | − 0.87 | 1 | ||||||||||||||||
LAI | − 0.49 | 0.43 | − 0.42 | 1 | − 0.5 | 0.45 | − 0.41 | 1 | ||||||||||||||
PPT | − 0.37 | 0.47 | − 0.4 | 0.26 | 1 | − 0.24 | 0.33 | − 0.37 | 0.1 | 1 | ||||||||||||
SRAD | 0.86 | − 0.77 | 0.8 | − 0.39 | − 0.17 | 1 | 0.44 | − 0.48 | 0.55 | − 0.13 | − 0.15 | 1 | ||||||||||
TMAX | 0.86 | − 0.91 | 0.88 | − 0.38 | − 0.3 | 0.84 | 1 | 0.77 | − 0.86 | 0.84 | − 0.37 | − 0.17 | 0.64 | 1 | ||||||||
TMIN | 0.88 | − 0.91 | 0.96 | − 0.45 | − 0.48 | 0.7 | 0.79 | 1 | 0.82 | − 0.82 | 0.88 | − 0.45 | − 0.37 | 0.28 | 0.7 | 1 | ||||||
VPD_MIN | 0.62 | − 0.56 | 0.6 | − 0.37 | − 0.48 | 0.3 | 0.38 | 0.79 | 1 | 0.39 | − 0.31 | 0.27 | − 0.32 | − 0.18 | − 0.15 | 0.13 | 0.66 | 1 | ||||
VPD_MAX | 0.77 | − 0.81 | 0.73 | − 0.33 | − 0.23 | 0.78 | 0.97 | 0.64 | 0.23 | 1 | 0.63 | − 0.7 | 0.61 | − 0.29 | − 0.01 | 0.65 | 0.93 | 0.45 | 0.03 | 1 | ||
WS | 0.7 | − 0.51 | 0.51 | − 0.4 | − 0.35 | 0.43 | 0.41 | 0.65 | 0.76 | 0.32 | 1 | 0.69 | − 0.54 | 0.44 | − 0.42 | − 0.07 | 0.16 | 0.42 | 0.63 | 0.58 | 0.32 | 1 |
AET | 1 | Summer | 1 | Fall | ||||||||||||||||||
ELEV | − 0.42 | 1 | − 0.83 | 1 | ||||||||||||||||||
DEWPT | 0.4 | − 0.85 | 1 | 0.74 | − 0.86 | 1 | ||||||||||||||||
LAI | − 0.32 | 0.5 | − 0.46 | 1 | − 0.55 | 0.5 | − 0.5 | 1 | ||||||||||||||
PPT | 0.34 | 0.22 | − 0.2 | 0.11 | 1 | − 0.28 | 0.42 | − 0.51 | 0.28 | 1 | ||||||||||||
SRAD | 0.03 | − 0.27 | 0.31 | 0.02 | − 0.22 | 1 | 0.75 | − 0.74 | 0.69 | − 0.39 | − 0.36 | 1 | ||||||||||
TMAX | 0.39 | − 0.91 | 0.85 | − 0.44 | − 0.24 | 0.32 | 1 | 0.8 | − 0.86 | 0.75 | − 0.42 | − 0.31 | 0.83 | 1 | ||||||||
TMIN | 0.36 | − 0.8 | 0.86 | − 0.49 | − 0.12 | 0.01 | 0.78 | 1 | 0.66 | − 0.74 | 0.88 | − 0.46 | − 0.38 | 0.4 | 0.53 | 1 | ||||||
VPD_MIN | 0.13 | − 0.36 | 0.29 | − 0.33 | 0.02 | − 0.4 | 0.33 | 0.73 | 1 | 0.18 | − 0.13 | 0.24 | − 0.18 | 0 | − 0.24 | − 0.09 | 0.66 | 1 | ||||
VPD_MAX | 0.34 | − 0.8 | 0.62 | − 0.37 | − 0.23 | 0.33 | 0.94 | 0.57 | 0.23 | 1 | 0.59 | − 0.58 | 0.33 | − 0.22 | − 0.07 | 0.69 | 0.87 | 0.1 | − 0.33 | 1 | ||
WS | 0.37 | − 0.66 | 0.54 | − 0.49 | − 0.05 | − 0.05 | 0.64 | 0.75 | 0.66 | 0.56 | 1 | 0.78 | − 0.59 | 0.57 | − 0.46 | − 0.08 | 0.43 | 0.48 | 0.71 | 0.54 | 0.24 | 1 |
. | AET . | ELEV . | DEWPT . | LAI . | PPT . | SRAD . | TMAX . | TMIN . | VPD_MIN . | VPD_MAX . | WS . | AET . | ELEV . | DEWPT . | LAI . | PPT . | SRAD . | TMAX . | TMIN . | VPD_MIN . | VPD_MAX . | WS . |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
AET | 1 | Winter | 1 | Spring | ||||||||||||||||||
ELEV | − 0.89 | 1 | − 0.88 | 1 | ||||||||||||||||||
DEWPT | 0.89 | − 0.93 | 1 | 0.8 | − 0.87 | 1 | ||||||||||||||||
LAI | − 0.49 | 0.43 | − 0.42 | 1 | − 0.5 | 0.45 | − 0.41 | 1 | ||||||||||||||
PPT | − 0.37 | 0.47 | − 0.4 | 0.26 | 1 | − 0.24 | 0.33 | − 0.37 | 0.1 | 1 | ||||||||||||
SRAD | 0.86 | − 0.77 | 0.8 | − 0.39 | − 0.17 | 1 | 0.44 | − 0.48 | 0.55 | − 0.13 | − 0.15 | 1 | ||||||||||
TMAX | 0.86 | − 0.91 | 0.88 | − 0.38 | − 0.3 | 0.84 | 1 | 0.77 | − 0.86 | 0.84 | − 0.37 | − 0.17 | 0.64 | 1 | ||||||||
TMIN | 0.88 | − 0.91 | 0.96 | − 0.45 | − 0.48 | 0.7 | 0.79 | 1 | 0.82 | − 0.82 | 0.88 | − 0.45 | − 0.37 | 0.28 | 0.7 | 1 | ||||||
VPD_MIN | 0.62 | − 0.56 | 0.6 | − 0.37 | − 0.48 | 0.3 | 0.38 | 0.79 | 1 | 0.39 | − 0.31 | 0.27 | − 0.32 | − 0.18 | − 0.15 | 0.13 | 0.66 | 1 | ||||
VPD_MAX | 0.77 | − 0.81 | 0.73 | − 0.33 | − 0.23 | 0.78 | 0.97 | 0.64 | 0.23 | 1 | 0.63 | − 0.7 | 0.61 | − 0.29 | − 0.01 | 0.65 | 0.93 | 0.45 | 0.03 | 1 | ||
WS | 0.7 | − 0.51 | 0.51 | − 0.4 | − 0.35 | 0.43 | 0.41 | 0.65 | 0.76 | 0.32 | 1 | 0.69 | − 0.54 | 0.44 | − 0.42 | − 0.07 | 0.16 | 0.42 | 0.63 | 0.58 | 0.32 | 1 |
AET | 1 | Summer | 1 | Fall | ||||||||||||||||||
ELEV | − 0.42 | 1 | − 0.83 | 1 | ||||||||||||||||||
DEWPT | 0.4 | − 0.85 | 1 | 0.74 | − 0.86 | 1 | ||||||||||||||||
LAI | − 0.32 | 0.5 | − 0.46 | 1 | − 0.55 | 0.5 | − 0.5 | 1 | ||||||||||||||
PPT | 0.34 | 0.22 | − 0.2 | 0.11 | 1 | − 0.28 | 0.42 | − 0.51 | 0.28 | 1 | ||||||||||||
SRAD | 0.03 | − 0.27 | 0.31 | 0.02 | − 0.22 | 1 | 0.75 | − 0.74 | 0.69 | − 0.39 | − 0.36 | 1 | ||||||||||
TMAX | 0.39 | − 0.91 | 0.85 | − 0.44 | − 0.24 | 0.32 | 1 | 0.8 | − 0.86 | 0.75 | − 0.42 | − 0.31 | 0.83 | 1 | ||||||||
TMIN | 0.36 | − 0.8 | 0.86 | − 0.49 | − 0.12 | 0.01 | 0.78 | 1 | 0.66 | − 0.74 | 0.88 | − 0.46 | − 0.38 | 0.4 | 0.53 | 1 | ||||||
VPD_MIN | 0.13 | − 0.36 | 0.29 | − 0.33 | 0.02 | − 0.4 | 0.33 | 0.73 | 1 | 0.18 | − 0.13 | 0.24 | − 0.18 | 0 | − 0.24 | − 0.09 | 0.66 | 1 | ||||
VPD_MAX | 0.34 | − 0.8 | 0.62 | − 0.37 | − 0.23 | 0.33 | 0.94 | 0.57 | 0.23 | 1 | 0.59 | − 0.58 | 0.33 | − 0.22 | − 0.07 | 0.69 | 0.87 | 0.1 | − 0.33 | 1 | ||
WS | 0.37 | − 0.66 | 0.54 | − 0.49 | − 0.05 | − 0.05 | 0.64 | 0.75 | 0.66 | 0.56 | 1 | 0.78 | − 0.59 | 0.57 | − 0.46 | − 0.08 | 0.43 | 0.48 | 0.71 | 0.54 | 0.24 | 1 |
The Best Subset regression technique afforded the opportunity to select at most five out of the 10 variable combinations that produced the best-fit ET model for each month. As represented in Table 4, the estimated coefficients from OLS regression revealed the most influential variables that control monthly ET. For example, four variables (WS, TMAX, LAI, and ELEV) explained 93% of the variability in ET in November. Wind speed (WS) shows a strong positive influence on ET followed by TMAX, LAI, and ELEV in that order. Indeed, WS appears to be the most dominant ET controlling factor throughout the year, appearing in 10 out of 12 months. Quite notably, details of the relatively weak correlation observed between environmental variables and ET in the summer are revealed clearly in the month of June. While it may appear that only 39% of the variability in ET has been explained by the five variables, it is worth pointing out that all the 10 variables included in the model could only explain approximately 41% of variability in ET for June. This counters what may be construed as the model's inability to predict ET in the summer. It clearly indicates that the model is quite robust and largely represents the key processes that strongly influence monthly ET in the study basin. More so, corresponding Mallow's Cp and BIC were relatively low compared with other model sets. Lower Cp and BIC are indicative of relatively precise and best models, respectively. Thus, the surprisingly large unexplained variance in the relationship for June may be attributed to some inherent data error propagated from the source data.
Geographically weighted monthly ET models
Based on the intercept and coefficient values, a mathematical expression of the different sets of key independent variables controlling monthly ET could be formulated from Table 6. However, such an expression will only result in a basin-wide average value of ET, concealing the complex nuances of the spatially varying environmental factors that influence monthly ET in the basin.
. | . | . | . | Apr . | May . | Jun . | . | . | . | . | . | . |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Variable . | Jan . | Feb . | Mar . | Estimated Coefficient . | Jul . | Aug . | Sep . | Oct . | Nov . | Dec . | ||
(Intercept) | −6.0217 *** | −14.1613 *** | 20.7849 *** | 59.2585 *** | 85.7725 *** | 92.6907 *** | 291.3125 *** | 49.7324 *** | 25.6541 *** | 43.5208 *** | 9.0569 *** | −10.7300 *** |
TMIN | 0.3401 *** | – | – | – | 0.8166 *** | – | 7.4594 *** | – | – | – | – | −1.6440 *** |
TMAX | −7.6322 *** | −9.9541 *** | – | 0.5113 *** | – | – | −15.0457 *** | 4.4419 *** | 1.9057 *** | – | 1.0366 *** | – |
WS | 3.4407 *** | 5.7222 *** | 8.9909 *** | 5.5493 *** | 4.3073 *** | – | – | 3.9956 *** | 7.8044 *** | 5.7030 *** | 6.7321 *** | 8.0780 *** |
LAI | – | −0.9655 *** | – | – | – | −0.0591 *** | – | −0.0297 *** | −0.1012 *** | −0.0996 *** | −0.2760 *** | – |
ELEV | – | – | – | −0.0120 *** | −0.0079 *** | −0.0058 *** | – | – | – | −0.0057 *** | −0.0054 *** | −0.0096 *** |
VPD_MAX | 16.4220 *** | 20.4178 *** | 6.2951 *** | – | – | 0.0872 *** | 6.3268 *** | −2.1246 *** | – | 0.3785 *** | – | 2.6220 *** |
VPD_MIN | – | – | – | – | −1.1314 *** | −0.6552 *** | −9.3819 *** | – | −2.2326 *** | −1.8201 *** | – | – |
DEWPT | 5.0595 *** | 6.5406 *** | 4.0319 *** | – | – | – | – | −2.5923 *** | – | – | – | 4.1380 *** |
PPT | – | – | – | – | – | 0.1202 *** | 0.1485 *** | – | −0.0816 *** | – | – | – |
Cp | 78 | 157 | 365 | 204 | 258 | 206 | 650 | 71 | 382 | 325 | 248 | 255 |
BIC | −8,885 | −8,254 | −6,747 | −7,315 | −5,465 | −1,598 | −7,343 | −3,581 | −7,153 | −5,966 | −8,687 | −8,464 |
R-squared | 0.93 | 0.92 | 0.87 | 0.89 | 0.81 | 0.39 | 0.89 | 0.67 | 0.89 | 0.84 | 0.93 | 0.93 |
. | . | . | . | Apr . | May . | Jun . | . | . | . | . | . | . |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Variable . | Jan . | Feb . | Mar . | Estimated Coefficient . | Jul . | Aug . | Sep . | Oct . | Nov . | Dec . | ||
(Intercept) | −6.0217 *** | −14.1613 *** | 20.7849 *** | 59.2585 *** | 85.7725 *** | 92.6907 *** | 291.3125 *** | 49.7324 *** | 25.6541 *** | 43.5208 *** | 9.0569 *** | −10.7300 *** |
TMIN | 0.3401 *** | – | – | – | 0.8166 *** | – | 7.4594 *** | – | – | – | – | −1.6440 *** |
TMAX | −7.6322 *** | −9.9541 *** | – | 0.5113 *** | – | – | −15.0457 *** | 4.4419 *** | 1.9057 *** | – | 1.0366 *** | – |
WS | 3.4407 *** | 5.7222 *** | 8.9909 *** | 5.5493 *** | 4.3073 *** | – | – | 3.9956 *** | 7.8044 *** | 5.7030 *** | 6.7321 *** | 8.0780 *** |
LAI | – | −0.9655 *** | – | – | – | −0.0591 *** | – | −0.0297 *** | −0.1012 *** | −0.0996 *** | −0.2760 *** | – |
ELEV | – | – | – | −0.0120 *** | −0.0079 *** | −0.0058 *** | – | – | – | −0.0057 *** | −0.0054 *** | −0.0096 *** |
VPD_MAX | 16.4220 *** | 20.4178 *** | 6.2951 *** | – | – | 0.0872 *** | 6.3268 *** | −2.1246 *** | – | 0.3785 *** | – | 2.6220 *** |
VPD_MIN | – | – | – | – | −1.1314 *** | −0.6552 *** | −9.3819 *** | – | −2.2326 *** | −1.8201 *** | – | – |
DEWPT | 5.0595 *** | 6.5406 *** | 4.0319 *** | – | – | – | – | −2.5923 *** | – | – | – | 4.1380 *** |
PPT | – | – | – | – | – | 0.1202 *** | 0.1485 *** | – | −0.0816 *** | – | – | – |
Cp | 78 | 157 | 365 | 204 | 258 | 206 | 650 | 71 | 382 | 325 | 248 | 255 |
BIC | −8,885 | −8,254 | −6,747 | −7,315 | −5,465 | −1,598 | −7,343 | −3,581 | −7,153 | −5,966 | −8,687 | −8,464 |
R-squared | 0.93 | 0.92 | 0.87 | 0.89 | 0.81 | 0.39 | 0.89 | 0.67 | 0.89 | 0.84 | 0.93 | 0.93 |
Significant codes (p-values): 0, ‘***’ 0.001, ‘**’ 0.01, ‘*’ 0.05, ‘.’ 0.1, ‘ ’ 1.
A new set of mathematical models derived for monthly ET according to the GW regression analysis is presented in Table 7. Rather than the single, fixed value applied uniformly over space by the OLS technique, the GW regression model provides a range of values of intercepts and coefficients that reflects the local variations in environmental variables as they relate to ET in space (Table 7). By this, ET has been spatially mapped to key environmental variables across the PRB for each month, depicting the range of values in both intercept and coefficient. This will provide a reasonable first approximation of evapotranspiration rates and their spatial distribution to aid in the quantification of ET for water resource planning and decision making to serve communities in the largely diverse PRB.
Month . | Geographically weighted model . |
---|---|
January | |
February | |
March | |
April | |
May | |
June | |
July | |
August | |
September | |
October | |
November | |
December |
Month . | Geographically weighted model . |
---|---|
January | |
February | |
March | |
April | |
May | |
June | |
July | |
August | |
September | |
October | |
November | |
December |
While the GWR technique can be more appropriate than the global regression approaches, it is not without concerns. There are concerns raised over issues such as kernel and bandwidth selection including those shared with conventional regression techniques. However, the key issue here is that the local variation in the relationship between significant predictor variables and ET would have gone unnoticed in the commonly used global, OLS method. The monthly ET model outputs generated by the GW regression analysis offer new insights into the key external and internal controls on ET. Within the PRB, these models can assist in effectively quantifying ET under typical climatic conditions and to a large extent, accurately estimate mean monthly ET at similar locations. Practical models such as this have been widely used to estimate ET given the huge cost involved in measuring ET on large temporal and spatial scales (Sun et al. 2011a, 2011b). Here, we have established a reasonably accurate non-stationary monthly ET model that uses readily available input data over the PRB.
CONCLUSION
The 30-year continuous ET and other gridded hydroclimatic and biophysical observations from PRISM/TerraClimate/MODIS provided the means to carefully examine and identify the key environmental variables controlling ET at a fine temporal (i.e. monthly) and spatial (i.e. 800 m grid resolution) scale. From this, the best subset variables for each month were used to develop 12 empirical geographically weighted ET models that will potentially estimate monthly ET over the PRB with reasonable accuracy under mean climate conditions. These variables are readily available from any standard meteorological monitoring stations and regional remote sensing products. Key conclusions from our analyses may be summarized as follows:
- (1)
Temporal and spatial variabilities in mean monthly ET over the PRB are significantly controlled by climatic (i.e. TEMP, WS, DEWPT, VPD, PPT) and biophysical (i.e. LAI, ELEV) drivers. The analysis revealed that key controlling factors may be different from month to month, with wind speed taking dominance throughout the year in the study basin. Precipitation, while appearing insignificant in the course of the year, appears to be a limiting factor in the summer months.
- (2)
The ability to successfully map ET to key environmental variables in such a complex terrain in this study demonstrates the superiority of the GWR over the OLS approach in modeling spatially varying environmental relationships. Given the usually unquestioned limitation of the OLS technique, one further refinement that could be included in studies concerned with spatial analyses should be the use of a more appropriate localized (i.e. geographically weighted) method when attempting to explain spatial relationships.
- (3)
Modeled spatially varying monthly ET developed from this study offer convenient and cost effective means to empirically estimate monthly water loss from similar ecosystems. The ET index map generated for the PRB illustrates areas where ET exceeds precipitation, especially in the summer months, and hence is useful for water resource planning and decision making by water managers in the basin. Moreover, reliable quantification of ET has been made possible in the study basin. As such, the amount of water loss due to evapotranspiration can be accounted for in future water supply plans for the basin.
It is hoped that this work, being the first of its kind in the study basin to the best of our knowledge, will form the foundation for future climate impact studies in the basin and the region at large.
DATA AVAILABILITY STATEMENT
All relevant data are included in the paper or its Supplementary Information.
CONFLICT OF INTEREST
The authors declare there is no conflict.