ABSTRACT
Evapotranspiration (ET) is an important component of the hydrological cycle and its accurate estimation is very necessary for water resource management and agricultural precision irrigation. The direct measurement of ET is difficult and the observed data are very limited. Therefore, this study explores the possibility of a back-propagation (BP) neural network for simulating ET based on eight flux sites in China. The input variables consist of land surface temperature, the water vapor pressure of the land surface, net radiation, and photosynthetically active radiation. Four input combination categories are determined, including one input variable, two input variables, three input variables, and four input variables. The results demonstrated the following: (1) Adding more input variables generally improves model performance, with a significant gain from one to two variables, but only slight improvements from two to three. (2) While some stations show better performance with four variables, others perform best with three. (3) Daily and monthly ET estimates are achieved across all stations, with summer estimates consistently outperforming winter ones. (4) The variability in the best input combination for the stations indicates that factors such as climate zones and land cover influence ET accuracy.
HIGHLIGHTS
In this study, all possible combinations of input variables are put into the back-propagation (BP) model, which is beneficial for truly achieving the best input combination.
The results demonstrated that the best performance of the BP model at all stations is reliable, with higher KGE and R values and low RMSE and BIAS values.
More input variables do not definitely produce better performances based on results at the XSBN.
INTRODUCTION
Land evapotranspiration (ET), including soil evaporation and vegetation transpiration, plays a critical role in the water cycle, carbon cycle, and energy cycle, influencing water availability, agriculture, and ecosystem health (Chao et al. 2021; Ahmadi et al. 2022; Pan et al. 2022; Zhang et al. 2023). Accurate measurement or estimation of ET is very essential for water resource management, drought forecasting, and agricultural precision irrigation, especially in arid and semi-arid regions (Jaafar et al. 2022; Theng Hue et al. 2022; Lee et al. 2023).
ET is usually measured through eddy covariance flux tower and energy balance Bowen ratio (Maes et al. 2019; Walls et al. 2020; Markos & Radoglou 2021). With these observation measurements, ET data can be acquired directly, but these measurements require expensive equipment, much time, and professional technicians (Saboori et al. 2022; Ezenne et al. 2023). Moreover, ET data based on these observation measurements are relatively limited, and cannot meet the growing demand of modern hydrology and agriculture. Traditionally, ET estimation has relied on empirical formulas and physical models, such as the Penman–Monteith equation (FAO-PM) (Allen et al. 1994). The FAO-PM model became a standard of reference evapotranspiration estimation due to its accuracy under various climate conditions around the world (Pereira et al. 2021; Chen et al. 2022). Combining the FAO-PM and ET coefficient, ET can be subsequently achieved. A number of other ET models are also proposed, which only need air temperatures or solar radiation, such as the Makkink equation, Hargreaves and Samani equation, and Priestley and Taylor equation (Makkink 1957; Priestley & Taylor 1972; Hargreaves & Samani 1985). While these approaches are widely accepted, they often require extensive input data, such as maximum air temperature, minimum air temperature, wind speed, solar radiation, and relative humidity and can be limited by assumptions about the underlying processes (Allen et al. 1994, 1998; Pereira et al. 2021). Studies have shown that traditional models may struggle to accurately capture the spatial and temporal variability of ET (Sun et al. 2024) and cannot obtain acceptable results in diverse climate zones (Lee et al. 2023). Thus, new approaches are urgently needed to address this problem.
Machine learning offers a promising alternative for ET estimation by utilizing algorithms that can learn from data without explicit programming. Zhang et al. (2018) simulated reference evapotranspiration in the Shule River Basin from northwest China via support vector machines (SVMs), adaptive network-based fuzzy inference systems (ANFIS), and back-propagation (BP) neural network and found that these models were able to accurately simulate reference evapotranspiration only based on land surface temperature. Fan et al. (2021) used SVMs, extreme gradient boosting (XGBoost), single-layer artificial neural networks (ANN), and deep neural network (DNN) models to simulate the transpiration of summer maize in northwest China and the results reveal that the DNN model is more effective for maize transpiration estimation. Zhang et al. (2022) proposed six machine learning models for daily reference evapotranspiration estimation with incomplete meteorological data in eastern Inner Mongolia, north China, the results demonstrated that all six proposed machine learning models could estimate reference evapotranspiration with high accuracy. Wu et al. (2023) developed random forest (RF), extreme learning machine (ELM), SVM, and ANN models for maize ET estimation in northwest China and found that machine learning models can provide satisfactory simulation of maize ET. The hybrid-ML models are also used to simulate ET. The wavelet-artificial neural network showed good accuracy in estimating daily ET (Araghi et al. 2018).
Sensitivity analysis (SA) is crucial for understanding the influence of different input variables on model outputs. In the context of ET simulation, variable sensitivity analysis helps identify key climatic and environmental factors affecting ET rates. For instance, DeJonge et al. (2015) utilized variance-based global sensitivity analysis to assess the impact of temperature, humidity, wind speed, and solar radiation on ET, providing insights that can enhance model accuracy. The integration of ML and sensitivity analysis has also been explored. Zhao et al. (2022) used the XGBoost algorithm to analyze the contribution rate of meteorological factors to ET and obtain the input combination of a few factors with a large impact on ET. Their findings highlighted that sunshine duration, temperature, and solar radiation are the most influential factors, demonstrating the potential of this combined approach to improve ET modeling.
Hence, machine learning models will be applied to estimate evapotranspiration and recognize the sensitivity of input variables at eight flux stations in this study. These eight flux stations are evenly distributed and cover the most prevalent climate in China. The aim of this study is, therefore, threefold: (1) to identify the best input variable combinations for modeling ET through BP for four input combinations categories, respectively; (2) to analyse the differences in BP model performances among the best input combinations under each category and achieve the variation features of BP model performances under best input combinations that vary with the numbers of input variables; (3) to select the best input combinations under all input variable combinations at eight flux stations and compare their performances at daily, monthly and seasonal scale. In this study, the input variables include land surface temperature, the water vapor pressure of the land surface, net radiation, and photosynthetically active radiation.
DATA AND METHODOLOGY
ChinaFLUX observations and station description
Attributes of the eight EC flux stations used in this study
Station (Abbreviation) . | Latitude (°N) . | Longitude (°E) . | Köppen-Geiger Climate . | IGBP landcover . | Elevation (m) . | TL (°C) . | WVPL (kpa) . | RN (W/m2) . | FAR (μmol/m2 s) . | Mean annual precipitation (mm) . | Mean annual ET (mm) . |
---|---|---|---|---|---|---|---|---|---|---|---|
Changbaishan (CBS) | 42.60 | 128.08 | Temperate Continental Monsoon Climate | MF | 738 | 3.6 | 0.8 | 76.6 | 310.5 | 695 | 500 |
Dinghushan (DHS) | 23.17 | 112.57 | Monsoon humid climate | EBF | 300 | 20.1 | 2.0 | 86.6 | 275.6 | 1,956 | 661 |
Dangxiong (DX) | 30.85 | 91.42 | The plateau monsoon climate | GRA | 4,250 | 2.7 | 0.4 | 63.0 | 466.0 | 477 | 550 |
Haibei (HB) | 37.62 | 101.29 | Continental climate | GRA | 3,216 | −1.3 | 0.4 | 90.3 | 366.0 | 560 | 576 |
Inner Mongolia (NMG) | 44.50 | 117.17 | Continental climate | GRA | 1,189 | 1.4 | 0.5 | 67.9 | 369.0 | 336 | 386 |
Qianyanzhou (QYZ) | 26.73 | 115.05 | Subtropical monsoon climate | MF | 102 | 17.8 | 1.8 | 88.2 | 273.8 | 1,542 | 704 |
Xishuangbanna (XSBN) | 21.95 | 101.2 | Tropical monsoon climate | EBF | 756 | 19.1 | 2.2 | 103.6 | 328.6 | 1,493 | 622 |
Yucheng (YC) | 36.95 | 116.6 | Warm and semi-humid continental monsoon climate | CRO | 28 | 13.2 | 1.3 | 63.3 | 265.6 | 582 | 683 |
Station (Abbreviation) . | Latitude (°N) . | Longitude (°E) . | Köppen-Geiger Climate . | IGBP landcover . | Elevation (m) . | TL (°C) . | WVPL (kpa) . | RN (W/m2) . | FAR (μmol/m2 s) . | Mean annual precipitation (mm) . | Mean annual ET (mm) . |
---|---|---|---|---|---|---|---|---|---|---|---|
Changbaishan (CBS) | 42.60 | 128.08 | Temperate Continental Monsoon Climate | MF | 738 | 3.6 | 0.8 | 76.6 | 310.5 | 695 | 500 |
Dinghushan (DHS) | 23.17 | 112.57 | Monsoon humid climate | EBF | 300 | 20.1 | 2.0 | 86.6 | 275.6 | 1,956 | 661 |
Dangxiong (DX) | 30.85 | 91.42 | The plateau monsoon climate | GRA | 4,250 | 2.7 | 0.4 | 63.0 | 466.0 | 477 | 550 |
Haibei (HB) | 37.62 | 101.29 | Continental climate | GRA | 3,216 | −1.3 | 0.4 | 90.3 | 366.0 | 560 | 576 |
Inner Mongolia (NMG) | 44.50 | 117.17 | Continental climate | GRA | 1,189 | 1.4 | 0.5 | 67.9 | 369.0 | 336 | 386 |
Qianyanzhou (QYZ) | 26.73 | 115.05 | Subtropical monsoon climate | MF | 102 | 17.8 | 1.8 | 88.2 | 273.8 | 1,542 | 704 |
Xishuangbanna (XSBN) | 21.95 | 101.2 | Tropical monsoon climate | EBF | 756 | 19.1 | 2.2 | 103.6 | 328.6 | 1,493 | 622 |
Yucheng (YC) | 36.95 | 116.6 | Warm and semi-humid continental monsoon climate | CRO | 28 | 13.2 | 1.3 | 63.3 | 265.6 | 582 | 683 |
Note: The location, climate, and vegetation information of stations can be found on the ChinaFLUX website. The mean annual flux data are calculated from the ChinaFLUX observations from 2002/2003 to 2010.
MF, mixed forests; EBF, evergreen broadleaf forests; GRA, grasslands; CRO, croplands.
BP neural network
In this study, we adopt a very common machine learning algorithm, i.e., BP neural network, to train the ET model. The BP model is a typical multilayer forward neural network and is widely used in approximating a complicated nonlinear function (Zhao et al. 2023a). As shown in Figure 2, the common structure of BP consists of an input layer, a hidden layer, and an output layer (Zhang et al. 2018). The information in the BP model is transmitted in a forward direction and the BP of the error gradient (Gong et al. 2016).
Identification of variable contribution
Model evaluation


RESULTS
Input combination and model implementation
In this study, we select several meteorological variables as input variables to BP, based on the correlation coefficient (r > 0.5) between meteorological variables and ETEC. These meteorological variables consist of land surface temperature (TL), temperature over canopy (Tc), the water vapor pressure of land surface (WVPL), water vapor pressure over canopy (WVPc), solar radiation (Rs), net radiation (Rn), and photosynthetically active radiation (FAR). The values of correlation coefficient (r) for these variables are displayed in Table 2. In Table 2, the r between TL and Tc, WVPL, and WVPc, and Rs and FAR are very close to 1, which means that the corresponding two variables have similar meanings. Here, the uncertainty of r between FAR and ETEC is slightly smaller than that between Rs and ETEC, hence, FAR is finally chosen. For temperature and water vapor pressure data, the near-surface data are selected in this study. Therefore, four variables are finally adopted as input variables (TL, WVPL, Rn, and FAR).
Mean value of correlation coefficient (r) between different variables at eight stations
![]() |
![]() |
Note: The black numbers are the mean value of r between two variables of the eight stations and the red numbers are the uncertainty of r, which is the difference between the maximum and minimum values of r at the eight stations.
Table 3 shows 15 input combinations for the BP model. All the input combinations are targeted at one response, which is the ETEC. These input combinations include four categories, i.e., one input variable (Category I), two input variables (Category II), three input variables (Category III), and four input variables (Category IV). In this study, we aim to explore the best input combinations for each category. Comparing these best input combinations, the best input combinations for different sites are then obtained.
The input combinations based on selected meteorological variables
Input combinations . | Input variables . | Abbreviations . |
---|---|---|
Category I | TL | T |
WVPL | W | |
Rn | R | |
FAR | F | |
Category II | TL, WVPL | TW |
TL, Rn | TR | |
TL, FAR | TF | |
WVPL,Rn | WR | |
WVPL, FAR | WF | |
Rn, FAR | RF | |
Category III | TL, WVPL, Rn | TWR |
TL, WVPL, FAR | TWF | |
TL, Rn, FAR | TRF | |
WVPL, Rn, FAR | WRF | |
Category IV | TL, WVPL, Rn, FAR | TWRF |
Input combinations . | Input variables . | Abbreviations . |
---|---|---|
Category I | TL | T |
WVPL | W | |
Rn | R | |
FAR | F | |
Category II | TL, WVPL | TW |
TL, Rn | TR | |
TL, FAR | TF | |
WVPL,Rn | WR | |
WVPL, FAR | WF | |
Rn, FAR | RF | |
Category III | TL, WVPL, Rn | TWR |
TL, WVPL, FAR | TWF | |
TL, Rn, FAR | TRF | |
WVPL, Rn, FAR | WRF | |
Category IV | TL, WVPL, Rn, FAR | TWRF |
Influences of model input combination on evapotranspiration estimation
Schematic diagram of the three-layer back-propagation neural network.
Table 4 shows the best input combinations in Categories I to IV and the maximum KGE values for these best input combinations. From Table 4, it can be observed that the best input combination in Category I is R at almost all stations except CBS. Besides XSBN stations, other stations show acceptable performances, with KGE values larger than 0.69. In Category II, the best input combinations are WF at CBS, HB, QYZ, and XSBN stations. At DX, NMG, and YC stations, the best input combinations are WR. At these seven stations, WVPL is a relatively important variable to simulate ET. However, at the DHS station, the best input combinations are TF, excluding WVPL. As presented in Table 4, the performances of the best input combinations in Category II are greatly improved compared with those in Category I. Except for XSBN stations, KGE at the other six stations range from 0.74 to 0.92. Furthermore, the performance of the XSBN station is also greatly improved. The KGE value at the XSBN station increases from 0.55 in Category I to 0.69 in Category II. It can be observed that the best input combinations in Category III exhibit substantial differences at eight stations. The differences in these stations on climate and vegetation may lead to the difference in the best input combinations. The improvement of the model performance from Category II to Category III is not as significant as that from Category I to Category II. Compared to the performances in Category III and Category IV, it can be illustrated that more input variables cannot lead to better performance from the perspective of KGE. This will be elaborated on in the discussion section.
The best input combinations for the BP model in categories I to IV in the eight stations
Stations . | Category I . | Category II . | Category III . | Category IV . | ||||
---|---|---|---|---|---|---|---|---|
Combination . | KGE . | Combination . | KGE . | Combination . | KGE . | Combination . | KGE . | |
CBS | T | 0.83 | WF | 0.92 | TWR | 0.92 | TWRF | 0.91 |
DHS | R | 0.71 | TF | 0.79 | TRF | 0.80 | TWRF | 0.79 |
DX | R | 0.84 | WR | 0.89 | TWR | 0.88 | TWRF | 0.90 |
HB | R | 0.80 | WF | 0.88 | WRF | 0.89 | TWRF | 0.86 |
NMG | R | 0.68 | WR | 0.73 | TWR | 0.74 | TWRF | 0.74 |
QYZ | R | 0.88 | WF | 0.92 | TWF | 0.92 | TWRF | 0.92 |
XSBN | R | 0.55 | WF | 0.70 | WRF | 0.70 | TWRF | 0.68 |
YC | R | 0.73 | WR | 0.78 | TWF | 0.79 | TWRF | 0.79 |
Stations . | Category I . | Category II . | Category III . | Category IV . | ||||
---|---|---|---|---|---|---|---|---|
Combination . | KGE . | Combination . | KGE . | Combination . | KGE . | Combination . | KGE . | |
CBS | T | 0.83 | WF | 0.92 | TWR | 0.92 | TWRF | 0.91 |
DHS | R | 0.71 | TF | 0.79 | TRF | 0.80 | TWRF | 0.79 |
DX | R | 0.84 | WR | 0.89 | TWR | 0.88 | TWRF | 0.90 |
HB | R | 0.80 | WF | 0.88 | WRF | 0.89 | TWRF | 0.86 |
NMG | R | 0.68 | WR | 0.73 | TWR | 0.74 | TWRF | 0.74 |
QYZ | R | 0.88 | WF | 0.92 | TWF | 0.92 | TWRF | 0.92 |
XSBN | R | 0.55 | WF | 0.70 | WRF | 0.70 | TWRF | 0.68 |
YC | R | 0.73 | WR | 0.78 | TWF | 0.79 | TWRF | 0.79 |
Note: The bold text represents the best input combination and its corresponding KGE value.
Performance of the BP model for evapotranspiration estimation at different stations
Performace of the BP model with model input combinations in the eight sites, where (a) to (d) correspond to Categories I–IV.
Performace of the BP model with model input combinations in the eight sites, where (a) to (d) correspond to Categories I–IV.
Scatter plots of EC-based ET (ETEC) and BP-simulated ET (ETBP) for the training period of eight stations, including (a) CBS, (b) DHS, (c) DX, (d) HB, (e) NMG, (f) QYZ, (g) XSBN, and (h) YC.
Scatter plots of EC-based ET (ETEC) and BP-simulated ET (ETBP) for the training period of eight stations, including (a) CBS, (b) DHS, (c) DX, (d) HB, (e) NMG, (f) QYZ, (g) XSBN, and (h) YC.
Scatter plots of EC-based ET (ETEC) and BP-simulated ET (ETBP) for the testing period of eight stations, including (a) CBS, (b) DHS, (c) DX, (d) HB, (e) NMG, (f) QYZ, (g) XSBN, and (h) YC.
Scatter plots of EC-based ET (ETEC) and BP-simulated ET (ETBP) for the testing period of eight stations, including (a) CBS, (b) DHS, (c) DX, (d) HB, (e) NMG, (f) QYZ, (g) XSBN, and (h) YC.
Daily time series of EC-based ET (ETEC) and BP-simulated ET (ETBP) for the testing period of eight stations, including (a) CBS, (b) DHS, (c) DX, (d) HB, (e) NMG, (f) QYZ, and (g) XSBN, (h) YC.
Daily time series of EC-based ET (ETEC) and BP-simulated ET (ETBP) for the testing period of eight stations, including (a) CBS, (b) DHS, (c) DX, (d) HB, (e) NMG, (f) QYZ, and (g) XSBN, (h) YC.
Monthly time series of EC-based ET (ETEC) and BP-simulated ET (ETBP) for the testing period of eight stations, including (a) CBS, (b) DHS, (c) DX, (d) HB, (e) NMG, (f) QYZ, and (g) XSBN, Seasonal performance.
Monthly time series of EC-based ET (ETEC) and BP-simulated ET (ETBP) for the testing period of eight stations, including (a) CBS, (b) DHS, (c) DX, (d) HB, (e) NMG, (f) QYZ, and (g) XSBN, Seasonal performance.
Contribution of variables in the best input combination
Table 5 shows the contribution of each variable in the best input combination. As shown in Supplementary material, Tables S1–S8, the contribution rates are related to the correlation between the input variables and ETEC. These correlation matrices can be used to represent hydrological and climatic characteristics at these stations. Generally speaking, variables with smaller correlation coefficients with ETEC have relatively smaller contribution rates in the model. For example, among the variables, FAR at CBS station has the smallest correlation with ETEC. Correspondingly, FAR is not included in the best input combination at CBS station. However, the correlation and contribution rate do not always correspond exactly. For example, at HB station, we found that the correlation coefficient between TL and ETEC is second only to that between Rn and ET, but TL does not appear in the best input combination. Meanwhile, we found that TL has high correlation coefficients with both Rn and WVPL. Therefore, it is supposed that the correlation between ETEC and TL might be attributed to Rn and WVPL. Similar adjustments in contribution rates also occur at the other stations. Furthermore, the correlation matrices can explain the poor performance of the BP model at the NMG and XSBN stations. This is because the correlation between the input variables and ETEC is relatively low, indicating that these stations need to introduce more variables to improve the simulation performance.
Contribution of each variable in the best input combination
Station . | TL . | WVPL . | Rn . | FAR . |
---|---|---|---|---|
CBS | 47.79% | 38.45% | 13.77% | – |
DHS | 43.76% | – | 31.74% | 24.51% |
DX | 21.48% | 25.49% | 30.44% | 22.59% |
HB | – | 38.92% | 27.18% | 33.90% |
NMG | 24.40% | 26.45% | 28.52% | 20.63% |
QYZ | 19.85% | 29.24% | 21.41% | 29.50% |
XSBN | – | 30.78% | 31.50% | 37.72% |
YC | 27.64% | 13.57% | 38.76% | 20.03% |
Station . | TL . | WVPL . | Rn . | FAR . |
---|---|---|---|---|
CBS | 47.79% | 38.45% | 13.77% | – |
DHS | 43.76% | – | 31.74% | 24.51% |
DX | 21.48% | 25.49% | 30.44% | 22.59% |
HB | – | 38.92% | 27.18% | 33.90% |
NMG | 24.40% | 26.45% | 28.52% | 20.63% |
QYZ | 19.85% | 29.24% | 21.41% | 29.50% |
XSBN | – | 30.78% | 31.50% | 37.72% |
YC | 27.64% | 13.57% | 38.76% | 20.03% |
DISCUSSION
Comparison of different models
Seasonal BIAS of EC-based ET (ETEC) and BP-simulated ET (ETBP) at four seasons for the testing period of eight stations.
Seasonal BIAS of EC-based ET (ETEC) and BP-simulated ET (ETBP) at four seasons for the testing period of eight stations.
Performance of different ET models for the training and testing period of eight stations, including (a) CBS, (b) DHS, (c) DX, (d) HB, (e) NMG, (f) QYZ, (g) XSBN, and (h) YC.
Performance of different ET models for the training and testing period of eight stations, including (a) CBS, (b) DHS, (c) DX, (d) HB, (e) NMG, (f) QYZ, (g) XSBN, and (h) YC.
As widely recognized, ET consists of vegetation transpiration and soil evaporation. Thus, the characteristics of soil and vegetation must have an impact on ET. The input variables of this study we used are only meteorological variables, as in most previous studies. In Fan et al. (2021), four meteorological variables, soil water content (soil-related variable) and leaf area index (vegetation-related variable) are used as input variables, the results illustrated that the incorporation of soil water content or/and leaf area index in the machine learning models contributed to improving the accuracy of modeling maize transpiration. Wu et al. (2023) used four machine learning models to simulate daily maize evapotranspiration at different growth stages in semi-humid regions of China and, the results demonstrated that the best-performed models and input variable combinations are very different. This conclusion partly reveals that vegetation-related variables are helpful to increase the accuracy of modeling ET. FAR considered in this study is certainly related to vegetation. In future studies, soil- and vegetation-related variables can be recommended for more accurate ET estimation. Additionally, multiple machine learning models are used in the above-mentioned studies, and their results showed that the superior models at different stations and at different growth stages differed very much. Thus, it is worth exploring the most suitable model and the best input variable combinations for different growth stages, different seasons, and different regions.
Physical causes of the best input combination at different stations
Table 4 shows that the best input combination in Category I is RL at almost all stations except CBS. It is also evident from Figure 3 that the performances of models with RL and FAR as input are very close at CBS station. The reason for the best input combination in Category I at eight stations is that temperature and radiation are the main driving factors for the ET amount (Priestley & Taylor 1972; Hargreaves & Samani 1985). The slightly poor performances at NMG and XSBN stations are mainly because the correlation coefficient between the input variables and ET is relatively small (Supplementary material, Tables S5 and S7). In other words, one input variable cannot simulate ET very well with the BP model at NMG and XSBN stations.
It is observed that, except for the DHS station, the best combinations of Category all contain WVPL. This is mainly due to the relevance of meteorological variables with ET and the interaction between meteorological variables (Xu et al. 2014). As stated by Xu et al. (2014), the interactions between meteorological variables are considerable and cannot be ignored. As for the DHS station, there is a low correlation coefficient between ET and WVPL, only 0.31. As a result, ET in DHS station evaporation is not sensitive to WVPL. In the DHS station, it is also observed that Rn does not appear as the best input combination in Category II, despite its strong performance in Category I. This can be explained by the high correlation between Rn and FAR (0.84 in Supplementary material, Table S2), which means they provide similar information to the model, leading to comparable results for combinations like Rn and FAR, as well as TF and RF. In Category II, the TF combination outperformed TR, because TF carried more diverse and complementary information compared to the TR combination, with a lower correlation coefficient between TL and FAR (0.44 in Supplementary material, Table S2) than that between TL and Rn (0.60 in Supplementary material, Table S2). The same phenomenon also occurred at the CBS station (Supplementary material, Table S1).
According to Table 1, these eight stations are evenly distributed and cover the most prevalent climate in China. Hence, it is significant that there is a difference among their various meteorological variables, soil and vegetation types, evapotranspiration ratio, etc. These results demonstrate obviously that the performance of BP for simulating ET is station-specific and depends on the climate zone, which confirms the conclusions found in previous studies (Diop et al. 2015; Bodian et al. 2016; Tao et al. 2018).
It can be illustrated that more input variables cannot lead to better performance based on Table 4, which is not consistent with the result of Tao et al. (2018). Tao et al. (2018) used a new hybrid ANFIS–FA model to simulate reference evapotranspiration based on various input combinations; the result revealed that the best accuracy was obtained for the sixth input combination, which contained all the available meteorological information. However, Tao et al. (2018) only used six input combinations, i.e., adding a meteorological variable to the previous input combination to obtain the next input combination. This composition of input variables cannot get all possible input combinations, which may lead to omitting the best input combination. Over the performances of all input combinations in this study, it can be drawn that more input meteorological variables are able to achieve better performance generally, but not absolutely. This result indicates that more input variable information can enhance the model performance. Hence, there is no doubt about the importance of maintaining and expanding the data collection.
CONCLUSIONS
This study simulates ET at eight flux stations through the BP model under four categories of input variable combinations (one input variable, two input variables, three input variables, and four input variables), including 15 different input combinations (all possible combinations of four meteorological variables, i.e., TL, WVPL, Rn, and FAR). The key findings of this study are summarized below.
(1) Overall, more input variables can improve the performance of the BP model at eight stations. From one input variable to two input variables, BP performs much better. This phenomenon also occurs from two to three input variables, but the improvement is not so significant.
(2) However, at some stations, the performances of four input variables are slightly better than those of three input variables. However, more input variables do not definitely produce better performances. At the other stations, the BP model with three input variables outperformed those with four input variables.
(3) Daily and monthly ETBP with the best input variable combinations show good performances at eight stations. At the seasonal scale, ETBP in summer outperforms that in winter.
(4) The different best input combinations of the BP model at eight stations reveal that the accuracy of modeling ET are subject to many factors, including climate zone soil and vegetation types, etc. The introduction of other variables, such as wind speed, soil moisture content, leaf area index, etc., or the unitization of more complex models, can enhance the performance of ET simulation.
ACKNOWLEDGEMENTS
We acknowledge the financial support from the National Nature Science Foundation of China (52209036; 51909233) and the Zhejiang Natural Science Foundation (LY22E090010). We thank ChinaFlux (http://www.chinaflux.org) for providing EC-based observations and meteorological observations from eight flux stations used in this study.
DATA AVAILABILITY STATEMENT
All relevant data are available from an online repository or repositories. The observation data of the eight flux stations in this study are available at the Chinese Terrestrial Ecosystem Flux Research Network (http://www.chinaflux.org/) and Chinese Ecosystem Research Network Data Center (http://www.nesdc.org.cn/theme/index?projectId=612458897e28172cbed3d77a).
CONFLICT OF INTEREST
The authors declare there is no conflict.