Abstract
The estimation of reference evapotranspiration (ET0) is important in hydrology research, irrigation scheduling design and water resources management. This study explored the capability of eight machine learning models, i.e., Artificial Neuron Network (ANN), Random Forest (RF), Gradient Boosting Decision Tree (GBDT), Extreme Gradient Boosting (XGBoost), Multivariate Adaptive Regression Spline (MARS), Support Vector Machine (SVM), Extreme Learning Machine and a novel Kernel-based Nonlinear Extension of Arps Decline (KNEA) Model, for modeling monthly mean daily ET0 using only temperature data from local or cross stations. These machine learning models were also compared with the temperature-based Hargreaves–Samani equation. The results indicated that the estimation accuracy of these machine learning models differed in various scenarios. The tree-based models (RF, GBDT and XGBoost) exhibited higher estimation accuracy than the other models in the local application. When the station has only temperature data, the MARS and SVM models were slightly superior to the other models, while the ANN and HS models performed worse than the others. When there was no temperature data at the target station and the data from adjacent stations were used instead, MARS, SVM and KNEA were the suitable models. The results can provide a solution for ET0 estimation in the absence of complete meteorological data.
INTRODUCTION
Evapotranspiration (ET) is the combination of two separate water loss processes: water evaporation from the soil and plant surfaces and plant transpiration by which water escapes from a plant's body to the ambient air in the form of steam through its stomata (Ali Ghorbani et al. 2018; Moazenzadeh et al. 2018). ET is one of the important components of hydrologic cycle. Reliable estimation of ET is the basis of developing precision irrigation system and improving water use efficiency. Although ET can be measured using eddy covariance, Bowen ratio system or lysimeters, their common problem is that they are expensive, time-consuming and require high professional knowledge, particularly in developing countries such as China. People usually use indirect methods to derive ET, that is, to use reference crop evapotranspiration (ET0) and crop coefficient (Kc). According to the Food and Agriculture Organization of the United Nations (FAO) publication by Allen et al. (1998), ET0 represents ‘the evapotranspiration from an actively growing virtual vegetated surface that is 0.12 m tall, completely shading the ground, with adequate water supply, and for daily time-step the aerodynamic resistance is 208/u2 (wind speed at 2 meter) surface albedo is 0.23 and a bulk canopy resistance is 70 s m−1. The most widely accepted methodology for ET0 estimation is the FAO 56 Penman–Monteith (PMF 56) formula, and it has been a standard method to test the other methods (Fan et al. 2016).
The main drawback of the PMF 56 formula is that it needs many high-quality meteorological data, e.g. solar radiation or sunshine duration, air temperature, wind speed and relative humidity, whereas these data are often unavailable in developing countries. For instance, the costs of observing solar radiation are very high. There are more than 2,000 meteorological stations in China, but only less than 130 stations record solar radiation (Fan et al. 2019a, 2019b). In addition, wind speed is affected by the topographic feature and land use, and it is difficult to obtain representative wind speed on a large scale. For this reason, temperature-based ET0 models are of great interest to researchers. There have been many temperature-based empirical models available for ET0 estimation, e.g. the Thornthwaite model (Pereira & Pruitt 2004; Beguería et al. 2014), Hamon model (McCabe et al. 2015; Valipour 2015), Malmström model (Almorox et al. 2015; Quej et al. 2018), Hargreaves–Samani (HS) model (Luo et al. 2014; Pandey et al. 2014; Shiri et al. 2015; Xu et al. 2016; Cobaner et al. 2017; Morales-Salinas et al. 2017), Oudin model (Oudin et al. 2005; Zhao et al. 2013), Blaney–Criddle model (Heydari et al. 2015; Valipour 2015; Valipour et al. 2017), and Baier–Robertson model (Liu et al. 2016; Seiller & Anctil 2016; Martel et al. 2018).
Among these temperature-based models, the Hargreaves–Samani model has been widely used all over the world as a result of its simple structure and strong applicability. Hargreaves and Allen (2003) suggested that suitable ET0 estimates could be obtained by the HS model for at least a five-day period, since the daily value was easily influenced by wind speed and cloud cover. However, ideal results on a daily scale were also reported. Raziei and Pereira (2013) evaluated the performance of the HS and FAO-PM temperature (PMT) models for the estimation of ET0 at 40 weather stations in Iran. The results suggested that the HS and PMT models had similar estimation accuracy in modeling ET0 in various climatic zones of Iran. Almorox et al. (2015) assessed more than 10 temperature-based models for estimating ET0 at 4362 worldwide stations. In this study, the HS model provided the best accuracy in many climates, e.g. arid, semiarid, temperate, cold and polar. On the other hand, the Thornthwaite and McCloud models gave the worst average estimates in all climates. Quej et al. (2018) evaluated seven temperature-based models for estimating ET0 in four cities of Mexico. They found that the HS model exhibited satisfactory accuracy (root mean square error (RMSE) = 0.74 mm d−1), which was slightly worse than that of the PMT model (RMSE = 0.70 mm d−1) in Yucatán Peninsula. Although the HS model has a good performance worldwide, model parameter calibration is a crucial prerequisite for local applications (Samani 2000). Fourteen general parametric models were established based on the geographical, temperature and wind speed information by Martí et al. (2015) in eastern Spain. Feng et al. (2017) calibrated the HS model based on the Bayesian method in the Sichuan basin of Southwest China. Other regions have also reported the calibration of the HS model parameters (Gavilán et al. 2006; Ravazzani et al. 2011; Shahidian et al. 2013; Heydari & Heydari 2014; Almorox & Grieser 2016; Cobaner et al. 2017; Shiri 2017; Valiantzas 2017).
In recent years, there has been more and more research focusing on the estimation and forecast of natural phenomenon (Yaseen et al. 2018, 2018c; Fan et al. 2018a; Ghorbani et al. 2018a, 2018b; Khosravi et al. 2018; Naganna et al. 2019; Xiao et al. 2019), including ET0 estimation by using machine learning models, e.g. Artificial Neural Network (ANN), Fuzzy Logic, Gene Expression Programming (GEP), Multivariate Adaptive Regression Splines (MARS), Decision Tree (DT), Random Forests (RFs), Support Vector Machine (SVM), Extreme Learning Machine (ELM) and Adaptive Neuro-fuzzy Inference System (ANFIS). Trajkovic (2005) compared the radial basis function neural network (RBFNN) model and three temperature-based empirical models (PMT, HS and Thornthwaite) for estimating ET0 at seven weather stations in Serbia. The results showed that the RBFNN model provided better ET0 estimates than the other models at most stations. Luo et al. (2015) evaluated four ANN models for ET0 prediction using forecasted temperature data. The results showed that the average values of RMSE ranged 0.87–1.36 mm d−1, and the prediction accuracy of maximum temperature was lower than that of minimum temperature. Yassin et al. (2016) compared the ANN and GEP models for estimating ET0 in Saudi Arabia. The results indicated that the ANN model performed slightly better than the GEP model under the same input combination of meteorological data. Feng et al. (2016) compared three machine learning models and several empirical models for the estimation of ET0 in the humid region of Southwest China, and the ELM and GANN models were recommended. Similar work has also been done in Iran (Mehdizadeh et al. 2017). It was found that the MARS and SVM models offered better ET0 estimates than the GEP and empirical models. Mattar (2018) developed a GEP model for estimating ET0 at 32 weather stations in Egypt. It was found that the GEP model had better estimation accuracy than the empirical models. Fan et al. (2018c) evaluated the M5 model tree (M5Tree), Gradient Boosting Decision Tree (GBDT), RF, XGBoost, SVM and ELM models for predicting daily ET0 in different climates of China. They found that the ELM and SVM models performed slightly better than the XGBoost model in terms of estimation accuracy, while the XGBoost model had much less computational time than the ELM and SVM models.
In addition, machine learning models can be coupled with preprocessing or parameter optimization algorithms, and the hybrid models usually perform better than the traditional machine learning models (Feng et al. 2018; Yaseen et al. 2018a, 2018b; Wu et al. 2019). Tao et al. (2018a, 2018b) developed a coupled model based on the ANFIS model and firefly algorithm (FFA) for estimating ET0 in Burkina Faso. The new ANFIS–FFA model (R2 = 0.97, RMSE = 0.24 mm d−1 and mean absolute percent error (MAPE) = 0.035) was superior to the ANFIS model (R2 = 0.89, RMSE = 0.38 mm d−1 and MAPE = 0.037). Shiri (2018) introduced a new hybrid model based on the RF model and wavelet transform (WT) to estimate ET0 using air temperature and wind speed data. The results revealed that the new hybrid model improved the estimation accuracy of the RF model and was superior to the empirical models. The ANN and ELM models coupled with WT have also showed superiority to the ordinary ANN and ELM models (Kisi & Alizamir 2018).
After training and testing by local dataset, the application of machine learning models to other regions with similar climatic conditions may still have great uncertainty (Feng et al. 2019; Huang et al. 2019). To overcome this limitation, many scholars have tested the performance of machine learning models when using exogenous data (Martí et al. 2015; Landeras et al. 2018; Shiri 2019; Shiri et al. 2019). Martí and Gasque (2011) explored the use of continentality index to evaluate the station's climate characteristics. The object station, which was selected based on this characteristic, was used to develop the ANN model to accomplish cross-station strategy. In addition, another new approach based on the geographical inputs has also been reported by Martí and Zarzo (2012). Karimi et al. (2017) evaluated the performance of the GEP and SVM models for ET0 estimation in the humid region of South Korea. The model was developed and tested at each location in the first scenario, and the results showed that the machine learning models had superiority to the empirical models. In the second scenario, ET0 was modeled using data from nearby stations and the generalized heuristic model was developed for the studied stations. They found that both the GEP and SVM models could fulfill these tasks, where the GEP model slightly outperformed the SVM model. Shiri et al. (2014) developed ANFIS models based on weather data from Spain and found that the model could successfully estimate ET0 in both the arid and humid regions of Iran. Kisi (2016) found that the estimation accuracy of the LSSVM, MARS and M5Tree models in ET0 modeling differed in various cross-station scenarios. The MARS model outperformed the other models when local input data were not available. However, the M5Tree model performed better than the others when both local input and output data were missing. Feng et al. (2017) applied the RF and GRNN models to estimate ET0 in both local and cross-station scenarios, and found that both models could estimate ET0 accurately in the Sichuan Province of China. Sanikhani et al. (2019) evaluated six temperature-based machine learning models (GRNN, RBFNN, ANFIS-GP, ANFIS-SC, GEP and MLP) and the HS model for the estimation of ET0 at two stations in Turkey. The results indicated that the machine learning models, except the MLP model, were superior to the HS model in the cross-station scenario.
Jiangxi Province is located in South China, which experiences a subtropical monsoon climate. This region is a major producing area for double-cropping rice and citrus fruits in China. Variation in seasonal precipitation distribution is obvious, resulting in the frequent existence of seasonal drought in this region. In 2018, for instance, the region was hit by a severe drought, which affected more than 200,000 ha areas and more than 3 million people, and caused direct economic losses of 240 million US dollars. Therefore, the reliable estimation of ET0 is of crucial significance for the rational utilization of agricultural water resources in this region. To the best knowledge of the authors, comprehensive comparison of various types of machine learning models for ET0 estimation has been very minimal, especially their performances with limited temperature data in local and cross-station applications. Machine learning models have different precision performance in various regions. The most suitable model in Jiangxi Province has not been reported yet and this is the first time to compare various types of models for the estimation of ET0 in this region. In addition, an improved version of kernel-based learning model, i.e., Kernel-based Nonlinear Extension of Arps decline (KNEA) model (Ma 2019), has been recently developed and successfully applied in many other fields (Ma & Liu 2018a). However, the KNEA model has not yet been tested in ET0 studies. Therefore, this study aims to evaluate and compare the performance of eight temperature-based machine learning models, i.e. ANN, RF, GBDT, XGBoost, MARS, SVM, ELM and KNEA models for: (1) locally estimating monthly mean daily ET0 at 15 stations in the Jiangxi Province of China using only temperature data, and compare their performance with the empirical HS model; (2) evaluating the developed models for estimating monthly mean daily ET0 with data from four stations; and (3) evaluating the model performance for estimating monthly mean daily ET0 using a new synthetic dataset (local extraterrestrial radiation data and temperature data from other stations).
MATERIALS AND METHODS
Case study and data description
Jiangxi Province, covering an area of 1.67 × 105 km2, is a major producing area for double-cropping paddy in China and yields paddy rice of 20.4 billion kg y−1. The study area has a subtropical humid climate with the mean annual rainfall ranging 1,341–1,943 mm, which is largely influenced by the East Asian monsoon (Fan et al. 2018a). About 15 billion m3 y−1 of water resources have been used for irrigation in this region. However, nearly 2 billion m3 y−1 water shortage exists as a result of unreasonable use of water resources and uneven distribution of seasonal rainfall. In this study, monthly maximum and minimum temperature data and extraterrestrial solar radiation from 15 meteorological stations in Jiangxi Province of China (Figure 1) were selected for testing the machine learning models and the empirical HS model in monthly ET0 modeling. The meteorological data were examined and shared by the National Meteorological Information Center (NMIC) of China Meteorological Administration. The extraterrestrial solar radiation (Ra) data were estimated on the basis of geographical, seasonal and solar information (Quej et al. 2017). It can be seen from Table 1 that there was no significant variation in the meteorological variables between the training and testing periods at all stations. In addition, the temperature of Station 58506 was much lower than that of the other stations. However, the average annual ET0 was slightly lower compared with the others due to higher elevation. The values of meteorological variables of the other stations (except Station 58506) are very similar, indicating that the air temperature and ET0 data in this area had fewer variations. This makes it possible to develop general models for monthly ET0 estimation in the whole region.
Machine learning models for estimating reference evapotranspiration
Gradient Boosting Decision Tree
The DT is one of the most widely used classification algorithms, which can be represented as multiple if-else rules. Decision tree is actually a method to divide the space into hyper planes. Each time the space is divided, the current space is divided into two parts, such as the decision tree, which makes each leaf node an intersecting region of space. After getting the above decision tree learning, when entering a classification samples instance for decision-making, we can divide the sample into a leaf node according to the two characteristics of the sample (x, y) values and classification results. This is the decision tree model of the classification process. The learning algorithm of decision tree has many subclasses, among which the ID3 algorithm, C4.5 and M5 model tree are the basic algorithms. The GBDT model is a hybridized algorithm that consists of an ensemble of decision trees. One single decision tree usually causes over-fitting issue, while the GBDT model is able to overcome this problem via integrating many weak decision trees with many leaf nodes. The GBDT model has many merits, such as the capability to identify nonlinear transformations, the capability to deal with a categorical variable, computational robustness and high scalability. GBDT had been used in web search (Mohan et al. 2011), subway ridership (Ding et al. 2016), global solar radiation (Fan et al. 2018b), pan evaporation (Lu et al. 2018) and ET0 estimation (Fan et al. 2018a). More details can be found in Elith et al. (2008).
Extreme Gradient Boosting
The XGBoost model is proposed by Chen & Guestrin (2016), which is an improved version of Gradient Boosting Machines (GBMs) and in particular K Classification and Regression Trees (CART). This model is originated from the idea of ‘boosting’, which integrates all the predictions of a series of ‘weak’ learners to develop a ‘strong’ learner via an additive training process. The XGBoost model is supposed to prevent over-fitting issue and minimize the computational time. This is obtained by simplifying the objective functions that allow combining the predictive and regularization terms, while it maintains an optimal computational efficiency at the same time. Parallel calculations are also automatically executed for the functions in the XGBoost model in the training stage. More information about the XGBoost model refer to Chen & Guestrin (2016).
Kernel-based Nonlinear Extension of Arps Decline Model
In addition, Artificial Neuron Network (ANN), Support Vector Machine (SVM), RF, Multivariate Adaptive Regression Spline (MARS) and Extreme Learning Machine (ELM) were also used in this study, and the details of these models can be found in Friedman (1991), Breiman (2001), Huang et al. (2006) and Vapnik (2013).
FAO 56 Penman–Monteith
Hargreaves–Samani model
Model scenarios
In the field of agricultural irrigation management, it is of great significance for decision-makers and planners to obtain the information of ET0. In this study, eight machine learning models as well as the HS empirical model were developed and applied by using the temperature-based general model for the estimation of monthly mean daily ET0 in the Poyang Lake Region of Jiangxi Province. The obtained results of the machine learning models were also compared with those estimated by the standard PMF 56 equation. Firstly, a general model for estimating ET0 was established using data during 2001–2010 from 11 meteorological stations in the Poyang Lake Region (Figure 2). Secondly, the established model was tested in three cases: (1) comparing the eight machine learning models and the HS empirical model for the estimation of monthly mean daily ET0 of the 11 stations using data from 2011 to 2015; (2) investigating the same predictive model and comparing their performance with the HS model based on input and output data from the other four stations (ID: 57896, 58509, 58608 and 58715) in the same region; and (3) investigating the same predictive model and comparing their performance with the HS model based on temperature data from the four neighboring stations (ID: 577793, 58606, 58813 and 58527) of the four target stations (ID: 57896, 58509, 58608 and 58715) and extraterrestrial radiation data from the four target stations (ID: 57896, 58509, 58608 and 58715), respectively. The second and third cases will be useful for regions lack of temperature data or with no local data at all. The coefficients of the empirical models were attained by the least-squares fitting method, while the parameters of machine learning models were optimized by the grid search technique.
Statistical indicators
RESULTS AND DISCUSSION
Case 1
The comparison of the eight machine learning models as well as the HS model for estimating monthly mean daily ET0 at the 11 stations in the Poyang Lake Region was performed. The statistical summary during training and testing are presented in Table 2. In general, MBE values were less than 0.05 mm d−1 during the training and testing periods. It means that there is no overall overestimation or underestimation by all the machine learning and empirical models. The tree-based models (RF, GBDT and XGBoost) had higher estimation accuracy during the testing stage. RMSE values of the RF, GBDT and XGBoost models were 0.276, 0.281 and 0.269 mm d−1 during testing. NRMSE values of the RF, GBDT and XGBoost models were 0.116, 0.119 and 0.113 during the testing stage. The RMSE values of the kernel-based models (ELM, KNEA and SVM) and the MARS model were close to each other, which were 2.5–9.3% higher than those of the tree-based models. The ANN model performed worst among machine learning models during the testing period. Compared with the XGBoost model, RMSE was increased by 23% during testing. However, the accuracy of the ANN model was significantly higher than that of the HS model, with the RMSE and NRMSE values of 0.446 mm d−1 and 0.188 during testing, respectively. It is clear that the worst model (HS) can still produce results that are suitable for estimating monthly mean daily ET0 in this region. Overall, high estimation accuracy can be obtained by established models using only monthly mean daily maximum and minimum temperatures. This is because the global solar radiation in this area has a good relationship with daily maximum and minimum temperatures (Fan et al. 2018b). Also, the relative humidity is very high over the year and the influence of wind is not as obvious as that in the arid areas. Thus, the information most closely related to ET0 can be described by temperature data alone. Similar results were also revealed in Southern China (Feng et al. 2017).
ID . | Longitude (°) . | Latitude (°) . | Elevation (m) . | n (h) . | Tmax (°C) . | Tmin (°C) . | RH(%) . | U2 (m s−1) . | ET0 (mm d−1) . |
---|---|---|---|---|---|---|---|---|---|
57598 | 114.35 | 29.02 | 146.8 | 4.8 (4.5) | 23.4 (23.1) | 13.1 (13.5) | 76.9 (80.4) | 1.1 (1.1) | 2.3 (2.2) |
57793 | 114.23 | 27.48 | 131.3 | 4.0 (3.9) | 22.8 (22.6) | 14.5 (14.8) | 79.1 (80.9) | 1.9 (1.8) | 2.3 (2.2) |
57799 | 114.55 | 27.03 | 71.2 | 4.3 (4.0) | 23.8 (23.5) | 15.8 (15.9) | 78.8 (78.5) | 1.6 (1.5) | 2.4 (2.3) |
57896 | 114.30 | 26.20 | 126.1 | 4.6 (4.4) | 24.3 (24.0) | 15.7 (15.9) | 75.1 (77.3) | 1.7 (1.7) | 2.6 (2.5) |
57993 | 115.00 | 25.52 | 137.5 | 4.8 (4.8) | 24.8 (24.8) | 16.4 (16.8) | 70.6 (74.5) | 1.4 (1.5) | 2.7 (2.7) |
58506 | 115.59 | 29.35 | 1164.5 | 4.7 (4.3) | 16.4 (16.4) | 9.6 (9.5) | 75.1 (78.2) | 3.6 (3.5) | 2.2 (2.0) |
58519 | 116.41 | 29.00 | 40.1 | 4.9 (4.8) | 22.7 (22.4) | 15.4 (15.4) | 73.0 (74.7) | 2.0 (1.9) | 2.6 (2.5) |
58527 | 117.12 | 29.18 | 61.5 | 4.8 (4.6) | 23.6 (23.3) | 14.7 (14.7) | 73.4 (74.2) | 1.3 (1.1) | 2.4 (2.3) |
58606 | 115.55 | 28.36 | 46.9 | 5.3 (4.9) | 22.6 (22.5) | 15.6 (15.7) | 71.8 (73.0) | 1.9 (1.8) | 2.7 (2.6) |
58608 | 115.33 | 28.04 | 30.4 | 4.6 (4.2) | 23.3 (23.1) | 15.6 (15.6) | 75.0 (73.4) | 1.2 (1.2) | 2.4 (2.4) |
58626 | 117.15 | 28.19 | 60.8 | 4.5 (4.3) | 23.9 (23.2) | 15.8 (15.3) | 74.4 (76.2) | 1.4 (2.1) | 2.5 (2.5) |
58634 | 118.15 | 28.41 | 116.3 | 4.7 (4.5) | 23.5 (23.1) | 14.6 (14.7) | 74.6 (75.6) | 2.0 (2.0) | 2.6 (2.5) |
58715 | 116.39 | 27.35 | 80.8 | 4.7 (4.5) | 23.4 (23.1) | 15.1 (15.4) | 78.2 (76.0) | 2.6 (2.5) | 2.6 (2.7) |
58813 | 116.20 | 26.51 | 143.8 | 4.5 (4.1) | 24.4 (24.1) | 15.2 (15.3) | 80.6 (79.2) | 1.3 (1.4) | 2.4 (2.3) |
59102 | 115.39 | 24.57 | 303.9 | 4.6 (4.4) | 25.0 (24.9) | 15.4 (15.8) | 77.3 (79.8) | 1.2 (1.0) | 2.4 (2.3) |
ID . | Longitude (°) . | Latitude (°) . | Elevation (m) . | n (h) . | Tmax (°C) . | Tmin (°C) . | RH(%) . | U2 (m s−1) . | ET0 (mm d−1) . |
---|---|---|---|---|---|---|---|---|---|
57598 | 114.35 | 29.02 | 146.8 | 4.8 (4.5) | 23.4 (23.1) | 13.1 (13.5) | 76.9 (80.4) | 1.1 (1.1) | 2.3 (2.2) |
57793 | 114.23 | 27.48 | 131.3 | 4.0 (3.9) | 22.8 (22.6) | 14.5 (14.8) | 79.1 (80.9) | 1.9 (1.8) | 2.3 (2.2) |
57799 | 114.55 | 27.03 | 71.2 | 4.3 (4.0) | 23.8 (23.5) | 15.8 (15.9) | 78.8 (78.5) | 1.6 (1.5) | 2.4 (2.3) |
57896 | 114.30 | 26.20 | 126.1 | 4.6 (4.4) | 24.3 (24.0) | 15.7 (15.9) | 75.1 (77.3) | 1.7 (1.7) | 2.6 (2.5) |
57993 | 115.00 | 25.52 | 137.5 | 4.8 (4.8) | 24.8 (24.8) | 16.4 (16.8) | 70.6 (74.5) | 1.4 (1.5) | 2.7 (2.7) |
58506 | 115.59 | 29.35 | 1164.5 | 4.7 (4.3) | 16.4 (16.4) | 9.6 (9.5) | 75.1 (78.2) | 3.6 (3.5) | 2.2 (2.0) |
58519 | 116.41 | 29.00 | 40.1 | 4.9 (4.8) | 22.7 (22.4) | 15.4 (15.4) | 73.0 (74.7) | 2.0 (1.9) | 2.6 (2.5) |
58527 | 117.12 | 29.18 | 61.5 | 4.8 (4.6) | 23.6 (23.3) | 14.7 (14.7) | 73.4 (74.2) | 1.3 (1.1) | 2.4 (2.3) |
58606 | 115.55 | 28.36 | 46.9 | 5.3 (4.9) | 22.6 (22.5) | 15.6 (15.7) | 71.8 (73.0) | 1.9 (1.8) | 2.7 (2.6) |
58608 | 115.33 | 28.04 | 30.4 | 4.6 (4.2) | 23.3 (23.1) | 15.6 (15.6) | 75.0 (73.4) | 1.2 (1.2) | 2.4 (2.4) |
58626 | 117.15 | 28.19 | 60.8 | 4.5 (4.3) | 23.9 (23.2) | 15.8 (15.3) | 74.4 (76.2) | 1.4 (2.1) | 2.5 (2.5) |
58634 | 118.15 | 28.41 | 116.3 | 4.7 (4.5) | 23.5 (23.1) | 14.6 (14.7) | 74.6 (75.6) | 2.0 (2.0) | 2.6 (2.5) |
58715 | 116.39 | 27.35 | 80.8 | 4.7 (4.5) | 23.4 (23.1) | 15.1 (15.4) | 78.2 (76.0) | 2.6 (2.5) | 2.6 (2.7) |
58813 | 116.20 | 26.51 | 143.8 | 4.5 (4.1) | 24.4 (24.1) | 15.2 (15.3) | 80.6 (79.2) | 1.3 (1.4) | 2.4 (2.3) |
59102 | 115.39 | 24.57 | 303.9 | 4.6 (4.4) | 25.0 (24.9) | 15.4 (15.8) | 77.3 (79.8) | 1.2 (1.0) | 2.4 (2.3) |
Model . | Training . | Testing . | ||||||
---|---|---|---|---|---|---|---|---|
RMSE (mm d−1) . | R2 . | NRMSE . | MBE (mm d−1) . | RMSE (mm d−1) . | R2 . | NRMSE . | MBE (mm d−1) . | |
ELM | 0.232 | 0.956 | 0.095 | 0.001 | 0.292 | 0.929 | 0.123 | −0.024 |
GBDT | 0.233 | 0.958 | 0.096 | 0.000 | 0.281 | 0.937 | 0.119 | −0.047 |
KNEA | 0.303 | 0.927 | 0.124 | 0.000 | 0.288 | 0.931 | 0.121 | −0.024 |
MARS | 0.308 | 0.924 | 0.126 | 0.000 | 0.295 | 0.929 | 0.124 | −0.044 |
ANN | 0.328 | 0.916 | 0.178 | 0.008 | 0.331 | 0.910 | 0.187 | −0.032 |
RF | 0.130 | 0.986 | 0.053 | 0.000 | 0.276 | 0.939 | 0.116 | −0.046 |
SVM | 0.308 | 0.925 | 0.126 | 0.033 | 0.294 | 0.929 | 0.124 | −0.010 |
XGBoost | 0.199 | 0.968 | 0.081 | 0.000 | 0.269 | 0.941 | 0.113 | −0.040 |
HS | 0.418 | 0.863 | 0.171 | 0.021 | 0.446 | 0.839 | 0.188 | −0.016 |
Model . | Training . | Testing . | ||||||
---|---|---|---|---|---|---|---|---|
RMSE (mm d−1) . | R2 . | NRMSE . | MBE (mm d−1) . | RMSE (mm d−1) . | R2 . | NRMSE . | MBE (mm d−1) . | |
ELM | 0.232 | 0.956 | 0.095 | 0.001 | 0.292 | 0.929 | 0.123 | −0.024 |
GBDT | 0.233 | 0.958 | 0.096 | 0.000 | 0.281 | 0.937 | 0.119 | −0.047 |
KNEA | 0.303 | 0.927 | 0.124 | 0.000 | 0.288 | 0.931 | 0.121 | −0.024 |
MARS | 0.308 | 0.924 | 0.126 | 0.000 | 0.295 | 0.929 | 0.124 | −0.044 |
ANN | 0.328 | 0.916 | 0.178 | 0.008 | 0.331 | 0.910 | 0.187 | −0.032 |
RF | 0.130 | 0.986 | 0.053 | 0.000 | 0.276 | 0.939 | 0.116 | −0.046 |
SVM | 0.308 | 0.925 | 0.126 | 0.033 | 0.294 | 0.929 | 0.124 | −0.010 |
XGBoost | 0.199 | 0.968 | 0.081 | 0.000 | 0.269 | 0.941 | 0.113 | −0.040 |
HS | 0.418 | 0.863 | 0.171 | 0.021 | 0.446 | 0.839 | 0.188 | −0.016 |
Note: The best statistical indicators among the models are marked in bold.
Figures 3 and 4 display the scatter plots of the PM-56 ET0 and those estimated by the machine learning models and the HS empirical model during the training and testing periods, respectively. It is clear from the figure that all the nine models had passed the significance test (P < 0.0001). However, scatter plots of different models showed various distributions. The RF model (R2 = 0.987) gave the less discrete points during the training period. The scatter distribution of the ELM (R2 = 0.987), GBDT (R2 = 0.987) and XGBoost (R2 = 0.987) models were very close to each other during training. The KNEA, MARS, SVM and ANN models had more discrete distribution during training and they were close to each other. The HS model showed a serious underestimation when the PMF56 ET0 > 5 mm d−1 during the training period. The ANN model produced more scattered estimates than the other machine learning models and the other eight machine learning models had a similar distribution of scatter points during testing. The scatter distribution was similar during testing and training for the HS model. This was not because the extreme values of the data were different between the two periods, but because the model itself did not capture useful information of temperature. In other words, diurnal temperature range and average temperature were not enough to describe the complex nonlinear relationship between temperature and ET0.
To evaluate the balance of different machine models and the empirical model, the radar chart of RMSE at the 11 stations is presented in Figure 5. It can be clearly seen that the RMSE of the HS model was higher than the machine learning models at each site. The main reason for the low accuracy of the ANN model was that the stations in the south and west had larger errors than the other machine learning models. The accuracy of the SVM model was affected by the large errors of the three stations (ID: 57993, 58506 and 58527) in the north. The RF model ranked first at stations 59102 and 58813, but exhibited the moderate performance at the other stations. The GBDT and XGBoost models were very stable at each station and ranked in the middle position. The above showed that different datasets had different impacts due to the various principles of model construction. However, due to the natural classification ability of tree-based models, different datasets can be converted into different decision trees, so the estimation accuracy of these models was higher.
Case 2
The model established in the previous section can be applied in the areas where only temperature observations are available. The application potential of different models in this case was further assessed, in which four stations with independent datasets (ID: 57896, 58519, 58608 and 58715) were used to evaluate the performance of the nine models. To clearly see the rank of the statistical results, the top three models were highlighted with red, green and blue colours. The mean values of the statistical results of the four stations are also listed in Table 3. As seen from the table, different models behaved differently at various stations. Taking Station 57896 as an example, the ranks of the studied models were: ELM> SVM > XGBoost > MARS > KNEA > GBDT > RF > ANN > HS. However, the SVM, XGBoost, MARS and KNEA models were close to each other. The ANN and HS models were worse than the other models, with the increase in RMSE by 10.5–37.1% and 35.4–67.9% at Station 57896 compared with the other models. From the statistical results of the average value of the four stations, the MARS and SVM models performed slightly superior to the other models, while the ANN and HS models performed worse than the others. It can be seen that since the variations of temperatures and ET0 are small at each station, it is feasible to develop general models for the estimation of ET0 in this region.
Note: the top three ranked models were highlighted in red, green and blue, respectively. Please refer to the online version of this paper to see this table in colour: https://dx.doi.org/10.2166/nh.2019.060.
Figure 6 presents the scatter plots of the PM-56 ET0 and those estimated by the same predictive machine learning models and the HS empirical model at the four stations. It is clear from the figure that all of the nine models had pass the significance test (P < 0.0001). All the machine learning models except the ANN model displayed relatively small scatter distribution. The HS model showed underestimation of monthly mean daily ET0 to some extent when ET0 < 1.5 mm d−1 or >5 mm d−1.
Case 3
When a site lacks the basic temperature observation data, it can be replaced with temperature data from the other stations, which is commonly referred to as ‘cross-station application’. In this section, it was supposed that there were four stations having no maximum and minimum temperature data, but only calculated extraterrestrial radiation data. The temperature data from the nearest station were used to replace the missing temperature data for each of these stations. In short, temperature data from stations 57896, 58519, 58813 and 58715 were replaced with those from stations 57993, 58527, 58606 and 58608, respectively. The statistical results are shown in Table 4. The GBDT model performed best at Station 57896, with RMSE 11.8–65.8% less than the other models. The KNEA, ELM, MARS, SVM and XGBoost models were close to each other, while the RF, ANN and HS models were obviously not as good as these five models. However, the MARS, KNEA and SVM models ranked the top three models at Stations 58519, 58608 and 58715. On this basis, the MARS, SVM and KNEA models performed superior to the other models at all the four stations. Figure 7 presents the scatter plots of the PMF56 ET0 and those estimated by the same predictive machine learning models and the HS empirical model performed at the other four stations in the cross-station applications. It is clear from the figure that all the nine models have passed the significance test (P < 0.0001). The scatter distribution of each model was not different from the performance of the models in the previous section. This indicates that it is feasible to use the adjacent meteorological data when local data are missing.
Note: the top three ranked models were highlighted in red, green and blue, respectively. Please refer to the online version of this paper to see this table in colour: https://dx.doi.org/10.2166/nh.2019.060.
Kisi (2016) found that the LSSVM model was superior to the MARS model when local data were available, but the MARS model performed better than the LSSVM model when cross-station data were used. Similar results have been reported by Karimi et al. (2017) who found that the GEP model outperformed the SVM model in cross-station scenarios. In this study, the tree-based models performed better than the other models in local applications, while the MARS, SVM and KNEA models offered better ET0 estimates than the others in the absence of local temperature data. This can be due to the differences in the dataset and the characteristics of various models. The tree-based models use greedy algorithms to explain every point as far as possible, but the dataset inevitably contain noise, which results in the over-fitting of the model to some extent. In addition, the tree-based models use many weak classifiers to establish a sub-model (one weak classifier) for small samples independently. The weight of this sub-model is much higher than that of other models, which can subtly obtain some useful information when localizing the model. However, this sub-model model may not be applicable in other regions, and it may also cause the over-fitting of the model when cross-station data are applied. The inspiration of the MARS model also comes from the classification tree, but the largest difference between the model and the decision tree is that the base function can be coupled, which has the ability to describe the interaction. This may be the reason why the MARS model is more adaptable. On the other hand, the SVM and KNEA models adopt structural risk minimization and some noise can be artificially ignored by means of tuning parameters, which may explain the high stability of the SVM and KNEA models. Overall, the selection of suitable alternative sites for ET0 estimation is a systematic project, not only depending on the distance of two sites, but also requiring the similarity of climate rather than the proximity of some individual values. In this study, only four groups of stations (eight stations) were selected to demonstrate the feasibility of switching stations for monthly mean daily ET0 estimation. However, how to establish a more suitable model still needs to be further explored. In addition, only temperature data were switched in this study and the applicability of using more meteorological data from nearby stations for estimating monthly mean daily ET0 in a target station is to be studied. Further study is also needed to assess the capability of the proposed models on various time scales (hourly or daily) or in different climatic zones.
CONCLUSIONS
This study compared the capability of eight machine learning models, i.e. ELM, GBDT, KNEA, MARS, ANN, RF, SVM and XGBoost, in modeling monthly mean daily ET0 using maximum and minimum air temperatures and extraterrestrial solar radiation data from 15 stations located in the Jiangxi Province of China. These machine learning models were also compared with the empirical Hargreaves–Samani model. The results showed that the tree-based models (RF, GBDT and XGBoost) had higher estimation accuracy than the other models in local applications. When only temperature data were available, the MARS and SVM models performed slightly better than the other models, while the ANN and HS models performed worse than the others. When there was no temperature data at the target station and the temperature data from the adjacent station was used instead, the MARS, SVM and KNEA model outperformed the other models. This study can provide a solution for the estimation of ET0 in the Jiangxi Province of China when lack of complete meteorological data and may provide a reference for other regions around the world with similar meteorological conditions.
ACKNOWLEDGEMENTS
This study was supported by the National Natural Science Foundation of China (Nos 51879196, 51790533 and 51709143). Thanks to the National Meteorological Information Center of China Meteorological Administration for offering the meteorological data.