Abstract
Surface Temperature (ST) is important in terms of surface energy and terrestrial water balances affecting urban ecosystems. In this study, to process the nonlinear changes of climatological variables by leveraging the distinct advantages of Long Short-Term Memory (LSTM) and Bidirectional Long Short-Term Memory (BiLSTM), we propose an LSTM-BiLSTM hybrid deep learning model which extracts multi-dimension features of inputs, i.e., backward (future to past) or forward (past to future) to predict ST. This study assessed the climatological variables, i.e., wind speed, wind direction, relative humidity, dew point temperature, and atmospheric pressure impact on ST using five major coastal cities of India: Chennai, Mangalore, Visakhapatnam, Cuddalore, and Cochin. The Recurrent Neural Networks (RNN) and hybrid LSTM-BiLSTM models have effectively predicted ST and outperformed the standalone Artificial Neural Networks (ANN), LSTM, and BiLSTM models. The RNN and LSTM-BiLSTM models have performed better in predicting ST for Mangalore (Nash-Sutcliffe efficiency (NSE)=0.91), followed by Cochin (NSE=0.89), Chennai (NSE=0.88), Cuddalore (NSE=0.88), and Vishakhapatnam (NSE=0.81). The hybrid data-driven modeling framework indicated that coupling the LSTM and BiLSTM models was proven effective in predicting the ST of coastal cities.
HIGHLIGHTS
Surface temperature prediction model based on hybrid machine learning algorithms.
Hybrid data-driven algorithm was more effective compared to the individual ML algorithms.
Surface temperature prediction of major coastal cities of India.
INTRODUCTION
It is well reported that global surface temperature (ST) has increased significantly by 1.6 °C (Basha et al. 2017). The increase in ST results in global warming, which causes severe heat waves, reduction in snow and ice, increase in sea level, decrease in biodiversity, and soil erosion (Intergovernmental Panel on Climate Change (IPCC) 2007). From climate model simulations, it is possible to understand the long-term changes and factors responsible for increasing ST at the regional and global scales (Mishra et al. 2020). However, the short-term forecasting of ST is a crucial factor for many different applications such as agriculture, industry, environment, tourism, etc. (Patz et al. 2005). For example, short-term forecasting includes applications for power utilities during summer, increasing the load on power maintenance. ST also plays a significant role in the water-energy balances and consequent intensification of the hydrological cycle affecting the water resources systems (Wang et al. 2021). Therefore, there is a need to precisely predict the ST in combination with the analysis of further features in the subject of interest, and they would help to create a planning horizon for infrastructure upgrades, insurance, energy policy, etc.
The short-term forecasting of ST has become an important field of Machine Learning (ML) techniques. It is known that the time series of ST at a particular station has nontrivial long-range correlation, presenting a nonlinear behaviour. The advantage of the data-driven technique is that it doesn't need to derive the physical processes for specific problems. It only requires input to represent a data set containing many samples to train the algorithm. Recent studies showed the problems solved by the ML in various fields, such as hydrological and climatological applications (Samadianfard et al. 2019; Sankaranarayanan et al. 2019; Sattari et al. 2020; Shamshirband et al. 2020; Madhuri et al. 2021; Sadeghfam et al. 2021). ML techniques accurately and efficiently represented the unresolved problems in climate science results (Brenowitz & Bretherton 2018; O'Gorman & Dwyer 2018; Rasp et al. 2018; Bolton & Zanna 2019; Salehipour & Peltier 2019). Furthermore, ML techniques can improve climate modeling significantly and long-term climate projections in the coming years (Schneider et al. 2015; Gentine et al. 2018; Reichstein et al. 2019; Chattopadhyay et al. 2020). Several studies reported the Sea Surface Temperature (SST) and Land Surface Temperature (LST) predictions with different ML models (Caruso 2002; Mathew et al. 2016; Himika et al. 2018; Choe & Yom 2020; Mustafa et al. 2020; Yu et al. 2020). They showed good performance in predicting the SST and LST from individual and ensemble averages of varying ML algorithms.
Several researchers have studied and presented solutions for predicting weather conditions in terms of temperature (Gocic & Trajkovic 2013; Baehr et al. 2015; Roesch & Günther 2019; Cifuentes et al. 2020; Hewage et al. 2021). Although some improved machine learning methods have already achieved exemplary performance and high accuracy, it still has difficulties processing large bulks of input data and dealing with vanishing or exploding gradients problems (Hewage et al. 2021). Therefore, the deep learning method has been applied in forecasting to overcome these insufficiencies. The most significant feature of LSTM is its capability to learn long-term dependency, which is not possible with simple RNNs (Apaydin et al. 2020). Although Long Short-Term Memory (LSTM) is superior to traditional ML methods in processing large bulk of input data and has a relatively fast computational speed, LSTM is not always the best choice considering the accuracy of model predictions (Cai et al. 2019; Wang et al. 2019). Among them, the BiLSTM model has already been applied in the prediction of photovoltaic power output (Wang et al. 2019), short-term load forecasting (He 2017), wind speed and solar radiation forecasting (Díaz-Vico et al. 2017), water resources (Hu et al. 2019; Offiong et al. 2021), and stock-market predictions (Althelaya et al. 2018). However, there is limited research on BiLSTM application in ST forecasting. To the best of our knowledge, no previous study attempted to investigate the potential of using BiLSTM for ST prediction. There are no intelligent algorithms or models that are competent for all problems, and deep learning models cannot be spared either (Zhen et al. 2020). There is still room for the improvement of deep learning models in ST forecasting. The combination of ML models that capture the spatial and temporal dependencies has its unique advantage in solving the forecasting problem with the bulk of data characterized by temporal and spatial correlation features (Zhen et al. 2020). To overcome the shortcomings of a single LSTM or BiLSTM model, to process the nonlinear changes of climatological variables and combine their advantages to get a better prediction performance, in this study, the LSTM-BiLSTM hybrid model is proposed. Given the recent development of ML models and the availability of advanced techniques, it is important to study the performance of various state-of-the-art ML models in the prediction of ST. Also, to the best of the authors' knowledge, none of the studies have focused on predicting coastal cities ST over the Indian context. Prediction of ST over coastal cities is highly important under high population and socioeconomic vulnerabilities due to increased heat stress under anthropogenic global warming (Rehnberg 2021). Large variabilities are observed in coastal stations compared to inland stations in terms of surface temperature. Therefore, in the present study, we have considered five coastal cities of Chennai, Mangalore, Visakhapatnam, Cuddalore, and Cochin in the southern part of India in the prediction of ST. In this article, we demonstrate the viability of different ML-based model performance in simulating the ST over the different coastal cities in the southern part of India. In the next section, we discuss the research domain, and different ML algorithms. Section 3 compares the performance of different models along with the predictive accuracy of individual and ensemble averages. Finally, we present conclusions from the research and discuss future work.
DATA
For this study, we have considered India's five major coastal cities, Chennai, Mangalore, Visakhapatnam, Cuddalore, and Cochin, to predict ST with various weather variables as features shown in Figure 1. We have considered atmospheric pressure, dew point temperature, wind speed, wind direction, and relative humidity as predictor variables. Each surface meteorological variable was considered daily from 1st January 1980 to 31st December 2019 obtained from the National Climate Data Centre of National Oceanic and Atmospheric Administration (NOAA) (https://www.ncdc.noaa.gov/cdo-web/datasets).
METHODOLOGY
The overview of the proposed modeling framework is shown in Figure 2. First, the study analyzed the dependability of each meteorological feature variable (atmospheric pressure, dew point temperature, wind speed, wind direction, and relative humidity) over ST for each coastal city. For ST modeling, in most studies, more than a single model is used to assess model performance. Hence, for uniqueness, globalization of the solution, and deep understanding there of, the study considered ANN, RNN, LSTM, and BiLSTM as benchmark models for comparing ST. The considered models are the state-of-the-art ML models widely applied due to their superiority in capturing hidden patterns in the data. Due to the implementation of these models, the study can predict ST accounting for nonlinearity and dynamic behaviors. As these models can capture the complex behavior of ST with the selected predictor variables, the adopted algorithms resulted in robust predictions of ST. Various ML approaches such as ANN, RNN, LSTM, BiLSTM, and hybrid model (LSTM-BiLSTM) were applied for each city to predict ST at a daily time scale. Figure 2 shows the architectural flow diagram implemented in the prediction of ST using various ML algorithms. Various performance measures such as Mean Square Error (MSE), Mean Absolute Error (MAE), Norm, Nash-Sutcliffe efficiency (NSE), root mean square error to the standard deviation of measured data (RSR), percent bias (PBIAS), and R2 values were used to study the performance of each ML algorithm for a given coastal city of India.
Artificial neural network (ANN)
ANN has been identified as one of the robust ML algorithms to capture the nonlinear relationships in the prediction with several applications in various fields of earth science in modeling the near-surface climatology (LeCun et al. 2015; Lekkas 2017; Reichstein et al. 2019; Qiu et al. 2020; Martin et al. 2021). ANNs are developed using the human brain and neuron network as inspiration. Its function is similar to the way the human brain analyzes and processes information. Gupta & Singh (2011) describe that ANN comprises several highly integrated computing components (neurons) working in unison to solve a particular problem. ANN has self-learning capabilities that help them to produce more accurate results with more extensive data. It has input layers, hidden layers, and output layers (Figure 3). The data inputs are sent to the input layer, sent to hidden layers with weights, and finally, we get the output from output layers. The weights are calculated by minimizing the mean square error between the output and the actual values. ASCE Task Committee on Application of Artificial Neural Networks in Hydrology (2000) provides a detailed survey of the numerous effective applications of ANNs to hydrological problems, e.g., estimating temperature (Cifuentes et al. 2020), and precipitation (Lee et al. 2018), modeling stream flows (Uysal et al. 2016), forecasting river stages (Dazzi et al. 2021), rainfall-runoff modeling (Riad et al. 2004), water quality modeling (Rehana & Dhanya 2018; Zhu et al. 2018), groundwater modeling (Ebrahimi & Rajaee 2017), and many other applications. More details of the ANN model and the training algorithms can be found in El-Baroudy et al. (2010).
Recurrent neural network
The recurrent neural network is a class of ANN where connections between nodes form a directed graph along a temporal sequence (Rumelhart et al. 1986). Here directed graphs have edges with direction. The edges indicate a one-way relationship in that each edge can only be traversed in a single direction, and undirected graphs have edges that do not have a direction. The edges indicate a two-way relationship in that each edge can be traversed in both directions (Sarker et al. 2019). RNN is capable of handling both current and past data, i.e., it considers current values and past values to predict the next values. It is a recurrent method, and it ends when we reach the required error limit. To understand RNN, look at the architecture of the RNN models, as can be seen in Figure 4, and observe that in simple Recurrent Neural Network, the nonlinear functions h(t) (those are tanh and ReLU) connects that itself. A single RNN model updates only a single past state, and it is trained by the backpropagation-through-time algorithm, by which the loss function is propagated backward to determine updates to weights (Werbos 1990).
In theory, simple RNNs can make use of information in arbitrarily long sequences, but in practice, backpropagation encounters the vanishing gradient problem in which the training signal becomes exponentially small as it propagates into the network, making backpropagation ineffective for deep networks as they are limited to looking back only a few steps (Shen 2018). To overcome gradient vanishing and exploding problems, RNNs can further be improved using the gated RNN architectures. LSTM and Gated Recurrent Unit (GRU) are some examples of this. This study evaluated the LSTM to predict the ST for the five coastal cities.
Long short-term memory
Bidirectional LSTM (BiLSTM)
Bidirectional LSTM (BiLSTM) is an extension of traditional LSTMs that can improve model performance on sequence classification problems. It consists of two LSTMs, one LSTM is trained by taking the sequential input data from forward, and the other LSTM trains by taking data from backward direction (Figure 6), i.e., it learns long-term bidirectional dependencies (Schuster & Paliwal 1997). Doing this increases the amount of information available for classifying the data, improving the performance compared to a traditional LSTM. This type of process is helpful in time series data when we want to learn the data at each timestep (Salehinejad et al. 2018).
LSTM-BiLSTM hybrid model
In this study, to address the individual weaknesses and leverage the distinct advantages of LSTM and Bi-LSTM, we propose an LSTM-BiLSTM hybrid model. LSTM-BiLSTM hybrid model is a particular model we implemented using LSTM and BiLSTM layers, and it has nine layers. Every alternate layer is a dropout layer where it drops out 20% of the random nodes in the previous layer to reduce overfitting. It includes one layer of Bidirectional LSTM and three more layers of simple LSTM. It starts with the first layer of BiLSTM and then three layers of LSTM, followed by dropout layers in between. Overfitting is a phenomenon in which the parameters get overly fixated over the training set and validation datasets. It performs exceptionally well and performs poorly on the test datasets or predicting. To overcome this problem, we introduce dropout layers after every layer. Some possible alternatives are cross-validation, feature selection, regularization, etc., to overcome the overfitting phenomenon. Cross-validation is computationally expensive as we train data for multiple types by separating them into parts. Feature selection is the proper method when there are fewer training samples with many features. To avoid overfitting, only essential features are selected for training the model using feature selection methods like calculating correlation coefficient, selectKBest, etc. Regularization is a technique to add a penalty in the error function. This helps in modifying the coefficients so that the predictions don't take extreme values. In this case, we used dropout layer methods because there are few features, and the feature selection method is not helpful; cross-validation is computationally expensive. Outliers are removed before, so regularization is also not very useful. So, the results should not be highly affected even if we don't apply these methods.
MODEL EVALUATION
Performance rating . | RSR . | NSE . | PBIAS (%) . |
---|---|---|---|
Very good | PBIAS<±10 | ||
Good | ±10≤PBIAS<±15 | ||
Satisfactory | ±15≤PBIAS<±25 | ||
Unsatisfactory | PBIAS≥±25 |
Performance rating . | RSR . | NSE . | PBIAS (%) . |
---|---|---|---|
Very good | PBIAS<±10 | ||
Good | ±10≤PBIAS<±15 | ||
Satisfactory | ±15≤PBIAS<±25 | ||
Unsatisfactory | PBIAS≥±25 |
RESULTS AND DISCUSSION
Historical trends of meteorological variables of coastal cities of India
Analysis of historical weather variables is important to plan and prepare for associated impacts. The statistical trend analysis of various meteorological variables provides practical information for better management of water resources. Tests for detecting significant trends in climatologic time series can be classified as parametric and non-parametric methods. Parametric trend tests require data to be independent and normally distributed, while non-parametric trend tests require only that the data be independent (Gocic & Trajkovic 2013). The non-parametric Mann-Kendall (MK) and Sen's methods were used to determine the annual and seasonal trends of various meteorological variables, whether there was a positive or negative trend in weather data with their statistical significance for five major coastal cities in India (Table 2). The non-parametric MK and Sen's methods tests are most commonly used to estimate the magnitude of trends identifying in meteorological data time series, because of its insensitivity to the normal distribution of data time series and outliers (Mann 1945; Kendall 1975; Teegavarapu 2019). The methods are less sensitive to extreme events and missing data points (Partal & Kahya 2006). MK trend test was applied to find the trends in ST over different cities. Considering the whole dataset, i.e., from 1980 to 2019, the ST and dew point temperature of all cities increased with a significance level of 0.05. The ST shows an increasing trend at a rate of 0.24 °C/decade, 0.07 °C/decade, 0.09 °C/decade, 0.01 °C/decade, 0.03 °C/decade over Chennai, Cochin, Cuddalore, Mangalore, and Visakhapatnam, a significance level of 0.05. Furthermore, the Sen's slopes of ST were estimated with a confidence level of 90% and found to be 0.022 °C/year for Chennai, 0.008 °C/year for Cochin, 0.009 °C/year for Cuddalore, 0.013 °C/year for Mangalore, and 0.027 °C/year for Visakhapatnam (Table 3).
Station . | Test . | Temperature . | Pressure . | Relative humidity . | Dew point temperature . | Wind speed . | Wind direction . |
---|---|---|---|---|---|---|---|
Chennai | Trend | ||||||
Z | 11.837 | 5.073 | −4.824 | 14.401 | 31.412 | −15.468 | |
Sen's slope | 0.022 | 0.013 | −0.028 | 0.0175 | 0.017 | −0.737 | |
Cochin | Trend | ||||||
Z | 5.604 | 6.964 | 5.734 | 13.456 | 15.543 | −3.031 | |
Sen's slope | 0.008 | 0.009 | 0.032 | 0.015 | 0.006 | −0.059 | |
Cuddalore | Trend | No trend | |||||
Z | 5.150 | −2.877 | 8.269 | 18.450 | −83.342 | −0.148 | |
Sen's slope | 0.009 | −0.004 | 0.067 | 0.026 | −0.039 | −0.13 | |
Mangalore | Trend | No trend | |||||
Z | 11.354 | −11.991 | 0.980 | 11.995 | −39.472 | −12.654 | |
Sen's slope | 0.013 | −0.015 | 0.001 | 0.015 | −0.025 | −0.404 | |
Visakhapatnam | Trend | ||||||
Z | 14.947 | 3.149 | −11.083 | 6.363 | −35.132 | 2.072 | |
Sen's slope | 0.027 | 0.013 | −0.064 | 0.013 | −0.036 | 0.150 |
Station . | Test . | Temperature . | Pressure . | Relative humidity . | Dew point temperature . | Wind speed . | Wind direction . |
---|---|---|---|---|---|---|---|
Chennai | Trend | ||||||
Z | 11.837 | 5.073 | −4.824 | 14.401 | 31.412 | −15.468 | |
Sen's slope | 0.022 | 0.013 | −0.028 | 0.0175 | 0.017 | −0.737 | |
Cochin | Trend | ||||||
Z | 5.604 | 6.964 | 5.734 | 13.456 | 15.543 | −3.031 | |
Sen's slope | 0.008 | 0.009 | 0.032 | 0.015 | 0.006 | −0.059 | |
Cuddalore | Trend | No trend | |||||
Z | 5.150 | −2.877 | 8.269 | 18.450 | −83.342 | −0.148 | |
Sen's slope | 0.009 | −0.004 | 0.067 | 0.026 | −0.039 | −0.13 | |
Mangalore | Trend | No trend | |||||
Z | 11.354 | −11.991 | 0.980 | 11.995 | −39.472 | −12.654 | |
Sen's slope | 0.013 | −0.015 | 0.001 | 0.015 | −0.025 | −0.404 | |
Visakhapatnam | Trend | ||||||
Z | 14.947 | 3.149 | −11.083 | 6.363 | −35.132 | 2.072 | |
Sen's slope | 0.027 | 0.013 | −0.064 | 0.013 | −0.036 | 0.150 |
=Increasing; =decreasing; Z=Mann-Kendall test; Sen's slope – monthly average.
Station . | Temperature . | Atmospheric pressure . | Relative humidity . | Dew point temperature . | Wind speed . | Wind direction . |
---|---|---|---|---|---|---|
Chennai | 2009) | 2006 () | 1996 () | 2009 () | 1996 () | 1997 () |
Cochin | 2008 () | 1991 () | 1992 () | 1995 () | 2005 () | 1989 () |
Cuddalore | 2009 () | 1998 () | 2007 () | 2007 () | 2002 () | 1988 () |
Mangalore | 1995 () | 1998 () | 1990 () | 1993 () | 2006 () | 2008 () |
Visakhapatnam | 1997 () | 2000 () | 1997) | 1991 () | 1994 () | 1986 () |
Station . | Temperature . | Atmospheric pressure . | Relative humidity . | Dew point temperature . | Wind speed . | Wind direction . |
---|---|---|---|---|---|---|
Chennai | 2009) | 2006 () | 1996 () | 2009 () | 1996 () | 1997 () |
Cochin | 2008 () | 1991 () | 1992 () | 1995 () | 2005 () | 1989 () |
Cuddalore | 2009 () | 1998 () | 2007 () | 2007 () | 2002 () | 1988 () |
Mangalore | 1995 () | 1998 () | 1990 () | 1993 () | 2006 () | 2008 () |
Visakhapatnam | 1997 () | 2000 () | 1997) | 1991 () | 1994 () | 1986 () |
=change from negative to a positive direction.
=change from positive to negative direction.
Relative humidity of cities Cochin and Cuddalore were found to be increasing, while Chennai and Visakhapatnam show decreasing, with no significant trend for Mangalore. The Sen's slopes of relative humidity were estimated to be −0.028 °C/year for Chennai, 0.032%/year for Cochin, 0.067%/year for Cuddalore, 0.001 °C/year for Mangalore, and −0.064 °C/year for Visakhapatnam at a significance level of 10% (Table 3). Wind speeds of cities Chennai (0.017 m/s/year) and Cochin (0.006 m/s/year) were found to be increasing, and Cuddalore (−0.039 m/s/year), Mangalore (−0.025 m/s/year), and Visakhapatnam (−0.036 m/s/year) were found to be decreasing. The atmospheric pressure of cities Chennai (0.013 hPa/year), Cochin (0.009 hPa/year), and Visakhapatnam (0.013 hPa/year) were found to be increasing, and Cuddalore (−0.004 hPa/year), Mangalore (−0.015 hPa/year) were found to be decreasing. The wind direction trends of cities Chennai, Cochin, and Mangalore were found to be decreasing, while for Mangalore decreasing trends with no trend for Cuddalore.
The change point analysis results in the meteorological variables during the period 1980–2019 are summarized in Table 3 for all stations. A change point can be detected for all seven variables for various coastal cities of India. Change from positive to negative direction was detected in the time series of atmospheric pressure, relative humidity, wind speed, and wind direction, while the ST and dew point temperature variables changed from negative to positive. The ST has shown significant positive changes from 2008/2009 onwards for Chennai, Cochin, and Cuddalore. Whereas the ST of Mangalore and Vishakhapatnam has demonstrated positive changes from the years 1995 and 1997 onwards. The change point detection years of atmospheric pressure, relative humidity, dew point temperature, wind speed, and direction for all major coastal cities of India are listed in Table 3. It can be noted that most of the variables have started showing positive change in the trends after the 1990s and 2000 onwards. For example, following ST, the dew point temperature has also shown positive changes for all cities after years of 1990 (Cochin, Mangalore, Vishakhapatnam) and 2000 (Chennai, Cuddalore) onwards.
Predictions of the surface temperature of coastal cities using various ML models
In this study, we have used surface meteorological parameters data from 1980 to 2007 (10,000 data points) for training and data from 2007 to 2019 (4,600 data points) for testing each of the ML models. For each coastal city, separate ML models have been trained and tested with performance measures such as NSE, R2, MSE, MAE, and Norm, as discussed in the previous section. First of all, the statistical dependency between each variable and ST has been studied using Pearson correlation coefficients, as displayed in Table 4. The ST of Chennai is more dependent on positively with dew point temperature and negatively with relative humidity and atmospheric pressure. For Cochin, the most dependable variable is the relative humidity in predicting ST with a correlation coefficient of −0.62. For Cuddalore, the most influencing variables in the prediction of ST are dew point temperature, relative humidity, and atmospheric pressure with correlation coefficients as 0.55, −0.61, and −0.75, respectively. For Mangalore, relative humidity is the most influencing variable with a negative correlation coefficient of 0.51. Whereas the most influencing variables in the prediction of ST are dew point temperature and atmospheric pressure with correlation coefficients as 0.78 and −0.53, respectively (Table 4).
City . | Pearson correlation coefficient . | ||||
---|---|---|---|---|---|
ST-Dew point temperature . | ST-Relative humidity . | ST- Wind speed . | ST-Wind direction . | ST – Atmospheric pressure . | |
Chennai | 0.52 | −0.64 | 0.47 | 0.42 | −0.73 |
Cochin | 0.26 | −0.62 | 0.12 | 0.096 | −0.05 |
Cuddalore | 0.55 | −0.61 | 0.12 | 0.35 | −0.75 |
Mangalore | 0.19 | −0.51 | 0.14 | 0.15 | −0.04 |
Visakhapatnam | 0.78 | 0.15 | 0.11 | 0.40 | −0.53 |
City . | Pearson correlation coefficient . | ||||
---|---|---|---|---|---|
ST-Dew point temperature . | ST-Relative humidity . | ST- Wind speed . | ST-Wind direction . | ST – Atmospheric pressure . | |
Chennai | 0.52 | −0.64 | 0.47 | 0.42 | −0.73 |
Cochin | 0.26 | −0.62 | 0.12 | 0.096 | −0.05 |
Cuddalore | 0.55 | −0.61 | 0.12 | 0.35 | −0.75 |
Mangalore | 0.19 | −0.51 | 0.14 | 0.15 | −0.04 |
Visakhapatnam | 0.78 | 0.15 | 0.11 | 0.40 | −0.53 |
The training and testing of the models are performed for many different parameters for each model. For every model, we have tested with 100, 150, 200 iterations and observed that from 150 iterations are nearly identical and converge to the results from 200 and 250 iterations, it is likely no benefit with more than 150. For every model, we set 150 iterations and compared the performance for many different values of hidden layers in each model. There is no fixed value for the number of nodes needed for the best output. So, we tested multiple values for the number of nodes. It was trained on 10,000 daily data points and tested on ∼4,500 data points for every parameter. A careful selection of a set of hyperparameters is required for the deep learning algorithm (Feigl et al. 2021). In this study, while training a model on a time series, all the possible combinations of deep learning technique hyperparameter sets (the number of hidden layers: 1–2, the total number of hidden nodes: 50–500, timesteps:1, the dropout ratio: 0–0.4, epochs: 50–150, and the batch size: 2–64) are evaluated, and the topmost group is chosen to improve the model's performance. Table 5 shows the selection of the number of hidden nodes for ANN, RNN, LSTM, and BiLSTM for Cochin, and bold values indicate the significant number of hidden nodes with respect to the best performing results. We have conducted a similar experimental test for all other four cities to identify the significant number of hidden nodes with respect to the best performing results.
Model . | Hidden nodes . | MSE . | MAE . | R2 . | NSE . | Norm . |
---|---|---|---|---|---|---|
ANN | 50 | 0.27 | 0.24 | 0.84 | 0.80 | 33.48 |
100 | 0.28 | 0.27 | 0.83 | 0.79 | 34.28 | |
150 | 0.27 | 0.26 | 0.85 | 0.81 | 32.61 | |
170 | 0.21 | 0.21 | 0.87 | 0.85 | 30.15 | |
200 | 0.23 | 0.23 | 0.85 | 0.82 | 32.54 | |
220 | 0.30 | 0.29 | 0.82 | 0.78 | 35.34 | |
RNN | 200 | 0.18 | 0.18 | 0.89 | 0.88 | 27.50 |
240 | 0.17 | 0.18 | 0.89 | 0.88 | 27.17 | |
280 | 0.17 | 0.17 | 0.89 | 0.88 | 27.09 | |
300 | 0.16 | 0.16 | 0.90 | 0.89 | 26.50 | |
320 | 0.17 | 0.17 | 0.89 | 0.88 | 27.11 | |
360 | 0.17 | 0.17 | 0.90 | 0.89 | 27.00 | |
400 | 0.17 | 0.17 | 0.90 | 0.89 | 26.99 | |
LSTM | 120 | 0.28 | 0.27 | 0.83 | 0.78 | 34.38 |
150 | 0.28 | 0.27 | 0.84 | 0.79 | 34.06 | |
170 | 0.27 | 0.26 | 0.84 | 0.79 | 34.01 | |
190 | 0.28 | 0.27 | 0.84 | 0.79 | 34.16 | |
200 | 0.28 | 0.27 | 0.83 | 0.79 | 34.19 | |
240 | 0.28 | 0.28 | 0.83 | 0.79 | 34.59 | |
300 | 0.30 | 0.29 | 0.83 | 0.78 | 35.26 | |
BiLSTM | 200 | 0.25 | 0.26 | 0.85 | 0.82 | 32.35 |
240 | 0.24 | 0.25 | 0.86 | 0.83 | 31.56 | |
280 | 0.23 | 0.24 | 0.86 | 0.83 | 31.26 | |
320 | 0.23 | 0.23 | 0.86 | 0.84 | 30.99 | |
360 | 0.22 | 0.22 | 0.87 | 0.84 | 30.66 | |
400 | 0.22 | 0.22 | 0.87 | 0.85 | 30.50 | |
440 | 0.22 | 0.21 | 0.87 | 0.85 | 30.31 |
Model . | Hidden nodes . | MSE . | MAE . | R2 . | NSE . | Norm . |
---|---|---|---|---|---|---|
ANN | 50 | 0.27 | 0.24 | 0.84 | 0.80 | 33.48 |
100 | 0.28 | 0.27 | 0.83 | 0.79 | 34.28 | |
150 | 0.27 | 0.26 | 0.85 | 0.81 | 32.61 | |
170 | 0.21 | 0.21 | 0.87 | 0.85 | 30.15 | |
200 | 0.23 | 0.23 | 0.85 | 0.82 | 32.54 | |
220 | 0.30 | 0.29 | 0.82 | 0.78 | 35.34 | |
RNN | 200 | 0.18 | 0.18 | 0.89 | 0.88 | 27.50 |
240 | 0.17 | 0.18 | 0.89 | 0.88 | 27.17 | |
280 | 0.17 | 0.17 | 0.89 | 0.88 | 27.09 | |
300 | 0.16 | 0.16 | 0.90 | 0.89 | 26.50 | |
320 | 0.17 | 0.17 | 0.89 | 0.88 | 27.11 | |
360 | 0.17 | 0.17 | 0.90 | 0.89 | 27.00 | |
400 | 0.17 | 0.17 | 0.90 | 0.89 | 26.99 | |
LSTM | 120 | 0.28 | 0.27 | 0.83 | 0.78 | 34.38 |
150 | 0.28 | 0.27 | 0.84 | 0.79 | 34.06 | |
170 | 0.27 | 0.26 | 0.84 | 0.79 | 34.01 | |
190 | 0.28 | 0.27 | 0.84 | 0.79 | 34.16 | |
200 | 0.28 | 0.27 | 0.83 | 0.79 | 34.19 | |
240 | 0.28 | 0.28 | 0.83 | 0.79 | 34.59 | |
300 | 0.30 | 0.29 | 0.83 | 0.78 | 35.26 | |
BiLSTM | 200 | 0.25 | 0.26 | 0.85 | 0.82 | 32.35 |
240 | 0.24 | 0.25 | 0.86 | 0.83 | 31.56 | |
280 | 0.23 | 0.24 | 0.86 | 0.83 | 31.26 | |
320 | 0.23 | 0.23 | 0.86 | 0.84 | 30.99 | |
360 | 0.22 | 0.22 | 0.87 | 0.84 | 30.66 | |
400 | 0.22 | 0.22 | 0.87 | 0.85 | 30.50 | |
440 | 0.22 | 0.21 | 0.87 | 0.85 | 30.31 |
The next step in predicting ST is to use appropriate ML, which can work accurately in terms of calibration and validation with a comparison of acceptable performance measures, as shown in Figure 2. The results of the five different ML techniques for predicting ST were evaluated using several goodnesses of fit statistics (MSE, MAE, NSE, Norm, RSR, PBIAS, and R2) and graphical tools (comparison plots). The experiment results showed a good trade-off between observed and predicted performance, confirming the stable generalization capacity of ANN, RNN, LSTM, BiLSTM, and LSTM-BiLSTM approaches. The developed model's predicted ST using atmospheric pressure, dew point temperature, wind speed, wind direction, and relative humidity as inputs successfully.
The models' performance for daily data at five coastal cities is provided in Table 6 and Figures 7–11. Results showed that the seasonal variations of predicted ST are almost synchronous and comparable with the observed values (Figures 8–11), but the ANN model performed poorly with observed values for all coastal cities (Figure 7) and performance statistics (MSE, MAE, R2, NSE, Norm, RSR, and PBIAS) can be found in Table 6.
City . | Model . | MSE . | MAE . | R2 . | NSE . | Norm . | RSR . | PBIAS . |
---|---|---|---|---|---|---|---|---|
Chennai | ANN | 1.78 | 1.03 | 0.74 | 0.66 | 90.30 | 0.58 | 3.00 |
RNN | 0.89 | 0.55 | 0.87 | 0.85 | 63.89 | 0.39 | 1.15 | |
LSTM | 1.06 | 0.64 | 0.84 | 0.80 | 69.77 | 0.45 | 0.82 | |
BiLSTM | 1.04 | 0.64 | 0.84 | 0.80 | 69.23 | 0.45 | 1.19 | |
LSTM- BiLSTM | 0.74 | 0.53 | 0.89 | 0.88 | 58.58 | 0.35 | −0.80 | |
Cochin | ANN | 0.21 | 0.21 | 0.87 | 0.85 | 30.15 | 0.39 | −0.01 |
RNN | 0.17 | 0.16 | 0.90 | 0.89 | 26.68 | 0.33 | 0.01 | |
LSTM | 0.27 | 0.26 | 0.84 | 0.79 | 34.02 | 0.46 | −0.24 | |
BiLSTM | 0.22 | 0.21 | 0.87 | 0.85 | 30.31 | 0.39 | −0.01 | |
LSTM- BiLSTM | 0.18 | 0.20 | 0.89 | 0.87 | 27.50 | 0.36 | −0.49 | |
Cuddalore | ANN | 0.86 | 0.53 | 0.85 | 0.81 | 62.93 | 0.44 | −0.04 |
RNN | 0.70 | 0.41 | 0.87 | 0.86 | 56.75 | 0.37 | 0.32 | |
LSTM | 0.89 | 0.56 | 0.84 | 0.81 | 63.79 | 0.44 | 0.12 | |
BiLSTM | 0.83 | 0.49 | 0.85 | 0.82 | 61.67 | 0.42 | 0.13 | |
LSTM- BiLSTM | 0.62 | 0.31 | 0.89 | 0.88 | 53.21 | 0.34 | 0.45 | |
Mangalore | ANN | 0.31 | 0.32 | 0.86 | 0.81 | 37.62 | 0.43 | 7.67 |
RNN | 0.20 | 0.26 | 0.90 | 0.89 | 30.89 | 0.33 | 1.06 | |
LSTM | 0.33 | 0.36 | 0.85 | 0.79 | 38.90 | 0.46 | 0.55 | |
BiLSTM | 0.27 | 0.36 | 0.88 | 0.86 | 35.12 | 0.37 | 1.25 | |
LSTM- BiLSTM | 0.16 | 0.24 | 0.92 | 0.91 | 27.78 | 0.30 | −0.58 | |
Visakhapatnam | ANN | 1.46 | 0.53 | 0.73 | 0.66 | 69.65 | 0.58 | −0.34 |
RNN | 0.88 | 0.37 | 0.83 | 0.81 | 54.28 | 0.43 | −0.12 | |
LSTM | 1.45 | 0.57 | 0.73 | 0.66 | 69.56 | 0.58 | 0.55 | |
BiLSTM | 1.42 | 0.56 | 0.74 | 0.68 | 68.79 | 0.56 | 0.28 | |
LSTM- BiLSTM | 0.97 | 0.51 | 0.82 | 0.77 | 56.72 | 0.48 | 0.52 |
City . | Model . | MSE . | MAE . | R2 . | NSE . | Norm . | RSR . | PBIAS . |
---|---|---|---|---|---|---|---|---|
Chennai | ANN | 1.78 | 1.03 | 0.74 | 0.66 | 90.30 | 0.58 | 3.00 |
RNN | 0.89 | 0.55 | 0.87 | 0.85 | 63.89 | 0.39 | 1.15 | |
LSTM | 1.06 | 0.64 | 0.84 | 0.80 | 69.77 | 0.45 | 0.82 | |
BiLSTM | 1.04 | 0.64 | 0.84 | 0.80 | 69.23 | 0.45 | 1.19 | |
LSTM- BiLSTM | 0.74 | 0.53 | 0.89 | 0.88 | 58.58 | 0.35 | −0.80 | |
Cochin | ANN | 0.21 | 0.21 | 0.87 | 0.85 | 30.15 | 0.39 | −0.01 |
RNN | 0.17 | 0.16 | 0.90 | 0.89 | 26.68 | 0.33 | 0.01 | |
LSTM | 0.27 | 0.26 | 0.84 | 0.79 | 34.02 | 0.46 | −0.24 | |
BiLSTM | 0.22 | 0.21 | 0.87 | 0.85 | 30.31 | 0.39 | −0.01 | |
LSTM- BiLSTM | 0.18 | 0.20 | 0.89 | 0.87 | 27.50 | 0.36 | −0.49 | |
Cuddalore | ANN | 0.86 | 0.53 | 0.85 | 0.81 | 62.93 | 0.44 | −0.04 |
RNN | 0.70 | 0.41 | 0.87 | 0.86 | 56.75 | 0.37 | 0.32 | |
LSTM | 0.89 | 0.56 | 0.84 | 0.81 | 63.79 | 0.44 | 0.12 | |
BiLSTM | 0.83 | 0.49 | 0.85 | 0.82 | 61.67 | 0.42 | 0.13 | |
LSTM- BiLSTM | 0.62 | 0.31 | 0.89 | 0.88 | 53.21 | 0.34 | 0.45 | |
Mangalore | ANN | 0.31 | 0.32 | 0.86 | 0.81 | 37.62 | 0.43 | 7.67 |
RNN | 0.20 | 0.26 | 0.90 | 0.89 | 30.89 | 0.33 | 1.06 | |
LSTM | 0.33 | 0.36 | 0.85 | 0.79 | 38.90 | 0.46 | 0.55 | |
BiLSTM | 0.27 | 0.36 | 0.88 | 0.86 | 35.12 | 0.37 | 1.25 | |
LSTM- BiLSTM | 0.16 | 0.24 | 0.92 | 0.91 | 27.78 | 0.30 | −0.58 | |
Visakhapatnam | ANN | 1.46 | 0.53 | 0.73 | 0.66 | 69.65 | 0.58 | −0.34 |
RNN | 0.88 | 0.37 | 0.83 | 0.81 | 54.28 | 0.43 | −0.12 | |
LSTM | 1.45 | 0.57 | 0.73 | 0.66 | 69.56 | 0.58 | 0.55 | |
BiLSTM | 1.42 | 0.56 | 0.74 | 0.68 | 68.79 | 0.56 | 0.28 | |
LSTM- BiLSTM | 0.97 | 0.51 | 0.82 | 0.77 | 56.72 | 0.48 | 0.52 |
The shown values all refer to the test time.
Table 6 shows the best performance of each model in each city after experimenting with different parameters for each model. It was observed that RNN and the LSTM-BiLSTM models performed better in all the cities (Figures 8 and 11). In the case of Chennai, it was observed that the LSTM-BiLSTM hybrid model (MSE=0.74, MAE=0.53, R2=0.89, NSE=0.88, Norm=58.58, RSR=0.35, PBIAS=-0.80) performed better than the remaining models. In the case of Cochin, the RNN model (MSE=0.17, MAE=0.16, R2=0.90, NSE=0.89, Norm=26.68, RSR=0.33, PBIAS=0.01) performed better than the other models. In the case of Cuddalore, the LSTM-BiLSTM model (MSE=0.62, MAE=0.31, R2=0.89, NSE=0.88, Norm=53.21, RSR=0.34, PBIAS=0.45) is better, for Mangalore, the LSTM-BiLSTM hybrid model (MSE=0.16, MAE=0.24, R2=0.92, NSE=0.91, Norm=27.78, RSR=0.30, PBIAS=−0.58) is better and for Visakhapatnam, the RNN model (MSE=0.88, MAE=0.37, R2=0.83, NSE=0.81, Norm=54.28, RSR=0.43, PBIAS=−0.12) is better than the remaining models.
The R2 scores ranged between 0.73 and 0.92, NSE scores ranged between 0.66 and 0.91, RSR scores ranged between 0.30–0.58, PBIAS scores were within the limit (<±10), and MSE scores were reasonably low (∼≤2 °C) during the validation periods, revealing high model reliability. The MAE scores for all the coastal cities ranged between 0.16–1.03 °C pertaining to all models (Table 6), which are reasonable in comparison to earlier models of the Adaptive Neuro-Fuzzy Inference System (ANFIS) for predicting the ST approach by Mustafa et al. (2020) (0.16 °C); Support Vector Regression (SVR) approach for Global Solar Radiation (GSR) by Samadianfard et al. (2019) (0.99 °C); hybrid Decision Tree (DT), Gradient Boosted Trees (GBT) (DT–GBT) approach for predicting soil temperature by Sattari et al. (2020) (0.52–0.97 °C); Multilayer Perceptron (MLP) algorithm and SVR approach for predicting soil temperature by Shamshirband et al. (2020) (0.72–5.17 °C). The R2 scores for all the coastal cities ranged between 0.73–0.92 °C pertaining to all models (Table 6), which are reasonable in comparison to earlier models of the Adaptive Neuro-Fuzzy Inference System (ANFIS) for predicting the ST approach by Mustafa et al. (2020) (0.99); Support Vector Regression (SVR) approach for predicting Global Solar Radiation (GSR) by Samadianfard et al. (2019) (0.96); hybrid Decision Tree (DT), Gradient Boosted Trees (GBT) (DT–GBT) approach for predicting soil temperature by Sattari et al. (2020) (0.98); Multilayer Perceptron (MLP) algorithm and SVR approach for predicting soil temperature by Shamshirband et al. (2020)(0.72–0.98).
The LSTM-BiLSTM hybrid model was implemented using LSTM and BiLSTM with nine layers. Every alternate layer is a dropout layer where it drops out 20% of the random nodes in the previous layer to reduce overfitting. It includes one layer of Bidirectional LSTM and three more layers of simple LSTM. The data samples were given input to the LSTM-BiLSTM hybrid model to predict the ST for all five coastal cities. The results were better than those obtained from the LSTM and BiLSTM models, as clearly shown in Table 6 and Figure 11. Comparing the LSTM-BiLSTM hybrid model with both the LSTM model and BiLSTM model, in terms of individual matching of observed versus predicted values through time series (Figures 9 and 10), it can be concluded that the combination of LSTM and BiLSTM yielded more accurate results than the standalone LSTM model, BiLSTM model to predict the ST (Figure 11).
Figure 12 is a radar plot of NSE (Nash-Sutcliffe Efficiency) values of different models against its cities. This gives an estimate of how each model performs in different cities. From Figure 12, we see a pentagon inside a pentagon, formed by five vertices, each representing the NSE value of each model compared to the best model among them. The outer pentagon, the grey boundary of the graphs, represents the best NSE value of all five models. The red inner polygon represents the polygon formed by the original NSE values by placing them at the vertices. The red vertex, which overlaps with the grey vertex, is the best performing model. The closer the red vertex is to the grey vertex, the closer it is to the performance of the best model, implying the better performance of that model. From Figure 12, we can see that the Cuddalore radar's red pentagon almost covers the whole of the grey pentagon, saying that all the models are performing well with the best NSE values. In general, we can see that the hybrid model performs better in all cities, and the next which comes close to performance is the RNN model. In Figure 12, the model with the best NSE value is at the vertex, and the remaining models are inside the pentagon. For a model, a closer distance from the original pentagon's vertex implies better performance, and further distance implies lesser performance. We can see that in most cities, the hybrid model performs better compared to other models. RNN comes close in performance, just behind the hybrid model. ANN works faster, gives the result in the shortest amount of time, and performs better in some cities.
CONCLUSIONS
In this study, we discussed the performance of a suite of different models like ANN, RNN, LSTM, BiLSTM, and an LSTM-BiLSTM hybrid model consisting of BiLSTM+LSTM layers in ST predictions for the five Indian coastal cities. The input parameters for these models are atmospheric pressure, dew point temperature, wind speed, wind direction, and relative humidity. The performances of these models are compared using different measures like NSE, R2, MSE, MAE, Norm, RSR, and PBIAS. We calculated the best parameter for each city and model by calculating various parameters like hidden nodes. Though we had to replace some missing data in some instances to calculate the surface temperature, the performance of these models is promising. As demonstrated in the present study, the proposed model for ST prediction can also be implemented for other weather factors with appropriate data preprocessing and promising data-driven approaches.
The following major conclusions are derived from this study:
The five major coastal cities of India, Chennai, Cochin, Cuddalore, Mangalore, and Visakhapatnam, have shown an increasing trend in ST at 0.24, 0.07, 0.09, 0.01, 0.03 °C/decade, respectively.
Dew point temperature, relative humidity, and atmospheric pressure are the most influencing parameters for predicting ST over coastal cities in India.
The RNN and LSTM-BiLSTM models have performed better in predicting ST for all major cities of Chennai (NSE=0.88), Cochin (NSE=0.89), Cuddalore (NSE=0.88), Mangalore (NSE=0.91), Vishakhapatnam (NSE=0.81).
The hybrid data-driven modeling framework indicated that coupling the LSTM and BiLSTM models proved effective in predicting the ST of coastal cities.
Overall, the hybrid data-driven modeling framework presented in the study indicated that coupling of the LSTM and BiLSTM models was proven to be effective in ST prediction. The outcomes of the current study have significant inferences for research on ST predictions, especially from the viewpoint of combining LSTM and BiLSTM methods. Though the hybrid LSTM-BiLSTM model performed well, there is still scope for further improvements through additional studies. Despite the robustness of the modeling frameworks as presented in the study, it has some caveats. The present study considered the study period from 1st January 1980 to 31st December 2019. The data used in the analysis is at a daily time scale and getting daily meteorological variables data for coastal cities of India is highly complex. Further, this is the only long period of data available for major coastal cities of India with few missing and erroneous data points. As we require a long time period of temperature data sets with minimal missing points, we have been confined to the 1980–2019 period only. The proposed methodology can always be updated with the newly available data sets with minimum missing values. Due to data limitations, some additional variables that directly impact ST, such as soil moisture, vegetation, etc., are not considered in the present study. The present study considered five major highly populated coastal cities of Southern India. However, the study can be implemented in the other coastal cities of India, given the data availability. Based on the availability of predictor variables, the proposed modeling framework can be implemented accordingly by implementing data-preprocessing approaches to select highly sensitive variables with optimal model parameters to get robust predictions of ST. However, the inclusion of such hydroclimatic variables, which have physical and conceptual relation with ST, can improve the prediction performance of ST with advancements over hybrid ML models, as demonstrated in the present study.
DATA AVAILABILITY STATEMENT
Data cannot be made publicly available; readers should contact the corresponding author for details.