This study focuses on the trend analysis of sea level data along the Chennai coast and thereby checks the structural change in the dataset using the Chow method. This study also proposed a methodology for predicting the mean sea level with the feed-forward neural network (FFNN) and wavelet transform neural network (WTNN) models. The data analysis shows that a breakpoint is observed in the year 1994 and found an overall increasing trend during the selected time period at the Chennai coast. For model development, a better understanding of the influencing parameters of the sea level is essential. Hence, correlation analyses have been performed and found that wind speed, sea surface salinity, and surface pressure are influencing variables for modelling sea level data. Apparently, these influencing variables have been considered as potential inputs for model development. To compare the performance of all the developed models, the Root Mean Square Error, Correlation Coefficient, and Nash–Sutcliffe Efficiency (NSE) were utilized. The results of performance indices and the graphical indicators also show that WTNN Model 4 outperformed all the other developed models. It was noticed that the percentage increase in the efficiency of NSE was 29.52% for WTNN Model 4 as compared to other developed models.

  • Detection of the breakpoint using the Chow method.

  • Identification of major climatic variables affecting the sea level rise.

  • Proposed a methodology for prediction of the sea level using climatic variables employing feed-forward neural network (FFNN) and wavelet transform neural network (WTNN).

Under higher emissions scenarios, oceanic sea level rise (SLR) by 2100 is expected to range between 0.61 and 1.10 m (https://www.ipcc.ch). According to Oppenheimer et al. (2019), due to instabilities in the Antarctic and Greenland ice sheets, the rise might be as much as 2 m (DeConto & Pollard 2016; Bamber et al. 2019). Many studies have confirmed that climate change has been linked to SLR. Climate change is a story of global economic prosperity, particularly in the last 70 years since World War II, when expanding energy demands fed a burgeoning global economy. The sea level data throughout the whole oceanic domain have been available since the early 1990s due to the availability of precise altimetry satellite data. This priceless data collection has verified that, on average, the pace of SLR over the last 20 years has been twice as fast as it was over the previous lengthier multidecadal time span (Nerem et al. 2010; Church & White 2011). According to the Intergovernmental Panel on Climate Change (IPCC), by the 2050s, mean summer temperatures would rise by 1.5–2.0 °C and mean winter temperatures will climb by 2.5–3.0 °C which might be terrible for this environment. Flooding threats, coastal erosion, saltwater intrusion into groundwater, ecosystem and land use changes, and the potential of land conversion into permanent open water are all effects of relative SLR (Nicholls et al. 2007). To address this problem, it is necessary to model mean SLR accurately, which can be done by modelling efficiently using soft computing techniques.

Artificial neural networks (ANNs) are one of the most extensively used neurocomputing technologies in the field of time series analysis. These methods are extremely useful for conducting analysis on datasets when minimal information about the influencing variables and their involvement in generating the time series is available (Shamshirband et al. 2020; Band et al. 2022). ANNs can approximate a continuous function to any required precision without making any implicit assumptions, and they have desirable properties including non-linearity, parallelism, and robustness (Basheer & Hajmeer 2000). A wavelet transform neural network (WTNN) is a hybrid model that combines ANN with wavelet transform and has been used in a number of studies. The convolution of a signal with each of the wavelets in the family is defined as a wavelet transform operation. Translation invariant representation can be obtained by introducing some non-linearity to the system. In hydrology, machine learning (ML) has proven to be effective. Traditional data-driven and physical hydrology models perform less accurately in flood prediction than ML approaches especially in short-term flood forecasting (Mosavi et al. 2018). ML is a field of artificial intelligence (AI) used to induce regularities and patterns, providing easier implementation with low computation cost, as well as fast training, validation, testing and evaluation, with high performance compared to physical models, and relatively less complexity. Furthermore, the ML approach aids in the estimation of precipitation using satellite records. These are widely used in a variety of applications, including navigational safety, agricultural optimization, and mechanical structure design among others (Guillou & Chapalain 2021). ML algorithms have also been used to aggregate ‘best-estimate’ forecasts from an ensemble for the predictions of ocean waves (Bruneau et al. 2020). Neural networks (NNs) have also successfully been used to bias-correct measurements leading to more homogeneous climate data records (Leahy et al. 2018). Characterized by remarkable learning ability, noise tolerance, and generalisability, these advanced approaches offer new horizons compared to traditional engineering methods, and are therefore recognized as one of the pillars of future economic and industrial developments. Thus, these powerful methods are also playing an increasing role in the study of coastal processes, and their importance is reinforced by a growing number of observational available datasets (Beuzen & Splinter 2020). In relation to its highly predictable characteristics, particular attention was also dedicated to tide forecasting with applications in harbours disseminated along the coastline by comparing traditional harmonic analysis techniques with ML methods (French et al. 2017; Liu et al. 2019).

The study region Chennai has been identified as one of the important coastlines of India. It is facing the problem of SLR and its impacts in the past have been devastating (Deepa & Gnanaseelan 2021). The increase in the rate of change of SLR creates an interest to study this area. An attempt has been made to study the trend of SLR and thereby detect the breakpoint of mean SLR along the Chennai coast during the period (1916–2015). In this paper, WTNN models and feedforward neural network (FFNN) models have been developed for sea level modelling along the Chennai coast. The correlation analysis between the local climatic variables and sea level was performed to identify the potential factors responsible for the changes. The input variables have been varied and the performance indices have been calculated for each model. Their comparison has been done and the best model has been recommended for carrying out forecasting in the future times. The majority of previous research work found solely the thermosteric influence on the sea level, which could lead to an overestimation of expected sea levels. Moreover, the number of inputs used for the previous work is very limited and this has been taken into consideration for the present work.

Study area and data

The study has been conducted on the tidal gauge station of the Chennai coast (India) located at 13.0827°N and 80.2707°E as shown in Figure 1. The Coromandel Coast of the Bay of Bengal and the Indian Ocean stretches along the southeast coast of the Indian Peninsula and is part of Tamil Nadu's coastline. It stretches for 1,076 km and is the country's second-longest coastline, after Gujarat in India. The coastal corridor is divided into 14 districts, each featuring 15 significant ports and harbours, as well as sandy beaches, lakes, and river mouths. It is 6.7 m above sea level on average. The temperature ranges from 35 to 40 °C, with a low of 19 °C. The hottest temperature ever recorded was 45 °C. The average annual rainfall of Chennai is 140 cm. The data relevant to climatic variables such as sea surface temperature (SST), sea surface height (SSH), and sea surface salinity (SSS) were retrieved from the Copernicus Marine Environment Monitoring Service (CMEMS) web portal for the period 1993–2020 (https://marine.copernicus.eu/). Other climatological parameters collected include surface pressure (SP), windspeed (w), and precipitation (p) (https://power.larc.nasa.gov/data-access-viewer/). The data for carrying out breakpoint analysis for Chennai from 1916 to 2015 were taken from the Permanent Service for Marine Sea Level (PSMSL) (https://www.psmsl.org).
Figure 1

Location of the study area.

Figure 1

Location of the study area.

Close modal

Breakpoint analysis

Breakpoint analysis is a way of looking at data to determine when there are shifts or breaks in normal levels. This analysis is used for determining the structural change of a time series that changes the slope abruptly at some unknown point. This test determines whether a broken line fits the data significantly better than a single straight line. The position of the breakpoint can be approximated using a confidence interval. The breakpoint is a crucial, safe, or threshold value above or below which undesirable effects occur. The breakpoint is very important in making decisions. The method used in this work for breakpoint detection is the Chow test. It was devised by econometrician Gregory Chow in 1960. It is a test of whether the true coefficients in two linear regressions on separate data sets are similar. It is frequently used to see if the independent variables have distinct effects on different demographic subgroups (Hurtado et al. 2020).

The Chow test was used in this study to check for a structural break in the mean sea level time series analysis. This Chow-attributed experiment is statistical and the dataset is divided into two sub-periods for the econometric test, the parameters of each are estimated and then it is tested whether the two sub-periods are equal with F-statistics (Anjali & Roshni 2022). The F-test statistics are computed as:
(1)
where ST is the sum of squared residuals from the total data; S1, S2 are the sum of squared residuals from each group; n is the total number of observations; k is the degree of freedom. The Chow test was performed with Rstudio using strucchange packages.

Trend analysis

This is the method of collecting data and identifying its pattern (Wang et al. 2020) and is very important in predicting future events. This analysis has gained much attention during the recent 2–3 decades. The periodicities of fluctuations have been detected for Chennai with data ranging from 1916 to 2015.

Modelling of sea level variations

For the modelling of sea level data, potential input variables have to be selected for the development of more efficient models. The relation between the variables and sea level was found using RStudio 2021.09.1-372. The variables having a strong correlation with the sea level were chosen as potential inputs for modelling sea level variations. These potential inputs were used for the development of FFNN and WTNN.

Feed-forward neural network

ANNs, sometimes known as NNs, are computer systems which work like a human brain. They learn from experience rather than programming. Artificial neurons, which are a collection of nodes, make up the system. Each neuron sends a signal to the next neuron. It functions similarly to a biological brain. The output of each neuron is determined using a non-linear function of its inputs. The weights of neurons frequently vary as they learn. Signals pass via concealed layers from the first (input layer) to the final (output layer). This approach is used to get the most expected value of an output variable by using the training settings and simulating the testing conditions. In this work, the feed-forward backpropagation method has been used to adjust the connection weights to compensate for each error found during learning. The multi-layered network trained by the backpropagation algorithm has been applied extensively to solve various engineering problems. The Levenberg–Marquardt algorithm (LMA) is used for training the network. The LM method is the most widely utilized in NN training because it allows for faster convergence of gradient descent (Nourani et al. 2009). The LMA is a popular trust region algorithm that is used to find a minimum of a function (either linear or non-linear) over a space of parameters. Essentially, a trusted region of the objective function is internally modelled with some function such as a quadratic. When an adequate fit is found, the trust region is expanded. As with many numerical techniques, the Levenberg–Marquardt method can be sensitive to the initial starting parameters. In traditional Levenberg–Marquardt implementations, finite differences are used to approximate the Jacobian. The Jacobian is a matrix of all first-order partial derivatives of the function being optimized. This matrix is convenient, as the user needs only supply a single function to the library. When the input is given, based on the performance of various inputs, weight is allocated to the neurons and the weights are updated based on the error values. The LM method is the most widely utilized in NN training because it allows for faster convergence of gradient descent (Nourani et al. 2009). The development of FFNN models has been carried out using MATLAB 9.8 R2020a.

Wavelet Transform Neural Network

A hybrid model was created by combining Wavelet Transform inputs with ANNs to improve the model's efficacy. Wavelet ensemble NN is the name for this type of NN (WTNN). It has a wide range of applications in signal and image analysis and denoising. It is concerned with the growth of functions using the basic functions as a starting point.

Discrete wavelet transform (DWT) takes less time and is easier to implement than the traditional continuous wavelet transform (CWT), which involves a large amount of processing work and data (Adamowski & Chan 2011). To obtain a time-scale signal in the DWT, digital filtering techniques are used. The wavelet algorithm is used to derive detailed coefficients and approximation series from the original time series after passing it through high-pass and low-pass filters (Gurley & Kareem 1999). There are varieties of mother wavelets and the authors compared four popular mother wavelets in their study: Daubechies (Db), Symlet (Sym), Discrete Meyer (dMey), and Haar (Nourani et al. 2009; Adamowski & Chan 2011). Daubechies wavelets were chosen as the study's mother wavelets because they yield the best results (Maheswaran & Khosa 2012).

Decomposition process

The decomposition means to break down the input parameters into different bands. It helps in finding the optimum value of the input to be used for model formation considering the scaling and shifting factors. Wavelet decomposition provides a complete image representation and performs decomposition according to both scale and orientation. When conducting a wavelet-based ANN model, it is necessary to determine the most suitable decomposition level from 1 to M. Theoretically, the maximum decomposition level (M) can be calculated as: M = log2(N), where N is the series length as mentioned in the line. Using this formula, the level of the decomposition was found to be 2, hence it has been used.

Using proper formulas, the minimum and the maximum number of decomposition levels were calculated. The lowest level of decomposition can be calculated using the equation (Reddy et al. 2022):
(2)
where L is the decomposition level and Ns is the signal length (number of data points).
The wavelet transform produced two signals when applied to the original time series: approximations (A) and details (D). Approximations are signals with a large size and a low frequency. Details are signals with a low scale and a high frequency. Approximation is the most important component as it reveals signal identification whereas subtlety is the detail (Krishna et al. 2011; Kumar et al. 2019). The deconstructed signals were fed into the ANN model, and the outputs were added together to achieve the final result. The methodology flowchart which shows the model development with FFNN and WTNN is shown in Figure 2.
Figure 2

Flowchart showing the steps involved in modelling.

Figure 2

Flowchart showing the steps involved in modelling.

Close modal

First, all the inputs which formed a good correlation with the mean sea level were chosen. The potential input parameters were sea SS, SP, and wind speed. Then, the datasets were divided into 80% training and 20% testing data. The model formulation started using two different NNs – ANN and WTNN. Now, the decomposition level is calculated by using the formula given. The Daubechies mother wavelet method is used in further analysis. In the model formation, the parameters are first trained and then tested. The network is simulated and the output variable is estimated.

Performance indices

The model evaluation has been done using the statistical parameters, i.e. Root Mean Square Error (RMSE), Nash–Sutcliffe Efficiency (NSE), and Correlation Coefficient (r). Ranking of each model was calculated using the CP technique (Sireesha et al. 2020). It is a multiple criteria decision-making (MCDM) technique. The basic idea in CP is to identify an ideal solution. CP selects a non-dominated preferred solution from a feasible data set, on the basis of the solution's closeness to an infeasible ideal point. Based on the criterion given in Table 1 (Sithara et al. 2020), the model has been classified as very good, good, and satisfactory. Formulas used for calculating different performance indices are as follows:
(3)
(4)
(5)
where is the observed SSH; is the simulated SSH; is the mean SSH; p is the sample length; is the ideal value and is the original value.
Table 1

Performance evaluation criteria (Sithara et al. 2020)

ParameterCriteriaPerformance category
R2 R2> 0.5 Acceptable 
R2 > 0.75 Very good 
NSE NSE ≤ 0.50 Unsatisfactory 
0.5 < NSE ≤ 0.65 Satisfactory 
0.65 < NSE ≤ 0.75 Good 
NSE ≥ 0.75 Very good 
ParameterCriteriaPerformance category
R2 R2> 0.5 Acceptable 
R2 > 0.75 Very good 
NSE NSE ≤ 0.50 Unsatisfactory 
0.5 < NSE ≤ 0.65 Satisfactory 
0.65 < NSE ≤ 0.75 Good 
NSE ≥ 0.75 Very good 

Along with the performance indices, graphical comparisons of the model performances have also been carried out. For that, random walk test, heat map, violin test and scatter diagram have been included. A heatmap shows the best model using different colour based on the ranks given by performance indices. The colour variation can be via hue or intensity, indicating whether the occurrence is clustered or varies over time. It is used to rank models according to their performance. For data visualization and quality control, heat maps are commonly employed in expression analysis investigations (Zhao et al. 2014). A violin plot is a type of quantitative data visualization. It is similar to a box plot, however, on each side, there is a rotating kernel density plot. According to the random walk theory, any variable or phenomenon does not follow a pre-existing trend. It is presumptively impossible to outperform the market without taking on greater risk. It is a set of discrete, fixed-length steps that move in random directions (Roshni et al. 2020). The scatter diagram displays two sets of data on each axis and helps in determining the relation between the calculated and the ideal values. If the variables have a good relation, then it will be closer to the 1:1 line. The scatter plot is one of seven fundamental quality tools.

Breakpoint analysis

It is evident from Figure 3 that a lot of fluctuation was seen during the past years in the sea level along the coast of Chennai. In order to analyze the variations where the sea level had a change point, a breakpoint detection test was carried out. The time and intensity of a breakpoint in a time series can be used to define it. The breakpoint detection test was carried out using the Chow method in Rstudio. The Chow method tells whether there is a requirement of one regression line or two regression lines to fit the data appropriately. Based on the time period at which abnormal changes are taking place, it gives a major breakpoint. The Chow method takes a series of data and detects the changes in the mean. The results of the assumed breakpoints with F-statistics and P value have been shown in Table 2. The results show multiple break points with a significant breakpoint in the year 1994 with F-statistics 39.6696 and P value 8.55609E-09. The P value of the statistics is very low in 1994 and this implies that the breakpoints are at those dates (Khanam et al. 2017). The results of the time series analysis of the MSL at the Chennai coast (Figure 3) are also in accordance with the breakpoint analysis. The F-statistics (equation 1) in the year 1994 represents the highest value among all years and hence shows that this year has the maximum chances of being a major breakpoint. It has the lowest P-value (probability) indicating null hypothesis to be rejected (see bold values in Table 2).
Table 2

Breakpoint check by the Chow test at the Chennai coast

S. No.Assumed breakpoints (years)F-statisticsP-value (Probability)
1927 12.99 0.000494342 
1928 9.66 0.002464821 
1993 38.26636 1.43067E-08 
4 1994 39.6696 8.55609E-09 
1995 39.04055 1.0766E-08 
1997 33.71456 7.89438E-08 
1998 32.0547 1.49566E-07 
2001 27.84859 7.87232E-07 
2002 21.53741 1.07605E-05 
10 2005 25.79382 1.81321E-06 
11 2006 26.88946 1.15982E-06 
12 2007 35.42913 4.11783E-08 
S. No.Assumed breakpoints (years)F-statisticsP-value (Probability)
1927 12.99 0.000494342 
1928 9.66 0.002464821 
1993 38.26636 1.43067E-08 
4 1994 39.6696 8.55609E-09 
1995 39.04055 1.0766E-08 
1997 33.71456 7.89438E-08 
1998 32.0547 1.49566E-07 
2001 27.84859 7.87232E-07 
2002 21.53741 1.07605E-05 
10 2005 25.79382 1.81321E-06 
11 2006 26.88946 1.15982E-06 
12 2007 35.42913 4.11783E-08 
Figure 3

Plot showing the breakpoint of Chennai for the time period (1916–2015).

Figure 3

Plot showing the breakpoint of Chennai for the time period (1916–2015).

Close modal

It can also be seen from Figure 3 that before the breakpoint, the rate of SLR was 0.2 mm/year and it has been increased to 4.1 mm/year after the year 1994. This change in the slope created an interest in further study of this region. The rate of rise has drastically increased in the last 2–3 decades. The dotted line shows the breakpoint in 1994. This does not mean that it is solely the extreme hydrological events in that particular year that have led to the abrupt changes in the time series for a longer duration. There will be a number of factors affecting the SLR such as melting of glaciers, sea SS, wind speed, SP, etc. During the last 2–3 decades, a lot of disasters have been witnessed along the coastal areas but this is not the only reason for the sudden rise in the sea level. A momentary surge would have been noticed for a shorter period of time due to these extreme events. Keeping in mind the average increase in the mean sea level at global scale, a threshold value of (2–3) mm increase was adopted. When this limit is crossed, it is notified as a breakpoint. The Chow test gave a number of breakpoints but the significant one was in the year 1994.

Trend analysis

The trend analysis was performed to detect the pattern of the mean sea level along the Chennai Coast. The yearly variations for the period 1916–2015 have been plotted along with their linear trends and are displayed in Figure 4. It was found that the sea level is increasing at a rate of 0.41 mm/year. This created an interest in further study of this region. The trend of long-term sea level along the Chennai coast is calculated using available tide gauge data. Since the observed data was not available for the whole time period, there were some gaps in the time series which may lead to errors. The plots of the trendline for the observed data from tidal stations and satellite data have been analyzed and compared. The plot made from tidal gauge data shows the moving average of 5, 9, and 15 years.
Figure 4

Plot showing the variation of the MSL and the trendline for Chennai for the tidal gauge station data (1916–2015).

Figure 4

Plot showing the variation of the MSL and the trendline for Chennai for the tidal gauge station data (1916–2015).

Close modal
It is also evident from Figure 5 that the rate of change of the sea level with respect to 1993 has increased drastically based on the satellite data. The sea level has increased by 0.4 m when compared between 1993 and 2020.
Figure 5

Plot showing the rate of change of the MSL and the trendline of Chennai for the satellite data (1993–2020).

Figure 5

Plot showing the rate of change of the MSL and the trendline of Chennai for the satellite data (1993–2020).

Close modal

Modelling of sea level variations

Looking at the impacts of climate change on the sea level, predicting the rise in the sea level with better precision is critical. Correlation analysis should be done to find out the major variables affecting the SLR. In order to tackle these problems, modelling of SLR is the first and foremost step that needs to be taken. It would be possible to assess the rate of change in sea level and proper management steps to be taken to decrease the rapid rate.

Correlation among climatic variables

There are many local climatic variables affecting the sea level. The diagnosis of these is of great importance in order to interpret the variables responsible for the large sea level rise in recent years. The yearly SSH, SSS, precipitation (p), SST, SP and wind speed were the variables considered for the correlation analysis to determine the potential predictors. Correlation analysis was performed in RStudio 2021.09.1-372 between the sea level and the other independent variables. The data distribution of different climatic variables is shown in Figure 6. From Figure 7, it can be inferred that the sea level had a good relation with SP, wind speed, and salinity which can be considered as potential inputs for the NN model. The data from 1916 have been available for all the different parameters on a yearly basis as it was observed manually. Monthly data were also available but there was a huge data gap. There was no use of satellite imagery for data collection until 1993. Therefore, in order to incorporate these data and find out their relation, correlation analysis was performed on a yearly basis. Since 1993, the data are available monthly for the study area, so it has been used in the model formation as it will have a higher accuracy and prediction values as compared to the yearly basis.
Figure 6

Data distribution of different climatic variables.

Figure 6

Data distribution of different climatic variables.

Close modal
Figure 7

Correlation plot between the climatic variables and sea level.

Figure 7

Correlation plot between the climatic variables and sea level.

Close modal

Sea level prediction

This study analyses the variation in sea level observed values with the predicted values of different NN models. It also takes into account the forcing factors which are responsible for such discrepancies. Modelling using FFNN and WTNN gives the most accurate predicted values. SSS, SP, and wind speed were used as potential input variables for the formulation of models and the sea level was taken as the target or output variable. It was done for the period 1993–2020. The data were divided into two parts: training and testing dataset. The training data consists of 270 months and testing data are of 66 months. Since the observed data were not available for the whole time period, there were some gaps in the time series which may lead to errors. Hence, instead of using observed data from 1916, satellite data from 1993–2020 have been used for the modelling.

Feed forward neural network

In this approach, the LM algorithm was used for the feedforward back propagation method in detecting the optimum solution. The best model was selected based on the performance indices given by the respective models. The trained model was put to the test by forecasting sea level values using the remaining 20% of data. Performance indices such as NSE, RMSE, and r were used to perceive the competence of the model. The results of the performance indices are shown in Table 2. The indices showed that further modification using different methods was needed.

Wavelet transform neural network

WTNN models are obtained by combining two methods, DWT and ANN. The WTNN are models which contain approximations and details and are created by applying the DWT to input data. The decomposition of the signal was done for Db1–Db4 and adopted Db3 for the present analysis. The inputs given were a combination of (a3 and d1, d2, d3) and (a3 and d1 + d2 + d3). These input datasets for ANN model development provide a good interconnection between observed and predicted values. Figure 8(a)–8(c) shows the decomposed approximations and details for the wind speed, salinity, and SP, respectively.
Figure 8

The decomposed approximations, details and signals of (a) wind speed (m/s), (b) salinity and (c) surface pressure (kPa) along time being shown for performing the model analysis.

Figure 8

The decomposed approximations, details and signals of (a) wind speed (m/s), (b) salinity and (c) surface pressure (kPa) along time being shown for performing the model analysis.

Close modal

The decomposed approximations and details have been shown on the y-axis for wind speed, pressures, and salinity.

A total of seven models were formed with FFNN and WTNN models with different input combinations and the performance indices have been calculated and shown in Table 3.

Table 3

Performance indices of different models for training and testing datasets

Model/Level/Mother wavelet NeuronInput combinationTraining
Testing
RMSERNSERMSERNSE
FFNN/50 (Model 1)  0.038 0.819 0.657 0.039 0.807 0.629 
WTNN/3/Db1/37 (Model 2) (a3 and d1,d2,d30.042 0.875 0.753 0.045 0.905 0.805 
WTNN/3/Db2/37 (Model 3) (a3 and d1,d2,d30.044 0.885 0.762 0.038 0.896 0.789 
WTNN/3/Db3/39 (Model 4) (a3 and d1,d2,d30.038 0.933 0.851 0.041 0.949 0.879 
WTNN/3/Db4/37 (Model 5) (a3 and d1,d2,d30.041 0.909 0.811 0.039 0.919 0.831 
WTNN/3/Db3/45 (Model 6) (a3 and d1 + d2 + d30.038 0.921 0.827 0.042 0.904 0.802 
WTNN/3/Db4/38 (Model 7) (a3 and d1 + d2 + d30.039 0.913 0.821 0.044 0.895 0.798 
Model/Level/Mother wavelet NeuronInput combinationTraining
Testing
RMSERNSERMSERNSE
FFNN/50 (Model 1)  0.038 0.819 0.657 0.039 0.807 0.629 
WTNN/3/Db1/37 (Model 2) (a3 and d1,d2,d30.042 0.875 0.753 0.045 0.905 0.805 
WTNN/3/Db2/37 (Model 3) (a3 and d1,d2,d30.044 0.885 0.762 0.038 0.896 0.789 
WTNN/3/Db3/39 (Model 4) (a3 and d1,d2,d30.038 0.933 0.851 0.041 0.949 0.879 
WTNN/3/Db4/37 (Model 5) (a3 and d1,d2,d30.041 0.909 0.811 0.039 0.919 0.831 
WTNN/3/Db3/45 (Model 6) (a3 and d1 + d2 + d30.038 0.921 0.827 0.042 0.904 0.802 
WTNN/3/Db4/38 (Model 7) (a3 and d1 + d2 + d30.039 0.913 0.821 0.044 0.895 0.798 

Performance analysis of ANN and WTNN models

In order to find the best model for predicting the values, the statistical parameters of both models (ANN and WTNN) were computed and compared and shown in Table 2. It was observed that Model 4 (highlighted in Table 2), which is a hybrid model, performed better compared to the other developed models. The ANN model showed that NSE > 0.5 (acceptable) and R > 0.75 (acceptable). The WTNN model yielded very good results with NSE > 0.851 and R > 0.933 (Sithara et al. 2020). The RMSE value being very small made the conclusion more reliable. In order to check the efficacy of the developed model, various plots such as violin plot, heatmap, scatter plot, and random walk test were used.

A violin plot is a hybrid of a box plot and a kernel density plot, which shows peaks in the data. It is used to visualize the distribution of numerical data. Unlike a box plot that can only show summary statistics, violin plots depict summary statistics and the density of each variable. They are used to compare the observed and most relevant models relative to each other. From the plot, it is clearly visible that Model 4 and Model 6 resemble more the observed model. It gives the mean, median and outermost values of the data and is shown in Figure 9.
Figure 9

Violin plot showing the mean, median and outliers of the observed and predicted data of different models.

Figure 9

Violin plot showing the mean, median and outliers of the observed and predicted data of different models.

Close modal
Heatmap was used as a tool in order to justify the results. It's a type of graphical data representation that uses a color-coding method to represent distinct values. The ranking of the model was done based on the CP. A detailed description of compromise programming can be found in Sireesha et al. (2020). The ranking of the developed models was computed using RMSE, R, NSE and SD (standard deviation). It can be clearly seen in Figure 10 that the Model 4 was ranked the best among the selected models.
Figure 10

Heatmap showing the ranking of different models.

Figure 10

Heatmap showing the ranking of different models.

Close modal
A random walk test is a non-statistical test used to show that the variables that do not follow a pre-existing trend and may move arbitrarily. It has been performed using CP. From Figure 11, it is seen that Model 4 outperforms all other models.
Figure 11

Random walk test analysis to find the best predicted model.

Figure 11

Random walk test analysis to find the best predicted model.

Close modal
Based on the above performance indices, Model 4 with input combinations (a3 and d1, d2, d3) was found to be the best model. Hence, the time series plot and the scatter plot of the best Model 4 results with the observed data sets were plotted and shown in Figure 12.
Figure 12

Variation of the observed sea level and predicted sea level using the best model for training and testing periods.

Figure 12

Variation of the observed sea level and predicted sea level using the best model for training and testing periods.

Close modal
Scatter plot is a plot showing the relation between two variables. It shows how much one variable is affected by the other and their inter-relation. The closer the points come to touching the line, the better the association. The best model (WTNN) scatter plot has been shown in Figure 13.
Figure 13

Scatter plot showing the relation between observed and predicted MSL.

Figure 13

Scatter plot showing the relation between observed and predicted MSL.

Close modal

The percentage change in performance indices has been calculated and shown in the table It is obvious from Table 4 that WTNN (Model 4) percentage gain in efficiency of NSE is in the range of (0–29.52)% for the training dataset while for the testing dataset, it varies between (0–39.74)% with respect to the conventional model (FFNN). Model 4 has no percent change in RMSE indicating that it is the best model for the prediction. The range of RMSE is (0–10.52)% for the training dataset and testing dataset, and is in the range of (0–15.38)% compared to FFNN. Similarly, considerable percentage improvement has been observed for all indices during the training and testing period.

Table 4

Percentage change in the performance indices compared to the FFNN model

Training
Testing
Model no.% RMSE% R%NSE% RMSE% R%NSE
– – – – – – 
10.52 6.83 14.61 15.38 12.14 27.98 
15.78 7.45 15.98 2.56 11.03 25.43 
13.91 29.52 5.12 17.59 39.74 
7.89 10.98 23.43 13.87 32.11 
12.45 25.87 7.68 12.01 27.50 
2.63 11.47 24.96 12.80 10.90 21.18 
Training
Testing
Model no.% RMSE% R%NSE% RMSE% R%NSE
– – – – – – 
10.52 6.83 14.61 15.38 12.14 27.98 
15.78 7.45 15.98 2.56 11.03 25.43 
13.91 29.52 5.12 17.59 39.74 
7.89 10.98 23.43 13.87 32.11 
12.45 25.87 7.68 12.01 27.50 
2.63 11.47 24.96 12.80 10.90 21.18 

The sea level variations are complex in nature, resulting from different processes and settings. This research examined the sea level variations by studying the trend analysis and breakpoint analysis and consequently the selection of potential drivers of sea level change at the Chennai coast. This work further assessed and compared the performances of the predicted sea level variations at the Chennai coast using ANN and WTNN models.

Trend lines of the mean sea level show an overall increasing trend for the selected time period. The results of breakpoint analysis by the Chow method lead us to conclude that there is a major breakpoint in the year 1994 and the SLR changed from 0.2 to 4 mm/year after the breakpoint year 1994. Further research has to be carried out to understand and verify the structural changes in SLR during the year 1994. The year 1994 is only acting as a breakpoint from where there has been a continuous rise in the sea level. From the correlation analysis, it was observed that salinity, wind speed, and SP are some of the major driving forces of sea level variations in the Chennai coast. It was also observed that salinity and wind speed are positively correlated with sea level and negatively correlated with SP. Seven models were developed to predict the sea level variations using ANN and WTNN models. For the development of WTNN models, different input combinations of approximate and details were used. Statistical indices along with the graphical indicators were used for the comparison of the developed models and found that the input combination a3 and d1, d2, d3 was found superior to the other developed models in terms of its prediction. It is evident from the graphical interpretations that Model 4 (a3 and d1, d2, d3) outperformed other developed models with 29.52% more efficiency than the conventional FFNN model. It tells what should be the combination of approximations and details in prediction analysis while carrying out the modelling part. It makes future research easier as again permutation and combination do not have to be used to establish the best-predicted results.

The applicability of different ML techniques in sea level prediction and extending the regional scale prediction to a global scale can be set as the future scope of this study. The outcomes of the present study have important implications for research on forecasting the sea level, especially from the viewpoint of wavelets and ANNs in particular and different time series/data-based methods more broadly. Though the hybrid WTNN models performed well, there is still scope for further improvements through additional studies. This study is region-specific and it can be extended to the entire Coromandel coast. There are many other factors that influence SLR (e.g. melting of glaciers, the gravitational pull of moon and earth) that have not been considered in the present study. Nevertheless, the model showed good predictability in the calibration and validation period using long-term observed data even when the above factors were not taken into consideration.

The authors would like to appreciate the time and effort that the editor and the reviewers dedicated to providing feedback on our manuscript and are grateful for their insightful comments.

All relevant data are included in the paper or its Supplementary Information.

The authors declare there is no conflict.

Adamowski
J.
&
Chan
H. F.
2011
A wavelet neural network conjunction model for groundwater level forecasting
.
Journal of Hydrology
407
(
1–4
),
28
40
.
Anjali
K.
&
Roshni
T.
2022
Linking satellite-based forest cover change with rainfall and land surface temperature in Kerala, India
.
Environment, Development and Sustainability
24
(
9
),
11282
11300
.
Bamber
J. L.
,
Oppenheimer
M.
,
Kopp
R. E.
,
Aspinall
W. P.
&
Cooke
R. M.
2019
Ice sheet contributions to future sea-level rise from structured expert judgment
.
Proceedings of the National Academy of Sciences
116
(
23
),
11195
11200
.
Band
S. S.
,
Karami
H.
,
Jeong
Y. W.
,
Moslemzadeh
M.
,
Farzin
S.
,
Chau
K. W.
&
Mosavi
A.
2022
Evaluation of time series models in simulating different monthly scales of drought index for improving their forecast accuracy
.
Frontiers in Earth Science
10
,
839527
.
Basheer
I. A.
&
Hajmeer
M.
2000
Artificial neural networks: fundamentals, computing, design, and application
.
Journal of Microbiological Methods
43
(
1
),
3
31
.
Beuzen
T.
&
Splinter
D. K.
2020
Machine learning and coastal processes
.
Computer Science
.
doi:10.1016/b978-0-08-102927-5.00028-x
.
Bruneau
N.
,
Polton
J.
,
Williams
J.
&
Holt
J.
2020
Estimation of global coastal sea level extremes using neural networks
.
Environmental Research Letters
15
(
7
),
074030
.
Church
J. A.
&
White
N. J.
2011
Sea-level rise from the late 19th to the early 21st century
.
Surveys in Geophysics
32
(
4
),
585
602
.
DeConto
R. M.
&
Pollard
D.
2016
Contribution of Antarctica to past and future sea-level rise
.
Nature
531
(
7596
),
591
597
.
Gurley
K.
&
Kareem
A.
1999
Applications of wavelet transforms in earthquake, wind and ocean engineering
.
Engineering Structures
21
(
2
),
149
167
.
Khanam
T.
,
Rahman
A.
,
Mola-Yudego
B.
&
Pykäläinen
J.
2017
Identification of structural breaks in the forest product markets: how sensitive are to changes in the Nordic region?
Mitigation and Adaptation Strategies for Global Change
22
,
469
483
.
Krishna
B.
,
Satyaji Rao
Y. R.
&
Nayak
P. C.
2011
Time series modelling of river flow using wavelet neural networks
.
Journal of Water Resources and Protection
03
(
1
),
50
59
.
https://doi.org/10.4236/jwarp.2011.31006
.
Kumar
K.
,
Singh
V.
&
Roshni
T.
2019
Efficacy of the hybrid neural networks in statistical downscaling of precipitation of the Bagmati River basin
.
Journal of Water and Climate Change
.
https://doi.org/10.2166/wcc.2019.259
.
Leahy
T. P.
,
Llopis
F. P.
,
Palmer
M. D.
&
Robinson
N. H.
2018
Using neural networks to correct historical climate observations
.
Journal of Atmospheric and Oceanic Technology
35
(
10
),
2053
2059
.
Liu
J.
,
Shi
G.
&
Zhu
K.
2019
High-precision combined tidal forecasting model
.
Algorithms
12
(
3
),
65
.
Maheswaran
R.
&
Khosa
R.
2012
Comparative study of different wavelets for hydrologic forecasting
.
Computers & Geosciences
46
,
284
295
.
Nerem
R. S.
,
Chambers
D. P.
,
Choe
C.
&
Mitchum
G. T.
2010
Estimating mean sea level change from the TOPEX and Jason altimeter missions
.
Marine Geodesy
33
(
S1
),
435
446
.
Nicholls
R. J.
,
Wong
P. P.
,
Burkett
V.
,
Codignotto
J.
,
Hay
J.
,
McLean
R.
,
Ragoonaden
S.
,
Woodroffe
C. D.
,
Abuodha
P. A. O.
,
Arblaster
J.
&
Brown
B.
2007
Coastal Systems and Low-Lying Areas
.
Nourani
V.
,
Alami
M. T.
&
Aminfar
M. H.
2009
A combined neural-wavelet model for prediction of Ligvanchai watershed precipitation
.
Engineering Applications of Artificial Intelligence
22
(
3
),
466
472
.
Oppenheimer
M.
,
Glavovic
B.
,
Hinkel
J.
,
Van de Wal
R.
,
Magnan
A. K.
,
Abd-Elgawad
A.
,
Cai
R.
,
Cifuentes-Jara
M.
,
Deconto
R. M.
,
Ghosh
T.
&
Hay
J.
2019
Sea Level Rise and Implications for Low Lying Islands, Coasts and Communities
.
In: IPCC Special Report on the Ocean and Cryosphere in a Changing Climate. Cambridge University Press, Cambridge, UK and New York, NY, USA, pp. 321–445
.
Reddy
B. S. N.
,
Pramada
S. K.
&
Roshni
T.
2022
Selection of level and type of decomposition in predicting suspended sediment load using wavelet neural network
.
Acta Geophysica
70
(
2
),
847
857
.
Roshni
T.
,
Jha
M. K.
&
Drisya
J.
2020
Neural network modelling for groundwater-level forecasting in coastal aquifers
.
Neural Computing and Applications
32
(
16
),
12737
12754
.
Shamshirband
S.
,
Hashemi
S.
,
Salimi
H.
,
Samadianfard
S.
,
Asadi
E.
,
Shadkani
S.
&
Chau
K. W.
2020
Predicting standardized streamflow index for hydrological drought using machine learning models
.
Engineering Applications of Computational Fluid Mechanics
14
(
1
),
339
350
.
Sireesha
C.
,
Roshni
T.
&
Jha
M. K.
2020
Insight into the precipitation behaviour of gridded precipitation data in the Sina basin
.
Environmental Monitoring and Assessment
192
(
11
),
1
23
.
Wang
Y.
,
Xu
Y.
,
Tabari
H.
,
Wang
J.
,
Wang
Q.
,
Song
S.
&
Hu
Z.
2020
Innovative trend analysis of annual and seasonal rainfall in the Yangtze River Delta, eastern China
.
Atmospheric Research
231
,
104673
.
Zhao
S.
,
Guo
Y.
,
Sheng
Q.
&
Shyr
Y.
2014
Advanced heat map and clustering analysis using heatmap3
.
BioMed Research International
2014
.
https://doi.org/10.1155/2014/986048
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY-NC-ND 4.0), which permits copying and redistribution for non-commercial purposes with no derivatives, provided the original work is properly cited (http://creativecommons.org/licenses/by-nc-nd/4.0/).