## Abstract

Trinidad has undergone rapid urbanization over the past few decades. Urbanization is accompanied with an increase in the country's demand for water. The forecasting of water demand can give rise to a better understanding of water consumption behaviour across all sectors of economy and therefore aid in effective water demand management. This study compares the application of the seasonal ARIMA, exponential state space (ETS) models, artificial neural network (ANN) models and hybrid combinations of them in developing forecast models for all categories of water consumption for Trinidad. The best forecasting model was selected using the forecasting assessment criterion of Root Mean Square Error (RMSE), Mean Absolute Error (MAE) and Mean Absolute Percentage Error (MAPE). The forecasts were conducted until the end of December 2021. The results of the study show that hybrid model combinations are adequate in forecasting four out of the five categories and the single model, SARIMA, has been found suitable for the domestic category. Forecast plots revealed an increase in water demand until the end of 2021. The study also demonstrates the suitability of hybrid models for forecasting water demand for the island of Trinidad.

## HIGHLIGHTS

Development of monthly water demand forecasts.

Development of water demand models for each category of water demand for an island in the Caribbean.

Development of the first hybrid combination forecasting model for water consumption in Trinidad.

### Graphical Abstract

## INTRODUCTION

Water is an essential part of life. Globally, water consumption has been increasing at twice the rate of population growth in the last century and this has presented several challenges with satisfying the demand for water (UN-Water 2021). According to a study by the United Nations, 1.42 billion people worldwide reside in areas of high-water vulnerability (UNICEF 2021). Approximately two thirds of the world population are faced with severe water shortages for at least one month per year (Mekonnen & Hoekstra 2016). In the Caribbean, the World Resources Institute have identified the following islands as having ‘extremely high’ levels of water stress: Dominica, St. Vincent and the Grenadines, Antigua and Barbuda, St. Kitts and Nevis, Barbados and Trinidad and Tobago. Of the above, Antigua and Barbuda, Barbados and St. Kitts and Nevis have been categorized as being water scarce islands with less than 1000 cubic meters of freshwater resources per capita (Chow 2019). Although Trinidad has an abundance of water resources, the country faces a water crisis as it has been unable to meet its demands in recent times. The demand for water was estimated at 393 MCM/year in 2015 with supply at 382 MCM/year which represents a deficit of 11 MCM. This deficit is intensified during the dry season when precipitation and reservoir levels are low. To address the increase in water demand and reduction in supply of water, various strategies need to be considered in water demand management. One important area is to develop a model for forecasting consumption.

There are many approaches to the modelling of water consumption. Worldwide, traditional approaches have utilized methods such as exponential smoothing, linear regression and ARIMA models. However, regression analysis assumes constant variance and normal distribution of errors (Sen *et al.* 2003). Most time series methods assume a linear dependence of future values on historic data. ARIMA models developed by Box & Jenkins (1976) are superior in modelling the linear behaviour of time series. Exponential smoothing models have also been found to be easily applicable and are remarkable in modelling the seasonality patterns in time series data (Gjika *et al.* 2019). In more recent times, Artificial Neural Networks (ANNs) have been used in forecasting due to its advantages such as its capability to model non-linear data (Hornik *et al.* 1989). Several studies have compared traditional methods of forecasting to neural network approaches, and it was found that ANNs were better at forecasting water consumption than the traditional techniques. Bougadis *et al.* (2005) analyzed three models for water consumption which included time series, ANN and regression. The study was carried out for the city of Ottawa in Canada and only included data for the summer months for the years 1993–2002. It was deduced that the ANN model outperformed the other conventional techniques. White & Safi (2016) postulated that the ANN model performed considerably better than the ARIMA model when the linking function was non-linear but the ARIMA model was more effective when the function was a linear one.

Therefore, although ANNs have been found to model non-linear data quite effectively, it has been deduced that ANNs alone, in some situations, are incapable of modelling data that exhibit both linear and non-linear characteristics (Zhang 2003). Hybrid models which are formed by combining different methods allow for accurately modelling data that comprise of linear and non-linear patterns. Ginzburg & Horn (1993) developed a hybrid model which combined several feedforward neural networks for time series forecasting. Their model was found to improve forecasting accuracy. Luxhoj *et al.* (1996) developed a hybrid model by using an econometric and ANN approach for sales forecasting which was also found to be very efficient.

The main objective of this study is to determine appropriate forecasting models that can adequately forecast water consumption for Trinidad. A study conducted by Ekwue (2009) showed that residential consumption accounted for approximately 40% of total consumption in Trinidad with the industrial and agricultural sectors accounting for approximately 21 and 3% of total demand, respectively. In this study, an ETS, ARIMA, ANN, and hybrid combinations of all three models were considered in developing monthly forecasting water demand models for each category: domestic, industrial, commercial, agricultural and ‘other’ based on data from Trinidad for the period 2003–2018. According to Leon *et al.* (2020), water analysis can assist in answering questions related to water consumption problems and its scarcity. A model for forecasting the demand for water will therefore provide a technique for maintaining a supply-demand balance.

## METHODOLOGY

### Data

Historical monthly consumption data from October 2003 to August 2018 across Trinidad were obtained from the Water Resources Agency (WRA), a division of the Water and Sewerage Authority (WASA). The data were sorted into the following categories: domestic, industrial, commercial, agricultural and ‘other’. Figure 1 shows the spread for each category of water consumption for the island. The domestic class is defined as follows: all premises used entirely as living quarters or used either solely or partly for business and have not been registered for Value Added Tax. The industrial category refers to property that is used for production activities. The commercial category refers to premises that have been registered for Value Added Tax and used to conduct commercial and business activities such as malls, shopping centers, etc. The agricultural category refers to premises used for agricultural operations such as crop and livestock farming, forestry and horticulture. The ‘other’ category comprises of cottage and charitable organizations and incorporates non-domestic property that is used partly for business operations or partly as domestic dwellings as well as domestic premises used for charitable purposes. Data from October 2003 to July 2017 were used as the training sample for model estimation and the remaining data from August 2017 to August 2018 were used as the testing sample data set. The training set is utilized solely for model development whereas the testing set is utilized primarily for model evaluation.

### Data cleaning

*et al.*2006). In this study, the non-seasonal part of the data was smoothed using the Friedman's Super Smoother which is a non-parametric trend estimator based on a localized least squares regression technique with adaptive bandwidths. The technique is first implemented by calculating

*m*different smooths using different bandwidths and computed using the entire data set. The performance of the smooth is measured and estimated as:where is some fixed span smoother, and is a performance measure of th smooth at the point for (Givens & Hoeting 2013).

The data used for each smooth come from the cross-validated residuals from alternate smooths. The optimal bandwidths are smoothed again and the two estimates with closest bandwidths are chosen. Final smoothing is done using linear interpolation technique between both estimates.

### Data normalization

### Seasonal autoregressive integrated moving average (SARIMA) model

*t*respectively (Wang & Meng 2012).

A seasonal autoregressive integrated moving average (SARIMA) model is expressed as ARIMA_{[m]} where represents the non-seasonal component, represents the seasonal component of the model and *m* represents the number of periods in each season. The values of represent the order of the non-seasonal autoregressive term, number of non-seasonal differencing required to obtain stationarity and the order of the moving average component respectively, whereas the values of and *Q* represent the corresponding values for the seasonal components. The parameters for the model are estimated so that the errors are minimized. Stationarity is required in building a SARIMA model. Stationarity is characterized by the condition that the mean and autocorrelations are constant over time. Once the model has been estimated, diagnostic checks are implemented to check that the error assumptions are satisfied.

### Exponential state space models

*et al.*(2002) and Taylor (2003). The methods were classified according to the seasonality and trend components. In 2008, Hyndman developed modifications to the method which took into consideration additive and multiplicative errors along with seasonality and trend components (Holt-Winters method). The ETS model refers to the three components: Error, Trend and Seasonality. This model was selected as it took into consideration the three components of the time series data. The general model comprises of a state space vector and the exponential smoothing models are written as follows:where represents the series level at time

*t*, represents the slope at time

*t*, represents the seasonal component of the time series at time

*t*, and

*m*represents the number of seasons in a year. varies according to the seasonal and trend component. and are constants (Gjika

*et al.*2019).

### The artificial neural network model

*et al.*2010). The relationship between the output and the inputs ( takes the following form (Zhang 2003):where are called the connection weights,

*p*is the number of input nodes and

*q*is the number of hidden nodes. Data are transferred from the input layer to the output layer by a sigmoid function given by (Zhang 2003):

This function is used as the hidden layer transfer function.

### Hybrid model

There is often difficulty in determining whether water consumption data display linearity or non-linearity properties and hence it is quite difficult to select the most appropriate statistical method for specific problems. Real world time series data are seldom either linear or non-linear and hence by using a combination of different statistical methods, complex data structures can be modelled more precisely. Hybrid models therefore allow for the modelling of different underlying patterns (trend and seasonality) in the data. Combinations of all three models were utilized to establish if any of the hybrid combinations would result in an increased forecasting performance.

The proposed model is developed for forecasting water consumption across Trinidad using historical time series consumption data. The steps in model development are shown in Figure 2.

### Model performance

The accuracy of models’ forecasts can be assessed by taking into consideration model performance using unseen data that have not been used during the training process. Various measures of accuracy have been developed and many authors have utilized these methods in assessing model performance.

*i*th actual and forecasting values respectively,

*n*is the total number of predictions (Wang & Meng 2012). The RMSE has been popular in modelling but has been found to be more sensitive to outliers than the MAE. The RMSE and MAE can be utilized if all forecasts are measured on the same scale and on the same dataset. For the MAPE, the value is represented in percentage terms and hence the scale does not pose a problem. MAPE can be used on different time series data sets on the same or different scales once the data set does not contain small values of zeros (Hyndman & Koehler 2006). The models having the lowest error values are selected as the most accurate. Another method for model evaluation is calculating the difference in error of the RMSE (RMSE) and change in MAE (MAE) for the trained and test data sets. The closer the values of RMSE and MAE to zero, the more accurate the proposed model (Bamisile

*et al.*2020):where and are training performance metrics and and are the test performance metrics.

### Index of agreement

*d*, is given by:where

*n*is the number of observations, is the th predicted value, is the th observed value, is the difference between the th observed value and the average observed value, is the difference between the th predicted value and the average predicted value (Khedkar

*et al.*2015).

### Trend detection

The Mann-Kendall (MK) test was used to assess whether there was a trend in water consumption over time. The MK test has been used extensively in the assessment trends in hydrologic time series data (Hamed 2008). The following are the assumptions of the Mann-Kendall test (Tosunoglu & Kisi 2017):

- 1.
The measurements that are observed over time are independent and identically distributed in the absence of a trend.

- 2.
The observations are representative of the real states at the time of measurement.

- 3.
The sampling methods, handling of data and measurement methods are unbiased.

The MK test is checked as follows:

No trend is present

There is an upward trend

The null hypothesis is rejected if the calculated standard *Z* value is greater than the standard normal *Z* value at significance level (Tosunoglu & Kisi 2017).

## RESULTS AND DISCUSSION

Trinidad is the most southerly island in the Caribbean with a population of approximately 1.3 million people. The island has two distinct seasons: a wet season which runs from June to November and a dry season which spans December to May. During the dry season, temperatures can reach as high as 36 °C during the day and about 28 °C at night. The wet season usually has slightly lower temperatures. The seasonal variation between the daytime and night time temperatures is approximately 0.9 °C.

Figure 3 shows plots of water consumption time series data after normalization. Figure 3(a), 3(b) and 3(d) show the plots for the domestic, commercial and agricultural categories, respectively. There are very clear seasonal patterns in water consumption for the domestic, commercial and agricultural categories. The highest consumption is during the wet season for the domestic category while the lowest are during the dry months. Agricultural consumption increases during the dry months and decreases during the wet season as farmers would require less water for watering their crops. This fluctuation in seasonal consumption can be amplified under changing climatic conditions. Climate change has the potential to alter meteorological conditions and hence have a significant impact on water resources (Nazari-Sharabian *et al.* 2018) as demonstrated in the case study conducted by Nazari-Sharabian *et al.* (2019) on the Mahabad Dam watershed in Iran. Water consumption data for the ‘other’ category are shown in Figure 3(e). An obvious linear upward trend is observed whereas the industrial category shown in Figure 3(c) exhibits an upward trend from 2003 to 2015 with a sharp downward trend for the two years that follow.

Tables 1–5 summarize the main forecast accuracy measures and Table 6 provides the structure of the optimum models selected for each category, respectively. Models were selected based on the minimization of RMSE, MAE and MAPE. For the domestic category, the highest consumption (1,431,712 m^{3}) was observed in June 2018 and the lowest consumption (164,362.4 m^{3}) in September 2017. The seasonal ARIMA model proved to be the best model based on all the accuracy measures (Table 1). This model was found to be an ARIMA (1,0,3)(0,1,1)_{[12]} with drift with parameters ar1=0.7892, ma1=−0.902, ma2=−0.0822, ma3=0.4151, sma1=−0.5001, drift=0.0016. Residual diagnostics show that the residuals satisfied white noise criterion according to the Box-Pierce test ( For the commercial category, the highest consumption (6,722,828 m^{3}) was observed in July 2017 and the lowest consumption (709,51 6.4 m^{3}) in February 2005. For this category, the ARIMA-ETS-NNAR model outperformed all single models and 2-model hybrid combinations (Table 2). For the seasonal ARIMA model, the results indicate that the best fit model was the ARIMA (0,1,2)(2,0,0)_{[12]} with parameters ma1=−0.7543, ma2=0.2730, sar1=0.2729, sar2=0.2901. For the exponential smoothing component, the model was an ETS (A, A_{d}, A) which is an additive damped trend method with additive errors. The parameters of the ETS component are with initial level , initial trend and initial state vector For the neural network model, the results indicate that the model was NNAR(2,1,10)_{[12]}. This three-layer neural network has an average of 250 networks, each of which is a 3-10-1 network with 51 weights. Residual diagnostics for the hybrid model indicate that the residuals satisfy white noise criterion according to the Box-Pierce test (0.3615). For the industrial consumption category, there was a sharp decrease in consumption in 2015–2017 followed by an increase thereafter. The lowest consumption (201,730 m^{3}) was observed in October 2017 and the highest consumption (4,750,722 m^{3}) was recorded in November 2017. The best fitting model for this category was a hybrid ARIMA-ETS-NNAR (Table 3) model with weights of 0.342, 0.325 and 0.333 respectively (Table 6). The model for the SARIMA component was (0,1,1)(1,0,0)_{[12]} with parameters ma1=−0.6449, sar1=0.2942. For the exponential smoothing component, the parameters are with initial level initial trend and The model for the neural network component was NNAR(2,1,10)_{[12]}. This neural network has an average of 250 networks, each of which is a 3-10-1 network with 51 weights. Residual diagnostics indicate that residuals satisfy white noise criterion according to the Box Pierce test The agricultural category displays increased consumption during most of the dry season from January to May and lower consumption from June to December as expected. The lowest consumption value was in February 2005 (16,424.62 m^{3}) and the highest consumption was reported in June 2017 (388,933.5 m^{3}). The most efficient model for the agricultural category was found to be a hybrid ARIMA-ETS-NNAR model (Table 4) with weights for the seasonal ARIMA. ETS and NNAR models of 0.325,0.349 and 0.326 respectively (Table 6). The model for the SARIMA component was (0,1,1)(2,0,0)_{[12]} with parameters ma1=−0.7107, sar1=0.1402, sar2=0.2357. For the NNAR component, the model was (4,1,10)_{[12].} This neural network model has an average of 250 networks, each of which is a 5-10-1 network with 71 weights. The ETS component was an ETS (A, A_{d}, A) model. The parameters of the ETS component are with initial level , initial trend and initial state vector Model diagnostics indicate that residuals satisfy white noise criterion according to the Box Pierce test The ‘other’ category displays a linear increasing trend over the years. The lowest consumption value was in March 2004 (22,480.06 m^{3}) and the maximum consumption occurred in July 2017 (142,441 m^{3}). From Table 5, although the ETS model had a lower value for RMSE, the ETS-NNAR model was found to be the best fit based on values of MAE and MAPE. The fitted model was an ETS-NNAR model with equal weights. The exponential smoothing component was an ETS(A, A_{d}, A) model which represents an additive damped trend method with additive errors. The parameters of the ETS component are with initial level , initial slope and For the NNAR component, the model was (3,1,10)_{[12].} This neural network model has an average of 250 networks, each of which is a 4-10-1 network with 61 weights. The results from the Box-Pierce test indicate that the residuals are white noise The Anderson-Darling test for normality was carried out to evaluate if the residuals after modelling for all selected models were normally distributed. For all five models, normality tests yielded the following *p*-values: <2.2 × 10^{−16}, 0.8038, 0.2353, 0.0039, 0.0708 for the domestic, commercial, industrial, agricultural and ‘other’ categories respectively. These results indicate that normality assumption was satisfied for the commercial, industrial and ‘other’ categories while the normality assumption was not satisfied for the domestic and agricultural categories. However, normality is a useful but not necessary condition for residuals (Hyndman & Athanasopoulos 2021) and models can still yield satisfactory results. Table 7 summarizes the training and test errors for all five categories. Even though all models had negative values for RMSE and MAE which indicate an overfitting of the models during the training phase, the model for the industrial, commercial, agricultural and ‘other’ categories proved to be adequate based on the values for RMSE and MAE being very close to zero. The SARIMA model, although being the most suitable model for the residential category, displayed larger changes in training and test errors than all models and hence the SARIMA model was not as effective as the others generalizing to unseen data for this category. It can also be deduced that the hybrid models proved to be the best models for forecasting water consumption based on accuracy measures for four out of five categories. Forecasts were performed for all categories up to December 2021 (a 40-month horizon). The observed values, fitted values, the point forecasts and the 80–95% confidence intervals are shown in Figure 4. We can observe from the plots that the fitted values are close to the observed values which validates the utilization of all models. This can be confirmed by the values for index of agreement (*d*) between the observed and fitted values of each model (Table 8). All values were very close to one which indicate generally good agreement between the actual and fitted values. The models for the industrial, commercial and ‘other’ categories demonstrated an even better agreement between the fitted model and observations (*d* > 0.900) than the models for the agricultural (*d* = 0.8652) and domestic categories (*d* = 0.8524).

Model . | RMSE . | MAE . | MAPE . |
---|---|---|---|

SARIMA | 0.2543 | 0.2102 | – |

ETS | 0.3081 | 0.2521 | – |

NNAR | 0.2849 | 0.2376 | – |

ARIMA-ETS | 0.2739 | 0.2228 | – |

ARIMA-NNAR | 0.2575 | 0.2110 | – |

ETS-NNAR | 0.2765 | 0.2279 | |

ARIMA-ETS-NNAR | 0.2652 | 0.2203 | – |

Model . | RMSE . | MAE . | MAPE . |
---|---|---|---|

SARIMA | 0.2543 | 0.2102 | – |

ETS | 0.3081 | 0.2521 | – |

NNAR | 0.2849 | 0.2376 | – |

ARIMA-ETS | 0.2739 | 0.2228 | – |

ARIMA-NNAR | 0.2575 | 0.2110 | – |

ETS-NNAR | 0.2765 | 0.2279 | |

ARIMA-ETS-NNAR | 0.2652 | 0.2203 | – |

Model . | RMSE . | MAE . | MAPE . |
---|---|---|---|

SARIMA | 0.0338 | 0.0247 | 12.2500 |

ETS | 0.0538 | 0.0499 | 23.4234 |

NNAR | 0.1917 | 0.1817 | 92.2916 |

ARIMA-ETS | 0.0422 | 0.0359 | 16.3573 |

ARIMA-NNAR | 0.0538 | 0.0489 | 24.5694 |

ETS-NNAR | 0.0428 | 0.0377 | 18.8122 |

ARIMA-ETS-NNAR | 0.0315 | 0.0247 | 12.2500 |

Model . | RMSE . | MAE . | MAPE . |
---|---|---|---|

SARIMA | 0.0338 | 0.0247 | 12.2500 |

ETS | 0.0538 | 0.0499 | 23.4234 |

NNAR | 0.1917 | 0.1817 | 92.2916 |

ARIMA-ETS | 0.0422 | 0.0359 | 16.3573 |

ARIMA-NNAR | 0.0538 | 0.0489 | 24.5694 |

ETS-NNAR | 0.0428 | 0.0377 | 18.8122 |

ARIMA-ETS-NNAR | 0.0315 | 0.0247 | 12.2500 |

Model . | RMSE . | MAE . | MAPE . |
---|---|---|---|

SARIMA | 0.0427 | 0.0325 | 6.3835 |

ETS | 0.0453 | 0.0374 | 6.4284 |

NNAR | 0.0716 | 0.0618 | 11.8761 |

ARIMA-ETS | 0.0400 | 0.0330 | 6.5185 |

ARIMA-NNAR | 0.0489 | 0.0427 | 8.3303 |

ETS-NNAR | 0.0469 | 0.0418 | 8.1298 |

ARIMA-ETS-NNAR | 0.0405 | 0.0330 | 6.5044 |

Model . | RMSE . | MAE . | MAPE . |
---|---|---|---|

SARIMA | 0.0427 | 0.0325 | 6.3835 |

ETS | 0.0453 | 0.0374 | 6.4284 |

NNAR | 0.0716 | 0.0618 | 11.8761 |

ARIMA-ETS | 0.0400 | 0.0330 | 6.5185 |

ARIMA-NNAR | 0.0489 | 0.0427 | 8.3303 |

ETS-NNAR | 0.0469 | 0.0418 | 8.1298 |

ARIMA-ETS-NNAR | 0.0405 | 0.0330 | 6.5044 |

Model . | RMSE . | MAE . | MAPE . |
---|---|---|---|

SARIMA | 0.0398 | 0.0339 | 33.2996 |

ETS | 0.0328 | 0.0299 | 32.5427 |

NNAR | 0.0770 | 0.0688 | 93.6701 |

ARIMA-ETS | 0.0361 | 0.0317 | 32.6562 |

ARIMA-NNAR | 0.0342 | 0.0262 | 40.0241 |

ETS-NNAR | 0.0363 | 0.0313 | 32.5900 |

ARIMA-ETS-NNAR | 0.0275 | 0.0203 | 30.3825 |

Model . | RMSE . | MAE . | MAPE . |
---|---|---|---|

SARIMA | 0.0398 | 0.0339 | 33.2996 |

ETS | 0.0328 | 0.0299 | 32.5427 |

NNAR | 0.0770 | 0.0688 | 93.6701 |

ARIMA-ETS | 0.0361 | 0.0317 | 32.6562 |

ARIMA-NNAR | 0.0342 | 0.0262 | 40.0241 |

ETS-NNAR | 0.0363 | 0.0313 | 32.5900 |

ARIMA-ETS-NNAR | 0.0275 | 0.0203 | 30.3825 |

Model . | RMSE . | MAE . | MAPE . |
---|---|---|---|

SARIMA | 0.0711 | 0.0635 | 21.1879 |

ETS | 0.0698 | 0.0606 | 20.0727 |

NNAR | 0.0816 | 0.0641 | 19.4683 |

ARIMA-ETS | 0.0700 | 0.0619 | 20.5888 |

ARIMA-NNAR | 0.0719 | 0.0598 | 18.8920 |

ETS-NNAR | 0.0707 | 0.0583 | 18.3016 |

ARIMA-ETS-NNAR | 0.0700 | 0.0600 | 19.2637 |

Model . | RMSE . | MAE . | MAPE . |
---|---|---|---|

SARIMA | 0.0711 | 0.0635 | 21.1879 |

ETS | 0.0698 | 0.0606 | 20.0727 |

NNAR | 0.0816 | 0.0641 | 19.4683 |

ARIMA-ETS | 0.0700 | 0.0619 | 20.5888 |

ARIMA-NNAR | 0.0719 | 0.0598 | 18.8920 |

ETS-NNAR | 0.0707 | 0.0583 | 18.3016 |

ARIMA-ETS-NNAR | 0.0700 | 0.0600 | 19.2637 |

Category . | Model . | Structure . | Weights . |
---|---|---|---|

Domestic | ARIMA | ARIMA(1,0,3)(0,1,1)_{[12]} with drift | – |

Industrial | ARIMA-ETS-NNAR | ARIMA(0,1,1)(1,0,0)_{[12]} | 0.342 |

ETS(A, A_{d}, A) | 0.325 | ||

NNAR(2,1,10)_{[12]} | 0.333 | ||

Commercial | ARIMA-ETS-NNAR | ARIMA(0,1,2)(2,0,0)_{[12]} | 0.333 |

ETS(A, A_{d}, A) | 0.333 | ||

NNAR(2,1,10)_{[12]} | 0.333 | ||

Agricultural | ARIMA-ETS-NNAR | ARIMA(0,1,1)(2,0,0)_{[12]} | 0.325 |

ETS(A, A_{d}, A) | 0.349 | ||

NNAR(4,1,10)_{[12]} | 0.326 | ||

Other | ETS-NNAR | ETS(A, A_{d}, A) | 0.5 |

NNAR(3,1,10)_{[12]} | 0.5 |

Category . | Model . | Structure . | Weights . |
---|---|---|---|

Domestic | ARIMA | ARIMA(1,0,3)(0,1,1)_{[12]} with drift | – |

Industrial | ARIMA-ETS-NNAR | ARIMA(0,1,1)(1,0,0)_{[12]} | 0.342 |

ETS(A, A_{d}, A) | 0.325 | ||

NNAR(2,1,10)_{[12]} | 0.333 | ||

Commercial | ARIMA-ETS-NNAR | ARIMA(0,1,2)(2,0,0)_{[12]} | 0.333 |

ETS(A, A_{d}, A) | 0.333 | ||

NNAR(2,1,10)_{[12]} | 0.333 | ||

Agricultural | ARIMA-ETS-NNAR | ARIMA(0,1,1)(2,0,0)_{[12]} | 0.325 |

ETS(A, A_{d}, A) | 0.349 | ||

NNAR(4,1,10)_{[12]} | 0.326 | ||

Other | ETS-NNAR | ETS(A, A_{d}, A) | 0.5 |

NNAR(3,1,10)_{[12]} | 0.5 |

. | Training . | Testing . | Difference in Errors . | |||
---|---|---|---|---|---|---|

Category . | RMSE . | MAE . | RMSE . | MAE . | RMSE . | MAE . |

Domestic | 0.0727 | 0.0408 | 0.2543 | 0.2102 | −0.1816 | −0.1694 |

Industrial | 0.0404 | 0.0304 | 0.0405 | 0.0330 | −0.0001 | −0.0026 |

Commercial | 0.0134 | 0.0105 | 0.0315 | 0.0347 | −0.0181 | −0.0242 |

Agricultural | 0.0098 | 0.0076 | 0.0275 | 0.0203 | −0.0177 | −0.0127 |

Other | 0.0366 | 0.0290 | 0.0698 | 0.0606 | −0.0332 | −0.0316 |

. | Training . | Testing . | Difference in Errors . | |||
---|---|---|---|---|---|---|

Category . | RMSE . | MAE . | RMSE . | MAE . | RMSE . | MAE . |

Domestic | 0.0727 | 0.0408 | 0.2543 | 0.2102 | −0.1816 | −0.1694 |

Industrial | 0.0404 | 0.0304 | 0.0405 | 0.0330 | −0.0001 | −0.0026 |

Commercial | 0.0134 | 0.0105 | 0.0315 | 0.0347 | −0.0181 | −0.0242 |

Agricultural | 0.0098 | 0.0076 | 0.0275 | 0.0203 | −0.0177 | −0.0127 |

Other | 0.0366 | 0.0290 | 0.0698 | 0.0606 | −0.0332 | −0.0316 |

Category . | Model structure . | Index of agreement (d)
. |
---|---|---|

Domestic | ARIMA(1,0,3)(0,1,1)_{[12]} with drift | 0.8524 |

Industrial | ARIMA(0,1,1)(1,0,0)_{[12]} | 0.9335 |

ETS(A, A_{d}, A) | ||

NNAR(2,1,10)_{[12]} | ||

Commercial | ARIMA(0,1,2)(2,0,0)_{[12]} | 0.9677 |

ETS(A, A_{d}, A) | ||

NNAR(2,1,10)_{[12]} | ||

Agricultural | ARIMA(0,1,1)(2,0,0)_{[12]} | 0.8652 |

ETS(A, A_{d}, A) | ||

NNAR(4,1,10)_{[12]} | ||

Other | ETS(A, A_{d}, A) | 0.9726 |

NNAR(3,1,10)_{[12]} |

Category . | Model structure . | Index of agreement (d)
. |
---|---|---|

Domestic | ARIMA(1,0,3)(0,1,1)_{[12]} with drift | 0.8524 |

Industrial | ARIMA(0,1,1)(1,0,0)_{[12]} | 0.9335 |

ETS(A, A_{d}, A) | ||

NNAR(2,1,10)_{[12]} | ||

Commercial | ARIMA(0,1,2)(2,0,0)_{[12]} | 0.9677 |

ETS(A, A_{d}, A) | ||

NNAR(2,1,10)_{[12]} | ||

Agricultural | ARIMA(0,1,1)(2,0,0)_{[12]} | 0.8652 |

ETS(A, A_{d}, A) | ||

NNAR(4,1,10)_{[12]} | ||

Other | ETS(A, A_{d}, A) | 0.9726 |

NNAR(3,1,10)_{[12]} |

All five forecast graphs display an increasing trend in consumption until the end of December 2021. This was confirmed by using the M-K test statistic to assess the trend for all categories. The M-K test yielded the following values for *S* of 1.969×10^{−3}, 2.335 × 10^{−3}, 2.087 × 10^{−3}, 1.791 × 10^{−3}, 1.743 × 10^{−0} which indicate an increasing trend in forecast time series (*S* > 0) for the domestic, commercial, industrial, agricultural and ‘other’ categories respectively until the end of 2021. The z-values were calculated as 4.3022, 5.1023, 4.5602, 3.9131 and 3.8082 for the domestic, commercial, industrial, agricultural and ‘other’ categories respectively which indicate a rejection of the null hypothesis that no trend is present at the 5% significance level (*z*-value > 1.645). These findings therefore indicate an increase in projected water consumption patterns for both the wet and dry season.

The results obtained from this study demonstrate that the proposed method can be applied to forecast monthly water demand in any location, once there is sufficient data available for the analysis to be conducted. Past studies have utilized single or two-model hybrid combinations of the ARIMA, ETS and NNAR models for forecasting water consumption. A study conducted by Ristow *et al.* (2021) for the city of Joinville in Brazil used only the ETS and ARIMA models to forecast water consumption. Another study conducted by Mukhairez & El-Halees (2018) for the KhanYounis municipality in Palestine utilized a hybrid ARIMA and neural network model. Our study used the ARIMA, ETS and NNAR models and two-model hybrid combinations. It then used the three-model hybrid combination, demonstrating the superiority of the hybrid models for water consumption forecasting. From this perspective, there may be potential that this model can be generalized to forecast water demand for different scenarios.

## CONCLUSION

In this study, water consumption data were analysed for the period October 2003–August 2018 across Trinidad, Trinidad and Tobago. The forecasting of water consumption can aid in developing a balance between demand and supply for water. This research utilized single models, namely the ARIMA, ETS and NNAR models, as well as hybrid combinations of all three models in forecasting of water consumption until the end of December 2021 for all categories of demand across Trinidad. The three-model hybrid combination of ARIMA-ETS and NNAR were found to be adequate in modelling water consumption for the commercial, industrial and agricultural categories, the two-model hybrid combination of ETS-NNAR was found to be suitable for the ‘other’ category whereas the SARIMA model was found to be suitable for the domestic category. All models demonstrated a good agreement between the actual and fitted values. Hybrid models were also found to be the optimum models for four out of five categories. The study also highlights the importance of a forecasting model as it revealed an increase in demand for water for the future.

This short-term forecasting of water demand can effectively be used to fuel processes such as operational planning and management, and aid in establishing benchmarks for water demand. The study can be further improved by utilization of a larger data set as well as exploitation of other potential models for forecasting water consumption. A larger training data set can be utilized to avoid overfitting. Future work can explore long-term forecasting methods to forecast water consumption for a longer time into the future as well as incorporate other variables as part of the model such as population, climatic variables and spatial attributes contingent on the availability of water demand and supply data for the region.

## DISCLOSURE STATEMENT

No potential conflict of interest was reported by the authors and contributors.

## DATA AVAILABILITY STATEMENT

Data cannot be made publicly available; readers should contact the corresponding author for details.