## Abstract

In this study, to reflect the effect of large-scale climate signals on runoff, these indices are accompanied with rainfall (the most effective local factor in runoff) as the inputs of the hybrid model. Where one-year in advance forecasting of reservoir inflows can provide data to have an optimal reservoir operation, reports show we still need more accurate models which include all effective parameters to have more forecasting accuracy than traditional linear models (ARMA and ARIMA). Thus, hybridization of models was employed for improving the accuracy of flow forecasting. Moreover, various forecasters including large-scale climate signals were tested to promote forecasting. This paper focuses on testing MARMA-NARX hybrid model to enhance the accuracy of monthly inflow forecasts. Since the inflow in different periods of the year has in linear and non-linear trends, the hybrid model is proposed as a means of combining linear model, Monthly Autoregressive Moving Average (MARMA), and non-linear model, Nonlinear AutoRegressive model with eXogenous (NARX) inputs to upgrade the accuracy of flow forecasting. The results of the study showed enhanced forecasting accuracy through using the hybrid model.

## INTRODUCTION

Improving accuracy of hydrological forecasting, especially long-term forecasting, is an essential yet often difficult task facing modern water resources management. Much recent literature has focused on real-life case studies of contemporary soft computing techniques in hydrological engineering (Chau & Wu 2010; Wu *et al.* 2010; Chen *et al.* 2015; Gholami *et al.* 2015; Taormina *et al.* 2015; Wang *et al.* 2015). Long-term forecasting is of high importance in the effective management of water resources. Since using the existing linear models cannot lead to accurate forecasts (Khashei & Bijari 2011), it is proposed to employ hybrid models for flow forecasting to consider both linear and non-linear components in modeling. Autoregressive moving average (ARMA) and artificial neural network (ANN) have been used in flow forecasting models. Mohammadi *et al.* (2006) and Yurekli *et al.* (2005) applied these models for monthly inflow forecasting. Moreover, many researchers have accomplished forecasting using ANN models (Kilinç & Ciğizoğlu 2003; Banihabib *et al.* 2008; Valipour *et al.* 2013), and Bazartseren *et al.* (2003) compared neural network and neuro-fuzzy models with ARMA and autoregressive with exogenous terms (ARX) to evaluate the models' accuracy in forecasting water levels of two rivers. The most critical point of these studies is the clarification of the advantage and disadvantage of using linear time series and ANN models. For instance, Bazartseren *et al.* (2003) reported that ARMA and ARX models have performed better than neural network and neuro-fuzzy models in short-term forecasts (up to 5 hours) of water level; however, by increasing the forecasting horizon, neural networks and neuro-fuzzy models performed better. Therefore, it is necessary to develop a method which benefits from the advantages of both linear time series and ANN models.

In various research fields excluding water science, many researchers have employed hybrid models to use the benefits of the models simultaneously. For instance, the hybrid model of autoregressive integrated moving average (ARIMA) and ANN was developed by Koutroumanidis *et al.* (2009) in forecasting annual petrol price as well as in hourly and daily forecasting of air quality and meteorological data by Aladag *et al.* (2009) and Diaz-Robles *et al.* (2008). Khashei & Bijari (2011) applied hybrid ARIMA and ANNs to forecast annual sunspot series, and Chau & Wu (2010) developed a hybrid model integrating ANNs and support vector regression for daily rainfall prediction. However, rarely, have some researchers like Banihabib *et al.* (2014) used hybrid models for inflow forecasting. Therefore, there is the possibility of using a hybrid model to benefit from the strengths of ARMA model in forecasting the linear component and ANN model in forecasting the non-linear component to increase forecasting accuracy.

Since studies on large-scale climate signals in Iran are a relatively rare topic (Rivandi *et al.* 2013), and some reports in recent years have expressed the effect of these signals in long-term forecasting (Karamouz & Araghinejad 2005), employing these signals along with rainfall data (as the most important recognized local factor on flow) can enhance flow forecasting. In recent decades, the effect of large-scale climate signals on flow and hydrologic variables have been studied worldwide (Bradbury *et al.* 2002; Fatehi Marj & Borhani Darian 2005; Ashoori *et al.* 2008; Ryu *et al.* 2010). Some researchers like Chandimala & Zubair (2007), Azimi *et al.* (2010), and Rahimi Khoob (2011) have used correlation analysis, data mining method, and regression equations, and others like Fallah Ghalhari & Khoshhal (2010), Borhani & Fatehi (2008), and Wang *et al.* (2006) have forecasted rainfall and river flow. However, few studies have assessed the effect of large-scale climate signals and rainfall on stream flow. According to studies in Iran (Rivandi *et al.* 2013), it is clear that signals such as the El Niño–Southern Oscillation (ENSO), the North Atlantic Oscillation (NAO), and the Persian Gulf Sea-surface temperature (PGSST) and Red Sea-surface temperature (RSST) impacted on the hydrologic variables in some parts of Iran. However, these studies are generally accomplished for a specific season during a year or merely for a three- or six-month period, while no forecasting is accomplished for the horizon of the whole year and on a monthly basis.

Hybridization of ARMA and ANN models can be applied for increasing forecasting accuracy of the variables with linear and non-linear components. Thus, ARMA as a time series model for estimating the linear component of discharge and NARX ANN model for estimating the non-linear component of discharge were used for flow forecasting in this study. The previous investigations indicate advantages of the NARX compared with the non-linear autoregressive neural network (NARNN) and non-linear input-output (NIO) models (Valipour 2016). Moreover, choosing the proper forecasters was another purpose of this study. Note that using hybrid ARMA-ANN models, in recent investigations, represents a reliability aspect of this technique in various fields (Benmouiza & Cheknane 2016; Gairaa *et al.* 2016).

Banihabib *et al.* (2008) aimed to find the relation between some stations in order to forecast the flow, but it was used only from the ANN. The research goal of Valipour *et al.* (2013) is forecasting the inflow of Dez dam reservoir by using ARMA and ARIMA models while increasing the number of parameters to increase the forecast accuracy to four parameters and comparing them with the static and dynamic ANNs.

By investigating the results, it showed that the ARIMA model could be used for forecasting an appropriate monthly inflow for the next 12 months, and dynamic autoregressive ANN model with sigmoid activity function could be used for forecasting discharge of the next five years.

All of the studies which were done showed that ANN and time series models have excellent ability to forecast the flow, but each of the studies used the models singly and did not use the abilities of these models for increasing the ability of forecasting. The data which were used in these models were streamflow data while in this study we attemped to use rainfall and precipitation data and also large-scale signals to reduce the uncertainty of forecasting.

Accordingly, in this study, the efficiency of using some large-scale climate signals such as ENSO, NAO, SST-PG, and SST-RS as global forecaster along with rainfall as local forecaster were examined to forecast the inflow to Dez Reservoir accurately.

Selection of these signals is according to the results of research which have shown the impact of these signals and water surface temperature on the precipitation and streamflow in Iran, spatially in the studied area and its neighbor areas. These studies are such as those of Azimi *et al.* (2010) and Gheiby & Noorafshan (2013) which are mentioned in the sources or are the articles which have presented and published in conferences and journals of Iran.

There were other indices for investigating for this purpose, such as water temperature of the Caspian Sea, but we did not find any research about the impact of it on the study area. Moreover, the Pacific Decadal Oscillation (PDO) is another large-scale index, but we did not use it because past investigations did not show the impact of it in the study area. Indeed, we are trying to use indices that previous reports have declared their impact on in the study area on some hydrological factors, i.e., rainfall.

Over the past year, in advance forecasting of reservoir inflows has offered essential data to operate reservoirs optimally. Reviewing the studies indicates we still require more accurate models which can include active parameters such as large-scale climate signals to achieve more forecasting accuracy than traditional linear models (ARMA and ARIMA). Consequently, the effect of using large-scale climate signals was studied for improving the monthly inflow forecasts with ARMA-NARX hybrid model for the following one-year horizon.

## MATERIALS AND METHODS

### Case study

Dez basin is a part of the middle Zagros Mountains. Regarding the general hydrological classification of Iran, it is considered as a part of the Persian Gulf Basin. The basin is located between 32° 35′ to 34° 07′ northern latitude and 48° 20′ to 50° 20′ eastern longitude in the southwest of Iran.

This area was selected because Dez dam, which is one of the important dams of Iran, has an essential role in the control of the floods of Dez River, irrigating the extended area of the Khuzestan plain, and producing electricity, and so, forecasting the flow of this river can be useful in making management decisions. Also, much research on flow forecasting in this area has been carried out, and because of the importance of this region, we were trying to increase its accuracy. For example, Azimi *et al.* (2010) predicted seasonal flow, and we reduced its timescale to the monthly flow. Other studies such as those of Banihabib *et al.* (2008) and Valipour *et al.* (2013) presented the advantage of time series models such as ARMA or ANN comparing with regression models in order to forecast the flow. Valipour *et al.* (2013) compared ARMA, ARIMA, and ANN models in forecasting monthly streamflow but they did not use the advantages of these models in promoting the forecasting. Also, another reason that encouraged us to select this region was the data of this dam which were accessible for more than 30 years.

Previous investigations show NAO and Southern Oscillation Index (SOI) are useful methods for rainfall and flow forecasting at different sites in Iran (Fatehi Marj & Borhani Darian 2005; Ashoori *et al.* 2008; Borhani & Fatehi 2008) and Dez basin (Azimi *et al.* 2010). As well, the effect of sea surface temperature of the Persian Gulf and the Red Sea on rainfall near the basin was observed (Rahimi Khoob 2011). However, the most effective factor on the inflow had not been evaluated. Therefore, this paper applies these data as a forecaster and assesses the effect of them singularly or in sets of two, three, and four in improving the accuracy of flow forecasting.

In this study, to forecast the reservoir inflow, the flow data of hydrometric stations in the upstream of Dez Reservoir (Telezang station) and rainfall data of three stations (Keshvar Sorkhab, Sepiddasht-Sezar, and Gelerood) were used. The data were obtained from the Water Resources Management Organization (WRMO) of Iran. Unit of flow data is cubic meter per second and unit of precipitation is millimeter.

Figure 1 shows Dez dam basin, the location of the dam, Telezang hydrometric station, and the rainfall gauges. As well as rainfall data, SOI, NAO, PGSST, and RSST were used as forecasters of reservoir inflow. The information of large-scale climate indices was gained from the National Oceanic and Atmospheric Administration (NOAA; http://www.noaa.gov/). These data are dimensionless.

### Research method

In this study, first, the flow discharge is forecasted through ARMA time series model. Then, the residuals of the model and other input data (rainfall, SOI, NAO, PGSST, and RSST indices) were entered into the NARX neural network model to forecast in the final stage.

The flowchart shows the research procedures (Figure 2).

*k*delay and average . where is the values of the parameter in temporal stage

*i*and time delay

*k*(Cryer & Chan 2008).

The ARIMA modeling approach involves the following three steps: model identification, parameter estimation, diagnostic checking. Identification of the general form of a model includes two stages:

If it is necessary, appropriated differencing of the series is performed to achieve stationary and normalit.

The temporal correlation structure of the transformed data is identified by examining its autocorrelation (ACF) and partial autocorrelation (PACF) functions (Mishra & Desai 2005). For determining the number of ARMA and ARIMA models parameters perform by partial auto-correlation function (PACF) and auto-correlation function (ACF) curves (Balaguer

*et al.*2008; Cryer & Chan 2008; Valipour*et al.*2013).

ACF is a useful statistical tool that measures the relation between the earlier values in the series to following values. PACF is the amount of correlation between a variable and a lag of itself that is not explained by correlations at all low order lags.

*A*and

_{j}*B*are Fourier series ratios (Equations (4) and (5)).

_{j}*j*is harmony and

*h*is the sum of harmonies that equals alternatively, , based on

*w*being an odd or an even value. The variable

*w*is correlated to the type of temporal scale. In monthly scale,

*w*= 12 and

*h*= 6 (Salas 1980). The Fourier series formula was used for the average and also the standard deviation calculations. Forecasting was done using time series models, after preparing the data. Autoregressive and moving average models are among the most important models derived from the total linear process, and by combining these models, ARMA model is produced. ARMA is formed by combining regressive and moving average process and includes

*p*sentences of autoregressive average and

*q*sentences of moving average, as shown in Equation (7): where

*x*is the value of the variable in time

_{t}*t*,

*z*expresses the pure random process with an average of 0, and is variance, and is the moving average parameter in delay step

_{t}*q*and is the autoregressive coefficient in delay step

*p*. The maximum likelihood, moment or other methods are used to calculate the model parameters. In this paper, the ARMA model parameters were calculated using maximum likelihood method and log-likelihood, shown in Equation (9) (Wang 2006). When is the forecasting error in every temporal step, the variance of forecasting error using , the variance of model errors (residuals), and

*f*, the scale parameter can be calculated by Equation (8) (Wang 2006): Maximum likelihood or least square error were used for estimating the parameters of time series model and using each of them is permitted (Van Calster

_{t}*et al.*2017). The use of

*ML*is common in studies such as Amiri (2015).

*ML*is the maximum likelihood, and

*k*is the number of used parameters in the model.

*et al.*2017). Where

*k*is the number of delays,

*r*equals the

_{i}*i*th ACF of the residuals and

*n*is the number of samples. The null hypothesis is expressed as the ACF of all residuals equals zero and all data are random. Under these circumstances, if the

*p*-value is less than or equal to the level of the meaningfulness of the test (5%), the is rejected (Cryer & Chan 2008).

While goodness-of-fit criteria such as the sum-squared error (*SSE*) or coefficient of determination may be appropriate for these purposes, it is often better to use more parsimonious model selection criteria such as Akaike's information criterion (AIC) or the Bayesian information criterion (BIC), which penalize unnecessary model complexity (Humphrey *et al.* 2016).

For in-sample model comparison usually, three criteria, namely, loglikelihood, AIC, and BIC are calculated. AIC and BIC criteria are objective measures of model suitability that balance model fits and model complexity. In another word, AIC and BIC are fitness indexes for trading off the complexity of a model against how well the model fits the data. Since AIC and BIC attempt to find the model that best explains the data with a minimum number of parameters, they are considered an approach favoring simplicity (Amiri 2015). In finite sample sizes, the BIC can perform worse than the AIC, even when the true model is under consideration. The choice between the AIC and BIC depends on one's notion of the true model. If the true model is assumed to be complicated, with large, moderate, and small effects, and the candidate models oversimplifications, then the AIC may be preferred to the BIC (Vrieze 2012). Due to the reasons mentioned above, between AIC and BIC, we have used AIC.

*d*≥1 equals the number of delays in the output,

_{y}*u*(

*t*) is the value of the input to the network in time

*t*,

*y*(

*t*) is the forecasted output in time

*t*, which is dependent on previous values of output or input. Figure 3 shows the structure of NARX dynamic ANN with

*d*output delays and

_{y}*d*number of input delays. NARX ANN has two different series and series-parallel structures. In the parallel structure, the estimated output is back propagated with a specified number of delays to the input of the feed forward neural network, shown in Equation (13) (Mandal & Prabaharan 2006):

_{u}In this paper, the training stage had series-parallel structure and forecasting stage had parallel structure. Moreover, the transfer function was tangent-sigmoid, and learning function was Levenberg–Marquardt.

NARX ANN's training function is Levenberg–Marquardt by default, because the Levenberg–Marquardt algorithm has high efficiency and high speed in most problems and also its accuracy is noticeable. Since it has the lowest error in most of the cases, the Levenberg–Marquardt algorithm was selected in this study because of its faster convergence in network training with medium size and also it has less epoch than other algorithms. Use of this algorithm has been reported by several studies (Abdul Jaleel & Aparna 2015; Raj & Sivanandan 2017).

*et al.*2009). In hybrid modeling, a time series is considered as a combination of linear auto-correlated and a non-linear component structure, where

*L*is the linear component and

_{t}*N*is the non-linear component which should be estimated by the data. Thus, by first using the ARMA model, the linear component was modeled. Accordingly, the linear model residuals merely include non-linear parts. Then, the residuals are input to NARX neural network, which models the residuals and the hybrid model results were created by summing up the linear and non-linear components (Equation (15)):

_{t}### Indices for evaluating the models

*et al.*2005; Mohammadi

*et al.*2006; Wang 2006; Wang

*et al.*2006; Rahimi Khoob 2011; Rivandi

*et al.*2013). where

*Q*is the calculated discharge in the month

_{ci}*i*(m

^{3}/day),

*Q*is the observed discharge in the month

_{oi}*i*(m

^{3}/day) and

*n*is the number of data and

*MARE*is the mean absolute relative error. The

*MARE*is a suitable index for showing the trend in forecasting. However, similar to other indices, this index has deficiencies. For instance, it increases the error in peak flow, and it does not consider a delay in the forecasting peak flow. Therefore, in this paper, although considering the advantages of

*MARE*error index and considering the delay errors in estimating the time and the amount of the peak flow, a new index called Integrated Index for Flow Forecasting Evaluation (

*IIFFE*) is introduced. This index is composed of three indices of forecasting

*MARE*, temporal delay relative error (

*TOPRE*) and maximum relative error (

*MAXRE*) with equal weighting for all the three indices.

*TOPRE*is calculated by subtracting the number of the month of observed and forecasted peak flow, divided by the number of the month of observed peak flow. In other words, it shows a temporal delay in calculating peak flow. Moreover,

*MAXRE*is calculated by subtracting the observed and forecasted discharge values, divided by the observed peak flow. In other words, it is the relative error of calculating peak flow. In calculating

*IIFFE*, each of the indices above is normalized through being divided to its maximum value (Equation (20)): where

*MAX*is the observed peak flow (m

_{o}^{3}/day),

*MAX*is the calculated peak flow (m

_{c}^{3}/day),

*TOP*is the number of the month of observed peak flow,

_{o}*TOP*is the number of the month of calculating peak flow,

_{o}*MAXRE*is the value of relative peak flow error, and

*TOPRE*is the relative peak flow temporal delay.

## RESULTS AND DISCUSSION

In this paper, forecasting was primarily accomplished with ARMA model, then it enters the neural network and afterwards, the effect of adding rainfall, SOI, NAO, PGSST, and RSST indices on forecasting accuracy were evaluated. The impact of rainfall data, SOI, NAO, PGSST, and RSST on the accuracy of the model was measured in single, binary, and triplet and its result was the comparison of 80 models. In Table 1, ten choices of models are presented in order to show how the indicators are introduced into the model.

Model name | Model input | Error indices | |||||
---|---|---|---|---|---|---|---|

Discharge forecasted by ARMA | NAO, SOI/PGSST, RSSST | Rainfall station | MARE | TOPRE | MAXRE | IIFFE | |

MARMA-NARX(1) | Yes | – | – | 0.114 | 0.167 | 0.227 | 0.805 |

MARMA-NARX(2) | Yes | – | Keshvar Sorkhab | 0.132 | 0.167 | 0.005 | 0.599 |

MARMA-NARX(6) | Yes | NAO | 0.16 | 0.17 | 0.11 | 0.77 | |

MARMA-NARX(7) | Yes | NAO | Keshvar Sorkhab | 0.16 | 0.17 | 0.02 | 0.67 |

MARMA-NARX(30) | Yes | NAO,SOI | Sepid Dasht, Keshvar Sorkhab, Gelerood | 0.14 | 0.17 | 0.11 | 0.72 |

MARMA-NARX(62) | Yes | RSSST,NAO,SOI | Keshvar Sorkhab | 0.13 | 0.17 | 0.10 | 0.70 |

MARMA-NARX(60) | Yes | PGSST,NAO,SOI | Sepid Dasht, Keshvar Sorkhab, Gelerood | 0.16 | 0.17 | 0.24 | 0.92 |

MARMA-NARX(66) | Yes | PGSST,RSSST,NAO | – | 0.15 | 0.17 | 0.08 | 0.72 |

MARMA-NARX(79) | Yes | PGSST,RSSST,SOI,NAO | Gelerood | 0.16 | 0.17 | 0.07 | 0.72 |

MARMA-NARX(80) | Yes | PGSST,RSSST,SOI,NAO | Sepid Dasht, Keshvar Sorkhab, Gelerood | 0.15 | 0.17 | 0.03 | 0.67 |

Model name | Model input | Error indices | |||||
---|---|---|---|---|---|---|---|

Discharge forecasted by ARMA | NAO, SOI/PGSST, RSSST | Rainfall station | MARE | TOPRE | MAXRE | IIFFE | |

MARMA-NARX(1) | Yes | – | – | 0.114 | 0.167 | 0.227 | 0.805 |

MARMA-NARX(2) | Yes | – | Keshvar Sorkhab | 0.132 | 0.167 | 0.005 | 0.599 |

MARMA-NARX(6) | Yes | NAO | 0.16 | 0.17 | 0.11 | 0.77 | |

MARMA-NARX(7) | Yes | NAO | Keshvar Sorkhab | 0.16 | 0.17 | 0.02 | 0.67 |

MARMA-NARX(30) | Yes | NAO,SOI | Sepid Dasht, Keshvar Sorkhab, Gelerood | 0.14 | 0.17 | 0.11 | 0.72 |

MARMA-NARX(62) | Yes | RSSST,NAO,SOI | Keshvar Sorkhab | 0.13 | 0.17 | 0.10 | 0.70 |

MARMA-NARX(60) | Yes | PGSST,NAO,SOI | Sepid Dasht, Keshvar Sorkhab, Gelerood | 0.16 | 0.17 | 0.24 | 0.92 |

MARMA-NARX(66) | Yes | PGSST,RSSST,NAO | – | 0.15 | 0.17 | 0.08 | 0.72 |

MARMA-NARX(79) | Yes | PGSST,RSSST,SOI,NAO | Gelerood | 0.16 | 0.17 | 0.07 | 0.72 |

MARMA-NARX(80) | Yes | PGSST,RSSST,SOI,NAO | Sepid Dasht, Keshvar Sorkhab, Gelerood | 0.15 | 0.17 | 0.03 | 0.67 |

In the time series analysis, it is necessary to investigate the normality of the data. The importance of normalized data is because of the time series theory which is developed by normality and when the data are not standard it is necessary to normalize them by some methods like the Log of them. According to the qq-plot diagram, and also fitting unsuitable data on the normal index line, we decided to normalize these abnormal data, and because of the fluctuations of current data, logarithmic based on the number of Neper was used.

After normalizing the data and removing the seasonal trend, the ACF and PACF graphs were examined. Since ACF and PACF graphs were both attenuating within the allowed limits, ARMA (*p,q*) was the proper model. The ARMA model with 28-year flow discharge (1982–83 to 2009–10) for Telezang station was trained in monthly scale and forecasting for 12 months of 2010–11 was examined using various inputs. Using this period is due to lack of access to the newer data. However, the changes in the new data can be a component of uncertainty in results for future research.

Among forecasting models, ARMA(1,1) with the least amount of AIC and the number of parameters was selected. Then the auto-regressive parameter in delay step one is 0.8396 and the moving average parameter in delay step one is −0.2851 .

After selecting the best model, the goodness of fit test was tested to demonstrate independency and stationarity of residuals through investigating the Ljung–Box statistics. The results of the statistics for ARMA(1,1) are shown in Figure 4. Considering that the results were within the allowed limits, the residuals were steady, and since the residuals did not have any trends and *p*-value more significant than the test meaningfulness level (5%), the is accepted, and the residuals were confirmed to be random. Accordingly, ARMA (1,1) was an appropriate model.

Successful applications of time series models have been reported in hydrological forecasting (Tyralis & Papacharalampous 2017; Papacharalampous *et al.* 2018a). Papacharalampous *et al.* (2018a) examined how various handling of the seasonality and non-normality affects the outcomes of the automatic univariate time-series models in forecasting monthly temperature and precipitation, and they showed that monthly temperature and precipitation could be forecasted acceptably which could be enhanced applying other methods.

The result of comparing the forecasting with the observed hydrograph is shown in Figure 5. It is observed that the forecasted values in most points are significantly distant and peak flow has been estimated with an error in the amount and with a one-month delay. To find the amount of error, *MARE* index is calculated, that is 0.55 for the forecast when using ARMA.

In the following step, the ARMA model residuals were inputted to NARX neural network to forecast the inflow to Dez reservoir using hybrid ARMA-NARX model. As the forecasting was carried out on a monthly basis, the name of monthly ARMA-NARX model here has been abbreviated to MARMA-NARX. Figure 6 shows the comparison between ARMA and the hybrid model (in which the mere input is the ARMA forecasting residuals) forecasts. Accordingly, the results of the hybrid model have improved concerning the *MARE* from 0.55 to 0.114, for ARMA (1,1) and MARMA-NARX (1), respectively. There are a great many investigations for which successful applications of machine learning models have been reported in hydrological forecasting (Papacharalampous *et al.* 2017a, 2017b, 2017c, 2018b). Papacharalampous *et al.* (2017a) found that some forecasting models are more advantageous than others. Similarly, in this study, we found some models forecasted reservoir inflow better than others did. As Papacharalampous *et al.* (2017a) also indicated, the forecasting errors at each time step of a forecast horizon within a particular case study intensely depend on the tested case regardless of the forecasting model.

Similarly, other inputs, including rainfall data, SOI, NAO, PGSST, and RSST with 80 different compositions are added to the ARMA residuals. These compositions include the singular input of large-scale climate signals and rainfall data, then, the sets of two, three, and four of these indices and consequently considers the large-scale climate signals with rainfall data. Here, it is necessary to explain the architecture of the hybrid network. The numbers of output delays were selected from 1 to 20 months that each delay was fitted with 20 neurons and entirely for each input, 400 forecasts were produced in monthly scale. The number of neurons in the output layer for all models was 1. In this stage, the best model for each input is selected based on the least *MARE*. The operations were accomplished for different inputs to ultimately reach 80 selected models with different inputs. The hydrographs of the 80 models and the observed flow were compared. To compare the results of the 80 models, a new index, *IIFFE*, is defined in this paper that is calculated from the average of *MARE*, the error in forecasting peak and the error in estimating the amount of peak flow (shown in Equation (19)). *MARE* is a suitable index in calculating the amount of error. However, it has not been able to consider time delay error and peak flow value. The index *IIFFE* was calculated for the 80 models, and the eight top models with the least *IIFFE* are shown in Table 2, and they are compared with MARMA-NARX(1) with the mere input of ARMA(1,1) residuals. In this table, the four top models with large-scale climate signals input (without rainfall data) are shown (MARMA-NARX(31), MARMA-NARX(36), MARMA-NARX(56), and MARMA-NARX(76)). Accordingly, the number of input large-scale climate signals to the models is more than two and the models with singular large-scale climate signals input have not been able to produce less *IIFFE* to be among the top four models. Therefore, the most important point is the positive effect of considering multiple large-scale climate signals in improving forecasting. In this stage, none of the models have decreased forecasting peak flow time delay and *MARE*, and top models are selected based on decreasing the amount of peak flow. Considering the least *IIFFE* among the four models (0.59), MARMA-NARX(36) has the best forecast using NAO and PGSST. These four models have almost similar *IIFFE* and also they have small differences in their hydrographs, shown in Figures 7–10. The second four top models, MARMA-NARX(27), MARMA-NARX(48), MARMA-NARX(55), and MARMA-NARX(58) are the models with large-scale climate signals and rainfall data. Considering Table 2, the assemblage of large-scale climate signals and rainfall data had a positive effect on decreasing *IIFFE* and time delay error in forecasting peak flow, which can be observed in Figures 7–14 (the inputs which have been added to the hybrid model in addition to the ARMA residuals are shown in each graph). Moreover, similar to what was observed in the previous four models (MARMA-NARX(31), MARMA-NARX(36), MARMA-NARX(56), and MARMA-NARX(76)), these four models had the best results while using multiple large-scale climate signals and rainfall data, not the singular use of large-scale climate signals and rainfall data. Another point shown in the table is that the input of the rainfall data singularly and without using large-scale climate signals is not among the eight top models with the least *IIFFE*.

Model | Large-scale climate signals | Rainfall station | MARE | TOPRE | MAXRE | IIFFE |
---|---|---|---|---|---|---|

MARMA-NARX(1) | 0.11 | 0.17 | 0.23 | 0.80 | ||

MARMA-NARX(31) | PGSST, RSST | 0.11 | 0.17 | 0.14 | 0.71 | |

MARMA-NARX(36) | PGSST, NAO | 0.13 | 0.17 | 0.00 | 0.59 | |

MARMA-NARX(56) | PGSST, NAO, SOI | 0.15 | 0.17 | 0.03 | 0.65 | |

MARMA-NARX(76) | PGSST, RSST, NAO, SOI | 0.16 | 0.17 | 0.04 | 0.68 | |

MARMA-NARX(27) | NAO, SOI | Keshvar Sorkhab | 0.17 | 0.00 | 0.03 | 0.37 |

MARMA-NARX(48) | RSST, NAO | Sepid Dasht | 0.13 | 0.00 | 0.11 | 0.37 |

MARMA-NARX(55) | RSST, SOI | Sepid Dasht, Keshvar Sorkhab, Gelerood | 0.15 | 0.00 | 0.05 | 0.35 |

MARMA-NARX(58) | PGSST, NAO, SOI | Sepid Dasht | 0.14 | 0.00 | 0.03 | 0.32 |

Model | Large-scale climate signals | Rainfall station | MARE | TOPRE | MAXRE | IIFFE |
---|---|---|---|---|---|---|

MARMA-NARX(1) | 0.11 | 0.17 | 0.23 | 0.80 | ||

MARMA-NARX(31) | PGSST, RSST | 0.11 | 0.17 | 0.14 | 0.71 | |

MARMA-NARX(36) | PGSST, NAO | 0.13 | 0.17 | 0.00 | 0.59 | |

MARMA-NARX(56) | PGSST, NAO, SOI | 0.15 | 0.17 | 0.03 | 0.65 | |

MARMA-NARX(76) | PGSST, RSST, NAO, SOI | 0.16 | 0.17 | 0.04 | 0.68 | |

MARMA-NARX(27) | NAO, SOI | Keshvar Sorkhab | 0.17 | 0.00 | 0.03 | 0.37 |

MARMA-NARX(48) | RSST, NAO | Sepid Dasht | 0.13 | 0.00 | 0.11 | 0.37 |

MARMA-NARX(55) | RSST, SOI | Sepid Dasht, Keshvar Sorkhab, Gelerood | 0.15 | 0.00 | 0.05 | 0.35 |

MARMA-NARX(58) | PGSST, NAO, SOI | Sepid Dasht | 0.14 | 0.00 | 0.03 | 0.32 |

Considering the aforementioned points, the singular input of large-scale climate signals or rainfall data did not have the least *IIFFE*, and they are not among the eight top models. In contrast, multiple large-scale climate signals, especially with rainfall data as input, have led to decreasing the *IIFFE*. The most successful models in decreasing *IIFFE* were the models with multiple large-scale climate signals and rainfall data that have decreased *IIFFE* by decreasing the time delay of peak flow (*TOPRE*). Ultimately, MARMA-NARX(58) had the least *IIFFE* (0.32) with SOI, NAO, and PGSST along with rainfall data of Sepiddasht-Sezar station as input. The most effective climate index is PGSST and has been the input to five of eight selected models. It is necessary to mention that the PGSST is spatially the nearest index for the study area (Dez Dam). Therefore, the spatial proximity of the selected indices can cause a positive effect on increasing the forecasting accuracy. In what follows, the best forecast is discussed based on comparing ARMA and MARMA-NARX models.

As seen in Figure 15, the results of the best ARMA forecast have a significant difference with observed hydrograph, and considerable error is seen on forecasting the peak discharge and its time. Compared to it, the hybrid MARMA-NARX(58), the best MARMA-NARX model, has good forecasting except in April and the *IIFFE* of the best ARMA model (7.29) is improved to 0.32 in the best MARMA-NARX model.

Considering the results, ARMA forecasted a good trend for flow discharge, but it is considerably distant from the observed value, while the hybrid MARMA-NARX has increased the forecasting accuracy by 96%. Therefore, separating the linear and non-linear components of flow increases the forecasting accuracy. Applying ARMA to forecast the linear component of flow and using the capabilities of the ANN model to forecast the non-linear component of discharge led to the best forecasting results of the hybrid model MARMA-NARX(58) which have *IIFFE* of 0.32 compared to 7.29 of ARMA, representing 96% enhancement. It is possible to conclude that where the linearity or non-linearity of the existing patterns in the system is not clear or the studied system has both patterns, the hybrid model can yield better results.

There are many investigations in which the influence of the climatic indexes on the characterization of the flow regime have been analyzed (Hosseinzadeh Talaee *et al.* 2012; Tabari *et al.* 2013; Meidani & Araghinejad 2014). Time series and ANN models are the models which are used in flow forecasting. These models will work strongly in linear forecasting or non-linear forecasting, but not in both patterns simultaneously. The hybrid model is the suggested model which allowed using benefits of both linear and non-linear models simultaneously.

There are a great many linear pattern and non-linear models used for monthly flow forecasting. ARMA is an example for linear and NARX is a good example for non-linear. Each of these models has advantages and disadvantages which we mentioned in the Introduction. There are many studies which show these models' abilities for flow forecasting, but because the flow is one of the components of the hydrological cycle which follows linear and non-linear patterns during the different periods of the year, it is necessary to use the advantages of both models for flow forecasting. Thus, the hybrid model has been used to increase the accuracy of the flow.

In this article, we have intended to enhance the accuracy of flow forecasting in time series models and also ANN with a hybrid model.

In this research, we used all the data that were available for our research. However, to improve the accuracy of the estimations, adding more data is suggested for future works.

## CONCLUSION

This paper focused on forecasting the following year inflow to Dez Reservoir, using hybrid MARMA-NARX model and comparing the results with ARMA model. The significant difficulties of this work are providing data and information of runoff and precipitation after solving the programming problems. If the data are more accessible and more extended, accuracy for prediction will be increased, and uncertainty will be decreased.

Eighty models with different input including singular input of large-scale climate signals of SOI, NAO, PGSST, and RSST along with data of three rainfall stations, Keshvar Sorkhab, Sepid Dasht, and Gelerood, and sets of two, three, and four input of the large-scale climate signals were used, and then, these indices with singular and joint rainfall station were examined, and ultimately the most successful model was determined by using proposed IIFFE error index. The hybrid MARMA-NARX(58) with SOI, NAO, PGSST, and rainfall data as input has the best forecast among MARMA-NARX models, which has improved *IIFFE* from 7.29 (for the best ARMA model) to 0.32. Comparing the models shows that both ARMA and hybrid MARMA-NARX have been able to forecast the trend of long-term monthly flow for a year. In other words, both models have had the ability to forecast the flow trend in the following year with different accuracies. However, considering the 96% improvement in the results of the hybrid model, we can determine better flow forecasting when the linearity or non-linearity of the patterns in the system is unknown, or the system has both patterns.

According to the results above, the large-scale climate signals in monthly scale were useful in flow forecasting and using multiple indices yields better performance of the model compared to the model with singular input. Rainfall data as input had an important role in improving the forecasting results, especially in decreasing the forecasting delay error. The joint input of rainfall and large-scale climate signals shows better results than the separate input of the data to the model. The effect of the large-scale climate signals on Dez Dam basin is clear; however, since forecasting is solely accomplished in a year and the impact of these indices varies in different years and seasons, this paper is not able to determine the dominant index on the climate of the basin. Merely based on the results that the PGSST is the input of five models among the eight top models, it is possible to consider the impact of this index to be more effective.

Lack of more data is the limitation of this study. Hence, using more and recent data could improve the accuracy of the models. Moreover, employment of modern hybrid models such as ANN-Fuzzy-ARIMA and quantum models is suggested for future work.