Time series data such as monthly stream flows can be modelled using time series methods and then used to simulate or forecast flows for short term planning. Two methods of time series modelling were reviewed and compared: the well-known auto regressive moving average (ARMA) method and the state-space time-series (SSTS) method. ARMA has been used in hydrology to model and simulate flows with good results and is widely accepted for this purpose. SSTS modelling is a more recently developed method that is relatively unused for hydrologic modelling. This paper focuses on modelling the stream flows from basins of different sizes using these two time series modelling methods and comparing the results. Three rivers in Labrador and South-East Quebec were modelled: the Romaine, Ugjoktok and Alexis Rivers. Both models were compared for accuracy of prediction, ease of software use and simplicity of model to determine the preferred time series methodology approach for modelling these rivers. The SSTS was considered very easy to use but model diagnostics were found to require a high level of statistical understanding. Ultimately, the ARMA method was determined to be the better method for the typical engineer to use, considering the diagnostics were simple and the monthly flows could be easily simulated to verify results.

## INTRODUCTION

Developing or managing river systems is important in the field of water resources. Stream flow analysis is used to determine if flows are sufficient and reliable for a particular purpose. For developing a run-of-river hydroelectric project, for example, the engineer uses stream flow analysis to determine whether a stream can meet the energy demand throughout the year. Part of this design process includes developing a stream flow model based on historical flow records, simulating flows from the model to determine whether the model provides a good representation of the historical flows and then, in some cases, using the model for short-term forecasting.

There are a few industry accepted methods for forecasting flows. These include exponential smoothing, periodic autoregressive modelling and autoregressive moving average. A comparison of simulated flows using the above noted methods and the state-space time-series (SSTS) method was previously completed for Island Rivers in Newfoundland (Sidhu & Lye 1995); however, the same comparison has not been done for Labrador Rivers. This paper compares two methodologies for modelling stream flows in Labrador and South-East Quebec: one of the standard methods, auto regressive moving average (ARMA), and SSTS.

The ARMA or Box-Jenkins methodology (Box *et al.* 2008) is well-known and has been used in modelling of hydrometric time series since the early 1970s. The SSTS methodology (Harvey 1989; Commandeur & Koopman 2007) on the other hand is relatively new, and has rarely been used for hydrological modelling. It has been used primarily for economic time series model development and forecasting. The SSTS methodology is also referred to as a Structural Time Series approach. One of the principal purposes of the work described here is to assess the SSTS as an additional tool for hydrologic time series modelling.

Both modelling methodologies are used here to model and forecast monthly flows in selected Labrador and South East Quebec Rivers. The models are then fitted to each time series, excluding the last 5 years of flow data, with the last 5 years of data being used to compare the forecasting capability of the models.

## PRELIMINARY ANALYSIS OF THE DATA

The study area was divided into two subareas: North and South of Goose Bay. Geologic and surface conditions in general appear to be different between areas north and south of Goose Bay; the ground is rocky with less storage capacity north of Goose Bay compared to south of Goose Bay. As runoff can be affected by vegetation type and density as well as ground infiltration and storage, Goose Bay was chosen as the divide. As a result of varying ground conditions, at least one river was selected from each of the north and south regions. In addition to geographic location, drainage basin size as well as drainage aspect were taken into account.

Having records from 79 gauge locations seems promising; however, most gauges did not meet the initial screening criteria. Table 1 includes those gauge records having more than 15 years of data. Of the remaining 33 gauged records, the rivers highlighted in bold, as shown in Table 1, did not meet all of the initial screening criteria and were removed from the study. Of the 79 gauged rivers, only 12 remained on the list. These were then reviewed based on geographic location, drainage aspect and drainage size. It was desirable to include at least one river from each of the northern and southern regions, as well as at least one river draining to each of the Gulf of St. Lawrence and the Labrador Sea. After the initial screening, only three basins in northern Labrador remained viable: Ugjoktok, Naskaupi and Kanariktok. Ugjoktok was selected since it has the longer record of the three rivers: 32 years versus 18 and 16 years respectively. Ugjoktok River drains to the Labrador Sea and with a basin size of 7,570 km^{2} is a moderately sized basin for this region.

Gauge name . | Drainage area km^{2}
. | Years . | Regulated . | Continuous . | Missing data . | State . | North or south . |
---|---|---|---|---|---|---|---|

Aganus Riviere | 5,590 | 21 | No | Yes | None | Discontinued | South |

Alexis River near Port Hope Simpson | 2,310 | 34 | No | Yes | None | Active | South |

Ashuanipi River at Menihek | 19,000 | 59 | Yes | Discontinued | North | ||

Atikonak Lake at Gabbro lake | 21,400 | 38 | Yes | Discontinued | |||

Atikonak river above Panchia Lake | 15,100 | 25 | No | No | 1983–98 | Active | South |

Big Pond Brook below Big Pond | 71.4 | 17 | No | No | 6 months | Active | North |

Churchill River above Upper Muskrat | 92,500 | 60 | Yes | Active | South | ||

Churchill River at Flour Lake | 33,900 | 17 | Yes | Discontinued | North | ||

Churchill River at foot of Lower Muskrat | 16 | Yes | Discontinued | South | |||

Churchill River at powerhouse | 69,200 | 39 | Yes | Discontinued | South | ||

Churchill River between Up and Low Muskrat | 16 | Yes | Discontinued | South | |||

Eagle River above falls | 10,900 | 46 | No | Yes | 4 months | Active | South |

Etamamiou Riviere | 2,950 | 20 | No | Yes | None | Discontinued | South |

Joir Riviere near boundary | 2,060 | 17 | No | Active | South | ||

Kanairiktok River below Snegamook Lake | 8,930 | 18 | No | Yes | None | Discontinued | North |

Little Mecatina | 19,100 | 16 | Discontinued | South | |||

Little Mecatina below Breton Lake | 12,100 | 16 | Discontinued | South | |||

Little Mecatina River above Lac Fourmont | 4,540 | 34 | No | No | 7 months | Active | South |

Magpie Riviere | 7,230 | 24 | No | Yes | None | Active | South |

Minipi River below Minipi Lake | 2,330 | 32 | No | No | 29 months | Active | South |

Moisie Riviere | 19,000 | 37 | No | No | 18 months | Active | South |

Nabisipi Riviere | 2,060 | 27 | No | Yes | None | Discontinued | South |

Naskaupi River at Fremont Lake | 8,990 | 16 | Discontinued | North | |||

Naskaupi River below Naskaupi Lake | 4,480 | 32 | No | No | 28 months | Active | North |

Natashquan Riviere | 15,600 | 22 | No | Yes | None | Active | South |

Natashquan Riviere below East Natash River | 11,600 | 18 | No | Yes | None | Discontinued | South |

Romaine Riviere | 13,000 | 46 | No | Yes | None | Active | South |

Saint Augustin Riviere | 5,750 | 16 | Active | South | |||

Saint Marguerite Riviere | 6,140 | 51 | Yes | Active | South | ||

Saint Paul Riviere | 6,630 | 35 | No | No | 12 months | Active | South |

Tonnerre Riviere | 684 | 40 | No | >24 months | Discontinued | South | |

Ugjoktok River below Harp Lake | 7,570 | 32 | No | Yes | None | Active | North |

Unknown river at Lake 51 | 19,900 | 16 | Yes | Discontinued | South |

Gauge name . | Drainage area km^{2}
. | Years . | Regulated . | Continuous . | Missing data . | State . | North or south . |
---|---|---|---|---|---|---|---|

Aganus Riviere | 5,590 | 21 | No | Yes | None | Discontinued | South |

Alexis River near Port Hope Simpson | 2,310 | 34 | No | Yes | None | Active | South |

Ashuanipi River at Menihek | 19,000 | 59 | Yes | Discontinued | North | ||

Atikonak Lake at Gabbro lake | 21,400 | 38 | Yes | Discontinued | |||

Atikonak river above Panchia Lake | 15,100 | 25 | No | No | 1983–98 | Active | South |

Big Pond Brook below Big Pond | 71.4 | 17 | No | No | 6 months | Active | North |

Churchill River above Upper Muskrat | 92,500 | 60 | Yes | Active | South | ||

Churchill River at Flour Lake | 33,900 | 17 | Yes | Discontinued | North | ||

Churchill River at foot of Lower Muskrat | 16 | Yes | Discontinued | South | |||

Churchill River at powerhouse | 69,200 | 39 | Yes | Discontinued | South | ||

Churchill River between Up and Low Muskrat | 16 | Yes | Discontinued | South | |||

Eagle River above falls | 10,900 | 46 | No | Yes | 4 months | Active | South |

Etamamiou Riviere | 2,950 | 20 | No | Yes | None | Discontinued | South |

Joir Riviere near boundary | 2,060 | 17 | No | Active | South | ||

Kanairiktok River below Snegamook Lake | 8,930 | 18 | No | Yes | None | Discontinued | North |

Little Mecatina | 19,100 | 16 | Discontinued | South | |||

Little Mecatina below Breton Lake | 12,100 | 16 | Discontinued | South | |||

Little Mecatina River above Lac Fourmont | 4,540 | 34 | No | No | 7 months | Active | South |

Magpie Riviere | 7,230 | 24 | No | Yes | None | Active | South |

Minipi River below Minipi Lake | 2,330 | 32 | No | No | 29 months | Active | South |

Moisie Riviere | 19,000 | 37 | No | No | 18 months | Active | South |

Nabisipi Riviere | 2,060 | 27 | No | Yes | None | Discontinued | South |

Naskaupi River at Fremont Lake | 8,990 | 16 | Discontinued | North | |||

Naskaupi River below Naskaupi Lake | 4,480 | 32 | No | No | 28 months | Active | North |

Natashquan Riviere | 15,600 | 22 | No | Yes | None | Active | South |

Natashquan Riviere below East Natash River | 11,600 | 18 | No | Yes | None | Discontinued | South |

Romaine Riviere | 13,000 | 46 | No | Yes | None | Active | South |

Saint Augustin Riviere | 5,750 | 16 | Active | South | |||

Saint Marguerite Riviere | 6,140 | 51 | Yes | Active | South | ||

Saint Paul Riviere | 6,630 | 35 | No | No | 12 months | Active | South |

Tonnerre Riviere | 684 | 40 | No | >24 months | Discontinued | South | |

Ugjoktok River below Harp Lake | 7,570 | 32 | No | Yes | None | Active | North |

Unknown river at Lake 51 | 19,900 | 16 | Yes | Discontinued | South |

More gauged records were available to choose from south of Goose Bay, and it was desirable to select one river that drained toward the Gulf of St. Lawrence and one that drained toward the Labrador Sea. Of the rivers draining towards the Gulf of St. Lawrence, Magpie, Romaine, Nabasipi, Aganus, Natashquan and Etamamiou remained after the initial screening. Most of these rivers had 20–27 years of data; however, the Romaine had 46 years of data. As a result, the Romaine River was selected for further study. This river has a large basin size at 13,000 km^{2}. Of the remaining rivers draining toward the Labrador Sea, both the Eagle and Alexis are located south of Goose Bay. They both have more than 30 years of data, are not regulated and the data are mostly continuous. Since one of the criteria was to have a range of drainage sizes, the Alexis River was selected in preference to Eagle River because its drainage area of 2,310 km^{2} is small compared with that of Ugjoktok and Romaine. The Eagle River on the other hand has a large drainage basin; 10,900 km^{2}, similar in size to the Romaine River. The Alexis, Romaine and Ugjoktok Rivers were therefore selected for detailed study. Table 2 summarizes the information for these selected rivers and their locations are shown in Figure 1.

River . | Drainage area km^{2}
. | Years of data . | Location . | Drainage basin aspect . | Outlet location . |
---|---|---|---|---|---|

Ugjoktok | 7,570 (medium) | 32 | North | West/East | Labrador Sea |

Alexis | 2,310 (small) | 34 | South | West/East | Labrador Sea |

Romaine | 13,000 (large) | 56 | South | North/South | Gulf of St. Lawrence |

River . | Drainage area km^{2}
. | Years of data . | Location . | Drainage basin aspect . | Outlet location . |
---|---|---|---|---|---|

Ugjoktok | 7,570 (medium) | 32 | North | West/East | Labrador Sea |

Alexis | 2,310 (small) | 34 | South | West/East | Labrador Sea |

Romaine | 13,000 (large) | 56 | South | North/South | Gulf of St. Lawrence |

In summary, one relatively small, one medium and one large river were selected for the study. Each river has more than 32 years of almost continuous data. One river is located north of Goose Bay and two are located south, with the two southern rivers having differing basin sizes, aspects and drainage outlet locations. This diversity of location, basin orientation and outlet location was selected to cover a wide variety of features that might provide a more varied list of rivers to model.

## METHODOLOGY

Two methodologies for modelling monthly stream flows were used: ARMA as per the Box-Jenkins methodology and SSTS. The software Structural Time Series Modelling Program (STAMP Version 8.3) developed by Koopman *et al.* (2009) and sold by Timberlake Consultants Ltd was used to carry out the SSTS modelling. ARMA modelling was completed using the Minitab 16 software (Minitab Inc. 2013). The goal was to compare the two modelling methodologies to determine accuracy of the models for simulation and forecasting as well as ease of use of the software for the practising hydrotechnical engineer.

Prior to modelling, the monthly flow data for each of the three rivers were collected and reviewed for normality. A review of all three time series indicates that the data sets are not normally distributed. Each data set was transformed and normality tests indicate that the log transformed data are normally distributed. The results for Alexis River have been used for illustration purposes with some commentary included relating to the specific results of Ugjoktok and Romaine Rivers.

Upon completion of the general review of each data set, each of the three rivers were modelled using both the ARMA and SSTS methodologies. The log transformed data were used in both methodologies for all rivers.

### ARMA with harmonic analysis

When modelling monthly stream flows with seasonality, ARMA models provide better results when the data set is deseasonalized (Hipel & McLeod 1994). There are two ways to deseasonalize a time series: remove the seasonal mean and possibly the seasonal standard deviation from each data point in the time series or use Fourier Series (harmonic analysis) to deseasonalize (Hipel & McLeod 1994). Subtracting the mean and standard deviation from each observation point results in a large number of model parameters. Fourier series is a way to remove seasonality while using fewer model parameters (Hipel & McLeod 1994). For this reason, the seasonal component for each of the three rivers was modelled using harmonic analysis and the residuals (deseasonalized data) were then modelled using the ARMA methodology. The two models could then be used together to generate or simulate new data sets for comparison to the actual data set. The summary statistics for the actual and generated data sets could then be compared to confirm the model's ability to simulate flows that are within specified confidence intervals. In theory, the verified model can be used to forecast stream flows up to several months in advance. In order to facilitate verification of the model, each actual flow data set was divided into two differing types of sets: a fitting set and a verification set. The fitting set had the last 5 years of data removed from each actual flow data set, and this set was used to fit the model. This set was also used to verify the model simulation results. The verification set included the full actual data set, and the model was then used to forecast the last 5 years: the years that were removed from the fitting set. The forecasted data points were compared to the actual observations for the same time period, understanding that only short term forecasting is relevant, as forecasting beyond 12 months ahead loses significance (Sidhu & Lye 1995).

*et al.*2008). Since the largest peak relates to a period of 12 months, as is shown in Figure 7, a maximum of six sine and cosine pairs can be used in the equation and thus models using one to six pairs were checked to determine which regression equation best fit the seasonal component. In Figure 7, sdf is spectral density function, UC is upper 95% confidence limit, LC is lower 95% confidence limit and WN is white noise spectral density function.

*R*

^{2}, it was determined that the equation with six sine and cosine pairs was the best fit for the Alexis data. As with the regression analysis, the assumptions of analysis of variance (ANOVA) were checked and found to be acceptable, primarily that the residuals are normally distributed, have constant variance and are independent. The general harmonic analysis equation is as shown in Equation (1): where

*Z*is the time series,

_{t}*α*is the regression constant,

*β*is the coefficient for each sine or cosine term,

*t*is month (1 to 12), 2π

*f*is a constant with

*f*= 1/12 for monthly values, and

*y*are the residuals.

_{t}*et al.*2008). For the Alexis river, there is a significant lag 1 on the ACF plot shown in Figure 8 as well as a significant lag 1 on the PACF plot shown in Figure 9. This would suggest that an ARMA(1,0) model or possibly an ARMA(1,1) may be the best fitting models, so both were reviewed (Box

*et al.*2008). Since running an ARMA model is quick to do in Minitab, other ARMA models were tested as well. Only ARMA(1,0) and ARMA(1,1) are presented in this paper as review of the ACF and PACF plots suggest these are the most likely best fitting models. As indicated, interpretation of the ACF and PACF plots is an important tool to help guide the determination of the best ARMA model.

*et al.*2008), this model produces the simplest model. The residuals from the harmonic analysis (deseaonalized data) were modeled for Alexis using an ARMA(1,0) also referred to as an AR1 model. The model results indicate that the AR1 parameter is significant and the chi-squared statistics are not significant, thus confirming independence. Once the AR1 model was developed, the autocorrelation and partial autocorrelation were plotted for the AR1 model residuals, as shown in Figures 10 and 11. There are no significant lags shown in the plots, which indicate that the model has addressed the significant correlations; thus the AR1 model is suitable for modelling the residuals of the deseasonalized data.

To validate the prediction ability of the model, the predicted values for the last 5 years of data were graphically compared to the actual last 5 years of data; from 2007 to 2011. As shown in Figure 14, the model appears to do a good job of predicting the first 3 years (2007 and 2009) but does not fit the actual data well for the 4th and 5th years (2010–2011). A closer look at the actual data suggests that the peak patterns that exist in the data pre-2007 do not appear to be replicated between 2010 and 2011, even though the pattern is replicated between 2007 and 2009. Two methods were used to validate the predictions: Nash–Sutcliffe efficiency and median absolute percentage error (MAPE). Since the actual data for 2010–2011 did not appear to have a pattern consistent with previous years' data, the data set was segregated to specifically look at the data set with and without the 2010 and 2011 years. Table 3 shows the error calculation results.

Years in data set . | Nash–Sutcliffe efficiency coefficient . | MAPE . |
---|---|---|

Full set (1978–2011) | 0.755 | 9.86% |

2007–2011 | 0.614 | 9.49% |

2007–2009 | 0.823 | 8.00% |

2010–2011 | 0.309 | 17.78% |

1978–2009 | 0.777 | 9.67% |

Years in data set . | Nash–Sutcliffe efficiency coefficient . | MAPE . |
---|---|---|

Full set (1978–2011) | 0.755 | 9.86% |

2007–2011 | 0.614 | 9.49% |

2007–2009 | 0.823 | 8.00% |

2010–2011 | 0.309 | 17.78% |

1978–2009 | 0.777 | 9.67% |

Using the Nash–Sutcliffe efficiency coefficient, the best fitting model will have values close to 1. As shown in the table, the predicted flows between 2007 and 2011 show a value of 0.614, which is less efficient than the full data set at 0.755. As noted above, a graphical review of Figure 14 indicates that actual flows between the winter of 2010 and winter 2011 did not appear to display the same pattern as historical flows had shown; primarily there were higher actual flows during the winter period than observed historically. As a result, the Nash–Sutcliffe efficiency coefficient was reviewed separately for the 2007–2009 flows and the 2010–2011 flows. As noted in Table 3, the 2007–2009 predicted flows match well to the actual flow data with a coefficient of 0.823, which is close to 1; however, the 2010–2011 predicted flows do not match the actual flow data very well, with a coefficient of 0.309. These findings are re-iterated when the MAPE was reviewed for the same sets of flows. The table shows that there is a larger percentage error (17.78%) between the predicted and actual flows for 2010–2011 and a small percentage error (8.00%) between predicted and actual flows for 2007–2009. Based on these results, the selected model, generated using ARMA with the harmonic analysis method, has been used to predict short term with reasonable accuracy. The larger discrepancy between measured and predicted flows in 2010 and 2011 appears to result from a different distribution of seasonal flows, where the flows are higher in the winter for these 2 years. This might suggest a warmer winter and/or earlier snow melt.

The same process was used for the Ugjoktok and Romaine Rivers. The characteristics of these rivers were similar to those of the Alexis River, in that the data were not normally distributed and required log transformation, displayed seasonality with 12- and 6-month cycles and were best fit using an AR1 model. The models for these two rivers were simulated and were found to generate data sets of the same sample size that on average have similar historical statistics. Similar results were found when validating the prediction values using Nash–Sutcliffe efficiency and MAPE for the Ugjoktok and Romaine Rivers.

## SSTS ANALYSIS

*μ*is the level component and

_{t}*γ*is the seasonal component with (

_{t}*s*–1) state equations where

*s*is the periodicity of the seasonal component.

In the sample results page in Figure 15, note that very strong convergence is identified and a steady state was found. These are two key components of an acceptable model using this software. If a steady state and convergence are not found, then the model generated is not a good representation of the actual data set. The diagnostics listed under summary statistics in the sample output (Figure 15) include a test for Normality, Durbin Watson test, r(1) and r(q) correlation values and AIC value. These must all be compared to critical values to determine if the base assumptions, on which the model was developed, have been met.

Developing multiple models is easy when a software package is used, but the difficulty lies in diagnosing each model. Unlike the diagnostic tests for the illustrated ARMA modeling, which are well prescribed, the SSTS software diagnostic test results do not provide the acceptable critical value or the *P*-value for each of the test statistics. As a result, determining the suitability of a model was difficult and time consuming. A model could also be mistakenly accepted as a good one if the diagnostics were not properly understood and verified. It was also challenging to determine from the results how to modify the model to achieve the best fit. Reviewing an ACF plot helps direct the user to the appropriate model by quickly showing in graphical form if there are significant correlations remaining in the model residuals. Unlike the ARMA modelling process, where an ACF plot is easily generated from the model residuals, the SSTS software requires the user to store the residual values from the model and then separately plot the ACF plot. Reviews of the plots that are automatically generated for each model do not provide enough help on determining how to adjust the model, and as such the modeller is left to try various combinations of components and then review the diagnostic test results for each. As shown in Table 4 below, the X marked fields indicate the model run combinations that were completed to determine the best model for Alexis River.

Model run . | Level . | Slope . | Seasonal . | Cycle . | Irregular . | ||||
---|---|---|---|---|---|---|---|---|---|

Deterministic . | Stochastic . | Deterministic . | Stochastic . | Deterministic . | Stochastic . | Short . | Medium . | ||

1 | X | X | |||||||

2 | X | X | |||||||

3 | X | X | X | ||||||

4 | X | X | X | ||||||

5 | X | X | X | ||||||

6 | X | X | X | ||||||

7 | X | X | X | X | X | ||||

8 | X | X | X | X | X | ||||

9 | X | X | X | X |

Model run . | Level . | Slope . | Seasonal . | Cycle . | Irregular . | ||||
---|---|---|---|---|---|---|---|---|---|

Deterministic . | Stochastic . | Deterministic . | Stochastic . | Deterministic . | Stochastic . | Short . | Medium . | ||

1 | X | X | |||||||

2 | X | X | |||||||

3 | X | X | X | ||||||

4 | X | X | X | ||||||

5 | X | X | X | ||||||

6 | X | X | X | ||||||

7 | X | X | X | X | X | ||||

8 | X | X | X | X | X | ||||

9 | X | X | X | X |

There were nine models developed, each with diagnostic results that were separately reviewed in order to select the best model. The critical values were calculated in a separate spreadsheet and Table 5 gives a summary of the diagnostics and the critical value calculations that were completed to help select the best model.

. | r(1) and r(q) . | Q(q, q–p) . | H . | n
. | AIC . | Convergence . | Steady state . | ||||
---|---|---|---|---|---|---|---|---|---|---|---|

Model . | Critical value . | Result . | Chi square critical value . | Result . | F critical value . | Result . | Chi square critical value . | Result . | |||

1 | 0.099015 | Not met | 35.172 | Not met | 1.3286 | Met | 5.991 | Not met | 84.02 | Found | |

2 | 0.099015 | r1 met, r24 not | 35.172 | Not met | 1.3286 | Met | 5.991 | Not met | 39.73 | Very strong | |

3 | 0.099015 | Not met | 33.924 | Not met | 1.3286 | Met | 5.991 | Not met | 99.71 | Found w/o full | |

4 | 0.099015 | r1 met, r24 not | 33.924 | Not met | 1.3286 | Met | 5.991 | Not met | 46.64 | Very strong | Found w/o full |

5 | 0.099015 | r1 not met, r24 met | 33.924 | Not met | 1.33 | Met | 5.991 | Not met | –409.26 | Pattern in irregular | |

6 | 0.099015 | r1 not met, r24 met | 33.924 | Not met | 1.33 | Met | 5.991 | Not met | –431.22 | Very strong | Found |

7 | 0.099015 | Both met | 28.869 | Met | 1.33 | Met | 5.991 | Not met | –477.98 | Weak | Found w/o full |

8 | 0.099015 | Both met | 28.869 | Met | 1.33 | Met | 5.991 | Not met | –477.98 | Strong | Found w/o full |

9 | 0.099015 | Both met | 30.144 | Met | 1.33 | Met | 5.991 | Not met | –491.81 | Very strong | Found |

. | r(1) and r(q) . | Q(q, q–p) . | H . | n
. | AIC . | Convergence . | Steady state . | ||||
---|---|---|---|---|---|---|---|---|---|---|---|

Model . | Critical value . | Result . | Chi square critical value . | Result . | F critical value . | Result . | Chi square critical value . | Result . | |||

1 | 0.099015 | Not met | 35.172 | Not met | 1.3286 | Met | 5.991 | Not met | 84.02 | Found | |

2 | 0.099015 | r1 met, r24 not | 35.172 | Not met | 1.3286 | Met | 5.991 | Not met | 39.73 | Very strong | |

3 | 0.099015 | Not met | 33.924 | Not met | 1.3286 | Met | 5.991 | Not met | 99.71 | Found w/o full | |

4 | 0.099015 | r1 met, r24 not | 33.924 | Not met | 1.3286 | Met | 5.991 | Not met | 46.64 | Very strong | Found w/o full |

5 | 0.099015 | r1 not met, r24 met | 33.924 | Not met | 1.33 | Met | 5.991 | Not met | –409.26 | Pattern in irregular | |

6 | 0.099015 | r1 not met, r24 met | 33.924 | Not met | 1.33 | Met | 5.991 | Not met | –431.22 | Very strong | Found |

7 | 0.099015 | Both met | 28.869 | Met | 1.33 | Met | 5.991 | Not met | –477.98 | Weak | Found w/o full |

8 | 0.099015 | Both met | 28.869 | Met | 1.33 | Met | 5.991 | Not met | –477.98 | Strong | Found w/o full |

9 | 0.099015 | Both met | 30.144 | Met | 1.33 | Met | 5.991 | Not met | –491.81 | Very strong | Found |

Where r(1) is the serial correlation and r(q) is serial correlation lag at step q, Q(q, q–p) is the Box-Ljung Q-statistic, H is the test for heteroscedasticity, N is a test for normality (Koopman *et al.* 2009), and AIC is the Akaike Information Criterion (Akaike 1974).

Table 5 shows that the model highlighted in bold (model 9), which is comprised of a stochastic level term, a stochastic seasonal term, a short term cycle term and an irregular term as seen in Table 4, meets most of the criteria and has a very strong convergence. The assumption of normality is not met, but this is the best model of all model runs and as such, was ultimately selected as the preferred model for Alexis River. A similar process was completed for Ugjoktok and Romaine. For all of the three rivers, the model generation was very quick and easy, but the diagnostics associated with verifying and selecting the most appropriate model was time consuming, with none of the models meeting all of the diagnostic criteria.

The same validation method was used to validate the forecasted flows using SSTS for Alexis River. Both the Nash–Sutcliffe efficiency coefficient and the MAPE were used. As shown in Table 6, the predicted flows between 2007 and 2011 show a value of 0.622 for Nash which is similar to the Nash coefficient using ARMA and much less than the target value of 1. Like the ARMA analysis, the Nash–Sutcliffe efficiency coefficient was reviewed separately for the 2007–2009 flows and the 2010–2011 flows. As noted in Table 6, the 2007–2009 predicted flows match well to the actual flow data with a coefficient of 0.825, which is close to 1; however, the 2010–2011 predicted flows do not match the actual flow data very well with a coefficient of 0.334. These findings were reiterated when the MAPE was reviewed for the same sets of flows. Table 6 shows that there is a larger percentage error (17.73%) between the predicted and actual flows for 2010–2011 and a small percentage error (8.42%) between the predicted and actual flows for 2007–2009. This model, generated using SSTS methodology, can be used for short term forecasting with reasonable accuracy.

Years in data set . | Nash–Sutcliffe efficiency coefficient . | MAPE . |
---|---|---|

2007–2011 | 0.622 | 9.49% |

2007–2009 | 0.825 | 8.42% |

2010–2011 | 0.334 | 17.73% |

Years in data set . | Nash–Sutcliffe efficiency coefficient . | MAPE . |
---|---|---|

2007–2011 | 0.622 | 9.49% |

2007–2009 | 0.825 | 8.42% |

2010–2011 | 0.334 | 17.73% |

## DISCUSSION

Both methods produce models that reasonably simulate and forecast the actual monthly time series data for the three rivers selected in this study: Alexis, Romaine and Ugjoktok Rivers. The location, basin size and flow directions did not seem to greatly affect the choice of model. Both methods produce models that estimate the actual values fairly closely. In addition, the validation results for the forecasted values derived from SSTS are very similar to the results calculated using ARMA with harmonic analysis. When considering results, both models perform approximately equally well. There are, however, pros and cons relating to the software capabilities for each of these methodologies, as listed in Table 7.

ARMA with harmonic analysis model using Minitab . | SSTS model using STAMP . |
---|---|

PROS | |

Results indicate whether diagnostic tests are significant so user can quickly diagnose the model | Seasonality can be included in a single model |

ACF plots for the model residuals are produced with the model results so user can easily decide how to improve the model | Modelling is very fast and easy to do |

Simulation of the data can be completed to verify the model accuracy | |

CONS | |

Monthly data must be deseasonalized prior to ARMA modelling | Diagnosis of the model is very time consuming and requires knowledge of diagnostic tests, some of which are specific to this methodology |

Simulation is time consuming and requires macros to perform the simulation, but can be done | Multiple runs are required to verify diagnostics before the best model can be selected |

Harmonic analysis is not built into the software and a number of macros are required to complete the seasonal modelling component | No easy way to simulate the model to verify that the summary statistics for the model are close to the actual summary statistics. Simulation can be programmed using a separate software module |

ARMA with harmonic analysis model using Minitab . | SSTS model using STAMP . |
---|---|

PROS | |

Results indicate whether diagnostic tests are significant so user can quickly diagnose the model | Seasonality can be included in a single model |

ACF plots for the model residuals are produced with the model results so user can easily decide how to improve the model | Modelling is very fast and easy to do |

Simulation of the data can be completed to verify the model accuracy | |

CONS | |

Monthly data must be deseasonalized prior to ARMA modelling | Diagnosis of the model is very time consuming and requires knowledge of diagnostic tests, some of which are specific to this methodology |

Simulation is time consuming and requires macros to perform the simulation, but can be done | Multiple runs are required to verify diagnostics before the best model can be selected |

Harmonic analysis is not built into the software and a number of macros are required to complete the seasonal modelling component | No easy way to simulate the model to verify that the summary statistics for the model are close to the actual summary statistics. Simulation can be programmed using a separate software module |

## CONCLUSIONS

Both methods are effective ways to develop models of stream flows, however for a practising engineer, the tools make the difference. Although it is more time consuming to develop an ARMA model in Minitab than to develop an SSTS model in STAMP, it is relatively easy to simulate the flows using the ARMA model. In addition, the ARMA or Box-Jenkins methodology is well-known, widely used in hydrology, and the statistical principles are easier to follow. Furthermore, there are many software packages available for ARMA modelling. The main difficulty with developing an SSTS model is in the diagnostics, and the theory behind the state-space approach is considerably more complex. If features were added to STAMP such as automatic graphing of residual plots, provision of critical values for diagnostic tests, provision of guidance for the interpretation of diagnostics in the model results, and adding the capability of simulation, the program could be a very good tool for the practising hydrotechnical engineer wanting to develop a model to simulate and forecast stream flows.