Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE) are widely used measures for evaluating the forecasting performance of time series models. Although these absolute measures can be used to compare the performance of competing models, one needs a reference to judge the goodness of the forecasts. In this paper, two relative measures, coefficient of efficiency (E) and index of agreement (d), and their modified versions (EM, EMP, dM and dMP) with desired values of closer to one are presented. These measures are illustrated by comparing the modeling ability and validation forecasting performance of a Nonlinear Additive Autoregressive with Exogenous variables (NAARX), Nested Threshold Autoregressive (NeTAR), and Multiple Nonlinear Inputs Transfer Function (MNITF) models developed for the Jökulsá eystri daily streamflow data. The results suggest that NeTAR describes the system best, and gives better 1- and 2-day ahead validation forecasts. MNITF gives better forecasts for 3-day ahead, and NeTAR and NAARX give comparable performance for 4- and 5-day ahead forecasting. The values of E and d were larger than those of the modified versions, giving a false sense of model performance, and unlike the modified versions, they decreased as forecast lead times increased. Differences among the values of these six relative measures can reveal the sensitiveness of competing models to outliers, and their potential for long-term forecasting. Accordingly, NeTAR was the least sensitive to outliers and NAARX was the most sensitive, with MNITF in between; and NAARX showed the most potential for long-term streamflow forecasting.

This content is only available as a PDF.