Despite the advances in methods of statistical and mathematical modeling, there is considerable lack of focus on improving how to judge models’ quality. Coefficient of determination (R2) is arguably the most widely applied ‘goodness-of-fit’ metric in modelling and prediction of environmental systems. However, known issues of R2 are that it: (i) can be low and high for an accurate and imperfect model, respectively; (ii) yields the same value when we regress observed on modelled series and vice versa; and (iii) does not quantify a model's bias (B). A new model skill score E and revised R-squared (RRS) are presented to combine correlation, term B and capacity to capture variability. Differences between E and RRS lie in the forms of correlation and the term B used for each metric. Acceptability of E and RRS was demonstrated through comparison of results from a large number of hydrological simulations. By applying E and RRS, the modeller can diagnostically identify and expose systematic issues behind model optimizations based on other ‘goodness-of-fits’ such as Nash–Sutcliffe efficiency (NSE) and mean squared error. Unlike NSE, which varies from −∞ to 1, E and RRS occur over the range 0–1. MATLAB codes for computing E and RRS are provided.

  • R2 is arguably the most widely applied goodness-of-fit measure.

  • R2 has known issues e.g. it (i) does not quantify bias, (ii) can be low & high for an accurate and imperfect model, respectively.

  • Revised R2 (RRS) and a metric E are presented to addresses the issues of R2.

  • E & RRS allow diagnostic exposure of systematic issues behind model optimizations based on other ‘goodness-of-fits’ such as mean squared error.

This content is only available as a PDF.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).

Supplementary data