Abstract for: All models are wrong, some are useful… but how do you know?

We present an operational ex-ante definition of model fitness-for-purpose, and a framework for evaluating model fitness in transparent, objective and intercomparable ways. As a team working on model development and use in contexts of integrated assessment of climate and sustainability challenges, we argue that basic guidelines for assessing model fitness based on technical, structural and behavioral criteria need to be complemented with more formal ways to assess the broader context of development and use. Our framework is designed to be metrologically useful, taking into account the use and user context, the problem context, and the project context, with criteria relating to model usefulness, reliability and feasibility, respectively. We combine construct theory with the Rasch measurement approach. Constructs defined through expert consultation can be mapped on a shared construct continuum, giving a basis for quantification of the various dimensions of model fitness. As a demonstration case-study, we apply this to FRIDA 1.0, a global system dynamics model currently under development, where such a co-created evaluation of fitness for purpose can play an important role in informing future model development and application priorities.