Posted on

Redefining Rigor: Describing quality evaluation in complex, adaptive settings

This blog is co-authored by Dr. Jewlya Lynn, Spark Policy Institute, and Hallie Preskill, FSG. The blog is also posted on FSG’s website: www.fsg.org 

Traditionally, evaluation has focused on understanding whether a program is making progress against pre-determined indicators. In this context, the quality of the evaluation is often measured in part by the “rigor” of the methods and scientific inquiry. Experimental and quasi-experimental methods are highly-valued and seen as the most rigorous designs, even when they may hamper the ability of the program to adapt and be responsive to its environment.

Evaluations of complex systems-change strategies or adaptive, innovative programs cannot use this same yardstick to measure quality. An experimental design is hard to apply when a strategy’s success is not fully defined upfront and depends on being responsive to the environment. As the recognition of the need for these programs, and consequently the number of complex programs grows, so does the need for a new yardstick. In recognition of this need, we proposed a new definition of rigor at the 2015 American Evaluation Association annual conference, one that broadens the ways we think of quality in evaluation to encompass things that are critical when the target of the evaluation is complex, adaptive, and emergent.

We propose that rigor be redefined to include a balance between four criteria:

  • Quality of the Thinking: The extent to which the evaluation’s design and implementation engages in deep analysis that focuses on patterns, themes and values (drawing on systems thinking); seeks alternative explanations and interpretations; is grounded in the research literature; and looks for outliers that offer different perspectives.
  • Credibility and Legitimacy of the Claims: The extent to which the data is trustworthy, including the confidence in the findings; the transferability of findings to other contexts; the consistency and repeatability of the findings; and the extent to which the findings are shaped by respondents, rather than evaluator bias, motivation, or interests.
  • Cultural Responsiveness and Context: The extent to which the evaluation questions, methods, and analysis respect and reflect the stakeholders’ values and context, their definitions of success, their experiences and perceptions, and their insights about what is happening.
  • Quality and Value of the Learning Process: The extent to which the learning process engages the people who most need the information, in a way that allows for reflection, dialogue, testing assumptions, and asking new questions, directly contributing to making decisions that help improve the process and outcomes.

The concept of balancing the four criteria is at the heart of this redefinition of rigor. Regardless of its other positive attributes, an evaluation of a complex, adaptive program that fails to take into account systems thinking will not be responsive to the needs of that program. Similarly, an evaluation that fails to provide timely information for making decisions, lacks rigor even if the quality of the thinking and legitimacy of the claims is high.

The implications of this redefinition are many.

  • From an evaluator’s point of view, it provides a new checklist of considerations when designing and implementing an evaluation. It suggests that specific, up front work will be needed to understand the cultural context, the potential users of the evaluation and the decisions they need to make, and the level of complexity in the environment and the program itself. At the same time, it maintains the same focus the traditional definition of rigor has always had on leveraging learnings from previous research and seeking consistent and repeatable findings. Ultimately, it asks the evaluator to balance the desire for the highest-quality methods and design with the need for the evaluation to have value for the end-user, and for it to be contextually appropriate.
  • From an evaluation purchaser’s point of view, it provides criteria for considering the value of potential evaluators, evaluation plans, and reports. It can be a way of articulating up-front expectations or comparing the quality of different approaches to an evaluation.
  • From a programmatic point of view, it provides a yardstick by which evaluators can not only be measured, but by which the usefulness and value of their evaluation results can be assessed. It can help program leaders and staff have confidence in the evaluation findings or have a way of talking about what they are concerned about as they look at results.

Across evaluators, evaluation purchases and users of evaluation, this redefinition of rigor provides a new way of articulating expectations from evaluation and elevating the quality and value of the evaluations. It is our hope that this balanced approach helps evaluators, evaluation purchasers and evaluation users to share ownership over the concept of rigor and finding the right balance of the criteria for their evaluations.

Posted on

How do you know if you’re getting the best quality in your evaluations?

How do you know if you’re getting the best quality in your evaluations?

RigorQuality in evaluation used to be defined as rigor (and sometimes still is), with rigor meaning the competence of the evaluator, the legitimacy of the process and, of course, applying the best research methods to the collection and analysis of data. These are important, but they don’t count as an all-encompassing definition of quality, particularly in complex, adaptive settings where evaluation partners with strategy.

If we cannot count of these measures to define quality, what are alternative ways of understanding if your evaluation is high quality? Hallie Preskill from FSG and I will be joining forces at the American Evaluation Association’s annual conference this Friday to explore this issue. We are proposing that the concept of “rigor” (and thus what you can look for in your evaluations) can – and should – be redefined as:

  • Balancing whether the evaluation is useful, inclusive of multiple perspectives, unbiased, accurate, and timely.
  • The quality of the learning process, including whether it engages the people who need the information when they need the information.
  • The quality of the thinking, including whether the evaluation engages in deep analysis, seeks alternative explanations, situates findings within the literature, and uses systems thinking.
  • The credibility and legitimacy of the findings, including whether people are confident in the ‘truth’ being presented.
  • Responsiveness to the cultural context, including the integration of stakeholders’ values and definitions of success, as well as who helps to interpret the findings.

Capture

Are you attending the annual conference? Come join us for an interactive discussion on how to reframe rigor and quality in your evaluations.