Our measurement and evaluation approaches are failing. Here’s what we can do to fix them

By Abram El-Sabagh

December 11, 2023

graphs
Measuring and evaluation focus on hard numbers that are gameable and easy to change. (SpicyTruffel/Adobe)

Measurement and evaluation as we know it have always been around in one way or another. In the 1800s, sailors who were at sea for long periods were prone to a condition where their teeth would fall out and their gums would swell — a disease we now know as scurvy.

James Lind, a doctor with the British Royal Navy conducted an experiment: he suspected the cause of the disease was related to a lack of access to fresh fruit. So he set up an experiment. The control group continued with their diet as normal. The experimental group ate limes regularly. From this, Lind found a potential cause and prescribed sailors to eat citrus fruits (the cause of the problem was vitamin C deficiency).

Measurement and evaluation come down to choosing the best course of action. How can I improve?

Well, let’s measure and evaluate our findings to figure out which course of action is best. This is typically where approaches like randomised controlled trials work well; where a specific result is sought and we evaluate the impact different approaches have on a specific outcome.

The trouble is, there is no ‘best course of action’ in complex systems. By their very nature, complex systems are unpredictable.

Why do we prefer hard numbers?

There’s a simple reason why we, as a society, leaders and organisations, prefer hard numbers. Numbers have patterns. They are our black and white in a world of greys. They are familiar. We encounter thousands of numbers every hour. Numbers are easy to grapple with. They give us a sense of certainty, and, to be fair (and as a software engineer), numbers are actually pretty helpful.

In the early days of computing, the only input you could provide was numerical. Compare that with recent advancements in AI, large language models and natural language processing, and it’s easy to see how far we’ve come.

Stories on the other hand are more complex. They stir emotion, they get us thinking. Ironically, we’re more likely to remember a number if it’s part of a story.

Aswath Damodaran sums it up nicely, “Stories create connections and get remembered, but numbers convince people. They give a sense of precision to even the most imprecise stories, and putting a number on a judgement call makes you feel more comfortable when dealing with uncertainty.”

What makes our traditional approach not fit for purpose?

So why do I, as a practitioner working towards systems change, have an issue with our current approaches? In short, they’re not fit for purpose.

The long story is that despite their allure of certainty, numbers can be gamed.

If I’m reporting on a project’s progress and impact, I can pick and choose which things to report on.

If I want to get help, I might share the budget report and show we’re close to the final budget. If I want to demonstrate progress, I might focus on how many workshops we ran. If I want to give a good representation of the project, I might do both.

Recently in our own organisation, we wanted to help people have targeted metrics of performance. We quickly learnt that it wasn’t as simple as ‘what’s your utilisation target?’ We realised that what we cared about was behaviours and attitudes towards work, helping each other, and focusing on the wellbeing of our team and our organisation’s sustainability.

We know organisations and governments pick and choose which numbers to report on. That’s the whole idea behind greenwashing – when an organisation spends more time and money on marketing than on actually reducing its footprint.

Anthropologist Marilyn Strathern told us, “When a measure becomes a target, it ceases to be a good measure.” Webflow, for example, uses core behaviours (instead of values) to guide their team as to what good looks like.

Numbers don’t give us the whole picture

Numbers don’t account for the complexity of the world we live in.

For corporations, the focus was historically on profit and loss. In the 1990s, John Elkington introduced the triple bottom line — profit, people and planet. Then, we evolved to get corporations to report on their sustainability metrics.

Although helpful and needed, I think they miss the point. The issue is not that numbers are inherently bad, it’s that we use them pretending they are unbiased and matter-of-fact. We keep trying to quantify, quantify, quantify instead of just looking at the complex system and trying to change it.

Early last year, I became a parent and it has taught me a lot. I don’t sit next to my child and count how many times he says ‘mama’. Instead, we play, we explore, we go out, we learn, and we engage.

When an issue arises, it’s obvious to us as parents and to our families, and we seek interventions to tackle them. Sometimes these interventions work, and other times they don’t. So we try something else.

Where to next?

Thankfully, researchers and practitioners around Australia (and the world) have caught on, but the practices of effective measurement are not yet widespread.

We still have governments at all levels wanting to quantify the impact of different measures — almost an example of (hopefully accidental) greenwashing.

Chief scientist Dr Cathy Foley recently highlighted the issue, saying: “The current system for assessing research careers for hiring, promotion and funding is not fit for purpose”.

Specifically, she said: “Narrow research metrics create perverse incentives and a ‘publish or perish’ mentality. Researchers may be incentivised to publish iteratively, and to chase citations, rather than focusing on quality. The current practices do not incentivise innovative or multidisciplinary research, nor recognise the breadth of roles in a healthy science and research system.”

Much has been said about the social sciences approach to measurement. Numbers are not going away anytime soon, and neither should they. Instead, we are weaving values, narratives and stories into numerical-based assessments of impact.

We’re reminding each other of the importance of human connection, and relationships, and rolling up our sleeves to work together towards a better future.

So my message to you is this: when thinking about the impact your work and project are making, discuss about:

  1. What are we trying to achieve together? What signs would tell us whether we’re on the right track?
  2. What indicators are we choosing to use and what bias might that create? What behaviours might that incentivise?
  3. What are we trying to achieve with the indicators we collect and report on? How might they introduce unintended consequences?

READ MORE:

Australian Centre for Evaluation to assess impact of government programs

About the author
0 Comments
Inline Feedbacks
View all comments