Measuring up: storytelling, evaluation and what if we admitted that it isn’t working part 1/2

The metrics of measurement, the power of stories and the lines they paint on the ground in hospital car parks.

Measuring up: storytelling, evaluation and what if we admitted that it isn’t working part 1/2
Sophie and the BFG hatching a plan for social impact.

Part 1: The metrics of measurement, the power of stories and the lines they paint on the ground in hospital car parks

i. Social impact/ everything is sorted

In Roald Dahl’s The BFG, protagonist Sophie and the Big Friendly Giant of the title go on an important mission. Desperate to stop the other giants -these ones not so friendly- from plucking children from their beds at night and eating them, Sophie and the BFG decide they will seek help from the Queen of England. Their plan is daring. The BFG will craft an exquisitely vivid dream that portrays the current situation, and blow it into Her Majesty’s bedroom, so that she dreams it. He will deposit Sophie on the end of Her Majesty’s bed, and when the Queen wakes up, Sophie will reiterate everything Her Maj has seen in the dream. Surely this will get her attention and convince her to help them? Fortunately everything goes to plan and the Queen is so affected by what she has experienced that she calls the heads of the army and the airforce immediately. Commanded by Her Majesty, they capture the murderous giants and deposit them in large pits where they can no longer do any harm. Sophie and the BFG have won. 

Sophie and the BFG have also orchestrated a storytelling project with significant social impact - the immersive story event (the dream) and post-event discussion space (chat with Sophie) plus clear call to action convinced a high powered individual to make an unprecedented policy decision and commit significant resources to solving a problem. All this before the BFG has finished eating the voluminous breakfast provided by Buckingham Palace’s kitchens.

Except the BFG isn’t real life. It’s a story - and a story aimed at children. In the real world, this kind of thing doesn’t happen. So why do we insist on approaching social impact measurement and evaluation of arts projects like it’s a story for 8 year olds, rather than the fraught, complicated and highly politicised reality?

Read on for a highly subjective whistle-stop tour through stories and how we use them to make sense of information (for better or worse); the political assumptions that underlie ideas that purport to be neutral; the centrality of (not enough) money; and the art that defies all these models.

IMPACT: Her Majesty the Queen commands the heads of the army and airforce to capture the horrible child-eating giants.

ii. Once upon a time

Humans love stories. Our brains love stories. And by that, I mean that our brains are constantly turning confusing and unmanageable amounts of information into something we can deal with. In The Science of Storytelling, Will Storr describes how the brain imposes 'cause and effect' theories on pieces of information: this is happening because of that. We actually fit external events to these internal stories, rather than vice versa. As well as helping us process information, these stories give our existence meaning. It's terrifying to imagine that we might just be living in random chaos. We use stories to make sense of things and to structure otherwise meaningless information.

Part of the way we use stories to make sense of our existence is to categorise and measure. Once we know what kind of a thing a thing is, or how much of a thing there is today compared to how much there was yesterday, that’s already a story: connected events with a relationship to time. Categorisation and measurement are ways to impose meaning, to tell a story. 

But it also works the other way around. How you measure affects the story you can tell. Hospitals have targets for how quickly they process emergency patients. I heard a story about a local hospital which repainted the line on the road which indicated that an ambulance had ‘arrived’ at the hospital - and would start the clock on the response time. By moving the line 50 metres down the road, they were more likely to be able to tell a story of success: patients treated within the target window, rather than the clock ticking while ambulances queued to offload patients into the overburdened hospital. 

If how you measure affects the story you tell, then playing with where it starts and ends will give you remarkably different results. In Red Riding Hood, pause the story after Red Riding Hood declines to go into the forest with the wolf and skips off along the path to Granny’s and she wins. Pause the story after the wolf eats granny and he wins - or after he declares ‘all the better to eat you with’ but before the huntsman turns up, and the wolf is successful plus you’ve got a cliff-hanger for Red Riding Hood II: Just when you thought it was safe to go back into the forest….

This isn’t just a whimsical example drawing on fairytales. In the LRB podcast episode ‘Protest, what is it good for?’, host Thomas Jones and guest James Butler discuss Vincent Bevins’ book If We Burn: The Mass Protest Decade and the Missing Revolution. They talk about the history of protest in Brazil - and how the question of where you stop the story determines whether protest has failed or succeeded. Rise of Bolsonaro and his election to the presidency in 2018 - a dark day for leftwing protest. Bolsonaro defeated by Lula at the 2022 election - a moment of triumph: stop the story here to end on a high. 2024 and a growing sense of disappointment with Lula’s actions, especially on the environment - not quite such a triumphant ending to the story. The future - who’s to say?

You can secure the story you want by the metrics and scope of what you measure. Or if we don’t want to be that cynical, if we give everyone the benefit of the doubt and proceed on the assumption that everyone is trying to measure ‘well’, how do you know that the ‘right’ moment to measure has arrived?

Pause the story here for a happy ending.

iii. Measurement in the arts

The world of arts evaluation depends heavily on counting people, and on the vox pop - on pouncing on slightly startled people exiting a theatre or gallery and asking them their opinion on what they just experienced. It’s a great way to get quotes for funders, as people generally don’t like to say upsetting things directly to the faces of other people and don’t like to think they’ve just wasted their time and money, so they say nice and generic things. If that was all you needed, well done, you’ve evaluated your project by talking to audiences. But have you really learned anything?

There are instances of more longitudinal studies, but they are few and far between. A research project in 2014 explored how people perceived the value of attending theatre productions, including how this perception changes over time. The researcher Astrid Breel is currently working with playable theatre company Coney on a more longterm engagement with audiences.

But these examples are rare. It is costly to follow up three or six months after an experience: it will come as a surprise to no one working in the arts that there is rarely enough money for this kind of evaluation. In the development sector evaluation is -as standard- 15% of total budget. So on a 15 million pound programme, this means over 2 million pounds for evaluation. In the arts, where we’re trying to do things on a frayed shoestring, this kind of scope and potential rigour is just unimaginable. 

There isn’t even a scaled-down version of these kinds of evaluation activities that could be fitted into an arts budget. Data collection costs are fixed: for example, a randomised control trial will cost the same amount whether your budget is 15 million pounds (a development programme) or 15 thousand pounds (an arts project). 15% of 15 thousand pounds will not buy you any sort of robust data collection. It will buy you a film team to collect vox pops after a show and make a nice video of them.

Even if you did have the resources to do a thorough, well-resourced evaluation of a storytelling project which explicitly aimed to change audiences’ thinking on a certain topic, how would you be able to isolate the impact of the storytelling project from everything else that person had experienced in the interim? It’s very rarely a single experience that changes people’s beliefs. It’s much more likely that the person is in a process, where the artwork may well have played a role.

In many ways, it’s the dynamic, organic way that culture transmits that makes it so hard to evaluate: someone watched a film and talked to their friend about it which made the friend notice an advert for a book where one of the same actors was in the film adaptation and they bought the book and lent it to their mum who left it in an Airbnb, where it was read by a guest 6 months later who had recently read an article about the same issue and talked to their brother about it, who was a documentary maker and looking for a project and unsuccessfully pitched it but the commissioner liked something in the idea and talked to a colleague about it and 2 years later someone pitched the colleague an idea and they recalled that conversation and commissioned a pilot, and someone watched the pilot and it changed how they thought about the world.

That incredibly long sentence of coincidences and unintended consequences is much more likely to have been the process of ‘impact’ than the BFG and Sophie making and executing a successful plan. So how on earth do you measure or evaluate that?

In the world of campaigning, there is a more nuanced vocabulary for this which we in the arts could think about adopting: contribution vs attribution. Contribution: what it says on the tin - this experience played a part in some kind of change. Attribution: everything happened as a result of this experience. But we don’t have this kind of vocabulary in the arts, so when we’re evaluating the impact of something, are we talking about contribution or attribution?

 It strikes me that it’s both very easy and incredibly difficult to ‘prove’ the impact of something - depending on how carefully you look into the metrics of measurement. 

How do we even start to measure unintended consequences?

iv. A brief digression into history

Before moving on, I want to spend a little moment with the idea of attribution, the singular transformative experience. It seems to me that this is kind of analogous to the Great Man theory of history. The Great Man theory of history was popular back in the day when the idea of Great Men was unproblematic and basically went like this: it was the unique genius of Bismark/ Gandhi/ insert other leader here which caused x momentous political change. If you studied history at any time in the last 30 years, you will have written umpteen essays which set out the ‘accepted’ Great Man theory in the intro, then elucidate on 3-4 other themes (economic situation, some political infrastructure change, something social), before concluding that said Great Man was part of the change that happened but by no means caused it in the way that insert discredited historian here has suggested. Gets you a 2.1 every time, especially if you can top and tail with a nice quote.

The idea that one cultural or artistic experience can change everything feels very similar, and similarly naive. It wasn’t this one documentary or play or film any more than it was Mazzini or FDR or Patrice Lumumba. To ignore the social, political and economic context just feels at best intentionally obtuse and at worst really intellectually lazy. 

With campaigning, there is often a shiny press-friendly experience which leads the campaign - for example, a Virtual Reality experience which allows audiences to really understand what it means to have a certain medical condition or traumatic life experience. Even the name given to this - the hero piece - feels weirdly 20th century. It leaves me asking myself whether this work is really about the change we purport to want to make, or actually about centring ourselves. We did this. See that documentary? We made it. We made it and it changed everything. Give us awards and pay us better day rates on the next project.

Ask any A-level history student and they will tell you the Great Man theory is largely discredited - so why are we taking so long to catch up in the way we think and talk about evaluating and impact? Would we not be better off using the time we spend trying to prove our impact instead building solidarity with others, leaning into the complex network of forces necessary for change to take place? Or if that feels like an unnecessarily inflammatory question, let’s try again: if it is both very hard and kind of pointless to measure the impact of one experience, why are we doing it? Spinning these elaborate fabrications is in whose interests?

The answer sadly is our own.

You can read the second part of this piece here.