Why we need to go beyond the experiment
We need a hybrid approach that integrates experiments with market and social research methods
As we know, behavioural scientists are in the business of behaviour change through the design of interventions. But how do we know if the intervention we have designed will have the desired effect on behaviour? Easy it seems, we can do an experiment, known in the industry as a Randomised Controlled Trial (RCT). Commonly considered to be the ‘gold standard’, we may consider this to be the end of the story. Well maybe, but perhaps not quite so fast: there is a growing realisation that we need hybrid approaches to evaluation of interventions, rather than always simply relying on RCTs.
Hybrid evaluation is where we still use an RCT – but recognise this offers us some very specific information about the potential impact of an intervention. Hybrid evaluation uses market and social research techniques alongside RCT approaches, reflecting the way in which the intervention needs to slot into a wider eco-system of behaviour. Without understanding the wider eco-system of behaviour we cannot really assess the potential impact of the intervention. Let’s explore why.
What does an RCT give us?
RCTs typically examine the impact of one (or at least a very small number) of interventions at one point in time against a control, or another intervention: given the multi-faceted nature of most behaviour change challenges, this means there is no opportunity to assess the degree to which multiple interventions work alongside (and in combination) with each other. Nor indeed do they assess the cumulative impact they may have when they operate over time (for more discussion of this point see here).
We seek to nullify these other influences, as for an experimental design to work, we need to strip out context. This is understandable - otherwise it is not ‘Controlled’. If we did not do this then the effect we see may be for all sorts of reasons that we cannot unpick. At the same time this brings us a challenge because any one behaviour is embedded in a wider eco-system or context; an RCT does not see the full picture it just looks at one element. The ‘R’ aspect of the RCT mitigates these other influences by randomly selecting which individuals receive the different intervention, thereby making any differences seen across intervention conditions as being the result of the intervention itself. This means that while RCTs are necessary, they are often not sufficient to help us understand the effect of an intervention on behaviour.
Why we typically need to go beyond a controlled experiment
Let’s take an example to illustrate the point. For illustration purposes, let’s assume that an organisation wanted to encourage the adoption of health-related behaviour in the population, for which they developed an app that would give a reading about their environment that prompts protection behaviour. Let’s say the reading is AQI, a measure of air pollution. We want to assess the impact of receiving this information on behaviour, in this case it might be protective health behaviours such as use of an inhaler by a specific populations, or mask wearing by the more general population. This can be done with the use of a simple RCT: we give one group the app that provides AQI readings, and let’s say for the control group we give them an app that provides regular weather report. We then measure the effect of these on health behaviours and can then assess the impact on inhaler usage and mask wearing.
While this seems straightforward, we need to recognise what is missing. First, it does not help us to understand the all-important way in which the intervention is part of the wider context or eco-system in which the behaviour is enacted. So would people be aware of the app and be motivated to download it? Second, simply getting them to download an app onto their phone fails to tell us whether providing the information via the app changes people’s understanding about the environment and its link to the behaviour. So without having been prompted via the RCT, would they understand what to do with the information and then know to take protective behaviour? Stripping away the context also means that we are left with key questions.
Depending on the nature of the intervention, there may be a range of dimensions that are important for it to work. In this case the RCT tells us something about the effect of information on behaviour but we have to make assumptions about the all important wider contextual dimensions.
Moving to a hybrid approach
To identify the wider contextual dimensions through which an effective intervention operates, we need a framework of behaviour. We use our MAPPS system. Dimensions that can be explored could be: we may not be motivated to use it (e.g. do we want to, do we see ourselves as the type of people that would use it), we may be lacking in the abilities needed (e.g. do we know what the levels mean, or struggle to make it part of our routine), we may not process information in a way that is helpful (e.g. we may automatically use other cues to assess pollution such as haze), there may be physical or environmental barriers (e.g. is it available when needed, have a phone to download it etc) and finally social norms and cultural values may reduce take-up (e.g. other people may simply discount it, influencing out behaviour).
If we simply use the results of the RCT then we don’t see all these other elements that might impact effectiveness. But of course simply because they are difficult to identity does not mean they are not there. Indeed, this is perhaps one of the reasons why so many innovations fail: looked at in isolation an app or device that gives AQI readings can seem like a very sensible proposition. But when looked at in the wider eco-system of behaviours we can see there are many good reasons why, however good it is, it may still simply not have the desired effect.
Hopefully this example helps to illustrate that whilst RCTs are important, we must recognise they need to be part of a wider understanding of the behaviour change challenge. Behaviours do not exist independently of each other but sit in a wider system. If we isolate one element, then this means that the design (and subsequent evaluation) of the intervention is limited. Stripped of any context we cannot hope to see the true predictive validity of the intervention, as it is contingent on so many other parts of the eco-system.
This is why evaluation needs market and social research methods operating alongside experimental techniques: neither by themselves are sufficient, but both are necessary. RCTs are often called the ‘gold standard’ for testing and while there is some truth in that, the claim is not quite right. Instead, we can see that when RCTs are used alongside broader social and market research methods that capture both how the intervention operates and its role within a broader context that supports the behaviour, we have a more comprehensive standard for testing. A new gold standard in fact!
Final thought: The widespread use of the academic practice of experimental design has bled into practitioner work but without recognising that the exam question is different. An academic investigation is typically trying to establish the effects of specific mechanisms on behaviour: a worthwhile and important requirement of course. But as practitioners we do not have the luxury of investigating individual mechanisms – we are looking at behaviour ‘in the wild’ where behaviours only ever operate in a broad ecosystem with different dimensions in play, inevitably working interdependently. We need to have experimental methods but our approach to intervention testing must also reflect this wider behavioural context.
For more on this topic it is worth checking out the work of Angus Deaton and Nancy Cartwright including this article.