May 26, 2023•1,701 words
For a long time I have accepted the widely held view that choice blindness experiments show that much post hoc rationalisation is actually confabulation. I think that was a mistake and understanding why it was a mistake, in particular understanding why it was based on a misunderstanding about the nature of rationalisation, will help us avoid making a much worse mistake with respect to calls for algorithmic transparency and accountability.
In choice blindness experiments subjects are asked to say which of two options they prefer and then to explain their choice. Between the two tasks, the experimenter - usually a trained prestidigitator - swaps the options. Around 2/3rds of subjects, though in some experiments up to 90%, do not notice the switch and around 70% of those proceed to give an explanation based on features of the option they didn't choose, but now have before them, which are not shared by the option they did choose.
The standard interpretation of these results is that the post hoc rationalisations are not in fact recollections of the reasons the subject had for choosing but confabulations, i.e. made up in response to the request without the subject consciously fictionalising. And it is very tempting to conclude that outside the experimental context, when there is no switch, the same process of confabulation will also be at work.
All this presupposes that when we ask someone to explain their choice, which is a request for a subjective justification for preferring one option, what we are asking is for historical information: what was it about this option which led to your choosing it. This may seem obvious, because the request for justification and the answer given are past tense: 'Why did you choose that?'.
But before we take that at face value, we need to think a bit harder about the purpose of asking for and giving reasons for choices or decisions. Sometimes this is forensic: we are tracking back from a decision to how it was made in order to learn something, such as whether due process was followed, if mistakes were made, or if there was good practice. This may or may not be with the intention of praising or blaming. In such cases, it does matter that the post hoc rationalisation is in fact an accurate account of what went on prior to the decision, but we only tend to have this forensic interest in situations where there is an expectation of explicit deliberation, a weighing of pros and cons.
In contrast, the choice blindness experiments rely upon there being no such explicit deliberation, otherwise the switch would be noticed. If we ask for reasons, for a justification, in such cases, we are not asking a forensic question. I will come back to this later, but first we need to look at algorithmic decision-making.
'Algorithm' is one of those nice technical terms, like 'data', where everyone is happy with rough-and-ready definitions because nothing hangs on whether borderline cases are classified one way or the other. What is important is that all algorithms have input (data) and output and the process for getting from input to output is such that it can be followed by a machine, which we can call a 'computer' (because it computes the output from the input). Some algorithms, such as arithmetical operations on not too large numbers, can also be followed by humans. In fact, when Turing was writing his famous paper on artificial intelligence, the word 'computer' meant 'person who computes'. But no one will deny that computers can now follow algorithms of a complexity no individual human can understand let alone follow themselves.
Some algorithms are deterministic, in that the same input always produces the same output. Others are stochastic, relying on statistical generalisations over the data to calculate probabilities, and do not always produce the same output from the same input. Some algorithms are formal in the sense that they have been made explicit in a formal language. However, most of what goes by the name 'AI' at the moment is non-formal: the computer has been trained to produce the desired outputs from the available data, and the effect of that training is that the computer itself constructs the algorithm. But it does not do this explicitly and there is no formal representation of the algorithm for us to inspect. We just know that it is producing the desired outputs from the data somehow, and that must be algorithmically (though usually stochastically).
Incidentally, that is why generative AI like ChatGPT is so bad at quite simple maths problems: it does not work out the answer, it predicts what answer the questioner is probably looking for on the basis of similar question-answer pairs it has found on the internet.
If the output is a decision, then we have algorithmic decision-making. If the algorithm is formal, then this decision-making is very similar to explicit deliberation by humans and thus the forensic request for reasons makes perfect sense. But if the algorithm is non-formal, there is no way to answer that forensic request. This is widely thought to be a problem, often called the 'Explainability/Transparency Problem for AI'.
Setting aside the forensic question
A forensic request for justification is back-tracking: it focuses on the process which led to the decision. Sometimes it is really important to ask forensic questions about algorithmic decision-making in order to identify legal or moral responsibility for bad decisions and their effects. How much does it matter to this whether the algorithm is formal or non-formal?
Take as an example a formal algorithmic decision-making process like points-based immigration systems. That is transparent in the sense that someone who is rejected can see exactly why they were rejected, which criterion they didn't get enough points on. While that information explains the decision, it does not (yet) justify that decision. It is not a rationalisation, a presenting of the decision as based on good reasons, showing it to be a good decision.
Suppose we come to the conclusion that the points-based immigration algorithm is making bad decisions, rejecting some people who should be given visas (false negatives) and accepting others who shouldn't (false positives). Since the algorithm is explicit, we can tinker with it to reduce those cases. But that does not answer our forensic interest in asking for reasons, since it does not tell us whether we should regard this as a culpable error or just an unfortunate mistake. If we have such forensic interests and we ask for justifications, we don't want to know the details of the algorithm. We want to know why it was decided to deploy this particular algorithm and on the basis of what information about its performance.
Exactly the same would apply were the algorithm non-formal. In fact, if we had an amazing tool which was able to extract a formal representation of the non-formal algorithm (or give us a similar level of understanding e.g. heat-maps in AI-assisted radiology), that wouldn't help answer the forensic question. Making the algorithm explicit gives transparency of process, it tells us which aspects of the data had what sort of effect on the final decision, but it does not tell us anything at all about whether the decision was justified. (It might help us see that those who decided to implement the algorithm were culpable for the bad decisions because they did not make sufficient effort to understand the tool they were using, but there are other ways to see that, as the well-known cases of algorithmic injustice show.)
We can conclude that in some contexts of inquiry, the forensic question matters and we need to know the actual process which led to the decision. With non-formal algorithms we can at best create a separate additional technology to give some of that information. The research field of 'Explainable AI' or XAI tries to achieve this, but the main driver there is to increase end-user trust and thus take-up of the system by persuading the users that it makes decisions in similar ways to how they would make that decision.
Post hoc rationalisations can be good
Sometimes we ask someone for their reasons because we simply want to know that the decision was made on the basis of understanding rather than being a lucky guess. For example, when a student gives the correct answer to a question, we might ask them to explain the answer as well in order to show that they have understood the material taught. Suppose a student did merely guess the correct answer but when asked to explain, gave a full and accurate justification. In such a case they were lucky but didn't need luck and deserve the marks.
This draws attention to the fact that in many cases where we ask someone to explain or justify their decision, we are not interested in unpicking the processes that led them there. What we want is to locate that decision in the 'space of reasons', to make it a candidate for rational review and consequent praise or critique. In the choice blindness experiments, what the subject is doing is trying to show that the choice they (appear to) have made is a good choice to have made. They are not confabulating a story about how they came to choose that option; they are situating the choice of that option in the space of reasons so that it - the decision - can be evaluated. They are deceived, but not self-deceived because they are not trying to describe themselves in that sense.
Such human behaviour, placing the decisions we have already made into the space of reasons where they can be discussed as rational choices and potentially revised, is common and valuable. It allows us to engage with each other as rational agents, to discuss and develop, to accept diversity and challenge perversity. Human life would not be enhanced by a tool that could monitor how we make decisions and give 'Explainable Human Intelligence' by unpicking largely non-conscious processes. That may sometimes - though I suspect rarely - help with forensic questions, but it certainly wouldn't help our relationships with colleagues, friends and loved ones.