On 'Reasoning' AI Systems
April 3, 2025•1,884 words
There is a lot of attention being given to new AI models - primarily LLMs - which are doing something their owners like to describe as 'reasoning'.1
What this means, at a very high level of generality, is that rather than simply giving an answer in response to a prompt which asks for a solution to some problem or task, they give a series of steps which lead to an answer. Where this is working well, what they produce looks very similar to what a human would produce if given the problem with the additional instruction 'Show your workings' (do maths exams still use that phrase?).
If we want to understand the significance of this, we need to think a bit about what humans are doing when they are reasoning
Three kinds of human reasoning
There are at least three things humans might be doing, in the sense of what their purpose might be, when 'reasoning':
- They might be going through a conscious process of trying to work out a solution to a problem. Call this 'deliberation'.
Deliberation is in practice a lot more messy and associationist than we like to admit, involving loose analogies, hidden assumptions and lucky guesses.
- They might have made up their own mind on the question and be seeking to get others to agree with them by presenting an idealised but fictional deliberation. Call this 'persuasion'.
We have many formal versions of this kind of reasoning, from legal rulings to academic research papers. Effective persuasion is rarely intellectual autobiography, it is rarely an account of the actual deliberation, because most would find that actual deliberation unpersuasive, since they are different people, with different background assumptions and different intuitions of what is analogous to what.
- They might be seeking not to change other people's minds (nor to get them to make up their minds in one specific way), but to show that their own judgements are rational, to show that they meet certain standards of epistemic evaluation. Call this 'rationalization'.
Like persuasion, rationalization is rarely, if ever, intellectual autobiography. Our epistemic evaluations are not about the person, about how their deliberation met some ideal standard, but the proposition which they are asserting, the claim they are making. We want to know how that sits in the space of reasons - can it be supported by evidence and logic? This is probably the most common form of communicated human reasoning and it serves a vital function in social contexts, allowing us to respect each other's opinions without necessarily agreeing with them. It is also fundamental to our ability to see others as 'making sense' - it is what makes human decisions and judgements 'explainable'.
Reasoning models
So is a 'reasoning model' doing any of these three things: deliberation, persuasion or justification? Here is the a priori answer, which has all the risks of being proved wrong by experience:
Reasoning models are not engaged in persuasion or rationalization. Primarily because those are activities which require one to have thoughts about the minds of others, but LLMs don't have thoughts at all, and an interest in what others think, either about the proposition one is trying to persuade them of or about oneself. Furthermore, at least in what I have seen, they are not even mimic-ing persuasion and rationalization because they do not respond to follow-up prompts challenging their reasoning in the way that a human engaged in either of those activities would. To put it bluntly, they don't defend their reasoning against criticism, which is fundamental to the activities of persuasion and rationalization. (As is stepping back when, but only when, the presented reasoning is shown to be inadequate.) That said, there is no reason to think that LLMs couldn't produce outputs which are indistinguishable from the outputs of a human persuader or rationalizer, as their success in some examinations suggests.
What about deliberation? I suspect that this is what the owners of these LLMs are hoping to persuade us their models are up to. The key feature of deliberation, and precisely what associationist psychology fails to explain, is that a deliberator is self-critical, continually sensitive to the rational adequacy of each step of their reasoning. Of course, humans are often tired, or lazy, or distracted and this results in a lack of self-criticality which is only shown up when they attempt to persuade or rationalize. If they do not see their reasoning as subject to certain normative standards, as rationally criticisable, then they are not engaging in deliberation at all, they are simply mind-wandering. Without that recognition by themselves that they are trying to follow norms, and it is their responsibility to self-monitor their success in so doing, we would not describe them as engaged in reasoning at all.
So are ‘reasoning models’ meeting this condition? There is a large philosophical literature, sparked by Wittgenstein's remarks on rule-following, which attempts to give conditions for genuine sensitivity to a normative rule, as opposed to merely following a sequence of steps which is the ones which the rule dictates. While there remain disagreements about that, there is almost universal consensus that there is such a distinction to be made. All the reasoning models I have seen so far clearly fall on the wrong side of this distinction, and I will try to explain why.
The Web of Belief
Human deliberation involves choice, and as such it is more than just inference - whether deductive or Bayesian or any other kind - of the kind that might be implemented in a machine. The choices we make in day to day deliberation are often hard to see because they are easy choices, in the sense that one option is massively more preferable than the other, so we do not even notice that we have made a choice. But the choice was still there.
To see this consider a simple example of deliberative reasoning: on waking I realise it is Saturday and conclude (with relief) that I don't have to go to work. Call the proposition 'It is Saturday' P and the proposition 'I don't have to go to work' C. We could reconstruct my deliberation from P to C deductively as a modus ponens:
If P, then C; P; Therefore C
Or as Bayesian:
Pr(P)=0.14; Pr(C)=0.39; Pr(P|C)=0.37; So Pr(C|P)=12
These are inferences, and computers are perfectly capable of implementing them. But they do not capture what happens in deliberation, for in deliberation there is a change in what the reasoner believes, an addition or subtraction to the total set of beliefs. These inferences don't tell us how to do that.
Take the deductive one for simplicity. When faced with that inference, the deliberator might come to believe C. But that is not the only possible response for a reasoning human. They might equally rationally conclude that it is not Saturday and they do have to go to work: i.e. use the same inference to go from not-C to not-P. Or they might conclude that the conditional is false and they do sometimes have to work on Saturdays. Or, if they are so minded, they may even choose to deny the validity of the inference rule (modus ponens) in this instance.3 There are a similar set of options for the human deliberator using Bayesian inference.
Most of the time a human deliberator is reasoning from premises in which they themselves have a very high degree of confidence, using rules they are strongly committed to.4 So the choice is easy: believe the conclusion because the other options involve quite radical revisions of one’s beliefs and commitments. However, we are also familiar with the occasional case where believing the conclusion has too many epistemic costs - in the form of other things which follow from it which we would also have to believe - and so a different choice is made.
The American philosopher Quine used the metaphor of the web of belief to describe the situation we find ourselves in when deliberating. Each node in the web is a proposition we believe and the links are the logical and evidential relations between them. Some propositions we believe are very close to the centre of the web, e.g. that contradictions cannot be true, and others on the periphery, e.g. that it is sunny here now. The more central a belief is, the more damage to the strength and stability of the web will be done by rejecting it. Generally we try to maintain strong and stable webs of belief, but sometimes we are faced with some new information so striking that we accept the damage to the web caused by revising a central belief and look at ways of slowly rebuilding a new strong and stable web. One example Quine was interested in was quantum superposition: this seems to conflict with some very central beliefs, so we could just reject it and keep our pre-quantum stable web, but that would require us to revise many of our beliefs about science, which might in turn make it impossible to rebuild a new web, at least to do so in ways we find rational. There is a clear choice to be made between revising logic and revising scientific epistemology.
Back to Reasoning Models
So one aspect of genuine rule-following is choice: choosing whether to accept the 'output' of the rule in this case or whether to revise the premises or inference rules themselves. While we seem happy to talk about our AI models, including those based on LLMs, as 'making decisions', they only do that within a fixed framework of option types, a framework which excludes the possibility of making such choices. Since their function is to predict a statistically likely human response, and nearly all humans in nearly all cases of deliberation ignore the more revisionary options, this limitation is not obvious to their owners and users. Their 'reasoning' looks very similar to human deliberation. But that doesn't mean it is the same:
not having a choice at all is not the same as not exercising the capacity to choose differently.
-
I suspect nothing I say here will come as news to (some of) the people developing these models, but the owners have a marketing strategy which misleads users, so it is the users I am primarily addressing here. ↩
-
The astute reader will be able to work out my employer's annual leave policy. ↩
-
There is also a fourth option, which is of particular relevance for LLMs because they conduct their inferences in natural language: some of the words may be ambiguous and shift their meanings between different sentences used in the inference. We can see this most easily when we use names. Consider the inference: Mr Blair wrote 1984. Mr Blair was elected Prime Minister in 1997. So there is someone who wrote 1984 and was elected Prime Minister. Most computers run on formal and mathematical languages which are - by a magical act of stipulation - guaranteed to be unambiguous. In contrast, LLMs face the messy linguistic reality of natural languages when they try to reason. ↩
-
This is an important mark of deliberation compared to persuasion and justification, where it is not the reasoner's but their audience's credences and commitments that determine the choice. ↩