A note on explanation and understanding concepts

One of the oft-touted revolutionary uses for LLMs is so-called "personalised education". An example often given by the apostles of how this will work goes roughly like this:

Suppose the curriculum requires pupils to grasp concept C. A teacher explaining C to a class will use one or two analogies for a whole class of 30. Those analogies may be with things some pupils are interested and already understand, and they will thus be helped to grasp C. But there will be some pupils who do not relate to the analogies the teacher uses, and they will struggle to understand C. But an LLM can produce a tailored explanation for each pupil that uses analogies which draw upon their personal interests.

Now, I have been teaching philosophy for a few decades, and teaching philosophy is pretty much nothing but explaining concepts and subtle distinctions. And of course I do you analogies and try to pick ones that will draw upon something the students already understand. But choosing the right analogy is not the real trick in explaining a hard concept. Let me try to explain :D

The task of teaching presupposes that there is one group of people, the experts, educators and textbook authors, who understand the concept C pretty well. Call that group E. Then there is another group, the students or learners more generally, who don't yet grasp C, or who have a weak and/or faulty grasp of it. Call that group S.

Now, some members of S will pick up C very easily from a textbook explication, a simple analogy, and a few examples. Other members of S will not. Call them S'. So the crucial question for a teacher becomes:

What do we need to tell or show S' in order to help them grasp C?

What an LLM does - all it can do in fact - is reformulate the explanation which worked for the other members of S in the hope that it will eventually work for the members of S'. This approach presupposes that the effectiveness of that explanation was its presentation, not its content.

However, in my experience, the main reason that S' do not get the concept is that all of E and the other members of S are taking something to be too obvious to need stating, something which is not obvious to S'. (Maybe they haven't thought of it, or maybe they take something else to be obvious which makes that crucial unarticulated assumption highly unlikely.) The real trick to personalised learning is identifying this mismatch of what is obvious, making it explicit, and trying to achieve convergence.

For example, I have been using the schematic letters C, E, S, and S' throughout this post. As such, I have been assuming that my readers understand this convention. But maybe some have never come across it before. Many of those will pick it up immediately, but a few will be puzzled and confused, especially by C which names a concept, something which doesn't happen outside certain special contexts, and S' where I use the apostrophe to indicate that this is a sub-group of S. This one is easy to resolve because I don't in fact take that to be obvious and in general do not use such formulations in non-academic writing - only using it here to make this point. However, there are probably other things I say here and elsewhere that I have taken to be obvious but they are not to many readers. Personalised learning kicks in when I am confronted with that failure to explain and have to work out where I went wrong.

I am very doubtful that the current direction of AI development could ever lead to this capability for one simple reason: Doing this involves identifying what is not said by the people who understand the concept, group E. This is not even the conversational pragmatics of working out what is implied but unspoken in an exchange. It is a form of rational reconstruction that starts by noticing an inference is enthymematic and considers from the range of premises which would make it valid, which are the most likely to be held by someone making that inference. It requires a combination of logical skills and the ability to imagine the intellectual context of the person putting forward the inference, the background beliefs they are unlikely to have ever been challenged on and thus will find too obvious to state.

Maybe someone will invent - or maybe they already have but not yet launched - a method of training in ML/DL which allows the AI to 'learn' what is not being said but is presupposed in a dataset such as the explanations given by group E. Maybe. But that seems a leap forward at least as big as transformer models.1


  1. Why? Because if you ask an LLM what is missing from this explanation, it can only answer that by comparing it to other explanations. SO it can easily identify a subset of explanations given by E which do not make explicit something made explicit in the rest of the set, but it cannot identify something missing from the whole set, because it has nothing to compare against. ā†©


You'll only receive email when they publish something new.

More from Tom Stoneham
All posts