How much of knowledge is just language?

How much of knowledge is just language?

We know from some cognitive and psycho-linguistics experiments that the way language encodes things can actually change what we remember, and how we remember it.

One mentioned in Guy Deutscher's book, "Through the Language Glass", is (should my memory serve me right) an experiment done at the end of the 19th century with a tribe that lacked a word for "blue", instead using "green" to cover both hues (a phenomenon often referred-to as "grue"). When shown a blue cup, they could point to its color on a chart, indicating that they clearly saw it as blue. But when the researchers returned a year later and asked them to point to the same chart to show them the color of the cup they saw a year prior, they pointed at green: they saw it as blue, but mentally recorded it as "green", so they recalled it as green, not as blue.

Another experiment, this one in the 21st century, Lera Boroditsky (then of Stanford) showed that Spanish speakers, whose language obligatorily uses indirect speech for accidents, were worse at remembering the perpetrators of accidents than English speakers, whose language uses direct speech for such scenarios.
(Compare: "Se rompió el florero." ("The vase broke itself.") versus "He broke the vase." ("Él rompió el florero.").)

Time and again, study after study shows effects of language on how and what we remember. Much of our ability to remember, then, is in our ability to encode things in language, and then recall them in language.

Nowadays, we have LLMs like ChatGPT -- the proverbial translator from Searle's famous Chinese Room Experiment. They know nothing; they are naught but a mathematical algorithm that transforms text. Yet, they appear to "know" things. Their "knowledge" is seemingly then perhaps all that which is part of the language itself; that is to say: simply by virtue of "knowing" the language via a connectionist model, they automatically "know" most things communicated in that language. This is fascinating, and brings me to question just how much of our own knowledge is like that: knowledge that is a part of language, rather than true knowledge independent of the language faculty.

Obviously not all of our knowledge is linguistic, and that's true even if we don't consider instincts to be knowledge. I know what plants look like, what a cat's "meow" sounds like, what sugar tastes like, etc. And putting qualia aside, I knew long before I had words for it that when I drop something it falls; and I know most of the faces I've seen in my life, even if I can't mentally visualize them on a whim. I also know the layout of my house, and can traverse it mentally without seeing it.

We also know that many non-human animals that lack language do have memories and knowledge: dogs know which human feeds them, for example. And so from that, we surely must conclude that much of our own knowledge can and does exist independently from language; yet, much of it is still regardless dependent upon it, if for no other reason than that it is far easier/cheaper to remember things in words than in mental images. Why invest the energy building and recording an eidetic image of a blue cup when you can just remember it was "grue"?

As always, the answer is somewhere in the middle, and contingent on context. Some things can be stored multiple ways, like how a cup can be remembered by words or by image. An LLM, though, has really only the word option -- the linguistic option -- for its "memory", and even that isn't so much knowledge as it is trial-and-errored probabilities masquerading as knowledge. Then again, to what extent are our brains just probabilities masquerading as knowledge?

I honestly don't know. But either way, we're still distinct from "AI"s in our multimodality, consciousness, senses, and embodiedness. Should a series of connectionist models ever be built and jerry-rigged together, made to self-run, and given physical senses (such as through cameras)... How far off from a basic AGI really are we? Such a being would still be quite different from a biologic, but at what point is it considered functionally comparable? And would it be reasonably "conscious" in any non-functional sense, or would it be just a philosophical zombie?

I sure as heck don't know the answer (although I expect the zombie one, unless the devs do something really clever). But I do think it is interesting just how much of our knowledge seems to be stored in language, and just how much of our knowledge can be so adeptly mimicked simply by making what is, in some sense, a fancy autocomplete algorithm. What I wonder is how much of our own knowledge is knowledge stored in language, versus "knowledge" that is part of a language. A question for the ages perhaps, but perhaps good to think about as you go through your day.


You'll only receive email when they publish something new.

More from Miles B Huff
All posts