Tuesday, June 1, 2021

Some thoughts on why systems like GPT-3 will always have trouble with common sense knowledge

I develop an analogical argument about why natural language systems trained only on text will never be able to deal with common-sense reasoning. I begin by presenting Herbert Simon’s famous parable of the ant and follow it with some information about sensory deprivation. From those I conclude that our mental apparatus depends on access to the world to achieve stability. I then tap-dance my way to the assertion that common sense reasoning depends on those sensory motor systems which are, in turn, dependent on the world.

AI language engines are enmeshed in language, with no access to the physical world. Consequently common sense reasoning will forever be elusive. Common sense grounds us in the physical world.

Simon’s ant

In Chapter 3, “The Psychology of Thinking: Embedding Artifice in Nature,” of The Sciences of the Artificial (2nd Ed., 1981), Herbert Simon gives us a parable, a story to think with. Simon asks us to imagine an ant moving about on a beach:

We watch an ant make his laborious way across a wind- and wave-molded beach. he moves ahead, angles to the right to ease his climb up a steep dunelet, detours around a pebble, stops for a moment to exchange information with a compatriot. Thus he makes his weaving, halting way back to his home. So as not to anthropomorphize about his purposes, I sketch the path on a piece of paper. It is a sequence of irregular, angular segments – not quite a random walk, for it has an underlying sense of direction, of aiming toward a goal.

After introducing a friend, to whom he shows the sketch and to whom he addresses a series of unanswered questions about the sketched path, Simon goes on to observe:

Viewed as a geometric figure, the ant’s path is irregular, complex, hard to describe. But its complexity is really a complexity in the surface of the beach, not a complexity in the ant. On that same beach another small creature with a home at the same place as the ant might well follow a very similar path.

That is, because the beach has a complex surface, the ant is able to walk a complex path on that surface using rather simple mechanisms. In posing this parable Simon is, of course, asking us to think of the beach as the world in full and that ant is us. Relative to the world’s complexity, our conceptual apparatus is relatively simple.

I would like to propose that the nervous system requires environmental support if it is to maintain its physical stability and coherence. Note that Simon was not at all interested in the physical requirements of the nervous system. Rather, he was interested in suggesting that we can get complex behavior from relatively simple devices, and simplicity translates into design requirements for a nervous system. That’s fine, but I’m suggesting that the nervous system actively seeks out the world and so is dependent upon finding it, in a more or less orderly fashion.

Our sensory systems don’t ‘represent’ (if that’s the right word, many reject it) the world in great detail. Their apprehension of the world is rough and ready. One doesn't need to represent apples and oranges in full detail in order to distinguish them, nor cats and dogs, cars and bicycles, and so forth. Our systems need only ‘grab on’ to the things and events in the world. The world itself will ‘fill out’ our perceptions in real time.

Now, consider this variation on Simon’s story. What would happen if we put the ant on an absolutely featureless surface and let it walk about? What kind of paths would it trace then? As that surface lacks any of the normal cues in the ant’s environment I would imagine the ant would either not move at all or move in a genuinely random or perhaps a rigidly stereotypic way (e.g. around and around in a circle). Or perhaps the ant would hallucinate.

Sensory deprivation

That is what seems to happen to humans when we are deprived of sensory input. Early on in The Ghost Dance, a classic anthropological study of the origins of religion, Weston La Barre considers what happens under various conditions of deprivation. Consider this passage about Captain Joshua Slocum, who sailed around the world alone at the turn of the 20th Century:

Once in a South Atlantic gale, he double-reefed his mainsail and left a whole jib instead of laying-to, then set the vessel on course and went below, because of a severe illness. Looking out, he suddenly saw a tall bearded man, he thought at first a pirate, take over the wheel. this man gently refused Slocum’s request to take down the sails and instead reassured the sick man he would pilot the boat safely through the storm. Next day Slocum found his boat ninety-three miles further along on a true course. That night the same red-capped and bearded man, who said he was the pilot of Columbus’ Pinta, came again in a dream and told Slocum he would reappear whenever needed.

La Barre goes on to cite similar experiences happening to other explorers and to people living in isolation, whether by choice, as in the case of religious meditation, or force, as in the case of prisoners being brainwashed.

In the early 1950s Woodburn Heron, a psychologist in the laboratory of Donald Hebb, conducted some of the earliest research on the effects of sensorimotor deprivation [2]. The subjects were placed on a bed in a small cubicle. They wore translucent goggles that transmitted light, but no visual patterns. Sound was masked by the pillow on which they rested their heads and by the continuous hum of air-conditioning equipment. Their arms and hands were covered with cardboard cuffs and long cotton gloves to blunt tactile perception. They stayed in the cubicle as long as they could, 24 hours a day, with brief breaks for eating and going to the bathroom.

The results were simple and dramatic. Mental functioning as measured by simple tests administered after 12, 24, and 48 hours or isolation deteriorated. Subjects lost their ability to concentrate and to think coherently. Most dramatically, subjects began hallucinating. They would begin with simple forms and designs and evolve into whole scenes. One subject saw dogs, another saw eyeglasses, and they had little control over what they saw; no matter how hard they tried, they couldn’t change what they were seeing. A few subjects had auditory and tactile hallucinations. Upon emerging from isolation the visual world appeared distorted with some subjects reporting that the room appeared to be moving. Woodburn concluded, as have other investigators, that the waking brain requires a constant flux of sensory input in order to function properly.

Of course, one might object to this conclusion by pointing out that, in particular, these people were deprived interaction with other people and that is what causes the instability, not mere sensory deprivation. But, from our point of view, that is no objection at all. For other people are a major part of the environment in which human beings live. The rhythms of our intentional structures are stable only if they are supported by the rhythms of the external world. Similarly, one might object that, while these people were cut off from the external physical world, their brains, of course, were still operating in the interior milieu. Consequently the instabilities they experienced reflect “pressure” from the interior milieu that is not balanced by activity in the external world. This may well be true, I suspect that it is, but it is no objection to the idea that the waking brain requires constant input from the external world in order to remain stable. Rather, this is simply another aspect of that requirement.

Thus I suggest that detaching one’s attention from the immediate world to “think” may cause problems. And yet it is the capacity for such thought that is one aspect of the mental agility that distinguishes us from our more primitive ancestors. How do we keep the nervous system stable enough to think coherently? The answer to that question depends, of course, on just what is causing the instability. Part of the answer may well be that we periodically “tune” our cortical circuits through music and dance. As long as the brain gets such tuning on a regular basis it can maintain its stability well enough during episodes of extending thinking--whether the merest day dreaming, or concentrated intellectual activity of one sort or another. But, without regular tuning, the brain begins to lose its stability.

What does that have to do with GPT-3?

GPT-3, and many other contemporary systems, is trained on large bodies of text. That text, of course, consists of words. Well, they are words to us, who can say them, spell them, offer definitions, and use them correctly in utterances and writing. To the computer those are merely word forms, symbols that are not attached to meanings in the way that word forms are attached to meanings in the human mind. The object of these systems is to somehow approximate word meanings by calculating over the distribution of words in texts. The underlying assumption, which Warren Weaver articulated in his famous memo on machine translation back in 1949 [1], is that words that appear together share some aspect of meaning. Thus if we can ‘examine’ words in a sufficiently large number of contexts each, we should be able to approximate their meanings.

And indeed, GPT-3’s ability to generate coherent text of some length suggests it has manages a remarkable approximation. But it exhibits failings in commonsense reasoning [3], a failing that the symbolic systems of four decades ago exhibited as well. I believe that much commonsense reasoning takes place ‘close to the ground’ as it were [4]. Because GPT-3 only has access to word forms, not to the sensory-motor schemas that directly support, give meaning to, many of them it lacks the basis on which common sense reasoning functions. It is, in effect, lost.

It is in the situation Slocum found himself when isolated at sea. Lacking people to talk to, his mind began unraveling. And so it happens with people in sensory deprivation. Without sensory input to stabilize their perceptual system they begin hallucinating.

Let’s return to Simon’s ant. There is the world, and there is the ant with its mental apparatus. The human situation is more complex. There is the world, and there is our sensorimotor apparatus. And for the first two years of life, that’s pretty much it. But then language begins to develop, and language depends on both those sensorimotor systems and on interaction with others. GPT-3 lacks both sensorimotor apparatus and conversation with others. That is, during training GPT-3 has no contact with human interlocutors. Once trained GPT-3 is given prompts to which it responds. But, it is my understanding that it does not learn from those interactions.

How can GPT-3, and similar engines, possibly make sense of language that functions ‘close to the world’? I suppose one can hope that by considering a very very large number of texts an AI Engine can somehow ‘fill in’ the information it is missing because it lacks direct access to the world. That has not worked so far. What reason do we have think that some day it will?

Consider Simon’s ant once again. By examining the paths it traces we can approximate the beach’s micro-geography. How many paths must we examine and superimpose in order fix the location of every pebble and dunelet – forget about grains of sand?

Now, until AI language systems and rich and flexible access to the world and the capacity to develop a rich analog or quasi-analog account of the world, until that happens, common sense reasoning will remain elusive.


[1] Warren Weaver, “Translation”, Carlsbad, NM, July 15, 1949, 12. pp. Online: http://www.mt-archive.info/Weaver-1949.pdf.

[2] Heron, W. (1957). The pathology of boredom. Scientific American, 196, 52–56. https://doi.org/10.1038/scientificamerican0157-52.

[3] This has been much discussed in the literature. I have offered a modest example in a post where I had GPT-3 explain the punch line to a Jerry Seinfeld joke: Analyze This! Screaming on the flat part of the roller coaster ride [Does GPT-3 get the joke?], May 7, 2021, https://new-savanna.blogspot.com/2021/05/analyze-this-screaming-on-flat-part-of.html.

[4] I have argued this in a working paper, GPT-3: Waterloo or Rubicon? Here be Dragons, Version 2, August 20, 2020, 34 pp., https://www.academia.edu/43787279/GPT_3_Waterloo_or_Rubicon_Here_be_Dragons_Version_2. See pp. 21-25. See also my post, Computation, Mind, and the World [bounding AI], September 28, 2019, https://new-savanna.blogspot.com/2019/12/computation-mind-and-world-bounding-ai.html.

No comments:

Post a Comment