For some time now I have been puzzled by the (astonishing) success of statistical models, especially artificial neural networks, in language processing, machine translation in particular – see, e.g. this post, Borges redux: Computing Babel – Is that what’s going on with these abstract spaces of high dimensionality? [#DH], which dates back to October 2017. Sure, it is statistics, but what are those statistics “grabbing on to?” There is no meaning there, just (naked) word forms. What is it about those word-forms-in-context that yields an approximation to, a simulacrum of, meaning? Better, what’s there that we readily read meaning, intention, into the results of these statistical techniques.
Semanticity and intention
My puzzlement reached a climax with the unveiling of GPT-3 in 2020 and I decided to take another run at the problem. I produced a working paper, GPT-3: Waterloo or Rubicon? Here be Dragons, which I liked very much. I made real progress. I now think I can nudge things forward another step. Look at this passage, where I discuss the Chinese Room thought-experiment (p. 28):
Yet if you would believe John Searle, no matter how rich and detailed those old school mental models, understanding would necessarily elude them. I am referring, of course, to his (in)famous Chinese Room argument. When I first encountered it years ago my reaction was something like: interesting, but irrelevant. Why irrelevant? Because it said absolutely nothing about the techniques AI or cognitive science investigators used and so would provide no guidance toward improving that work. He did, however, have a point: If the machine has no contact with the world, how can it possibly be said to understand anything at all? All it does is grind away on syntax.
What Searle misses, though, is the way in which meaning is a function of relations among concepts, as I pointed out earlier (pp. 18 ff.). It seems to me, however – and here I’m just making this up – we can think of meaning as having both an intentional aspect, the connection of signs to the world, and a relational aspect, the relations of signs among themselves. Searle’s argument concentrated on the former and said nothing about the latter.
What of the intentional aspect when a person is writing or talking about things not immediately present, which is, after all quite common? In this case the intentional aspect of meaning is not supported by the immediate world. Language use thus must necessarily be driven entirely by the relations signifiers have among themselves, Sydney Lamb’s point which we have already investigated (p. 18).
Those statistics are grabbing onto the relational aspect of meaning. The question is: How much of that can these methods recover from texts? Let’s set that aside for the moment.
That passage mentions intention and relation. Intention resides in the relationship between a person and the world. Relation resides in the relationships that signifiers have among themselves. It is a property of the cognitive system. I am now thinking that it must be paired with adhesion. Taken together they constitute semanticity. Thus we have semanticity and intention where semanticity is a general capacity inherent in the cognitive system, in a person’s mind, and intention inheres in the relation between a person and the world in a particular perceptual and/or cognitive activity.
What do I mean by adhesion? Adhesion is how words ‘cling’ to the world while relationality is the differential interaction of words among themselves within the linguistic system. Words whose meaning is defined directly over the physical world, but also, to some extent, the interpersonal world of signals and feeling, they adhere to the world through sensorimotor schemas. Words whose meaning is abstract are more problematic. Their adhesion operates though patterns of words and other signs and symbols (e.g. mathematics, data visualizations, illustrative diagrams of various kinds, and so forth). Teasing out these systems of adhesion has just barely begun.
The psychologist J.J. Gibson talked of the affordances an environment presents to the organism. Affordances as the features of the world which an organism can readily pick up during its life in the world. Adhesions are the organism’s complement to environmental affordances; they are the perceptual devices through which the organism relates to the affordances.
What this means for language models
Large language models built through deep neural networks, such as GPT-3, conflate the interaction of three phenomena: 1) the world-level relational aspect of semanticity as captured in the locations of word forms (signifiers) in a string, 2) the conventions of discourse structure, and 3) the world itself. The world is present in the model because the texts over which the model was constructed were created by people interacting in the world. They were in an intentional relationship with the world when they wrote those texts. The conventions of discourse are present simply because they organize the placement of word forms in a text, with special emphasis on the long-distance relationships of word. As for relationality, that’s all that can possibly be present in a text. Adhesions belong to the realm of signifieds, of concepts and ideas, and they aren’t in the text itself.
Would it somehow be possible to factor a language model into these three aspects? I have no idea. The point of doing so would be to reduce the overall size of the model.
Putting that aside, let us ask: Given a sufficiently large database of texts and tokens and a high enough number of parameters for our model, is it possible for a language model to extract all the relationality from the texts? How much of that multidimensional relational semanticity can be recovered from strings of word forms? Given a deep enough understanding of how relational semantics is reflected in the structure of texts, can we calculate what is possible with various text bases and model parameterization?
To answer those questions we need to have some account of semantic relationality which we can examine. The models of Old School symbolic AI and computational linguistics provide such accounts. Many such models have been created. Which ones would we choose as the basis for our analysis? The sort of question that interests me is how many word forms have their meanings given in adhesions to the physical world (that is, physical objects and events), to the interpersonal world (facial expressions, gestures, etc.) and how many word forms are defined abstractly?
So many questions.
* * * * *
I have appended this to my GPT-3 working paper, which is now Version 3.
No comments:
Post a Comment