Friday, July 31, 2020

3. Interlude: GPT-3 as a model of the mind

Here are the key paragraphs from the previous post:
Let us notice, first of all, that language exists as strings of signifiers in the external world. In the case that interests us, those are strings of written characters that have been encoded into computer-readable form. Let us assume that the signifieds – which bear a major portion of meaning, no? – exist in some high dimensional network in mental space. This is, of course, an abstract space rather than the physical space of neurons, which is necessarily three dimensional. However many dimensions this mental space has, each signified exists at some point in that space and, as such, we can specify that point by a vector containing its value along each dimension.

What happens when one writes? Well, one produces a string of signifiers. The distance between signifiers on this string, and their ordering relative to one another, are a function of the relative distances and orientations of their associated signifieds in mental space. That’s where to look for Neubig’s isometric transform into meaning space. What GPT-3, and other NLP engines, does is to examine the distances and ordering of signifiers in the string and compute over them so as to reverse engineer the distances and orientations of the associated signifieds in high-dimensional mental space.
The purpose of this post is simply to underline the seriousness of my assertion to treat the mind as a high-dimensional space and that, therefore, we should treat the high-dimensional parameter space of GPT-3 as a model of the mind. If you aren't comfortable with the idea, well, it takes a bit of time for it to settle down (I've been here before). This post is a way of occupying some of that time.

If it’s not a model of the mind, then what IS it a model of? “The language”, you say? Where does the language come from, where does it reside? That’s right, the mind.

It is certainly not a complete model of the mind. The mind, for example, is quite fluid and iscapable of autonomous action. GPT-3 seems static and is only reactive. It cannot initiate action. Nonetheless, it is still a rich model.

I built plastic models as a kid, models of rockets, of people, and of sailing ships. None of those models completely captured the things they modeled. I was quite clear on that. I have a cousin who builds museum-class ship models from wood of various kinds, metal, cloth, paper, thread and twine (and perhaps some plastic here and there). They are much more accurate and aesthetically pleasing than the models I assembled from plastic kits as a kid. But they are still only models.

So it is with GPT-3. It is a model of the mind. We need to get used to thinking of it in those terms, dangerous as they may be. But, really, can the field get more narcissistic and hubristic than it already is?

* * * * *

This is not the first time I’ve been through this drill. I’ve been thinking about this that and the other in so-called digital humanities since 2014; call it computational criticism. These particular investigations have been using various kinds of distributional semantics – topic modeling, vector space semantics – to examine literary texts and populations of texts. They don’t think about their language models as models of the mind; they’re just, well, you know, language models, models of texts. There’s some kind membrane, some kind of barrier, that keeps us – them, me, you – from moving from these statistical models of the mind. They’re not the real thing, they’re stop gaps, approximation. Yes. And they are also models, as much models of the mind as a plastic battleship is a model of the real thing.

Why am I saying this? Like I said, to underline the seriousness of my assertion to treat the mind as a high-dimensional space. In a common formulation, the mind is what the brain does. The brain is a three-dimensional physical object.

It consists of roughly 86 billion neurons, each of which has roughly 10,000 connections with other neurons. The action at each of those synaptic junctures is mediated by upward of a 100 neurochemicals. The number of states a system can take depends on 1) the number of elements it has, 2) the number of states each element can take, and 3) the dependencies among those elements. How many states can that system assume? We don't really know. Jillions, maybe zillions, maybe jillions of zillions. A lot.

That is a state space of very high dimensionality. That state space is the mind. GPT-3 is a model of that.

* * * * *

I’ve written quite a bit about computational criticism, though nothing for formal academic publication. Here’s one paper to look at:

William Benzon, Virtual Reading: The Prosper Project Redux, Working Paper, Version 2, October 2018, 37 pp., https://www.academia.edu/34551243/Virtual_Reading_The_Prospero_Project_Redux.
Abstract: Virtual reading is proposed as a computational strategy for investigating the structure of literary texts. A computer ‘reads’ a text by moving a window N-words wide through the text from beginning to end and follows the trajectory that window traces through a high-dimensional semantic space computed for the language used in the text. That space is created by using contemporary corpus-based machine learning techniques. Virtual reading is compared and contrasted with a 40 year old proposal grounded in the symbolic computation systems of the mid-1970s. High-dimensional mathematical spaces are contrasted with the standard spatial imagery employed in literary criticism (inside and outside the text, etc.). The “manual” descriptive skills of experienced literary critics, however, are essential to virtual reading, both for purposes of calibration and adjustment of the model, and for motivating low-dimensional projection of results. Examples considered: Augustine’s Confessions, Heart of Darkness, Much Ado About Nothing, Othello, The Winter’s Tale.
* * * * *

Posts in this series are gathered under this link: Rubicon-Waterloo.

No comments:

Post a Comment