I've finished a new working paper. Title above, links, abstract, table of contents, and introduction below.
Academia.edu: https://www.academia.edu/107318793/Discursive_Competence_in_ChatGPT_Part_2_Memory_for_Texts
SSRN: https://ssrn.com/abstract=4585825
ResearchGate: https://www.researchgate.net/publication/374229644_Discursive_Competence_in_ChatGPT_Part_2_Memory_for_Texts_2_Memory_for_Texts
Abstract: In a few cases ChatGPT responds to a prompt (e.g. “To be or not to be”) by returning a specific text word-for-word. More often (e.g. “Johnstown flood, 1889”) it returns with information, but the specific wording will vary from one occasion to the next. In some cases (e.g. “Miriam Yevick”) it doesn’t return anything, though the topic was (most likely) in the training corpus. When the prompt is the beginning of a line or a sentence in a famous text, ChatGPT always identifies the text. When the prompt is a phrase that is syntactically coherent, ChatGPT generally identifies the text, but may not properly locate the phrase within the text. When the prompt cuts across syntactic boundaries, ChatGPT almost never identifies the text. But when told it is from a “well-known speech” it is able to do so. ChatGPT’s response to these prompts is similar to associative memory in humans, possibly on a holographic model.
Contents
Introduction: What is memory? 2
What must be the case that ChatGPT would have memorized “To be or not to be”? – Three kinds of conceptual objects for LLMs 4
To be or not: Snippets from a soliloquy 16
Entry points into the memory stream: Lincoln’s Gettysburg Address 26
Notes on ChatGPT’s “memory” for strings and for events 36
Appendix: Table of prompts for soliloquy and Gettysburg Address 43
Introduction: What is memory?
In various discussions about large language models (LLMs), such as the one powering ChatGPT, I have seen assertions that such as, “oh, it’s just memorized that.” What does that mean, “to memorize?”
I am a fairly talented and skilled musician. I can and have memorized a piece of music by practicing it over and over. There are the notes on the page. I start playing them until I am comfortable. Then I look away and see how far I can go. When I get lost, I look at the music, continue playing the notes on the page, and finish the piece – something like that. Then I start over from the beginning, again without the music. When I can play the whole piece without having to consult the written music, I have it memorized. At least for the moment.
But I don’t do that very often. More likely, I’ll hear a tune I like two, three, or five times and then I pick up my trumpet and playing, sometimes perfectly, sometimes with a glitch or two. I didn’t memorize it, and yet I’m playing it. From memory? No, by ear?
I’m speaking metaphorically of course. What does it mean to play by ear? I don’t really know, but I imagine it goes something like this: Music has an inner logic, a grammar, a set of rules through which it is structured. When I hear a tune I’m listening to it in term of that inner logic, as, for that matter, anyone is – at least if they’re familiar with the musical idiom. It’s that logic that I’m registering as I listen to the tune. Once I’ve heard the tune a couple of times, I’ve “absorbed” that logic, without even thinking about it or working on it. It just happens as a side-effect of (ordinary) listening. When the absorption is complete, I am able to play the tune “by ear.”
Those are two very different processes, absorbing a tune through listening vs. repeating it over and over until you have it “memorized.” Which, if either, or those two is an LLM doing when it is chewing its way through a corpus of texts? When I prompt ChatGPT with “To be or not to be,” it responds with Hamlet’s complete soliloquy, word-for-word. When it does that is the process more like what I do when playing music by ear or like I do when memorizing music? Or is it something else?
That’s the kind of issue I had in mind when I undertook the investigations I report in this working paper. In the first piece – What must be the case that ChatGPT would have memorized “To be or not to be”? – I start out with Hamlet’s famous soliloquy, initially prompting ChatGPT with first line, but then prompting it with other fragments from the soliloquy. Then I prompt it with the phrase, “Johnstown flood, 1889,” and it responds with information about that flood, by not a specific text word-for-word. Many prompts are like that, many more than elicit a specific text word-for-word. What leads to that difference? I conclude with two topics I have reason to believe were included in the training corpus, but which ChatGPT seems to know nothing about. Why not?
In the next section (To be or not: Snippets from a soliloquy) I create various prompts for the soliloquy. I do the same in the third section (Entry points into the memory stream: Lincoln’s Gettysburg Address), but more systematically. Finally, I do a bit of speculating about what’s going on: Notes on ChatGPT’s “memory” for strings and for events. I begin by quoting a passage from F. C, Bartlett’s classic 1932 study, Remembering, and conclude that ChatGPT may have an associative memory along the lines suggested by holography, which engendered a great deal of speculation in the 1970s and, in this millennum, specifically for word meaning and order.
If you are looking at a sheet of music without an instrument present do you hear it and start 'playing it' ?
ReplyDeleteIt's what I do with the soliloquy without thought. Immediately resolving 'glitches' i.e. over- hitting and over- sibilance.
'To be or not to be, that iz the question.... tiz... slingz... arrowz... armz... troubles.
Listening to how it has to sound straight away, and working out how it shapes exactly on the tongue.
Procedural memory, that allows me to hear and work out what I am doing by ear I think.
Giving a first stage sense of pace and timing.
To your question: At best, only approximately.
DeleteThat is, though I can read music, I can't read it as fluently as I can written language. And, while I can read fluently, including aloud, I'm not at all trained in so doing. Musicians who read music for a living can probably read notated music as fluently as they read language. In the universe of musicians, though, that's probably rare. Exceedingly rare at the top levels. There are, however, a fair percentage of musicians who read more fluently than I do, though perhaps not quite to the point of hearing the music in the mind's ear.
DeleteI just did a brief search on the subject came across the terms 'eye-hand span' (music) or with speech 'eye- voice span.'
ReplyDeleteAlong with 'subvocalization.'
Rather interesting, in relation to memory.
Wonder what the history is, early monastic communities did not read or write silently, scriptorium is filled with sound, commonly described as a hive buzzing with bees. I assume reading silently is a later stage development after the introduction of literacy.
p.s Shakespeare, without the rhythm (in particular) and inflection their is no sense. Strongly connected to meaning and how you find it.
DeleteI'd think "eye-hand" span is most relevant for keyboard instruments where both the music and the hands are directly in front. The issue is how far do your eyes have to travel from the music to your hands. I'd guess that good keyboardists, whatever "good" means (but it doesn't mean international virtuoso), don't have to look at their hands in order to execute. It might apply to guitar and the like as well. But it's not very meaningful for wind instruments not I would think string instruments.
DeleteFWIW, Glenn Gould is notorious for NOT subvocalising when he plays. His vocalizing is annoyingly audible.
As for silent reading, yes, that comes late. There's a passage in Augustine's Confessions where he remarks on being surprised at seeing his teacher, Ambrose, read without moving his lips.