Pages in this blog

Sunday, January 15, 2023

A note about story grammars in ChatGPT

A new working paper, just a note really. Title above, links, abstract, and opening paragraphs below:

Abstract: Think of an artificial neural net as a platform on which to implement higher level structures, like one might implement a word processor or a database in C++. In investigate of the neural net underlying ChatGPT implements a simple grammar, stories with five components, in order: 1) Donné, 2) Disturb, 3) Plan, 4) Enact, 5) Celebrate. I present the results of four experiments in which ChatGPT transforms one story into another based on a single change in the nature of the protagonist or antagonist.

* * * * *

The opening paragraphs:

I’ve been playing around with the idea that artificial neural nets are platforms in which high-level processes can be implemented, like writing a word processor in C++. I just came across a memo by Chris Olah, Mechanistic Interpretability, Variables, and the Importance of Interpretable Bases. His first paragraph:

Mechanistic interpretability seeks to reverse engineer neural networks, similar to how one might reverse engineer a compiled binary computer program. After all, neural network parameters are in some sense a binary computer program which runs on one of the exotic virtual machines we call a neural network architecture.

That seems to be pretty much the same idea, though he develops it in a different way than I am about to.

Let’s say you’ve got a word processor and the spell checker isn’t working. You decide to investigate and try to fix it. Are you going to start by examining assembly code? Not very likely, though you may end up there. You’re going to start by looking at the spell checker routine in the source code.

I’m not interested in a spell checker. I’m interested in ChatGPT and I’ve been investigating how it writes stories, among other things. I don’t have access to the underlying neural net, but I doubt that would help me much. It’s too large and complicated and the structures that interest me are probably not explicit at that level, any more than a spell checker is explicit at the assembly-language level. It’s latent there, but not explicit.

So how do I get at the story grammar, if you will, that ChatGPT is using? I am going to have to examine the text. Narratologists in various disciplines (folklorist, anthropology, literary criticism, even classical symbolic AI) have been doing this for decades. The procedure I’ve been using is derived from the analytical method Claude Lévi-Strauss employed in his magnum opus, Mythologiques. He started with one myth, analyzed it, and then introduced another one, very much like the first. But not quite. They are systematically different. He characterized the difference by a transformation. He worked his way through hundreds of myths in this manner, each one derived from another by a transformation.

I have appended tables containing "before and after" analyses of four experiments two in which the hero Aurora is replaced by a different character and two in which the antagonist faced by William the Lazy is changed.

Think of those tables as indicating a simple grammar consisting of frames, slots, and fillers. A slot can be filled by another frame or by a language string. In these experiments the basic frame has five slots, which I have characterized as follows:

1. Donné: a term from literary criticism for what is given at the beginning of a story,
2. Disturb(ance),
3. Plan: a response to the disturbance,
4. Enact,
5. Celebrate.

Each of those slots in turn has a frame specifying what goes in the slot. The highlighted sections thus indicate slots where the fillers are different in the corresponding sections of two stories. I assume that most of the slots in these second level frames can be filled in various ways depending on circumstances, but it would take quite a bit of experimentation to tease that out.

If I am correct in my belief that those tables indicate a structure that is there in the language model driving ChatGPT, then students of mechanistic interpretability have two closely related tasks: 1) find those structures in patterns of weights in network parameters, and, correlatively, 2) what structures in the network can be used to create story grammars?

There is more at the links, namely those tables that indicate the before-and-after of those stories in more detail than I have done in my posts.

1 comment:

  1. Fascinating. I look forward to more adventures in this area.

    ReplyDelete