Pages in this blog

Friday, April 26, 2024

How is an LLM like a mechanical clock?

The rest of the tweet stream:

A clock movement has no concept of "time" if you take the gears and levers apart, nowhere in there will you find any notion of what "time" is, nor, especially, the *correct* time of day. It just turns shafts, and even that function is not obvious to the non-expert.

If you attach some hands to the shafts, and position a "face" with numerals on it behind, and so forth, you can cause the combined machine to "tell the time" in the sense of making an assertion about what the time of day is.

The position of the hands over the face has a *meaning* for a suitably trained reader, in the same way that a sentence has *meaning*. If you speak the language, a sentence, as a 'signifier,' points to a 'signified,' the meaning of the sentence.

A book does not "know" anything, it does not "mean" anything by itself. If you, a language reader, read it, the meaning arises in that interaction.

In the same way, "it is 2:17" is a meaning which arises when you look at a clock face.

If the clock has not been wound, or if the clock has been set to the wrong time, the statement "it is 2:17" will still have meaning, but that meaning will likely be false. It is 10:17, not 2:17.

The clock is not *lying* in any reasonable sense. It itself does not even carry the meaning of the signs it displays. The meaning arises in the interaction of you with the hands and face attached to the clock.

The gears and levers don't even have "2:17" encoded in them.

In the same way, an LLM has no meaning encoded in it, although it produces signs which in their interaction with you often have a great deal of meaning.

LLMs do not "lie" or "hallucinate" any more than a clock does. It's simply a very very complicated set of gears and lever which, if you wind it up, and if you set it to the correct time, can produce true statements like "it is 10:20" some of the time.

a set of signs on the screen are incapable of being true or false, hallucinations or reality. They're just arrangements of letters.

Only meanings are true or false, and there are no meanings in the LLM. 

It is true that you will find echoes of meaning inside the LLM. Objects arise in the data which resemble an Othello board in interesting ways, when you train an LLM to play Othello.

This is like the position of the shaft of a clock movement.

You can see artifacts in the clock's machinery which, if you know what's going on, you can map to the output statement about what time it is. At the same time, it's foolish to argue that the clock "knows" what time it is, and then to argue that "it's lying."

The clock just turns shafts. This is not changed by the fact that you can imagine the hands and face without actually putting them on, and thereby imaginatively replicate the ability of the clock to make temporal assertions.

Now, an electronic clock can extract the actual correct time from a number of over-the-air oracles (GPS etc) and in fact if you take one of these apart you *will* find a place in there which contains the actual time. Allegedly the "correct" time.

A sequence of ones and zeros in a memory location encodes the "correct" time in some sense, or at least is supposed to.

There is a meaning inside the thing, kinda! It can be wrong, now, and it can arguably lie! If the clock is damaged, say.

When you look at the display you see that it reads "2:17" but the correct time is now 10:27. The clock is making some mistake between the oracle it consults and the display is shows you. It is "lying" or "hallucinating" in some meaningful way.

There is a model of the world attached to the clock, which the clock consults. That model can be interpreted to "mean" that "the time is 10:27" and the display on the clock can be interpreted to "mean" that "the time is 2:17" and that mismatch is a falsehood.

LLMs have no world model. They are clock movements, just clock movements. Attach them to a screen and a keyboard, and you've given the clock hands and a face. You can now read meaning if you like, but to propose that the "meaning" is inside the LLM is simply false.

fin/

Then there's this:

No comments:

Post a Comment