This is a long podcast and I’ve not listened to all of it. I’m excerpting part of the conversation – which I’ve taken from this transcript – which speak to two of my hobby horses: 1) the apparent lack of linguistic knowledge in the LLM community, and 2) the idea that LLMs are built on relationships between words. On the first point, I’m using Lex Fridman as a proxy for the LLM community, though he does not work on LLMs. But he is trained in computer science and machine learning, is generally familiar with LLMs, and has interviewed a number of experts in machine learning. I was just a little surprised to hear that the idea of sentence structure being tree-like seemed new to him. I thought “everyone” knew that, where by “everyone” I mean anyone in the last 50 to 60 years with a technical interest in how the mind words.
The second point is where things get interesting. In the discussion of language Gibson hammers home the distinction between form and meaning in language. And then, in discussing LLMS, he talks about them as being based on language forms, but not meaning. I think that’s right – keeping in mind that I use those terms a bit differently than they’re being used here (see this post, for example), but not in a way that’s inconsistent with what I believe Gibson is saying.
NOTE: The machine-generated transcript the transcript has the names reversed (as of today, Apr. 22), so I have corrected that. Nor have I checked it against the video for accuracy.
Linguistic knowledge and dependency theory
Let’s start with a bit from the beginning of the conversation:
LEX FRIDMAN: (00:03:23) Did you ever come across the philosophy angle of logic? If you think about the 80s with AI, the expert systems where you try to maybe sidestep the poetry of language and some of the syntax and the grammar and all that kinda’ stuff and go to the underlying meaning that language is trying to communicate and try to somehow compress that in a computer representable way? Did you ever come across that in your studies?
EDWARD GIBSON: (00:03:50) I probably did but I wasn’t as interested in it. I was trying to do the easier problems first, the ones I thought maybe were handleable, which seems like the syntax is easier, which is just the forms as opposed to the meaning. When you’re starting talking about the meaning, that’s a very hard problem and it still is a really, really hard problem. But the forms is easier. And so I thought at least figuring out the forms of human language, which sounds really hard but is actually maybe more attractable.
LEX FRIDMAN: (00:04:19) It’s interesting. You think there is a big divide, there’s a gap, there’s a distance between form and meaning, because that’s a question you have discussed a lot with LLMs because they’re damn good at form.
EDWARD GIBSON: (00:04:33) Yeah, I think that’s what they’re good at, is form. And that’s why they’re good, because they can do form, meanings are …
LEX FRIDMAN: (00:04:39) Do you think there’s … Oh, wow. It’s an open question.
EDWARD GIBSON: (00:04:42) Yeah.
LEX FRIDMAN: (00:04:43) How close form and meaning are. We’ll discuss it but to me studying form, maybe it’s a romantic notion it gives you. Form is the shadow of the bigger meaning thing underlying language. Language is how we communicate ideas. We communicate with each other using language. In understanding the structure of that communication, I think you start to understand the structure of thought and the structure of meaning behind those thoughts and communication, to me. But to you, big gap.
This is basic, very basic. It’s not that “form is the shadow of the bigger meaning” but that syntax is (based on) the form of meaning, relationships between semantic elements.
Now we’re a bit later in the conversation where Gibson is talking about syntactic dependency.
LEX FRIDMAN: (00:10:59) [...] There’s so many things I want to ask you. Okay, let me just some basics. You mentioned dependencies a few times. What do you mean by dependencies?
EDWARD GIBSON: (00:11:12) Well, what I mean is in language, there’s three components to the structure of language. One is the sounds. Cat is C, A and T in English. I’m not talking about that part. Then there’s two meaning parts, and those are the words. And you were talking about meaning earlier. Words have a form and they have a meaning associated with them. And so cat is a full form in English and it has a meaning associated with whatever a cat is. And then the combinations of words, that’s what I’ll call grammar or syntax, that’s when I have a combination like the cat or two cats, okay, where I take two different words there and put together and I get a compositional meaning from putting those two different words together. And so that’s the syntax. And in any sentence or utterance, whatever, I’m talking to you, you’re talking to me, we have a bunch of words and we’re putting them together in a sequence, it turns out they are connected, so that every word is connected to just one other word in that sentence. And so you end up with what’s called technically a tree, it’s a tree structure, where there’s a root of that utterance, of that sentence. And then there’s a bunch of dependents, like branches from that root that go down to the words. The words are the leaves in this metaphor for a tree.
LEX FRIDMAN: (00:12:34) A tree is also a mathematical construct.
EDWARD GIBSON: (00:12:37) Yeah. It’s graph theoretical thing, exactly.
LEX FRIDMAN:(00:12:38) A graph theory thing. It’s fascinating that you can break down a sentence into a tree and then every word is hanging onto another, is depending on it. [...]
LEX FRIDMAN: (00:13:05) Can I pause on that?
EDWARD GIBSON: (00:13:06) Sure.
LEX FRIDMAN: (00:13:06) Because to me just as a layman, it is surprising that you can break down sentences in mostly all languages.
Again, this is fundamental. I suppose I had something of a preview of this sort of thing when I learned Reed-Kellogg sentence diagramming in the sixth grade. It’s still kicking around – there’s lots of stuff about it on the web – but I don’t know how routinely it’s taught these days. I learned about syntactic trees in my sophomore year in college when I took a course in psycholinguistics.
LEX FRIDMAN: (00:18:22) I love the terminology of agent and patient and the other ones you used. Those are linguistic terms, correct?
EDWARD GIBSON: (00:18:29) Those are for meaning, those are meaning. And subject and object are generally used for position. Subject is just the thing that comes before the verb and the object is the one that comes after the verb. The agent is the thing doing, that’s what that means. The subject is often the person doing the action, the thing.
LEX FRIDMAN: (00:18:48) Okay, this is fascinating. How hard is it to form a tree in general? Is there a procedure to it? If you look at different languages, is it supposed to be a very natural … Is it automatable or is there some human genius involved in construction …
EDWARD GIBSON: (00:19:01) I think it’s pretty automatable at this point. People can figure out the words are. They can figure out the morphemes, technically morphemes are the minimal meaning units within a language, okay. And so when you say eats or drinks, it actually has two morphemes in English. There’s the root, which is the verb. And then there’s some ending on it which tells you that’s the third person singular.
I think that anyone working with LLMs should be conversant with the distinction between the meaning-bearing aspect of language and the positional aspect. They may not need this familiarity to work with transformers, but they should know that the distinction is basic to language mechanism. After all, the positionality of tokens is something that is central to the transformer architecture. They should know that it’s central to language itself and not just an aspect of the transformer architecture.
LEX FRIDMAN: (00:35:43) So quick questions around all of this. So formal language theory is the big field of just studying language formally?
EDWARD GIBSON: (00:35:49) Yes. And it doesn’t have to be human language there. We can have a computer languages, any kind of system which is generating some set of expressions in a language. And those could be the statements in a computer language, for example. It could be that, or it could be human language.
LEX FRIDMAN: (00:36:10) So technically you can study programming languages?
EDWARD GIBSON: (00:36:12) Yes. And heavily studied using this formalism. There’s a big field of programming language within the formal language.
LEX FRIDMAN: (00:36:21) And then phrase structure, grammar is this idea that you can break down language into this S NP VP type of thing?
EDWARD GIBSON: (00:36:28) Yeah. It’s a particular formalism for describing language. And Chomsky was the first one. He’s the one who figured that stuff out back in the ’50s. And that’s equivalent actually, the context-free grammar is actually, is equivalent in the sense that it generates the same sentences as a dependency grammar would. The dependency grammar is a little simpler in some way. You just have a root and it goes… We don’t have any of these, the rules are implicit, I guess. And we just have connections between words. The free structure grammar is a different way to think about the dependency grammar. It’s slightly more complicated, but it’s kind of the same in some ways.
Again, this strikes me as something that is absolutely basic. Programming languages and natural languages have something very deep in common – they're both languages! As such, they share common principles that can be studied formally.
There’s more discussion of Chomsky, and in particular, of phrase structure grammar and the way it differs from dependency. Gibson points out (00:37:33):
So phrase structure grammar and dependency grammar aren’t that far apart. I like dependency grammar because it’s more perspicuous, it’s more transparent about representing the connections between the words. It’s just a little harder to see in phrase structure grammar.
In the context of LLMs that’s a very important difference. Phrase structure grammar obscures the relationships between syntax and semantics, form and meaning. Thinking in those terms makes it more difficult to see how LLMs, as models of forms, can track meaning well enough to function as effectively as they do.
LLMs as models of relational structure in language
This is the payoff.
LEX FRIDMAN: (01:30:35) Well, let’s take a stroll there. You wrote that the best current theories of human language are arguably large language models, so this has to do with form.
EDWARD GIBSON: (01:30:43) It’s a kind of a big theory, but the reason it’s arguably the best is that it does the best at predicting what’s English, for instance. It’s incredibly good, better than any other theory, but there’s not enough detail.
LEX FRIDMAN: (01:31:01) Well, it’s opaque. You don’t know what’s going on.
EDWARD GIBSON: (01:31:03) You don’t know what’s going on.
LEX FRIDMAN: (01:31:05) Black box.
EDWARD GIBSON: (01:31:06) It’s in a black box. But I think it is a theory.
LEX FRIDMAN: (01:31:08) What’s your definition of a theory? Because it’s a gigantic black box with a very large number of parameters controlling it. To me, theory usually requires a simplicity, right?
EDWARD GIBSON: (01:31:20) Well, I don’t know, maybe I’m just being loose there. I think it’s not a great theory, but it’s a theory. It’s a good theory in one sense in that it covers all the data. Anything you want to say in English, it does. And so that’s how it’s arguably the best, is that no other theory is as good as a large language model in predicting exactly what’s good and what’s bad in English. Now, you’re saying is it a good theory? Well, probably not because I want a smaller theory than that. It’s too big, I agree.
LEX FRIDMAN: (01:31:47) You could probably construct mechanism by which it can generate a simple explanation of a particular language, like a set of rules. Something like it could generate a dependency grammar for a language, right?
EDWARD GIBSON: (01:32:03) Yes.
LEX FRIDMAN: (01:32:03) You could probably just ask it about itself.
EDWARD GIBSON: (01:32:12) Well, that presumes, and there’s some evidence for this, that some large language models are implementing something like dependency grammar inside them. And so there’s work from a guy called Chris Manning and colleagues over at Stanford in natural language. And they looked at I don’t know how many large language model types, but certainly BERT and some others, where you do some kind of fancy math to figure out exactly what kind of abstractions of representations are going on, and they were saying it does look like dependency structure is what they’re constructing. It’s actually a very, very good map, so they are constructing something like that. Does it mean that they’re using that for meaning? Probably, but we don’t know.
LEX FRIDMAN: (01:33:01) You write that the kinds of theories of language that LLMs are closest to are called construction-based theories. Can you explain what construction-based theories are?
EDWARD GIBSON: (01:33:09) It’s just a general theory of language such that there’s a form and a meaning pair for lots of pieces of the language. And so it’s primarily usage-based is a construction grammar. It’s trying to deal with the things that people actually say and actually write, and so it’s a usage-based idea. What’s a construction?
Construction’s either a simple word, so a morpheme plus its meaning or a combination of words. It’s basically combinations of words, the rules, but it’s unspecified as to what the form of the grammar is underlyingly. And so I would argue that the dependency grammar is maybe the right form to use for the types of construction grammar. Construction grammar typically isn’t formalized quite, and so maybe the a formalization of that, it might be in dependency grammar. I would think so, but it’s up to other researchers in that area if they agree or not.
LEX FRIDMAN: (01:34:16) Do you think that large language models understand language? Are they mimicking language? I guess the deeper question there is, are they just understanding the surface form or do they understand something deeper about the meaning that then generates the form?
EDWARD GIBSON: (01:34:33) I would argue they’re doing the form. They’re doing the form, they’re doing it really, really well. And are they doing the meaning? No, probably not. There’s lots of these examples from various groups showing that they can be tricked in all kinds of ways. They really don’t understand the meaning of what’s going on. And so there’s a lot of examples that he and other groups have given which show they don’t really understand what’s going on
Right, LLMs are doing form. And if you have some understanding of the relationship between form and meaning, especially one grounded in dependency syntax, then it is not so difficult to see how a model of formal relations, like an LLM, can provide a convincing simulacrum of understanding. That in itself won’t tell you just how the model functions, but it gives you useful intuitions about what to look for when you pop the hood and start poking around. It’s an unsolved puzzle, but not an existential mystery.
* * * * *
Addendum: as an example of the kind of conceptual mischief that can result from ignorance about language and cognition, see this post: OpenAI Co-Founder Ilya Sutskever on the mystical powers of artificial neural nets.
No comments:
Post a Comment