Back in mid-March Scott Aaronson made a post entitled: On overexcitable children. The post begins:
Wilbur and Orville are circumnavigating the Ohio cornfield in their Flyer. Children from the nearby farms have run over to watch, point, and gawk. But their parents know better.
An amusing toy, nothing more. Any talk of these small, brittle, crash-prone devices ferrying passengers across continents is obvious moonshine. One doesn’t know whether to laugh or cry that anyone could be so gullible.
Or if they were useful, then mostly for espionage and dropping bombs. They’re a negative contribution to the world, made by autistic nerds heedless of the dangers.And so forth. He’s obviously using Wilbur and Orville as figures for the developers of AI and their toy as a figure for current devices.
The ensuing discussion was variously interesting, even exiting, mediocre and boring, and pointless, as such things are on the web tubes. But the good stuff makes it worthwhile for me to follow along and occasionally throw in my 2¢. Here’s a nickel’s worth.
General intelligence
@Scott #101: You observe:
On the other hand, with GPT, the world has just witnessed a striking empirical confirmation that, as you train up a language model, countless different abilities (coding, logic puzzles, math problems, language translation…) emerge around the same time, without having to be explicitly programmed. Doesn’t this support the idea that there is such a thing as “general intelligence” that continues to make sense for non-human entities?
I’m going to go all-out weasel and say, I don’t know what it does.
One problem is that “intelligence” isn’t just a word that means “can do a lot of cognitive stuff.” At some point in the last 100 years or so it has become surrounded by a quasi-mystical aura that gets in the way. Saying that “as we scale up GPTs they become more capable” doesn’t have quite the same oomph that “as we scale up GPTs they become more intelligent” does.
We know right now that GPT-3 to 4 can do a lot more stuff at a pretty high level than any human being can do. We’ve got other more specialized systems that can do a few things – play chess, Go, predict protein folding, design a control regime for plasma containment – better than any human can. For that matter, we’ve long had computers that can do arithmetic way better than any humans. We think nothing of that because it’s merely routine and long has been. But things were trickier before the adoption of the Arabic notation.
Getting back to the GPTs, are there any specialized cognitive tasks they can do better than the best human? I don’t know off hand, I’m just asking. But I suppose that’s what the discussion is about, better than the best human. What if GPT-X turns out to prove one of those theorems you’re interested in? What then? But what if that’s the only thing it does better than the best human, but in many other areas it’s better than GPT-4, but not up to merely superior (as opposed to the best) human performance? What then? I don’t know.
And I’m having trouble keeping track of my line of thought. Oh, OK, so GPT-4 knows lots more stuff than any one human. But it also messes up in simple ways. Given that it has some visual capabilities, I wonder if it would go “off the reservation” when confronted with The Towers of Warsaw* in the hilarious way that ChatGPT did? That’s a digression. Even within its range of capabilities, though, GPT-4 hallucinates. What are we to make of that?
What I make of it is that it’s a very difficult problem. I’m guessing that Gary Marcus would say that to solve the problem you need a world model. OK. But how do you keep the world model accurate and up-to-date? That, it seems to me, is a difficult problem for humans, very difficult. As far as I can tell, we deal with the problem by constantly communicating with one another on all sorts of things at all sorts of levels of sophistication.
Let me take another stab at it.
Given the interests of the people who comment here, examples of Ultimate Problems tend to be drawn from science and math. While I’ve got an educated person’s interest in those things, I’m driven by curiosity about other things. While I’ve got a general interest in language and the mind, I’m particularly interest in literature and, above all else, one poem in particular, Coleridge’s “Kubla Khan.” Why that poem?
Because I discovered that, by treating line-end punctuation like nested parentheses in a Lisp expression, the poem is structured like a pair of matryoshka dolls. The poem has two parts, the first twice as long as the second. Each of them is divided into three, the middle is in turn divided into three, and once more, divided into three. All other divisions are binary. And the last line of the first part turns up in the structural center of the second part. So: line 36, “A stately pleasure-dome with caves of ice!”, line 47: “That sunny dome! those caves of ice!”
The whole thing smelled like computation. But computation of what, and how? That’s what drove me to computational linguistics, which I found very interesting. But it didn’t solve my problem. So I’ve been working on that off and on ever since. Oh, I’ve spent a lot of time on other things, a lot of time, but I still check in with “Kubla Khan” every now and then.
I took another look last week (you'll find diagrams at the link that make this is lot clearer). Between vector semantics and in-context learning I’ve made a bit more progress. Who knows, maybe GPT-X will be able to tell me what’s going on. And if it can tell me that, it’ll be able to tell us all a lot more about the human mind and about language.
Short of that, it would be nice to have a GPT, or some other LLM, that’s able to examine a literary text and tell me whether or not it exhibits ring-composition, which is generally depicted like this:
A, B, C...X...C’, B’, A’
It’s an obscure and all-but forgotten topic in literary studies, more prominent among classicists and Biblical scholars. I learned about it from the late Mary Douglas, an important British anthropologist who got knighted, or whatever it is called for women, in recognition of her general work in anthropology. The two parts of “Kubla Khan” exhibit that form. But so do many other texts, like Conrad’s Heart of Darkness, Obama’s Eulogy for Clementa Pinckney, or, of all things, Pulp Fiction, perhaps Shakespeare’s Hamlet as well. Figuring that out is not rocket science. But it’s tricky and tedious.
I’m afraid I’ve strayed rather far afield, Scott. But that’s more or less how I think about “intelligence” or whatever the heck it is. My interest in these matters seems to be dominated by a search for mechanisms, like those in literary texts. Thus, while I’m willing to take ChatGPT’s performance at face value – and believe there’s more coming down the pike, I really want to know how it works. That’s a far more compelling issue that whatever the heck intelligence is. [BTW, the Chatster can tell stories exhibiting ring-composition. That’s one kind of skill, but entirely different from being able to analyze and identify ring-composition in texts (or movies).]
*A variant of The Towers of Hanoi. The classic version is posed with three pegs and five graduated rings. Back in the early 70s some wise guys at Carnegie-Mellon posed a variant with five pegs and three rings.
Scale & intelligence
Scott, let me take another crack at the question you posed in #101, if I may paraphrase: What do we make of the fact that all sorts of capabilities just keep showing up in GPTs without any explicit programming? First, let’s put on the table the work the Anthropic people have done on In-context Learning and Induction Heads. They are circuits that emerge at a certainly (relatively early) point during training and seem to be capable of copying a sequence of tokens, completing a sequence, even pattern matching. What can you do with general pattern matching? Lots of things.
What follows is hardly a rigorous argument, but it’s a place to start. Consider the idea of a tree, by which I mean a kind of plant, not a mathematical object. Though trees are physical objects, they are conceptually abstract. No one ever saw a generic tree. What you see are individual examples of maple trees, or palm trees, or pine trees. Maples, palms, and pines appear quite different. Why would anyone ever think of them as the same kind of thing? Well, consider them in the context of bushes and shrubs, grasses, and flowers. In that context, their size would bring them together in similarity space, just as bushes and shrubs would have their region, grasses would have theirs, flowers would have theirs, and we’ll have to have a region for vines as well. So now we have trees, and bushes, etc. and what are they? They’re all plants. As such, they are distinguished from animals.
So now we have all these abstract or general categories for plants and animals. Let’s take the whole abstraction process up a level and talk about species and genera and biological taxonomy in general. Perhaps we go up another level of abstraction from there and arrive at graph theory.
But to keep climbing to these higher levels of abstraction, we need more and more examples to deal with, and more compute to make all the comparisons and sort things out. And GPTs, of course, aren’t dealing with patterns over physical objects. They’re dealing with patterns over tokens. But those tokens encode all kinds of statements about physical and other kinds of objects. So it is not deeply surprising to me that when you through enough compute at enough texts, all sorts of interesting capabilities show up. I’m not saying or implying that I understand how this works. I don’t. But it doesn’t violate my sense of how the world works.
Now, Jean Piaget, the great developmental psychologist, had this idea of reflective abstraction. He’s the one who elaborated on the idea that cognitive development happens in stages during a child’s life. Children aren’t just learning more and more facts. They’re developing more sophisticated ways of thinking. Thinking processes at a higher level take as their objects, processes at a lower level. He was unclear on just how this works & I’m not sure anyone has tried to figure it out – but then I haven’t looked into the stuff in a while. The general idea is simply that more sophisticated levels of thinking are built on lower-level processes. And, while he was mostly interested in child development, he also applied the idea to the cultural development of ideas.
So maybe one thing we’re seeing as GPTs scale up is phase changes in capability as we use more compute and more examples. The basic architecture remains the same, but significant jumps in capacity allow for new capabilities. To return to maple tree and cows, it’s one thing to encompass more plants and animals in the system. But maybe he takes a major leap in compute to be able to abstract over that whole system and come up with the ideas of family, genus, species and so forth.
Setting that aside, Richard Hanania has just done a podcast with Robin Hanson. In the section on Intelligence and “Betterness” Hanson observes:
So the issue is the kind of meaning behind various abstractions we use. So abstractions are powerful. We use abstractions to organize the world, and abstractions embody similarities between the things out there. And we care about our abstractions, and which pool we use.
But for some abstractions, they well summarize our ambitions and our hopes, but they don’t necessarily correspond to a thing out there, where there’s a knob on them you can turn and change things. So it’s important to distinguish which of our abstractions correspond to things that we can have more direct influence over, and which abstractions are just abstractions about our view of the world and our desires about the world. So that’s the key distinction here. We could talk about a good world and a happy world and a nice world, but there isn’t a knob in the world to turn out and make the world nicer in some sense.
In the next two sections (Knowledge Hierarchy and Innovation, The History of Economic Growth) Hanson tosses out some ideas about abstraction that are useful in thinking about these matters. Later:
And now we have this parameter intelligence, and the question is, what’s that? How does that fit in with all these other parameters? We don’t usually use intelligence, say, as a measure of a country or a measure of a firm. We use wealth or other parameters. If it’s equivalent, then fine. If it’s something separate then we want to go, “Well, what is that exactly?”
For an individual, we have this measure of intelligence for an individual in the sense that there’s a correlation across mental tasks and which ones they can do better. And then the question is, what’s the cause of that correlation? One theory is that some people’s brains just trigger faster, and if I got a brain that triggers faster, it can just think faster and then overall it can do more.
There are other theories, but there are ways to cash out, what is it that makes one person smarter than another? Maybe they just have a bigger brain. That’s one of the stories, a brain that triggers faster. Maybe a brain with certain modules that are more emphasized than others. Then that’s a story of the particular features of that brain, that makes it be able to do many more tasks.
And so on.
No comments:
Post a Comment