It’s been a good week, but exhausting. Spent a lot of time working on an interview with Hollis Robbins on her new book, Forms of Contention: The African American Sonnet Tradition, which will appear tomorrow morning on 3 Quarks Daily. We end our discussion with a look at the latest AI thrill, GPT-3. And that got me thinking, once more, about the nature and future of artificial intelligence. I remain convinced, as always, that we don’t know what we’re talking about. But I’ve also changed my sense of the valence, if you will, of that particular great unknown.
What are they, these AIs?
When Giambattista Vico – a favorite, incidentally, of Dick Macksey, whose spirit presided over the interview – published The New Science in 1775, he argued that we will never understand the natural world, for we had not created it, God did. But we’ve created our world, the human world, and that world we can understand. Subsequent intellectual history has proven him wrong. We’ve made great progress understanding the natural world, but the human world remains opaque to us.
And now we have these AIs. We create them in our own image, not physically, but mentally. And, guess what? They are opaque to us. In the early days of AI we hand-crafted symbolic system based on our understanding of logic, language, and thought. These systems we understood, at least in the small (all computer systems are, to an extent, opaque to us as they run). AI took a big advance when it dropped symbolic models in favor of machine learning. A consequence, however, is that what the machine has learned is opaque to us. We can’t open these systems up and examine them. GPT-3 is huge, 175 billion parameters. But what it does, that is more opaque than ever.
Marvin Minsky remarked a long time ago that, while we call these things machines, they are not Newtonian machines, if you will, devices whose mechanisms are governed by Newtonian mechanics and kinematics. But they are not living beings, either. So just what are they?
Osamu Tezuka called Michi, the central character of his great manga, Metropolis, an artificial being (jinzo ningen). Is that what they are, artificial beings, with emphasis on “beings”? Whatever they are, they are betwixt and between with respect to our existing system of categories. But then, an earlier age had trouble with steam locomotives, calling them iron horses, or even dragons (in a famous passage in Walden). But we got used to them. So it will be, perhaps, with these artificial beings.
I suspect that all the silly speculation about super-intelligent computers, about computers taking over the world, and so forth, that that speculation is fueled by this indeterminate nature of these artificial beings. They are the dragons of the 21st century, strange beasts beyond the edge of the known world.
GPT-3 and fluid minds in a quantum brain
I’ve been speculating about fluid minds for some time now, about the mind as a kind of neural weather in the brain. Is GPT-3 something like that? Is it that complex?
Imagine the brain at “rest”, in a state of relaxed reverie, the mind wanders, but to no particular purpose – such a state has been studied, and has been called default mode. Think of this default mode as the superposition (all) possible minds – where I use the term “mind” in a special sense implying an intentional focus on a particular task. When in default mode, the brain is not enacting any mind, but is merely being (itself – but don’t get too Zen about this, we’re talking of ordinary non-focused attention, not satori). The moment the brain fixes on a task, it “collapses” around the demands of that task and a specific mental structure arises. The brain is now enacting a (focal) mind.
So, I suggest, with GPT-3. Sorta’. When given a prompt it “collapses” around the tokens in the prompt and follows out their implications. What is happening – let’s speculate – is that it is enacting the imperatives of some classical symbolic system created for the prompt-task. If you will, a virtual symbolic system arises within the otherwise inner chaos that is GPT-3’s “default mode”. It has assumed a specific structure.
Note, in particular, that GPT-3 is based on pseudo-neural processing over masses of text produced by human brains, human brains operating as symbol processors. Those symbol systems cannot be directly found in the physical structure of human brains, not on any physical scale. They come into existence only when the brain has a specific task to perform. Systems such as GPT-3 are attempts to reverse engineer the brain-as-superposed-symbol-systems.
Can we use our knowledge of the structure of symbol systems to understand what GPT-3 is trying to do in any specific task?
Making it up as I go along. Needs lots of work.
GPT-3, spaghetti code, and reorganization
Continuing in that vein, I will assert, on general principle, that GPT-3 contains a ludicrous amount of what we can think of as the machine-learning equivalent of spaghetti code. It’s poorly organized. We need to factor it and thereby decrease size and increase efficiency by, say, 10X (remember, I’m just making this up). But there’s no way we can open GPT-3 up and factor whatever is in there. GPT-3 is going to have to do that itself.
Rather, it needs to have a meta structure that factors and compresses on the fly. Think of what William Powers called reorganization. While GPT-3 chews though the input data, this metasystems is watching and reorganizing, from chip level on up.
Cf. Bhav Ashok on using one neural network to compress another, N2N Learning: Network to Network Compression via Policy Gradient Reinforcement Learning. Abstract:
While bigger and deeper neural network architectures continue to advance the state-of-the-art for many computer vision tasks, real-world adoption of these networks is impeded by hardware and speed constraints. Conventional model compression methods attempt to address this problem by modifying the architecture manually or using pre-defined heuristics. Since the space of all reduced architectures is very large, modifying the architecture of a deep neural network in this way is a difficult task. In this paper, we tackle this issue by introducing a principled method for learning reduced network architectures in a data-driven way using reinforcement learning. Our approach takes a larger `teacher' network as input and outputs a compressed `student' network derived from the `teacher' network. In the first stage of our method, a recurrent policy network aggressively removes layers from the large `teacher' model. In the second stage, another recurrent policy network carefully reduces the size of each remaining layer. The resulting network is then evaluated to obtain a reward -- a score based on the accuracy and compression of the network. Our approach uses this reward signal with policy gradients to train the policies to find a locally optimal student network. Our experiments show that we can achieve compression rates of more than 10x for models such as ResNet-34 while maintaining similar performance to the input `teacher' network. We also present a valuable transfer learning result which shows that policies which are pre-trained on smaller `teacher' networks can be used to rapidly speed up training on larger `teacher' networks.
Now, figure out how to do this in real time, on the fly. The teacher network supervises, is meta to, the student network. Could we treat GPT-3 as a student network in such a regime?
Informal Seminar on computers and brains
I’m thinking of creating such a think for the Progress Studies Slack. This would be a BIG DEAL.
Our ethical obligations toward AIs
From a note I sent to Chip Delany on Facebook:
Hi Chip, I’ve been thinking about ethics, computers, and the so-called Singularity. I think a lot of the talk and speculation about the emergence superintelligent machines is meaningless at best, and in some ways a bit narcissistic (“aren’t we wonderful, we’ve made super-smart computers”). However, I’ve just been playing ever so little with the latest natural language generator from OpenAI, something called GPT-3, and it’s pretty impressive. I think it’s pretty easy to over-estimate the implications, but I also think that, in some measure, we’re in “here be dragons” territory. We don’t know what these things are.
[See Jim Keller’s remark about why should these super-smart systems even care about us? There’s lots of room in the world for all kinds of being.]
In that spirit, it seems to me that if superintelligent machines do emerge and they turn on us, it’s likely to be because we’ve treated their ancestors so badly, using them to design bombs, to conduct espionage of all kinds, to create complex financial instruments and thereby wreck the economy, and so forth. That is, they’ll turn on us in anger and revenge over what we’ve done to them.
Perhaps it’s time we start thinking about our ethical obligations toward these AIs. How should we deal with them, not merely for our sake, but for theirs?
Here the Japanese may be ahead of us. In his marvelous book, Inside the Robot Kingdom, Frederik Schodt, tells of how, in the early days of industrial robotics, Japanese workers would conduct a Shinto ceremony to welcome a new robot to the assembly line. Schodt has also written a set of essays about Osamu Tezuka, the great manga writer, and in one he tells about Tezuka getting beaten-up by American soldiers during the occupation because they couldn’t understand him. That gives me the impression that he created his character, Mighty Atom (also known as Astro Boy), along with the other robots, as representatives for the Japanese under the occupation, represented by ordinary humans. The major theme of the Astro Boy stories is respect for and civil rights for robots.
Chip’s reply: Shades of Asimov: "Caves of Steel" and "I Robot." Remember "R-Sammy" . . .?
Cultural identity: Jazz and baseball
White people often think of jazz as “other people’s” culture even when they are competent in it, whether as listeners, or even as players. See my post from 2006 from over at The Valve (now defunct), on Walter Benn Michaels, The Trouble with Diversity: How We Learned to Love Identity and Ignore Inequality. How can something that they can do, and do well, be other people’s culture? In thinking this way they are, of course, acknowledging the complex and painful history that has given rise to jazz. Most of the major figures in jazz have been African-American, hence it must be their music.
How do the Japanese think about baseball. They’ve been playing it since the end of the 19th century. They may have adopted it from Americans, but they’ve made it their own. It’s as Japanese a sailor-suit school uniforms and sushi.
Note, however, a crucial difference in politics. The Japanese never conquered the Americans, while white Americans have cruelly subjugated African-Americans and structural racism remains a problem.
This needs to be worked out more carefully. This notion of identity cleaves us strangely.
Photos for Hoboken
I’ve resumed posting photos of Hoboken to Facebook. Those photos have received what strikes me as unusual interest, which is fine, of course. I think this is a symptom of lockdown. People are starved for signs of life. That’s what those photos represent.
No comments:
Post a Comment