Friday, July 24, 2020

0. The road ahead, into "the Singularity" and beyond [GPT-3]

This is the first in a series of posts where I set out a vision for the evolution of artificial intelligence beyond GPT-3 (GPT = Generative Pre-trained Transformer). As I explain in the next post in the series, “No meaning, no how”, it is both a remarkable achievement – we are now at sea in the Singularity and there is no turning back – and a remarkable temptation, hence my name for the overall series, GPT-3: Rubicon and Waterloo. No doubt some will yield to that temptation, but others are already resisting it, and have been for awhile. What will happen?

Of course I don’t know how things will unfold, but I have preferences.

The purpose of this series is to lay out those preferences. In the next section of this post I quote extensively from an article David Hays and I published in 1990, The Evolution of Cognition [1], the first in a series of essays in which we outline a view of human cultural evolution over the longue durée. Then I reprise the sketch of my (current) vision that I tucked into a comment at Marginal Revolution. I conclude with some observations of the value of being old (priors!).

Beyond AGI

In “The Evolution of Cognition” David Hays and I argued that the long-term evolution of human culture follows from the architectural foundations of thought and communication: first speech, then writing, followed by systematized calculation, and most recently, computation. In discussing the importance of the computer we remark:
One of the problems we have with the computer is deciding what kind of thing it is, and therefore what sorts of tasks are suitable to it. The computer is ontologically ambiguous. Can it think, or only calculate? Is it a brain or only a machine?

The steam locomotive, the so-called iron horse, posed a similar problem for people at Rank 3. It is obviously a mechanism and it is inherently inanimate. Yet it is capable of autonomous motion, something heretofore only within the capacity of animals and humans. So, is it animate or not? Perhaps the key to acceptance of the iron horse was the adoption of a system of thought that permits separation of autonomous motion from autonomous decision. The iron horse is fearsome only if it may, at any time, choose to leave the tracks and come after you like a charging rhinoceros. Once the system of thought had shaken down in such a way that autonomous motion did not imply the capacity for decision, people made peace with the locomotive.

The computer is similarly ambiguous. It is clearly an inanimate machine. Yet we interact with it through language; a medium heretofore restricted to communication with other people. To be sure, computer languages are very restricted, but they are languages. They have words, punctuation marks, and syntactic rules. To learn to program computers we must extend our mechanisms for natural language.

As a consequence it is easy for many people to think of computers as people. Thus Joseph Weizenbaum, with considerable dis-ease and guilt, tells of discovering that his secretary “consults” Eliza—a simple program which mimics the responses of a psychotherapist—as though she were interacting with a real person (Weizenbaum 1976). Beyond this, there are researchers who think it inevitable that computers will surpass human intelligence and some who think that, at some time, it will be possible for people to achieve a peculiar kind of immortality by “downloading” their minds to a computer. As far as we can tell such speculation has no ground in either current practice or theory. It is projective fantasy, projection made easy, perhaps inevitable, by the ontological ambiguity of the computer. We still do, and forever will, put souls into things we cannot understand, and project onto them our own hostility and sexuality, and so forth.

A game of chess between a computer program and a human master is just as profoundly silly as a race between a horse-drawn stagecoach and a train. But the silliness is hard to see at the time. At the time it seems necessary to establish a purpose for humankind by asserting that we have capacities that it does not. It is truly difficult to give up the notion that one has to add “because . . . “ to the assertion “I’m important.” But the evolution of technology will eventually invalidate any claim that follows “because.” Sooner or later we will create a technology capable of doing what, heretofore, only we could.
That is where we are now. The notion of an AGI (artificial general intelligence) that will bootstrap itself into superintelligence is fantasy; it arises because, even after three-quarters of a century, computers are still strange to us. We design, build, and operate them; but they challenge us; they’ve got bugs, they crash, they don’t come to heel when we command. We don’t know what they are. That is certainly the case with GPT-3. We’ve built it; it’s performance amazes (but puzzles and disappoints as well). And we do not understand how it works. It is almost as puzzling to us as we are to ourselves. Surely we can change that, no?

We conclude the essay with this paragraph:
We know that children can learn to program, that they enjoy doing so, and that a suitable programming environment helps them to learn (Kay 1977, Pappert 1980). Seymour Pappert argues that programming allows children to master abstract concepts at an earlier age. In general it seems obvious to us that a generation of 20-year-olds who have been programming computers since they were 4 or 5 years old are going to think differently than we do. Most of what they have learned they will have learned from us. But they will have learned it in a different way. Their ontology will be different from ours. Concepts which tax our abilities may be routine for them, just as the calculus, which taxed the abilities of Leibniz and Newton, is routine for us. These children will have learned to learn Rank 4 concepts.
Frankly, I think we are a behind the curve on this one. Had Hays and I hazarded to predict the advance of computing into the lives of children –“The child is Father to the man”, as Wordsworth observed – I fear we would be disappointed by the current situation.

Yes, relatively young programmers have done remarkable things and Silicon Valley teems with young virtuosi. It is not the virtuosi I’m concerned about. It is the average, which is too low, by far.

Oddly enough, the current pandemic may help raise that average, though only marginally. With at-home schooling looming in the future, school districts are beginning to buy laptop machines for children whose families cannot afford them. For without those machines, those children will not be able to participate in the only education available to them. No doubt most of the instruction they receive through those machines will train them to be only passive consumers of computation, as most of us are and have been conditioned to be.

But some of them surely will be curious. They’ll take a look under the virtual hood – though some of them will undoubtedly open up the physical machine itself (not that there’s much to see, with so much action integrated on a single chip) – and begin tinkering around. And before you know, they’ll do interesting things and Peter Thiel is going to be handing out more of those $100,000 fellowships [2] to teens living in institutionally impoverished neighborhoods plagued by substandard infrastructure.

We’ll see.

The road ahead

On July 19, 2020, Tyler Cowen made a post to Marginal Evolution entitled “GPT-3, etc.” It consisted of an email from a reader who asserted, “When future AI textbooks are written, I could easily imagine them citing 2020 or 2021 as years when preliminary AGI first emerged,. This is very different than my own previous personal forecasts for AGI emerging in something like 20-50 years…” As I’ve already indicated, I have my doubts about the concept of AGI.

The post has attracted 52 comments so far, more than a few of acceptable or even high quality. Here is a slightly revised version of the comment I made:
Yes, GPT-3 [may] be a game changer. But to get there from here we need to rethink a lot of things. And where that's going (that is, where I think it best should go) is more than I can do in a comment.

Right now, we're doing it wrong, headed in the wrong direction. AGI, a really good one, isn't going to be what we're imagining it to be, e.g. the Star Trek computer.

Think AI as platform, not feature (Andreessen) [4]. Obvious implication, the basic computer will be an AI-as-platform. Every human will get their own as an very young child. They're grow with it; it'll grow with them. The child will care for it as with a pet. Hence we have ethical obligations to them. As the child grows, so does the pet – the pet will likely have to migrate to other physical platforms from time to time.

Machine learning was the key breakthrough. Rodney Brooks' Gengis, with its subsumption architecture, was a key development as well, for it was directed at robots moving about in the world. FWIW Brooks has teamed up with Gary Marcus and they think we need to add some old school symbolic computing into the mix. I think they're right.

Machines, however, have a hard time learning the natural world as humans do. We're born primed to deal with that world with millions of years of evolutionary history behind us. Machines, alas, are a blank slate.

The native environment for computers is, of course, the computational environment. That's where to apply machine learning. Note that writing code is one of GPT-3's skills.

So, the AGI of the future, let's call it GPT-42, will be looking in two directions, toward the world of computers and toward the human world. It will be learning in both, but in different styles and to different ends. In its interaction with other artificial computational entities GPT-42 is in its native milieu. In its interaction with us, well, we'll necessarily be in the driver's seat.

Where are we with respect to the hockey stick growth curve? For the last 3/4 quarters of a century, since the end of WWII, we've been moving horizontally, along a plateau, developing tech. GPT-3 is one signal that we've reached the toe of the next curve. But to move up the curve, as I've said, we have to rethink the whole shebang.

We're IN the Singularity. Here be dragons.

[Super-intelligent computers emerging out of the FOOM is bullshit.]

* * * * *

ADDENDUM: A friend of mine, David Porush, has reminded me that Neal Stephenson has written of such a tutor in The Diamond Age: Or, A Young Lady's Illustrated Primer (1995) [5]. I then remembered that I have played the role of such a tutor in real life, The Freedoniad: A Tale of Epic Adventure in which Two BFFs Travel the Universe and End up in Dunkirk, New York [6].
As currently envisioned, the next post in this series will elaborate on the material highlighted in yellow. I have no idea how many posts it will take me to work my way to the end. But if the number rises above ten, then I’m doing it wrong.

On the value of being old, Priors!

I normally do not pull rank, because I have no rank to pull. I am an independent scholar and have been so for years. The choice has not been freely made; I would have preferred to remain on the faculty of a good university. That did not happen. It has long been obvious that I could not have done my work in a most universities. Perhaps here or there, perhaps; but the number of suitable slots is very small, and not advertised; they’re accidents of capricious circumstance. In lieu of rank all I have is the fact that I’ve been around for awhile, and that has value. Not automatic value, not inherent value, but if you live right and keep changing your mind, you learn something.

I was trained in computational semantics by the late David Hays, a first generation researcher in machine translation and one of the founders of computational linguistics [7]. He saw the collapse of that enterprise in the mid-1960s because it over-promised and under-delivered. He learned from that collapse, but of course, I could not. For me it is just something that had happened in the past. I could listen to the lessons Hays had taken from those events, and believe them, but those lessons weren’t my lessons. I did not have to adjust my priors to accommodate to those events.

Symbolic AI, roughly similar to the computational semantics I learned from Hays, collapsed in the mid-1980s. I had fully expected to see the development of symbolic systems capable of “reading” a Shakespeare play in an intellectually interesting way [8]. That was not to be. I have made other adjustments, in response to other events, since then. I have NOT kept to a straight and narrow path. My road has been a winding one.

But I have kept moving.

I have always believed that you should commit yourself to the strongest intellectual position you can, but not in the expectation that it will pan out or that it is your duty to make it pan out come hell or high water. No, you do it because it maximizes your ability to learn from what you got wrong. If you don’t establish firm priors, you can’t correct them effectively.

My intellectual career has thus been a long sequence of error-correcting maneuvers. Have I got it right at long last?

Are you crazy?

This post and the ones to follow are no more than my best assessment of the current situation, subject to the fact that I’m doing this quickly. I will surely be wrong in many particulars, and perhaps in overall direction as well. Consider this series to be a set of Baysian priors subject to correction by later events.

* * * * *

Posts in this series are gathered under this link: Rubicon-Waterloo.

References

[1] William L. Benzon and David G. Hays, The Evolution of Cognition, Journal of Social and Biological Structures 13(4): 297-320, 1990, https://www.academia.edu/243486/The_Evolution_of_Cognition.

[2] Peter Thiel offers $100,000 fellowships to talent young people provided they drop out of college so they can do new things, https://thielfellowship.org/.

[3] "GPT-3, etc." Marginal Revolution, blog post, July 19, 2020, https://marginalrevolution.com/marginalrevolution/2020/07/gpt-3-etc.html

[4] "Is AI a feature or a platform? [machine learning, artificial neural nets]", New Savanna, blog post, December 13, 2019, https://new-savanna.blogspot.com/2019/12/is-ai-feature-or-platfrom-machine.html.

[5] The Diamond Age, Wikipedia: https://en.wikipedia.org/wiki/The_Diamond_Age.

[6] "The Freedoniad: A Tale of Epic Adventure in which Two BFFs Travel the Universe and End up in Dunkirk, New York," New Savanna, blog post, February 12, 2019, https://new-savanna.blogspot.com/2014/10/the-freedoniad-tale-of-epic-adventure.html.https://thielfellowship.org/

[7] David G. Hays, Wikipedia, https://en.wikipedia.org/wiki/David_G._Hays.

[8] See the discussion of the Prospero system on pages 271-273 of William Benzon and David G. Hays, Computational Linguistics and the Humanist, Computers and the Humanities, Vol. 10. 1976, pp. 265-274, https://www.academia.edu/1334653/Computational_Linguistics_and_the_Humanist.

No comments:

Post a Comment