I know what the terms mean individually, and have some sense of what the phrase means. But, come to think of it, “level” seems to imply some linear scale, like IQ. Is that what intelligence means in this context, what IQ tests measure? That is to say, is that how the concept of human level intelligence is to be operationalized, as the methodologists say?
As I’ve indicated a few weeks ago, I’m happy treating intelligence as simply a measure in the way that, say, acceleration is a measure of automobile performance. If I do that, however, I don’t think it makes much sense to reify that measure as a kind of device, or system that one can design and build. While the many components of an automobile have various effects on its acceleration, some more (the engine) than others (the fabric on the back seat), the automobile doesn’t have an acceleration system as such.
And yet discussions of artificial intelligence seem to use the term, intelligence, in that way. So, how is that term operationalized? What I’m seeing in current Twitter debates indicates that the most common, but not the only, operationalization is simply the intuitive judgments of people engaging in the discussion. What are those intuitions based on?
More often than not, they’re based on an informal version of Turing’s imitation game. Turing proposed the game in a very specific format: “... an interrogator asks questions of a man and a woman in another room in order to determine the correct sex of the two players.” The informal version is some version of: “Is this behavior human?” Well, we know it isn’t – whether it’s an image generated from a caption or some prose generated from a prompt – and so the question becomes something like, “If machines can do this now, is human level artificial intelligence just around the corner?” Some say yes, and some say no. And some present actual arguments, though my impression is that, in the Twitterverse, the arguments mostly come from the negative side of the question. That may well just be my local Twitterverse.
Many of those who enthusiastically believe that, yes, these remarkable exhibits betoken the arrival of AGI, they seem to be AI developers. They are experts. But experts in just what, exactly?
Being able to participate in the development of AI systems is one thing. Being able to judge whether or not some bit of behavior is human behavior, that’s something else entirely, is it not? As far as I can tell, no one is saying that these various models are pretty much like human mentation – though some folks do like to skate close to the line every once in a while. It’s the behavior that’s in question and, if this behavior really is human-like, then, so the reasoning goes, the engines that created it must be the way to go.
There is no doubt that, on the surface, these machines exhibit remarkable behavior. What’s in question is what we can infer about the possibilities for future behavior from a) present behavior in conjunction with b) our knowledge of how that behavior is produced. We know a great deal about how to produce the machines that produce the behavior. After all, we created them. But we created them in such a way – via machine learning – that we don’t have direct and convenient access to the mechanisms the machines use in generate their behavior. There’s the problem.
So, we have behavior B, which is remarkably human-like, but not, shall we say, complete. B was produced by machine Z, whose mechanisms are obscure. Machine Z was in turn produced by machine X, which we designed and constructed. Is it possible that X1, which is pretty much like X and which we know how to create, will produce Z1, and the behavior produced by Z1 will be complete? Some say yes, some say no.
But the enthusiasm of the Yessers seems largely driven by a combination of, 1) the convincing nature of current behavior, and 2) some unknown factor, call it Ω. Now maybe Ω is something like keen intuitive insight into the operation of those Z devices. It may also be professional vanity or boyish enthusiasm – not mutually exclusive by any means. If keen intuitive insight, is that insight good enough to enable firm predictions about the relationship between design changes in new X devices and the subsequent behavior of the correlative Z devices?
How many of these Yessers know something about human cognition and behavior? I don’t know. I expect it varies from person to person, but such knowledge isn’t required in order to be expert in the creation of X devices. I’m sure that some of the Naysayers, such as Gary Marcus, know a great deal about human cognition and behavior. What’s the distribution of such knowledge among the Yessers and the Naysayers? I don’t know. I don’t believe, however, that the Naysayers claim to know what those Z devices are doing. So, even if, on average, they know more about the human mind than the Yessers, why should that count for anything in these debates?
What the Yessers do know is that, back in the days of symbolic AI, people used knowledge of human behavior to design AI systems, and those old systems don’t work as well as the new ones. So why should that knowledge count for anything?
Now, just to be clear, I’m among the Naysayers, I claim to know a great deal about the human mind, and I believe that knowledge is relevant. But how and why? I do note that my willingness to credit GPT-3 as a (possible) breakthrough is related to my knowledge of the human mind, which is the basis of my GPT-3 working paper.
Finally – for this has gone on long enough – I note that there is something to be learned from the fact that these X engines, as I have lapsed into calling them, require enormous amounts of data to generate plausible Z engines. Everyone acknowledges this are regards it as a problem. I mention it simply to point out that it is a fact. That fact points to something. What? Something about the world? About the mind? Perhaps it’s something about the lumpiness of the world, as I put it in, What economic growth and statistical semantics tell us about the structure of the world.
More later.
No comments:
Post a Comment