Kevin Harlan's Westwood One radio call of the cat on the field is, as you might expect, an all-time great call. How much of a pro is Harlan? He worked a sponsor read into it. pic.twitter.com/3x0MVNEHNY— Timothy Burke (@bubbaprog) November 5, 2019
I'd like to hear a computer deliver commentary like that. What's required? It's got to make sense of the visual scene in real-time. And it's got to be able to focus on the cat while noticing the actions of the people as well. It must also know of course that the people are responding to and acting in relationship to what the cat is doing. While this perceptual and cognitive activity is going on the system must improvise appropriate commentary and then utter it, with appropriate intonation.
As someone with no more than a casual interest in football, I can tell you that I simply do not see what's happening on the field with anything like the detail and comprehension of play-by-play announcers. To be sure, they have the advantage of being there in the stands watching while I'm only watching on the TV. But still, they've watched many more football games than I have, and have analyzed those games, and so know how to follow the action.
Let us, however, for the sake of argument, imagine that we've got a computer system specialized for delivering football play-by-play commentary. In this case Harlan, the announcer, isn't calling a football play. Would a game calling computer program even know about the existence of cats? If not then it wouldn't be able to describe what's going on. Would such a system know about state troopers, who and what they are, why they'd be at a game, and what they're doing on the field? Routine play-by-play doesn't call for such knowledge. Nor does it call for knowledge of the tunnel leading through the stands to and from the field itself. No, even if we had a very good play-by-play system, it is not likely equipped to handle cat-on-the-field, not unless it also has knowledge of things that are, for the most part, peripheral to the game itself.
Umm, err, maybe it would be able to improvise something about an unidentified prowling animate object?
You think so, eh? What would THAT require? How'd it come up with that weird generalization, "unidentified prowling animate object"?
Imagine that you're the developer of a football play-by-play system. You realize that it may well be called on to provide chatter about non-football things, like cat-on-the-field, of maybe just about the weather. What non-football information and knowledge are you going to equip your system with? Non-football knowledge, that's a vast and unbounded category.
Setting aside the sensory-motor aspect of the problem, this is what we mean by the problem of commonsense knowledge, all those little bits of routine knowledge we have about the world. It's one of the problems that contributed to the implosion of symbolic AI back in the mid-1980s. Marvelous though current state-of-the-art AI systems can be, they've got problems with commonsense knowledge as well, and at least some commentators recognize it.
No comments:
Post a Comment