Friday, October 19, 2018

Cosma Shalizi on machine learning, AI


Machine learning is mostly nonparametric regression:
The second point is about models --- it's a machine learning point. The dirty secret of the field, and of the current hype, is that 90% of machine learning is a rebranding of nonparametric regression. (I've got appointments in both ML and statistics so I can say these things without hurting my students.) I realize that there are reasons why the overwhelming majority of the time you work with linear regression, but those reasons aren't really about your best economic models and theories. Those reasons are about what has, in the past, been statistically and computationally feasible to estimate and work with. (So they're "economic" reasons in a sense, but about your own economies as researchers, not about economics-as-a-science.) The data will never completely speak for itself, you will always need to bring some assumptions to draw inferences. But it's now possible to make those assumptions vastly weaker, and to let the data say a lot more. Maybe everything will turn out to be nice and linear, but even if that's so, wouldn't it be nice to know that, rather than to just hope?

There is of course a limitation to using more flexible models, which impose fewer assumptions, which is that it makes it easier to "over-fit" the data, to create a really complicated model which basically memorizes every little accident and even error in what it was trained on. It may not, when you examine it, look like it's just memorizing, it may seem to give an "explanation" for every little wiggle. It will, in effect, say things like "oh, sure, normally the central bank raising interest rates would do X, but in this episode it was also liberalizing the capital account, so Y". But the way to guard against this, and to make sure your model, or the person selling you their model, isn't just BS-ing is to check that it can actually predict out-of-sample, on data it didn't get to see during fitting. This sort of cross-validation has become second nature for (honest and competent) machine learning practitioners.

This is also where lots of ML projects die. I think I can mention an effort at a Very Big Data Indeed Company to predict employee satisfaction and turn-over based on e-mail activity, which seemed to work great on the training data, but turned out to be totally useless on the next year's data, so its creators never deployed it. Cross-validation should become second nature for economists, and you should be very suspicious of anyone offering you models who can't tell you about their out-of-sample performance. (If a model can't even predict well under a constant policy, why on Earth would you trust it to predict responses to policy changes?)
AI these days:
The third point, the most purely cautionary one, is the artificial intelligence point. This is that almost everything people are calling "AI" these days is just machine learning, which is to say, nonparametric regression. Where we have seen breakthroughs is in the results of applying huge quantities of data to flexible models to do very particular tasks in very particular environments. The systems we get from this are really good at that, but really fragile, in ways that don't mesh well with our intuition about human beings or even other animals. One of the great illustrations of this are what are called "adversarial examples", where you can take an image that a state-of-the-art classifier thinks is, say, a dog, and by tweaking it in tiny ways which are imperceptible to humans, you can make the classifier convinced it's, say, a car. On the other hand, you can distort that picture of a dog into an image something unrecognizable by any person while the classifier is still sure it's a dog.

If we have to talk about our learning machines psychologically, try not to describe them as automating thought or (conscious) intelligence, but rather as automating unconscious perception or reflex action. What's now called "deep learning" used to be called "perceptrons", and it was very much about trying to do the same sort of thing that low-level perception in animals does, extracting features from the environment which work in that environment to make a behaviorally-relevant classification or prediction or immediate action.
Nota bene: "Rodney Brooks, one of the Revered Elders of artificial intelligence, puts it nicely recently, saying that the performances of these systems give us a very misleading idea of their competences."

5 comments:

  1. Thank you for this post. I would never have seen Shalizi's remarks, and they nicely summarize some thoughts that had been bouncing around in my head mixed with a bunch of insights that seem obvious now but hadn't occurred to me (how easy it is to overfit with less constraints being the big one).

    ReplyDelete
    Replies
    1. Thanks for your comment, Tom. Shalizi is very sharp. You should cruise by his site and look around (I've now fixed the link so it works).

      Delete
  2. Dear prof. Benzon, Thank you for the post and for the excellent blog. I would be interested to see you opinion on the issue of nonparametric regression when it comes to language acquisition. In the chomskian tradition of linguistics we are used to argue that plausible learnability can only be achieved by constraining the hypothesis space within some finite dimensions. But we probably need a change of perspective...

    ReplyDelete
    Replies
    1. I think language and language learning is grounded in semantics. We are born with certain perceptual motor capacities. They develop during the first years of life and it is THOSE developed capacities that constrain the "hypothesis space" for the acquisition of syntax.

      Delete
    2. Thank you for your reply!

      Delete