ML researchers:
— Yann LeCun (@ylecun) July 13, 2021
Late 1990s: "Method X is bad because the loss is non convex, there are no generalization bounds, and it's not properly regularized"
Early 2020s: "Method X is non convex, has no generalization bound, and is wildly over-parameterized. But it works great!"
ML researchers:
— Yann LeCun (@ylecun) July 13, 2021
Late 1990s: "Method X is worthless because the Matlab code takes more than 20 minutes to converge"
Early 2020s: "Method X is great because with <favorite_DL_framework>, I can train it on 10 billion samples using 1000 {GPU,TPU}s in less than a week."
We do not have an answer to that question, and the gap to bridge is enormous (how can people learn to drive a car in 20h of practice?)
— Yann LeCun (@ylecun) July 13, 2021
Decisive advances towards an answer will mark a new era in AI.
That's why I work on self-supervised learning.
It's our best shot at the moment.
Now, by "new paradigms", I mean new learning paradigms.
— Yann LeCun (@ylecun) July 13, 2021
And there is no doubt that they will involve some sort of gradient-based optimization applied to complex architectures (aka "deep learning").
The "new" part should focus on learning world models in a task independent way.
No comments:
Post a Comment