Unbelievable results, feels like a dream—our R1 model is now #1 in the world (with style control)! ππ Beyond words right now. π€― All I know is we keep pushing forward to make open-source AGI a reality for everyone. π✨ #OpenSource #AI #AGI #DeepSeekR1 https://t.co/h0pT2Em14D
— Deli Chen (@victor207755822) January 24, 2025
Gary Marcus, The race for "AI Supremacy" is over — at least for now:
The race for "AI Supremacy" is over, at least for now, and the U.S. didn't win. Over the last few weeks, two companies in China released three impressive papers that annihilated any pretense that the US was decisively ahead. In late December, a company called DeepSeek, apparently initially built for quantitative trading rather than LLMs, produced a nearly state-of-the-art model that required only roughly 1/50th of the training costs of previous models, — instantly putting them in the big leagues with American companies like OpenAI, Google, and Anthropic, both in terms of performance and innovation. A couple weeks later, they followed up with a competitive (though not fully adequate) alternative to OpenAI's o1, called r1. Because it is more forthcoming in its internal process than o1, many researchers are already preferring it to OpenAI's o1 (which had been introduced to much fanfare in September 2024). And then ByteDance (parent company of TikTok) dropped a third bombshell, a new model that is even cheaper. Yesterday, a Hong Kong lab added yet a fourth advance, making a passable though less powerful version of r1 with even less training data.
None of this means however that China won the AI race or even took the lead. American companies will incorporate these new results and continue to produce new results of their own.
Instead, realistically, we are quickly converging on a tie — with some style points to China, for doing so much without hundreds of thousands of Nvidia H100s.
Others may catch up, too, because LLMs just got a lot cheaper, and consequently the requirement for vast arrays of special purpose hardware has somewhat diminished. There is almost no moat left whatsoever; new technical leads very short lived, measured in months or even weeks, not years.
Later:
China caught up so quickly for many reasons. One that deserves Congressional investigation was Meta's decision to open source their LLMs. [...]
Putting so many eggs in Altman’s basket, as the White House did last week and others have before, may also prove to be a mistake in hindsight. [...] Altman may be a master salesman, but Musk is correct that the US should not be so reliant on him and should not have given him such an eminent seal of approval based on so little.
In a brutal, viral tweet that captures much of my own thinking here, the reporter Ryan Grim wrote yesterday about how the US government (with the notable exception of Lina Khan) has repeatedly screwed up by placating big companies and doing too little to foster independent innovation.
Marcus concludes:
The race to AGI will be won not by the country with the most chips but by the one that best fosters true innovation. That could be the US, China, or perhaps some other country less locked into LLM orthodoxy, and more willing to bet big on new ideas.
There's more at the link.
"It then thought for 20 paragraphs before outputting the joke! Here’s the full transcript and here’s the output at the end, after the tag:
ReplyDeleteThe Joke:
A pelican and a walrus decide to open a quaint little tea room together. The walrus, with its big size, struggles to find comfortable chairs, so it sits on the table by accident, knocking over the teapot. Meanwhile, the pelican, trying to help, uses its beak to place saucers on the table, causing a few spills.
After a series of comical mishaps, the walrus looks up and says with a grin, “This isn’t so fishy anymore.” The pelican smirks and remarks, “Maybe not, but we do have a lot of krill in our tea!”
Explanation:
Setup: Pelican and Walrus run an unlikely tea room. Their sizes lead to comedic chaos.Punchline: “This isn’t so fishy anymore” is a pun on the situation becoming less absurd. The follow-up joke introduces “krill,” a play on words that ties into both marine life (“fish”) and tea, making it humorous while fitting the theme.
So yeah, the joke is awful. But the process of getting there was such an interesting insight into how these new models work."
...
"DeepSeek-R1 and exploring DeepSeek-R1-Distill-Llama-8B"
20th January 2025
https://simonwillison.net/2025/Jan/20/deepseek-r1/
I'd love to hear trump yelling at the ever so smart centibillionaires!
Cheers, Dipity