Monday, February 3, 2025

TNSTASFL, that goes for knowledge too. OR: Why there’s so much AI hype. [once more with the whales]

Why is there so much hype about AI? Sure, it’s new, it’s interesting, and certainly has transformative potential. That’s one thing. But all this talk about AGI in five years, possibly followed by ASI, and then, who knows, perhaps DOOM! The machines will take over and humans will either be reduced to slavery or be eliminated entirely. Where’d that come from?

[AGI=artificial general intelligence. ASI=artificial superintelligence.]

Well, yeah, there’s fantasy. But I think something else is going on as well.

While there are other things going on, the excitement is centered on LLMs (large language models), the things that power chatbots such as ChatGPT, Gemini, Claude, and others. You don’t have to know much of anything about language, cognition, the imagination, or the human mind in order to create an LLM. You need to know something about programming computers, and you need to know a lot about engineering large-scale computer systems. If you have those skills, you can create an LLM. That’s where all your intellectual effort goes, into creating the LLM.

The LLM then goes on to crank out language, and really impressive language at that. That doesn’t require any intellectual effort from you. As far as you’re concerned, it’s free. It took some genuine insight to come up with the transformer architecture. That’s what all these LLMs are built on. That was created by engineers at Google.

OpenAI got ahold of the idea and built their GPT series. That’s all engineering. I saw some output from GPT-2. Not very impressive. GPT-3 was much more impressive. It was built on the same design as GPT-2, but just bigger. GPT-2 had 1.5 billion parameters; GPT-3 had 175 billion. I assume that the size difference required some very skillful engineering; but the underlying concept was the same.

From an intellectual point of view the dramatically increased performance from one model to the next was free. The same goes for the difference between GPT-3 and GPT-3.5 (which powered the original ChatGPT). And so it goes for GPT-4. (We’re still waiting for GPT-5.)

In that situation, where increased performance, even radically increased performance, imposes no similar increase in intellectual insight, in scientific understand if you will, in that situation it’s easy to give-in to one’s fantasies and generate hype by the bucket load. Forget the buckets. Let’s go for swimming pools, giant Olympic-sized swimming pools filled with hype.

And so, once again, I trot out my whaling analogy. Nineteenth-century whaling ships were three-masted square-rigged vessels, just like the merchant ships used between ports in Europe and America for trading purposes. The skills needed to sail them are quite different. Now, take an expert captain and crew from a merchant vessel, put them on a whaler, and what happens? For one thing the whaler has a try-works midships. It’s used to render whale oil from blubber. The merchant seamen have never seen that. But that’s a skill easily learned.

But sailing the treacherous seas around Cape Horn, that’s another matter. Once you’re through, now you’ve got to hunt whales in the Pacific Ocean. If you’ve never done it before, how do you know where to look? And if you’ve spotted a whale, what then? How do they behave? How do you after them and kill them? No, I’m afraid the skills of a merchant seaman aren’t adequate to the task.

That’s what we’ve got in the case of deep leaning, LLMs, and language. The people who’ve created the technology don’t know anything about language and cognition. They get the performance for free and don’t have intellectual tools for thinking about what’s going on. So they throw hype into the void and hope it’ll make things right.

It won’t. They’re lost, and don’t know it.

Drinking Silicon Valley Joy Juice is not a formula for long-term success. 

* * * * *

NOTE: I used ChatGPT to create the image. If you look closely at the sign in the upper left you'll see that it elaborated TM (for trademark) into TMI (too much information), which is interesting, but not appropriate.

No comments:

Post a Comment