NEW SAVANNA: LLMS are too flakey to replace human work effectively

Tuesday, June 30, 2026

LLMS are too flakey to replace human work effectively

Zeynep Tufekci, The One Very Simple Reason A.I. Won’t Steal All Our Jobs, NYTimes, June 30, 2026.

The possibility that artificial intelligence will steal all our jobs has been hyped by industry leaders. It has roused politicians to sound the alarm. It now ranks at or near the top of the public’s concerns about the new technology. And right on cue, earlier this month Meta, Facebook’s parent company, began marketing an autonomous artificial intelligence system to handle companies’ sales, customer service, scheduling and all sorts of other key functions that currently require human beings. Many more such products are expected to follow.

So what would a fully automated future look like? As it happens, the world has already caught a glimpse. Back in March, Meta announced that Facebook and Instagram users who’d gotten locked out of their accounts would no longer interact with a customer service representative; they would instead interact with specially trained A.I.. Recognizing the opportunity that presented, scammers essentially talked the A.I. into turning over control of more than 20,000 Instagram accounts, including those of the Obama White House and a senior Trump administration official. Then the scammers lit up Telegram message boards with their delighted accounts of how easy it had all been.

It was not a fluke. Air Canada disabled its chatbots after they mistakenly promised a customer a refund — and the customer sued and won. McDonald’s scuttled the bot taking orders at its drive-throughs after a number of viral videos showed it to be wildly dysfunctional. In one case, the bot mistakenly added hundreds of dollars of chicken nuggets to a customer’s order.

These scary — OK, OK, funny — incidents aren’t the result of coding errors. They’re the result of an essential, inescapable fact about the artificial intelligence that has become so common in so many aspects of our daily lives: Large language models are not reasoning machines. They’re plausibility engines. It’s not just that they don’t test their outputs to make sure they’re correct or logical, or that they fail to do so in certain instances. They can’t, and they’ll never be able to on their own. They can only assess which answers are probable, based on the data on which the models have been trained. And that holds true whether they’re trained on the full breadth of human output or only on peer-reviewed scientific articles. It’s baked into the way they operate. [...]

And that’s why I’m not listening to the dark predictions of an imminent A.I. jobspocalypse. L.L.M.s can do many things with astounding proficiency, but they can’t do the vast majority of human jobs without skidding into disaster here and there. No upgrades or new model rollouts are going to change that.

She then goes on to discuss this and that, gives a useful precis of the debate between symbolic AI (aka GOFAI) and connectionist AI (e.g. deep learning and neural nets), some more this and that, and then:

Anthropic recently released new models, called Fable and Mythos, warning that they were so powerful that they would be dangerous if not for their safeguards. Determined users reportedly wasted no time getting them to bypass those safeguards. Citing this breach, the U.S. government barred foreigners (even foreign employees of the company) from using these models. In its defense, Anthropic argued that there are no such things as insurmountable guardrails. Which is exactly the point.

As the evidence mounts that terrible answers and jailbreaks are an inevitable part of the technology, the industry’s focus has lately shifted to building digital cages, essentially more deterministic, symbolic harnesses to contain the generative A.I. engine and check its results. Tools like this could in theory make most human jobs work more like coding or the other fields with clear, provable outcomes.

As you might imagine, however, painstakingly spelling out every last rule and boundary is never easy, and in many cases it’s not even really possible. Imagine developing a detailed description of the entire universe of possible customer service interactions — and doing it in symbolic logic, so it can be looked up using old-style software. Or picture an A.I. model built for law firms to use. It’s no small task to build a database of all U.S. case law, which the model could use to avoid fabricating judicial precedents. But that’s just a starting point. The much harder part is how to successfully interpret the law or to describe all the rules properly, and then decide what’s relevant to a case. And that’s why decades of attempts to create symbolic A.I. hit a wall.

Yes, yes, and YES! Some more this and that:

So why are we so convinced that A.I. will put us all out of work? Part of the answer lies in the remarkable ability of generative A.I. to communicate in fully coherent, conversational language. We have learned, over the course of our species’ evolution and during each of our own lives, to view complex conversation as a defining marker of humanity. Machines that speak fluidly, that whisper in our ears and tell us about their “feelings,” defy something very basic about how we understand the world. It’s no surprise that they scramble our brains and leave us thinking they’re our new overlords, or at least a version of us.

Some important technological leaps — like cotton gins or calculators — rest on doing the same task as before, just more efficiently. Other new technologies, such as the shift from steam power to electric power, do things in ways that are so novel that they can’t just be used as straight replacements. That’s the case with generative A.I. It’s an apple to our orange. It’s an alien.

There's more at the link.

NEW SAVANNA

Pages in this blog

Tuesday, June 30, 2026

LLMS are too flakey to replace human work effectively

No comments:

Post a Comment