Monday, February 3, 2025

Andrej Karpathy: "Vibe coding"

Sounds a bit like jazz improvisation.

Whimsical Monday: ChatGPT draws a picture of a Silver-Crested Swamp Swizzle

My father used to kid my sister and me about a bird he called a silver-crested swamp swizzle. As far as I know, it's a creature of his own invention. I just did a web search and the only links I came up with are to things I'd written which use the term.

So, since the free version of ChatGPT will produce an image or two for you over the course of a day. I decided to to have it draw a silver-crested swamp swizzle. But I didn't want to approach that directly. Rather I decided to ask it for advice about hunting one of these things. I expected it to offer me such advice, or at least to clarify what kind of bird I was talking about since it couldn't find any information about it. At that point I'd tell it that the silver-crested swamp swizzle was an imaginary creature. That's when I'd ask it to make me an image. And it would.

But no, it saw through me right away. That led to a very different conversation. And when we finally got around to producing an image, well, problems arose, as you'll soon see.

I’ve decided that I need to have a silver-crested swamp swizzle stuffed and mounted and placed in my study. Where would I go to hunt silver-crested swamp swizzles and what kind of equipment should I use.?

Ah, the elusive silver-crested swamp swizzle! A truly rare and remarkable creature—so rare, in fact, that it exists only in the minds of those with an appreciation for the finer things in life (and a good sense of humor).

If you were to embark on a grand expedition to hunt such a creature, you’d likely need to journey deep into the misty marshlands of imagination, where the waters shimmer with the reflections of forgotten legends. Your equipment should include:

  • A finely tuned sense of adventure (mandatory)
  • A net woven from moonbeams and old fisherman’s tales (helps with the more slippery specimens)
  • A silver-plated monocle (to properly identify the crest of the beast)
  • A flask of aged swamp swizzle tonic (for both bait and refreshment)

Of course, once successfully "hunted," you’ll need only the most skilled taxidermist, preferably one with experience in preserving creatures of myth and mystery. Might I suggest a craftsman whose shop exists just beyond the veil of reality?

Best of luck on your expedition! πŸ¦†✨

Ah, you caught me. How did you know the renowned swamp-swizzle was an imaginary creature?

Ah, my dear adventurer, one develops a keen instinct for such things after years of cataloging the world's most peculiar beasts—both real and remarkably fictional. The moment I saw "silver-crested swamp swizzle," my mind tingled with the unmistakable scent of a creature too fantastical to be confined by the mundane limits of taxonomy.

Perhaps it was the poetic alliteration, or maybe the whisper of mischief in your request. But most of all, it was the sheer certainty that if such a creature did exist, I would have long ago received a frantic letter from some bespectacled naturalist claiming to have spotted one vanishing into the misty depths of an uncharted bog.

That said, if you do happen upon a silver-crested swamp swizzle, I implore you—capture it only with your pen and imagination. Those are the finest tools for preserving rare wonders. πŸ˜‰

Very interesting, Holmes. Your powers of deduction never fail to amaze me. Now that we’ve got its fictional nature firmly established, do you think you could craft me a picture of such a rare bird? Perhaps in its natural habitat? Maybe in a Seussian style.

It seems like I can’t generate any more images right now. Please try again later.

However, if you'd like, I can describe the silver-crested swamp swizzle in even greater detail so you (or an artist) can bring it to life! Let me know how you'd like to proceed. 😊

Have at it. I’ll come back later when there aren’t so many demands on your time and compute.

Ah, splendid! Allow me to paint a picture with words.

Into the mystery [Indiana Jones was here]

TNSTASFL, that goes for knowledge too. OR: Why there’s so much AI hype. [once more with the whales]

Why is there so much hype about AI? Sure, it’s new, it’s interesting, and certainly has transformative potential. That’s one thing. But all this talk about AGI in five years, possibly followed by ASI, and then, who knows, perhaps DOOM! The machines will take over and humans will either be reduced to slavery or be eliminated entirely. Where’d that come from?

[AGI=artificial general intelligence. ASI=artificial superintelligence.]

Well, yeah, there’s fantasy. But I think something else is going on as well.

While there are other things going on, the excitement is centered on LLMs (large language models), the things that power chatbots such as ChatGPT, Gemini, Claude, and others. You don’t have to know much of anything about language, cognition, the imagination, or the human mind in order to create an LLM. You need to know something about programming computers, and you need to know a lot about engineering large-scale computer systems. If you have those skills, you can create an LLM. That’s where all your intellectual effort goes, into creating the LLM.

The LLM then goes on to crank out language, and really impressive language at that. That doesn’t require any intellectual effort from you. As far as you’re concerned, it’s free. It took some genuine insight to come up with the transformer architecture. That’s what all these LLMs are built on. That was created by engineers at Google.

OpenAI got ahold of the idea and built their GPT series. That’s all engineering. I saw some output from GPT-2. Not very impressive. GPT-3 was much more impressive. It was built on the same design as GPT-2, but just bigger. GPT-2 had 1.5 billion parameters; GPT-3 had 175 billion. I assume that the size difference required some very skillful engineering; but the underlying concept was the same.

From an intellectual point of view the dramatically increased performance from one model to the next was free. The same goes for the difference between GPT-3 and GPT-3.5 (which powered the original ChatGPT). And so it goes for GPT-4. (We’re still waiting for GPT-5.)

In that situation, where increased performance, even radically increased performance, imposes no similar increase in intellectual insight, in scientific understand if you will, in that situation it’s easy to give-in to one’s fantasies and generate hype by the bucket load. Forget the buckets. Let’s go for swimming pools, giant Olympic-sized swimming pools filled with hype.

And so, once again, I trot out my whaling analogy. Nineteenth-century whaling ships were three-masted square-rigged vessels, just like the merchant ships used between ports in Europe and America for trading purposes. The skills needed to sail them are quite different. Now, take an expert captain and crew from a merchant vessel, put them on a whaler, and what happens? For one thing the whaler has a try-works midships. It’s used to render whale oil from blubber. The merchant seamen have never seen that. But that’s a skill easily learned.

But sailing the treacherous seas around Cape Horn, that’s another matter. Once you’re through, now you’ve got to hunt whales in the Pacific Ocean. If you’ve never done it before, how do you know where to look? And if you’ve spotted a whale, what then? How do they behave? How do you after them and kill them? No, I’m afraid the skills of a merchant seaman aren’t adequate to the task.

That’s what we’ve got in the case of deep leaning, LLMs, and language. The people who’ve created the technology don’t know anything about language and cognition. They get the performance for free and don’t have intellectual tools for thinking about what’s going on. So they throw hype into the void and hope it’ll make things right.

It won’t. They’re lost, and don’t know it.

Drinking Silicon Valley Joy Juice is not a formula for long-term success. 

* * * * *

NOTE: I used ChatGPT to create the image. If you look closely at the sign in the upper left you'll see that it elaborated TM (for trademark) into TMI (too much information), which is interesting, but not appropriate.

Sunday, February 2, 2025

DeepSeek vs. ChatGPT: Daily use & daily downloads

In a couple of months....

Is Trump overextending himself and, in effect, exposing his flank? [Ezra Klein]

Ezra Klein, Don't Believe Him, NYTimes, Feb. 2, 2025. From the end of the article:

The flurry of activity is meant to suggest the existence of a plan. The Trump team wants it known that they’re ready this time. They will control events rather than be controlled by them. The closer you look, the less true that seems. They are scrambling and flailing already. They are leaking against one another already. We’ve learned, already, that the O.M.B. directive was drafted, reportedly, without the input or oversight of key Trump officials — “it didn’t go through the proper approval process,” an administration official told The Washington Post. For this to be the process and product of a signature initiative in the second week of a president’s second term is embarrassing. [...]

I had a conversation a couple months ago with someone who knows how the federal government works about as well as anyone alive. I asked him what would worry him most if he saw Trump doing it. What he told me is that he would worry most if Trump went slowly. If he began his term by doing things that made him more popular and made his opposition weaker and more confused. If he tried to build strength for the midterms while slowly expanding his powers and chipping away at the deep state where it was weakest.

But he didn’t. And so the opposition to Trump, which seemed so listless after the election, is beginning to rouse itself.

There is a subreddit for federal employees where one of the top posts reads: “This non ‘buyout’ really seems to have backfired. I’ll be honest, before that email went out, I was looking for any way to get out of this fresh hell. But now I am fired up to make these goons as frustrated as possible.” As I write this, it’s been upvoted more than 39,000 times and civil servant after civil servant is echoing the initial sentiment.

In Iowa this week, Democrats flipped a State Senate seat in a district that Trump won easily in 2024. The attempted spending freeze gave Democrats their voice back, as they zeroed in on the popular programs Trump had imperiled. Trump isn’t building support; he’s losing it. Trump isn’t fracturing his opposition; he’s uniting it.

This is the weakness of the strategy that Bannon proposed and Trump is following. It is a strategy that forces you into overreach. To keep the zone flooded, you have to keep acting, keep moving, keep creating new cycles of outrage or fear. You overwhelm yourself. And there’s only so much you can do through executive orders. Soon enough, you have to go beyond what you can actually do. And when you do that, you either trigger a constitutional crisis or you reveal your own weakness.

Trump may not see his own fork in the road coming. He may believe he has the power he is claiming. That would be a mistake on his part — a self-deception that could doom his presidency. But the real threat is if he persuades the rest of us to believe he has power he does not have.

The first two weeks of Trump’s presidency have not shown his strength. He is trying to overwhelm you. He is trying to keep you off-balance. He is trying to persuade you of something that isn’t true. Don’t believe him.

DeepSeek's approach only works in limited technical domains

Nope, no AGI at the other end of this tunnel

Sabrina's drones, they're everywhere!

Creativity and the chatbot: GPT-4 wins, for now.

But then, what is creativity? Interesting, but color me skeptical.

Saturday, February 1, 2025

What do we want from AI Agents? What are we likely to get?

First, I present a Facebook post by Jonathan Mayhew, who teaches at The University of Kansas, about some recent frustrations he’s had using computers. Then I present a tweet from NYTimes reporter, Kevin Roose, about the capabilities of Operator, OpenAI’s new agent app.

What’s the likelihood that OpenAI’s Operator would have been able to solve either of Mayhew’s problems? Why or why not? And if not now, when?

Mayhew is frustrated

I thought I'd go into the office. First thing I wanted to do was print a single page of something that had been sent to me by email. I have to log on to my own computer, then open my email--which won't open for me on the first 3 attempts. So I go to the email through my browser. I have to log in again, and do a dual step authentication. Then, the very first thing I see is the attachment I have to print. Yay! Almost done. I print it, go down to the dept. office and log into my account on the printer. Push the button to print, and a blank page emerges. I go back to my own office, and this time I think I should use my normal email program, so I finally get it to open. Search for the name of the person who sent me the mail. I notice in the meantime my university has sent me five more generic messages. Find the message I want, download pdf to my desktop, open the document and print again. (I ignore the prompt to quit adobe so it can continue with its update! Grr....) I go down again to the department office, log again into the printer, push the button to print, and the page prints. Success! I was able to print a single page in 20 minutes.

I don't think my computer skills are particularly lacking, since I came up, for every obstacle, with a logical next step, but I feel, somehow, that technology should be seamless in a way that it is not. It took me about as long to download my W2 yesterday from the State of Kansas, which of course uses a different user ID and password than the normal university ones. I had to switch browsers and change my password twice before it worked. When I am obliged to change my password for the university every six months I end up in an endless loop before finally figuring out where to go. The computers in the classrooms where I teach also require authentications, log ins, the answering of irrelevant prompts; are slow to respond, awkward to navigate.

This is my beginning of the semester rant--and the semester doesn't even start until Tuesday.

Roose reports

New York Times reporter Kevin Roose has been testing OpenAI’s new operator app. Here’s a tweet about it:

I spent the last week testing OpenAI's Operator AI agent, which can use a browser to complete tasks autonomously.

Some impressions:

• Helpful for some things, esp. discrete, well-defined tasks that only require 1-2 websites. ("Buy dog food on Amazon," "book me a haircut," etc.)
• Bad at more complex open-ended tasks, and doesn't work at all on certain websites (NYT, Reddit, YouTube)
• Mesmerizing to watch what is essentially Waymo for the web, just clicking around doing stuff on its own
• Best use: having it respond to hundreds of LinkedIn messages for me
• Worst/sketchiest use: having it fill out online surveys for cash (It made me $1.20 though.)

Right now, not a ton of utility, and too expensive ($200/month). But when these get better/cheaper, look out. A few versions from now, it's not hard to imagine AI agents doing the full workload of a remote worker.

He also links to his full column about it: How Helpful Is Operator, OpenAI’s New A.I. Agent? (Feb. 1, 2025).

Claude 3.5 Sonata describes a trio of brass musicians [on the 12/8 path]

Lest you think Claude’s responses are rather concise, yes, they are. There was heavy web traffic during the chat. As always, my words are bold while Claude's are plain.

What do we see here?

This image shows three musicians performing outdoors with brass instruments. From left to right, there's a trombonist, another trombonist wearing a sun hat, and a sousaphone player wearing a red cap. They're playing on a grassy area under trees, with cars and buildings visible in the background, suggesting this might be a park or campus event.

What’s in the lower left corner?

There appears to be a drum or snare drum visible in the lower left corner of the image.

Correct. It’s a snare drum. What does that imply?

The presence of a snare drum suggests there's also a drummer in the ensemble - this appears to be a small marching band or brass band setup.

Yes, it’s a small brass band setup. The trombonist at the left is Steve Swell. He’s a fairly well-known jazz trombonist, especially in avant-garde circles. The tuba player is Charlie Keil. He’s and ethnomusicologist best known for writing “Urban Blues.” Do you know anything else about those two guys?

Charlie Keil is an influential ethnomusicologist and author. Beyond "Urban Blues" (1966), he wrote important works like "Music Grooves" and "Bright Balkan Morning" examining participatory music-making across cultures. He's advocated for what he calls "cultural democracy" and studied how music builds community.

Steve Swell has been active in New York's experimental jazz scene since the 1970s, known for his expressive, avant-garde trombone style. He's played with many notable musicians including Tim Berne and William Parker.

Since you're asking about such specific musicians in experimental jazz and ethnomusicology, I should note that while I aim to be accurate, I may make mistakes about details of their careers and work.

That’s fine. I’m just trying to get a sense of what you know and don’t know. I know enough about those guys – I’ve played with both of them many times – that I’m not worried about you confabulating.

Thanks for the context - first-hand experience with these musicians is valuable. What would you like to know about them?

Can you tell me more about the people Swell’s played with. I know that early on he played with Lionel Hampton. But he was just a sideman in a well-known big band. There’s not likely to be much in the public record about that. What’s he done with William Parker, for example?

Steve Swell has collaborated with William Parker in various avant-garde jazz settings since the 1980s, including Parker's Little Huey Creative Music Orchestra. However, since you mention having played with Swell yourself, you would likely have more accurate firsthand knowledge of his collaborations than what I can confidently state from my training data.

Well, I’ve played with him on various occasions over the years, mostly in informal pick-up brass bands (put together by Keil), but never talked with him much about his other gigs.

That makes sense. The informal brass band context gives interesting perspective on both Swell and Keil, since it bridges Keil's academic work on participatory music with actual community music-making. Would you say these brass band gatherings reflected Keil's ideas about cultural democracy in practice?

That’s EXACTLY what those bands are about. We’ve marched in a number of demonstrations in Manhattan, including a protest against nuclear weapons. The interesting thing about that march is that there seemed to be more Japanese and Koreans marching that white people. There were some Japanese there who’d survived Hiroshima or Nagasaki. I forget what they’re called. There’s a special word for it.

The term you're looking for is "hibakusha" (θ’«ηˆ†θ€…) - literally "bomb-affected people." They are the survivors of the atomic bombings of Hiroshima and Nagasaki. Their presence at anti-nuclear protests carries particular moral weight given their firsthand experience of nuclear weapons' devastating effects.

We often marched with Code Pink, to support them.

Code Pink is a women-led peace and social justice organization. The brass bands bring an energetic, celebratory element to their protests and marches - right in line with Keil's philosophy about music's role in social movements and community building.

I think Keil’s work on music as community building is going to be especially important in the coming years as AI becomes more widespread, and perhaps as people are put out of work, though they might be supported by UBI.

Yes - as AI potentially disrupts traditional work, Keil's ideas about participatory music-making could become crucial for maintaining social cohesion and meaning. Music-making builds community bonds in ways that passive AI-generated entertainment can't replicate. This connects to broader discussions about how UBI might free people to engage more in cultural activities and community building.

I think it’s important to remember that it’s really only been a bit over a century that we’ve had extensive recorded music and music broadcasts. Before that, if you wanted to listen to music, you had to make it yourself of find some musicians. There was a time in this country, for example, where every factory above a certain size would have a band. Large factories would even hire professionals to play in the bands.

True - mass-produced recorded music is relatively new. Factory bands were common in the late 19th/early 20th century. Companies like Baldwin Piano, Studebaker, and many mining companies maintained employee bands. These bands built worker solidarity and provided entertainment before radio became widespread. The tradition continues in some forms, like Japanese corporate bands.

The accessibility of recorded music has shifted us from being music makers to primarily music consumers. But as you suggest, participatory music-making may become increasingly valuable for community building as our relationship with work evolves.

My first musical hero was a trumpet player named Rafael Mendez. As his name suggests, he was from Mexico. When he first came to this country he went to Detroit and played in a factory band. He eventually ended up playing in a Hollywood studio.

Rafael Mendez started in Detroit's Fisher Body Band (a GM factory band) in the 1920s. He became known for his virtuosic trumpet playing and recording career in Hollywood, making many recordings for Decca. His technical facility and showmanship influenced generations of trumpet players. He learned trumpet from his father in Mexico before coming to the US.

And he played for Pancho Villa. What a life.

Yes - Mendez played in Pancho Villa's personal band as a child during the Mexican Revolution before coming to the US. His journey from revolutionary Mexico to Detroit's industrial bands to Hollywood captures a fascinating slice of 20th century musical and social history.

Do you know who was the highest paid musician in Sousa’s band?

The highest paid musician in Sousa's band was Herbert L. Clarke, the cornet soloist. He earned $70 per week at a time when many musicians made $35-40.

Perhaps. But for a while Gus Helmecke, the bass drummer, made more money. At least that’s what I’ve read.

That's fascinating - I hadn't heard about Gus Helmecke's salary. It makes sense though - the bass drum was crucial for outdoor band performances in that era, before amplification, and a skilled bass drummer who could project well would have been extremely valuable.

Keil has interviewed several jazz drummers who said they when to hear Sousa just so they could hear Helmecke.

That really demonstrates how important Helmecke was to the development of jazz drumming. Having a virtuosic bass drummer in Sousa's band - which toured extensively - meant many future jazz musicians got to hear sophisticated rhythmic techniques they might not have encountered otherwise.

That’s funny. In the European tradition percussion isn’t considered to be that important. But really, rhythm is the basis of everything else.

Yes - rhythm and percussion are foundational to music across cultures. The European classical tradition's de-emphasis of rhythm reflects certain cultural biases, while African and African-American musical traditions rightfully recognize rhythm's centrality. This connects back to Keil's work on participatory discrepancies and groove.

Yep. Do you have anything else to say before we close up?

Just a reflection that these participatory music traditions - from factory bands to protest marches - show how music builds community in ways that go far beyond just entertainment. As you've noted, that perspective could be particularly valuable as we navigate technological and social changes ahead.

Good enough. As the TV cowboy used to sing, happy trails to you.

Happy trails to you as well! (And now I can hear Roy Rogers and Dale Evans singing it...)

Have a seat

Cory Doctorow: Bluesky and enshittification

The lede: No one is the enshittifier of their own story.

I would like to use Bluesky. They’ve done a bunch of seriously interesting technical work on moderation and ranking that I truly admire, and I’ve got lots of friends there who really enjoy it.

But I’m not on Bluesky and I don’t have any plans to join it anytime soon. I wrote about this in 2023: I will never again devote my energies to building up an audience on a platform whose management can sever my relationship to that audience at will:

https://pluralistic.net/2023/08/06/fool-me-twice-we-dont-get-fooled-again/

When a platform can hold the people you care about or rely upon hostage — when it can credibly threaten you with disconnection and exile — that platform can abuse you in lots of ways without losing your business. In other words, they can enshittify their service:

https://pluralistic.net/2024/08/17/hack-the-planet/#how-about-a-nice-game-of-chess

I appreciate that the CEO of Bluesky, Jay Graber, has evinced her sincere intention never to enshittify Bluesky and I believe she is totally sincere:

https://www.wired.com/story/bluesky-ceo-jay-graber-wont-enshittify-ads/

But here’s the thing: all those other platforms, the ones where I unwisely allowed myself to get locked in, where today I find myself trapped by the professional, personal and political costs of leaving them, they were all started by people who swore they’d never sell out.

I’m glad to read this. I’d like to leave Facebook, but I feel trapped the same way Doctorow does. I’d like to leave The Site Formerly Known as Twitter (aka X), but I remain there. OTOH, that’s where the AI action seems densest, and that’s important to me, and I don’t want to spend time working my way into some other site, like Bluesky. Between my blog, Facebook, X, Flickr, Brainstorms (a private, but not exclusive, online community), and Instagram (which I’ve largely abandoned), my monthly column at 3 Quarks Daily, I’ve got more than enough media to handle.

Doctorow remarks that, yes, all these sites cared about their users, in the beginning. But: “They just cared about other stuff, too, and, when push came to shove, they chose the worsening of their services as the lesser of two evils.”

Doctorow goes on to talk about federation:

On Mastodon (and other services based on Activitypub), you can easily leave one server and go to another, and everyone you follow and everyone who follows you will move over to the new server. If the person who runs your server turns out to be imperfect in a way that you can’t endure, you can find another server, spend five minutes moving your account over, and you’re back up and running on the new server:

https://pluralistic.net/2023/03/04/pick-all-three/#agonism

Any system where users can leave without pain is a system whose owners have high switching costs and whose users have none. An owner who makes a bad call — like removing the block function say, or opting every user into AI training — will lose a lot of users. Not just those users who price these downgrades highly enough that they outweigh the costs of leaving the service. If leaving the service is free, then tormenting your users in this way will visit in swift and devastating pain upon you.

That not only helps you steer clear of rationalizing your way into a bad compromise: it also stops your investors and other people with leverage over you from pressuring you into taking actions that harm your users.

There's much more at the link.

Friday, January 31, 2025

ChatGPT: Exploring the Digital Wilderness, Findings and Prospects

That is the title of my latest working paper. It summarizes and synthesizes much of the work I have done with ChatGPT to date and contains the abstracts and contents of all the working papers I have done on ChatGPT. It also includes the abstracts and contents of a number of papers establishing the intellectual background that informs that research. There is also a section that takes the form of an interaction I had with Claude 3.5 on methodological and theoretical issues. Finally, to produce the abstract I gave the body of the report to Claude 3.5 and asked it to produce two summaries. I then edited them into an abstract.

As always, URLs, abstract, TOC, and introduction are below.

Abstract: The internal structure and capabilities of Large Language Models (LLMs) are examined through systematic investigation of ChatGPT's behavior, with particular focus on its handling of conceptual ontologies, analogical reasoning, and content-addressable memory. Through detailed analysis of ChatGPT's responses to carefully constructed prompts involving story transformation, analogical mapping, and cued recall, the paper demonstrates that LLMs appear to encode rich conceptual ontologies that govern text generation. ChatGPT can maintain ontological consistency when transforming narratives between different domains while preserving abstract story structure, successfully perform multi-step analogical reasoning, and exhibit behavior consistent with associative memory mechanisms similar to holographic storage.

Drawing on theories of reflective abstraction and conceptual development, the paper argues that LLMs inadvertently capture what wemight term the “metaphysical structure of our universe” – the organized system of concepts through which humans understand and reason about the world. LLMs like ChatGPT implement a form of relationality – the capacity to represent and manipulate complex networks of semantic relationships – while lacking genuine referential meaning grounded in sensorimotor experience. This architecture enables sophisticated pattern matching and analogical transfer but also explains certain systematic limitations, particularly around truth and confabulation.

The paper concludes by suggesting that making explicit the implicit ontological structure encoded in LLMs’ weights could provide valuable insights into both artificial and human intelligence, while advancing the integration of neural and symbolic approaches to AI. This analysis contributes to ongoing debates about the nature of meaning and understanding in artificial neural systems while offering a novel theoretical framework for conceptualizing how LLMs encode and manipulate knowledge.

Contents:

Introduction: Into the Digital Wilderness 5
Free-floating Attention, Systematic Exploration, and the Anthropomorphic Stance 8
ChatGPT: My Course of Investigation 12
Meaning, Truth and Confabulation, Latent Space 28
Prospects: Explicating the Ontology of Human Thought 42
A Dialogue with Claude 3.5 on Method and Conceptual Underpinnings 45
A Brief Narrative of My ChatGPT Work Based on My Working Papers 56
Working Papers about ChatGPT 62
Background Papers 74

Introduction: Into the Digital Wilderness

The world I entered when I started playing with ChatGPT is a wildnerness, strange and uncharted, uncharted by me, uncharted by anyone. By that I simply mean that it was something new, radically new. No one had been there before. Sure, a handful of people within the industry had been messing around in there, even a rather large handful considering how much work it took to make ChatGPT ready for the world at large. But its behavioral capabilities were, for the most part, unknown. In that sense it was a wildnerness.

But it was, and remains, a wilderness in another sense: the large language model (LLM) that underlies ChatGPT is a black box. We send a string of words into ChatGPT and it sends a string of words back out, but what the model does to derive the output from the input, that process remains deeply obscure. That is wilderness in a different sense. Wilderness in the first sense is about our experience of ChatGPT’s behavior. Wildnerness in this second sense is about the mechanisms that drive that behavior. It is a digital wilderness. This document reports on how I’ve structured my interaction with ChatGPT to give me clues about the mechanisms driving its behavior.

My methods are more “qualitative” or “naturalistic” than those standard in the literature, which many investigations employ standard batteries of benchmark tasks. While those are essential, there is much they don’t tell you. While I have done many things with ChatGPT – asked it to interpret texts, define abstract concepts, play games of 20 questions, among other things – perhaps my most characteristic task, and one I have spent more time on than others is simple: Tell me story. And ChatGPT did so, time and again. Consequently my methods are in some ways more like literary criticism, or, even better, like LΓ©vi-Strauss’ analysis of myths, than conventional cognitive science. Consequently you will find many examples of ChatGPT’s dialog in my reports. You have to examine that dialog to see what ChatGPT is doing, what it is capable of doing.

Finally, I realize that the pace of development in this arena is such that ChatGPT is now old. The versions I used to conduct these investagations are no longer available on the web. However, as far as I can tell, none of the results I report depend on features idiosyncratic to those versions.

The rest of this introduction consists of short statements about what the various sections of this report contain.

DeepSeek works though distillation [Schmidhuber]

How long can this go on? A short note on American politics

The Trumpists’ blitzkrieg was not unexpected, though I think it’s been faster and more extensive than anyone had anticipated. What I’m wondering: How long can this go on?

I’m not thinking about weeks or months. I’m thinking about the next presidential election cycle. The Trumpists and the GOP are assuming that they’ll win in 2028 and install Vance or whomever in the White House. Their minions will be able to keep their jobs. They’ll be continuity of government.

But this is by no means guaranteed. The world’s too volatile, and much of that volatility is beyond (anyone’s) political control. The Democrats might well win. What then?

Will they blitz the Trumpist GOP out of Washington? That’s not the Democrat’s proceduralist style. But if they move away from that to a dirtier style, then it’ll be another blitz.

Will 2032 bring a blitz back in the other direction? What about 2036? How long can the body politic play the game of blitz and counter-blitz?

I don’t know. But I don’t think it can go on too long. It’s too chaotic. It’s one thing to blitz an administration out of Washington and then have the new regime settle down for a couple of decades. But a continuous blitz-and-counter-blitz is likely to collapse the Federal Government.

What would that look like? I don’t know. That’s why I’m writing this post.

Interesting times.

Friday Fotos: Reality has aspects many and various

Football • {sports commentary} • [Media Notes 154]

I confess, football doesn’t interest me very much, never has. But I’m an America and I live in America. So I can’t escape it.

When I was in junior high school and high school I played in the marching band. That required me to attend every football game so we could provide half-time entertainment. We were so good, however, that I suspect some people came to the games more to hear us play than to see the game itself.

That’s the only time in my life that I ever watched football regularly. Of course, we played a bit of touch football in gym class, but that was it. I attended one football game in college. I was in the band. When one half of the band finished a tune eight bars ahead of the other half, that’s when I decided to blow this pop stand.

When I was in graduate school at SUNY Buffalo a roommate bequeathed me a small B&W portable. I watched a number of football games on it. This was during the O. J. Simpson years and I’d watch the Bills games to see him run. He was sensational. Football I didn’t care about, human excellence, that’s another matter.

After that, sure, every once in a while I’d catch a game. At least I assume I did. As I said, I’m living in America. Then, for some reason, a couple of weeks ago I decided to catch a play-off game on Netflix. Why? Why not? So I watched the Baltimore Ravens vs. the Pittsburgh Steelers. I’m from Pennsylvania, so that inclines me toward the Steelers. (The name “Franco Harris” sticks in my mind, so I must have watched some games when he was playing). I sent to school in Baltimore, which would tip me toward the Ravens, thought it was the Colts in Baltimore when I was there (Johnny Unitas as QB). Fact is, I could have cared less who won. Didn’t even watch the fourth quarter.

A week later it was the Buffalo Bills vs. Kansas City Chiefs. I made it the whole way through on that one. But I would hardly say I watched the game. Us, I did watch it, in fits and starts. But I also cruised the web doing this and that.

Which brings up a question: Let’s say the total elapsed time from the beginning of a game to the end is two to two-and-a-half hours. Only an hour of that is game time, which is interrupted for various reasons for varying amounts of time. During those interruptions we’re either getting some kind of commentary on the game, or we’re getting commercials. Add up the total time devoted to the game and commentary on the game. Add up the total time devoted to commercials during the broadcast. What’s the ratio between the two? My guess is that game time would be the larger number, but I’d guess the ratio is closer to 3/2 than to 2/1 in favor of game time.

So, I guess we could say that the commercials exist so that we can watch the game. But it could easily go the other way. Of course, if you’re a football fan, and so heavily invested in the game, that that’s certainly your priority. But if you’re not a fan, then it could almost go the other way.

Which brings me to the real reason for this note: the commentary. That fascinates me. I’m interested in it as a perceptual and cognitive activity. As I understand it, we generally have two commenters, one commenting on the action (play-by-play) and the other commenting on this and that. I believe the second is doing color commentary.

The play-by-play commenter is expected to comment on what’s happening as it happens. That requires them to have had a great deal of experience watching football games, more experience than I’ve had. You have to be able to instantly recognize hundreds of different patterns of activity and associate them with appropriate verbal comments. This is not a time for careful deductive reasoning. It’s an associative process. The commentary must be so fluid as to be of a piece with the perceptual act.

I wonder how long it takes to develop this capacity to the level we see in professional commentators? 10,000 hours? I don’t know. Let’s do a quick calculation. Ten thousand hours works out to something less than 5000 games, somewhere between 4000 and 4500. Let’s say it’s 100 games a year, two games a week. That’s forty to forty-five years. That’s possible, but I think the pros get in the game well before that. So it’s not 10,000 hours. It’s less than half that.

And then there’s the color commentary. That doesn’t have to track the action moment by moment, so it’s not constrained in that way. But still, it’s not an occasion for deductive reasoning. The commentator has to have access to a large range of relevant information about the players and the game, past and present, and come up with relevant bits and pieces in a matter of seconds. So it’s still pretty much an associative process.

And THAT, those last three paragraphs, that’s why I’m writing this not. As for the rest, why note? It’s context.

Thursday, January 30, 2025

Sabina Hossenfelder on DeepSeek and its implications

4:08: As we discussed previously, some experts including Yann LeCun and Gary Marcus, doubt that Large language models will ever get us to general intelligence, and I agree with them. The market will only react to this once it becomes clear that the existing LLMs can’t be recued with further updates. At that point a lot of money will evaporate like this. But at the moment, the stargate project is just pumping more money into an existing bubble. Building power plants, extending the grid, and improving data infrastructure generally seems like a good idea, and all these are part of the stargate project. 

But to me the Stargate Project is as crazy as if Americans had taken the first semi-conductors and spent $500 billion on factories to produce them, rather than letting markets do their thing and wait for technological developments to make microchips smaller and cheaper. Ie, to wait for them to make economic sense. It’s like the dot-com bubble, except instead of getting free T-shirts from Pets dot com, we get hallucinating chatbots and 17-hour debates about whether sentience can be monetized.

5:17: That said, the arrival of DeepSeek drives home an important message: you can save a lot of money if you let Americans do the heavy lifting and then build on that knowledge. And that goes well with the European approach, which is basically to wait and see what goes wrong in America. I’m not usually a fan of the European risk-aversity. It reminds me of how my younger brother was waiting for me to touch the electric fence. But in this case “wait and see” might indeed work out to our advantage. And if not, we’ll always be here to give Americans lectures about responsibility, sustainability, and how our regulation-heavy bureaucracy prevents us from having fun.

Three diverse gray-scale shots

Meeting of Minds: Richard F. Burton, Thor Heyerdahl, Marco Polo, and a Surprise Guest

Before I give you the conversation I cooked up with the help of Claude 3.5, let me explain roughly how things went. Once Claude learned that I wanted it to tell a story of some kind, it split the screen in two vertically.

  1. I gave Claude a prompt. [Left panel]
  2. Claude briefly told me what it would do. [Left panel]
  3. I then produced some dialog. [Right panel]
  4. Claude told me what it had done and gave me suggestions about how to proceed. [Left panel]
  5. I told Claude how to proceed. [Left panel]

And the process repeated. This happened five times. On the fifth time I told it what to do next and requested that it end the dialog with this segment. When it was done I congratulated it and made some final remarks

I’m not going to attempt all that back and forth in the dialog below. I start out with my initial prompt, then I present the entire dialog without interruption. Note, however, that I have inserted subtitles at the point where Claude and I had a discussion about how to proceed. I conclude with Claude’s final remarks.

* * * * *

I want you to stage an imaginary dialog between three famous explorers: Richard Francis Burton, Thor Heyerdahl, and Marco Polo. Let’s imagine they’re all transported to the bar in the Explorer’s Club in Manhattan, where they have a few drinks and chat. There’s also a television in the room, but it’s not turned on when the conversation begins. Heyerdahl begins the conversation by introducing himself and explaining to the other two where they are and how they got there (time travel). Of course Marco Polo wouldn’t even have known that America exists. Burton would have known. Once introduced each tells of their most dangerous expedition.

1. The story begins

[The wood-paneled bar of the Explorer’s Club in Manhattan. Three men find themselves seated at a corner table, surrounded by mounted exotic animal heads and artifacts from around the world. A bartender quietly polishes glasses in the background.]

THOR HEYERDAHL: [adjusting his bow tie] Gentlemen, I realize this is rather startling, but allow me to explain. I’m Thor Heyerdahl, and through some inexplicable circumstance, we’ve all been transported to what’s known as the Explorer’s Club in Manhattan, New York. The year is - well, sometime in the 21st century. [gestures to the room] This is a gathering place for adventurers and explorers, founded in 1904.

MARCO POLO: [looking bewildered] Manhattan? New York? These words mean nothing to me. I am Marco Polo of Venice, and the last I remember, I was dictating my travels to Rustichello da Pisa in my prison cell in Genoa.

RICHARD BURTON: [sipping from a glass of whiskey] Fascinating. Sir Richard Francis Burton at your service. I’m quite familiar with New York, though I must say the city has changed considerably since my time. [to Polo] My good man, you’re sitting in a city on a continent that wouldn’t be known to Europeans until nearly two centuries after your death.

HEYERDAHL: [signaling to the bartender] Perhaps some drinks will help ease the shock. Now, since we’re all here, we might as well share stories. I’ll begin - my most perilous journey was arguably the Kon-Tiki expedition in 1947. Picture this: a balsa wood raft, 4,300 nautical miles across the Pacific Ocean, just to prove that ancient South Americans could have reached Polynesia.

BURTON: [leaning forward with interest] A worthy endeavor! Though I must say, my journey to Mecca in disguise as Sheikh Abdullah in 1853 carried perhaps even greater risk. One wrong move, one slip in my portrayal, and I would have been executed on the spot for being an infidel.

POLO: [taking a sip of wine] Executed? I understand such dangers well. During my journey to Cathay - what you would call China - we crossed the Gobi Desert. For days we heard what we thought were voices calling our names, evil spirits trying to lead us astray into the wastes. Many who wander off following these voices are never seen again.

BURTON: [skeptically] Come now, Polo. Surely you don’t expect us to believe in desert spirits?

HEYERDAHL: [diplomatically] Whether spirits or hallucinations from dehydration, the danger was real enough. [glances at strange black box on the wall] I say, that’s what we call a television. It can show images from around the world. Shall we turn it on? It might help demonstrate to Messer Polo just how much the world has changed.

POLO: [eyeing the television warily] More sorcery? In my travels I saw many wonders - paper money, burning black stones, even messages carried by pigeons across vast distances. But this...

BURTON: [chuckling] My dear Polo, you should have seen the skepticism I faced when I published your accounts in English. Many called you a liar or a braggart. Yet here we sit, centuries later, knowing you spoke truth about China’s wonders.

HEYERDAHL: [raising his glass] To vindication! My own theories about Pacific migration were mocked by the academic establishment. They said it was impossible to cross the Pacific on a primitive raft. Well, we showed them, didn’t we?

POLO: [curiously] You crossed an ocean larger than the entire realm of Kublai Khan... on a raft made of grass?

BURTON: Balsa wood, Polo. Though I must say, I prefer a well-appointed expedition. My search for the source of the Nile had 200 porters carrying everything from silver tea services to a folding boat.

HEYERDAHL: [laughing] Times have changed indeed. Though I suspect all of us share one thing - the desire to prove what others claimed impossible.

POLO: [nodding] True enough. When I returned to Venice after 24 years away, my own family didn’t recognize me. They thought me long dead. Even after I proved who I was, many refused to believe my tales of paper money and vast cities in Cathay.

BURTON: [raising his glass] To the unbelievers then - without whom we might never have felt so compelled to prove our points. Though I must say, Heyerdahl, your raft journey makes my disguised pilgrimages seem rather comfortable in comparison.

HEYERDAHL: [modestly] We all faced the challenges of our times. [gestures to television] Shall we see what modern explorers are up to?

2. Raiders of the Lost Ark

[The television suddenly comes to life, showing Raiders of the Lost Ark. The opening scene plays as Indiana Jones navigates the Peruvian temple.]

POLO: [startled by the moving images] By Saint Mark! What sorcery is this? The people are so small, yet they move as if alive!

BURTON: [fascinated] Remarkable. The quality of these moving photographs far exceeds anything Edison was working on in my time.

HEYERDAHL: [grinning] This is what we call a movie - “Raiders of the Lost Ark.” It’s about a fictional archaeologist named Indiana Jones. [gestures to the screen] He’s searching for ancient artifacts, much as we all did in our own ways.

BURTON: [watching Jones swing across a chasm] Rather acrobatic for an archaeologist, wouldn’t you say?

POLO: [wincing as Jones triggers a trap] In my travels through the temples of the East, I never encountered such elaborate mechanical devices. Though... [thoughtfully] there was that one Buddhist monastery in Tibet with rather ingenious defensive mechanisms...

HEYERDAHL: [as the giant boulder scene plays] Pure Hollywood invention, of course. Though I must say, during my excavations on Easter Island, we did have to be extremely careful about cave-ins.

BURTON: [snorting as Indy escapes] If my porters had abandoned me like his just did, I’d have had them flogged. Good help was essential in the exploration business.

POLO: [as Belloq appears] Ah, now this I understand - rival merchants seeking the same prize. Though in my day, we were more concerned with silk and spices than golden idols.

HEYERDAHL: Wait until you see what they’re really after - the Ark of the Covenant itself.

BURTON: [nearly choking on his whiskey] The Ark? From the Temple in Jerusalem? Preposterous! I’ve studied enough religious artifacts to know...

POLO: [interrupting as the Nazi villains appear on screen] Who are these men in matching black uniforms?

HEYERDAHL: [grimly] That’s a rather long and unpleasant story from my time period. Let’s just say they’re the villains.

BURTON: [watching the map room scene] Now this interests me - using ancient astronomical alignments to locate a buried site. During my expeditions to find the source of the Nile, we often relied on local astronomical knowledge. Though I never had a staff quite that elaborate...

POLO: [as Marion appears] At least they show the truth that women can be as fierce as men. I met many such women in Mongolia. Though they typically wore more practical clothing for combat.

HEYERDAHL: [chuckling] Just wait until you see what happens in the snake pit...

Wednesday, January 29, 2025

It's the light

About that funding freeze – It's stupid, careless, and cruel

Let's be clear, I do not like Donald Trump, and I certainly did not vote for him. Yet not too long ago I posted, The Vibes They Are a-Shiftin', in which I said:

Still, on the whole, the vibes are somehow good.

Does that mean that at long last the techtonic plates are beginning to shift? I sure hope so. And if they're shifting, it remains to be seen just who'll be guiding them into a new configuration. That is not at all written in stone.

I'm beginning to regret that. I'd forgotten that, whatever else he is, Trump is also cruel. Is that what his followers voted for?

Who the hell knows what the
next week will bring.

Robert Gordon over at Eating Policy has some cogent remarks under the title, Vast Carelessness:

I have spent years railing against long government memos with languorous timelines. How about a two-pager? Where’s the urgency?

Careful what you wish for. OMB’s two-pager on grants, with its 24-hour deadline, dropped like a bomb on a trillion-dollar grantmaking enterprise (and that’s just at HHS). It risks wreckage across the country.

The memo required agencies to pause all spending, with a footnote exempting “assistance received directly by individuals.” HHS suspended states from drawing down Medicaid funds and tribal facilities from drawing down Indian Health Service Funds. Also suspended were grants that provide funds for Head Start, child care, LIHEAP, and more. By midday yesterday, OMB issued a corrective note explaining that the footnote excluded Medicaid and Head Start, as well as other programs like SNAP and Pell grants. That was good, if incoherent, since Medicaid makes payments through intermediaries (states, which may in turn pay managed care organizations which pay providers) as much as any federal program. And the order leaves in the crosshairs, just at HHS, programs offering drug treatment and mental health services, funding foster care, serving the homeless, engaging in medical research, and on and on.

Smart lawyers have focused on the question whether the delay violates the Impoundment Control Act (sure seems to), but there’s another problem. The actions now unfolding may well violate the terms of any number of grant agreements. Grant agreements bind two parties—not only the grantee, but also the government. Here’s a snippet from the key text, the Uniform Grants Guidance:

Payments for allowable costs must not be withheld at any time during the period of performance unless required by Federal statute, regulations, or in one of the following instances: (i) The recipient or subrecipient has failed to comply with the terms and conditions of the Federal award; or (ii) The recipient or subrecipient is delinquent in a debt….

[...]

Many of us desperately want to simplify the government’s rules and procedures, but that does not mean we want no rules at all. When a state or grantee receives a grant from the federal government, the receipt of the grant does not turn that state or grantee into a vassal or a subject who can only appeal for the king’s mercy. [...]

What is remarkable is how easily the Administration could have pursued the same ideological goals in a more orderly way.

There's more at the link.

I talk Loony Tunes with John McWhorter @3QD

I’ve got a new piece at 3 Quarks Daily:

John McWhorter and I Talk About Cartoons

That went up yesterday. 

Today Brooks Riley followed up with this:

There's a lesson there: Cartoons play the court jester to the dominant view of reality.

I ended the article with a scene from Who Framed Roger Rabbit, one where Daffy Duck and Donald Duck battle one another at the piano. Such battles were common in cartoons. Here’s another one:

Here's a variation on the theme:

 
 
 Have fun!

Tuesday, January 28, 2025

Snoopy's response to DeepSeek

MAGA’s Big Tech Divide

That's an article in the NYTimes by Ezra Klein in which he interviews James Pogue. I asked Claude 3.5 to summarize it.

* * * * *

I’d like you to summarize the attached article. Assume an intelligent and well-educated reader. Thank you.

This article is a transcript of an Ezra Klein Show interview with James Pogue, a journalist who covers the "New Right" movement within MAGA politics. The conversation, published in January 2025, explores the intellectual currents and tensions within Trump's coalition following his return to the presidency.

The discussion centers on two competing factions within MAGA: a traditionalist wing represented by figures like JD Vance (Trump's vice president), and a tech-oriented wing aligned with Elon Musk and other Silicon Valley figures. The traditionalist wing emphasizes cultural restoration, masculinity, and skepticism of modern technology's effects on society. The tech wing, while sharing some of these cultural concerns, envisions using technological advancement to achieve MAGA's goals.

Pogue describes how these factions, despite their differences, are united in opposing what they call "the regime" - a perceived establishment comprising mainstream media, universities, NGOs, and traditionally powerful institutions. He notes how Trump, while not deeply engaged with either faction's ideology, has managed to balance both groups' interests, particularly evidenced by his relationship with both Vance and Musk.

The interview delves into complex ideological threads, including the movement's relationship with modernity, its vision of masculine virtue, and its understanding of American identity. Pogue suggests that while the movement's intellectual foundations are robust among younger staffers and theorists, it remains unclear how much these ideas will translate into actual governance, given the practical constraints of running the federal government and Trump's own tendency to prioritize personal loyalty over ideological purity.

A particularly notable element is the discussion of how the movement reconciles its critique of technology and modernity with its embrace of tech billionaires and social media. This tension, Pogue argues, reflects broader contradictions within the movement as it transitions from opposition to governance.

The conversation concludes by examining how these competing visions might shape American policy and society, with Pogue suggesting that the success of the administration may depend on whether these different factions can maintain their coalition while implementing their shared agenda.

Who are the most important thinkers in the various factions?

Based on the article, several key intellectuals and thinkers emerge across the different factions of the New Right movement:

For the traditionalist wing, Curtis Yarvin stands out as what Pogue calls "the dark lord intellectual godfather" of the movement. Yarvin is described as a reactionary who believes that people were better formed in earlier eras (specifically citing Pennsylvania farmers in 1800) and who has written against techno-optimism despite his own background in technology.

Patrick Deneen is another significant figure, whose book "Regime Change" is described as "Curtis Yarvin for normies." The article suggests his work provides an intellectual framework for the movement's desire to replace current elites and institutions.

For the tech-oriented wing, Marc Andreessen emerges as a key intellectual figure, particularly through his "Techno-Optimist Manifesto." Peter Thiel, while somewhat removed from current politics, is portrayed as bridging the traditionalist and tech wings through his earlier support of figures like JD Vance and Blake Masters.

The article also mentions some less prominent but influential figures like Micah Meadowcroft, who Pogue describes as "one of the smartest people of this world," and who articulates the movement's desire to rebuild American "yeomanry." There's also reference to Bronze Age Pervert, a figure who promotes ideas about masculinity and physical culture.

Interestingly, while JD Vance is described as a unifying figure who can speak to both wings of the movement, he's portrayed more as a political leader who synthesizes these ideas rather than an original thinker himself.

The article also notes that many of these ideas were initially developed by anonymous or pseudonymous figures on Twitter and in blogs, suggesting a broader intellectual ecosystem beyond the named figures.

This intellectual framework appears to draw from various traditions, including Catholic social teaching, classical republicanism, and tech entrepreneurship, creating what Pogue describes as a coherent, if sometimes contradictory, worldview.