Wednesday, April 3, 2024

The seasonal oddities of AI: Ethan Mollick talks with Ezra Klein [+prompting]

The New York Times, How Should I Be Using A.I. Right Now?, 4.03.24

Deep into the interview:

Ethan Mollick: “What the hell” is a good question. And we’re just scratching the surface, right? There’s a nice study actually showing that if you emotionally manipulate the A.I., you get better math results. So telling it your job depends on it gets you better results. Tipping, especially $20 or $100 — saying, I’m about to tip you if you do well, seems to work pretty well. It performs slightly worse in December than May, and we think it’s because it has internalized the idea of winter break.

Ezra Klein: I’m sorry, what?

Ethan Mollick: Well, we don’t know for sure, but —

Ezra Klein: I’m holding you up here.

Ethan Mollick: Yeah.

Ezra Klein: People have found the A.I. seems to be more accurate in May, and the going theory is that it has read enough of the internet to think that it might possibly be on vacation in December?

Ethan Mollick: So it produces more work with the same prompts, more output, in May than it does in December. I did a little experiment where I would show it pictures of outside. And I’m like, look at how nice it is outside? Let’s get to work. But yes, the going theory is that it has internalized the idea of winter break and therefore is lazier in December.

Ezra Klein: I want to just note to people that when ChatGPT came out last year, and we did our first set of episodes on this, the thing I told you was this was going to be a very weird world. What’s frustrating about that is that — I guess I can see the logic of why that might be. Also, it sounds probably completely wrong, but also, I’m certain we will never know. There’s no way to go into the thing and figure that out.

But it would have genuinely never occurred to me before this second that there would be a temporal difference in the amount of work that GPT-4 would do on a question held constant over time. Like, that would have never occurred to me as something that might change at all.

Ethan Mollick: And I think that that is, in some ways, both — as you said, the deep weirdness of these systems. But also, there’s actually downside risks to this. So we know, for example, there is an early paper from Anthropic on sandbagging, that if you ask the A.I. dumber questions, it would get you less accurate answers. And we don’t know the ways in which your grammar or the way you approach the A.I. — we know the amount of spaces you put gets different answers.

So it is very hard, because what it’s basically doing is math on everything you’ve written to figure out what would come next. And the fact that what comes next feels insightful and humane and original doesn’t change that that’s what the math that’s doing is. So part of what I actually advise people to do is just not worry about it so much, because I think then it becomes magic spells that we’re incanting for the A.I. Like, I will pay you $20, you are wonderful at this. It is summer. Blue is your favorite color. Sam Altman loves you. And you go insane.

...the A.I. has no internal monologue, it’s not thinking. 

So acting with it conversationally tends to be the best approach. And personas and contexts help, but as soon as you start evoking spells, I think we kind of cross over the line into, “who knows what’s happening here?”

Ezra Klein: Well, I’m interested in the personas, although I just — I really find this part of the conversation interesting and strange. But I’m interested in the personalities you can give the A.I. for a different reason. I prompted you around this research on how a personality changes the accuracy rate of an A.I. But a lot of the reason to give it a personality, to answer you like it is Starfleet Commander, is because you have to listen to the A.I. You are in relationship with it.

And different personas will be more or less hearable by you, interesting to you. So you have a piece on your newsletter which is about how you used the A.I. to critique your book. And one of the things you say in there, and give some examples of, is you had to do so in the voice of Ozymandias because you just found that to be more fun. And you could hear that a little bit more easily.

So could you talk about that dimension of it, too, making the A.I. not just prompting you to be more accurate, but giving it a personality to be more interesting to you?

Ethan Mollick: The great power of A.I. is as a kind of companion. It wants to make you happy. It wants to have a conversation. And that can be overt or covert.

So, to me, actively shaping what I want the A.I. to act like, telling it to be friendly or telling it to be pompous, is entertaining, right? But also, it does change the way I interact with it. When it has a pompous voice, I don’t take the criticism as seriously. So I can think about that kind of approach. I could get pure praise out of it, too, if I wanted to do it that way.

The mysteries of prompting:

Ethan Mollick: Just to take a step back, A.I. prompting remains super weird. Again, strange to have a system where the companies making the systems are writing papers as they’re discovering how to use the systems, because nobody knows how to make them work better yet. And we found massive differences in our experiments on prompt types. So for example, we were able to get the A.I. to generate much more diverse ideas by using this chain of thought approach, which we’ll talk about.

But also, it turned out to generate a lot better ideas if you told it it was Steve Jobs than if you told it it was Madame Curie. And we don’t know why. So there’s all kinds of subtleties here. But the idea, basically, of chain of thought, that seems to work well in almost all cases, is that you’re going to have the A.I. work step by step through a problem. First, outline the problem, you know, the essay you’re going to write. Second, give me the first line of each paragraph. Third, go back and write the entire thing. Fourth, check it and make improvements.

And what that does is — because the A.I. has no internal monologue, it’s not thinking. When the A.I. isn’t writing something, there’s no thought process. All it can do is produce the next token, the next word or set of words. And it just keeps doing that step by step. Because there’s no internal monologue, this in some ways forces a monologue out in the paper. So it lets the A.I. think by writing before it produces the final result. And that’s one of the reasons why chain of thought works really well.

So just step-by-step instructions is a good first effort.

Ezra Klein: Then you get an answer, and then what?

Ethan Mollick: And then — what you do in a conversational approach is you go back and forth. If you want work output, what you’re going to do is treat it like it is an intern who just turned in some work to you. Actually, could you punch up paragraph two a little bit? I don’t like the example in paragraph one. Could you make it a little more creative, give me a couple of variations? That’s a conversational approach trying to get work done.

If you’re trying to play, you just run from there and see what happens. You can always go back, especially with a model like GPT-4, to an earlier answer, and just pick up from there if your heads off in the wrong direction.

There's more at the link, much more.

No comments:

Post a Comment