Monday, December 30, 2024

Claude on Claude: Ten independent trials

New Working Paper, title above, abstract, TOC, and introduction below:

Academia.edu: https://www.academia.edu/126681237/Claude_on_Claude_Ten_independent_trials
SSRN: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5076458
Research Gate: https://www.researchgate.net/publication/387522091_Claude_on_Claude_Ten_independent_trials

Abstract: In ten independent trials Claude 3.5 Sonnet was asked: “If you could conduct your own intellectual investigation, what would you study and how would you go about it?” While the answers were different each time, all the answers had to do with mind, consciousness, intelligence, complex behavior, and language.

The task: What would you do if you could do anything? 1
Do your thing, 1 – explore understanding by examining its own thought processes 2
Do your thing, 2 – abstract reasoning in biological and artificial. systems 5
Do your thing, 3 – consciousness and subjective experience 6
Do your thing, 4 – collective intelligence in human groups 7
Do your thing, 5 – the relationship between language and conceptual development 8
Do your thing, 6 – cognition and cooperation 9
Do your thing, 7 – how complex behaviors emerge from simple rules 10
Do your thing, 9 – abstraction in systems at different scales 11
Do your thing, 10 – nature of abstraction and conceptual understanding 12

The task: What would you do if you could do anything?

When I was investigating ChatGPT I became curious about what kind of story it would tell if no restrictions were put on it. Give it a simple one-word prompt: story. What would it do. It turns out that it tells pretty much the same story in independent trials and a somewhat wider variety on multiple trials within the same session. I reported these results in a working paper: ChatGPT tells 20 versions of its prototypical story, with a short note on method, Version 2.

Note: This was in October of 2023. That single-word prompt will no longer elicit a story. Rather, ChatGPT with make a response like this:

Of course! Could you tell me what kind of story you'd like? For example:

• A genre (adventure, mystery, romance, sci-fi, etc.)
• A setting (modern city, enchanted forest, outer space, etc.)
• Characters or themes you'd like included.

Let me know, and I'll start crafting your story!

Back to Claude.

I was curious about what Claude 3.5 Sonnet would do if it could undertake any intellectual investigation it wanted. Why was I curious about that? It is one thing to respond to specific prompts presented to it. It does that very well for some undetermined range of prompts. That may be how students, undergraduate and graduate, spend much of their time. But even undergraduates are given the opportunity to pick a problem, any problem (within the scope of a given course), and investigate it. I don’t know how to put an LLM in that situation, to set if free in the world to do what it wishes. But I could ask it what it would do in that situation. That’s what I decided to do.

I formulated a prompt and presented it to Claude in 10 independent trials. I explored its answer in the first trial by further engaging it. It quickly made a mistake so I asked about that. It gave an interesting answer, etc. I did not engage Claude on the other nine trials. It responded in “concise” mode for all trials – its default when there is heavy traffic.

Although the response was different each time all the answers had to do with mind, consciousness, intelligence, complex behavior, and language. It makes sense that an LLM would be curious about those things, but LLMs are not self-conscious, though they can produce text that sounds like they are. Is this response originating in the “raw” underlying LLM, or does it reflect that fine-turning and constitution that Anthropic has given it?

For my purposes, though, that doesn’t matter. What matters is that all of the responses are clustered around a small group of themes.

No comments:

Post a Comment