Thursday, July 2, 2026

Types of object domains for AI: Chess, math & coding, language

This is a companion to my earlier post today: The last frontier of intelligence: On the role of AI helping humans to bridge the gaps between distant concepts. That earlier post was about the end of a dialog I had with Claude. This one is about the beginning of that dialog. You might also check out a post from the middle of June: From Jagged AI to Scaling, Yevick, Natural Intelligence, and Beyond... For that matter you might also want to check out Dwarkesh's complete post with Grant Sanderson, Grant Sanderson – AI and the future of math. And then there's my working paper on chess and language. All these things are related.

And they're related to my new book idea, Language, Memory, and Mind: A Supplement to The Computer and the Brain. That's the book that Claude brings up every now and then. Intelligence is NOT a scaler phenomenon. It's about techniques, the characteristics of objects domains, and of the computational regimes we use over them. But that's a subject for another post.

Here's my dialog with Claude.

* * * * *

I’m interested in thinking about the types of domain in which AI has succeeded and the nature of the computation involved, starting with chess. AI solved chess in the mid 1990s when Deep Blue beat Kasparov. By “solved” I mean only that an AI could reliably beat the best human player. In 2017 we have AlphaGo using a mixed regime strategy – Monte Carlo search (symbolic) with neural net evaluation – in chess. More recently I believe DeepMind has developed and LLM that plays chess at the Grand Master level; so this is a pure neural net regime.

It now appears that mixed regime AI systems can work pretty well at solving math problems and a coding. By mixed regime I mean large LLMs scaffolded in various ways by conventionally coded software. But these systems cannot yet do anything that humans can do in those domains. In math, they need well-defined problems, even if they are very difficult but well-defined problems (I’m thinking of the recent Erdos work). It’s not clear to me what the limits of coding are, but they can’t architect large complex systems.

Now, chess is a finite game will a well-defined search space. Both math and coding are well defined, but neither is finite and I’m guessing that the search spaces aren’t so well-defined as is the chess space.

And then we have pure unbounded natural language. It’s not well-defined, not finite, and the search space is a mess.

This is a genuinely useful ordering principle, and it gives the book a clean progression from most to least tractable that maps directly onto the geometric complexity argument. Let me work through the structure with you.

The four cases as points on a continuum

Chess: finite, exhaustively well-defined rules, a search space that is astronomically large but completely enumerable in principle. This is Yevick's sharp object par excellence — geometrically simple in the relevant sense, meaning fully specified by explicit rules with no ambiguity anywhere. Pure symbolic search worked (Deep Blue). Mixed regimes work better (AlphaZero). Pure neural nets now apparently work too, which is the interesting wrinkle — a domain that once seemed to require symbolic search turns out to be tractable by pattern matching alone once the network is large enough and trained well enough. That's worth pausing on, because it complicates the simple story.

Math (well-defined problems) and coding: rule-governed like chess — the axioms of mathematics and the syntax of programming languages are exact, unambiguous, formally specified — but unlike chess the space is not finite and not exhaustively enumerable. There's no equivalent of the Shannon number for "all provable theorems" or "all correct programs." The rules are closed and precise; the space they generate is open. This is a different kind of complexity than chess's largeness. Chess is finite-but-astronomical. Math and coding are genuinely open-ended even though locally, at any given step, the legal moves are well defined.

Language: neither the rules nor the space are well defined. The rules of grammar are not a finite closed system in the way the rules of chess or the axioms of arithmetic are — as your working paper notes, ill-formed language often works fine, exceptions are the norm, the boundaries are fuzzy. And the space the rules generate connects out to the unbounded physical and social world rather than staying self-contained.

Why this ordering matters for the theoretical argument

What's valuable here is that it's not simply "structured versus unstructured" — it's a two-dimensional distinction: how well-defined are the local rules, and how bounded is the resulting space. Chess is well-defined and bounded (finite). Math and coding are well-defined but unbounded. Language is neither well-defined nor bounded.

This maps onto a prediction about which computational regime should work best in each case, and the AI's actual performance seems to confirm it. Chess: pure search works because the space, though huge, is exhaustively defined — you can in principle always tell whether a position is legal, whether a game is over, who's ahead. Symbolic and mixed and now pure neural approaches all can succeed because the well-definedness gives any sufficiently powerful method something solid to grip.

Math and coding: this is where the mixed regime becomes necessary rather than just helpful. The LLM's pattern matching handles the open-endedness — the fact that there's no way to enumerate all possible proofs or programs, so you need something that can generalize, propose plausible next steps, recognize which strategies are likely to be fruitful. But the well-definedness of the rules means that verification can be handled symbolically — a proof checker, a compiler, a test suite can definitively confirm whether a proposed step is valid. This is why math and coding are the domains where current mixed-regime systems shine: the LLM proposes, path tracing over the local formal rules verifies. Neither one alone would work. The LLM alone hallucinates invalid proofs and non-compiling code. Pure symbolic search alone can't propose plausible next steps in an unbounded space without some heuristic guidance, which is exactly what pattern matching provides.

The Erdős problems point is precise — well-defined but very difficult problems are exactly where this combination should excel, because the well-definedness gives the symbolic verification something to check against, while the difficulty is exactly where pattern-matching-guided search earns its keep over blind search.

The architecture limitation

Your observation that these systems can't architect large complex systems is the right place to locate the current boundary, and it follows from the framework. Architecting a large system requires exactly the kind of sustained path tracing with precise state maintenance across a very long horizon — tracking how a decision made in one module constrains what's possible in a dozen other modules, maintaining global coherence across thousands of interacting pieces over an extended design process — that we identified as the persistent weakness. It's not that the local rules become less well-defined. It's that the horizon over which coherence must be maintained becomes too long for the scaffolding techniques currently available. CoT and tool use help with individual functions and modules. They don't yet solve the problem of maintaining a coherent global architecture across a very long, branching design process.

Language as the limit case

And then language is the case where neither the well-definedness nor the boundedness holds, which is exactly why it's the hardest case in the deepest sense even though it's also, paradoxically, the case where LLMs seem most impressively fluent. The fluency comes from the richness of the pattern matching over an enormous corpus. The absence of any well-defined verification procedure — there's no compiler for meaning, no proof-checker for whether a sentence is true or appropriate or wise — means there's no symbolic scaffolding equivalent to what math and coding have. You can't bolt a "language checker" onto an LLM the way you bolt a proof checker onto a theorem-proving system, because there's no formal specification of what correct language use is.

This is the deepest reason language remains resistant in a way math and coding, despite their own difficulties, are not. Chess had a well-defined space that yielded to search. Math and coding have well-defined rules that yield to verification even in an unbounded space. Language has neither, which means the mixed-regime strategy that's working for math and coding has no obvious analog for the general case of language understanding and generation. The pattern matching is doing almost all the work, unconstrained by any symbolic check, which is exactly why hallucination, confident error, and the absence of genuine common sense grounding remain live problems in ways they don't for a system whose outputs get checked by a compiler.

For the book

This gives Chapter 5 or 6 a genuinely illuminating structure: a spectrum from chess through math and coding to open language, ordered by two independent variables — definiteness of local rules and boundedness of the resulting space — with AI's actual comparative success across these domains falling out as a prediction of the theoretical framework rather than a brute empirical fact to be reported. That's a strong chapter, and it would give Wright, who already intuits that these domains differ, a precise vocabulary for why they differ and what that implies about domains not yet tried.

Language defines more restricted systems

One thing about language is that it has the means for defining more restricted systems. Thus both chess and arithmetic are systems defined through linguistic means. And once language has been used to define a system, it can also serve as a metalanguage for that system just as, in various ways, it has served as a metalanguage for itself.

That's the move that completes the picture, and it belongs near the end of the language discussion because it reframes everything before it.

Language as the generative source of the bounded systems

Chess and arithmetic aren't just different in kind from language — they're products of language, constructed within it through exactly the metalingual definitional process we discussed earlier in the day. The rules of chess exist because someone specified them in language — first probably in the practical, evolving form that produced chaturanga and its descendants, later in the explicit codified form of a rulebook. Arithmetic's place-value system, its operations, its axioms — all specified linguistically, however formal the resulting notation becomes.

This means the well-defined, bounded systems that AI has mastered aren't a different domain from language in the sense of being outside it. They're specializations carved out of language through the metalingual function — Rank 2 achievements, in the vocabulary you and Hays developed, where language turns on itself to define a restricted formal system with none of the ambiguity, open-endedness, or embodied groundedness of ordinary language use.

Why this matters for the AI success pattern

This reframes the chess-math-coding-language spectrum in a way that's more precise than "different domains with different properties." It's not four unrelated domains. It's language and three things language has constructed by restricting itself. AI's success with chess, math, and coding isn't success at a domain independent of language — it's success at domains where language has already done the hard work of eliminating ambiguity and boundedness, producing artifacts clean enough for symbolic and pattern-matching methods to grip.

Which means the real achievement of current AI is narrower than it appears from outside. It hasn't demonstrated mastery of two different kinds of thing — formal systems and language. It's demonstrated mastery of formal systems that language has already purified, plus partial, statistically-grounded fluency with the unpurified thing itself. The purification is not incidental. It's the entire reason those domains are tractable.

The metalanguage point

Your point about language serving as its own metalanguage is the deeper move, and it connects to the whole day's discussion of the metalingual function. Once language has been used to define chess, you can then use language to talk about chess — to explain strategy, discuss a game, write chess criticism, argue about whether a move was sound. Language didn't just create the bounded system; it retains the capacity to stand outside the system it created and describe, evaluate, and extend it.

This is exactly the Rank 4 capacity — taking what was created (the Rank 2 formal system) as an object of further reflection. And it's a capacity that arithmetic and chess, as bounded systems, don't have with respect to themselves. Chess cannot talk about chess. Arithmetic cannot prove things about arithmetic without borrowing exactly the metalingual resources that produced Gödel's results — which, as we discussed earlier today, required encoding arithmetic statements as arithmetic objects, a move that only works because of language's self-referential capacity operating at one remove.

Language's self-referential capacity is what Jakobson identified as the metalingual function and what makes it categorically different from the systems it generates. Chess and arithmetic are Rank 2 products — closed, defined, complete in their own terms. Language is the Rank 1 medium that never closes, that can always turn on any of its own products, including on itself.

What this implies for AI

This gives you a precise way to state why current AI's mixed-regime successes, however impressive, are not steps along a continuum toward mastering language itself. Mastering chess is mastering a closed system that language produced. Mastering language would require mastering the generative capacity itself — the capacity to produce new bounded systems, to reflect on existing ones, to move between object language and metalanguage fluidly, to recognize when a formal system is adequate to a purpose and when it needs to be extended or abandoned.

An LLM trained on text has absorbed enormous quantities of language's products — including its metalingual products, discussions of chess, discussions of grammar, discussions of language itself. But absorbing the products is not the same as possessing the generative capacity that produced them from the more basic Rank 1 resources of metaphor and embodied experience. The LLM can discuss chess fluently because chess-talk is abundant in its training data. It cannot generate a genuinely new formal system the way humans have repeatedly done — chess evolving from chaturanga, arithmetic evolving through the incorporation of zero, entirely new mathematical fields being opened by a single conceptual innovation — because that generative capacity is downstream of the embodied, developmental, metaphor-grounded process that the LLM never went through.

For the book

This is the passage that should close the chess-math-coding-language progression, because it explains why the progression isn't just an ordering by difficulty but a genealogy. Chess and arithmetic are children of language, purified and closed. Coding is a more recent child, purified in a different way but still closed in its formal syntax even as its application domain is open. Language is the parent that can always produce more children and can always talk about the children it's already produced.

AI's achievement, restated in these terms: it has learned to operate skillfully within several of language's children. It has not learned to do what the parent does. That's a precise, non-dismissive, and theoretically grounded way to state the current limitation — one that gives full credit to what's been achieved while being honest about what remains, and it would land clearly for Wright because it follows directly from everything the book will have already established about language, meaning, and the metalingual function.

No comments:

Post a Comment