As you may know, I’ve been working on a book project, Play: How to Stay Human in the A.I. Revolution. For some reason I’ve been unable to finish the proposal, though I’ve got lots of stuff and a number of the chapters are substantially drafted. But I keep finding myself distracted into thinking about basics, very basic things about computing and A.I.
At the very end of his life, John von Neumann wrote a slim book, The Computer and the Brain (1958). It grapples with the problem of how computation can be implemented in a physical medium and does so in a way that is basic, both simple and straightforward and profound. We’ve learned a great deal about both the brain and the computer since then, but as far as I know, no one has revisited von Neumann’s project and extended it to include what we have since learned. That’s what I propose to do in this book.
Now, I have no intention of trying to summarize what we’ve learned on those two topics since 1958. That’s working at the wrong level. When von Neumann was writing he, and by extension, we, had no conception of distributed representation much less how it could be achieved physically. Now we do. That’s what needs to be added to von Neumann’s exposition.
I have no intention of repeating what von Neumann did. In particular, I will not revisit his material on analog computing. Rather, I want to augment his discussion. Fortunately the new material is of such a nature that I should be able to write short book that can be read as a stand-alone discussion or as a supplement to von Neumann’s book. I’m imagining a sophisticated general audience of the sort that reads 3 Quarks Daily.
My working title: Language, Memory, and Mind: A Supplement to The Computer and the Brain. I expect the book to by 100 to 120 pages long (30K to 40K words).
I have uploaded a bunch of material (100K words or more) to Claude and asked it to review that material and put together and initial outline. I’ve appended that below the asterisks.
* * * * *
Preface
How to use this book — with or without von Neumann. What it adds to his argument. What it doesn't attempt. Brief note on the collaboration with Claude that produced parts of the text.
Introduction: Von Neumann's Unfinished Argument
What he got right: the architectural mismatch between brains and computers — memory and computation separated in the digital machine, unified in the neuron. The energy efficiency puzzle he couldn't explain. His honest acknowledgment that the brain's organizational principles lay beyond the framework he'd built. The concepts he lacked that this book supplies.
Chapter 1: Two Paradigm Cases
The chess-language contrast as the entry point. Chess has a bounded, well-defined geometric footprint — 8×8 board, six piece types, explicit rules, finite tree. Language has an unbounded, poorly-defined geometric footprint — rooted in the full complexity of physical and social reality. Chess was AI's founding benchmark precisely because it seemed to demand the highest human intelligence while yielding to computational treatment. Moravec's paradox: the easy problems are hard and the hard problems are easy. Transcendent versus non-transcendent coding — programmers can observe and specify a chess engine completely from outside; nobody can specify an LLM from outside, including its creators. Where we now stand.
Chapter 2: Location and Content
A collection of photographs. Solid objects at specific locations — finding by address is natural, finding by content requires going to each photo in turn. The combinatorial explosion that follows. The formal argument: solidity localizes content; localized content can only be retrieved by address. What holography does physically — interference patterns distribute information about each stored object across the whole plate, so that any partial cue can activate the whole. Lashley's ablation experiments: memory didn't disappear when specific cortical tissue was removed because memory was never stored in specific locations in the first place. Von Neumann's energy efficiency puzzle, now answerable: the brain doesn't spend energy moving content to a processor because memory and processing are the same physical substrate.
Chapter 3: The Brain as Content-Addressed System
The McCulloch-Pitts neuron-as-logic-gate: computationally fruitful, architecturally wrong. What neurons actually are — active units and memory units simultaneously, connected in massive parallel. Distributed representations: concepts as patterns across populations of neurons, not stored at specific cell addresses. Yevick's logical necessity argument in plain terms: the world contains two categories of object, geometrically simple ones that sequential symbolic processing handles efficiently and geometrically complex ones that only holographic parallel processing handles efficiently; the world contains both; therefore any adequate cognitive system must implement both regimes. Path tracing and pattern matching as the two fundamental operations on any cognitive network. Freeman's cinematic model — global coherence frames at 10-12 Hz as the atomic unit of biological cognitive processing — and its correspondence to speech production rates.
Chapter 4: Language as a One-Dimensional Projection
The semantic network as the right model for conceptual structure: meaning as position, each node defined by its pattern of relations to other nodes. Sydney Lamb's principle. The multidimensional character of the conceptual network versus the one-dimensional character of any spoken or written string. Language strings as 1D projections of the multidimensional network — necessarily lossy, hence paraphrase and ambiguity. The colored beads thought experiment: strip away semantic content, replace each token with a color, and you have a 1D image — making visible the purely formal structure the LLM operates on. Words as abstract addresses in an abstract space. Why classical computational linguistics hit combinatorial explosion: it was trying to reconstruct the multidimensional structure in a location-addressed system.
Chapter 5: What Large Language Models Actually Are
The transformer architecture in plain terms. The weight space as distributed content-addressed memory — concepts are patterns smeared across billions of parameters, not stored at specific addresses. The forward pass as the atomic processing unit, corresponding to Freeman's global coherence frame: one complete transit through the weight space producing one output token. The token string as a path through the abstract address space, with each forward pass mediating between the 1D sequential surface and the multidimensional distributed interior. What LLMs do well — pattern matching over the weight space, which is what their architecture naturally supports. What they do poorly — sustained sequential path tracing requiring precise state maintenance, common sense grounded in embodied experience, continuous learning. Why these limitations aren't engineering failures awaiting a fix but structural consequences of implementing holographic-like processing on location-addressed hardware with training only on 1D projections.
Chapter 6: What the Analysis Implies.
The first principles of intelligence are not the first principles of computation. Why scaling won't close the gap: scaling improves the quality of the holographic approximation but doesn't change the architectural mismatch, provide embodied grounding, or enable continuous learning. The fast takeoff fantasy as physics-free reasoning — every self-improvement step requires moving billions of parameters between physically separated memory and compute on real hardware that consumes real energy. The TSMC problem: the most critical hardware infrastructure in the world runs on tacit knowledge distributed across human communities that no LLM can access or replicate. What a genuinely adequate artificial cognitive system would require, in the terms this book has developed. The research program that's needed and why it requires multi-generational public investment rather than industrial R&D on commercial timescales. The human-machine collaboration that's already underway and what it can and cannot achieve.
Conclusion: The Mismatch, Named
Von Neumann saw the gap and couldn't name what was on the other side of it. This book names it: content addressing, requiring distributed storage, implemented in biological tissue through interference-like neural dynamics, approximated in LLMs through distributed weights on location-addressed hardware, grounded in embodied experience that no text-trained system has. The naming matters because you can't close a gap you can't see clearly.
Appendix: A Chronology of Chess, Language, and AI
From the working paper, lightly edited.
No comments:
Post a Comment