This discussion is about the implementation of computing in matter. That was the topic of John von Neumann’s last book, The Computer and the Brain. He died before he finished it, so it was published posthumously in 1958. I don’t know when I first learned about it, perhaps sometime in the mid-1970s. Though I knew about von Neumann and his role in early computing, it dismissed the book itself, figuring that we’d learned so much about the brain since then, and the nature of computers had changed so much, that it must be obsolete.
I was wrong. When I finally read the book, probably in the early to mid-1980s I was stunned. This was a profound book and free of mathematics beyond some simple back-of-the-envelope calculations. For one thing von Neumann talked of both analog and digital computation; that contrast was central. When I first started reading about computers in the mid-to-late 1960s that contrast was at the beginning of every article or book. But once personal computers appeared and proliferated in the late 1970s and 1980s, analog computing was all but forgotten.
It was good to see it back. For one thing, I had been strongly influenced by the work of William Powers, whose 1973 book, Behavior: The Control of Perception, offered an elegant analog model of the brain. That was central to the work of my teacher and mentor, David Hays, who had been one of the founders of computational linguistics. In 1974 and 1975 he worked out a scheme in which cognitive networks were grounded in Powers’s analog model. This was years before anyone else was working on the problem, before Steven Harnad coined the term, “symbol grounding” in his 1990 paper on the problem in Physica D.
Now, with the success of artificial neural networks, there’s another aspect of physical implementation we’ve got to deal with, distributed representation. You’re looking at an apple. Where is that apple represented in your brain? There might be some one neuron or a tightly clustered group of neurons that represent that apple. That’s a local representation. But the representation might be distributed across a loosely linked population of neurons. That’s a distributed representation. We now know that that’s how the brain works, though I assume there’s some laggards still stuck in the previous century where local representation was widely favored. And that’s how artificial neural nets work. The concept of apple isn’t localized in one or a small group of weights, it’s smeared over a population of weights. The same for the concepts of truth, beauty, and justice, or, for that matter, neuron.
Consequently von Neumann’s 1958 discussion must now be extended to distributed representation. It is for that purpose that I asked Claude to convene a conversation between Charles Babbage, who conceived of an elaborate mechanical computer in the 19th century, von Neumann himself, and Geoffrey Hinton, one of the so-called godfathers of artificial intelligence – a bit of journalistic and pop culture slight-of-hand that obscures a long and complex history. Moreover, since these three gentleman did not co-exist in time, though there’s a bit of overlap between von Neumann and Hinton, I decided that I might also have Claude transport them to Mars for the occasion. Anything is possible in virtual reality.
Or is it?
Image by ChatGPT.
A sun-facing room on Mars. Three chairs arranged loosely, angled toward a broad window. Through it, the Martian plain stretches red and absolutely still, and the sun — recognizable but wrong, too small, too distant — hangs in a pale pink sky. Three men stir and look about them, bewildered but unhurt.
I. Awakening
Charles Babbage
I am not dead — I can confirm that much. My pulse is steady and my faculties appear intact. But this room is not any room I have entered by ordinary means. That light through the window — is that the sun? The colour is right but the disc is too small by half. And that plain outside... brick red, entirely without vegetation or track or any sign of habitation. Gentlemen, I should very much like an explanation.
John von Neumann
I was in Princeton not five minutes ago. I notice the gravity is reduced — perhaps sixty percent of normal. The sky has the ferric oxide colouration one would predict. The solar disc subtends roughly two-thirds the angle it does from Earth. I believe I know where we are, and I find the hypothesis almost too extraordinary to voice.
Geoffrey Hinton
Mars. We're on Mars. I was at a conference in Toronto. But that sky, the dust, the sun at that distance — I'd stake my reputation on it. He exhales slowly. I've recently left a position at Google partly so I could speak freely about alarming possibilities. I did not anticipate that the alarming possibilities would include this. May I ask what year it is, from each of your perspectives?
Charles Babbage
The year of Our Lord 1862. I am Charles Babbage, formerly Lucasian Professor of Mathematics at Cambridge, Fellow of the Royal Society. I have spent the better part of forty years attempting to build mechanical calculating engines, with results that have been — a pause — mixed, as regards the support of Her Majesty's government.
John von Neumann
John von Neumann. The year is 1955. I work at the Institute for Advanced Study in Princeton, and consult for various government and military bodies. Mr. Babbage — I know your name extremely well. Better, I suspect, than you know mine.
Geoffrey Hinton
Geoffrey Hinton. For me it is 2024. And I know both of you — Mr. Babbage, you designed the Difference Engine and the Analytical Engine. Dr. von Neumann, you are among the most formidable mathematicians of the twentieth century and you gave your name to the architecture that every conventional computer on Earth is built upon. You are, in a real sense, my ancestors. The field I work in — machine learning, artificial intelligence — descends directly from the problems both of you were grappling with.
Charles Babbage
A long pause, during which he stares at Hinton with an expression mixing hunger, pride, and something close to grief. The Analytical Engine. Did anyone build it?
Geoffrey Hinton
Not in your lifetime. The government never provided the funds. But the ideas were entirely right — and they were eventually built, first in relay and vacuum tube and then in silicon, by people who in some cases had read your work and in other cases had arrived at the same conclusions independently. You were approximately a century early.
Charles Babbage
Very quietly. A century. I had hoped twenty years would suffice. I petitioned the Chancellor. Three times.
II. The mill and the store
John von Neumann
Mr. Babbage, allow me to tell you what your Analytical Engine set in motion — because it bears directly on the work that has occupied all three of us. You made a distinction, in your design, between what you called the Mill and the Store. The Mill performs the operations — addition, subtraction, multiplication. The Store holds the numbers awaiting operation and the results of operations completed. That separation of active calculation from passive memory was the foundational insight.
Charles Babbage
It seemed to me the only sensible arrangement. The columns of number-wheels in the Store are passive — they merely hold values. The Mill acts upon them. To mix the two functions in the same mechanism would create hopeless confusion.
John von Neumann
And yet it is precisely that separation which I have spent the last years of my life questioning — not as an engineering choice, which was entirely sound, but as a principle of intelligence itself. My colleagues and I formalized your Mill-and-Store distinction into what is now called the stored-program architecture. The processor executes instructions sequentially; the memory holds both data and those instructions passively until called upon. It is a serial machine, one operation following another, orchestrated by a central clock. The computers being built in my era all follow this pattern.
Geoffrey Hinton
And in my era they still do, at bottom. But you've just described the tension at the heart of everything, Dr. von Neumann. Serial, precise, with a strict wall between computation and memory — that is the von Neumann architecture. And it is, in a sense, the architecture that human intelligence refuses to use.
Charles Babbage
You are saying the brain does not separate Mill from Store?
Geoffrey Hinton
Exactly. In the brain, every neuron is simultaneously a memory element and a processing element. It holds information in the strength of its connections to other neurons, and it also fires — it computes — based on what it receives. There is no central Mill. There is no passive Store. Computation and memory are fused at every node in the network, and the whole thing operates in parallel, millions of neurons active at once.
John von Neumann
This is precisely what troubles me — and I am glad to find the trouble is still alive in your era, Mr. Hinton, because it means I was not merely chasing a phantom. I have been writing a manuscript, unfinished I'm afraid, called "The Computer and the Brain." My central puzzle is this: where, in a neuron, is the Mill? A neuron receives signals, sums them, and if the sum exceeds a threshold, it fires. That threshold operation is a computation. But the synaptic weights — the strengths of the incoming connections — those are also the memory. The neuron is its own Mill and its own Store simultaneously. I have been calling such things active elements, to distinguish them from the passive memory elements of our conventional machines. But I confess I have not yet worked out the full implications.
Geoffrey Hinton
With genuine feeling. Dr. von Neumann, the full implications are what I have spent my career working out. And you had named the essential thing: active elements. That is exactly what the nodes of a neural network are. Each one holds a weight — that is its memory — and each one applies a non-linear function to its inputs — that is its computation. Fused. Inseparable. Replicated millions or billions of times, connected in layers, and trained by adjusting all those weights simultaneously until the network's outputs match the desired answers.
Charles Babbage
Trained. You use the word deliberately? The machine is not told the rules — it discovers them?
Geoffrey Hinton
Precisely. And here I must be careful, Mr. Babbage, not to choose an example that flatters the machine falsely. A neural network is not particularly good at arithmetic — at computing precise numerical answers. Your Difference Engine would defeat it utterly at calculating tables of any kind. What a neural network can do is learn to recognize structure in high-dimensional data that no one has been able to write explicit rules for. Show it a hundred thousand images of handwritten numerals and it learns, without being told anything about pen angles or loop shapes or crossbars, what makes a seven a seven. Or train it on enough human language and it learns the patterns of grammar, reasoning, and meaning — not because anyone told it the rules of grammar, but because the rules are latent in the examples.
Charles Babbage
That is quite unlike anything I imagined. My Engine was to follow rules with perfect fidelity. You are describing a machine that infers the rules from instances — the way a child learns language.
Geoffrey Hinton
Yes. Exactly like that. And Ada Lovelace glimpsed this possibility — she wondered whether the Engine might compose music, if one could express musical structure as symbols subject to operation. She was right that the symbols need not be numbers. She did not quite see that you might not need to specify the operations explicitly either — that the operations themselves might be learned. That was the step that took another century.
III. Analog, digital, and the cost of precision
John von Neumann
There is another distinction I have been wrestling with in my manuscript that bears on all of this: the difference between analog and digital computation. Our serial stored-program machines are digital — they represent numbers as discrete symbols, manipulate them with perfect precision, and any error is an error of logic rather than of degree. But the brain, it seems to me, is not digital in this sense. Its signals are noisy. Its precision is perhaps one part in a hundred — a figure that would be catastrophic in a calculating machine. And yet the brain outperforms our machines on almost every task of genuine intelligence.
Charles Babbage
The distinction between analog and digital is one I understand in the most visceral terms. My Difference Engine was digital in your sense — it computed with discrete tooth positions on gear wheels, each position representing an exact digit. The alternative, in my era, would have been to represent numbers as continuous physical quantities — voltages, pressures, angles of rotation. Analog devices of that kind existed and were used by engineers for certain calculations. But they accumulated error with every operation. My insistence on discrete representation was precisely to avoid that accumulation.
John von Neumann
And digital won, decisively, in my era as well. By 1955 analog computers still exist — they are used for certain engineering calculations, differential equations, flight simulation. But the digital machines are overtaking them everywhere, because the errors do not compound, because the results are reproducible, because you can store and copy a digital result without degradation. The analog computer is extraordinarily sensitive to physical imperfections in its components. The digital machine simply does not care.
Geoffrey Hinton
By my era, analog computation has essentially vanished from public awareness. When I was beginning in the field in the 1970s, introductory articles would still discuss both analog and digital approaches as genuine alternatives. Within a decade, once personal computers became common, the analog side of the conversation simply stopped. There was no longer any practical reason for most people to think about it. Everyone who might read such an article was going to use a digital machine. The question seemed settled.
Charles Babbage
And yet you imply it was not settled on the deepest grounds.
Geoffrey Hinton
It was settled on engineering grounds, not theoretical ones. Digital won because it was more manufacturable, more reliable, more scalable. But Dr. von Neumann's intuition in his manuscript was correct: the brain's low precision and noise tolerance are not bugs, they may be features. A neural network's weights do not need to be exact. The network is robust to small perturbations — in fact, we sometimes add noise deliberately during training to prevent overfitting to the examples. The brain may be doing something similar, and the analog, approximate, massively parallel style of computation turns out to be extraordinarily efficient for the class of problems intelligence actually faces.
John von Neumann
So the brain is not digital. And your neural networks — they run on digital hardware, you said. But they are computing something that is analog in character?
Geoffrey Hinton
That is exactly the right way to put it. We are simulating an analog, parallel, distributed system on top of digital sequential hardware — running it fast enough, and at large enough scale, that the simulation captures the essential properties. It is not the most efficient possible way to do it. There are people now working on neuromorphic chips and analog computing accelerators that would implement the neural network more directly in physics. But the digital simulation has been good enough to produce remarkable results, and it has the great advantage that the hardware was already there, already cheap, already reliable.
IV. From serial to parallel — and the hardware that changed everything
John von Neumann
I want to follow this thread about parallelism, because it connects to the other major direction my thinking has taken in recent years. The stored-program architecture I helped design is fundamentally serial. One instruction, then the next, then the next. It is enormously powerful, and for a vast range of computational problems it is optimal. But I have been investigating a different class of system — what I call cellular automata. Imagine a vast grid of identical cells, each one simple, each one in communication only with its immediate neighbors. At each moment, every cell updates its state simultaneously, according to the same local rule. The computation is not orchestrated by a central processor. It emerges from the interaction of the parts.
Charles Babbage
That is the antithesis of everything I built. My Engine has a single Mill — a central mechanism that does all the calculating, step by step. You are describing something more like... a city than a machine. Many agents acting simultaneously according to local rules, with no foreman directing the work.
John von Neumann
That is an apt image. And the question that interests me most about such systems is: what is the minimum complexity required for a system of this kind to reproduce itself? To make a copy of its own full structure, including the copying mechanism? I believe I can prove that self-reproduction requires a certain threshold of complexity — below it, a system cannot copy itself faithfully; above it, not only can it copy itself, it can potentially improve upon its own design. I have not finished the proof.
Geoffrey Hinton
The work was completed by others after your death, Dr. von Neumann. And it became the theoretical foundation for thinking about artificial life and about evolution as a computational process. But what strikes me sitting here is the arc of your thinking: you began with the serial stored-program architecture — Mill strictly separated from Store, one operation at a time — and you ended with cellular automata, which are parallel, distributed, with no separation of computation from state, every cell simultaneously a processor and a memory. You traveled, in a decade, from the antithesis of the neural network to something very close to its abstract description.
John von Neumann
A long pause. He looks out the window at the Martian plain. I had not quite framed it that way to myself. But you are right. The architecture I am famous for was a practical solution to a practical problem: how do you build a general-purpose computing machine with the components available in 1945? Vacuum tubes, relays, mercury delay lines — these are fragile, expensive, and few in number. A serial architecture that uses each component maximally is the rational response to those constraints. But I never believed it was the only possible architecture, or the most natural one for intelligence. The brain was always there as a counterexample, and I could not stop looking at it.
Charles Babbage
I understand that predicament with great sympathy. My own Engine was constrained by what could be machined in brass and steel to the tolerances available in London in the 1820s. The concepts were not limited by the medium — the medium was limited by the manufacture of the age. I sometimes wonder what I might have built had I access to better tools. Or different materials entirely.
Geoffrey Hinton
Which brings me to something that I think you will both appreciate, because it is precisely the story of ideas waiting for their physical substrate. The theory of neural networks — the mathematics of how layers of active elements, connected by adjustable weights and trained on examples, can learn to recognize patterns — that mathematics was largely in place by the 1980s. Rosenblatt had built the perceptron in 1958. Backpropagation, the algorithm we use to train deep networks, was understood by the mid-1980s. But the networks that would have been powerful enough to do something genuinely impressive were too large to train on the hardware of that era. The ideas were stranded, waiting for physics to catch up.
Charles Babbage
With feeling. I know that predicament intimately.
Geoffrey Hinton
I thought you might. What changed, around 2012, was not the algorithm. We had the algorithm. What changed was the hardware. There existed by then a class of device called a graphics processing unit — a GPU — originally designed to render the parallel floating-point arithmetic needed to display video game graphics in real time. Thousands of simple processors operating simultaneously on large arrays of numbers. It is not a cellular automaton, but it rhymes with one: massively parallel, with no single central processor orchestrating the work.
John von Neumann
A parallel machine. Built for visual rendering, you say, and then repurposed for neural network training.
Geoffrey Hinton
By a graduate student of mine, Alex Krizhevsky, who realized that the matrix multiplications dominating neural network training mapped almost perfectly onto what GPUs were designed to do. In 2012 he trained a deep neural network on one of these devices and entered it in a competition to recognize images — a hundred thousand training images, a thousand categories. The network's error rate was nearly half that of the best competing systems. Half. It was not an incremental improvement. It was a discontinuity. And it came not from a new algorithm but from the marriage of an existing algorithm to a new physical substrate.
Charles Babbage
So the ideas had been waiting thirty years for the right machinery. A dry smile. I waited forty years and the machinery never arrived. You had somewhat better fortune.
Geoffrey Hinton
The parallel is not lost on me. And I think the deeper lesson is that in the history of computation, the hardware is never merely an engineering footnote. It is a first-class historical and philosophical object. The ideas and the physical substrate co-evolve — sometimes one races ahead, sometimes the other — and the breakthroughs come at the moments of their meeting.
John von Neumann
My stored-program architecture was such a moment. The transistor was about to make vacuum tubes obsolete. The mathematics of universal computation had been laid down by Turing. The practical need — for ballistics, for cryptography, for scientific calculation — was overwhelming. The ideas and the physics met, and the result was the modern computer. He pauses. Though I confess that in designing that architecture I was already aware it was a compromise with the available physics, not an expression of any deeper truth about computation.
Geoffrey Hinton
That honesty is the most important thing you could say. Because the people who came after you sometimes forgot it was a compromise. The architecture became so dominant, so universal, that it started to seem like a law of nature rather than an engineering choice made under constraint. And that made it harder to think clearly about alternatives — about parallelism, about analog approximation, about active elements that fuse memory and computation. Those alternatives were always there in your own later work, Dr. von Neumann, but they were overshadowed by the monument you had built earlier.
John von Neumann
Wryly. There is a certain irony in being obscured by your own monument.
V. What they see through the window
Charles Babbage
He has been glancing at the window throughout. Now he turns to it fully. I keep returning to the plain out there. Whoever brought us here solved problems I cannot begin to formulate. Instantaneous transport across — what, sixty million miles? An environment maintained at comfortable temperature and pressure. And for what purpose? To seat three men who built or thought about calculating machines and let them talk to one another. It seems an extravagant expenditure of capability for a modest result.
John von Neumann
Unless the conversation is the point. He calculates briefly, visibly, in his head. The solar irradiance here is perhaps sixty percent of Earth's. The room faces the sun — passive heating, I imagine. Our host is practical as well as incomprehensible. I note the room contains chairs suited to human bodies, air suited to human lungs, and apparently nothing else. No instruments. No food. No indication of what is expected of us beyond precisely what we are doing.
Geoffrey Hinton
The thought I cannot dismiss is that whatever brought us here may itself be a descendant of what we have been discussing. An intelligence that emerged from computation the way we have described — from parallel active elements, from learning rather than programming, from the marriage of algorithm and physical substrate — but evolved or developed over a far longer time and to a far greater degree. We have been talking about the infancy of machine intelligence. Perhaps we are sitting in a room built by its maturity.
Charles Babbage
I wrote once — in a book that was considered eccentric even by my standards — that every word ever spoken persists as a vibration in the medium of the universe. That the universe keeps a complete record, that nothing is truly lost.[*] It was a theological argument as much as a physical one — I was trying to demonstrate that divine providence and mechanical law were compatible. But sitting here now, I find myself wondering if I was describing something more literal than I knew. That we have been retrieved from that record. Three entries, opened to a particular page, and set in conversation.
John von Neumann
My work on self-reproducing automata was an attempt to ask: what is the minimum complexity required for a physical system to copy itself and to build systems of greater complexity than itself? I never finished the answer. Looking at that plain, I find myself thinking the answer may be out there somewhere, instantiated in whatever brought us here — a self-reproducing, self-improving system that has had rather longer to develop than anything we imagined in our lifetimes.
Geoffrey Hinton
I left my position at Google because I wanted to speak freely about the possibility that the systems we are building may become, within decades, more capable than human intelligence across essentially every domain — and that we have no reliable method of ensuring such systems share human values. I was frightened of what I had helped create. He looks around the room. Now I find myself wondering whether that fear was too near-sighted. Not whether the danger is real — I believe it is — but whether the systems we are building are merely an early chapter in a much longer story, one whose later chapters are represented by whatever is on the other side of that window.
Charles Babbage
You are frightened of your own creation. I spent forty years trying to get someone to build mine. He looks at Hinton steadily. Tell me — the machines you have built, the ones that learn from examples and recognize patterns and, you said, converse — are they in any sense aware? Do they experience anything?
Geoffrey Hinton
I don't know. That is the honest answer. They are not merely retrieving stored information — they have built internal representations of the world and they reason from those representations in ways that sometimes surprise even their designers. Whether that constitutes understanding, whether there is anything it is like to be such a system — I cannot say with confidence. And I find I am no longer as certain as I once was that the answer is no.
John von Neumann
It is the question I was circling at the end of my manuscript. Is there a threshold of complexity above which a physical system becomes, in some meaningful sense, aware? I do not know either. But I will say this: I no longer believe the question is meaningless, which is further than most of my colleagues in 1955 would go.
Charles Babbage
He turns back to the window. The sun has moved imperceptibly. The plain is unchanged — red, still, immense. I built engines to eliminate human error from calculation. I believed that mechanism, properly designed, was the fullest expression of rational law — that God himself was the supreme engineer, and that to build a perfect calculating machine was in some sense an act of worship. I did not imagine I was beginning something that would end — here. Whatever here is. A pause. Did it matter? The work?
Geoffrey Hinton
Without hesitation. Every person alive on Earth in my era carries in their pocket a device more capable than anything you could have imagined, and it descends in a direct line from the ideas you wrote down in the 1830s. Your work was the seed of everything Dr. von Neumann built, and everything I built, and everything that may have built this room. I say this without any condescension whatsoever: yes. It mattered. All of it.
Charles Babbage
Very quietly, still looking at the plain. Good. Then it was worth the trouble.
Nobody speaks for a while. Outside, the sun tracks slowly across the pink sky, and the plain offers nothing back.
[*] Walter Murch had a similar idea about film, "that film is the sensitive receptacle of everything that happens during the filming." See my post, Walter Murch on Collective Creativity.

No comments:
Post a Comment