NEW SAVANNA: Physical constraints on computing, process and memory, Part 1 [LeCun]

Sunday, July 24, 2022

Physical constraints on computing, process and memory, Part 1 [LeCun]

Yann LeCun recently posted a major position paper that has been receiving quite a bit of discussion:

Yann LeCun, A Path Towards Autonomous Machine Intelligence, Version 0.9.2, 2022-06-27, https://openreview.net/forum?id=BZ5a1r-kVsf

This post is a response to a long video posted by Dr. Tim Scarfe which raised a number of important issues. One of them is about physical constraints in the implementation of computing procedures and memory. I’m thinking this may well be THE fundamental issue in computing, and hence in human psychology and AI.

I note in passing that John von Neumann’s The Computer and the Brain (1958) was about the same issue and discussed two implementation strategies, analog and digital. He also suggested that the brain perhaps employed both. He also noted that, unlike digital computers, where you have and active computational unit linked to passive memory through fetch-execute cycles, that each unit of the brain, i.e. neuron, appears to be an active unit.

Physical constraints on computing

Here’s the video I was talking about. It is from the series Machine Learning Street Talk, #78 - Prof. NOAM CHOMSKY (Special Edition), and is hosted by Dr. Tim Scarfe along with Dr. Keith Duggar and Dr. Walid Saba.

As I’m sure you know, Chomsky has nothing good to say about machine learning. Scarfe is not so dismissive, but he does seem to be a hard-core symbolist. I’m interested in a specific bit of the conversation, starting about about 2:17:14. One of Scarfe’s colleagues, Dr. Keith Duggar, mentions a 1988 paper by Fodor and Pylyshyn, Connectionism and Cognitive Architecture: A Critical Analysis (PDF). I looked it up and found this paragraph (pp. 22-23):

Classical theories are able to accommodate these sorts of considerations because they assume architectures in which there is a functional distinction between memory and program. In a system such as a Turing machine, where the length of the tape is not fixed in advance, changes in the amount of available memory can be affected without changing the computational structure of the machine; viz by making more tape available. By contrast, in a finite state automaton or a Connectionist machine, adding to the memory (e.g. by adding units to a network) alters the connectivity relations among nodes and thus does affect the machine’s computational structure. Connectionist cognitive architectures cannot, by their very nature, support an expandable memory, so they cannot support productive cognitive capacities. The long and short is that if productivity arguments are sound, then they show that the architecture of the mind can’t be Connectionist. Connectionists have, by and large, acknowledged this; so they are forced to reject productivity arguments.

That’s what they were talking about. Duggar and Scarfe agree that this is a deep and fundamental issue. A certain kind of very useful abstraction seems to depend on separating the computational procedure from the memory on which it depends. Scarfe (2:18:40): “LeCun would say, well if you have to handcraft the abstractions then learning's gone out the window.” Duggar: “Once you take the algorithm and abstract it from memory, that's when you run into all these training problems.”

OK, fine.

But, as they are talking about a fundamental issue in physical implementation, it must apply to the nervous system as well. Fodor and Pylyshyn are talking about the nervous system too, but they don’t really address the problem except to assert that (p. 45), “the point is that the structure of ‘higher levels' of a system are rarely isomorphic, or even similar, to the structure of ‘lower levels' of a system,” and therefore the fact that the nervous system appears to be a connectionist network need not be taken as indicative about the nature of the processes it undertakes. That is true, but no one has, to my knowledge, provided strong evidence that this complex network of 86 billion neurons is, in fact, running a CPU and passive memory type of system.

Given, that, how has the nervous system solved the problem of adding new content to the system, which it certainly does? Note that here is their specific phrasing, from the paragraph I’ve quoted: “adding to the memory (e.g. by adding units to a network) alters the connectivity relations among nodes and thus does affect the machine’s computational structure.” The nervous system seems to be able to add new items to memory without, however, having to add new physical units, that is neurons, to the network. That is worth thinking about.

Human cortical plasticity: Freeman

The late Walter Freeman has left us a clue. In an article from 1991 in Scientific American (which was more technical in those days), entitled “The Physiology of Perception,” he discusses his work on the olfactory cortex. He’s using an array of electrodes mounted on the cortical surface (of a rat) to register electrical activity. Note that he’s NOT making recordings of the activity of individual neurons. Rather, he’s recording activity in a population of neurons. He then made 2-D images of that activity.

The shapes we found represent chaotic attractors. Each attractor is the behavior the system settles into when it is held under the influence of a particular input, such as a familiar odorant. The images suggest that an act of perception consists of an explosive leap of the dynamic system from the " basin" of one chaotic attractor to another; the basin of an attractor is the set of initial conditions from which the system goes into a particular behavior. The bottom of a bowl would be a basin of attraction for a ball placed anywhere along the sides of the bowl. In our experiments, the basin for each attractor would be defined by the receptor neurons that were activated during training to form the nerve cell assembly.

We think the olfactory bulb and cortex maintain many chaotic attractors, one for each odorant an animal or human being can discriminate. Whenever an odorant becomes meaningful in some way, another attractor is added, and all the others undergo slight modification.

Let me repeat that last line: “Whenever an odorant becomes meaningful in some way, another attractor is added, and all the others undergo slight modification.” That the addition of a new item to memory should change the other items in memory is what we would expect of such a system. But how does the brain manage it? It would seem that specific memory items are encoded in whole populations, not in one or a small number of neurons (so-called ‘grandmother’ cells). Somehow the nervous system is able to make adjustments to some subset of synapses in the population without having to rewrite everything.

In this connection it’s worth mentioning my favorite metaphors for the brain, as a hyperviscous fluid. What do I mean by that? A fluid having many components, of varying viscosity (some very high, some very low, and everything in between), which are intermingled in a complex way, perhaps fractally. Of course, the brain, like most of the body’s soft tissue, is mostly water, but that’s not what I’m talking about. I’m talking about connectivity.

Perhaps I should instead talk about a hyperviscous network, or mesh, or maybe just a hyperviscous pattern of connectivity. Some synaptic networks have extremely high viscosity and so change very slowly over time while others have extremely low viscosity, and change rapidly. The networks involved in tracking and moving in the world in real time must have extremely low viscosity while those holding our general knowledge of the world and our own personal history will have a very high viscosity.

In the phenomenon that Freeman reports, we can think of the overall integrity of the odorant network as being maintained at, say, level 2, where moment-to-moment sensations are at level 0. The new odorant is initially registered at level 1 and so will affect the level 1 networks across all odorants. That’s the change registered in Freeman’s data. But the differences between odorants are still preserved in level 2 networks. Over time the change induced by the new odorant will percolate from level 1 to level 2 synaptic networks. Thus a new item enters the network without disrupting the overall pattern of connectivity and activation.

That is something I just made up. I have no idea whether or not, for example, it makes sense in terms of the literature on long-term potentiation (LTP) and short-term potentiation (STP), which I do not know. I do note, however, that the term “viscosity” has a use in programming that is similar to my use here.

Addendum: Are we talking about computation in the Freeman example?

I have variously argued that language is the simplest operation humans do that qualifies as computation. Thus, earlier in the commentary on LeCun’s piece I have said:

I take it as self-evident that an atomic explosion and a digital simulation of an atomic explosion are different kinds of things. Real atomic explosions are enormously destructive. If you want to test an atom bomb, you do so in a remote location. But you can do a digital simulation in any appropriate computer. You don’t have to encase the computer in lead and concrete to shield you from the blast and radiation, etc. And so it is with all simulations. The digital simulation is one thing, and real phenomenon, another.

That’s true of neurons and nervous systems too. [...] However, back in 1943 Warren McCulloch and Walter Pitts published a very influential paper (A Logical Calculus of the Ideas Immanent in Nervous Activity) in which they argued that neurons could be thought of as implementing circuits of logic gates. Consequently many have, perhaps too conveniently, assumed that nervous systems are (in effect) evaluating logical expressions and therefore that the nervous system is evaluating symbolic expressions.

I think that’s a mistake. Nervous systems are complex electro-chemical systems and need to be understood as such. What happens at synapses is mediated by 100+ chemicals, some more important than others. It seems that some of these processes have a digital character while others have an analog character. [...] I have come to the view that language is the simplest phenomenon that can be considered symbolic, thought we may simulate those processes through computation if we wish. That implies that there is no symbolic processing in animals and none in humans before, say, 18 months or so. Just how language is realized in a neural architecture that seems so contrary to the requirements of symbolic computing, that is a deep question, though I’ve offered some thoughts about that in the working paper I mentioned in my original comment.

If the phenomenon Freeman describes is not about computation, and it is NOT according to my current beliefs, then how does the problem brought up by Fodor & Pylyshyn apply?

And yet there IS a problem, isn’t there. There is a physical network with connections between the items in the network. Those connections must be altered in order to accommodate a new phenomenon. We can’t just add a new item to the end of the tape. That is, it IS a physical problem of the same form. So perhaps this technicality doesn’t matter.

NEW SAVANNA

Pages in this blog

Sunday, July 24, 2022

Physical constraints on computing, process and memory, Part 1 [LeCun]

No comments:

Post a Comment