Monday, August 5, 2019

The design of design [How abstract is that?]

Turing and von Neumann both completely understood that the interesting place in computation is how computation becomes physical, how it becomes embodied and how you represent it.
MIT's Neil Gershenfeld has an interesting discussion at Edge: Morphogenesis for the Design of Design.
I’d like to end this interesting long day by explaining why I think computer science was one of the worst things ever to happen to computers or science, why I believe that, and what that leads me to. I believe that because it’s fundamentally unphysical. It’s based on maintaining a fiction that digital isn’t physical and happens in a disconnected virtual world. [...]

First of all, I’ve come to the conclusion that this is a historical accident. I could ask Marvin what John von Neumann was thinking, and I could ask Andy Gleason what Turing was thinking, and neither of them intended us to be living in these channels. Von Neumann wrote beautifully about many things, but computer architecture wasn’t one of them. We’ve been living with the legacy of the EDVAC and the machines around us, and much of the work of computers is not computationally useful because it’s just shuttling stuff. The Turing machine was never meant to be an architecture. In fact, I'd argue it has a very fundamental mistake, which is that the head is distinct from the tape. And the notion that the head is distinct from the tape—meaning, persistence of tape is different from interaction—has persisted. The computer in front of Rod Brooks here is spending about half of its work just shuttling from the tape to the head and back again.
That is, as currently designed computers spend a lot of time just transferring "information" (my scare quotes) from one place to another, between memory and the CPU. This is interesting:
We did a study for DARPA of what would happen if you rewrote from scratch a computer software and hardware so that you represented space and time physically. So, if you zoom from a transistor up to an application, you change representations—completely unrelated ones—about five different times. If you zoom the building we’re in from city, state, country, it’s hierarchical, but you respect the geometry. It turns out you can do that to make computer architectures where software and hardware are aligned and not in disconnected worlds.
But, alas, I don't understand it, can't imagine/visualize it. Here and there Gershenfeld does mention brains, and brains do not separate the "head" from the "tape", processing from memory. Each neuron is both a memory unit and a processor, something von Neumann realized in his small posthumous volume, The Computer and the Brain.

On deep-learning:
There’s nothing magic about the deep-learning architectures. The magic is there’s more data with more memory with more cycles. It’s a cargo cult, the obsession with the acronym zoo of deep learning. It’s just an exercise in scaling that’s been making that possible.
This is followed by an interesting discussion of analog vs. digital involving "interior point methods, or relaxations, where you have a discrete answer you want—like routing an airplane or which way to turn a car—but the way you get through it is to relax the discrete constraints and use internal degrees of freedom." Interesting, but I don't quite understand. Relaxation, yes (simulated annealing?). Interior points – ?? And then:
The real meaning of digital is that scaling property. But the scaling property isn’t one and zero; it’s the states in the system. In the end, what these interior point and relaxation methods do is drive to an outcome that’s a discrete state, but you pass through continuous degrees of freedom. It’s very naïve to say digital is ones and zeroes. It’s state restoration, but you can use continuous degrees of freedom. In many different areas this is done to do the state restoration.
And on to an interesting analogy:
Compare state of the art manufacturing with a Lego brick or a ribosome: When a kid plays with Lego, you don’t need a ruler because the metrology comes from the parts. It's the same thing for the amino acids. The Lego tower is more accurate than the motor control of the child because you detect and correct errors in their construction. It’s the same thing with the amino acid. There’s no trash with Lego because there’s information in the construction that lets you deconstruct it and use it again. It’s the same thing with the amino acids. It’s everything we understand as digital, but now the digital is in the construction. It’s digitizing the materials.
Morphogenesis:
This is the thing I’m most excited about right now: the design of design. Your genome doesn’t store anywhere that you have five fingers. It stores a developmental program, and when you run it, you get five fingers. It’s one of the oldest parts of the genome. Hox genes are an example. It’s essentially the only part of the genome where the spatial order matters. It gets read off as a program, and the program never represents the physical thing it’s constructing. The morphogenes are a program that specifies morphogens that do things like climb gradients and symmetry break; it never represents the thing it’s constructing, but the morphogens then following the morphogenes give rise to you.

What’s going on in morphogenesis, in part, is compression. A billion bases can specify a trillion cells, but the more interesting thing that’s going on is almost anything you perturb in the genome is either inconsequential or fatal. The morphogenes are a curated search space where rearranging them is interesting—you go from gills to wings to flippers. The heart of success in machine learning, however you represent it, is function representation. The real progress in machine learning is learning representation. How you search hasn’t changed all that much, but how you represent search has. These morphogenes are a beautiful way to represent design. Technology today doesn’t do it. Technology today generally doesn’t distinguish genotype and phenotype in the sense that you explicitly represent what you’re designing. In morphogenesis, you never represent the thing you’re designing; it's done in a beautifully abstract way. For these self-reproducing assemblers, what we’re building is morphogenesis for the design of design. Rather than a combinatorial search over billions of degrees of freedom, you search over these developmental programs. This is one of the core research questions we’re looking at.
And now fabrication:
I started with complaining that computer science was the worst thing to happen to computers or science because it’s unphysical, and pointed out that you can have a do-over of computer science that’s much more aligned with physics. It has all kinds of benefits ranging from computing with very different physical systems to limits of high-performance computing but, ultimately, reuniting computer science and physical science leads to merging the bits and atoms. Fabrication merges with communication and computation. Most fundamentally, it leads to things like morphogenesis and self-reproducing an assembler. Most practically, it leads to almost anybody can make almost anything, which is one of the most disruptive things I know happening right now. Think about this range I talked about as for computing the thousand, million, billion, trillion now happening for the physical world, it's all here today but coming out on many different link scales.
In the comments, from Danny Hillis:
I was going to give just a very specific small example that supports the abstraction that you’re saying. In modern ways of analyzing algorithms, and computers, and the computer science, we count the cost of moving a bit in time. We call that storage, and that’s very carefully measured in the algorithms and things like that. The cost of moving a bit in space is completely invisible, and it just doesn’t come up. There’s no measure of that in the way that we abstract it, but if you look at the megawatts that are dissipated in high-performance computers, it mostly comes from moving bits in space. That’s the big limitation, and that’s also where the errors are and where the cost is. So, our abstraction that we’re thinking about the algorithms in is completely out of sync with where our costs are.
Gershenfeld in the comments:
Analog doesn’t mean analog. In other words, analog in this context means you have states, and you recover from errors, and you detect states. But states are outcomes of the system, they're not ones and zeroes. One of the things we’re stuck in is this idea that a state is one and a zero. This device in front of me keeps recurring the state not at the high-level thing I’m trying to do, but at the ones and zeroes.

These interior point relaxation methods I was describing both in software optimization and in emerging chips do digitize, but they’re digitizing on high-level outcomes but using the analog degrees of freedom. That was behind my comment that when the brain does a few moves a second, it’s moving through this very high-dimensional space, ending into a discrete outcome. So, the effective number of operations that are done this way is an enormous number.
And:
Gopnik: It seems to me like it’s an incredibly interesting understudied fact that what this all ends up driving is a bunch of fingers and your larynx. This tiny system with tiny degrees of freedom and very little complexity is the thing that’s doing the work that we think of as being a lot of the work of intelligence.

Gershenfeld: But again, these kinds of relaxation interior point methods that I keep alluding to, there’s something similar to them in that they’re moving through these billion-dimensional spaces, but what they’re putting outside is not the interior point but statistics of the states that they’re getting driven to. So, there are analogs between unpacking the huge number of internal degrees of freedom versus small numbers of observable degrees of freedom in these engineered systems.

No comments:

Post a Comment