NEW SAVANNA: April 2022

Saturday, April 30, 2022

Feed-forward layers in transformer models

If you (like me) have wondered what the feed-forward layers in transformer models are actually doing, this is a pretty interesting paper on that topic:https://t.co/cqs1OksVR5 pic.twitter.com/BiplVDxS3e
— Karl Higley (@karlhigley) April 30, 2022

While I'm finding it tough sledding – my fault – this is a very interesting article, well worth your time. I'm beginning to think we're going to figure out what's going on inside these engines.

White flower

About the Ukraine, the WEIRDEST war, Part 2: On cooperation in war

On Thursday I posted about the Ukraine war. The post had two parts. The first part was a standard kind of post on such matters, some links to news, a quote from a pundit, and some connecting-tissue prose. The second part was quite different. It was a fantasy fueled by conspiracy-theory: The world’s leaders, on all sides of the conflict, had sent their proxy bots to a hidden underground lair for the purpose of managing the war. They were playing a game. Games, as you know, are cooperative ventures.

The parties cooperate to establish a venue within which they can compete against one another. In the case of ordinary games, such as chess, poker, soccer, jai alai, and so forth, the rules are agreed to in advance. Such agreement is an act of cooperation. There is always the possibility of cheating, which is anti-cooperative. But there is a framework.

What’s the agreed framework for war in general, or for this particular war? We’ve got the Geneva Conventions, “that establish international legal standards for humanitarian treatment in war.” But that’s only one aspect of the conduct of war, and the conventions are easily breached. We’ve got treaties governing the use of biological and chemical weapons. There are implicit norms, there is international law, and so forth. There IS stuff, but it’s not like the rules of chess or soccer. It’s much looser.

For example, we’ve got understandings, let us say, about nuclear weapons. Consider this recent statement, from January 22, 2022: Joint Statement of the Leaders of the Five Nuclear-Weapon States on Preventing Nuclear War and Avoiding Arms Races. The five states are The People’s Republic of China, the French Republic, the Russian Federation, the United Kingdom of Great Britain and Northern Ireland, and the United States of America. Consider these paragraphs:

We affirm that a nuclear war cannot be won and must never be fought. As nuclear use would have far-reaching consequences, we also affirm that nuclear weapons—for as long as they continue to exist—should serve defensive purposes, deter aggression, and prevent war. We believe strongly that the further spread of such weapons must be prevented.

We reaffirm the importance of addressing nuclear threats and emphasize the importance of preserving and complying with our bilateral and multilateral non-proliferation, disarmament, and arms control agreements and commitments. We remain committed to our Nuclear Non-Proliferation Treaty (NPT) obligations, including our Article VI obligation “to pursue negotiations in good faith on effective measures relating to cessation of the nuclear arms race at an early date and to nuclear disarmament, and on a treaty on general and complete disarmament under strict and effective international control.”

It concludes, “We are resolved to pursue constructive dialogue with mutual respect and acknowledgment of each other’s security interests and concerns.” That’s a statement of good intentions, no more. It’s not a binding and enforceable agreement.

I think we can say that, in threating the use of nuclear weapons, Russian has violated the spirit of those good intentions. And? Does the other side take them seriously? How does it affect their actions? That’s something being negotiated. And negotiations imply cooperation. The cooperation may be coerced, but it is still cooperation.

We’re trying to figure out how to fight this war. We’re negotiating the rules. What rules govern the use of sanctions or the seizure of assets? What about Russia’s gas? Does Russia deny it to Europe? Does Europe stop purchasing it? And cyberwar, what are the rules there?

The fact is, it’s all up for grabs. In one way or another everyone in the world has a stake in this war, which has become a proxy war between NATO and Russia, with implications everywhere. While everyone is affected, there are a relatively small number of effective agents, and they certainly are not neatly aligned on two sides. It’s a mess. No one really knows what’s going on. It does appear to me that the nature of the international order is up for grabs.

My conspiracy-theory fantasy, bots playing a war game in a Bond-villain lair, that is much neater that what is actually happening. But then, that’s how conspiracy theories work, they reduce a complex situation to an intelligible conflict between comprehensible actors. That’s not what’s happening it the world today.

Friday, April 29, 2022

Input to the visual cortex

Visual cortex gets input from different other brain regions depending on if it is for the center of your vision or the periphery [in primates]

Very cool anatomical studyhttps://t.co/VIPw5ydkno pic.twitter.com/40pLkuw3iF
— Adam J Calhoun (@neuroecology) April 29, 2022

The first iris of the season

Bye, bye 9 to 5, hello 9 to 2, plus an evening shift

Emma Goldberg, Working 9 to 2, and Again After Dinner, NYTimes, April 29, 2022.

For many remote workers, 9 to 5 has changed to something more fragmented. A typical schedule might look more like 9 to 2, and then 7 to 10. Then sometimes another five minutes, wherever you can squeeze them in.

When the coronavirus upended the workplace in 2020, leaving roughly 50 million people working from home by that May, the workday as we knew it went through radical changes, too. Mornings became less harried. Afternoons became child care time. Some added a third shift to their evenings, what Microsoft researchers call the “third peak” of productivity, following the midmorning and after-lunch crunches. With 10 percent of Americans still working from home and some businesses embracing remote work permanently, companies are scrambling to adjust to a new understanding of working hours.

“What we used to think of as traditional work — very specific location, very specific ways of working together, very well-defined work metrics — those are changing,” said Javier Hernandez, a researcher in Microsoft’s human understanding and empathy group. “There’s the opportunity for flexibility. There’s also the opportunity to make us miserable.”

The more scattered approach to work scheduling has created enormous upsides for parents, along with some new sources of stress. What’s clear is the shift: The workday, when charted out, has started to look less like a single mountain to scale, and more like a mountain range.

There's more at the link, including examples days from several people.

Two AI companies converging on (the mythical) AGI from different directions

On Wednesday I blogged about Adept, which is approaching general intelligence by way of creating natural language interfaces for software packages used by end users, in their terms, “a universal collaborator for every knowledge worker.” That makes sense because current systems acquire knowledge, not through hand-coding (like symbolic AI), but through learning. The software environment is ‘native’ for AI engines, whereas the physica world is not, and so they should be able to learn it efficiently so they can act in it.

This morning I found about Halodi Robotics. They’ve just hired Eric Jang as their VP for AI. Here’s what he says:

If your endgame is to build a Foundation Model that train on embodied real-world data, having a real robot that can visit every state and every affordance a human can visit is a tremendous advantage. Halodi has it already, and Tesla is working on theirs. My main priority at Halodi will be initially to train models to solve specific customer problems in mobile manipulation, but also to set the roadmap for AGI: how compressing large amounts of embodied, first-person data from a human-shaped form can give rise to things like general intelligence, theory of mind, and sense of self. [...]

Reality has a surprising amount of detail, and I believe that embodied humanoids can be used to index that all that untapped detail into data. Just as web crawlers index the world of bits, humanoid robots will index the world of atoms.

Jang is right. Reality does have a surprising amount of detail, and an AI engine can’t learn it by reading zillions and jillions of texts. It’s got to get out in the physical world and interact with it.

Here’s what I said two years ago:

So, the AGI of the future, let’s call it GPT-42, will be looking in two directions, toward the world of computers [that’s Adept] and toward the human world [that’s Halodi]. It will be learning in both, but in different styles and to different ends. In its interaction with other artificial computational entities GPT-42 is in its native milieu. In its interaction with us, well, we’ll necessarily be in the driver’s seat.

And, yes, I know, I’ve said that AGI is a chimera, the Philosopher’s Stone of alchemical AI. But if smart and imaginative people take a good run on it, they’re come up with something interesting in the process. Who cares if they don’t make it there. Maybe they’ll make it to Mars instead. They can great Elon when he lands.

Sign me up!

Friday Fotos: Some forsythia flicks I forgot

From Harold Arlen, to Judy Garland, through IZ by way of the theremin to the future – To infinity and beyond!

Somewhere over the rainbow" è la canzone che tutti noi cantiamo sotto la doccia, fischiettiamo quando siamo felici e associamo ai più spensierati sogni della nostra vita; mentre "Il mago di Oz" è il film che si guarda una volta da piccoli e che ci farà compagnia nelle serate più tristi d'inverno per tutta la nostra vita. Tra le versioni di Israel "IZ" Kamakawiwoʻole', Ella Fitzgerald, Frank Sinatra, Celine Dion, Ariana Grande e ogni cantante che abbia un canale Youtube con più di 30 iscritti al momento, volevamo anche noi dare un piccolo contributo ad un gioiello della musica e del cinema di altri tempi.

"Somewhere over the rainbow" is the song that we all sing in the shower, whistle when we are happy and join to the most carefree dreams of our lives; instead "The Wizard of Oz" is the film that is we watched in our childhood and that make us company in the most sad winter evenings of our lives. Among the versions of Israel "IZ" Kamakawiwo'ole ', Ella Fitzgerald, Frank Sinatra, Celine Dion, Ariana Grande and every singer who has a Youtube channel with more than 30 subscribers at the moment, we also wanted to make a small contribution to a jewel of music and of the cinema of other times.

Arrangement: Andrew Harvey

DECOSTRUTTORI POSTMODERNISTI

Four professional musicians combining an astute technical performance with the verve of cabaret. For both a refined and curious ear. Decostruttori Postmodernisti – or if you like, Postmodern Deconstructionists – perform with a piano, violin, cello and a theremin.

Quattro musicisti di professione capaci di unire un'attenta esecuzione tecnica con la verve da cabaret. I Decostruttori Postmodernisti si esibiscono con un pianoforte, un violino, un violoncello e un theremin.

www.decostruttori.it
www.facebook.com/decostruttoripostmodernisti/

My post on IZ: The Song of Israel Kamakawino'ole [stealth hit from out of nowhere]

Is AGI currently the Rome toward which all AI research is headed? [The Alchemical Age]

I came across this blog post first when the graphic about data moats was shared with me: pic.twitter.com/rgu4MyrVAV
— Emily M. Bender (@emilymbender) April 29, 2022

That graphic is from this blog post: All Roads Lead to Rome: The Machine Learning Job Market in 2022, by Eric Jang. From the post:

For instance, Alphabet has so much valuable search engine data capturing human thought and curiosity. Meta records a lot of social intelligence data and personality traits. If they so desired, they could harvest Oculus controller interactions to create trajectories of human behavior, then parlay that knowledge into robotics later on. TikTok has recommendation algorithms that probably understand our subconscious selves better than we understand ourselves. Even random-ass companies like Grammarly and Slack and Riot Games have a unique data moats for human intelligence. Each of these companies could use their business data as a wedge to creating general intelligence, by behavior-cloning human thought and desire itself.

The moat I am personally betting on (by joining Halodi) is a “humanoid robot that is 5 years ahead of what anyone else has”. If your endgame is to build a Foundation Model that train on embodied real-world data, having a real robot that can visit every state and every affordance a human can visit is a tremendous advantage. Halodi has it already, and Tesla is working on theirs. My main priority at Halodi will be initially to train models to solve specific customer problems in mobile manipulation, but also to set the roadmap for AGI: how compressing large amounts of embodied, first-person data from a human-shaped form can give rise to things like general intelligence, theory of mind, and sense of self.

Embodied AI and robotics research has lost some of its luster in recent years, given that large language models can now explain jokes while robots are still doing pick-and-place with unacceptable success rates. But it might be worth taking a contrarian bet that training on the world of bits is not enough, and that Moravec’s Paradox is not a paradox at all, but rather a consequence of us not having solved the “bulk of intelligence”.

Reality has a surprising amount of detail, and I believe that embodied humanoids can be used to index that all that untapped detail into data. Just as web crawlers index the world of bits, humanoid robots will index the world of atoms. If embodiment does end up being a bottleneck for Foundation Models to realize their potential, then humanoid robot companies will stand to win everything.

Yes, reality is (not so) surprisingly detailed, something I talked about in my GPT-3 working paper. I fear that AGI is the Philosopher's Stone of this Alchemical Age of AI that we are still living in.

Thursday, April 28, 2022

Peter Gärdenfors: Conceptual Spaces, Cognitive Semantics and Robotics

Peter Gärdenfors (Lund University, Sweden) gives a talk at the EROSS 2020 series entitled “Conceptual Spaces, Cognitive Semantics and Robotics”.

Peter Gärdenfors is probably the greatest living Swedish philosopher. He made seminal contributions to philosophy of science, decision theory, language evolution, and was already a key figure in the area of belief revision when he created one of the current central notions in cognitive semantics, namely, the notion of Conceptual Spaces. Conceptual Spaces have been applied in Robotics, Linguistics, Knowledge Representation, Conceptual Modeling, among many other areas. He is a professor of Cognitive Science at Lund University, in Sweden, a recipient of the Gad Rausing Prize, and an elected member of the Royal Swedish Academy of Letters, History and Antiquities, the Royal Swedish Academy of Sciences, Deutsche Akademie für Naturforscher, and Academia Europaea. He was a member of the Prize Committee for the Prize in Economic Sciences in Memory of Alfred Nobel 2011-2017 (aka the Nobel Prize in Economics). He is the editor and authors of many books, including: “The Geometry of Meaning: Semantics Based on Conceptual Spaces” – MIT Press, 2014, “Conceptual Spaces: The Geometry of Thought” – Bradford Books, 2004, “The Dynamics of Thought” – Synthese Library, 2006, “How Homo Became Sapiens: On the Evolution of Thinking” – Oxford University Press, 2006.

You can find the slides and references for this talk HERE.

If you would like to know more about EROSS, please visit us at https://eross2020.inf.unibz.it/.

Enrico Franconi,
Giancarlo Guizzardi,
Tiago Prince Sales,
Claudenir Morais Fonseca,

KRDB Research Centre for Knowledge and Data
Free University of Bozen-Bolzano
(Organization Team)

Pink and white

About the Ukraine, the WEIRDEST war I can remember [the bots are running the show]

I’ve been thinking for some time now that I should post something about the Ukraine war. But what? I know relatively little about Ukraine or its history with Russia over the past ten years. Though I’m certainly interested in foreign policy, international affairs, and war, I have no particular expertise in these areas. I’m just a concerned citizen, watching, helplessly, as people get slaughtered half way around the world.

But I can say this: This is the weirdest war I can remember. I marched against the war in Vietnam and served as a conscientious objector. That war was certainly in the news. Back then, however, the news was three national broadcast TV networks, and hardcopy newspapers. It was much more tightly controlled than today’s 24/7 web-fueled media environment. The same was pretty much the case for the Balkan wars of the 1990s. Then the US invaded Iraq after 9/11, pretty much the same. The web was now in heavy use, but we didn’t have social media. The same for Afghanistan and, though that war continued up until yesterday, it had pretty much receded into the background.

Things are different now. Here’s what Thomas Friedman wrote on April 3, 2022:

Almost six weeks into the war between Russia and Ukraine, I’m beginning to wonder if this conflict isn’t our first true world war — much more than World War I or World War II ever was. In this war, which I think of as World War Wired, virtually everyone on the planet can either observe the fighting at a granular level, participate in some way or be affected economically — no matter where they live.

While the battle on the ground that triggered World War Wired is ostensibly over who should control Ukraine, do not be fooled. This has quickly turned into “the big battle” between the two most dominant political systems in the world today: free-market, “rule-of-law democracy versus authoritarian kleptocracy,” the Swedish expert on the Russian economy Anders Aslund remarked to me.

That sounds about right to me. We’re all watching, and thinking...what?

Russia is being isolated from the world economy. Japan is facing an energy shortage as prices of liquified natural gas have gone up. Russia is cutting off natural gas supplies to Europe. Russia’s foreign minister, Sergey V. Lavrov accuses “U.S. and its allies of pursuing a proxy war, and warned that their involvement could lead to nuclear war.” No, Toto, we are not in Kansas anymore.

* * * * *

I’m imagining an underground lair like the Bad Guys in those Bond movies have. A big table in the middle and high-tech gizmos all around. In the middle of the table we see a map of the world. Ukraine and its surrounds are lit up. The Bad Guys are discussing what to do next.

Who are the Bad Guys? Proxies – android duplicates perhaps, whole-brain uploads (or is it downloads?) – for the heads of state of the world’s nations, all of them: Ukraine, Russia, France, South Africa, Sri Lanka, USA, Guatemala, Japan, Chile, and so forth. As I say, all of them.

They saw tensions rising in the wake of America’s withdrawal from Afghanistan. What are we gonna’ do? these bots asked, what are we gonna’ do? They decided we needed another war. The Putin bot offered Ukraine, said they could mop it up quickly, It’ll divert people’s attention while we plot a more viable long-term solution. We can do this! The Biden bot agreed, as did the Xi Jinping bot, the Félix Tshisekedi bot, the Abdel Fattah el-Sisi bot, the Andry Rajoelina bot, the Prithvirajsing Roopun bot, the Jair Bolsonaro bot, the Willem-Alexander bot, the Carl XVI Gustaf bot, the Boris Johnson bot, and on and on, all the bots agreed.

And then things got out of hand. More and more people are dying. More will die. The bots don’t care. It’s no skin off any of their noses. ‘Cause they don’t have skin and noses. They’re just bots, bots run amuck in a James Bond movie. As long as we watch and don’t interfere with their power supply the bots don’t care.

Intervene in the operations of a language model using an interactive debugger

LM-Debugger builds upon our findings from this work https://t.co/fnEBgFe4b8
to provide three core capabilities for single-prediction debugging and model analysis 2/8 pic.twitter.com/Boer9BVEJD
— Mor Geva (@megamor2) April 27, 2022

(2) It also lets the user intervene in the prediction process by changing the weights of FFN updates of her choice, e.g. increasing (decreasing) an update that promotes music-related (teaching-related) tokens 4/8
— Mor Geva (@megamor2) April 27, 2022

Check out a demonstration of LM-Debugger here:https://t.co/nm6G4jrZCN
And try it out for yourself with our two demos:
GPT2 Medium: https://t.co/supnwK26CU
GPT2 Large: https://t.co/mYyZxKsm1n
6/8
— Mor Geva (@megamor2) April 27, 2022

Be sure to check out the video. It's very cool. Pay close attention, though. If you've never worked with such systems – I haven't – you may be puzzled on first viewing. It made more sense to me the second time around.

What I'm wondering is if something like this could be used to "bootstrap" a symbolic component into a neural net. I'm thinking of some posts where I discuss Vygotsky's account of language acquisition: Vygotsky Tutorial (for Connected Courses) and this one, Dual-system mentation in humans and machines [updated]. The second one involves a hybrid AI system with symbolic and neural net components.

Lone bird flying low over the Hudson River in the early morning

Wednesday, April 27, 2022

Distinguishing between very similar musical lines during practice sessions

I’ve been practicing (the melody of) a 12-bar blues that I’ve created, “Skippy’s Blues.” Bars 3, 7, and 11 are identical. Bars 4, 8, and 12 are each almost like the preceding bar, but they differ in the final 3 notes, which are an eight-note triplet leading in to the next measure. So, I’ve got three 2-bar phrases – 3 & 4, 7 & 8, and 11 & 12 – which are identical except on the final beat. Let me recast that: I’ve got 3 8-beat phrases that are identical except for the 8th beat.

Why am I expending so much energy describing this? Because I keep confusing those phrases in the last beat. I note that I’ve not written the melody down, so I’m practicing it “by ear,” as the saying goes. I must have played it 20, 30, 40, who knows how many times over the last three days or so. I’m getting better, but keeping those phrases distinguished from one another and executing the right phrase for the context, that’s very tricky. What makes it tricky is that the phrases are identical for 7/8ths of their span.

Every once in awhile I get it right. But on those occasions I’m still thinking ahead, and there may be a bit of hesitation on that last beat of the phrase. How many times will I have to repeat the whole melody until I get those phrases right? The idea is to play it through without thinking about.

I assume that the problem is in the motor system. I haven’t been keeping track, so I don’t know whether or not it’s trying to play the previous version. That is, in bar 8 it wants to play what it had done in bar 4, in bar 12 it wants to play what it had done in bar 8, and then in bar 8 it wants to play what it had done in bar 12. It would be interesting to keep track, but also tedious beyond belief, so I won’t do it.

What’s going on? Is the motor system trying to ‘compile’ one phrase with a variable last beat, depending on context? Or is it trying to learn three different phrases, which happen to be almost identical? I don’t know. Which would be the best way to do? Do I have any conscious choice in the matter?

Yes, this is ultimately about the motor system, but also about the relationship between the motor system and auditory perception and attention. It seems to me that, since I don’t have it down at the moment, the attention system is trying to interrupt the motor system near the end of the phrase so as to stear it to the proper ending. That doesn’t seem to working. Maybe the idea is to play it slowly enough so that I can get it smooth without interruption. Once I’ve got that down, then I can start gradually speeding up.

This is the kind of thing that forces musicians to practice hours and hours and hours. Athletes, too.

Coda: And, you know, I suspect that getting really good at arithmatic calculation poses similar problems. It’s not so much about moving the pencil around – but remember all that time you spent practing letter forms and word forms in learning to write? – but keeping track of where you are. The brain, after all, is a physical mechanism. Keeping track is thus a physical process. See my earlier post, Why is simple arithmetic difficult for deep learning systems?

AI tutors of the future

I called it. Adept intends to make it so. [Neil Stephenson was there in ‘95]

Back in July of 2020, I posted my first reactions to GPT-3 in a comment on Tyler Cowen’s Marginal Revolution:

Machine learning was the key breakthrough. Rodney Brooks' Gengis, with its subsumption architecture, was a key development. FWIW Brooks has teamed up with Gary Marcus and they think we need to add some old school symbolic computing into the mix. I think they're right.

Machines, however, have a hard time learning the natural world as humans do. We're born primed to deal with that world with millions of years of evolutionary history behind us. Machines, alas, are a blank slate.

The native environment for computers is, of course, the computational environment. That's where to apply machine learning. Note that writing code is one of GPT-3's skills.

So, the AGI of the future, let's call it GPT-42, will be looking in two directions, toward the world of computers and toward the human world. It will be learning in both, but in different styles and to different ends. In its interaction with other artificial computational entities GPT-42 is in its native milieu. In its interaction with us, well, we'll necessarily be in the driver's seat.

Is the Star Trek computer heading our way? A new startup, Adept, aims to make it so.

After 5+ wonderful years in Google Brain, working at the forefront of ML alongside inspiring colleagues, I'm excited to share my new adventure. We started Adept with the mission to build the future of human-computer collaboration. https://t.co/t8dMqZfbSZ.
— Ashish Vaswani (@ashVaswani) April 26, 2022

From their blog:

In practice, we’re building a general system that helps people get things done in front of their computer: a universal collaborator for every knowledge worker. Think of it as an overlay within your computer that works hand-in-hand with you, using the same tools that you do. We all have parts of our job that energize us more than others – with Adept, you’ll be able to focus on the work you most enjoy and ask our model to take on other tasks. For example, you could ask our model to “generate our monthly compliance report” or “draw stairs between these two points in this blueprint” – all using existing software like Airtable, Photoshop, an ATS, Tableau, Twilio to get the job done together. We expect the collaborator to be a good student and highly coachable, becoming more helpful and aligned with every human interaction.

This product vision excites us not only because of how immediately useful it could be to everyone who works in front of a computer, but because we believe this is actually the most practical and safest path to general intelligence. Unlike giant models that generate language or make decisions on their own, ours are much narrower in scope–we’re an interface to existing software tools, making it easier to mitigate issues with bias. And critical to our company is how our product can be a vehicle to learn people’s preferences and integrate human feedback every step of the way.

Andreesen talks of AI as platform. It looks like Adept is heading toward AI as operating system – see my post, AI as platform [Andreessen]: PowerPoint Squared and beyond.

I’m thinking of a future in which each child is given their own AI companion at some suitably early age. The companion grows with them, remaining with them for the rest of their life. Is that where we’re going, toward Neil Stepheson's Diamond Age: Or, A Young Lady's Illustrated Primer (1995)? (Thanks, David.)

* * * * *

Addendum: Some more tweets out of Adept:

This is also the reverse of most AGI work out there. Rather than automating economically valuable tasks, we want to keep humans in the driver’s seat, by building AI tools that people can work with to do things together.
— David Luan (@jluan) April 26, 2022

* * * * *

Adept Video Demo! from Augustus Odena on Vimeo.

Newport Mall, Jersey City

Tuesday, April 26, 2022

Why is simple arithmetic difficult for deep learning systems?

Marcus points this out at two points in the video: c. 18:25 (multiplication of 2-digit numbers), c. 19:49 (3-digit addition). Why is this so difficult for deep learning models to grasp this? This suggests a failure to distinguish between semantic and episodic memory, to use terms from Old School symbolic computation.

The question interests me because arithmetic calculation has well-understood procedures. We know how people do it. And by that I mean that there’s nothing important about the process that’s hidden, unlike our use of ordinary language. The mechanisms of both sentence-level grammar and discourse structure are unconscious.

It's pretty clear to me that arithmetic requires episodic structure, to introduce a term from old symbolic-systems AI and computational linguistics. That’s obvious from the fact that we don’t teach it to children until grammar school, which is roughly when episodic level cognition kicks in (see the paper Hays and I did, Principles and Development of Natural Intelligence).

I note that, while arithmetic is simple, it’s simple only in that there are no subtle conceptual issues involved. But fluency requires years of drill. First the child must learn to count; that gives numbers meaning. Once that is well in hand, children are drilled in arithmetic tables for the elementary operations, addition, subtraction, multiplication, and division. The learning of addition and subtraction tables proceeds along with exercises in counting, adding and subtracting items in collections. Once this is going smoothly one learns the procedures multiple-digit addition and subtraction, multiple-operand addition and then multiplication and division. Multiple digit division is the most difficult because it requires guessing, which is then checked by actual calculation (multiplication followed by subtraction).

Why does such intellectually simple procedures require so much drill? Because each individual step must be correct. One mistake anywhere, and the whole calculation is thrown off. You need to recall atomic facts (from the tables) many times in a given calculation and keep track of intermediate results. The human mind is not well-suited to that. It doesn’t come naturally. Drill is required. That drill is being managed by episodic cognition.

Obviously machine learning cannot pick up that kind of episodic structure. The question is: Can it pick up any kind of episodic structure at all? I don’t know.

When humans produce the kind of coherent prose that these AI devices to, they are using episodic cognition. But that episodic cognition is unconscious. Do machine learning systems pick up episodic cognition of that kind? As I say, I don’t know. But I can imagine that they do not. If not, then what are they doing to produce such convincing simulacra of coherent prose? I am tempted to say they are doing it all with systemic-level cognition, but that may be a mistake as well. They’re doing it with some other mechanism, one that doesn’t differentiate between semantic and episodic level cognition – not to mention gnomonic (see Principles above).

The fact that these systems can only produce relatively short passages of coherent prose suggests a failure at the episodic level. The fact they can produce things nonsense and that are not true suggests a failure at the gnomonic level.

* * * * *

Me attempting to teach math to GPT3 pic.twitter.com/yyCS9NtaMC
— Peter "Fund Pandemic Prevention" Wildeford (@peterwildeford) May 6, 2022

Check out that tweet thread for more examples.

Here's a more recent post on this subject.

Everything Everywhere All at Once [Media Notes 72]

I loved it! But first, I had to go to a theater to see it, a live in person in the round movie theatre. Which was OK. The theater had been revamped with recliner seats and they aren’t all that comfortable, especially for a film that clocks in at over two hours.

And what a movie! Zowie! Does it ever zip around. It is exactly that, Everything Everywhere All at Once. It begins in an apartment above a laundromat in the world you and I are familiar with, but with hints of something else. And then it moves into the multiverse, which is not to be confused with the metaverse. This is no Ready Player One, zipping back a forth between meat space and virtual space in a more or less coherent manner. No, the multiverse is different, all universes at once, with the characters living their lives simultaneously in all of them. But there is some leakage between them.

There’s plenty of martial arts tomfoolery, as you might expect in a film starring Michelle Yeoh. But for all the bling and fling across the galaxies, this isn’t a martial arts movie, nor a science fiction movie. It’s a family melodrama. Here’s what A.O. Scott says:

... it’s a bittersweet domestic drama, a marital comedy, a story of immigrant striving and a hurt-filled ballad of mother-daughter love.

At the center of it all is Evelyn Wang, played by the great Michelle Yeoh with grace, grit and perfect comic timing. Evelyn, who left China as a young woman, runs a laundromat somewhere in America with her husband, Waymond (Ke Huy Quan). Her life is its own small universe of stress and frustration. Evelyn’s father (James Hong), who all but disowned her when she married Waymond, is visiting to celebrate his birthday. An I.R.S. audit looms. Waymond is filing for divorce, which he says is the only way he can get his wife’s attention. Their daughter, Joy (Stephanie Hsu), has self-esteem issues and also a girlfriend named Becky (Tallie Medel), and Evelyn doesn’t know how to deal with Joy’s teenage angst or her sexuality.

But it takes a while for that to reveal itself amid all the world hopping.

As a big Ratatouille fan I especially appreciated the ongoing riff on that, a chef with a racoon on his head puppet-mastering his actions. Who knows what else is being riffed up on. There’s also segments where two rocks sit high on a cliff and think to one another. And the animated child’s drawings. The piñatas.

You get the idea.

Go see for yourself – but avoid the recliner seats if you can.

It looks like things have gotten back to normal on my Academia page, Part 2

I'd previously reported that my view-count at Academia.edu was blowing up:

They're still back to normal. I don't expect any deviations from that in the near future.

But I do wonder what happened.

Neural connectivity during various activities [behavioral mode]

Here's the first two tweets in a stream of 8 tweets:

We used factor analysis to derive a measure of intrinsic connectivity that persists across rest and 8 different task states. Results suggest that (when treated equally) resting state is an especially poor measure of intrinsic connectivity compared with other task states 2/n pic.twitter.com/FOrKCJc3By
— Michael W. Cole (@TheColeLab) April 25, 2022

Here's the abstract and author's summary from the article, Latent functional connectivity underlying multiple brain states:

Abstract

Functional connectivity (FC) studies have predominantly focused on resting state, where ongoing dynamics are thought to reflect the brain’s intrinsic network architecture, which is thought to be broadly relevant because it persists across brain states (i.e., is state-general). However, it is unknown whether resting state is the optimal state for measuring intrinsic FC. We propose that latent FC, reflecting shared connectivity patterns across many brain states, better captures state-general intrinsic FC relative to measures derived from resting state alone. We estimated latent FC independently for each connection using leave-one-task-out factor analysis in seven highly distinct task states (24 conditions) and resting state using fMRI data from the Human Connectome Project. Compared with resting-state connectivity, latent FC improves generalization to held-out brain states, better explaining patterns of connectivity and task-evoked activation. We also found that latent connectivity improved prediction of behavior outside the scanner, indexed by the general intelligence factor (g). Our results suggest that FC patterns shared across many brain states, rather than just resting state, better reflect state-general connectivity. This affirms the notion of “intrinsic” brain network architecture as a set of connectivity properties persistent across brain states, providing an updated conceptual and mathematical framework of intrinsic connectivity as a latent factor.

Author Summary

The initial promise of resting-state fMRI was that it would reflect “intrinsic” functional relationships in the brain free from any specific task context, yet this assumption has remained untested until recently. Here we propose a latent variable method for estimating intrinsic functional connectivity (FC) as an alternative to rest FC. We show that latent FC outperforms rest FC in predicting held-out FC and regional activation states in the brain. Additionally, latent FC better predicts a marker of general intelligence measured outside of the scanner. We demonstrate that the latent variable approach subsumes other approaches to combining data from multiple states (e.g., averaging) and that it outperforms rest FC alone in terms of generalizability and predictive validity.

This article speaks to an idea that was, I believe, first articulated by Warren McCulloch. He argued that for each specific behavioral mode (and here)– hunting, eating, sex, sleep, etc. – there is a particular pattern of brain activity, some regions are more active than others. That is, the brain doesn't have specific modules for each activity, but rather specific patterns of activation over the whole brain.

My visit to Popeye's

Monday, April 25, 2022

Semanticity: adhesion and relationality

For some time now I have been puzzled by the (astonishing) success of statistical models, especially artificial neural networks, in language processing, machine translation in particular – see, e.g. this post, Borges redux: Computing Babel – Is that what’s going on with these abstract spaces of high dimensionality? [#DH], which dates back to October 2017. Sure, it is statistics, but what are those statistics “grabbing on to?” There is no meaning there, just (naked) word forms. What is it about those word-forms-in-context that yields an approximation to, a simulacrum of, meaning? Better, what’s there that we readily read meaning, intention, into the results of these statistical techniques.

Semanticity and intention

My puzzlement reached a climax with the unveiling of GPT-3 in 2020 and I decided to take another run at the problem. I produced a working paper, GPT-3: Waterloo or Rubicon? Here be Dragons, which I liked very much. I made real progress. I now think I can nudge things forward another step. Look at this passage, where I discuss the Chinese Room thought-experiment (p. 28):

Yet if you would believe John Searle, no matter how rich and detailed those old school mental models, understanding would necessarily elude them. I am referring, of course, to his (in)famous Chinese Room argument. When I first encountered it years ago my reaction was something like: interesting, but irrelevant. Why irrelevant? Because it said absolutely nothing about the techniques AI or cognitive science investigators used and so would provide no guidance toward improving that work. He did, however, have a point: If the machine has no contact with the world, how can it possibly be said to understand anything at all? All it does is grind away on syntax.

What Searle misses, though, is the way in which meaning is a function of relations among concepts, as I pointed out earlier (pp. 18 ff.). It seems to me, however – and here I’m just making this up – we can think of meaning as having both an intentional aspect, the connection of signs to the world, and a relational aspect, the relations of signs among themselves. Searle’s argument concentrated on the former and said nothing about the latter.

What of the intentional aspect when a person is writing or talking about things not immediately present, which is, after all quite common? In this case the intentional aspect of meaning is not supported by the immediate world. Language use thus must necessarily be driven entirely by the relations signifiers have among themselves, Sydney Lamb’s point which we have already investigated (p. 18).

Those statistics are grabbing onto the relational aspect of meaning. The question is: How much of that can these methods recover from texts? Let’s set that aside for the moment.

That passage mentions intention and relation. Intention resides in the relationship between a person and the world. Relation resides in the relationships that signifiers have among themselves. It is a property of the cognitive system. I am now thinking that it must be paired with adhesion. Taken together they constitute semanticity. Thus we have semanticity and intention where semanticity is a general capacity inherent in the cognitive system, in a person’s mind, and intention inheres in the relation between a person and the world in a particular perceptual and/or cognitive activity.

What do I mean by adhesion? Adhesion is how words ‘cling’ to the world while relationality is the differential interaction of words among themselves within the linguistic system. Words whose meaning is defined directly over the physical world, but also, to some extent, the interpersonal world of signals and feeling, they adhere to the world through sensorimotor schemas. Words whose meaning is abstract are more problematic. Their adhesion operates though patterns of words and other signs and symbols (e.g. mathematics, data visualizations, illustrative diagrams of various kinds, and so forth). Teasing out these systems of adhesion has just barely begun.

The psychologist J.J. Gibson talked of the affordances an environment presents to the organism. Affordances as the features of the world which an organism can readily pick up during its life in the world. Adhesions are the organism’s complement to environmental affordances; they are the perceptual devices through which the organism relates to the affordances.

What this means for language models

Large language models built through deep neural networks, such as GPT-3, conflate the interaction of three phenomena: 1) the world-level relational aspect of semanticity as captured in the locations of word forms (signifiers) in a string, 2) the conventions of discourse structure, and 3) the world itself. The world is present in the model because the texts over which the model was constructed were created by people interacting in the world. They were in an intentional relationship with the world when they wrote those texts. The conventions of discourse are present simply because they organize the placement of word forms in a text, with special emphasis on the long-distance relationships of word. As for relationality, that’s all that can possibly be present in a text. Adhesions belong to the realm of signifieds, of concepts and ideas, and they aren’t in the text itself.

Would it somehow be possible to factor a language model into these three aspects? I have no idea. The point of doing so would be to reduce the overall size of the model.

Putting that aside, let us ask: Given a sufficiently large database of texts and tokens and a high enough number of parameters for our model, is it possible for a language model to extract all the relationality from the texts? How much of that multidimensional relational semanticity can be recovered from strings of word forms? Given a deep enough understanding of how relational semantics is reflected in the structure of texts, can we calculate what is possible with various text bases and model parameterization?

To answer those questions we need to have some account of semantic relationality which we can examine. The models of Old School symbolic AI and computational linguistics provide such accounts. Many such models have been created. Which ones would we choose as the basis for our analysis? The sort of question that interests me is how many word forms have their meanings given in adhesions to the physical world (that is, physical objects and events), to the interpersonal world (facial expressions, gestures, etc.) and how many word forms are defined abstractly?

So many questions.

* * * * *

I have appended this to my GPT-3 working paper, which is now Version 3.

Blues @ 3 Quarks Daily [East Asian style]

I’ve taken eight examples from my current series on the blues and written an article for 3 Quarks Daily:

Tell me about the blues, 3 Quarks Daily, April 25, 2022
https://3quarksdaily.com/3quarksdaily/2022/04/tell-me-about-the-blues.html

* * * * *

The blues has long since made its way around the world. I’ve got two examples from East Asia. The first is a 2014 edition of the Big Friendly Jazz Orchestra (BFJO) from Takasago High School in Japan. They’re playing a Basie tune, by Frank Foster, which I examined here. That was the Basie Band in 1959. I’m guessing that the parents of most of these Japanese students (notice that most of them are young women) weren’t even alive then.

Blues In Hoss’ Flat (Frank Foster) / BFJO 2014 team Ota, with Guest Players

They’re not all high school students. There are some adult ringers. Listen to their ensemble work. How tight! I'd like to think of Frank Foster is grinning from ear while listening to this performance.

* * * * *

This one, from Korea, is a bit more recent, 2019. Kwak Dakyung is sitting in on a blues jam. She appears to be about 10 years old here. I devoted a post to her about a year ago.

[윤아트홀] Blues Jam - 곽다경 (재즈 트럼펫 / Jazz Trumpet)

The first tune is a minor key blues. It appears to me that her father (the bass player) just tossed her into the fray, leaving it up to her to figure out what to do. Listen to her testing things out during the guitar and piano solos, figuring out where things lay. Now she’s got it, 3:23. Now she dialogs with the guitar, 4:31. Now we pick up the tempo, 5:49. But you don’t need me to point out what’s going on, do you? She’s still figuring it out.

But then aren’t we all?

Here's a blues playlist I put together at YouTube, though some non-blues tunes managed to slip in.

Acute angles [Manhattan, Hoboken]

Sunday, April 24, 2022

On the Differences between Artificial and Natural Minds: Another version of my intellectual biography

A couple weeks ago Tyler Cowen’s Emergent Ventures announced an interest in funding work in artificial intelligence (AI). I decided to apply. The application was relatively short and straightforward: Tell us about yourself and tell us what you want to do. So that’s what I did. I ended up recounting my intellectual career from “Kubla Khan” to attractor nets.

So, I’ve reproduced that narrative below, except for the final paragraph where I ask for money. It joins the many pieces I’ve written about my intellectual life. I list most of them, with links, after the narrative.

* * * * *

In a recent interview with Karen Hao, Geoffrey Hinton proclaimed, “I do believe deep learning is going to be able to do everything” (MIT Technology Review, 11.3.2020). His faith is rooted in the remarkable success of deep learning in the past decade. This notion of AI omnipotence has deep cultural roots (e.g. Prospero and his magic) and is the source of both wild techno-optimism and apocalyptic fears about future relations between AI and humanity. Momentum seems to be on Hinton’s side. I believe, however, that by establishing a robust and realistic view of the actual difference between artificial and natural intelligence, we can speed progress by tamping down both the hyperbolic claims and the fears.

In the 2010s I employed a network notation developed by Sydney Lamb (computational linguistics) to sketch out how salient features in the high-dimensional geometry of complex neurodynamics could map into a classical symbolic system. (Gary Marcus argues that Old School symbolic computing is necessary to handle common sense reasoning and complex thought processes.) My hypothesis is that the highest-level processes of human intelligence are best conceived in symbolic terms and that Lamb’s notation provides a coherent way of showing how symbols can impose high-level organization on those “big vectors of neural activity” that Hinton talks about.

Here is a quick account of how I arrived at that hypothesis.

For my Master’s Thesis at Johns Hopkins in 1972 I demonstrated that Coleridge’s “Kubla Khan” was a poetic map of the mind, structured like a pair of matryoshka dolls, each nested three deep. It “smelled” of an underlying computational process, nested loops perhaps. Over a decade later I published that analysis in Language and Style (1985) – at the time perhaps the premier journal about language and literature.

In 1973 I started studying for a PhD in English at SUNY Buffalo. The department was in the forefront of postmodern theory and known for its encouragement of interdisciplinary boldness, with Rene Girard, Leslie Fiedler, Norman Holland and several prominent postmodern writers on the faculty. There I met David Hays in the linguistics department. He had led the RAND Corporation’s team on machine translation in the 1950s and 1960s and later coined the term “computational linguistics.” I joined his research group and used computational semantics to analyze a Shakespeare sonnet, “The Expense of Spirit.” I published that analysis in 1976 in the special 100th anniversary issue of MLN (Modern Language Notes) – an intellectual first. Much of my 1978 dissertation, “Cognitive Science and Literary Theory,” consisted of semi-technical work in knowledge representation, including the first iteration of an account of cultural evolution that Hays and I would publish in a series of essays in the 1990s.

Prior to meeting Hays I had been attracted by a 1969 Scientific American article in which Karl Pribram, a Stanford neuroscientist, argued that vision and the brain more generally operated on mathematical principles similar to those underlying optical holography, principles also used in current convolutional neural networks. Neural holography played a central role in a pair of papers Hays and I published in the 1980s, “Metaphor, Recognition, and Neural Process” (American Journal of Semiotics, 1987), and “The Principles and Development of Natural Intelligence” (Journal of Social and Biological Structures, 1988). Drawing on a mathematical formulation by Miriam Yevick, both papers developed a distinction between holographic semantics and compositional semantics (symbols) and argued that language and higher cognitive processes required interaction between the two.

I spent the summer of 1981 working on a NASA project, Computer Science: Key to a Space Program Renaissance, leading the information systems group. I left the academic world in 1985 – I’d been on the faculty of the Rensselaer Polytechnic Institute – and collaborated with Richard Friedhoff on a coffee-table book about computer graphics and image processing, Visualization: The Second Computer Revolution (Abrams 1989). During this period Hays and I began publishing our articles on cultural evolution, beginning with “The Evolution of Cognition” (Journal of Social and Biological Structures, 1990). We argued that the development of a major new conceptual instrument, such as writing across the ancient world, enabled a new cognitive architecture, and that new architecture in turn supported new modes of thought and invention. When Europe had fully absorbed positional decimal arithmetic from the Arabs, the result was a new conceptual architecture which enabled the scientific and industrial revolutions and indirectly, the novel. The twentieth century saw the development of the computer, first conceptually, and then implemented in electronic technology at mid-century. Another new cognitive architecture emerged, but also modernism in the arts.

At the end of the 1990s I entered into extensive correspondence with Stanford’s Walter Freeman about complex neurodynamics. That work became central to the account of music I developed in Beethoven’s Anvil: Music in Mind and Culture (Basic Books, 2001). Meanwhile literary scholars were finally discovering cognitive science. I jumped back into the fray and published several articles, including a general theoretical and methodological piece, “Literary Morphology: Nine Propositions in a Naturalist Theory of Form” (PsyArt: An Online Journal for the Psychological Study of the Arts, 2006). I argued, among other things, that literary form could be expressed computationally in the way that, say, parentheses give form to LISP expressions. My early work on “Kubla Khan” and “The Expense of Spirit” exemplifies that notion of computational form, which I also discussed in “The Evolution of Narrative and the Self” (Journal of Social and Evolutionary Systems, 1993). Over the last two decades I have described and analyzed over 30 texts and films from this perspective, though most of that work is in informal working papers posted to Academia.edu where I rank in the 99.9 percentile of publications viewed.

I am now who knows how many miles into my 1000-mile journey. The full range of the work I’ve done over a half century, all of it with computation in mind – language, literature, music, cultural evolution ¬– remains open for further exploration. I am now ready to make significant progress on the problem that started my journey: the form and semantic structure of “Kubla Khan.” In so doing I intend to clarify the difference between natural and artificial intelligence.

“Kubla Khan” is one of the greatest English-language poems and has left its mark deep in popular culture. It has a rich formal structure and through that draws on the full range of human mental capacities. By explicating them I will propose a minimal, but explicit, set of capabilities for a truly general intelligence and show how they work together to produce a coherent object, a poem. I expect to show – though I can’t be sure of this – that some of those capacities are beyond the range of silicon.

I undertake to do so, not to save the human from the artificial, but to liberate the artificial from our narcissistic investment in it - the tendency to project our fears of the unknown and anxieties about the future onto our digital machines. Only when we have clarified the difference between natural and artificial intelligence will we be able to assess the potential dangers posed by powerful artificial mentalities. Artificial intelligence can blossom and flourish only if it follows a logic intrinsic and appropriate to it.

As futurist Roy Amara noted: We tend to overestimate the effect of a technology in the short run and underestimate the effect in the long run. So it is with AI. Fear of human-level AI is short-term while the transformative effects of other-than-human AI will be long term.

I don’t intend to craft code. I’m looking to define boundaries and mark trails. I have spent a career examining qualitative phenomena and characterizing them in terms making them more accessible to investigators with technical skills I lack. I seek to provide AI with ambitious and well-articulated goals that are richer rather than simply “bigger and still bigger.”

* * * * *

The break – How I ended up on Mars

Here’s one way I’ve come to think about my career: I set out to hitch rides from New York City to Los Angeles. I don’t get there. My hitch-hike adventure failed. But if I ended up on Mars, what kind of failure is that? Lost on Mars! Of course, it might not actually be Mars. It might be an abandoned set on a studio back lot. Ever since then I’ve been working my way back to earth.

This material is about how I ended up on Mars while on the way to LA. That is, it is about I set out to analyze “Kubla Khan” within existing frameworks but ended up outside those frameworks.

Touchstones • Strange Encounters • Strange Poems • the beginning of an intellectual life https://www.academia.edu/9814276/Touchstones_Strange_Encounters_Strange_Poems_the_beginning_of_an_intellectual_life

This is about my undergraduate years at Johns Hopkins and my years as a master’s student in the Humanities Center, where I wrote my thesis on “Kubla Khan.” This is how I became an independent thinker with my own intellectual agenda. Among other things, I talks about the role that some altered mental states – two having nothing to do with drugs, one about and LSD trip (that wasn't trippy in the standard sense) – in my early intellectual development. If you read only one of these pieces, this is the one.

Into Lévi-Strauss and Out Through “Kubla Khan”
https://new-savanna.blogspot.com/2013/08/into-levi-strauss-and-out-through-kubla.html

This is a story told in diagrams, about how I went from Lévi-Strauss style structuralism to the computationally inspired semantic networks of cognitive science. Read this second.

“How wrong labels can lead to poor generalization”: I think this is the simplest effective visual explanation I have seen. From @fchollet’s Deep Learning with Python (2nd edition). pic.twitter.com/ofPkMbKtUe
— 🇺🇦Rama Ramakrishnan (@rama100) April 23, 2022

Spatially distributed cortical circuits

This is conceptually cool stuff

Spatially distributed computation in cortical circuits https://t.co/vuVQgrHETa
— Martin Vinck (@martin_a_vinck) April 24, 2022

Abstract of the linked article:

The traditional view of neural computation in the cerebral cortex holds that sensory neurons are specialized, i.e., selective for certain dimensions of sensory stimuli. This view was challenged by evidence of contextual interactions between stimulus dimensions in which a neuron’s response to one dimension strongly depends on other dimensions. Here, we use methods of mathematical modeling, psychophysics, and electrophysiology to address shortcomings of the traditional view. Using a model of a generic cortical circuit, we begin with the simple demonstration that cortical responses are always distributed among neurons, forming characteristic waveforms, which we call neural waves. When stimulated by patterned stimuli, circuit responses arise by interference of neural waves. Results of this process depend on interaction between stimulus dimensions. Comparison of modeled responses with responses of biological vision makes it clear that the framework of neural wave interference provides a useful alternative to the standard concept of neural computation.

I tip-toed among them, what are they? [tulips]

Saturday, April 23, 2022

Tell me about the blues: Trane, Ornette, Hannibal

Where have we been? We started with Old School jazz, Bessie Smith, Louis Armstrong, and Henry Allen. Then we zipped ahead a couple decades to hear the very different blues of Miles Davis, Wayne Shorter, and Frank Foster, blues shorn of much of the funk, but not the juice. The juice is always in the performer, not the specifications of the tune. Then we went back to fill-in the interval. Duke Ellington and Count Basie aren’t so far from Louis Armstrong but they work with the full force of a big band and so have more sonic resources and have to devote more attention to scheming out the course of a performance.

Then Charlie Parker showed us what happens when you fill the blues up with a zillion chord changes. Can you push that any further without the form collapsing in on itself? Monk increased melodic angularity to the point where the melody split into two streams. Mingus turned the form back on itself into an endless loop. How does it end? Miles washed all that away in the radical simplification of “All Blues.”

Now we’re going to continue on with John Coltrane’s “Mr. P.C.”; jump off the deep end – well, not quite – with Ornette Coleman’s “Broad Way Blues”; and become a snake swallowing its own tail with Hannibal Lokumbe.

John Coltrane: Mr. P.C.

John Coltrane is a tenor saxophonist who came up in the late 1940s on through the 1950s and apprenticed with Dizzy Gillespie, Thelonious Monk, Miles Davis. That is to say, he came up as a bebopper, learning that style inside-out. Through his work with Davis he got ready to move on. That’s what we see him doing in “Mr. P.C.”, from his Giant Steps album, which was released early in 1960. He’s poised at the precipice.

Giant Steps came out at roughly the same time as Kind of Blue (Miles Davis), on which Coltrane played. The title tune, “Giant Steps,” is known for its difficult chord changes to be taken at a furious tempo. “Mr. P.C.” – named after Coltrane’s long-time bass player, Paul Chambers – is just as fast, perhaps faster, but the chord changes are more tractable. It’s a 12-bar blues in a minor key, which is common enough, but not so common as blues in a major key (though we should keep in mind that the blues tonality, if you will, is neither major nor minor, but something else). The fact that it is in a minor key puts it within the ambit of the modal jazz on Kind of Blue and which Coltrane would embrace whole-heartedly soon thereafter.

Let’s listen:

Coltrane starts right on the melody, which takes a standard blues form:

1) The first 4-bar phrase consists of a 2-bar phrase that goes up and then comes back down (bars 1 & 2); it is followed by a 4-note phrase in the third bar, which alternates on two pitches.
2) The second 4-bar phrase repeats the opening 2-bars, but transposes them up, to follow the shift in harmony (bars 5 & 6); it is followed by the same phrase we heard in bar 3, and on the same pitches.
3) The phrase in bars 9 and 10 is different from the one in 1 and 2, 5 and 6, but return to that same 4-note phrase in bar 11.

Easy-peasy.

Coltrane plays the melody twice and them launches into a solo (0:22). Listen closely for phrases that he repeats at various points in the solo. Listen as well for the parts where he hangs on a dissonant note (e.g. 0:28, 0:43, 0:49, etc.) and where he plays little (and not so little) runs that go chromatic (e.g. 0:34). There’s a nice cry at 2:26. In the next chorus plays rapid arpeggios (starting at 2:37) down and up and down and up for eight bars or so; there are more rapid runs in the next chorus (starting at 2:53). He’s building up a head of steam. Trane finishes his last chorus with the melody (3:18). He’s signaling that he’s done. Now Tommy Flanagan gets a piano solo – forgive me if I don’t offer some play-by-play; I’m burnt out. At 4:54 Trane starts to trade fours (that is, four bar phrases) with the drummer, Art Taylor. We return to the head at 6:25, for two choruses, and we’re out.

Ornette Coleman: Broad Way Blues

Ornette Coleman is an alto saxophonist and is Coltrane’s contemporary, a couple years younger. But, while Trane was based on the East Coast, Coleman matured out West, in Los Angeles. Nor did he apprentice in bebop. He created his own apprenticeship. Thus when he came East in 1959 no one knew what he was doing. He played a plastic alto sax and his compatriot, Don Cherry, played a funny looking little pocket trumpet. Are these guys for real, playing toy instruments and music that don’t make no sense no how. Sheesh! Yes, they’re for real. Some people dug them strait off, others had to warm up, and some never did.

He recorded “Broad Way Blues” in 1968 on New York is Now. It’s not a blues in any conventional way of thinking about such matters. And yet, in a somewhat less conventional way, it’s as all blues as “All Blues,” if not more so. Why don’t you give it a listen and see if it makes sense to you:

Make sense? Yes? No? Maybe? Or perhaps you did not try to make sense of it and it just sounded wonderful, though perhaps a bit sideways.

It sounded that way to me. When I tried to count it out, I got lost. So I looked at the sheet music. 22-bars long. That’s not a blues form, it’s not a 32-bar standard, it’s not made of 4-bar phrases (adding up to, say, 16 bars). And there are some bars in there that don’t count out to four beats; they’re in six. When solo they – Dewey Redman joins Ornette on tenor – time undergoes a different discipline, courtesy of Elvin Jones.

Let’s take it easy. I’ve been doing this for a while and my brain is running low on the neuro-transmitters that make for quick shifts of concentration. So I’m going to minimize my attempts get timings for you. But I’ll do this much: The melody starts at the beginning. That descending line at about 0:13 – the first one in the tune – is in six. We play a quick cyclic flurry across the bar and land on two quarter notes to end the first part of the head. We return to the opening figure at about 0:17, at bar 14, to begin the second part of the head. The descending line at 0:21 is in 6. We jump to come down in the next bar, up for a bar, and down for two. We’re at the end of the head. Starting at 0:28 we go around again, then into solos, where everything is up for grabs.

So listen and grab some why don’t you. Coleman is endlessly lyrical, no?