NEW SAVANNA: August 2022

Tuesday, August 30, 2022

Two boys and a photo of the Brooklyn Bridge [Hoboken]

A note on AGI as concept and as shibboleth [kissing cousin to singularity]

Artificial intelligence was ambitious from its beginnings in the mid-1950s; this or that practitioner would confidently predict that before long computers could perform any mental activity that humans could. As a practical matter, however, AI systems tended to be narrowly focused. The field’s grander ambitions seemed ever in retreat. Finally, at long last, one of those grand ambitions was realized. In 1997 IBM’s Deep Blue beat world champion Gary Kasparov in chess.

But that was only chess. Humans remained ahead of computers in all other spheres and AI kept cranking out narrowly focused systems. Why? Because cognitive and perceptual competence turned out to require detailed procedures. The only way to accumulate the necessary density of detail was to focus on a narrow domain.

Meanwhile, in 1993 Verner Vinge delivered a paper, Technological Singularity, at a NASA Symposium and then published it in The Whole Earth Review.

Progress in hardware has followed an amazingly steady curve in the last few decades. Based on this trend, I believe that the creation of greater-than-human intelligence will occur during the next thirty years. (Charles Platt has pointed out that AI enthusiasts have been making claims like this for thirty years. Just so I'm not guilty of a relative-time ambiguity, let me be more specific: I'll be surprised if this event occurs before 2005 or after 2030.)

What are the consequences of this event? When greater-than-human intelligence drives progress, that progress will be much more rapid. In fact, there seems no reason why progress itself would not involve the creation of still more intelligent entities – on a still-shorter time scale. The best analogy I see is to the evolutionary past: Animals can adapt to –oblems and make inventions, but often no faster than natural selection can do its work – the world acts as its own simulator in the case of natural selection. We humans have the ability to internalize the world and conduct what-if’s in our heads; we can solve many problems thousands of times faster than natural selection could. Now, by creating the means to execute those simulations at much higher speeds, we are entering a regime as radically different from our human past as we humans are from the lower animals.

This change will be a throwing-away of all the human rules, perhaps in the blink of an eye – an exponential runaway beyond any hope of control. Developments that were thought might only happen in “a million years: (if ever) will likely happen in the next century.

That got people attention, at least in some tech-focused circles, and provided a new focal point for thinking about artificial intelligence and the future. It’s one thing to predict and hope for intelligent machines which will do this that and the other. But rewriting the nature of history, top to bottom, that’s something else again.

In the mid-2000s Ben Goertzel and others felt the need to rebrand AI in a way more suited to the grand possibilities that lay ahead. Goertzel noted:

In 2002 or so, Cassio Pennachin and I were editing a book on approaches to powerful AI, with broad capabilities at the human level and beyond, and we were struggling for a title. The provisional title was “Real AI” but I knew that was too controversial. So I emailed a bunch of friends asking for better suggestions. Shane Legg, an AI researcher who had worked for me previously, came up with Artificial General Intelligence. I didn’t love it tremendously but I fairly soon came to the conclusion it was better than any of the alternative suggestions. So Cassio and I used the term for the book title (the book “Artificial General Intelligence” was eventually published by Springer in 2005), and I began using it more broadly.

Goertzel realized term had its limitations. “Intelligence” is a vague idea and, whatever it means, “no real-world intelligence will ever be totally general.” Still

It seems to be catching on reasonably well, both in the scientific community and in the futurist media. Long live AGI!

My point, then, is that the term did not refer to a specific technology or set of mechanisms. It was an aspirational term, not a technical one.

And so it remains. As Jack Clark tweeted a few weeks ago:

Discussions about AGI tend to be pointless as no one has a precise definition of AGI, and most people have radically different definitions. In many ways, AGI feels more like a shibboleth used to understand if someone is in- or out-group wrt some issues.
— Jack Clark (@jackclarkSF) August 6, 2022

For the matter, “the Singularity” is a shibboleth as well. The two of them tend to travel together.

The impulse behind AGI is to rescue AI from its narrow concerns and focus our attention on grand possibilities for the future.

Cab Calloway and his band on the road [railroad]

Photographic print of Cab Calloway and his band in a sleeper car, 1933 #openaccess #museumarchive https://t.co/yzXR22JyPQ pic.twitter.com/B2dy5KkdW9
— SI: Museum of African American History (Bot) (@si_nmaahc) August 28, 2022

Audiences and critics are now disagreeing about blockbusters

Fans and critics agree more often than not. The average difference in their ratings is about 5 points.

This year? It is 19. Here are the 10 highest-grossing movies of 2022.https://t.co/y5OwrrQm9C pic.twitter.com/2d41uk1JY4
— Lucas Shaw (@Lucas_Shaw) August 28, 2022

FWIW there have been years where critics liked movies as much if not more than fans.

In 2005, the 10 biggest movies got higher scores from critics than fans. pic.twitter.com/JVtLPUuf9x
— Lucas Shaw (@Lucas_Shaw) August 28, 2022

Lots more in this week's newsletter, including methodology and the one year fans/critics were in near total agreement about blockbusters. Read it if you like movies, or just like arguing.

And if you dig it, you can always subscribe -->https://t.co/aqvwunFRqC
— Lucas Shaw (@Lucas_Shaw) August 28, 2022

Monday, August 29, 2022

Sun and shade [Hoboken]

My current thoughts on AI Doom as a cult(ish) phenomenon

I’ve been working on an article in which I argue that belief in and interest in the likelihood that future AI poses an existential risk to humanity – AI Doom – that this has become the focal point of cult behavior. I’m trying to figure out how to frame the argument. Which is to say, I’m trying to figure out what kind of an argument to make.

This belief has two aspects: 1) human level artificial intelligence, aka artificial general intelligence (AGI), is inevitable, and may well, very likely will, inevitably will lead to superintelligence, and 2) this intelligence will very likely turn on us, either deliberately or inadvertently. If both of those are true, then belief in AI Doom is rational. If neither are true, then belief in AI Doom is mistaken. Cult behavior, then, is undertaken to justify and support these mistaken beliefs.

That last sentence is the argument I want to make. I don’t want to actually argue that the underlying beliefs are mistaken. I wish to assume that. That is, I assume that on this issue our ability to predict the future is dominated by WE DON’T KNOW.

Is that a reasonable assumption? That is to say, I don’t believe that those ideas are a reasonable way to confront the challenges posed by AI. And I’m wondering what motivates such maladaptive beliefs.

On the face of it such cult behavior is an attempt to assert magical control over phenomena which are beyond our capacity to control. It is an admission of helplessness.

* * * * *

I drafted the previous paragraphs on Friday and Saturday (August 26th and 27th) and then dropped it because I didn’t quite know what I was up to. Now I think I’ve figured it out.

Whether or not AGI and AI Doom are reasonable expectations for the future, that’s one issue, and it has its own set of arguments. That a certain (somewhat diffuse) group of people have taken those ideas and created a cult around them, is a different issue. And that’s the argument I want to make. In particular, I’m not arguing that those people are engaging in cult behavior as a way of arguing against AGI/AI Doom. That is, I’m not trying to discredit believers in AGI/AI Doom as a way of discrediting AGI/AI Doom. I’m quite capable to arguing against AGI without saying anything at all about the people.

As far as I know, the term “artificial general intelligence” didn’t come into use until the mid-2000s, and focused concern about rogue AI begins consolidating in the subsequent decade, well after the first two Terminator films (1984, 1991). It got a substantial boost with the publication of Nick Bostrom’s Superintelligence: Paths, Dangers, Strategies in 2014, which introduced the (in)famous paperclip scenario, in which an AI tasked with creating as many paperclips as possible proceeds to cover the earth's surface into paperclips.

One can believe in AGI without being a member of a cult and one can fear future AI without being a member of a cult. Just when these beliefs became cultish, that’s not clear to me. Just when these beliefs became cultish, that’s not clear to me. What does that mean, became cultish? It means, or implies, that people adopt those beliefs in order to join the group. But how do we tell that that has happened? That’s tricky and I’m not sure.

* * * * *

I note, however, that the people I’m thinking about – you’ll find many of them hanging out at places like LessWrong, Astral Codex Ten, and OpenPhilanthropy – tend to treat the arrival of AGI as a natural phenomenon, like the weather, over which they have little control. Yes, they know that the technology is created by humans, many of them are their friends, they may themselves be actively involved in AI research, and, yes, they want to slow AI research and influence it in specific ways, but they nonetheless regard the emergence of AGI as inevitable. It’s ‘out there’ and is happening. And once it emerges, well, it’s likely to go rogue and threaten humanity.

The fact is the idea of AGI is vague, and, whatever it is, no one knows how to construct it. There’s substantial fear that it will emerge through scaling up of current machine learning technology. But no one really knows that or can explain how it would happen.

And that’s what strikes me as so strange about the phenomenon, the sense of helplessness. All they can do is make predictions about when AGI will emerge and sound the alarm to awaken the rest of us, but if that doesn’t work, we’re doomed.

How did this come about? Why do these people believe that? Given that so many of these people are actively involved in creating the technology – see this NYTimes article from 2018 in which Elon Musk sounds the alarm while Mark Zuckerberg dismisses his fears – one can read it as a narcissistic over-estimation of their intellectual prowess, but I’m not sure that covers it. Or perhaps I know what the words mean, but what the assertion means, I don’t understand it very well. I mean, if it’s narcissism, it’s not at all obvious to me that Zuckerberg is less narcissistic than Musk. To be sure, that’s just two people, but many in Silicon Valley share their views.

Of course, I don’t have to explain why in order to make that argument that we’re looking at cult behavior. That’s the argument I want to make. How to make it?

More later.

Sunday, August 28, 2022

At the docks [Hoboken]

Elemental Cognition is ready to deploy hybrid AI technology in practical systems

Steve Lohr, One Man's Dream of Fusing A.I. With Common Sense, NYTimes, Aug. 28, 2022

David Ferrucci is best-known as the researcher who led the team that developed IBM's Watson, which beat the best human players of Jeopardy in 2011. He left IBM a year later and formed his own company, Elemental Cognition, in 2015. Elemental cognition is taking a hybrid approach, combining aspects of machine learning and symbolic computation.

Elemental Cognition has recently developed a system that helps people plan and book round-the-world airline tickets:

The round-the-world ticket is a project for oneworld, an alliance of 13 airlines including American Airlines, British Airways, Qantas, Cathay Pacific and Japan Airlines. Its round-the-world tickets can have up to 16 different flights with stops of varying lengths over the course of a year.

Elemental Cognition supplies the technology behind a trip-planning intelligent agent on oneworld’s website. It was developed over the past year and introduced in April.

The user sees a global route map on the left and a chatbot dialogue begins on the right. A traveler starting from New York types in the desired locations — say, London, Rome and Tokyo. “OK,” replies the chatbot, “I have added London, Rome and Tokyo to the itinerary.”

Then, the customer wants to make changes — “add Paris before London,” and “replace Rome with Berlin.” That goes smoothly, too, before the system moves on to travel times and lengths of stays in each city.

Rob Gurney, chief executive of oneworld, is a former Qantas and British Airways executive familiar with the challenges of online travel planning and booking. Most chatbots are rigid systems that often repeat canned answers or make irrelevant suggestions, a frustrating “spiral of misery.”

Instead, Mr. Gurney said, the Elemental Cognition technology delivers a problem-solving dialogue on the fly. The rates of completing an itinerary online are three to four times higher than without the company’s software.

Elemental Cognition has developed an approach that all-but eliminates the hand-coding typcial of symbolic A.I.:

For example, the rules and options for a global airline ticket are spelled out in many pages of documents, which are scanned.

Dr. Ferrucci and his team use machine learning algorithms to convert them into suggested statements in a form a computer can interpret. Those statements can be facts, concepts, rules or relationships: Qantas is an airline, for example. When a person says “go to” a city, that means add a flight to that city. If a traveler adds four more destinations, that adds a certain amount to the cost of the ticket.

In training the round-the-world ticket assistant, an airline expert reviews the computer-generated statements, as a final check. The process eliminates most of the need for hand coding knowledge into a computer, a crippling handicap of the old expert systems.

There's more at the link.

* * * * *

Lex Fridman interviews David Ferrucci (2019).

0:00 - Introduction
1:06 - Biological vs computer systems
8:03 - What is intelligence?
31:49 - Knowledge frameworks
52:02 - IBM Watson winning Jeopardy
1:24:21 - Watson vs human difference in approach
1:27:52 - Q&A vs dialogue
1:35:22 - Humor
1:41:33 - Good test of intelligence
1:46:36 - AlphaZero, AlphaStar accomplishments
1:51:29 - Explainability, induction, deduction in medical diagnosis
1:59:34 - Grand challenges
2:04:03 - Consciousness
2:08:26 - Timeline for AGI
2:13:55 - Embodied AI
2:17:07 - Love and companionship
2:18:06 - Concerns about AI
2:21:56 - Discussion with AGI

Saturday, August 27, 2022

Johnny carson interviews Marlon Brando about race in America in May of 1968

Jazz has been blamed for many things [Twitter]

Warts on the feet pic.twitter.com/OJj8SSHd49
— Paul Fairie (@paulisci) August 25, 2022

Bad teeth pic.twitter.com/I6jOuiWmTO
— Paul Fairie (@paulisci) August 25, 2022

There are more tweets in the stream. Here's the last one in the original thread, quoting Paul Whiteman, known as "The Kind of Jazz" in his time (the 1920s and 1930s).

Everything pic.twitter.com/Y7HOlU0LLr
— Paul Fairie (@paulisci) August 25, 2022

And then there's these in the comments, among many others:

The Jazz Problem, August 1924 pic.twitter.com/4GD7ITtkD3
— Mark W. (@DurhamWASP) August 26, 2022

Don't forget the anti-Semitic white supremacist Henry Ford tried to ban jazz claiming it caused unsociability, sensuality, infantilism, & wasmoral poison (& claimed Jews were behind it). Citation: Emery Warnock, "The Anti-Semitic Origins of Henry Ford's Arts Education Patronage" pic.twitter.com/gfgNItHAo1
— Brock Bahler (he/him) 🌻 (@brockbahler) August 26, 2022

H/t Tyler Cowen.

As a point of refence, see my post, Official Nazi Policy Regarding Jazz.

Tile madness [photoshoppery]

Friday, August 26, 2022

Sex among ath Aka, hunter-gatherers of the Central African Republic [sex is work of the night]

2. The Aka report having sex about 3 times a night, with some days of rest between (all data are self-reported). Here, these frequencies are converted to weekly for comparison with neighboring Ngandu farmers and the US: pic.twitter.com/EVIprY0hkv
— Ed Hagen (@ed_hagen) August 26, 2022

3. The Aka and Ngandu both report that having sex is mainly to have children, which warms my sociobiological heart: pic.twitter.com/hwTAnO0jdZ
— Ed Hagen (@ed_hagen) August 26, 2022

There are seven more tweets in the stream.

What’s Stand-Up Like? [from the inside]

I'm bumping this to the top of the queue on general principle. It's about performing, stand-up comedy specifically. It's about craft, how it's done.

* * * * *

Thinking about Jerry Seinfeld, starting with his show with Obama, has gotten me to think more generally about stand-up. It turns out that that’s something he’s been talking about a lot. To some extent I’ve been understanding his remarks about stand-up through music.

I don’t do stand-up, never have, and have no particular desire to give it a try. I’ve told jokes – more so when I was young than later – and I can be witty, but none of that is stand-up. But I have quite a bit of experience in performing music – various kinds of music, different kinds of venues, from small bars to outdoor gigs with thousands in the audience. So understand how musical performance works.

Right now I watching a clip on YouTube where Jerry Seinfeld, Chris Rock, Ricky Gervais and Louis C.K. are talking about the craft and the business:

If you’re interested in the craft, it’s worth a listen.

For example, one of the things they talk about is ‘cheap laughs’ (the term they use) versus, well, ‘honest’ laughs (my term). So Gervais was saying how he’d be perfectly happy getting a cheap laugh in performance, but he’s not going to put it on the DVD. Seinfeld counters that, if it killed in performance, it’s yours, put it on the DVD.

I can relate to that. But that’s not where I’m going in this post. I’m curious about what it’s like starting out in stand-up, something they talk about here and there. The impression I’ve got is that even when they were starting out, they were doing their own material. And, if I’d have to guess, I’d guess that even at local open-mics where anyone could get up and do five minutes, they’re going to do their own material–even if they have no aspirations to making a living at this. That’s just how stand-up is.

Thursday, August 25, 2022

Religion on the Ground, Sunday Service

I'm bumping this to the top of the queue on general principle. It's from October 2011 and I need to keep it in mind as I think about the future. See also, Religion in America, Going Forward, and Black Preaching, the Church, and Civic Life, both from 2015.

* * * * *

I learned something about religion this past Sunday. Or, if you will, I gained a richer and subtler appreciation for things I’ve know for some time, thing’s I’ve known because I’ve read them in books and articles, many of them quite good. But even the best of them must necessarily abstract away from concrete reality, and concrete reality is what I experienced on Sunday.

I went to church for the first time in years and years. I had a specific reason for going to church, and to THAT particular church. I wanted to check out Rev. Smith—not his real name, BTW. While I wouldn’t be violating any confidence by using the man’s real name, nor by telling exactly which church I went to, the fact is that I didn’t go into that church telling people that, as a reporter, ethnographer, or some other kind of thinker-about-religion, I would be writing about the service on my blog. Thus I DO feel that it would be ever so slightly out of place for me to name names.

I’d met Rev. Smith a week and a half ago at an emergency meeting of three neighborhood associations. Two bus lines were about to be discontinued, leaving many in the neighborhood without access to the outside world. So the leaders of these three associations called a meeting. Rev. Smith spoke briefly during that meeting, saying that he was starting up a new organization for empowering people in various neighborhoods. He only spoke for a minute or two but, oratorically, he went from zero to sixty in about 4.3 seconds. Zooom!

I chatted with him after the meeting, as did several others, and gave him my card after expressing interest in his new venture. I also figured I ought to check him out on his home turf, which is why I went to his church this past Sunday.

Yes, to the extent that I had expectations about his preaching style, the sermon he preached satisfied those expectations. Rev. Smith didn’t deliver the sermon from a raised pulpit. He put a lectern front and center, level with the pews. That was his home base. He had a Bible on the lectern, seemed like another book as well, and perhaps some notes. But he mostly winged it, referring back to scripture every once in awhile. He had a wireless mic so he could move freely, which he did.

PowerPoint Assistant: Augmenting End-User Software through Natural Language Interaction

This links to a set of notes that I first wrote up in 2003 or 2004, somewhere in there, but didn't post to the web until 2015. It calls for a natural language interface with end-user software that could be easily extended and modified through natural language. The abstract talks of hand-coding a basic language capability. I wrote these notes before machine learning had become so successful. These days hand-coding might be dispensed with – notice that I said "might." In any event, a young company called Adept has recently started up with something like that in mind:

True general intelligence requires models that can not only read and write, but act in a way that is helpful to users. That’s why we’re starting Adept: we’re training a neural network to use every software tool and API in the world, building on the vast amount of existing capabilities that people have already created.

In practice, we’re building a general system that helps people get things done in front of their computer: a universal collaborator for every knowledge worker. Think of it as an overlay within your computer that works hand-in-hand with you, using the same tools that you do.

* * * * *

Another working paper at Academia.edu: https://www.academia.edu/14329022/PowerPoint_Assistant_Augmenting_End_User_Software_through_Natural_Language_Interaction

Abstract, contents, and introduction below, as usual.

* * * * *

Abstract: This document sketches a natural language interface for end user software, such as PowerPoint. Such programs are basically worlds that exist entirely within a computer. Thus the interface is dealing with a world constructed with a finite number of primitive elements. You hand-code a basic language capability into the system, then give it the ability to ‘learn’ from its interactions with the user, and you have your basic PPA.

C O N T E N T S

Introduction: Powerpoint Assistant, 12 Years Later 1
Metagramming Revisited 3
PowerPoint Assistant 5
Plausibiity 5
PPA In Action 6
The User Community 12
Generalization: A new Paradigm for Computing 13
Appendix: Time Line Calibrated against Space Exploration 15

Introduction: Powerpoint Assistant, 12 Years Later

I drafted this document over a decade ago, after I’d been through an intense period in which I re-visited work I’d done in the mid-to-late 1970s and early 1980s and reconstructed it in new terms derived jointly from Sydney Lamb’s stratificational linguistics and Walter Freeman’s neurodynamics. The point of that work was to provide a framework – I called it “attractor nets” – that would accommodate both the propositional operations of ‘good old artificial intelligence’ (aka GOAFI) and the more recent statistical style. Whether or not or just how attractor nets would do that, well, I don’t really know.

But I was excited about it and wondered what the practical benefit might be. So I ran up a sketch of a natural language front-end for Microsoft PowerPoint, a program I use quite a bit. That sketch took the form of a set of hypothetical interactions between a use, named Jasmine, and the PowerPoint Assistant (PPA) along with some discussion.

The important point is that software programs like PowerPoint are basically closed worlds that exist entirely within a computer. Your ‘intelligent’ system is dealing with a world constructed with a finite number of primitive elements. So you hand-code a basic language capability into the system, then give it the ability to ‘learn’ from its interactions with the user, and you have your basic PPA.

That’s what’s in this sketch, along with some remarks about networked PPAs sharing capabilities with one another. And that’s as far as I took matters. That is to say, that is all I had the capability to do.

For what it’s worth, I showed the document to Syd Lamb in 2003 or 2004 and he agreed with me that something like that should be possible. We were stumped as to just why no one had done it. Perhaps it simply hadn’t occurred to anyone with the means to do the work. Attention was focused elsewhere.

Since then a lot has changed. IBM’s Watson won at Jeopardy and more importantly is being rolled out in commercial use. Siri ‘chats’ with you on your iPhone.

And some things haven’t changed. Still no PPA, nor a Photoshop Assistant either. Is it really that hard to do? Are the Big Boys and Girls just distracted by other things? It’s not as though programs like PowerPoint and Photoshop serve niche markets too small to support the recoupment of development costs.

Am I missing something, or are they?

So I’m putting this document out there on the web. Maybe someone will see it and do something about it. Gimme’ a call when you’re ready.

Wednesday, August 24, 2022

Beyond linear regression: mapping models in cognitive neuroscience should align with research goals

In short, if you plan to train an encoding/decoding model of the brain, you should determine which properties of your mapping are essential to your research question. Does it need to be simple? Biologically plausible? Explainable in terms of neuro/psych terms?
— Anna Ivanova (@neuranna) August 24, 2022

If you're going to #ccneuro22, stop by @leylaisi's poster presenting our work!

Many thanks to @CogCompNeuro GAC organizers for support with this project (@meganakpeters @GunnarBlohm)
— Anna Ivanova (@neuranna) August 24, 2022

Abstract of linked article:

Many cognitive neuroscience studies use large feature sets to predict and interpret brain activity patterns. Feature sets take many forms, from human stimulus annotations to representations in deep neural networks. Of crucial importance in all these studies is the mapping model, which defines the space of possible relationships between features and neural data. Until recently, most encoding and decoding studies have used linear mapping models. Increasing availability of large datasets and computing resources has recently allowed some researchers to employ more flexible nonlinear mapping models; however, the question of whether nonlinear mapping models can yield meaningful scientific insights remains debated. Here, we discuss the choice of a mapping model in the context of three overarching desiderata: predictive accuracy, interpretability, and biological plausibility. We show that these desiderata do not map cleanly onto the linear/nonlinear divide; instead, each desideratum can refer to multiple research goals, each of which imposes its own constraints on the mapping model. Moreover, we argue that, instead of categorically treating the mapping models as linear or nonlinear, researchers should report the complexity of these models. We show that, in many cases, complexity provides a more accurate reflection of restrictions imposed by various research goals and outline several complexity metrics that can be used to effectively evaluate mapping models.

Three views of a barge going up-river

Patterns as Epistemological Objects [2495]

Pattern-matching is much discussed these days in connection with deep learning in AI. Here's a post from July, 2014, where I discuss patterns more generally. [I was also counting down to my 2500th post. I'm now over 8600.]

* * * * *

When I posted From Quantification to Patterns in Digital Criticism I was thinking out loud. I’ve been thinking about patterns for years, and about pattern-matching as a computational process. I had this shoot-from-the-hip notion that patterns, as general as the concept is, deserve some kind of special standing in methodological thinking about so-called digital humanities – likely other things as well, but certainly digital humanities. And then I discovered that Rens Bod was thinking about patterns as well. And his thinking is independent of mine, as is Stephen Ramsay’s.

So now we have three independent lines of thought converging on the idea of patterns. Perhaps there’s something there.

But what? It’s not as though there’s anything new in the idea of patterns. It’s a perfectly ordinary idea. THAT’s not a disqualification, but I think we need something more if we want to use the idea of pattern as a fundamental epistemological concept

From Niche to Pattern

In my previous patterns post, “Pattern” as a Term of Art, I argued that the biological niche is a pattern in the sense we need. It’s a pattern that arises between a species and its sustaining environment. Organisms define niches. While biologists sometimes talk of niches pre-existing the organisms that come to occupy them, that is just a rhetorical convenience.

That example is important because it puts patterns “out there” in the world rather than them being something that humans (only) perceive in the world. But now it’s the human case that interests me, patterns that humans do see in the world. But we don’t necessarily regard all the patterns we see as being “real”, that is, as existing independently of our perception.

When we look at a cloud and see an elephant we don’t conclude that an elephant is up there in the sky, or that the cloud decided to take on an elephant-like form. We know that the cloud has its own dynamics, whatever they might be, and we realize that the elephant form is something we are projecting onto the world.

But that is something we learn. It’s not given in the perception itself. And that learning is guided by cultural conventions.

We see all kinds of things in the world. Not only does the mind perceive patterns, it seeks them out. What happens when we start to interact with the phenomena we perceive? That’s when we learn whether or not the elephant we saw is real or a projection.

With this in mind, consider this provisional formulation:

An observer defines a pattern over objects.

The parallel formulation for ecological niche would be:

A species defines a niche over the environment.

The pattern, the niche, exists in the relationship between a supporting matrix (the environment, an array of objects) and the organizing vehicle (the species, the observer). Just as there’s no way of identifying an ecological niche independently of specifying an organism occupying the niche, so there’s no way of specifying a (perceptual or cognitive) pattern independently of specifying a mind the charts the pattern.

As a practical matter, of course, we often talk of patterns simply as being there, in the world, in the data. And our ability to understand how the mind captures patterns is still somewhat limited. But if we want to understand how patterns function as epistemological primitives, then we must somehow take the perceiving mind into account.

The point of this formulation is to finesse the question of just what characteristic of some collection of objects makes them a suitable candidate for bearing a pattern. We can understand how patterns function as epistemological primitives without having to specify, as part of our inquiry, what characteristics an ensemble must have to warrant treatment as a pattern. We as epistemologists are not in the business of making that determination. That’s the job of a perceptual-cognitive system.

Our job is to understand how such systems come to accept some patterns as real while rejecting others. How does that happen? Through interaction, and the nature of that interaction is specific to the patterns involved.

Two Simple Examples: Animals and Stars

Let us consider some simple examples. Consider the patterns a hunter must use to track an animal, footprints, disturbed vegetation, sounds of animal movement, and so forth. The causal relationship between the animal and the signs in the pattern is obvious enough; the signs are produced by animal motion. The hunter knows that the pattern is real when the animal is spotted. Of course, the animal may not always be spotted, yet the pattern is real. In the case of failure the hunter must make a judgment about whether the pattern was real, but the animal simply got away, or whether the perceived pattern was simply mistaken.

Constellations of stars in the sky are a somewhat more complex example. That a certain group of stars is seen as Ursa Major, or the Big Dipper, is certainly a projection of the human mind onto the sky. The set of stars in a given constellation do not form a group organized by internal causal forces in the way that a planetary system does. The planets in such a system are held there by mutual gravitational attraction. The gravitational force of the central star would be the largest component in the field, with the planets exerting lesser force in the system.

But the stars in the Big Dipper are not held in that pattern by their mutual gravitational forces. Whatever that pattern is, it is not evidence of a local gravitational system among the constituents of the pattern. Rather, that pattern depends on the relationship between the observer and those objects. An observer at a different place in the universe, near one of the stars, for example, wouldn’t be able to perceive that pattern. And yet the stars have the same positions relative to one another and to the rest of the (nearby) universe.

Our knowledge of constellations is quite different from the hunter’s knowledge of tracking lore. One cannot interact with constellations in the way one interacts with animals. While one can pursue and capture or kill animals, one can’t do anything to constellations. They are beyond our reach. But we can observe them and note their positions in the sky. And we can use them to orient ourselves in the world and thus discover that they serve as reliable indicators of our position in space.

These two patterns attain reality in a different way. The forces that make the animal’s trail a real pattern are local ones having to do with the interaction between the animal and its immediate surrounding. The forces “behind” the constellations are those of the large-scale dynamics of the universe as “projected” onto the point from which the pattern is viewed.

A Case from the Humanities

Now let’s consider an example that’s closer to the digital humanities. Look at the following figure:

The red triangle is the pattern and I am defining it over the vertical bars. That is, I examined the bars and decided that they’re approximating a triangle, which I then superimposed on those bars. The bars preexisted the triangle.

I also created those bars, but through a process that is different and separate from that from the informal and intuitive process through which I created the triangular pattern. Each bar represents a paragraph in Joseph Conrad’s Heart of Darkness; the length of the bar is proportional to the number of words in the paragraph. The leftmost bar represents the first paragraph in the text while the rightmost bar represents that last paragraph in the text. The other bars represent the other paragraphs, in textual order from left to right.

The bars vary quite a bit in length. The shortest paragraph in the text is only two words long while the longest is, I believe, 1502 words long. In any given run of, say, twenty paragraphs, paragraph lengths vary considerably, though there isn’t a single paragraph over 200 words long in the final 30 paragraphs or so.

But why, when the distribution of paragraph lengths is so irregular, am I asserting the overall distribution has the form of a triangle? What I’m asserting is that that is the envelope of the distribution. There are a few paragraphs outside the envelope, but great majority are inside it.

The significant point, though, is that there is one longest paragraph and it is more or less in the middle. That paragraph is considerably longer (by over 300 words) than the next longest paragraphs, which are relatively close to it. The paragraphs toward the beginning and the end, the end especially, tend to be short.

What we’d like to know, though, is whether this distribution is an accident, and so of little interest, or whether it is an indicator of a real process. In the first place I observe that, in my experience, paragraphs over 500 words long are relatively rare – this is the kind of thing that can be easily checked with the large text databases we now have. Single paragraphs of over 1000 words must be very rare indeed.

And that longest pattern is quite special. It is very strongly marked. If you know Conrad’s story, then you know it centers on two men, Kurtz, a trader in the Congo, and Marlow, the captain of a boat sent to retrieve him. Marlow narrates the story, but it isn’t until we’re well into the story that Kurtz is even mentioned. And then we don’t learn much about him, just that he’s a trader deep in the interior and he hasn’t been heard from in a long time.

That longest paragraph is the first time we learn much about Kurtz. It’s a précis of his story. The circumstances in which Marlow gives us this précis are extraordinary.

His narrative technique is simple; he tells events in the order in which they happened – his need for a job, how he got that particular job, his arrival at the mouth of the Congo River, and so forth. With that longest paragraph, however, Marlow deviates from chronological order.

He introduces this information about Kurtz as a digression from the story of his journey up the Congo River to Kurtz’s trading station. Some of what he tells us about Kurtz happened long before Marlow set sail; and some of what we learned happened after the point in Marlow’s journey where he introduces this paragraph as a digression.

What brought on this digression? Well, Marlow’s boat was about a day’s journey from Kurtz’s camp when they were attacked from the shore. The helmsman was speared through the chest and fell bleeding to the deck. It’s at THAT point that Marlow interrupts his narrative to tell us about Kurtz – whom he had yet to meet. Once he finishes this most important digression he returns to his bleeding helmsman and throws him overboard, dead. Just before he does so he tells us that he doesn’t think Kurtz’s life was worth that of the helmsman who died trying to retrieve him.

That paragraph – its length, content, and position in the text – is no accident. That statement, of course, is a judgement, only based only on my experience and knowledge as a critic, which have been shaped by the discipline of academic literary criticism. But it’s not an unreasonable judgement; it is of a piece with the thousands of such judgements woven into the fabric of our discipline.

Conrad may not have consciously planned to convey that information in the longest paragraph in the text, and to position that paragraph in the middle of his text, but whatever unconscious cognitive and affective considerations were driving his craft, they put that information in that place in the text and at that length. The apex of that triangle is real, not merely in the sense that the paragraph is that long, but in the deeper sense that it is a clue about the psychodynamic forces shaping the text.

Just what are those psychodynamic forces? I don’t know. The hunter can tell us in great detail about how the animal left traces of its movement over the land. Astronomers and astrophysicists can tell us about constellations in great detail. But the pattern of paragraph lengths in Heart of Darkness is a mystery.

* * * * *

Why do I consider this example at such length? For one thing, I’m interested in texts. Patterns in text are thus what most interest me.

Secondly, that example makes the point that description is one thing, explanation another. I’ve described the pattern, but I’ve not explained it. Nor do I have any clear idea of how to go about explaining it.

There’s a lot of that going around in the digital humanities. Patterns have been found, but we don’t know how to explain them. We may not even know whether or not the pattern reflects something “real” about the world or is simply an artifact of data processing.

Third, whereas much of the work in digital humanities involves data mining procedures that are difficult to understand, this is not like that. Counting the number of words in a paragraph is simple and straightforward, if tedious (even with some crude computational help). And yet the result is strange and a bit mysterious. Who’d have thought?

Note that I distinguish between the bar chart that displays the word counts and the pattern I, as analyst, impose on it. When I say that the envelope of paragraph length distribution is triangular, I’m making a judgement. That judgement didn’t come out of the word count itself. And when I say that that pattern is real, I’m also making a judgement, one that I’ve justified – if only partially – by discussing what happens in that longest paragraph and that paragraph's position in the text as a whole.

My sense of these matters is that, going forward, we’re going to have to get comfortable with identifying patterns we don’t know how to explain. We need to start thinking about, theorizing if you will, what patterns are and how to identify them.

* * * * *

I’ve written a good many posts on Heart of Darkness. I discuss paragraph length HERE and HERE. I’ve called that central sentence the nexus and discuss it HERE. Here’s a downloadable working paper that covers these and other aspects of the text.

* * * * *

I’m on a countdown to my 2500th post. This is number 2495.

Three diverse irises

The deciine of the humanities in college majors

Here's the raw size of all the fields (just BAs). The downtick in cultural, ethnic, and gender studies is notable--those had been the only fields *not* to get pulled down by the collapse of humanities majors. Also sharper-than normal drops in English, Comp Lit, languages... pic.twitter.com/cbecOnl8Ls
— Benjamin Schmidt (@benmschmidt) August 23, 2022

Here's the slightly longer term shifts from 2011-2021. The total outlier of computer science's explosion is really clear here: so is the concentration of growth in fields that have clear career prospects. pic.twitter.com/Pq5nuZveqH
— Benjamin Schmidt (@benmschmidt) August 23, 2022

The downtick in humanities this year pushes up the ETA for when CS is larger than all humanities degrees together to one of the classes currently in college. pic.twitter.com/muaYNhDIe4
— Benjamin Schmidt (@benmschmidt) August 24, 2022

Tuesday, August 23, 2022

Coming and going on 11th Street [Hoboken]

Which comes first, AGI or new systems of thought? [Further thoughts on the Pinker/Aaronson debates]

Long-time readers of New Savanna know that David Hays and I have a model of cultural evolution built on the idea of cognitive rank, systems of thought embedded in large-scale cognitive architecture. Within that context I have argued that we are currently undergoing a large-scale transformation comparable to those that gave us the Industrial Revolution (Rank 3 in our model) and, more recently, the conceptual revolutions of the first half of the 20th century (Rank 4). Thus I have suggested that the REAL singularity is not the fabled tech singularity, but the consolidation of new conceptual architectures:

Redefining the Coming Singularity – It’s not what you think, Version 2, Working Paper, November 2015, https://www.academia.edu/8847096/Redefining_the_Coming_Singularity_It_s_not_what_you_think

I had occasion to introduce this idea into the recent AI debate between Steven Pinker and Scott Aaronson. Pinker was asking for specific mechanisms underpinning superintelligence while Aaronson was offering what Steve called “superpowers.” Thus Pinker remarked:

If you’ll forgive me one more analogy, I think “superintelligence” is like “superpower.” Anyone can define “superpower” as “flight, superhuman strength, X-ray vision, heat vision, cold breath, super-speed, enhance hearing, and nigh-invulnerability.” Anyone could imagine it, and recognize it when he or she sees it. But that does not mean that there exists a highly advanced physiology called “superpower” that is possessed by refugees from Krypton! It does not mean that anabolic steroids, because they increase speed and strength, can be “scaled” to yield superpowers. And a skeptic who makes these points is not quibbling over the meaning of the word superpower, nor would he or she balk at applying the word upon meeting a real-life Superman. Their point is that we almost certainly will never, in fact, meet a real-life Superman. That’s because he’s defined by human imagination, not by an understanding of how things work. We will, of course, encounter machines that are faster than humans, and that see X-rays, that fly, and so on, each exploiting the relevant technology, but “superpower” would be an utterly useless way of understanding them.

I’ve added my comment below the asterisks.

* * * * *

I’m sympathetic with Pinker and I think I know where he’s coming from. Thus he’s done a lot of work on verb forms, regular and irregular, that involves the details of (computational) mechanisms. I like mechanisms as well, though I’ve worried about different ones than he has. For example, I’m interested in (mostly) literary texts and movies that have the form: A, B, C...X...C’, B’, A’. Some examples: Gojira (1954), the original 1933 King Kong, Pulp Fiction, Obama’s eulogy for Clementa Pinkney, Joseph Conrad’s Heart of Darkness, Shakespeare’s Hamlet, and Osamu Tezuka’s Metropolis.

What kind of computational process produces such texts and what kind of computational process is involved in comprehending them? Whatever that process is, it’s running in the human brain, whose mechanisms are obscure. There was a time when I tried writing something like pseudo-code to generate one or two such texts, but that never got very far. So these days I’m satisfied identifying and describing such texts. It’s not rocket science, but it’s not trivial either. It involves a bit of luck and a lot of detail work.

So, like Steve, I have trouble with mechanism-free definitions of AGI and superintelligence. When he contrasts defining intelligence as mechanism vs. magic, as he did earlier, I like that, as I like his current contrast between “intelligence as an undefined superpower rather than a[s] mechanisms with a makeup that determines what it can and can’t do.”

In contrast Gary Marcus has been arguing for the importance of symbolic systems in AI in addition to neural networks, often with Yann LeCun as his target. I’ve followed this debate fairly carefully, and even weighed in here and there. This debate is about mechanisms, mechanisms for computers, in the mind, for the near-term and far-term.

Whatever your current debate with Steve is about, it’s not about this kind of mechanism vs. that kind. It has a different flavor. It’s more about definitions, even, if you will, metaphysics. But, for the sake of argument I’ll grant that, sure, the concept of intellectual superpowers is coherent (even if we have little idea about how’d they’d work beyond MORE COMPUTE!).

With that in mind, you say:

Not only does the concept of “superpowers” seem coherent to me, but from the perspective of someone a few centuries ago, we arguably have superpowers—the ability to summon any of several billion people onto a handheld video screen at a moment’s notice, etc. etc. You’d probably reply that AI should be thought of the same way: just more tools that will enhance our capabilities, like airplanes or smartphones, not some terrifying science-fiction fantasy.

I like the way you’ve introduced cultural evolution into the conversation, as that’s something I’ve thought about a great deal. Mark Twain wrote a very amusing book, A Connecticut Yankee in King Arthur’s Court. From the Wikipedia description:

In the book, a Yankee engineer from Connecticut named Hank Morgan receives a severe blow to the head and is somehow transported in time and space to England during the reign of King Arthur. After some initial confusion and his capture by one of Arthur's knights, Hank realizes that he is actually in the past, and he uses his knowledge to make people believe that he is a powerful magician.

Is it possible that in the future there will be human beings as far beyond us as that Yankee engineer was beyond King Arthur and Merlin? It seems to me that, providing we avoid disasters like nuking ourselves back to the Stone Age, catastrophic climate change exacerbated by pandemics, and getting paperclipped by an absentminded Superintelligence, it seems to me almost inevitable that that will happen. Of course science fiction is filled with such people but, alas, has not a hint of the theories that give them such powers. But I’m not talking about science fiction futures. I’m talking about the real future. Over the long haul we have produced ever more powerful accounts of how the world works and ever more sophisticated technologies through which we have transformed the world. I see no reason why that should come to a stop.

So, at the moment various researchers are investigating the parameters of scale in LLMs. What are the effects of differing numbers of tokens in the training corpus and number of parameters in the model? Others are poking around inside the models to see what’s going on in various layers. Still others are comparing the response characteristics of individual units in artificial neural nets with the response characteristics of neurons in biological visual systems. And so and on and so forth. We’re developing a lot of empirical knowledge about how these systems work, and models here and there.

I have no trouble at all imagining a future in which we will know a lot more about how these artificial models work internally and how natural brains work as well. Perhaps we’ll even be able to create new AI systems in the way we create new automobiles. We specify the desired performance characteristics and then use our accumulated engineering knowledge and scientific theory to craft a system that meets those specifications.

It seems to me that’s at least as likely as an AI system spontaneously tipping into the FOOM regime and then paperclipping us. Can I predict when this will happen? No. But then I regard various attempts to predict the arrival of AGI (whether through simple Moore’s Law type extrapolation or the more heroic efforts of Open Philanthropy’s biological anchors) as mostly epistemic theater.

Monday, August 22, 2022

Orange flower, light blue ribbon

What’s it mean, minds are built from the inside?

I'm bumping this post from September 2014 to the top because it's my oldest post on this topic.

In my recent post arguing that “superintelligent” computers are somewhere between very unlikely to impossible, I asserted: “This hypothetical device has to acquire and construct its superknowledge ‘from the inside’ since no one is going to program it into superintelligence ...” Just what does that mean: from the inside?

The only case of an intelligent mind that we know of is the human mind, and the human mind is built from the “inside.” It isn’t programmed by external agents. To be sure, we sometime refer to people as being programmed to do this or that, and when we do so the implication is that the “programming” is somehow against the person’s best interests, that the behavior is in some way imposed on them.

And that, of course, is how computers are programmed. They are designed to be imposed upon by programmers. A programmer will survey the application domain, build a conceptual model of it, express that conceptual model in some design formalism, formulate computational processes in that formalism, and then produce code that implements those processes. To do this, of course, the programmer must also know something about how the computer works since it’s the computer’s operations that dictate the language in which the process design must be encoded.

To be a bit philosophical about this, the computer programmer has a “transcendental” relationship with the computer and the application domain. The programmer is outside and “above” both, surveying and commanding them from on high. All too frequently, this transcendence is flawed, the programmer’s knowledge of both domain and computer is faulty, and the resulting software is less than wonderful.

Things are a bit different with machine learning. Let us say that one uses a neural net to recognize speech sounds or recognize faces. The computer must be provided with a front end that transduces visual or sonic energy and presents the computer with some low-level representation of the sensory signal. The computer then undertakes a learning routine of some kind the result of which is a bunch of weightings on features in the net. Those weightings determine how the computer will classify inputs, whether mapping speech sounds to letters or faces to identifiers.

Now, it is possible to examine those feature weightings, but for the most part they will be opaque to human inspection. There won’t be any obvious relationship between those weightings and the inputs and outputs of the program. They aren’t meaningful to the “outside.” They make sense only from the “inside.” The programmer no longer has transcendental knowledge of the inner operations of the program that he or she built.

If we want a computer to hold vast intellectual resources at its command, it’s going to have to learn them, and learn them from the inside, just like we do. And we’re not going to know, in detail, how it does it, any more than we know, in detail, what goes on in one another’s minds.

How do we do it? It starts in utero. When neurons first differentiate they are, of course, living cells and further differentiation is determined in part by the neurons themselves. Each neuron “seeks” nutrients and generates outputs to that end. When we analyze neural activity we tend to treat it, and its activities, as components of a complicated circuit in service of the whole organism. But that’s not how neurons “see” the world. Each neuron is just trying to survive.

Think of ants in a colony or bees in a swarm. There may be some mysterious coherence to the whole, but that coherence is the result of each individual pursuing its own purposes, however limited those purposes may be. So it is with brains and neurons.

The nervous system develops in a highly constrained environment in utero, but it is still a living and active system. And the prenatal auditory system can hear and respond to sounds from the external world. When the infant is born its world changes dramatically. But the brain is sill learning and acting “from the inside.”

The structure of the brain is, of course, the result of millions of years of evolutionary history. The brain has been “designed” by evolution to operate in a certain world. It is not designed and built as a general purpose device, but yet becomes capable of many things, including designing and building general purpose computational devices.

But if we want those devices to be capable in an “intelligent” way we’re going to have to let them learn their way about in the world. We can design a machine to learn and provide it with an environment in which it can learn, an environment that most likely will entail interacting with us, but just what it will learn and how it will learn it, that’s taking place inside the machine outside of our purview. The details of that knowledge are not going to be transparent to external inspection.

We can imagine a machine that picks up a great deal of knowledge by reading books and articles. But that alone is not sufficient for deep knowledge of any domain. No human ever gained deep knowledge merely through reading. One must interact with the world through building things, talking with others, conducting experiments, and so forth. It may, in fact, have to be a highly capable robot, or at least have robotic appendages, so that it can move about in the world. I don’t see how our would-be intelligent computer can avoid doing this.

Just how much could a computer learn in this fashion? We don’t know. If, say, two different computers learned about more or less the same world in this fashion, would they be able to exchange knowledge simply by direct sharing of internal states? That’s a very interesting question, one for which I do not have an answer. I have some notes suggesting “why we'll never be able to build technology for Direct Brain-to-Brain Communication,” but that is a somewhat different situation since we didn’t design and construct our brains and they weren’t built for direct brain-to-brain communication. Perhaps things will go differently with computers.

By and large, we don’t know what future computing will bring. A computer with facilities roughly comparable to the computer in Star Trek’s Enterprise would be a marvelous thing to have. It wouldn’t be superintelligent, but its performance would, nonetheless, amaze us.