Thursday, July 9, 2015

Intention and Story-Telling: A Neural Explication

Originally published in The Valve, 10.24.07. I've tacked on some out-takes from the original post. Note that I had a long exchange with John Holbo over at The Valve that's worth reading and it's relevant to my current consideration of Dennett on intentionality. Glancing through it I think I've changed my views on this and that since I wrote this piece.

* * * * *
But in the night of thick darkness enveloping the earliest antiquity, so remote from ourselves, there shines the eternal and never failing light of a truth beyond all question: that the world of civil society has certainly been made by men, and that its principles are therefore to be found within the modifications of our own human mind.
- Giambattista Vico

Consciousnesses present themselves with the absurdity of a multiple solipcism, such is the situation which has to be understood.
- Maurice Merleau-Ponty
Though I find myself perplexed over all the wit, intellect, and energy expended in contemplation of peculiar hypotheticals that, so far as I can tell, have yet to materialize - you know, Wordsworth on the beach and such - I nonetheless find myself thinking about intention from time to time. Most recently I've been thinking about the notion that all those present at the telling of a story - teller and audience alike - share the same “intentional frame,” where intentional frame is defined with respect to the operations of the nervous system. This would be true of the actors and audience of a play or the audience at a movie as well. I also think it true of all those who read a given novel, though that situation differs sufficiently from those face-to-face situations that the generalization cannot be casually granted.

My object in this post is to lay this out. First I'll use Walter Freeman to establish the use of intentionality when theorizing about the nervous system. Then I'll argue that people engaged in face-to-face conversation share the same intentional frame. Then I'll consider oral story-telling and develop a restricted notion of intentional frame to cover that situation. The point of this exercise is to come up with a way of thinking about story-telling at the neural level.

A Neural View

Let's consider how a neuroscientist, Walter Freeman, talks about intention. I first encountered Freeman's treatment of intentionality in his Societies of Brains, but I'm going to quote from an essay on The Self-Organizing Subject of Psychoanalysis (PDF):
The basic Thomist premise is the unity and inviolability of the self that is inherent in the brain and body. This unity does not allow the entry of forms (we would say information) into the self. The impact of the world onto the senses gives rise to states of activity he called 'phantasms', which are ephemeral and unique to each impact and therefore cannot be known. The function of the brain is to exercise the faculty of the imagination, which is not present in the Aristotelian view, in order to abstract and generalize over the phantasms that are triggered by unique events. These processes of abstraction and generalization create information that assimilates the body and brain to the world. Assimilation is not adaptation by passive information processing, nor is it an accumulation of representations by resonances. It is the shaping of the self to bring it into optimal interaction with desired aspects of the world. The goal of an action is a state of competence that Maurice Merleau-Ponty (1945) called "maximum grip". It is the beginning for all knowledge. Sensory impacts that are attended by the brain are only those which can be assimilated on the basis of the pre-existing structure and capabilities of the body and brain, which have already been created through the prior experience.

Thus the manner of acquisition of knowledge is by thrusting the body into the world, from which our word 'intention' has come from the Latin "intendere" = 'stretching forth'. The thrust initiates the action-perception cycle, which is followed by the changes through which the self learns about the world, and ultimately about God, by assimilation (from the Latin "adequatio") of the self to the world. There is no transfer of information across the senses into the brain, but instead the creation of information within the brain under the existing constraints of the brain and body. In this respect cognition is related to digestion, which protects the integrity of the immunological self by breaking all forms of foodstuffs into elementary ions and molecules, that are absorbed and built into complex macromolecules, each now bearing the immunological signature of the individual self. Similarly, events and objects in the world are broken into sheets of action potentials like pinpoints of light, the 'raw sense data' of analytic philosophers and the phantasms of Thomists, and new forms emerge through constructions by the chaotic dynamics in sensory cortices. The explanation for this manner of function of both the neural and the digestive systems is essentially the same: the world is infinitely complex, and the self can only know and incorporate what it makes within itself. This is why neurobiologists using passive neural networks cannot solve the figure-ground problem, why linguists cannot do machine translation, why philosophers cannot solve the symbol grounding problem, why cognitive scientists cannot surmount the limitations of expert systems, and why engineers cannot yet build autonomous robots capable of operating in unstructured environments. The unbounded complexity of the world defeats those classic Platonic and Arisotelian approaches.
So, that's Freeman on intention. He's been investigating the nervous system considered as a dynamical system (he's been influenced by the physicist Hermann Haken among others). In particular, he's studied the olfactory system, looking at how the brain “stretches forth” to comprehend odors and how it assimilates its own structures to the activity patterns imposed upon it by odorants. We need not worry about the details of his models except to note that they are very much about the timing of impulse trains and how they propagate through the nervous system. [Note: FWIW, Piaget would talk of accommodation where Freeman talks of assimilation. Piaget uses assimilation for a different purpose.]


Now let's consider ordinary conversation between two people. Such conversations generally involve fluid turn-taking; many remarks are elliptical and-or grammatically slipshod. Conversation is dynamic and two-way and is thus unlike the prototypical situation in those strange hypotheticals beloved of philosophers and literary critics, where some observer is simply confronted with (mysteriously) written signs. It's not at all clear to me that intuitions “pumped” from such a situation (to use John Holbo's Valvological term) are of much use in understanding basic conversational interaction - but then, that's not what those strange tales are about, is it?

The situation of written signs allows us to slip into what the cognitive linguists call the conduit metaphor for communication. This is the notion that the author puts meaning into the text, which then serves as a conduit that conveys that meaning to the reader. That is what happens with the electrical signals in a telephone conversation, for example; those signals do travel through wires (and satellite links too) between people. But the meaning does not. Similarly, written marks pass from author to reader, but the meaning does not mysteriously tag along on those ink splotches, waiting to leap from the page into the mind of the attentive reader. Something else happens, something we don't understand very well. Hence the attraction of talking about communication as though it were sending meaning through a conduit: That's easy to understand. But wrong.

Getting back to conversation and its constant two-way interaction, I am going to say that, in conversation, two people (or more, as the case may be) share the same intentional framework. It sometimes happens, for example, that one person will finish a sentence begun by the other. This is not mind-reading in the sense of paranormal access to the thoughts of another, but it certainly implies that, in conversation, one can become highly attuned to what's on the other's mind.

To be useful, however, the notion of intentional frame needs to be more than a matter of mere definition. The definition needs to “pick out,” call our attention to, a significant range of observations. Here's a start:

Starting back in the 1960s and continuing on through the 1980s, a Boston psychiatrist named William Condon filmed and video-taped people interacting with one another. He found that in normal successful interactions that the physical motions of the participants were entrained to one another so that they shared the same temporal framework to within 10s of milliseconds. People with certain kinds of disabilities - e.g. schizophrenia, autism - were not able to synchronize with others. Condon further discovered that neonates could synchronize their body movements to the rhythms of adult speech within an hour after birth. This interactional synchrony, as he called it, is the physical correlate of sharing the same intentional framework. It is evidence that something physical is supporting intersubjective intentionality, something more tangible than being able to finish someone else's sentence for them.

Here's some more on interactional synchrony from my review of Steven Mithen's The Singing Neanderthals:
Using Freeman's work as a starting point, I have previously argued that, when individuals are musicking with one another, their nervous systems are physically coupled with one another for the duration of that musicking (Benzon 2001, 47-68). There is no need for any symbolic processing to interpret what one hears or so that one can generate a response that is tightly entrained to the actions of one's fellows.

My earlier arguments were developed using the concept of coupled oscillators. The phenomenon was first reported by the Dutch physicist Christian Huygens in the seventeenth century (Klarreich 2002). He noticed that pairs of pendulum clocks mounted to the same wall would, over time, become synchronized as they influenced one another through vibrations in the wall on which they were. In this case we have a purely physical system in which the coupling is direct and completely mechanical.

In this century the concept of coupled oscillation was applied to the phenomenon of synchronized blinking by fireflies (Strogatz and Steward 1993). Fireflies are, of course, living systems. Here we have energy transduction on input (detecting other blinks) and output (generating blinks) and some amplification in between. In this case we can say that the coupling is mediated by some process that operates on the input to generate output. In the human case both the transduction and amplification steps are considerably more complex. Coupling between humans is certainly mediated. In fact, I will go so far as to say that it is mediated in a particular way: each individual is comparing their perceptions of their own output with their perceptions of the output of others. Let us call this intentional synchrony.

Further, this is a completely voluntary activity (cf. Merker 2000, 319-319). Individuals give up considerable freedom of activity when they agree to synchronize with others. Such tightly synchronized activity, I argued (Benzon 2001), is a critical defining characteristic of human musicking. What musicking does is bring all participants into a temporal framework where the physical actions - whether dance or vocalization - of different individuals are synchronized on the same time scale as that of neural impulses, that of milliseconds. Within that shared intentional framework the group can develop and refine its culture. Everyone cooperates to create sounds and movements they hold in common.
That people “give up considerable freedom of activity” when they engage others in, e.g. conversation, is important. When you converse with someone you commit yourself to being intelligible to them; you tailor your remarks to (your best understanding of) their conceptual competence and interests (cf. that well-known Grice article that I've never read). I'll call on this “giving up” when talking about story-telling. Let's continue with the passage from my review:
There is no reason whatever to believe that one day fireflies will develop language. But we know that human beings have already done so. I believe that, given the way nervous systems operate, musicking is a necessary precursor to the development of language. A variety of evidence and reasoning suggests that talking individuals must be within the same intentional framework.

Consider an observation that Mithen offers early in his book (p. 17). He cites work by Peter Auer who, along with his colleagues, has analyzed the temporal structure of conversation. They discovered that, when a conversation starts, the first speaker establishes a rhythm to which the other speakers time their turn-taking. That is, even though they are only listening, other parties are actively attuned to the rhythm of the speaker's utterance (cf. Condon 1986). What if this were necessary to conversation, and not just an incidental feature of it?

Let us recall some passages from Eric Lenneberg's landmark review and synthesis, The Biological Foundations of Language (1967). While he does not address the issue of conversational turn-taking, he does devote the better part of chapter three to timing issues. He was particularly interested in problems arising from the fact that neural impulses travel relatively slowly and that the recurrent nerve, innervating the larynx, is over three times as long as the trigeminal branch innervating the one of the jaw muscles. It also has a smaller diameter, which means that impulses travel more slowly in it than in the trigeminal. The upshot, observes Lenneberg, is that “innervation time for intrinsic laryngeal muscles may easily be up to 30 msec longer than innervation time for muscles in and around the oral cavity.” He goes on to observe: “Considering now that some articulatory events may last as short a period as 20 mesc, it becomes a reasonable assumption that the firing order in the brain stem may at times be different from the order of events at the periphery” (96). It is on the basis of such considerations, which he discusses in some detail, that Lenneberg concludes: “rhythm is … the timing mechanism which should make the ordering phenomenon physically possible” (119).

It follows from this that, if you wish your utterances to smoothly intercalate with those of others, you need to share their rhythms; that is the only way your conversational entrances will be appropriately timed. Still, this might merely be a conversational convenience, not a necessity. So, let us consider the problem of speech perception.

We know that, while we tend to hear speech as a string of discrete sounds, that is something of an illusion. Sonograms do not show the segmentation that we hear so easily (Lenneberg 93-94). The brain is doing some sophisticated analysis of the sound stream. Though I am not aware that anyone has investigated this, I can imagine that it would be very useful if the listener operated within the same temporal framework as the speaker. This might help with the segmentation. If this is so, rhythmic synchronization is no longer simply a feature of how the nervous system happens to operate. It becomes essential to being able to treat the speech stream as a string of phonemes; it is necessary to linguistic communication.

Let us push the argument a step further. For the last decade or so there has been considerable interest in the notion that people acquire a so-called theory of mind (TOM) early in maturation and that this TOM is critical to interpersonal interaction (see e.g. Baron-Cohen 1995). Gaze following is one behavior implicated in TOM. Humans beyond a relatively early age will follow the direction of one another's gaze. I would like to suggest that we notice gaze direction in people with whom we synchronize, but not otherwise.

Think about the perceptual requirements of noticing and tracking gaze direction. Even at conversational distance, another person's eyes are small in relation to the whole visual scene; thus the visual cues for gaze direction will also be small. Further, people in conversation are likely to be in constant relative motion with respect to one another. The motions may not be large - head turns and gestures, trunk motion - but they will be compounded by the fact that one's eyes are in constant saccadic motion. Synchronization would eliminate one component of relative motion between people and therefore simplify the process of picking up the minute cues signaling gaze direction. But if one cannot properly synchronize with others, then those cues will be more difficult to notice and track. Thus the capacity for interpersonal synchrony may be a prerequisite for the proper functioning of TOM circuitry.

In this light let us now consider Paul Bloom's (2000) recent work on language acquisition. He has demonstrated that young children do more than merely associate the words they hear with the objects and events to which they refer. Such associations are not sufficient. Rather children make inferences about speaker's intentions when listening to them and learning the meanings of words they use. In the current parlance, children use a so-called theory of mind (TOM) to infer what, of many immediate possibilities, the speaker's words refer to. Inferring another's intentions also plays a large role in Quine's (1960, 26 ff.) classic argument about radical translation.
The general point, then, is that asserting that people in conversation share the same intentional framework is not a matter of mere definition. It's a conception that's tied to various empirical correlates. It's a statement about how human nervous systems interact as physical systems.

Given that conversation requires a shared intentional framework, we're now ready to think about story-telling.

Story Telling

One reason for taking oral story-telling as my paradigm case is that it is historically prior to written texts, which is where almost all literary discussion of intentionality begins. The fundamental point about oral performance is that teller and listeners are there, together, in one another's visible and audible presence. The teller can sense immediately whether or not the audience is enjoying the tale; and audience members can register their interest or boredom, their pleasure or their anxiety. To be sure, each person's subjective experience, is of course, private. But not totally so, for their posture, gestures, facial expressions, sighs, murmurs, groans, giggles and exclamations, all are apparent to everyone else and to the speaker as well. The living significance of these non-verbal expressions is obvious to all, as they are grounded in biological behaviors that evolved to communicate inner states to conspecifics; these behaviors may be modified by cultural convention, to be sure, but those present share the same conventions.

The storyteller can thus modulate his performance in response to audience reaction and individual audience members can modulate their reactions by taking into account the reactions of their friends and family. Here literature - sometimes called orature - exists in the interaction among people assembled together. To be sure, there is an asymmetry between the role of storyteller and that of audience members in that audience members do not converse with the story-teller in the normal fashion. The teller talks, everyone else listens. Still, audience reaction is directly available to the teller, and to others in the audience. The experiences of people in this situation are public and shared.

The details of just how this happens are something of a mystery. But I can employ the notion of an intentional frame without having to solve that mystery. In fact, the purpose of using the idea is to develop a way of thinking about literary communication at the neural level without having first to work out the details of verbal and non-verbal communication. The idea depends only on certain general characteristics of the situation, not on all the messy details of how the mechanisms work.

Now we need to take a very abstract view of the nervous system, the kind of view adopted by AI types when they think about computers and brains and just when it is that we'll have a computer as intelligent as a human brain. They make the argument in terms of raw computing capacity, arguing that when computers have the raw capacity of a human brain, they'll be as smart as we are - and when computers have more raw capacity, they'll outsmart us. Forget about whether or not they're right about that. It's the thumbnail capacity estimate that interests me.

The estimate is stated in terms of the number of states a physical system can assume. Both computers and brains are, after all, physical systems. The number of states depends on how many elements the system has, how many states each element can assume, and the way the elements constrain one another. If the elements are completely independent, then the last factor drops out. The elements in both brains and computers, however, do not operate independently of one another. Neurons signal one another, thereby influencing one another's states. So do elements in computers.

However the estimate works out for a human brain, the number is very large. That's all that matters right now. We don't need to know how many elements are in a typical brain (the problem here is that we don't know just what the appropriate element is - the neuron, the synapse, the molecule), nor how many states each element can assume, nor even just how they constrain one another. All we need to know is that that's what we're talking about and that number is quite large.

Now consider the system consisting of all those people present during the telling of a story - the members, say, of a single hunter-gatherer band. How many states are available to that “collective” brain? Since, say, 30 brains taken together have thirty times as many elements as one brain, you might think that that system has many more states available to it. If one brain has X states available to it, then 30 brains taken independently would have 30^X states available to it - a number that is hugely vaster than X. And that would indeed be the case if we were considering the band, as it were, at some time when the people were going about their business either as lone individuals or in groups of two, three, or so. In such a situation, constraints on overall activity are loose. But that's not the situation we are considering.

We're thinking of these thirty brains as they are committed to the story-telling situation. Though only one brain is telling the story, all brains are intent on it. That places severe constraints on the ensemble. Let me suggest that, in the story-telling situation, the state-space of the collective is no larger than that of a single brain and may, in fact, be somewhat smaller, given the constraints imposed by the story itself.

Consider further that, during the story-telling interval, cells and synapses are dying, synaptic weights are being adjusted, and so forth. That is to say, each brain is modifying itself in modest ways as it “stretches forth” to comprehend and assimilate the unfolding story. Everyone is thus becoming attuned to the same set of (imaginary) events. And this happens time and again, with the same story, with related stories, and has been going on for generations. This is culture at work, one aspect of it.

* * * * *

That's the basic story. Now for some further comments. In the first place, it seems to me that traditional story-telling is far more restrictive than ordinary conversation. The stories are from a repertoire that's well known and relatively fixed. No new information is being conveyed. On the contrary, it's very important that the characters and incidents be the same from telling to telling, though the exact verbal formulations will differ. It is because of this restriction that I made an explicit argument about the size of the collective state space.

I made no such argument about general conversation. Yes, there are restrictions, and parties to a conversation do try to be mutually intelligible. But many conversations are intended to convey information from one person to another and, beyond that, some conversations are intended to change the way people think in significant ways. In such conversations it is common that one person will say things that another cannot understand. That is to say, one party will move into a region of their neural state space that has no correlate in the state space of some other party; hence, that other party cannot “follow” the argument. It may well be that all parties in the conversation will make such statements. These issues may or not get resolved. If they are not resolved, then the collective state space still contains regions that at least one party cannot enter.

It thus seems to me that the collective state space of ordinary conversation might well be larger than that of either nervous system that is a party to the conversation. In the absence of mathematical analysis - which will have to be done by someone with more math than I've got - it's difficult to say much about this.

[Note, however, that this is a different issue from whether or not the parties share the same intentional frame. That doesn't necessarily entail mutual understanding; it only implies a certain mode of interaction.]

Finally, the world of written texts, which is the world that interests most literary critics. Obviously various issues must be addressed if we wish to generalize this story to written texts. I want to look at only one of the issues, the fact that stories are no longer drawn from a fixed repertoire known to all. But of course, while any given story is new, the situations and kinds of characters in it are not - hence the notion that there are only 36 plots, or whatever the number is. What interests me, though, is that some texts are very popular, while others are not. And the popularity of a texts may well change over time.

What we see is a selective process in which authors and publishers put texts on the market and people, in effect, form themselves into a diffuse network of communities around those texts. [Whether or not these are Fish's interpretive communities is an question I'll leave to the reader.] Each of these communities shares an intentional framework that is regulated by a given text (or, as the case may be, body of texts).

I think that what's going on is that the gradual and continuous shaping of a fixed body of stories in oral cultures has become transformed into a somewhat different way of guaranteeing the “fit” between text an audience. The audience has a large number of texts at their disposal, more than any one individual can read, and each person chooses those which best fit their needs. In this they are influenced by family, friends, and peers (Duncan Watts has some web-based experiments on this, though I don't have any citations conveniently at hand).

Exit, pursued by a bear

There you have it, a neural explication of intention among groups of story-telling apes. I suppose that, for the literary critic who doesn't buy into intentionalism, this argument is strange enough that it may be safely ignored. For the critic who wishes to use intentionalism as a way of restricting interpretive play, there may or may not be something here. The state-space restriction I've stated cannot be taken to imply that a story means the same thing to everyone who hears it. It might well mean different things. To figure that one out we'd need a way of determining the meaning of a text.

That's a different discussion. On that issue, my opinion is that you can't determine the meaning of a text “from the outside.” Meaning is subjective in the way that color is subjective, and considerably more complex. And that means you can't determine what a text means to someone else any more than you can determine whether or not red is exactly the same for them as it is for you. So why bother?

In general, this is a different conceptual universe from that of standard-issue lit. crit. and its Theory-laden alternatives. The standard questions either do not exist in this universe, or they are transformed in sometimes strange ways.


Note specifically that I am not advancing this conception as a means of resolving the (by now) standard-issue problems of literary intention in a new and more definitive way. I’m uninterested in the Knapp-Michaels line, or alternatives, and consider it a dead end. That is, I see no resolution within . . . Rather, I’m explicating intentionality within a different conceptual domain, one where, at least in principle, we can deal with issues computationally and empirically.

* * * * *

Intention is a big topic in literary criticism and philosophy of language, and certainly here at The Valve. Yet I find myself perplexed over all the wit, intellect, and energy expended in contemplation of peculiar phenomena that, so far as I can tell, have yet to materialize – you know, Wordsworth on the beach and such.

I’m interested in the mechanisms of language. What happens in the mind and brain when it sees written words? That’s what I want to know. From that point of view, whatever that process is, should it go into action when it sees words inscribed on a deserted beach and ponder the meaning of those words, why then, it’s made a mistake. It’s been fooled. Any of our perceptual and cognitive mechanisms can be fooled. So what?

Experimental psychology is replete with odd and ingenious ways of fooling the nervous system and thereby learning something about how it works. I don’t see how this bit of philosophical foolery tells us much of anything how the system works.

* * * * *

I note further along these lines that there is a literature on small group behavior that indicates that certain sizes of groups are appropriate for certain kinds of activities. As far as I know, no one really knows where these restrictions come from, though people are working on the problem. It’s not unreasonable to think that such restrictions have to do with the relationship between task requirements and

No comments:

Post a Comment