Friday, August 15, 2014

Reading Macroanalysis 5: An Interlude on Scale: Micro, Meso, and Macro

Before moving on to the last two major chapters, “Theme” (8), and “Influence” (9), I want to pause a bit and think about scale as I discussed it Toward a Computational Historicism. Part 1: Discourse and Conceptual Topology and a consequent Working Paper on Digital Historicism. While Jockers focuses on the macroscale, large populations of texts, he is also working at the mesoscale or the individual text, and his analytical work implies microscale phenomena.

From Micro to Meso: Paths in a Network

It has become common for cognitive scientists to think of the mind as a cognitive, or associative, network:


The nodes represent concepts, or even words, while the arcs or edges represent relations. We can now think of an utterance or a written statement as taking a path through the network:


An entire novel is simply a very long path, one that will pass through a large area of the network and that will go through various subnetworks many times, often along different paths and orientations.

Let us posit that the way an author moves through the net is the author’s style. The function words that are so very useful as features in identifying style can be thought of as regulating movement from one node (content) to another. That’s why they are so very useful in stylistic analysis.

No matter what you write about, you have to use those function words. Your choice of content words varies as your topic varies, but the set of function words is quite limited and you have to draw from that set regardless of topic. That is to say, while an author has to navigate many different subnetworks, that is, many different topics, the way they navigate each subnetwork is pretty much the same.

But why should that signal also be an individual one? Because, as the saying goes, there are many ways to skin a cat. That is, there are many ways to construct sentences and paragraphs about any given topic. Different writers will choose different ways of so doing.

Let’s now push that one step further. Following the psychoanalyst, Roy Lichstenstein, Norman Holland argues that each of us possess an identity theme, which he has recently identified with brain systems, thus grounding it in biology. By the time a given individual has become a writer he or she will have, of course, been shaped by culture and society. And so we have this diagram:

writers mind brain

When Jockers is identifying an author’s style, he’s doing so at the level of the text, at the mesoscale. But the phenomenon he’s examining is the result of processes at the microscale.

From Micro to Macro: Patterns in Networks

Cognitive scientists have also identified patterned subnetworks, where each subnetwork is about a specific topic or general idea:


In this case the two highlighted areas represent two different subnetworks. And here, for example, is the network I designed for the concept of shame when I was working on Shakespeare’s Sonnet 129:

Shame s129 mln76

Now, in my work on computational historicism I’ve suggested that the topics recovered through mescoscale topic analysis may well reflect the presence these microscale patterns in the cognitive networks of individual authors. It’s these networks that cause certain words to ‘travel’ or ‘hang’ together in a given text–where we can think of cause in the sense of formal cause. So, here’s the SIN SHAME AND REPENTENCE topic that Jockers identified in his collection of novels:

sin shame repent

And here’s the SHAME AND ANGER topic:

shame anger

[You can examine these topics on line here.]

If you examine those clusters of words I think you’ll find yourself in the world of Shakespeare sonnet:
Th' expence of Spirit in a waste of shame
Is lust in action, and till action, lust
Is perjurd, murdrous, blouddy full of blame,
Savage, extreame, rude, cruell, not to trust,
Injoy'd no sooner but dispised straight,
Past reason hunted, and no sooner had
Past reason hated as a swollowed bayt,
On purpose layd to make the taker mad.
Made In pursut and in possession so,
Had, having, and in quest, to have extreame,
A blisse in proofe and provd and very wo,
Before a joy proposd behind a dreame,
       All this the world well knowes yet none knowes well,
       To shun the heaven that leads men to this hell.
Note, though, that not all of the topics Jockers identifies can be accounted for in this way. One of the topics in his corpus, for example, is Irish dialect:

Irish Dialect

There is little reason to think that all the terms in this topic are held together by the same kind of network structure that governs shame, or, for that matter, such a mundane activity as getting a meal at a restaurant. We’ve got a different kind of microscale process operating in the case of this Irish dialect topic.

I’ve got two points in mind. First, Jockers is examining a macroscale object, a corpus of 1000s of texts written during the course of the 19th Century, but that object is the result of microscale processes massively distributed in time and space: the lives of those many authors and, by implication, of their readers. Second, though Jockers’ work comes out of the corpus linguistics and machine learning that arose in the last two decades (plus an older tradition of stylometrics), it has a natural affinity, if you will, for the kinds of hand-crafted cognitive models characteristic of an older tradition in computational linguistics and AI, namely those network models. They belong in the same larger conceptual universe, that of the human sciences in the 21st Century.

* * * * *


No comments:

Post a Comment