Monday, August 25, 2014

From “Distant” Reading to Crowdsourcing, or, the Rise of Citizen Scholars

Distant reading of ALL texts, yes. But also, every text must be read by a human reader. This will require crowd-sourcing† from skilled amateur readers. This is the only way, for example, we’ll be able to deal with the mass of popular culture texts that must be accounted for.

Distant reading and dark matter

One of Moretti’s arguments in favor of “distant” reading is that most of the texts that have been written never receive consideration under the standard (now traditional) regime of literary criticism. Under that regime reading is restricted to a relatively small handful of canonical texts. What of all those unread texts, the “dark matter” of literary culture (to borrow a metaphor from astrophysics)? How do we take account of them? Yet they must play some role in literary culture. How are we going to understand that role if we don’t read them?

The problem, Moretti pointed out, is that no one has time to read all of the texts. One critic can read several hundred, a thousand texts, or so. But not the 10s of thousands of novels published in, say, the nineteenth century alone. Moretti’s solution? Use computers to scan through all those texts and analyze them in this way or that.

I have no problem with this computer mediated “distant” reading as Moretti calls it. no problem at all. But, as a result of undertaking a close look at Matthew Jockers’ Macroanalysis I’ve reached the tentative conclusion that in fact a much larger fraction, if not all, of the texts in that unread dark matter must be read by humans. Perhaps not in the style of old-fashioned close reading, but they must be read in order to identify features and properties that computers cannot (yet) identify.

For example, ring composition

For example, I’ve become convinced that we need to know of a text whether or not it has ring-form or centerpoint construction, or whether it is of more standard “straight through” organization. We also need to know of any instances where the organizations of episodes by plot differs from the ordering inherent in story. We don’t even know these things for the canonical texts must less for a significant portion, if not all, of the non-canonical texts.

Determining whether or not a text is structured by ring-form is not rocket science. Nor is it like undertaking a Lacanian or a Feminist reading. But it is exacting work, work that cannot by done by a machine. It has to be done by a skilled human reader.

Crowd-sourcing the reading of dark matter texts

How is this to be done? Moretti implicitly assumes the standard model of academic literary critical practice, that of the lone practitioner. That’s how, for example, Love and Death in the American Novel, got written (one of my macroanalysis posts centers on this book). Fiedler read though the American and much of the European canon and based his book on those readings.

How do we get beyond this problem? We’re going to have to crowd-source the reading of dark textual matter to citizen scholars, people with a strong interest in literature, but no interest in getting a doctorate in literature or holding an academic post. They can read heretofore “unread” texts with specific analytic or descriptive agendas in mind. They would then enter the results of their reading into a publically accessible database. Depending on what the specific features involved, we may want each text read by two or three readers to verify results.

I can imagine, for example, that undergraduate literature majors would participate in a one semester or even a one year workshop where they read though dark matter texts with specific objectives in mind. Graduate students would do this as well. But I don’t see why such reading couldn’t be done people who are not directly affiliated with a college or university. At least some of the people who participate in book reading circles and contribute to fan sites on the web would be interested in participating in on-going research projects in this way.

Of course, for this to happen, the profession will have to reorganize how it does business. Just how that will be done is beyond the scope of this brief note.

Once the profession figures out how to do this, it will change its relationship with the public at large, and in ways beneficial both to that public and to academic literary research.

† From a recent email to some colleagues: Come to think of it "crowd sourcing" probably isn't a good term. What I've got in mind really is more like the Wikipedia – and, having been involved w/ doing some work on manga and anime for Wikipedia I'm aware of the problems. What I want of my citizen-scholars isn't "distant" reading, but "close" reading, albeit of a fairly circumscribed sort. Think of all those bird watchers who do real work tracking the activities of birds and reporting on them. That's more like what I've got in mind.

