This article makes a very different claim, one that is based on observing a great deal of instances in which individuals have engaged in fictional or non-fictional writing over the past two centuries. Seen from this perspective, fictionality emerges as a highly legible category at the level of linguistic content ("lexis" in Aristotelian terminology). Such legibility is what allows us to build predictive models that can identify works of fiction with greater than 95% accuracy, and it should be added, that allow human readers to do the same (as in my initial experiment above). Contrary to the beliefs of the philosophers of language or different schools of literary critics from poststructuralists to postclassical narratologists, truth claims in language (or their opposite fictionality) are a highly recognizable linguistic aspect of texts. What appeared to be the case at the level of the sentence or "utterance" (what Searle rather vaguely called a "stretch of discourse"), no longer holds when we observe writing at a different level of scale. Placing all of the emphasis on the reader's activity, whether as cognitive predisposition or interpretive freedom, overlooks the powerful and extensive ways that texts mark themselves for their readers according to their fictional nature.Not only does the research here suggest that fictionality is a highly legible category, but it also appears to have been surprisingly stable for at least two hundred years. While there have undoubtedly been significant changes to the way we tell stories, when we use learning algorithms trained on nineteenth-century texts we can still recognize contemporary novels with an impressive degree of accuracy (about 91%), even if that performance does decrease (history still matters). Indeed, the very features that seem to indicate the uniqueness of novels in the nineteenth-century, for example, appear either to be increasing over time or largely holding steady, even among a diverse range of genres into the twentieth and twenty-first century. While it remains an open question as to the extent to which different genres exhibit these features of fictionality to a similar degree, my initial research suggests that there is a surprising degree of commonality across very diverse types of fictional writing. Such continuity has important and still largely unaddressed implications for how we think about both genre and literary periodization.
Note, I've not yet read the article. I'm just skimming it. And I spotted this:
Indeed, imagining people as people may be fiction's most important role. If we remove dialogue from the sets above, including the pronominal expressions that accompany them (she said, he cried, etc),27 we see how third-person pronouns emerge as one of the strongest indicators of fictionality along with references to family members and bodies (Table 5). There is over a three-fold increase in the average number of she/he pronouns in fiction versus non-fiction outside of dialogue, with just these two words alone accounting for more than five-percent of all words in the text (or roughly 5,000 instances for a medium-length novel).
This speaks to my hobby horse about how we as critics have to learn to think and talk about characters as characters, not as people (who just happen not to be real).
And then we have:
Finally, we see how novels are marked by a much stronger use of mental states, captured in major verbs such as know, feel, think, remember, and believe, along with a second layer of less frequent, but similarly distinctive complex cognitive verbs such as admit, ponder, imagine, and forgive (the latter not shown). This is the ground of the novel's reflectiveness, that which binds together doubt and conditionality into a consistent mental state. Indeed, the combination of seem and feel, both of which appear 30% more often in the novel, give us a particular indication of what I am calling the novel's phenomenological orientation. Not the world itself, but a person's encounter with and reflection upon that world - the world's feltness - is what marks out the unique terrain of novelistic discourse when compared with other forms of classical fiction.* It is this combination of sense perception plus cognitive skepticism that seems to bring out the novel's contribution to fictional discourse. The novel professes its uniqueness in the way it offers extended reading experiences of the human assessment of the world's givenness.
That speaks to the Theory of Mind folks, though I wish they'd stop using that misleading and assumes-too-much phrase. However, "the mind-body distinction that underlies theory of mind models does not hold up well in light of the novel's strong emphasis on sensorial input and embodied entities."
Abstraction and concreteness:
The results suggest that the novel's relationship to its concreteness, when measured in this way, has indeed been changing over the course of the nineteenth century (Table 11).If we look at the first half of the nineteenth century, we see how there is a greater degree of abstraction relative to physical objects when compared to classical fiction and tales, but that this difference disappears by the second half of the century. As Ryan Heuser and Long Le Khac have argued, the British novel experiences a decline of valuation and a rise of concretization over the course of the century. And yet an important caveat to that finding is that while abstraction appears to be declining in the novel, it never drops below other kinds of fictional discourse from the period. Far from the Victorian novel being uniquely concrete, it is the early-nineteenth-century novel that looks uniquely abstract when compared to other kinds of fictional discourse* from the period. It suggests that we have been potentially telling this story in reverse: what matters in the nineteenth century is not the later rise of concreteness, which looks more like other types of fictional writing, but the earlier abstractness of the novel, which stands out relative to other types of fiction (not to mention abstraction remains considerably more important to these texts overall than their physicality).The other part of the table, that which concerns the novel's specificity, tells this same story in the other direction. Where the first half of the century witnessed little significant difference in the degree of specificity between the novel and other kinds of fiction (about a 0.003% difference), by the second half the century, the novel has about 0.5% more specific words than other kinds of fiction (or about 2 words/page). In other words, the novel approaches other kinds of fiction in its degree of abstraction while it departs from other kinds of fiction in its degree of specificity. It begins the century uniquely abstract and finishes uniquely specific.
In trying to distinguish fiction from non-fiction, in locating what makes fiction and the novel unique as types of writing, I have in the process been attempting to gain insights into their larger social function, to answer that perennial question of "why literature matters." According to the results presented here, if we focus on the quantitatively distinct qualities of novels in particular - of what separates them off from non-fictional or "true" writing - we can say that the novel's mattering since the nineteenth century appears to be less a matter of social realism and more one of phenomenological encounter, a kind of social imbedding in the world.
*Fictions that aren't novels:
Non-novelistic fiction in this case refers to a broad mixture of fictional writing that would have been very present to nineteenth-century readers, including classical epics translated into prose (The Iliad, Odyssey, Edda, Nibelungenlied), classic works of prose fiction (The Tale of Genji, The Decameron, King Arthur Tales, and Rabelais), fairy tale collections from around the world (drawn from Irish, German, Danish, Japanese and Indian sources), contemporary novella collections (novellas by Hoffmann, Tolstoy, Dickens, Maupassant, Hawthorne, and Washington Irving), as well as a variety of "tales" collections (Tales of Former Times, Tales of Domestic Life, Moral Tales). This data set is meant to represent a range of prose fiction that would have been widely read and known to nineteenth-century anglophone readers but would not have been considered a "novel." While the material dates from different epochs, the publications (and translations) are all contemporaneous with the period as a whole.