Sunday, July 20, 2014

Beyond Spindling and Mutilation: Shouldn't we recover machine translation for the digital humanities and thereby increase our imaginative scope?

Here's a nice standard-issue history of the digital humanities from NEH, tracing us back to Fr. Busa:
The story of digital humanities often begins with another theologian on a quest to make a concordance. In the mid 1940s, Father Roberto Busa, an Italian Jesuit priest, latched onto the idea of making a master index of works by Saint Thomas Aquinas and related authors. Busa had written his dissertation on “the metaphysics of presence” in Aquinas. Looking for the answer, he created 10,000 hand-written index cards. His work demonstrated the importance of how an author uses a particular word, especially prepositions. But making an index for all of Aquinas’s works required wrangling ten million words of Medieval Latin. It seemed an impossible task.

In 1949, Busa’s search for a solution led him to the United States and International Business Machines, better known as IBM, which had a patent on the resources Busa needed to realize his project. Without the company’s help, his vision for a master concordance would remain just a dream.
But what of machine translation, that's almost as old?

I know, I know, the genealogy may spring from the same root, but its branches went in a different, very different, direction. But still, isn't translation a characteristically humanistic activity, one of the most fundamental? All those biblical texts translated into Latin and Greek, all those classical texts translated into modern European tongues, not to mention translations from Sanskrit, Mandarin, and all those other many tongues.

Perhaps it's too dangerous to reclaim that aspect of our heritage? Because then we'd have to follow computing deep into the heart of language and the mind. 

But what of the soul, of the spirit? Do not fold, bend, mutilate, or spindle!* 

Wouldn't want to do that, would we? The computer is a tool of the Devil. Why? 

Because capitalism, because imperialism, because patriarchy, because racism! Because EVIL.

Still, computing is a child of the soul, an avatar of the spirit. What are we to do? Should we not reclaim it? Must we remain imprisoned in the 60s forever, an era when many of us were not even born?

Besides, if we're already using the computer to do back-room grunt work, then we've already entered the Devil's Workshop. We might as well look at our hands and see what they're doing to stave off idleness. Really, there's no way out. We're committed.

We need a properly revisionist history of our discipline.

*Some links: Free Speech Movement: Do Not Fold, Bend, Mutilate or Spindle, "Do Not Fold, Bend, Mutilate or Spindle": A Cultural History of the Punch Card.


  1. Hi Bill,

    I wrote my dissertation on this, though it looked at machine translation from the perspectives of media theory and translation theory.


    I'm refashioning the diss into a book now, portions of which aim for the revisionism you seek -- though my objective isn't to reclaim the history for DH, just for H.

    Other scholars working through this history at present focus on the American context -- perhaps less reclaimable for DH than machine translation histories outside the US.

    It's fascinating stuff, glad you're priming the pump for this!

  2. Hi Christine,

    I studied computational linguistics with David Hays while getting my degree in English. Have you seen the paper we wrote back in 1976, "Computational Linguistics and the Humanist"? You can download it here:

  3. I just downloaded your dis, Christine, and took a look. David Hays is the one who brought Martin Kay to the USA to work with him at the RAND Corporation's MT project. He was part of the committee that wrote the ALPAC report and, seeing the writing on the wall, coined the term "computational linguistics" as the new name for the discipline, which he knew would suffer a bad hit when that report came out.

  4. Thanks for the link and details! I realize I'd read about David Hays in Hutchins' book on MT pioneers (the bio was written by Martin Kay). And I've been trying to track down some of the MT reports from those years from Rand -- but now I realize many were written by Hays.

    1. The RAND reports should be availble from RAND; they're listed online, though they probably charge an arm and a leg for them.

      I've written a Wikipedia entry on Hays that says a bit about what he did once he'd left RAND in 1969. While he remained in touch with the MT world, he no longer worked on MT. He worked on semantics into the 80s and then turned to broader interests in culture, including ballet and the evolution of technology.

      Hays also knew Margaret Masterman, and even had an off-white sweater she'd knitted for him. Here's a post on connections between Hays, Masterman, and Kay. Here's a post with a section on about working with Hays on the American Journal of Computational Linguistics.

      Hays died before SMT really took off, but I suspect his views would be similar to those voiced by Kay in the passages quoted at the end of this Language Log post by Mark Liberman, which is generally about the statistics vs. structure issue.