15 December 2010

Zombie Editions: An Archaeology of POD Areopagiticas

I'm zipping along on Amazon, trying to find a lightweight edition of John Milton's Areopagitica -- a paperback Penguin, maybe -- when I stumble upon this:

Yes, there's an egregious typo in John Milton's name; but that isn't the only weirdness here. English Reprints Jhon Milton Areopagitica is machine-speak for "Does Not Compute" -- or, if I may be allowed some license with my translation, "Situation Normal, Metadata Categories All &*$%ed Up." Someone (or something) missed a few line breaks. And why is Edward Arber, a nineteenth-century editor and professor of English, tagged as the author of this hot mess?*

This is a zombie edition, one of many I found for early modern texts on Amazon. Produced as cheap print-on-demand editions from EEBO or GoogleBook scans, they're listed alongside reputable scholarly print editions published by university presses, indistinguishable at first glance except for a few glaring markers. Like a mismatched cover image --

-- or excessively expressive titles:

Closer examination reveals their undead status. In the case of English Reprints Jhon Milton Areopagitica, the publisher is the aptly-named BiblioLife, a project of BiblioLabs, which designs software "to address the challenges of cost-effectively bringing old books back to life." (BiblioLabs takes the "brining things back to life" shtick pretty seriously. Their website proudly boasts that their company is located in a "Renewal Community" -- a distressed urban zone where businesses are eligible for billions in tax incentives.) Another common publisher of zombie books is Nabu Press, taking its name from the Babylonian god of writing and patron of the scribes. The description for each Nabu Press book includes the same disclaimer:
This is an EXACT reproduction of a book published before 1923. This IS NOT an OCR'd book with strange characters, introduced typographical errors, and jumbled words. This book may have occasional imperfections such as missing or blurred pages, poor pictures, errant marks, etc. that were either part of the original artifact, or were introduced by the scanning process. We believe this work is culturally important, and despite the imperfections, have elected to bring it back into print as part of our continuing commitment to the preservation of printed works worldwide. We appreciate your understanding of the imperfections in the preservation process, and hope you enjoy this valuable book.
Whereas most "Product Descriptions" describe the content of the book, here the publisher draws attention to the physical artifact -- to its missing or blurred pages, its poor pictures and errant marks. In his latest book Piracy, Adrian Johns points out that during the eighteenth-century, pirated reprints didn't simply impinge on the intellectual property rights of authors -- a concept still in construction -- but threatened the civility of scholarly discourse by disseminating corrupt, even ugly copies of important texts. In other words, piracy forced a split between "texts" and "books" -- between words and the things that circulate them.

Although POD books are not pirated, these zombie editions exert a similar pressure on our concept of "the book." On the one hand, so-called POD facsimile editions that reprint a GoogleBooks scan of an original edition put a new batch of readers in touch with the visual aesthetics of early modern books, from the typography to the importance of title pages. In fact, one no longer needs to have institutional access to EEBO to download or own a facsimile edition of most early modern texts. As someone who geeks out over such things, I'm pretty excited by the possibilities POD holds; as a reader subjected to the general awfulness of these "facsimiles," though -- most of which are corrupted to the point of illegibility -- ....well, just take a look at the table of contents for English Reprints Jhon Milton Areopagitica:

On the other hand, POD publishers are also pulling from sites like Project Gutenberg, which offer "plain vanilla" texts stripped clean of accidentals. These editions usually come from OCRed scans, which simply means a machine has turned an image into copy-and-paste-able digital text. Depending on the sophistication of the OCR software used, anywhere between 1-20% of the words may be garbled, with older, non-standard typography being more difficult for the machine to read. While this typically isn't a problem, the occasional misrendering can be disruptive, especially when encountered in a printed book -- a medium in which we're not used to finding typos.

Areopagitica (Volume 1); 24 November 1644: Preceded by Illustrative Documents, shown above, is an excellent example. Pulled from an OCR of a scan of a bowdlerized reprinting in the Little Humanist Classics series, the text substitutes "sexist pronouns" with "the humanist pronouns HU, HUS, and HUM," even going so far as to switch "adulthood" for "manhood." Several confused readers, expecting OCR corruptions but perhaps not humanist bowdlerizations, left disappointed reviews (although at least one female reader appreciated the edition's "inclusive wordage": "I am not a man," she writes, "mankind does NOT mean humankind, and during John Milton's time women were being burned on stakes, so his outlook especially towards women was dim, and very well could have been reflected in this book, if it wasn't for the publishers insights regarding this." She concludes, without a touch of irony, "This is an excellent account of one of the original ideas for free speech in this country.") So much for Milton's campaign against censorship.

Like the pirated reprints of eighteenth-century England, then, these POD editions put scholars, especially book historians, in a real bind (pun intended). Cheap facsimile editions could bring bibliographic concerns back to the forefront of research, indeed to the very site of reading; but, sadly, most publishers downright suck at producing them. Meanwhile, "plain vanilla" texts are flooding the market, ostensibly pushing us away from any material connection with the original book; yet it's hard to immerse oneself in the immaterial text when OCR errors, the artifacts of shoddy digital scans, continually draw attention back to their technologies of production. In each case, a tangle of texts, technologies and histories challenges us to reconsider the border between material technologies and immaterial texts -- in short, to rethink what makes each POD book a "book."

Interestingly, the reader, left to navigate this web herself, emerges as the primary agent of meaning. It's the reader who must carefully check metadata for clues of corruption and read reviews to be sure she isn't buying, for example, a bowdlerized "Little Humanist Classic." It's also the reader who must discern typographic clues. (If the typography in a "facsimile" edition of an early modern text looks nineteenth-century, chances are it's a GoogleBooks scan of a censored Victorian edition. Because the term "censorship" is culturally relative, even nineteenth-century editions purporting to be "complete and full renditions" often are not.) When Amazon lumps POD books together with scholarly critical editions, it's up to the reader to become a literate surfer and sorter of information.

As, I suppose, it always has been.

*If you're curious, here's what probably happened: between 1868 and 1880, Edward Arber produced a series "English Reprints," cheap editions that disseminated literary works to a broader audience. At some point, both an 1868 and 1903 reprinting were scanned and posted on Google Books and/or archive.org; in the translation to Amazon's system, the metadata for the title included the series and tagged the series editor, Ed Arber, as the author; so now Milton's Areopagitica is accessible through the frame of a nineteenth-century popular reprinting.


peacay said...

(it's hard to know whether to laugh or cry to be honest)

In the same ballpark, I guess, are the problems associated with google books. I'd be surprised if you weren't aware of this 2009 Chronicle article (which is, I think, but one of a series of commendable critiques - from a few people - from around the same time last year).

Simon J. James said...

I know this is not a new post- forgive me I've only just found it - but this is BRILLIANT! I'll be using it in my graduate class -thank you!