CHRISTOPHER ALLEN WALDROP: Google Print is the PATRIOT ACT on Steroids

This Week’s Column:

GOOGLIZATION AND YOU

... a MobyLives guest column

by Christopher Allen Waldrop

"Everybody gets so much information all day long that they lose their common sense." — Gertrude Stein

11 JULY 2005 — It must have seemed like a good idea at the time: Google, via its Google Print program, would digitize library collections, making them more accessible and, probably most important, more searchable.

Nothing lasts forever, but in spite of the possibility that todayšs Blackberry will become tomorrowšs TRS–80, digitization seems to be here to stay. And since most books produced from around 1860 into the 1990's were printed on unstable acidic paper, digitization is one — though not necessarily the only — way to give older books, including those still protected by copyright, a longer shelf life. Five libraries — the University of Michigan. Harvard University, Stanford University, Oxford University and the New York Public Library — are taking part in Google Print, although only the University of Michigan is allowing Google access to its entire collection. The others have limited the project to items that are public domain.

The problem is not the idea itself but Google's assumption that publishers would "opt in" to the program in exchange for visibility and revenue from "content–targeted ads". But as Business Week and the American Library Association have reported, publishers aren't buying it.

John Wilkin, associate librarian at the University of Michigan, has said that, for now, digitized versions of copyrighted materials will remain in a "dark archive", and won't be available.

If the material's not going to be available for the foreseeable future, why is Google wasting time and money digitizing it? And even if copyright laws eventually allow the release of material Google's digitized without the permission of publishers, who's responsible for keeping it in the dark until then? If the łdark archive˛ isnšt secure and material leaks out the University of Michigan or Google could be held liable for distributing protected material.

There are a lot of 'ifs' surrounding the future of the project, but there are serious questions about its present as well.

Daniel Brandt, founder of Google–watch, has written about his concerns with Google's lack of privacy protection. For example: In its privacy policy Google admits that it collects and keeps information on all users and their search terms.

Brandtšs critics (like Google–watch–watch) claim that his gripe with Google is personal; that he feels his own business web site wasn't ranked highly enough.

But Brandt hasn't tried to hide his reasons for disliking Google, and the concerns he raises about privacy are valid. Googlešs privacy policy states, "We do not rent or sell your personally identifying information to other companies or individuals, unless we have your consent." However, it goes on to say that Google will share information if "We provide such information to trusted businesses or persons for the sole purpose of processing personally identifying information on our behalf."

What Google does on its own behalf doesn't necessarily serve libraries. If the participating libraries don't use Google to search their new digital archive they'll have to go to the time and expense of creating a new interface since most library catalogs aren't designed for full–text searching. If they use a Google interface then patrons could have their reading habits put under surveillance. This is another 'if', but it's a possibility that should be considered.

After all the lobbying against the PATRIOT Act's provisions that prevent libraries from even saying whether law enforcement agents have been looking at their records, libraries can't really consider possibly opening patrons' reading habits, not only to law enforcement agencies but any third party that Google chooses.

Aside from privacy there are still a lot of unanswered (and unasked) questions about the Google Library Project. How will it affect researchers? How much control will Google have? If they're trying to lure in publishers with the promise of ad revenues, how will that affect the results patrons get? Are the searches even reliably returning all the available information?

Google has paired with OCLC, a consortium of more than 50,000 libraries. Google users supposedly can now use OCLC's WorldCat database to find the library that's closest to them with a specific book. This is an amazing idea, but it's not working yet. I searched for the closest library to me with Harry Potter and the Sorcerer's Stone. The nearest copy, according to Google, was in Jackson, Tennessee, about a two–hour drive away.

And while the book–finder (if it ever works properly) will serve anyone who knows exactly what they're looking for, Google's one–stop–shopping approach isn't made for research. Its very nature is to rank results by popularity. If you're researching, say, the duckbill platypus, the most popular results aren't necessarily what you want. Even with an advanced search Google can't distinguish between peer–reviewed research papers and stuffed toys.

In spite of their inherent slowness, organizing information is a job that's still best done by people, and in most places those people are called librarians. I admit librarians can't begin to sort all the available information, but at least for them preserving, categorizing, and creating access to the information that people need is a higher priority than content–targeted advertising. The insistence of librarians on continuing to use what might seem like arcane and antiquated systems — such as the Dewey Decimal or Library of Congress systems — is as much for the benefit of patrons as it is for the librarians who shelve the books. These systems were designed to keep the materials on platypuses in one place and the materials on stuffed toys in another.

This may seem like common sense, but in the rush to hand everything over to Google common sense is in short supply. The Association of American Publishers asked Google to wait six months before going ahead with the project. Google refused to accept even this short delay, even though they've said the whole project could take as much as ten years.

Assuming the law will go their way could be costly. The fight's just beginning and no one can say how long it will go on or how it will end. Google's partners need to get their common sense back and take this opportunity to start asking the hard questions about what the Google Library Project means for libraries, their patrons, and the future. It's the one area where the problem is not too much information but too little.

Librarian CHIRSTOPHER ALLEN WALDROP is the Serials Coordinator at the Vanderbilt University Library in Nashville, Tennessee. His Googling Libraries appeared on MobyLives in January.

©2005 Christopher Allen Waldrop

Previous columns:

BOOKSELLER AT LARGE . . . Guest commentator Dan Bloom says he moved to Taiwan and wrote a book that sold thousands of copies — after he took to the streets yelling, "Buy my book!"

ENOUGH ALREADY WITH THE MFA BASHING . . . Regular contributor Steve Almond, an MFA grad who also teaches creative writing, responds to Elizabeth Clementson's column about the influence of MFA programs.

DOWN WITH MFAs . . . In a guest column, MFA dropout and publisher Elizabeth Clementson say MFA programs are ruining literature and the publishing buisness.

TELEVISION WITHOUT PITY . . . Tired of the short story writer's life, guest columnist Steve Almond explains why he's now writing television shows such as "Blog and Order."

READING TO CHAIRS . . . When Quinn Dalton showed up at a bookstore to read from her new book, she was greeted by . . . empty chairs. In a guest column, she asks herself, "Why bother?"

Home