I thought I’d write a few words on Google Print, Google’s new full-text book search service, since it has been a matter of considerable debate on the Web and elsewhere after Author’s Guild of America sued them for copyright infringement. More recently, Yahoo announced a similar effort with contributors including Internet Archive and O’Reilly Media. The Yahoo-led Open Content Alliance plans to focus more on already available public domain books and the books of those rightholders who explicitly give permission to scan their books, whereas Google has been only offering publishers to opportunity to opt out from their scheme, scanning works whose exclusion is not explicitly requested by default. The Google Print Library includes scans of books from Oxford University Library and several major American university libraries, including Harvard and Stanford.
To put it bluntly, I think the Author’s Guild are shooting themselves in the foot with a double-barreled shotgun. In the long run, authors can only benefit from people being able to find books they like. Moreover, something like Google Print should exist. The benefits of a full-text search service to researchers, students and, well, pretty much everybody are incalculable. If Google can provide a service like that within the boundaries of fair use (as, for instance, copyright scholar William Patry argues), then I’m all for it.
I do admit to being something of a Google fanboy, but bear with me even if you think Google’s slogan should really be “Don’t be evil, unless it’s necessary for the greater good…”
Much has been made of the fact that searches generated by Google Print contain adverts. I don’t find this particularly intrusive, perhaps because I’ve got a pretty powerful mental filter that tends to block adverts even when I’m reading newspapers and the like, but if Google Print is advertising, it’s really free advertising for the authors. Also, this is a business model that also applies to, say, book reviews: newspapers and magazines use book reviews (which may contain excerpts from books) to lure unsuspecting readers to view the adverts contained within their pages.
Some people claim that even though Google Print is only supposed to display a limited number of pages, it’s not hard to view them all. Obviously, these measures that Google has taken are not enough to prevent industrious pirates from downloading full books (since they must be scanned in full for indexing). Greg Duffy, a web cookie hacker, came up with a script that compiled full PDFs of books on Google Print until Google fixed a cookie vulnerability. (Duffy wasn’t interested in piracy: he just wanted to impress Google in order to get a job. Google sent him a thank you email and a T-shirt.) Still, it could be argued that Google’s measures are enough to prevent 99 percent of users from doing it. Books can already be pirated by anyone with a scanner, yet no one is that concerned about it. Why? Because reading ebooks is not fun. Until we get electronic paper that is as comfortable to use as the real thing, book piracy is not going to be a major problem. And no one will even think about pirating your book if they don’t know it exists. Obscurity is the enemy, not the pirates, as Tim O’Reilly puts it in his sobering New York Times op-ed piece.
The legal issues here are a bit murky as well. LawPundit argues that technically Author’s Guild should perhaps be suing the libraries, not Google, even if you accept the argument that scanning books and indexing them doesn’t fall within the realm of fair use. So what’s behind all this?
According to Tim O’Reilly, Google Print would actually rescue a lot of “orphaned” works.
Google is also solving a huge problem for the publishing industry. Because no one knows who owns many of the works in question, Google’s innovative deal with libraries is the only practical approach. It sweeps up all the loose ends of forgotten rights and ignored works. As the public discovers the value of these works, publishers and authors are incentivized to track down and assert their ownership in order to opt-in to the revenue sharing offered by the Google Print service.
And that — surprisingly — may be the problem, as mentioned by NigelJohnstone on Slashdot:
They obviously want an ‘opt-in’ system, because that reduces the number of books competing to just the current commercial books, and removes possible public domain, orphan works and smaller publishers authors. Joe public on the other hand, is best served by ‘opt-out’ because that includes orphaned work & possible public domain books.
In other words, publishers don’t want to see “forgotten” books competing with the latest best-sellers. Sinister, no?
I’m really hoping that Google Print will pull through. For example, in the past Google has been very open with providing APIs (Application Program Interfaces) to its services, which allows able hackers to use Google’s search engines for novel purposes. This Wired piece mentions some cool hacks that have arisen from Google Maps, for example. There is no reason why similar things couldn’t be made with Google Print. I can easily envision a “find similar books” application that would make it easier to find books one would like based on a sample of books one has read: sort of like Last.fm for books. This would bring the Internet’s power to amplify fringe markets to the analog publishing world, something that I’d be quite keen to see for purely selfish reasons.
Come on, Author’s Guild. Don’t be evil.