By ANICK JESDANUN, AP Internet Writer
Tony Sanfilippo is of two minds when it comes to Google Inc.'s
ambitious program to scan millions of books and make their text fully
searchable on the Internet.
On the one hand, Sanfilippo credits the program for boosting sales of
obscure titles at Penn State University Press, where he works. On the
other, he's worried that Google's plans to create digital copies of
books obtained directly from libraries could hurt his industry's
With Google's book-scanning program set to resume in earnest this
fall, copyright laws that long preceded the Internet look to be headed
for a digital-age test.
The outcome could determine how easy it will be for people with
Internet access to benefit from knowledge that's now mostly locked up
- in books sitting on dusty library shelves, many of them out of
"More and more people are expecting access, and they are making do
with what they can get easy access to," said Brewster Kahle,
co-founder of the Internet Archive, which runs smaller book-scanning
projects, mostly for out-of-copyright works. "Let's make it so that
they find great works rather than whatever just happens to be on the
To prevent the wholesale file-sharing that is plaguing the
entertainment industry, Google has set some limits in its library
project: Users won't be able to easily print materials or read more
than small portions of copyright works online.
Google also says it will send readers hungry for more directly to
booksellers and libraries.
But many publishers' remain wary.
To endorse Google's library initiative is to say "it's OK to break
into my house because you're going to clean my kitchen," said Sally
Morris, chief executive of the U.K.-based Association of Learned and
Professional Society Publishers. "Just because you do something that's
not harmful or (is) beneficial doesn't make it legal."
Morris and other publishers believe Google must get their permission
first, as it has under the Print Publisher Program it launched in
October 2004, two months before announcing the library initiative.
Under the publishers' program, Google has deals with most major
U.S. and U.K. publishers. It scans titles they submit, displays
digital images of selected pages triggered by search queries and gives
publishers a cut of revenues from accompanying ad displays.
But publishers aren't submitting all their titles under that program,
and many of the titles Google wants to scan are out of print and
belong to no publisher at all.
Jim Gerber, Google's director of content partnerships, says the
company would get no more than 15 percent of all books ever published
if it relied solely on publisher submissions.
That's why it has turned to libraries.
Under the Print Library Project, Google is scanning millions of
copyright books from libraries at Harvard, Michigan and Stanford along
with out-of-copyright materials there and at two other libraries.
Google has unilaterally set this rule: Publishers can tell it which
books not to scan at all, similar to how Web site owners can request
to be left out of search engine indexes. In August, the company halted
the scanning of copyright books until Nov. 1, saying it wanted to give
publishers time to compile their lists.
Richard Hull, executive director of the Text and Academic Authors
Association, called Google's approach backwards. Publishers shouldn't
have to bear the burden of record-keeping, agreed Sanfilippo, the Penn
State press's marketing and sales director.
"We're not aware of everything we've published," Sanfilippo
said. "Back in the 50s, 60s and 70s, there were no electronic files
for those books."
Google, which wouldn't say how many books it has scanned so far, says
it believes its initiative is protected under the "fair use"
provisions of copyright law.
Gerber argues that the initiative will "stimulate more people to
contribute to the arts and the sciences by making these books more
Washington lawyer Jonathan Band says Google's case is strong given the
limits on display -- a few sentences at a time for works scanned from
libraries, with technology making it difficult to recreate even a
"I don't see how making a few snippets of a work available to a user
could have any negative impact on the market," said Band, who has
advised library groups and Internet companies on copyright issues.
Under Google's strictures, readers can see just five pages at a time
of publisher-submitted titles -- and no more than 20 percent of an
entire book through multiple searches. For books in the public domain,
they can read the entire book online.
Not all publishers are opposed.
"For a typical author, obscurity is a far greater threat than piracy,"
said Tim O'Reilly, chief executive of O'Reilly Media and an adviser to
Google's project. "Google is offering publishers an amazing
opportunity for people to discover their content."
James Hilton, associate provost and interim librarian at the
University of Michigan, said his school is contributing 7 million
volumes over six years because one day, materials that aren't
searchable online simply won't get read.
"That doesn't mean it's going to be read online, but it's not going to
be found if it's not online," he said.
Hal Hallstein, a 2003 Colby College graduate, said Google's project
would have been useful for his studies in Buddhism. He typed the word
"shunyata" -- Sanskrit for emptiness -- and found several books he
didn't know existed.
"The card catalog in my experience is rather limited in terms of the
amount it really describes," he said.
Nonetheless, as e-media coordinator at Wisdom Publications, he
believes each publisher should be able to decide whether to join, as
his company has.
Much of the objections appear to stem from fears of setting a
precedent that could do future harm to publishing.
"If Google is seen as being permitted to do this without any response,
then probably others will do it," said Allan Adler, a vice president
at the Association of American Publishers. "You would have a
proliferation of databases of complete copies of these copyrighted
Publishers won't rule out a lawsuit against Google.
The technology juggernaut, whose name is synonymous with online
search, isn't just shaking up book publishing.
Google has a separate project to archive television programs but has
so far received limited permissions. The company also faces lawsuits
over facilitating access to news resources and porn images online.
Jonathan Zittrain, an Internet legal scholar affiliated with Oxford
and Harvard universities, says the book-scanning dispute comes down
balancing commercial and social benefits.
"From the point of view of the publishers, you can't blame them for
playing their role, which is to maximize sales," he said. "But if fair
use wasn't found, (Google) would never be able to do the mass
importation of books required to make a database that is socially
On the Net:
Anick Jesdanun can be reached at netwriter(at)ap.org
Copyright 2005 The Associated Press.
NOTE: For more telecom/internet/networking/computer news from the
daily media, check out our feature 'Telecom Digest Extra' each day at
http://telecom-digest.org/td-extra/more-news.html . Hundreds of new
Also see news at http://telecom-digest.org/td-extra/othernews.html