Wednesday, December 15, 2004

Google To Scan Books From Big Libraries

SAN FRANCISCO (AP) _ Google Inc. is trying to establish an online reading
room for five major libraries by scanning stacks of hard-to-find books
into its widely used Internet search engine.

The ambitious initiative announced late Monday gives Mountain View,
Calif.-based Google the right to index material from the New York public
library as well as libraries at four universities _ Harvard, Stanford,
Michigan and Oxford in England.

The Michigan and Stanford libraries are the only two so far to agree to
submit all their material to Google's scanners.

The New York library is allowing Google to include a small portion of its
books no longer covered by copyright while Harvard is confining its
participation to 40,000 volumes so it can gauge how well the process
works. Oxford wants Google to scan all its books originally published
before 1901.

Scanning books so they can be read through computers isn't new. Both
Google and Amazon.com already have programs that offer online glimpses of
new books while an assortment of other sites for several years have
provide digital access to some material in libraries scattered around the
country.

But Google's latest commitment could have the biggest impact yet, given
the breadth of material that the company hopes to put into its search
engine, which has become renowned for its processing speed, ease of use
and accuracy.

``It's a significant opportunity to bring our material to the rest of the
world,'' said Paul LeClerc, president of the New York Public Library. ``It
could solve an old problem: If people can't get to us, how can we get to
them?''

Librarians are also excited about the prospect of creating a digital
record for the reams of valuable material written long before computers
were conceived.

``This is the day the world changes,'' said John Wilkin, a University of
Michigan librarian working with Google. ``It will be disruptive because
some people will worry that this is the beginning of the end of libraries.
But this is something we have to do to revitalize the profession and make
it more meaningful.''

The project gives Google's search engine another potential drawing card as
it faces stiffening competition for Yahoo Inc. and Microsoft Corp.'s MSN.
Attracting visitor traffic is crucial to Google's financial health because
the company depends on revenue generated by people clicking on advertising
links posted next to the main body of search results.

Scanning the library books figures to be a daunting task, even for a
cutting edge company such as Google, whose online index of 8 billion Web
pages already has revolutionized the way people look for information.

Michigan's library alone contains 7 million of its library volumes _ about
132 miles of books. Google hopes to get the job done at Michigan within
six years, Wilkin said.

Harvard's library is even larger with 15 million volumes. Virtually all of
that material will be off limits Google shows it can scan the material
without losing or damaging anything, said Harvard professor Sidney Verba,
who also is director of the university's library.

``The librarians at Harvard are very punctilious about protecting their
great treasures,'' Verba said.

The project also poses other prickly issues, such as how to convert
material written in foreign languages, and the issue of protecting
copyrighted books.

As it does with new books already included in its search engine, Google
will only allow its users to view the bibliographies or other snippets of
copyrighted books scanned from the libraries. The search engine will
provide unrestricted access to all material in the public domain _ work no
longer covered by copyrights.

The books scanned from libraries will be included in the same Google index
the spans the Web. By throwing everything into the same pot, Google risks
burying the library book results far below the Web documents containing
the same search terms term, reducing the usefulness of the feature, said
Danny Sullivan, editor of Search Engine Watch, an industry newsletter.

No comments: