Latest InsideMicrosoft Posts: InsideGoogle: Google Scholar: Stand On The Shoulders Of Giants .comment-link {margin-left:.6em;}
Thursday, November 18, 2004
Google Scholar: Stand On The Shoulders Of Giants
Last night, Google released Google Scholar, a search engine for academic data.

Google Scholar takes academic papers and applies the PageRank standard to them. However, more than links are important; as in the academic world, its all about citations. If other important researchers cite your paper, Google Scholar's search algorithms recognize this and use that to weight the relevancy of the document. A link without a link, so to speak.

Google says:
With Google Scholar, researchers, students, professors and others can find relevant information drawn from a diverse collection of literature such as peer-reviewed papers, theses, books, preprints, abstracts, and technical reports.
For exampe, try a search for "search engine". The number one result? The anatomy of a large-scale hypertextual Web search engine by S Brin and L Page. Not only does the engine differ by including the author's name, the institution, and the year published in the search results, it lets you look at the citations, much like you would backlinks. Even though most of the database is in PDF format, Google can translate the pages to html for you, if you like.

Whenever possible, Google has said that the Google Scholar crawler searches not just the abstract, but the full text. All indications are that this also means that Google Scholar is not limited to the first 100 or so KB of a page, much like the regular Google search engine. Why is that so important? Because all of the citations are at the end of most papers, of course!

Google Scholar not only returns online results, but also offline papers. If the most cited paper is by Einstein, who's writings are both very well cited and barely online, Google will let you know, and give you an idea of how to find it.

The service is very impressive from the get-go. Google is clearly making a statement that it is serious about being able to provide tools for users to find the answers for anything. While other engines provide the "dumb" work-around of hard-coding answers into the engine, Google is determined to create the most powerful engine, and create unique ways for users to leverage that power. Last week, when Google announced it had doubled the size of its index, many had criticized the announcement, saying it is more important what you do with data than how much data you have. Well, Google is now showing us just how many things you can do when you have the world's information at your fingertips. Don't be suprised if we see more niche search engines in the future.

Other articles on this topic:
Google Offers Search Service For Researchers

Google Scholar
PR Weaver

We did a heuristic search using the "author:"
search feature. Using this method we found the most
cited document to be:

[BOOK] Molecular Cloning: A Laboratory Manual
J Sambrooke, EF Fritsch, T Maniatis - Cited by 46350
Cold Spring Harbor Laboratory, NY, 1989

Nothing else we could find even comes close.

Lee Giles
Isaac Councill
Eren Manavoglu
Fascinating. I posted about it here. I'm curious as to how many citations you found, and if you kept any data on the papers that didn't come close. I'd love to see any data you have.
Post a Comment

Links to this post:

Create a Link

<< Home

Powered by Blogger

Who Reads InsideGoogle?

The Seattle Times

Evan Williams

Most Popular Posts
A Look At Google's Secret Instant Messaging Product: Hello

New Gmail Features Include An Atom Feed

An Interview With Google's Marissa Mayer at Digital Life

Google And Microsoft: Neighbors