InsideGoogle: Is Google Broken? The Source Is Broken!

Monday, September 06, 2004

Is Google Broken? The Source Is Broken!

I've discovered that it is possible that the originator of the "Google Is Broken" theory is none other than Daniel Brandt, operator of Google-Watch. You can read his articles here and here. In fact, other than his articles, the rumor spread through this article, a press release submission at w3reports. Considering that press releases are released by companies announcing something, not anonymous articles touting the shortcomings of billion-dollar corporations by one "Anthony Federico", who is further commented on down the page by Daniel Brandt himself. Anthony may very well be the vice president, Platform Development, Xerox Production Systems Group, as seen here. More likely, he is the person described here, who designed the ranking technology for ScrubTheWeb.com, a rival search engine to Google.
What we are dealing with here is a serious conflict of interest on a major story. The only two sources for the "Google is broken" theory (unless someone would like to point out one I missed) are a man who has dedicated his time to bringing down Google, and a competitor. Both are fully aware that such a story, if it hit the mainstream press, would hurt Google's stock price in an instant, well before the story could be verified.
I would encourage other blogs to bring this fact to light. Since we spread this story, it is only fitting that we be responsible for the disclosure as well. Since it seems that the story was spread only by two people with serious conflicts, with no evidence to back themselves up, I'm retracting the story. I've also asked Google for a quote. We'll see where this goes.

- posted by Nathan Weinberg @ 9/06/2004 09:44:00 AM

Comments:

What nonsense.

Anyone hanging out the major forums will have seen plenty of posters observing that Google has been displaying inconsistent behaviour recently.

Anyone monitoring a number of sites, will have seen the same.

Maybe Daniel's theory is not the answer, but to suggest that there is nobody else noticing problems with Google at the moment is ridiculous.

Large numbers of pages getting dropped and then reinstated without any rhyme or reason.

Large number pages going in and out of 'supplemental' - again with no ryhme or reason.

Broken? Maybe, maybe not.

Working in a manner that is leading PLENTY of people to speculate that it is broken - definitely.

# posted by

Anonymous : 7:21 PM

That's exactly the point, it is just speculation.

Perhaps I should have made more information about myself available. Besides this and other blogs, I am a reporter by trade. As a result, I have a different perspective on the way and type of news I post here. Rumors are met with far larger grains of salt than other bloggers might find necessary. My point was that whether or not Google is broken, these sources are completely the wrong sources for this topic.

As for my personal opinion, I do not believe Google is broken. I believe it is simply overhauling its database and trying new things.

# posted by

Nathan Weinberg : 12:53 AM

Hello, Daniel Brandt here.

I don't know who Anthony Federico is, but the fact of the matter is that the gist of the points he raises were presented by me in June 2003 -- 15 whole months ago.

Your concern about Google's stock is touching, but somewhat suspicious. Do you own shares? Wall Street analysts are required to disclose this now, and I'm asking you to disclose it as well. I don't own any shares.

In any event, you have everything backwards. If Google is broken, and if the story "would hurt Google's stock price in an instant," then you still cannot complain that I am raising the issue "well before the story could be verified." I've sought verification for 15 months.

If Google is broken, SEC regulations required that Google disclose this in their prospectus as a substantial risk for investors. Google did not mention this in their prospectus. What am I supposed to do? Withdraw the questions that I raised 15 months ago out of courtesy to Google, because Google prefers not to disclose it to investors? Even though all the evidence I've collected since then reinforces my original position?

I wrote 15 months ago, about the 4-byte docID issue, that "if someone at Google goes on record on this topic, then we'll take down this page and that will be the end of it." No one did, so I didn't. I don't consider GoogleGuy to be "on record," since he is anonymous. I mean someone should go on record who has a name, so that if it turns out that he is lying, Google can be held accountable by the SEC. That's all I've ever asked.

For you to claim I'm asking too much is the same as claiming that Google is above the law, and has no responsibility to the public and to its investors.

# posted by

Anonymous : 1:21 PM

I agree that your points are well-researched, and there is a definite chance that you are right, although that is one of many theories.

I am not a Google investor, and do not worry, I would absolutely disclose it. They drilled that into my head back in my journalism classes.

And you are absolutely right when it comes to the fact that Google should, and may very well be required to disclose any problems. Of course, many companies keep many things secret, but that doesn't make it right. You have raised some serious questions that would make any investor think twice.

My main point is not on the content of what you or Anthony raise, but on the source. As a member of the working press, I understand that I need to be aware of who is saying what, and how much weight it carries because of that person's professional standing.

I for example, may be a fan of Google, but I am primarily a journalist, and will occasionally criticize them and their practices. Not to disparage you or your opinions, but you are a well-known critic of Google, and Anthony appears to be one of their competitors, or at least an employee of. Since I am reporting on these stories, I found it necessary for disclosure as to who the people are who are promoting this theory. It is important that the reader be aware of any inherrent biases in the sources of information, so they can make an informed decision on whether to believe what they are hearing.

In addition, when coovering a story, I have to decide whether I believe what I am hearing. While many of your points are valid and well-researched, the main focus of the article is the idea that Google has not fixed any inherrent flaws in its software. While I would defer to an impartial reporting body as an expert on the subject, I cannot in good conscience defer to known opponents of the company in question. Your research can only be presented as rumor and "of interest", but not as an "article".

If you have any more questions you wish to raise, I will be more than happy to explain.

# posted by

Nathan Weinberg : 4:06 PM

As an added bonus, I have ported over some of the discussion from the LiveJournal InsideGoogle. I hope the formatting carries over okay.

--------------------------------------------------

(Anonymous)
2004-09-06 16:43 (from 4.240.78.148)
You say the article "Is Google Broken?" has no evidence, but I have to completely disagree

with you on that. I don't know if you have been following the many posts in the article

written at w3reports.com or not, but Anthony Federico has provided some compelling support

written by the google founders and can be downloaded directly from stanford.edu! Not only

does he support this 4 byte limit, but he has shown and uncovered a huge amount of evidence

on several other google odities that I myself have noticed over the past several months with

my own website.

This guy has not only provided written support, but has also supplied examples of real

search results from google's website. I think google should respond don't you? Who cares if

google hit the limit or not? I want answers to all the other problems that Anthony Federico

has so kindly provided us support for. I don't know about you, but I'll be following this

post closely. I also don't care who Anthony Federico is. If he is a rival of google then I

don't know why he didn't plug his own website in the articles. It appears he is not looking

to promote anything to me.

-------------------------

Nathan Weinberg (montevino)
2004-09-06 19:21 (from 151.202.47.90)
Down the list:
The Stanford articles are very old, and from a former build of the database. There is no reason to assume things haven't changed since then, something he ackowledges. The sites he lists as examples are in fact in Google's database, something he says they are not.
As for why a rival of Google would attack Google and not plug himself, read Google-Watch. Daniel Brandt has dedicated an enormous amount of time and effort finding Google, but never really plugs NameBase, the site he is angry at Google about. You don't need to plug yourself if your desire is simply to hurt Google.
And I didn't say this was some plot by Anthony or anybody else, but simply that this constitutes a conflict of interest. In the newspaper business, we would never run a story based on old information, spread around by rivals of the company. At the very least, some independant body would have to prove this.
And you can't call this an "article", since it is a press release posted to a website. It is no more official than a forum posting, even if it looks like an article.

----------------------

(Anonymous)
2004-09-06 21:51 (from 4.240.78.7)
Granted the google publications he points to are old, but when was Windows XP released? Has it ever worked right? And it's backed by one of the largest companies in the world with quite possibly the deepest pockets and most brilliant minds. Heck I just installed another 75MB patch the other day.

Could everything have changed at Google and they still had the time develop adwords, adsense and gmail? I don't know about that one.

It's not easy to re-engineer something even if it is as simple as googles search engine. Plus if you go to google's job postings you will find that some jobs link to these same reports as a prerequesite. So if they are outdated I don't know why they would provide this information to future employees.

You say the sites he lists are in google. I guess I don't know what you're reading, but maybe you missed the actual urls he posted. If you do the searches like he shows you would see what he's talking about.

[google search term] site:www.liberty72.com

www.liberty72.com/L72_HT.html
Similar pages

www.liberty72.com/L72_contact.html
Similar pages

they are empty in the results at google just like he said. I did the same kind of search on my own site and found all kinds of empty pages at google not showing my titles or any content. This is very alarming to me and might explain why my traffic has decreased in the past 8 months.

On David's pages and in the post he made to the article he does plug NameBase. He gives a link to http://www.google-watch.org/dying.html which is all about NameBase.

I can't agree with you more about being careful what you post. You have to do your research that is for sure. But like you said, Anthony Federico did say google may have changed this 4 bite stuff, but it doesn't change the evidence he presents or the other google flaws.

I didn't realize this was a press release site. The headings on the pages says "NEWS FOR WEBMASTERS". Even on their home page it says it is a news article site. I now wonder if we are visiting the same site?

Anyway, thanks for your time,
John

----------------------

Nathan Weinberg (montevino)
2004-09-06 22:58 (from 151.202.47.90)
I too have noticed the strange phenomenon of Google pointing to pages but having no information. However, I suspect it is not a problem with the 4-byte integer as suggested, but rather simply because of Google's own explanation:
Page Title
The first line of the result is the title of the web page found. Sometimes, instead of a title there will be a URL, meaning that either the page has no title, or Google has not indexed the full content of that page. We still know it's a good match because of other web pages – which we have indexed – that have links to this returned page. If the text associated with these links matches your query, we may return the page as a result even though its full text has not been indexed.
Other explanations, from http://www.ozzu.com/ftopic25079.html:
If the listing is coming up with only the url, you are not alone. I have one client doing the same thing. I found out the reason for this client, anyway. Some how our server's firewall was blocking the spider from reading the site. We lost alot of positioning and titles, descriptions for most of the pages.
Google bann those pages which are nearly the same content within the same url.
It is simply a function of the way Google indexes and filters spam that this happens, or at least the rumor goes. A far more plausible one than that Google did not find a way to simply change the ID number system for its web pages.
Your right, I did miss Daniel's plugging of Namebase. I guess I focused on the fact that he rarely plugs it, despite getting so much press and PageRank for Google-Watch. It's a bit surprising, and does out a plus for his character, even if I still feel he goes too far.
And yes, w3reports is a "release site", in that most of the content is submitted from the outside, rather than written by site reporters. Some of it is press releases, others are freelance unsolicited articles, as this one is. An article such as that, on the totem pole of credibility and accountability, is too far down to measure. Blogs are much higher, because at least they falter without readers, while release sites can do well despite large amounts of lies. The difference between a "real" article and an article on a release site is the difference between a Time magazine article and an email touting the powers of generic Viagra. Believe it at your own risk.

# posted by

Nathan Weinberg : 4:31 PM

Contributors