Google's Attention Deficit Disorder

There's a new buzz about the way Internet watchers are trying to understand what's happening on the web - a sort of generalised hunt for the next web 2.0. Tim O'Reilly---the one who Trade Marked web 2.0 --- has been posting about the links between financial markets and web services like Google, Wikipedia, etc. He is particularly intrigued by the parallel between market makers trading on their own accounts (possibly in conflict with their clients), and Google entering content provision, in services like YouTube and Blogger (and now Knol, in a head-on with Wikipedia). CoryDoctorow and Jim Wales are writing about the possibility of open source, transparent search --- the moment and the reason for the community to take the power back from Google. Indeed, "Jimbo" has announced the launch of Wikia with the hope that "Transparency" and social aggregation will yield better results than Google.

All this is exciting stuff --- we are getting to the point at which we have digested how social information can be aggregated by networked computers, and we are wondering what comes next.

What I haven't seen fully laid out is the analogy between Web 2.0 services and economic mechanisms. Having this analogy clear is useful because it allows us to ask how all the results known about economic mechanisms translate to Web mechanisms. By and large, I think it also shows that Web 2.0 represents the naif phase of web service development--- akin to economists' modeling of perfect competition. The reality is clearly some way from that, but the lessons from mechanism design are not all encouraging: from the point of view of quality of results, the best from the wisdom of the crowds is behind us.

Google:Attention Auction

Let's start with the PageRank algorithm. This mechanism "auctions" attention (the screen position in a search result) and is paid for in links. At a wine auction, lots are ordered from most to least valuable. Value is measured by bidders' willingness to pay. In a "Google search auction"web sites are ordered from most relevant to least, where relevance is measured by the number of quality-weighted links pointing to a page. If I want to get "openDemocracy.net" to number 1 slot on a "Democracy"search, I need to make sure that no one has better quality-weighted links relating Democracy to the domain "openDemocracy.net".Google is auctioning slots in the results pages. On the left hand side are slots auctioned for links; in the right hand side, they are auctioned for money.

So? What of this parallel?

First, to Cory's point about search not being neutral: the design of the algorithm selects what information is returned, and what meaning is given to "relevance". This is generally the case with all auction-like mechanisms. In the wine auction, you will end up allocating wine to different bidders depending on whether you use an ascending or descending auction; and open out-cry auction or a sealed-bid auction. There is a sort of "gold standard" --- Cory's notion of the "neutral search" --- which in the auction literature is called the "efficient"outcome: the one that allocates each good to the bidder that truly values it most. ("Relevance" is a bit trickier than efficiency because of the philosophical issues it raises, so I am not sure Cory's ideal of neutrality exists for long). The auction literature suggests that achieving efficiency is very hard and often requires unbelievably contorted mechanisms.

The auction design literature tells us that whatever mechanism you adopt, bidders will modify their behaviour to do best for themselves. So, in a "first price" auction (one in which you pay what you announce as being prepared to pay --- as opposed to the E-Bay style second price auction), you think hard about what the next person below you is prepared to pay and bid close to that rather than bidding your own maximum willingness to pay. The electricity markets that I worked on inthe 1990's were, would you believe it, mostly designed as first price auctions! This started years of very profitable manipulation by all power companies. Enron was particularly adept.

PageRank manipulation has also turned into an industry. In its simplest form, you buy awell-regarded web property and you then sell links from that property to other sites that are trying to rise in the ranks.

The mechanism literature is very keen on discovering implementable mechanisms that are non-manipulable, in the sense that it is in everyone's selfish interest to reveal the true information about their valuation. In the PageRank analogy, this would be an algorithm that would lead you to create your content without regard for its impact on its Google position, but only with regard to your readers' best interests. So, for example, the simple Search Engine Optimisation advice that all links should be made with descriptive, meaningful terms, might lead one to make this sort of link in an article: "openDemcoracy's "Democracy in Kenya" coverage suggests that ..." instead of "Peter Kimani suggests that ..." If I do the first rather than the second because that is what the SEO handbooks say will improve PageRank's recognition of openDemocracy's links to "Democracy" and"Kenya" ,I am gaming PageRank just as I am gaming the wine auction by second-guessing how cheaply I can let it go before losing it.

The essence of the efficient mechanism design results is that it is important to divorce what someone pays from the outcome of the mechanism. So, the beauty of the E-Bay style "second price" auction is that what I pay is determined by the bid of the next lowest person, not by my bid. It is quite easy to see that it doesn't (usually) make sense for me to game the E-Bay system. (For the interested, the generalisation of the E-Bay auction to many goods --- which a Google page of results is, since it has many slots, is tricky. See Ausubel).

What does this mean for search? I've thought for a while that the equivalent would be for Google to give you not your own PageRank as as core, but the PageRank of your next closest "competitor", or web site. You could then SEO all you like, it won't affect your PageRank, except in so far as it affects your closest competitors'. The trick in this scheme will be implementing who your "nearest neighbour" is for any web page.

PageRank is a market mechanism. Implementing it---like all mechanisms---requires endless fixing around corner cases. An intriguing example is Google's trouble with Jewishness. This kind of "corner case fixing" might make one think that longevity in the market allows you to perfect the algorithm like no one else does, and so protects you from entry.

But if I were a Google shareholder, I would be worried by the analogy between Google search and a market mechanism. As every web content producer adjusts to Google, its results become necessarily less and less compelling. The joy of Google past was to think hard about the search query and get a first screen result full of relevant but quirky, even obscure material. A Google result today is much less sensitive to the searcher, because every content maker is trying to "buy" space that it can't pay for in "genuine" links. SEO-- even the unconscious SEO that is now so widely practised -- will ossify Google and a better solution will wipe it out with the speed of an epidemic. The web has become over-fitted to Google like a strain of wheat becomes over-designed to a specific ecology. The web is covered in content strategies over-designed to Google, and a new mechanism will find a source of meaningful, un-manipulated information---just as thehyper-link was before PageRank made it a gameable commodity.

Google will disappear much faster than Wikipedia, because Google provides a flow of services, while the Wikipedia mechanism has been accumulating an asset in its millions of pages. But Wikipedia is not out the woods yet. There is an auction analogy there too, from which I forecast that Wikipedia will be gradually locked down, the process for editing more and more institutionalised. Moreof that in a future post.