Even the biggest party has to end one day! Here’s how to topple Google in their own game. This article draws inspiration from wired magazine special edition ‘Googlemania’ published during it’s first IPO season. Also, lately I came across some research papers on the pitfalls in Search Giant’s approach.
#1. Build a search engine which can crawl every webpage ever created on this planet.
Google or any other search engine as a matter of fact, still scratching the surface of the web and about 80% of the deep web remains unexplored. Some estimates say that the deep web (invisible from any search engine) is 550 times greater than the normal web space.
“Google has 3 billion pages in its database, with AlltheWeb and Inktomi close behind. But there may be a trillion more pages hiding in plain sight - in online databases such as WebMD and The New York Times' archive, and they can't be reached by hopping from one link to another”
#2. Can you index and save a history of every webpage modifications.
“Google lets you search only its most recently crawled version of the Web. Pages that were changed or deleted prior to the last crawl are lost forever. What if you could search every version of every page ever posted?”
#3. Following, Aggregating RSS feeds
“News sites and blogs are supplementing their pages with RSS feeds - a service that pushes new content to subscribers as soon as it's published. Google doesn't track RSS feeds, and bloggers gripe that their posts take two to three days to show up in search results. An engine to which Web site owners could upload RSS would provide the latest version of every page.”
#4. What about knowledge mining?
Yes, Google returns the search results in milliseconds or lesser fraction of time. So, what. Still, I need to spend 30 minutes of time in Google to gather the information what I need. So, the information has to be aggregated from various sources to present ‘information related to a context’. Something similar to Yahoo Mindset – the beta search where you can adjust between ‘Shopping’ and ‘Researching’. If I type, “Biography of Gandhi”, the search companion should return extracts (authentic) from various webpages.
#5. Google is not so good.
At google, rich becomes richer. That is, the web pages resulting in top of the results page get linked again by many, promoting them further. Google’s ‘backlink algorithm’ is already flawed and bombed by hackers. It’s time to move on to a better Expert algorithm. We need to build an alogorithm which resembles an expert researcher who sieves through millions of junks to return golden nuggets of information and knowledge.
It’s not this easy. You need to build a complete, fast, free, search companion suiting each and everyone’s needs. More importantly, you should do it quickly!