Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Anyone can go build a crawler and scrape the web the way Google scrapes it so they can compete with Google.

I don't think that making a scraper will make you competitive with Google. If you can make a site ranking algorithm that compete's with google, on the other hand, you might have a chance



The site ranking algorithm is a solved problem.

The one reason Google is competitive is due to them taking advantage of the cheap labour that keeps track of ranking manipulation.

Luckily most of the search problems have nothing to do with ranking manipulation.


site ranking is not a “solved problem” - google tries to solve it all the time and yet finding anything other than trending or popular stuff still takes more than several attempts (and often doesn’t even result in best results).


Google has a set of contradicting requirements for the interface they've got on their website.

From one side it's along the natural-language interface from Alexa or alike; from the other side it's an interface of search for people who generally need access to information.

If Google exposed interfaces similar to Elastic Search - the search would never be an issue anymore; but it would not be easy to use by the users.


That’s precisely why they don’t want you cheating the hard part and just storing the results. It makes sense to me. Work on your own machine learning if you want good results.


There is no such thing as cheating, only staying within boundaries that don't land you in jail or sued in your own jurisdiction. If you can get an edge by using Google's own data, do so.


Bing bing bing!

Er, ughm. I mean,

Ding ding ding!


You could argue that Google should work on their own knowledge database instead of learning from other people's content and/or presenting other people's content in their own frontends (shopping etc)...


This is what Common Crawl does: http://commoncrawl.org/. I think more people should know about it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: