Hacker News new | past | comments | ask | show | jobs | submit login

So PageRank was an algorithm, and RankBrain is an AI? I'd love to understand a bit more of what makes them different from each other. I don't feel as though I've seen the search results become any better. In fact I've been frustrated by how many words it leaves out without telling me. Or how it says "here are the results to your search" when in all honesty it had 0 results to my search.



PageRank and RankBrain both seem to be (complicated) features to the actual core search algorithm, which is some unspecified machine learned model [1].

PageRank, for example, can't be the whole search algorithm since it's not even query-dependent. It would just put the same most authoritative document at the top for every query.

Similiarly, RankBrain doesn't sound like it could be the whole algorithm. It sounds like it is just a text understanding model (which wouldn't know anything about e.g. the global reputation of the document, the popularity of the document, etc.). In fact, the article explicitly confirms this:

>RankBrain has become the third-most important signal contributing to the result of a search query, he said.

I'd guess some kind of composite content quality signal and some kind of composite popularity signal sit at #1 and #2.

[1] - I don't have any insider knowledge about how Google works, but this article suggests that they were getting ready to switch from a hand-tuned model to a machine learned one in 2008: http://anand.typepad.com/datawocky/2008/05/are-human-experts...


Hiya, author here - yes, you are correct. RankBrain is one of hundreds of distinct signals that go into the results page. It just happens to be one with a great deal of influence, which I think demonstrates the surprising generality/adaptability of this type of approach for natural language processing and interpretation.


Well, except the results of Google got worse and worse over the past years (since around 2013 I could pinpoint the issue), to the point that by now often only one or two entries on the results page are relevant.

Even if I search for a very specific query, Google will just only give me results ignoring several parts of my query, and at best one or two about my actual query.

Even duckduckgo manages to give better results often at this point.

EDIT: Just found verbatim search, at least that gives almost equal results as duckduckgo. Still not as good, but it’s okay.


> Just found verbatim search

Even Verbatim search will happily lop-off search words.

Tonight I was searching for "raf ballykelly vulcan" and after a few puzzling results realised that Ballykelly had been discarded. Since it was the airfield in question the results were therefore useless.


It is exactly this kind of issue that I am talking about. Even DuckDuckGo manages this better than Google. It’s seriously an issue.

And with Google Maps being already worse than Bing, OSM, or even fucking Apple Maps, all Google products I actively used are now useless for me.


Either they've fixed it or I'm not seeing the problem - I get a bunch of pages that include ballykelly and vulcan. Forcing "ballykelly" didn't seem to make any difference.


If you'd like to share the query I'd be really interested to debug.


I don’t have one specific query – it is EVERY query where Google will ignore 90%+ of the words I entered, and show me around 5 related results, and otherwise completely bullshit results, with no way to see more.


Use [Search Tools] => [All Results] => [Verbatim] that is a known issue for people who have been using Google Search for years. There are many discussions about this topic, like: https://www.webmasterworld.com/google/4744658.htm


Even that doesn’t solve all issues – it still shows results that are irrelevant to my query and do not even match the query.

It should not be hard to convert a query into a regex for content, title and URL and apply it on the index.


That sounds really frustrating. Any examples you can give would really help to debug.


Often the issue occured with searching technical things, like, I copy-pasted a python exception, and Google would tell me where to download python and which python books to get and that python is a type of snakes.

It’s incredibly frustrating, but if I, by accident, use Google again and see the issue, I’ll tell you.


Here, a recent case, where Google tries to completely ignore half of my query: https://www.google.de/search?q=TLD+.gov.sa


Verbatim search is what you want: On the search results page, choose "Search tools -> All results -> Verbatim"


Thank you for this. I felt like my search terms were being "conveniently" omitted for the longest time just to show me more results instead of more refined results.


Is there a way to set this permanently? I couldn't find anything.


Perform a search then change the setting to Verbatim. Right click the search bar and "Add as Keyword" or "Add as Search Engine" (depending if you are using Firefox or Chrome)

Then give it a keyword, I use "vg" for "Verbatim Google".

Then in the Navbar I can type "vg foo bar", which will search "foo bar" verbatim. Closest thing to permanent once you get used to using keywords ( which are awesome by the way :D )


I didn't see the Add as Search Engine option in Chrome Dev.

Specifically, the url you want to use is: https://www.google.com/search?q=%s&tbs=li:1

Good tip. I use a similar one with site:en.wikipedia.com and I'm Feeling Lucky to quickly jump to wikipedia articles.


Of course, thanks :)


RankBrain is at the front, query-interpretation end. PageRank is at the back end, for picking pages which reasonably match the query.

Does RankBrain have an intermediate form which shows its interpretation of the query? Wolfram Alpha does, and will show an explanation of how it interpreted the query. (It has to, because it may give you an numeric answer). It would be useful for Google to tell you what question they think you are asking.


Marketing. AI is the new word for algorithm. I think most people probably feel that search quality is going down; unsurprising given Google's monopoly position.


That's going too far. AI always uses algorithms. What constituted it varied quite a bit. However, we usually allowed the term if it involved machine learning or decision-making based on heuristics. Especially if it was adaptive overtime. The AI's were also usually more resource intensive (slower) than regular algorithms. Kept them out of use in many places until AI field caught up with requirements.

PageRank was a simple, stupid algorithm that produced incredibly smart results. The exact kind of thing that sees widespread deployment with a startup. The description of this AI sounds more like an AI tool in general. It would've been much harder for Google to have started with this. The computers alone would've been prohibitive. So, we can call it an AI.


Are you saying that because Google is the leader, people perceive a drop in quality that may not be measurable? I suspect not but hope yes.


My guess is RankBrain is the personalization piece that operates on user data (location, history, etc.) while PageRank is the search index piece that operates on web data (web-pages, trends, etc.).




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: