Where it has fallen down (compared to its relative performance in relevant research) is public generative AI products [0]. It is trying very hard to catch up at that, and its disadvantage isn't technological, but that doesn't mean it isn't real and durable.
[0] I say "generative AI" because AI is a big an amorphous space, and lots of Google's products have some form of AI that is behind important features, so I'm just talking about products where generative AI is the center of what the product offers, which have become a big deal recently and where Google had definitely been delivering far below its general AI research weight class so far.
> Google is very good at AI research.
Where it has fallen down (compared to its relative performance in relevant research) is public generative AI products
In such cases, I actually prefer Google over OpenAI. Monetization isn’t everything
> In such cases, I actually prefer Google over OpenAI.
For, what, moral kudos? (to be clear, I'm not saying this is a less important thing in some general sense, I'm saying what is preferred is always dependent on what we are talking about preferences for.)
> Monetization isn’t everything
Providing a user product (monetization is a different issue, though for a for-profit company they tend to be closely connected) is ultimately important for people looking for a product to use.
> For the good of society? Performing and releasing bleeding edge research benefits everyone, because anyone can use it.
OK, but that only works if you actually do the part that lets people actually use the research for something socially beneficial. A research paper doesn't have social benefit in itself, the social benefit comes when you do something with that research, as OpenAI has.
> There is nothing open about OpenAI and they wouldn't exist in their current form without years of research funded by Google.
True enough. But the fact remains that they're the ones delivering something we can actually use.
I personally think of it as open in the sense that they provide an API to allow anyone to use it (if they pay) and take advantage of the training they did. Is in contrast to large companies like Google which have lots of data and historically just use AI for their own products.
Edit:
I define it as having some level of being open beyond 'nothing'. The name doesn't scale well over time based on business considerations and the business environment changing and was named poorly when 'open source' is a common usage of open within tech. They should have used AI products to help them in naming the company and be aware of such potential controversies.
From chatgpt today (which wasn't an option at the time but they maybe could have gotten similar information or just thought about it more):
What are the drawbacks to calling an AI company 'open'?
...
"1. Expectations of Open Source: Using the term "open" might lead people to expect that the company's AI technology or software is open source. If this is not the case, it could create confusion and disappointment among users and developers who anticipate access to source code and the ability to modify and distribute the software freely.
2. Transparency Concerns: If an AI company claims to be "open," there may be heightened expectations regarding the transparency of their algorithms, decision-making processes, and data usage. Failure to meet these expectations could lead to skepticism or distrust among users and the broader public."
I mean, we do use that word to describe physical retail shops as being available to sell vs being closed to sell, so it's not an insane use... though I do think that in a tech context it's more misleading than not.
Compared to a curated video service like HBO Max, Hulu, or Netflix, that's an accurate way to describe the relative differences. We aren't used to using that terminology through, so yes, it comes across as weird (and if the point is to communicate features, is not particularly useful compared to other terminology that could be used).
It makes a bit less sense for search IMO, since that's the prevalent model as far as I'm aware, so there's not an easy and obvious comparison that is "closed" which allows us to view Google search as "open".
They publish but don't share. Who cares about your cool tech if we can't experience it ourselves? I don't care about your blog writeup or research paper.
Google is locked behind research bubbles, legal reviews and safety checks.
The researchers at all the other companies care about the blog write-ups and research papers. The Transformer architecture, for example, came from Google.
Sharing fundamental work is more impactful than sharing individual models.
To take an example from the past month, billions of users are now benefiting from more accurate weather forecasts from their new model. Is there another company making more money from AI-powered products than Google right now?
It's a very fuzzy question I posed. For pure customer-pays-for-AI-service it could be Microsoft. I'm kind of thinking of it as: Google's core products (search, ads, YouTube, Gmail) would not be possible with AI and they are huge cash cows.
Only indirectly, but I wanted to point out that there are a lot of interesting research innovations that get implemented by Google and not some other company.
Or, well, like many companies; all the peons doing the actual work, creation etc and the executives and investors profiting at the top. All it takes is to be lucky to be born into generational wealth apparently.
I think the real underlying cause is the explosion of garbage that gets crawled. Google initially tried to use AI to find "quality" content in the pile. It feels like they gave up and decided to use the wrong proxies for quality. Proxies like "somehow related to a brand name". Good content that didn't have some big name behind it gets thrown out with the trash.
I think the bottom line (profit) inversely correlates with the quality of search results. I've been using phind.com lately and it seems there can be search without junk even in this age.
Google has lots of people tagging search rankings, which is very similar with RLHF ranking responses from LLMs. It's interesting that using LLMs with RLHF it is possible to de-junk the search results. RLHF is great for this task, as evidenced by its effect on LLMs.
Right. It’s less that their declining quality of search results is due to AI and more that the AI got really good at monetizing and monetizing and quality search results are sometimes in opposition.
This entire thread kinda ignore that they are also selling ad space on many sites and their objective function in ordering search is not just the best possible result. Case in point the many sites stealing stack overflow content and filling it with adverts ranking higher than the source, that committed the cardinal sin of running their own and network.
> I've been using phind.com lately and it seems there can be search without junk even in this age.
A few reasons partially (if not fully) responsible for it might be:
- Google is a hot target of SEO, not Phind.
- If Google stops indexing certain low quality without a strong justification, there would be lawsuits, or people saying how "Google hasn't indexed my site" or whatever. How would you authoritatively define "low quality"?
- Having to provide search for all spectrum of users in various languages, countries and not just for "tech users".
Web has grown by 1000x over years. The overall signal to noise ratio has been worsen, around by 100x and SEO has been become much more sophisticated and optimized against Google. A large fraction of quality content has been moving toward walled gardens. The goalpost is moving (much) faster than technologies.
Yup, and us humans produce as much garbage as we can too. "60 hours of black screen" type videos on YouTube that gotta be stored on CDNs across the globe, taboola's absolutely vile ads, endless scripted content made by content creators for the short term shock/wow value.
I recently google searched "80cm to inches" and it gave me the result for "80 meters to inches". I can't figure out how it would make this mistake aside from some poorly conceived LLM usage
I highly doubt that this is related to any LLM use. It would breathtakingly uneconomical and completely unnecessary. It's not even interesting enough for an experiment.
This does highlight the gap between SOTA and business production. Google search is very often a low quality, even user hostile experience. If Google has all this fantastic technology, but when the rubber hits the road they have no constructive (business supporting) use cases for their search interface, we are a ways away from getting something broadly useful.
It will be interesting to see how this percolates through the existing systems.
I am at first just saying that search as PageRank in the early days is a ML marvel that changed the way people interact with the internet. Figuring out how to monetize and financially survive as a business have certainly changed the direction of its development and usability.
This is because their searches are so valuable that real intelligence, i.e. humans, have been fighting to defeat google's AI over billions of dollars of potential revenue.