If this ran on google's own cloud it amounts to internal bookkeeping. The only c...

rrr_oh_man · on July 30, 2024

> They chose to publish. So they are interested in seeing it reproduced or improved upon.

Call me cynical, but this is not what I experienced to be the #1 reason of publishing AI papers.

ash-ali · on July 30, 2024

I hope someone could share their insight on this comment. I think the other comments are fragile and don't hold too strongly.

theptip · on July 30, 2024

Marketing of some sort. Either “come to Google and you’ll have access to H100s and freedom to publish and get to work with other people who publish good papers”, which appeals to the best researchers, or for smaller companies, benchmark pushing to help with brand awareness and securing VC funding.

pishpash · on July 31, 2024

Come be dishwashers in the fancy kitchen! You can only have one chef after all and the line cook positions are filled long ago too, but dishes don't wash themselves.

godelski · on July 30, 2024

It's commonly discussed in AI/ML groups that a paper at a top conference is "worth a million dollars." Not all papers, some papers are worth more. But it is in effect discussing the downstream revenues. As a student, it is your job and potential earnings. As a lab it is worth funding and getting connected to big tech labs (which creates a feedback loop). And to corporations, it is worth far more than that in advertising.

The unfortunate part of this is that it can have odd effects like people renaming well known things to make the work appear more impressive, obscure concepts, and drive up their citations.[0] The incentives do not align to make your paper as clear and concise as possible to communicate your work.

[0] https://youtu.be/Pl8BET_K1mc?t=2510

echoangle · on July 30, 2024

As someone not in the AI space, what do you think is the reason for publishing? Marketing and hype for your products?

simonw · on July 30, 2024

Retaining your researchers so they don't get frustrated and move to another company that lets them publish.

a_bonobo · on July 30, 2024

and attracting other researchers so your competitors can't pick them up to potentially harm your own business

stairlane · on July 30, 2024

> The only cost is then the electricity and used capacity. Not consumer pricing. So negligible.

I don’t think this is valid, as this point seems to ignore the fact that the data center that this compute took place in required a massive investment.

A paper like this is more akin to HEPP research. Nobody has the capability to reproduce the higgs results outside of at the facility the research was conducted within (CERN).

I don’t think reproduction was a concern of the researchers.

morbia · on July 30, 2024

The Higgs results were reproduced because there are two independent detectors at CERN (Atlas and CMS). Both collaborations are run almost entirely independently, and the press are only called in to announce a scientific discovery if both find the same result.

Obviously the 'best' result would be to have a separate collider as well, but no one is going to fund a new collider just to reaffirm the result for a third time.

stairlane · on July 30, 2024

Absolutely, and well stated.

The point I was trying to make was the fact that nobody (meaning govt bodies) was willing to make another collider capable of repeating the results. At least not yet ;).

Rastonbury · on Aug 1, 2024

Kinda but Google sells compute so it makes money off the data centre investment, assuming they had spare capacity for this it's negligible at Google scale

rty32 · on July 30, 2024

Opportunity cost is cost. What you could have earned by selling the resources to customers instead of using them yourself is what the resources are worth.

g15jv2dp · on July 30, 2024

This assumes that you can sell 100% of the resources' availability 100% of the time. Whenever you have more capacity that you can sell, there's no opportunity cost in using it yourself.

michaelt · on July 30, 2024

A few months back, a lot of the most powerful GPU instances on GCP seemed to be sold out 24/7.

I suppose it's possible Google's own infrastructure is partitioned from GCP infrastructure, so they have a bunch of idle GPUs even while their cloud division can sell every H100 and A100 they can get their hands on?

dmurray · on July 30, 2024

I'd expect they have both: dedicated machines that they usually use and are sometimes idle, but also the ability to run a job on GCP if it makes sense.

(I doubt it's the other way round, that the Deepmind researchers could come in one day and find all their GPUs are being used by some cloud customer).

myworkinisgood · on July 30, 2024

As someone who worked for an compute time provider, I can tell you that the last people who can use the system for free are internal people. Because external people bring in cash revenue while internal people just bring in potential future revenue.

nkrisc · on July 30, 2024

Not if you’re only using the resources when they’re available because no customer has paid to use them.

K0balt · on July 30, 2024

I think Google produces their own power, so they don’t pay distribution cost which is at least one third of the price of power, even higher for large customers.

Cthulhu_ · on July 30, 2024

I'd argue it's not hard to reproduce per se, just expensive; thankfully there are at least half a dozen (cloud) computing providers that have the necessary resources to do so. Google Cloud, AWS and Azure are the big competitors in the west (it seems / from my perspective), but don't underestimate the likes of Alibaba, IBM, DigitalOcean, Rackspace, Salesforce, Tencent, Oracle, Huawei, Dell and Cisco.

pintxo · on July 30, 2024

> They chose to publish. So they are interested in seeing it reproduced or improved upon.

Not necessarily, publishing also ensure that the stuff is no longer patentable.

slashdave · on July 30, 2024

Forgive me if I am wrong, but all of the techniques explored are already well known. So, what is going to be patented?

fragmede · on July 30, 2024

the fundamental algorithms have been, sure, but there are innumerable enhancements upon those base techniques to be found and patented.

pintxo · on July 31, 2024

I merely listed another reason why someone would publish something. This did not imply they did if for that reason.

jfengel · on July 30, 2024

Is the electricity cost negligible? It's a pretty compute intensive application.

Of course it would be a tiny fraction of the $10m figure here, but even 1% would be $100,000. Negligible to Google, but for Google even $10 million is couch cushion money.

dekhn · on July 30, 2024

The electricity cost is not neglible- I ran a service that had multiples of $10M in marginal electricity spend (IE, servers running at 100% utilization, consuming a significantly higher fraction than when idle, or partly idle). Ultimately, the scientific discoveries weren't worth the cost, so we shut the service down.

$10M is about what Google would spend to get a publication in a top-tier journal. But google's internal pricing and costs don't look anything like what people cite for external costs; it's more like a state-supported economy with some extremely rich oligarch-run profit centers that feed all the various cottage industries.

stavros · on July 30, 2024

I feel like your comment answers itself: If you have the money to be running a datacenter of thousands of A100 GPUs (or equivalent), the cost of the electricity is negligible to you, and definitely worth training a SOTA model with your spare compute.

dylan604 · on July 30, 2024

Is it really spare compute? Is the demand from others so low that these systems are truly idle? Does this also artificially make it look like demand is high because internal tasks are using it?

K0balt · on July 30, 2024

I’d imagine publishing is more oriented toward attracting and retaining talent. You need to scratch that itch or the academics will jump ship.

ape4 · on July 30, 2024

Its like them running SETI@home ;)

dekhn · on July 30, 2024

We ran Folding@Home at google. we were effectively the largest single contributor of cycles for at least a year. It wasn't scientifically worthwhile, so we shut it down after a couple years.

That was using idle cycles on Intel CPUs, not GPUs or TPUs though.