>By one estimate, the training time for AlphaGo cost $35 million [0] How about X...

visarga · on April 16, 2020

BERT is trained on unsupervised data. It's not the same kind of model the article talks about.

barkingcat · on April 16, 2020

Yah it is by no means wasteful for AlphaGo to throw away all their training data and then re-train itself!

That kind of ruthless experimentation is how AlphaGo was able to exceed even itself. The willingness to say - all these human games we've fed the computer? All these terabytes of data? It's all meaningless! We're going to throw it all away! We will have AlphaGo determine what is good by playing games against itself!

And I bet you that for the next iteration of AlphaGo, the creators of this system will again, delete their own data and retrain when they have a better approach.

If you don't "waste" your existing datasets (once you reallze the flaws in your data sets), you are being held back by the sunk cost principle. You only have yourself to blame when someone does train for the exact same purposes, but with cleaner data.

The person who has the cleanest source of training data will win in deep learning.

You're sabotaging yourself in my opinion. 30k is nothing when you're just sabotaging the training with faulty data.

p1esk · on April 16, 2020

I'm actually glad it costs so much to train these models. Great incentive to find more efficient algorithms. That's how biological brains evolved.

third_I · on April 16, 2020

As an investor, $35m to train just about the pinnacle of AI seems like a cheap, oh so cheap cost. I can't even buy 1 freaking continental jet for that ticket, and there are thousands of these babies flying (not as we speak, but generally).

I don't think you are fully cognizant yet with the formidable scale of AI in the grander scheme of things, as an industry, which is nowadays comparable to transistors circa 1972 in terms of maturity. Long, long ways to go before we sit on "reference" anything. Whether architectures, protocols, models, test standards, it's a Far West as we speak.

You make excellent points in principle, which are important to keep in mind in guiding us all along the way, but now is not the time to set things in stone. More like the opposite.

The matter of the fact is that someone will eventually grab the old and new benchmarks, prove superiority in both, and by that point the new is the one to beat since it would be presumably error-free this time.