Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

>By one estimate, the training time for AlphaGo cost $35 million [0]

How about XLNet which cost something like $30k-60k to train [1]? GPT-2 may have been around the same [2] is estimated around the same, while thankfully BERT only costs about $7k[3], unless of course you're going to do any new hyperparameter tuning on their models which you of course will do on your own model. Who cares about apples-to-apples comparisons?

We're not talking about spending an extra couple hours and a little money on updated replication. We're talking about an immediate overhead of tens to hundreds of thousands of dollars per new paper.

Tasks are updated over time already to take issues into account, but not continuously as far as I know.

[0] https://www.wired.com/story/deepminds-losses-future-artifici...

[1] https://twitter.com/jekbradbury/status/1143397614093651969

[2] https://news.ycombinator.com/item?id=19402666

[3] https://syncedreview.com/2019/06/27/the-staggering-cost-of-t...



BERT is trained on unsupervised data. It's not the same kind of model the article talks about.


Yah it is by no means wasteful for AlphaGo to throw away all their training data and then re-train itself!

That kind of ruthless experimentation is how AlphaGo was able to exceed even itself. The willingness to say - all these human games we've fed the computer? All these terabytes of data? It's all meaningless! We're going to throw it all away! We will have AlphaGo determine what is good by playing games against itself!

And I bet you that for the next iteration of AlphaGo, the creators of this system will again, delete their own data and retrain when they have a better approach.

If you don't "waste" your existing datasets (once you reallze the flaws in your data sets), you are being held back by the sunk cost principle. You only have yourself to blame when someone does train for the exact same purposes, but with cleaner data.

The person who has the cleanest source of training data will win in deep learning.

You're sabotaging yourself in my opinion. 30k is nothing when you're just sabotaging the training with faulty data.


I'm actually glad it costs so much to train these models. Great incentive to find more efficient algorithms. That's how biological brains evolved.


As an investor, $35m to train just about the pinnacle of AI seems like a cheap, oh so cheap cost. I can't even buy 1 freaking continental jet for that ticket, and there are thousands of these babies flying (not as we speak, but generally).

I don't think you are fully cognizant yet with the formidable scale of AI in the grander scheme of things, as an industry, which is nowadays comparable to transistors circa 1972 in terms of maturity. Long, long ways to go before we sit on "reference" anything. Whether architectures, protocols, models, test standards, it's a Far West as we speak.

You make excellent points in principle, which are important to keep in mind in guiding us all along the way, but now is not the time to set things in stone. More like the opposite.

The matter of the fact is that someone will eventually grab the old and new benchmarks, prove superiority in both, and by that point the new is the one to beat since it would be presumably error-free this time.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: