Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Both. Companies are certainly building bigger and bigger clusters for training.

At the same time though, consumer GPUs have gotten significantly faster (compare e.g. an Nvidia 2080TI to a 980TI), and learning algorithms keep improving / better learning algorithms become more widely used (e.g. Adam instead of stochastic gradient descent).




And also, architectural search allowed for neural networks to use more efficient builtin blocks, using many less parameters, and achieving the same accuracy with smaller models (and lowering training cost)




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: