Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

in the paper, it seemed to work quite well for LSTMS (language modeling), although it is best suited for tasks with very, very large neural networks for which parallelism is beneficial


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: