Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Inference on GPU is already very slow on the full-scale non-distilled model (in the 1-2 sec range iirc), on CPU it would be an order of magnitude more.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: