Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

With this card you can also run Open AI's Whisper with the Large model (the multilingual one!), as it requires 10GB.


Highly recommend quantizing the model (https://pytorch.org/tutorials/recipes/recipes/dynamic_quanti...). I converted the large model to use int8, and I'm able to run it 5x real-time on CPU with pretty low RAM requirements with still very good quality.


My implementation of Whisper uses slightly over 4GB VRAM running their large multilingual model: https://github.com/Const-me/Whisper




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: