Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm a Python neophyte, but I've read that Python 3.11 is 10-60% faster than 3.10, so that may be a consideration.


In this particular case that doesn't matter, because the only time you run Python is for a one-off conversion against the model files.

That takes at most a minute to run, but once converted you'll never need to run it again. Actual llama.cpp model inference uses compiled C++ code with no Python involved at all.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: