Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> With 5 finetunes you need to host 5 copies or load and unload them.

If you use LoRA, which many do when fine-tuning nowadays, you don't need five full copies. You only need to store adapters, which can be in the tens of MBs range for a given finetune.



You can also batch requests using different LoRAs. See "S-LoRA: Serving Thousands of Concurrent LoRA Adapters". https://arxiv.org/abs/2311.03285




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: