An interesting corollary of this is that if you want to reduce the model size yo... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		snovv_crash on May 14, 2023 \| parent \| context \| favorite \| on: Run Llama 13B with a 6GB graphics card An interesting corollary of this is that if you want to reduce the model size you can compensate by training for longer to achieve the same accuracy. Depending on your training:inference ratio this may be more optimal globally to reduce your total compute costs or even just reduce your frontend latency.

cubefox on May 14, 2023 [–]

Yeah, though I have not seen a formula which takes the number of expected inference runs into account for calculating the optimal data/parameter balance.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact