Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
unstuck3958
on Dec 13, 2023
|
parent
|
context
|
favorite
| on:
MemoryCache: Augmenting local AI with browser data
While I agree a website/spreadsheet would be convenient, it's not that complicated. As long as GPU is handling 50-75% of the LLM layers, you should get a decent tok/sec speed (unless you're running really really large models).
dr_kiszonka
on Dec 14, 2023
[–]
Could you explain to me (in steps) how I would go about calculating how much VRAM I would need to run, say, Mistral 8x7B?
Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: