Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
bitL
on May 14, 2023
|
parent
|
context
|
favorite
| on:
Run Llama 13B with a 6GB graphics card
One can keep all tensors in the RAM, just push whatever needed to GPU VRAM, basically limited by PCIe speed. Or some intelligent strategy with read-ahead from SSD if one's RAM is limited. There are even GPUs with their own SSDs.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: