Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think there's one more axis: Frequency-of-use.

For occasionally use, the major constraint isn't speed so much as which models fit. I tend to look at $/GB VRAM as my major spec. Something like a 3060 12GB is an outlier for fitting sensible models while being cheap.

I don't mind waiting a minute instead of 15 seconds for some complex inference if I do it a few times per day. Or having training be slower if it comes up once every few months.



Hopefully the next generation of cards have high-VRAM variants.


"As for capacity, Samsung’s first GDDR7 chips are 16Gb, matching the existing density of today’s top GDDR6(X) chips. So memory capacities on final products will not be significantly different from today’s products, assuming identical memory bus widths. DRAM density growth as a whole has been slowing over the years due to scaling issues, and GDDR7 will not be immune to that."

Source: https://www.anandtech.com/show/18963/samsung-completes-initi...


I can buy a DDR5 64GB kit from Crucial for $160.

https://www.crucial.com/memory/ddr5/ct2k32g48c40u5

If a $1000 GPU came with that, it would blow everything else out-of-the-water for model size. Speed? No. Model size? Yes.

If it came with 320GB, I could run ChatGPT-grade LLMs. That's $800 worth of DDR5.

Instead, I get 24GB on the 3090 or 4090 for $2k.

A $3k LLM-capable card would not be a hard expense to justify.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: