Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What's the underlying hardware for this?


It's a system built from hundreds of GroqChips (a custom ASIC we designed). We call it the LPU (language processing unit). Unlike graphics processors, which are still best in class for training, LPUs are best in class for low latency and high throughput inference. Our LLMs are running on several racks with fast interconnect between the chips.


They have a paper [1] about their 'tensor streaming multiprocessor'

[1] https://wow.groq.com/wp-content/uploads/2024/02/GroqISCAPape...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: