Hacker News new | past | comments | ask | show | jobs | submit login

TOPS != TFLOPS

RTX 4090 Tensor 1,321 TOPS according to spec sheet so roughly 35x.

RTX 4090 is 191 Tensor TFLOPS vs M2 5.6 TFLOPS (M3 is tough to find spec).

RTX 4090 is also 1.5 years old.




Yeah where are the bfloat16 numbers for the neural engine? For AMD you can at least divide by four to get the real number. 16 TOPS -> 4 tflops within a mobile power envelope is pretty good for assisting CPU only inference on device. Not so good if you want to run an inference server but that wasn't the goal in the first place.

What irritates me the most though is people comparing a mobile accelerator with an extreme high end desktop GPU. Some models only run on a dual GPU stack of those. Smaller GPUs are not worth the money. NPUs are primarily eating the lunch of low end GPUs.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: