Hacker News new | past | comments | ask | show | jobs | submit login

Are numbers available for FP16 on P100 or FP32 on V100? It would make for a more direct comparison.

EDIT: Nvidia's advertised TFLOPS are:

         FP16  FP32  FP64
    V100 30    15    8.5
    P100 21.2  10.6  5.3
    K40  4.29  4.29  1.43



Your table doesn't include the new "tensor core" TFLOPS of V100.

That's a core that does 4x4 FP16 matrix multiplication + 4x4 FP32 accumulation in one go.

That's where V100 gets its boost, up to 120 TFLOPS.


Tensor Cores: 120 TFLOP/s mixed-precision (peak). Typo in your table: V100 FP64 is 7.5 TFLOP/s.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: