Are numbers available for FP16 on P100 or FP32 on V100? It would make for a more... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

mappu on May 12, 2017 | parent | context | favorite | on: Caffe2 adds 16 bit floating point training support...

Are numbers available for FP16 on P100 or FP32 on V100? It would make for a more direct comparison.

EDIT: Nvidia's advertised TFLOPS are:

         FP16  FP32  FP64
    V100 30    15    8.5
    P100 21.2  10.6  5.3
    K40  4.29  4.29  1.43

Tom1971 on May 12, 2017 | [–]

Your table doesn't include the new "tensor core" TFLOPS of V100.

That's a core that does 4x4 FP16 matrix multiplication + 4x4 FP32 accumulation in one go.

That's where V100 gets its boost, up to 120 TFLOPS.

bsprings on May 12, 2017 | [–]

Tensor Cores: 120 TFLOP/s mixed-precision (peak). Typo in your table: V100 FP64 is 7.5 TFLOP/s.

Join us for AI Startup School this June 16-17 in San Francisco!
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact