Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Thanks for your answers.

True that, to get crazy perf with FINN, one needs to quantize like crazy (at least it's the default strategy, but it's something that might change if/when it can synthetize to use dsp slices or shiny Versal Weird Cores). Now I'll have to take a look at Tensil. How would it scale on large FPGAs though? Would you leave the floor planning to a seasoned vhdl person? Does Tensil handle it (generating parrallel pipelines, maxing out performance using all resources on chip) ? Say for someone doing 1D CNNs or some 1D VAEs with (tens of) millions inferences/second on a continuous stream (low batch size)? :-).

I'm not sure what Intel proposes nowadays on that front, with the abandonment of OpenVino for FPGA. No idea how one could use the stratix 10 nx with its 'ai cores' with actual neural networks. Tensil might be a gateway for all this (I sadly don't have much for FINN to become crossplatform...).



So far we've been focused on edge devices like the Zynq, Artix and Zynq Ultrascale+ families. Tensil certainly works on larger devices but it's not as optimized there as we'd like it. If that's interesting to you, I'd love to talk and understand your use case in more depth.

The Intel FPGA side is interesting, as you say there are fewer projects targeting their technologies for ML use cases. We haven't tested support for their boards yet, but there is nothing in our generated RTL that is exclusive to Xilinx. The only thing we'd need to add is new drivers for their platforms.


Would love to take a look at this. We just launched our FPGA-based cloud platform last year and currently we offer all of the Alveo series and some Intel as well. vmaccel.com


VMAccel looks very interesting! Send me an email and we can explore how to collaborate.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: