But, crucially, larger feature set than the likes of CoreML etc. You don't get access to the NPU that way though, at least not on iOS. The only acceleration option there seems to be Metal. Which isn't bad, but also not the most power efficient thing the hardware supports.
Still though, it's the only game in town if you don't want to have insane un-debuggable headaches everywhere you deploy to device. Plus it also supports embedded Linux boards, and pretty much all current TPU-like things available there.