Doesn't CoreML abstract 90% of the actual model format/training source away? Last time I played with it in Xcode, it was painless to pull a pretrained TF model and use as-is.
But, crucially, larger feature set than the likes of CoreML etc. You don't get access to the NPU that way though, at least not on iOS. The only acceleration option there seems to be Metal. Which isn't bad, but also not the most power efficient thing the hardware supports.
Still though, it's the only game in town if you don't want to have insane un-debuggable headaches everywhere you deploy to device. Plus it also supports embedded Linux boards, and pretty much all current TPU-like things available there.