Fwiw I bought a sleeve for my M.2 SSD and it plugs into usb3 with very little slowdown. It gets 500MiB read and 1.2GiB write. Haven’t measured vs native M.2 to compare, but it’s fast enough that I’d be surprised if the pi were blocked waiting for I/O vs native M.2. And native M.2 comes with increased manufacturing complexity.
Oh, I didn’t test on a pi, just a mac. And yeah the double write speed was super weird. Thanks for answering my internal question about why ~500MiB seemed to be the max.