The non-standard NVMe queue format is because they support offloading encryption...

svenpeter · on Aug 6, 2021

The "queue" format itself is incredibly similar to the normal NVMe queue.

The normal queue is (more-or-less) a ringbuffer with N slots in memory and a head/tail pointer. You append the command to the next slot and increase the tail by writing to a doorbell. Once the controller is done it increases the head the same way.

Apple's "queue" instead is just a memory region without those head/tail pointers. Command submission now works by again putting the request into a free slot followed by just writing the ID of that slot to a MMIO register. Once a command is done the CPU again gets an interrupt and can just take the command out of the buffer again.

This probably makes the driver a little bit easier to implement.

On top of that a similar structure (which identifies the DMA buffers that need to be allowed) also needs to be put into their NVMe-IOMMU with a reference to the command buffer entry. The slightly weird thing about the encryption is that you put the key/iv into this buffer instead of the normal queue. My best guess is that this IOMMU design pushed them to also simplify the command queue to make the matching easier.

Hiding the encryption part inside the IOMMU also makes sense for them because the whole IOMMU management is hidden inside a highly protected area of their kernel with more privileges while the NVMe driver itself is just a regular kernel module which possibly doesn't have access to keys.

marcan_42 · on Aug 6, 2021

Ah, I was thinking of what they did for the T2 Macs (change the queue entry size), not the new changes for M1. But yeah, once they're doing proprietary variants of the NVMe spec, they can do whatever they want and probably had a good reason to make this change too.

(For those following: sven has been working on the NVMe stuff for Asahi Linux recently, he knows about this more than me)