You also need a way to convert data without downtime. A simple block or file kernel thread to lock, encrypt, mark and writeback works well.
Another beneficial technique is to increase blocksizes on disk. User Processes usually work in 4K blocks, but writing back blocks at small sizes is expensive. Better to schedule those writebacks later at 64k blocks so that hopefully the application is done with that particular stretch of data.
We ended up with several solutions- but all of them generally work the same conceptually.
First off, separation of I/O layers. System calls into the FS stack should be reading and writing only to memory cache.
Middle layer to schedule, synchronize and prioritize process IO. This layer fills the file system caché with cleartext and schedules writes back to disk using queues or journals.
You also need a way to convert data without downtime. A simple block or file kernel thread to lock, encrypt, mark and writeback works well.
Another beneficial technique is to increase blocksizes on disk. User Processes usually work in 4K blocks, but writing back blocks at small sizes is expensive. Better to schedule those writebacks later at 64k blocks so that hopefully the application is done with that particular stretch of data.
Anyway, my 2 pennies.