Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Sure, although the proper way to test it would be to write a lot of data to the drive, issue an fsync, and cut power in the middle of the operation. Rinse and repeat a (few) hundred times for each drive.

There's a guy on btrfs' LKML (also the author of [0]) who is diligent enough to do these tests on much of the hardware he gets, and his experience does not sound good for consumer drives.

[0]: https://github.com/Zygo/bees/



> although the proper way to test it would be to write a lot of data to the drive, issue an fsync, and cut power in the middle of the operation. Rinse and repeat a (few) hundred times for each drive.

This isn't quite right. You have to ensure that the drive returned completion of a flush command to the OS before the plug was pulled, or else the NVMe spec does allow the drive to return old data after power is restored. Without confirming receipt of a completion queue entry for a flush command (or equivalent), this test as described is mainly checking whether the drive has a volatile write cache—and there are much easier ways to check that.


Here is a post from him: https://lore.kernel.org/linux-btrfs/20190623204523.GC11831@h...

TLDR: Very few drives don't implement flush correctly. Notice that he mainly uses hard disks, not SSDs/NVMe. Failure often occurs when two (usually rare) things occur at once. E.g. remapping an unreadable sector while power-cycling.


Does he share the results of his tests anywhere?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: