Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I wrote a little copy program at my last job to copy files in a reasonable time frame on 5PB to 55PB filesystems.

https://github.com/hpc/dcp

We got an IEEE paper out of it:

http://conferences.computer.org/sc/2012/papers/1000a015.pdf

A few people are continuing the concept to other tools -- that should be available at http://fileutils.io/ relatively soon.

We also had another tool written on top of https://github.com/hpc/libcircle that would gather metadata on a few hundred-million files in a few hours (we had to limit the speed so it wouldn't take down the filesystem). For a slimmed down version of that tool, take a look at https://github.com/hpc/libdftw



And it's interesting and useful for scientific computing where you already have an MPI environment and distributed/parallel filesystems. However, it's not really applicable to this workload, as the paper itself says.

There is a provision in most file systems to use links (symlinks, hardlinks, etc.). Links can cause cycles in the file tree, which would result in a traversal algorithm going into an infinite loop. To prevent this from happening, we ignore links in the file tree during traversal. We note that the algorithms we propose in the paper will duplicate effort proportional to the number of hardlinks. However, in real world production systems, such as in LANL (and others), for simplicity, the parallel filesystems are generally not POSIX compliant, that is, they do not use hard links, inodes, and symlinks. So, our assumption holds.

The reason this cp took such large amounts of time was the desire to preserve hardlinks and the resize of the hashtable used to track the device and inode of the source and destination files.


Sure, but if you read that article you walk away with a sense of thats a lot of files to copy. And the GP built a tool for jobs 2-3 orders of magnitude larger?! Clearly there are tradeoffs forced on you at that size...


Author of the paper here. The file operations are distributed strictly without links, otherwise we could make no guarantees that work wouldn't be duplicated, or even that the algorithm would terminate. We were lucky in that because the parallel file system itself wasn't POSIX, so we didn't have to make our tools POSIX either.


man! and here I am feeling like a champ conquering NTFS's long file name limitations :/


Is your conquering public? We have issues occasionally, and a toolkit would be nice :)


couple nitpicks:

Check return value of malloc

You don't need \ when breaking function parameters


NB, that PDF seems to have a number of formatting glitches, e.g., the first sentence: "The amount of scienti c data". Numerous others as well, under both xpdf and evince.


There is an fi ligature between the i and the c, perhaps those PDF renderers don't support them? Could be some sort of font loading issue.


Perhaps. That's only one of many similar issues.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: