Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Wanting the buffers to be flushed so that I had a complete logfile, I gave cp more than a day to finish disassembling its hash table, before giving up and killing the process....Disassembling data structures nicely can take much more time than just tearing them down brutally when the process exits.

Does anyone know what the 'tear down' part is about? If it's about erasing the hashtable from memory, what takes so long? I would expect that to be very fast: you don't have to write zeros to it all, you just tell your GC or memory manager to mark it as free.



Looking at the code, it looks like deallocating a hash table requires traversing the entire table, because there is malloc()'d memory associated with each hash entry, so each entry has to be visited and free()'d. From hash_free() in coreutils hash.c:

    for (bucket = table->bucket; bucket < table->bucket_limit; bucket++)
      {
        for (cursor = bucket->next; cursor; cursor = next)
          {
            next = cursor->next;
            free (cursor);
          }
      }
Whereas if you just don't bother to deallocate the table before the process exits, the OS will reclaim the whole memory block without having to walk a giant data structure. That's a fairly common situation in C programs that do explicit memory management of complex data structures in the traditional malloc()/free() style. Giant linked lists and graph structures are another common culprit, where you have to pointer-chase all over the place to free() them if you allocated them in the traditional way (vs. packing them into an array or using a userspace custom allocator for the bookkeeping).


... and both malloc() and free() need to maintain (update) their own data structures, which in many implementations (don't know about the current glibc one) are stored interspersed with the data. Walking a hash table and calling free() on each and every item, even if it doesn't seem so, actually dirties a large number of memory pages (which could be truly randomly scattered because it is a hash table), which then need to be re-written into the swap space on swap-out.


Isn't it a terribly inefficient design to malloc hash entries individually? It would make more sense to use a pool / arena for that.


You could, but it's fairly uncommon to do that kind of "custom allocator" style in C, especially in traditional Unix utility programming. It seems to be more of a C++ thing culturally in my experience, though you do find it in C in "big" programs, like scientific-computing code.


Why exactly is it necessary to to free each hash entry instead of exiting the process?


Others have already mentioned that it isn't really necessary, but for some historical context this wasn't always true. When I first started programming computers, apps (written in languages that aren't garbage collected) on your typical consumer level computer/OS had to be really careful with freeing up all allocated memory before exiting or that memory would be gone forever (as far as the OS was concerned when it came to reallocating it to other apps) until the user rebooted the computer.

Advances coming from the combination of more advanced operating systems and CPU features like MMUs have made this a non-issue in most cases (can still be an issue on embedded, etc).


This was a particular problem on the Amiga system, among others. There were attempts to retro-fit resource tracking to the OS, but this was made all-but-impossible by the fact that a process can malloc() a chunk of memory, write a message in it, and then hand a pointer to that message to another process, expecting the other process to dealloc() it. This made for wonderfully cheap inter-process communication (for instance, the filesystem would load blocks from disc and pass them as messages to the process wanting them).


If it's the last thing you do before you exit the process, it isn't necessary, because the OS will reclaim your process's memory in one fell swoop. I believe that's what the linked post is advocating 'cp' should do. (At least on modern systems that's true; maybe there are some exotic old systems where not freeing your data structures before exit causes permanent memory leaks?)

It's seen as good C programming practice to free() your malloc()s, though, and it makes extending programs easier if you have that functionality, since what was previously the end of program can be wrapped in a higher-level loop without leaking memory. But if you really are exiting for sure, you don't have to make the final free-memory call. It can also be faster to not do any intermediate deallocations either: just leave everything for the one big final deallocation, as a kind of poor-man's version of one-generation generational GC. Nonetheless many C programmers see it somehow as a bit unclean not to deallocate properly. Arguably it does make some kind of errors more likely if you don't, e.g. if you have cleanup that needs to be done that the OS doesn't do automatically, you now have different kinds of cleanup routines for the end-of-process vs. not-end-of-process case.


I tend to do this in my C programs because in development usually have malloc() wrapped so that if any block hasn't been free()'ed it's reported at exit() time. This kind of check for lost pointers is usually so cheap that you use it even if you never expect to run on a system without decent memory management.

As an aside, GNU libc keeps ( or at least used to keep, I haven't checked in years ) the pointers used by malloc()/free() next to the blocks themselves, which gives really bad behavior when freeing a large number of blocks that have been pushed out to swap--you wind up bringing in pages in order to free them because the memory manager's working set is the size of all allocated memory. Years ago I wrote a replacement that avoided this just to speed up Netscape's horrible performance when it re-sized the bdb1.85 databases it used to track browser history. The browser would just "go away" thrashing the disk for hours and killing it just returned you to a state where it would decide to resize again an hour or so after a restart. Using LD_PRELOAD to use a malloc that kept it's bookkeeping away from the allocated blocks changed hours to seconds.


Is cleaning up after yourself memory-wise considered part of the POSIX standard?


It's not necessary per se, but many programmers do it out of good habits. It becomes much harder to e.g. find memory leaks in your code if you never free the memory you haven't leaked. And normally the extra overhead it entails is imperceptible.

Obviously many programmers do not think to (or cannot easily) test their code with 40 TBs of data.


For one it makes automated leak detection tools like Valgrind useless.


As others have commented, it's not necessary.

If you wrote cp in such a way that it's a reusable C library that other programs might use, then I can understand why they do this. It still doesn't make much sense for the command-line version to call this. Of course, it might run on some weird platform where the OS doesn't reclaim the memory of a process on exit(?).


OP says the system was swapping.

Once your hash table and its linked lists have been swapped to disk, each pointer dereference causes a random disk access, which are orders of magnitude slower than RAM.


So it's a hash table that's about twice the size of physical memory. The entries in the hashtable are usually pointers to memory that you need to deallocate (char* filenames in this case, maybe some other stuff too) so if you're a good citizen you iterate through them all and deallocate (mark is as free) them as you go. It would seem there were some pathological swap issues doing that. It would be interesting to see if a tree behaved better in this type of situation. The allocator has tables of its own in memory too, if you have to swap those out to find the memory that your then going to deallocate then swap out all the user data structures to swap in the allocator's structures to do the deallocation or something whacky like that, it kind of makes sense.

His patch just skips that step and ends the program.


From reading the thread, what happens is that cp before it exits tries to return all memory to the operating system by explicitly freeing it (this is not necessary on modern operating systems as far as I'm aware). What I draw from it is that since it's so big, parts of the structure of the hash table are on the disk, so it will have to swap in almost all of the table to be able to explicitly free everything.


I don't understand it, but the OP says this, implying the author agrees with you for any modern system, but not on 'old systems without working memory management'

> And unless old systems without working memory management must be supported, I don't see any harm in simply removing the call to the forget_all function towards the end of cp.c.


How would such a system even work? Any abnormal process termination or bugs would mean eventually the system becomes unusable.


The system would slowly leak memory until you reboot.


Yep, I remember programming on such a system (Amiga) in the early 90's! Fun times.


This is system C the memory is manually managed with malloc(). You can't simply nil the reference and expect GC to come through and clean it for you, you have to issue a free() for each malloc() that was called. While it might be possible to write a custom disposal routine that does that efficiently for the structure, the simpler way is to just use the normal functions you use for removing single items and call it against everything. Doing that will not be as efficient because those functions keep the structure in a consistent state and don't perform bulk operations, but they have already been written and tested, and in normal use cases they won't be a bottle neck.


> You can't simply nil the reference and expect GC to come through and clean it for you

This is exactly what happens when your process exits and its private memory space is freed in its entirety. Hence the discussion. There's no need to clean up before exit, except if you want to (i) use a leak checker; or (ii) run on a machine with no MMU.


It's funny that ii is even an issue because Linux does not support platforms without MMUs without some serious modifications.


But the program being discussed is (of course) GNU cp. That program is not in any way designed for, limited to, or intended only for, use in Linux.

So what Linux supports and doesn't support in terms of memory management really has no bearing on what GNU cp can or cannot do, in general.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: