Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That reminds me that I had to reimplement basic hashmaps in C at least three times in my career. What a waste of time.


How much time did you waste?


I don't recall exactly, but at least a day each I would say? They were all performance critical, so had to put some effort. Two of them were for internal nginx patches, but were storing data a bit differently so code couldn't be quite reused. In languages with generics hashmaps are trivial to reuse with zero performance cost.


Is there one with enough parameterisation to choose between different hash collision / resizing / data layout choices? Or one where the various tree/trie/vector alternatives can be swapped in? Seems possible to write such, but std::unordered_map doesn't have it. Maybe rust?


Yes, Rust with const generics could probably do that, but I am not aware of an existing implementation.


One of the rare HN-appropriate occasions for me to say: user-name checks out.


Was there a good reason not to use Judy arrays?


Why would writing a Judy array three times have been better? Aren't they infamously difficult to implement?


There's a mature open source library for them packaged for Arch and other distros. Using that would be less work than trying to implement it from scratch.


Are you suggesting there aren't mature implementations of hash maps in C you can easily vendor?


What does “vendor” mean in this context? Is it just a synonym for “distribute”?


It means including a library in your codebase via manually copying the files in (but generally in their own subdirectory and without modifying them so that it is easy-ish to update to a new version of the library by simply copying in the files from the new version).

I believe the name comes from a tradition of putting such code in a directory named vendor/name-of-library or vendor/name-of-vendor/name-of-library which allowed for distinguishing between the src directory which was first-party code and the vendor directory which contained code form 3rd party vendors (often such code was proprietary and had to be bought/licensed).

Nowadays the term is used to differentiate between a manual approach and the use of a package manager.


It's basically the professionally sanctioned version of copy and paste. With all the good and bad that that entails.

Based on my long builds times and erratic network behavior in CI I'm beginning to think every system should just vendor.


I was only suggesting what I thought was the easiest option. If others are even easier, then I'm even more curious about what might rule them out. I hope to learn something.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: