This is all pretty great stuff. Make sure you read through the docs; I'm unlikely to use Fb's C++ code, but I'm sure as hell going to look at making my C vector code do some of what FBVector does:
I poured through the FBVector code because of your comment. It has some really interesting tricks. In a normal array allocation, it always goes for the next-largest-size in the jemalloc memory hierarchy. That is, it goes for multiples of 64 bytes, 256 bytes, 4 KB, or 4 MB. This makes the array cache-efficient.
The push_back() semantics features the standard array doubling technique, but only up to 4 KB; jemalloc can't grow in-place anything smaller than this, so a copy is required. Beyond that cut-off, push_back() will instead grow by 1.5 times the capacity to prevent too much "slack" memory from accumulating.
I realise this is a dangerous subject, but I am genuinly curious: given the business you run and the work you do for the Freebsd team, why would you object to using c++, but not c?
I could understand why you would object to both, but why only the lower level language.
Personally, I can't keep all of how C++ works in my brain at once. Most people end up limiting themselves to a "sane subset" of C++ to compensate for spec sprawl. But, everybody picks a different subset, so you can quickly stumble on things you think you know but don't work quite right when working on non-self-originated code.
I think Tavis put it best: C is easier to audit because it's transparent. With C++ you can have innocuous-looking source code and have the compiler doing all sorts of crazy things behind your back.
I recently created a private memory allocator so the discussion on the page is somewhat interesting...
Only a tiny minority of objects are genuinely non-relocatable:
Hmm, I'm not exactly what is meant here. Moving a block of memory from here to there in the most general case will leave your pointers dangling and crash you in no short order. Things that pointed to the data you moved just don't any more. If you are disciplined and don't have raw pointers in the block moved, you're good. But as far as I know, that situation requires a very complete understanding of the data in the block you are moving. You can do that. It's just not easy or something that makes sense to stuff "anything" into.
If your object is in a vector, it's already possible for the vector to move the object without updating any pointers to it. The only question is whether it is safe for the vector to move the object with memmove instead of invoking a copy/move constructor, essentially skipping the step of telling the object that it is being moved. This will only fail if the object contains pointers to its own sub-objects (or the move constructor updates external pointers). This is fairly rare, especially for objects that you would store directly in a vector.
std::string is a key exception to this, but fbstring is not.
Edit: I think the assumption is that an object itself is relocatable inside a container like vector if it marks itself as such. The document itself doesn't actually explain the inner workings.
fbstring seems to be exactly what the doctor ordered.
I like the emphasis on cooperating with the memory allocator in fbstring and fbvector. If the entire library does that, that's going to a big win for long-running programs: memory fragmentation can slowly increase program footprint, requiring the use of fancy arena allocators etc.
Modern heaps typically have some low-fragmentation technique built-in, for example, Windows ships with Low Fragmentation Heap, which is turned on by default since Vista.
Low-fragmentation heap puts object of similar size together, so once object is freed, this memory can be reused for other object of similar size without fragmentation. Because of this is has more "slack" - unused memory at the end of the objects that are smaller than their buckets. On other hand, application in steady state is not going slowly increase it's memory use over time.
Also it puts consequently allocated objects (of different size) far away (and thus reduces cache locality), which, in turn may reduce performance for some "allocate a lot of stuff at the beginning and then serve it", etc scenarios, but this is pretty esoteric problem.
Benefits outweigh the concerns, so most apps benefit from the low-fragmentation heaps.
From the OP: "Our primary aim with this 'foolishness' is to create a solution that allows us to continue open sourcing parts of our stack without resorting to reinventing some of our internal wheels."
And: "we think C++ developers might find parts of this library interesting in their own right."
I think the key thing is that these are some basic pieces that a lot of FB's internal C++ code depends on. These pieces are being made available as a stepping stone to open sourcing more significant work.
I'm Tudor, one of the main folly authors. I wrote format, Arena / ThreadCachedArena, DiscriminatedPtr, GroupVarint, TimeoutQueue, and various other pieces (parts of String.h, ThreadLocal, etc), and I'm pretty well-versed with the entire library so I can reasonably answer questions (or poke the appropriate people to make a HN account and answer). Ask away.
Not necessarily specific to Folly, but I've wondered why vector classes don't have a "short vector optimization" like string classes do. That is, why don't they store 24 bytes or so in place? Is it because the required iterators would take-up too much space anyway?
folly::small_vector does just that (and it lets you use one bit for a mutex, too! -- we have a lot of memory-constrained apps so we had to design data structures for them).
I'm surprised they used mixed-case file names when their class names are lowercase. You have to remember both capitalizations. Mixed-case file names can also be problematic when porting to case-insensitive file systems like Windows' or Mac OS X'.
"Folly" started as an internal codename loosely based on "F"acebook "O"pen source "LL"ibrar"Y". When the time to choose an official name arrived, we found "folly" too funny to not use.
This is a conspiracy theory, but what if there are bugs in it that they can't find and so they're open sourcing it in the hope that someone else will fix it.
https://github.com/facebook/folly/blob/master/folly/docs/FBV...