LibPhenom, a high-performance C eventing framework from Facebook

astrodust · on Sept 16, 2013

Does anyone know how this compares to the standard libevent or the quirky allegedly faster libev (http://software.schmorp.de/pkg/libev.html)?

wezfurlong · on Sept 16, 2013

We're a bit faster than libevent in terms of dispatch throughput; some benchmarks in this commit message: https://github.com/facebook/libphenom/commit/41b6106f04fe62c...

The `tests/bench/iopipes.t` "test" allows you to play with some concurrency parameters to try this for yourself on your hardware.

We haven't compared against libev.

We've added some more APIs (buffers and sockets) since those benchmarks were done and we don't have numbers to share around those yet.

One key difference between libevent, libev and libuv is that libphenom is inherently multithreaded in its IO dispatcher and timeout implementation.

If you're dispatching purely CPU bound jobs, we get very close to linear scaling with the number of cores: https://github.com/facebook/libphenom/commit/c2753c2154a0cff...

mrb · on Sept 17, 2013

What do you mean by "inherently multithreaded"? I have written a network application handling 100-500k TCP concurrent connections using libev. It was multithreaded to distribute the number of connections evenly per thread (typically from 10k to 100k connections per thread). This is a model that is perfectly supported by libev. And I observed a nice linear scaling of the network throughput with the number of cores since my jobs were also purely CPU-bound. Depending on how CPU intensive my jobs were exactly, my libev code needed anywhere from 2 to 10 threads to saturate a GbE link.

wezfurlong · on Sept 17, 2013

The principal difference between the libphenom event dispatcher and the other event libraries is that libphenom can wakeup and dispatch IO events to any of the IO scheduling threads (no thread affinity).

Contrast with the libevent approach of using an application specific scheme to assign descriptors to an event base associated with a thread (strong thread affinity).

This makes more of a difference if you have chatty protocols and/or long lived sessions and no way to rebalance your fd -> event_base mapping.

kev009 · on Sept 16, 2013

There's a lot more framework in place for building an application here... hash tables, configuration files, JSON, performance counters.

libevent/libev are easier to retrofit into existing applications. This looks like something you'd start a new application with. libuv is somewhere in the middle.

petsos · on Sept 17, 2013

Exactly, I think it is more relevant to compare to e.g. glib.

escaped_hn · on Sept 16, 2013

what about libuv?

otterley · on Sept 16, 2013

... and libev?

codehero · on Sept 16, 2013

Why is it every framework handling run loops or communication makes its own implementation of printf? While printf is ubiquitous, I'd hardly call its semantics or syntax perfect. Why does everyone have to copy the same mistakes over and over again?

wezfurlong · on Sept 16, 2013

I'm not hot for reimplementing printf, but I did need an interface that made it easy to print diagnostics for various objects; rendering them to the stack and then passing them to the underlying printf implementation makes for a lot of boilerplate code.

In addition to reducing boilerplate and aiding portability, having our own printf implementation aids in consistent behavior across platforms, and allows for a deeper integration with our streams and buffers so that we don't need to make a series of clunky calls to measure how much storage is needed before passing the formatted data into the lower layers.

codehero · on Sept 16, 2013

I know the scope of your project isn't to break ground on formatted output. I've seen the same thing you've done in fastcgi. and in nginx. A pattern that recurs because printf is a clunky interface. And of course each copy has its own little syntax variation. But people accept the printf approach because they were indoctrinated, starting from "Hello World". I know there's something better out there.

TheZenPsycho · on Sept 17, 2013

You know there's something better? What is it?

zzzcpan · on Sept 16, 2013

To make it more portable and consistent, something you can rely on. To avoid issues with locales (LC_*) and save some CPU in the meantime. To gain control. It would be wrong not to do that for such tiny amount of code.

to3m · on Sept 17, 2013

To back this up, some specific printf issues I've seen in the past:

- printf calls malloc

- printf calls FPU emulator

- platforms differ over whether %p's output includes a leading 0x

- platforms differ over how you print int64_t/uint64_t

- platforms differ over how you print size_t

- some platforms have almost-ISO-but-not-quite semantics

Additionally, on top of the difficulty of adding extra types in a cross-platform fashion, the printf system is tied to the FILE . FILE s are usually not extensible, and on Windows don't work with sockets.

This is daft, and surprisingly shortsighted (perhaps it's the word "FILE" that causes people to come over all unimaginative?), because you could provide some system like Mac OS X's funopen, and then use fprintf for everything - maybe even replacing snprintf with it! - but functionality like this isn't as widely available as it should be.

Anyway, if you write your own printf, you can fix all of this.

wezfurlong · on Sept 17, 2013

All of these are factors in our choice for printf here.

Another fun one: FILE on Solaris can only be used with file descriptors whose value fits in 8 bits due to an astonishing degree of backwards ABI compatibility. Also on Solaris, printf("%s", NULL) -> crash but on other systems will print "(null)".

In our implementation we couldn't solve the frustrating size_t uint64_t stuff without disabling the compile time parameter checking that gcc provides; I value that more than the slight annoyance of PRIu64.

codehero · on Sept 17, 2013

Most of these are all just implementation or platform divergence issues, though what was the FPU emulator issue? That actually seems fundamental to formatted output strategies.

to3m · on Sept 17, 2013

Well, regarding the emulation issue, you've got me there, slightly, because that item was going by what I remember of what my teammate told me in the pub about 12 years ago :)

The platform was the Playstation2 and the issue was (as I recall) that the system's FPU supported floats but not doubles, and the supplied libc wasn't fully compatible with the compiler flag that effectively did a typedef float double. (I assume printf was affected because of the traditional varargs promotion rules.)

I suspect it was easier to write a new printf than figure out how to rebuild libc, assuming you were even allowed to link the final game to your own libc in the first place...

(As for the rest being simply differences between one platform and the next, that's quite true. (And you can usually work around to one degree or another - believe it or not I've worked on a number of multi-platform projects that didn't rewrite printf, though funnily enough every single one had to wrap it.) But then, for what reason does one do this sort of thing, except to remove these differences? You might as well rail against #define stricmp strcasecmp and the like - writing your own printf is just a difference of degree.)

codehero · on Sept 17, 2013

Thanks for the response. Hearing a real world example of a type promotion trap was illuminating.

I pick on printf in particular because most seasoned programmers consider copying and pasting code a smell, but is accepted for printf and friends.

est · on Sept 17, 2013

Python also had this pprint.pprint() and pprint.pformat()

escaped_hn · on Sept 16, 2013

maybe because printf is blocking and will block the event loop.

codehero · on Sept 16, 2013

I don't know if that's the reason, but there's no reason formatted output HAS to:

1) block at the caller, ever

2) have its data parameters pushed onto the stack

3) print all or nothing to a single buffer

4) identify the data type or modifier by the first letter of its English name (int, long, etc)

5) have its core modified to add functionality

otterley · on Sept 16, 2013

Unlikely, unless the output stream's buffer is full, in which case you pretty much have no choice but to block. stdout/stderr are two channels one generally should not write to in a nonblocking fashion.

capkutay · on Sept 16, 2013

Can someone explain what eventing frameworks are used for? Are they related to event messaging or processing frameworks?

wezfurlong · on Sept 16, 2013

At a high level, they make it easier to write server (or client) software that deals with multiple concurrent connections (such as HTTP server software, or just about any network facing service these days). They do this by abstracting some of the details away so that you can focus on writing your application code; instead you declare callbacks that get invoked when you have data available.

Traditional eventing frameworks focused on non-blocking I/O on a single thread on the basis that you don't need so many resources to scale up to a large number of clients when compared to a simple one-thread-per-client model.

There's lot of good material discussing this at http://www.kegel.com/c10k.html

libphenom is a bit more than just an eventing framework though; we have a number of APIs that help with putting together the whole application. And we blur the lines a bit: we also have thread pooling and dispatch support for cases where your can't build 100% of your application in a non-blocking fashion.

IPGlider · on Sept 16, 2013

How does this compare to libdispatch (https://libdispatch.macosforge.org/) in functionality and performance? I'm very interested in this.

wezfurlong · on Sept 16, 2013

We haven't looked at libdispatch specifically; would welcome your feedback on how we compare.

We hope we do well here; we've had some nice results: https://github.com/facebook/libphenom/commit/c2753c2154a0cff...

IPGlider · on Sept 16, 2013

That looks nice.

I used libdispatch to prototype a server and would be great to compare with other libraries before starting the project.

libPhenom seems to have much more features, maybe even more than I need, but the documentation seems good. I will try to make another prototype with your lib.

Thanks.

wezfurlong · on Sept 16, 2013

This is the test code referenced by that commit: https://github.com/facebook/libphenom/blob/master/tests/tpoo...

nwmcsween · on Sept 17, 2013

Why do you do -fno-omit-frame-pointer on x86_64 the ABI requires dwarf or is there something I'm missing?

wezfurlong · on Sept 17, 2013

Frame pointers make things easier for tools to get backtraces without requiring complex dwarf unwinders. The observability is something I value more than not being able to use that register for other things.

throwaway812 · on Sept 16, 2013

Functionality: Well, libdispatch doesn't really work outside of OS X, and this is primarily targeted at Linux server apps.

I can't comment on performance.

petsos · on Sept 17, 2013

Also, it doesn't look very active: https://libdispatch.macosforge.org/trac/log/

IPGlider · on Sept 16, 2013

libdispatch works on FreeBSD (https://wiki.freebsd.org/GCD) and on Linux (http://packages.debian.org/wheezy/libdispatch-dev), but I don't know if blocks works on Linux.

Moto7451 · on Sept 17, 2013

If you use clang you get blocks:

http://stackoverflow.com/questions/5907071/clang-block-in-li...

throwaway812 · on Sept 17, 2013

The library may work, but Grand Central Dispatch, which is the whole point, is really an OS X thing. BSD may have adopted it, not sure. I'm pretty that confident Linux has not.

fyolnish · on Sept 17, 2013

Grand Central Dispatch is just the marketing name for libdispatch. The do use pthread work queues if available, which they are on OSX & BSD. On linux a thread pool is used

throwaway812 · on Sept 17, 2013

Yes, so there's absolutely no reason to use it if you're primarily targeting Linux.

iatrou · on Sept 17, 2013

Looks very promising! It would really help the adoption of such new projects if there was a bit more effort to present the current status as well as a roadmap. libphenom strives to be a core component for server applications. It also has to compete with similar Open Source projects that have been around for a while, with well known strengths and limitations. A more clear statement of its current status (is it used in production for Facebook? is it in beta? are there known caveats?) as well as the future goals would help to build confidence in the project.

wezfurlong · on Sept 17, 2013

Thanks; good feedback. It's not currently in production at Facebook, but like just about everything we do, we're quickly iterating.

Regarding roadmap, we'll try to keep the issue tracker on Github sync'd up with the broad goals and milestones, and we welcome feedback there too; they're a bit more dynamic and easier to update in real time than the docs and website materials.

kev009 · on Sept 16, 2013

Nice to see Concurrency Kit here. I wonder if they should have extended/patched libuv though.

marshray · on Sept 16, 2013

I hadn't heard of CK (http://concurrencykit.org/) before.

Do you know of anything else that's using it?