That fan-out functionality is particularly interesting for anyone needing to implement the upstream part of a firehose API, which of course Bloomberg do.
Would love to hear any technical history you can share, what did this replace, and what shortcomings led to it being built? (Seems quite similar to Kafka but perhaps more emphasis on low latency rather than raw throughput?)
I have been working on an engine to perform heavy computations that is distributed and my messaging choice has been gRPC + Flatbuffers + proxys and I was wondering if someone could name some reasons or use cases to chose message brokers vs a lower level approach like gRPC. Is it mostly because of synchronous/asynchronous needs or there is something else such as latency or even ease of implementation?
Depends on your use case. Queueing messages enables the competing consumer flow which implicitly allows load-leveling which is great if your traffic is spiky and you need a little time to auto-scale or other scenarios like that. If you’re just looking to blast out data to all consumers or just load-balancing is fine, approaches like yours are probably better.
It is the asynchronous and decoupled nature of messaging that pulls architectures towards message brokers. In a "MOM Architecture", the producing of messages, and consuming of messages, is completely decoupled - both in "space" (HA, location transparency, replication etc) and in time (the async nature).
This leads to a whole heap of benefits. I've outlined a few of them here, in the "Rationale for Mats3", my Message-Oriented Async RPC library: https://mats3.io/background/rationale-for-mats/
If your workers can hop onto the next task without having to reach out for any additional state or report back to anything synchronously a distributed log/spool/queue/broker works well.
If you need light-weight reporting like a status code you can use RPC/reply mechanisms if the broker supports it (e.g. rabbitmq), build your own (custom headers), or use an API/gRPC call.
Everything else starts needing some sort of more sophisticated coordination mechanism and gRPC will look attractive for that although there are usually other options as well.
grpc client a -> grpc server a for client a
grpc client b -> grpc server a for client b
grpc client a -> message broker(implemen a) -->
grpc client b -> message broker(implemen b) -->
message broker-> your app for the broker
this make sense if dont make sense manage the quque in your app if ypu want to abstract the implementations is worth if you have n to 1 or n to n and all works different
> Supports a unique multi-hop network topology leading to a distribution tree for each queue, thereby leading to network bandwidth savings for certain use cases.
This is very interesting. One of the benefits of Kafka is partition routing so I'm curious to learn how this may compare.
No idea, but I am interested too. One of the reasons this was probably created was because of a network constrained environment where data is expensive in more than one way.
Bloomberg (who built this) operate a system that makes money by transmitting the same data to a huge amount of clients. Having one server transmit the same piece of information to a million client is approximately 500 times slower than having one server transmit it to 1000 proxies who each pass it on to 1000 end users.
The actual architecture of most of these systems in finance relies heavily on UDP multicast, which is a technology that big tech has basically forgotten because it can be tough to administer at large scale.
Multicast puts the entire load of distributing the data in its natural place: the network. The pub/sub queues that the rest of the world uses are more complicated and a lot worse.
This appears to be the beginning of a "cloud-native" bloomberg that can't lean on multicast.
UDP Multicast relies on constantly throwing out the entire state of your system in order to be reliable, whereas with TCP you only need to throw out changes. This is fine in finance because the state is changing constantly anyway so you might as well just continually throw out the multicast traffic.
Message queuing is a more complicated system, but it allows you to push the bulk of your work out to the distributed network instead of concentrating it on the producer.
That is definitely not true about state management - you can build very TCP-like systems on top of UDP (see QUIC), and the final adaptation you need is to figure out how to make that work with multicast (a non-trivial adaptation, but possible). Once you do, you have the foundations of a pub/sub system in the form of a one-to-many (or many-to-many if you are brave enough) network protocol, and now you basically only have to figure out how to do state recovery of new clients and clients who are behind, and that can basically just be a database. The system described in the OP is kind of a combination of a database and load balancer in order to do this over unicast connections.
You have to do this state management with a TCP-based system anyway - the use of TCP doesn't magically come with the full state of past messages.
This is not right in finance at least. Messages updates are typically incremental just like in TCP. The reliability problem is dealt with with careful network stack tuning, core affinity and out of band recovery.
Check out 0MQ (ZeroMQ), Informatica Ultra Messaging (formerly 29 West), and aeron.io
They create reliable uni/multicast protocols over UDP, including strategies for NAK storms etc.
Yes, UDP on top of IP multicast - UDP is a pretty thin layer on top of IP in general, and the IP protocol is deceptively powerful. You just can't run protocols like TCP over IP multicast.
Specifically, egress from Cloud Providers (transmitting data from inside the cloud to a computer outside the cloud) is the usual culprit for being "very expensive". It's part of a pricing strategy by cloud providers that encourages folks to put all their infrastructure inside of the cloud and to have none of it outside the cloud. As one example of the ink spilled over this topic, see this discussion from 2 years ago about AWS's very high egress fees:
Ingress can also be costly especially if there is a steady state of high load traffic sent from on-prem machines to machines inside cloud (not sure about aws but have experienced this with azure where we had to resort to buy their expressroute which was very costly and ultimately unsustainable for us)
Something else I was caught out on recently is cross availability zone traffic, especially for multi AZ kubernetes where the nodes are very chatty. It's possible to get the services to consider network topology to limit cross az traffic, but it's not the default.
maybe if you're hosting replicas or proxies outside of a highcost bandwith area?
low cost end replicas at the end. if the bandwidth usage is half the cost in their multihop topology, 14XXXX compared to 16XXXX? depends on the number of proxies or replication and pricing.
If it's heavily used inside Bloomberg, that's good enough for me!
BTW, the original Comdb2 paper from 2016 is good [0]. It has features that even today seem awesome. It sounds like a really nice distributed MySQL/PostgreSQL.
What a phenomenal landing page. Thanks to the animations I have a good idea of what this does and how it does it. You barely need to know what message queuing is to understand it.
Does anyone recommend any particular piece of software for creating these types of visualizations with relative ease? I've always been a fan of animations for helping to understand some technical topics.
As erwinkle mentioned, these animations use anime.js
If you want to build something but don't really know how to do it, I would suggest the following:
1. Build the visual content in an SVG with Inkscape. It should be easy for you to build the layout with assets/item to animate. Don't forget to group items and set names so you can reference them later.
2. Once exported, insert the SVG in an HTML page and add anime.js. This library will let you animate the content you just created (you should be able to reference your named items from Inkscape and animate them). The learning curve might be tough depending on your experience regarding animation (or CSS in general).
Lottie.js is a library for animations in JavaScript, and after effects can export to it! You're obviously limited in what you can export, but if you do a basic after effects animation with shapes and text, it works fine.
Pretty sure it was made by Airbnb, and used in tons of apps like Google home
All the replies mention technologies to help you build one, but this chart looks like a modified, animated version of a Petri Net to me, which is a very useful chart for visually describing the flow of data between systems. I've built similar charts using PowerPoint and exporting the PowerPoint as a GIF... same effectiveness for internal use, but not something you're going to want to brag to your friends about.
Okay, this is going to sound weird, but I've used Keynote in the past. I'm sure PowerPoint can do this as well, but Keynote has Magic Move, and you can export this animation. It's very easy to work with, and do simple animations. I'm sure there are much better tools for this, but as someone who knew Keynote, this was always effective for me.
Keynote is surprisingly good for this, and I’ve used it for animated data flow visualizations in the past.
While I like the idea of something like anime.js and think it raises the ceiling on what’s possible, I’d still probably start with something like keynote just to get the idea out of my head and into a format I can validate before going into code.
I've heard Framer can make some impressive animations, but I don't know much about how well they export to web. What's on display here is really cool, it's all animated SVG.
The animations are very slick, but I can still grumpily pick at them: the "Clustering and Quorum-based Replication" one shows 4 total servers (suboptimal number because quorum requires 3/4 where I'd prefer to require 2/3 or 3/5) and all the followers acking before the primary proceeds (the animation could show one lagging instead).
4 might be useless compared to 3 from a "required quorum" PoV but if you want to stay up with 1 failure and the load can't be handled on 3 servers but 4 is OK, it is optimal isn't it?
I.e. selection of ideal server count is a multi-dimensional problem.
Worse than useless from a quorum perspective, because you've increased the number of servers that have to participate in quorum (quickly) without increasing the number that don't. Looking at it the other way, you go from having problems when 2/3 are slow/broken to having problems when 2/4 are slow/broken. (also note slow doesn't even mean the machine is having problems; in geographically distributed setups it may just be due to physical distance from the leader.)
Yes, 4 servers might handle more load than 3 servers [1], but so would 5 servers. I'd guess that if you've got enough message queue traffic that 3 servers can't handle the load, those servers aren't a significant fraction of your total cost. You might as well go to 5. Likewise, if 5 aren't enough, I'd skip over 6 and go to 7. If 7 aren't enough...I'd probably rethink my design entirely...but I'd prefer 9 to 8.
Alternatively, quorum systems can support "zero-weight replicas" which effectively don't vote but still can be useful for eventually consistent reads. Then you have this server that helps handle the load but doesn't increase the number of servers that have to participate in a round.
[1] if the load is overwhelmingly eventually consistent read operations and their multi-hop topology with proxies doesn't sufficiently reduce the load on the replicas. It's not obvious to me if this scenario is plausible or not.
Same here! I don’t get it. It’s much more lightweight, fully FOSS and their custodian company (Synadia) is very friendly and open to self-hosting. Basically, they run a global super-cluster of Nats which is pretty much vanilla afaik.
My main gripe is lack of “best practices”. You’re pretty much on your own when it comes to important decisions like subject namespacing and some operational stuff can be a bit tricky, like changing options and migrating. They had “nats by example” but that seems to be quite out of date and not super helpful. They have started doing more videos which are great but not that much content overall.
Their auth story is very sophisticated but perhaps a bit over-engineered. Some use cases like having anonymous “self-signed” client keys are tricky to work with.
Other than that, it’s an amazing piece of tech. One that is scalable in a way that just makes sense and gets out of the way when multiple machines are involved.
Nats is simple because it doesn't bother addressing some of the scalability problems that Kafka was designed to address, like clustering with partitions:
You linked to NATS Streaming, which has been deprecated for years. It’s not relevant. It can still be used for legacy code that still needs it, as the URL even shows, but it is not recommended for new applications.
The current standard is NATS JetStream, which absolutely addresses scalability, arguably to an even larger degree than Kafka, since it can be used to create a globally interconnected supercluster of clusters where messages can be configured to flow between regions as needed. I doubt anyone would run a Kafka cluster that spans multiple regions.
I doubt anyone would run a Kafka cluster that spans multiple regions.
Stretch clusters and clusters with brokers in different AZs are absolutely a thing in Kafka. I make no comment on the ease relative to Nats, which I don't know at all.
I was talking about different regions altogether, not just different AZs. My understanding is that it’s generally a bad idea in Kafka due to the latency, but I could be wrong.
EDIT:
Definitely sounds like there are some tight latency requirements[0]:
> A stretched 3-data center cluster architecture involves three data centers that are connected by a low latency (sub-100ms) and stable (very tight p99s) network, usually a “dark fiber” network that is owned or leased privately by the company.
100ms would theoretically be enough to span the contiguous United States, but the references to "very tight p99s" and "dark fiber" make me wonder if 100ms is actually acceptable, or just a theoretical maximum that can only be allowed under absolutely perfect conditions. Either way, suggesting the user should have access to dark fiber between their regions does not fill me with confidence about the robustness of this solution. I'm sure it would be fine for geographically close datacenters, as AZs are designed to be.
Garden variety NATS (i.e. the non-"Jetstream" case, hereafter) is "at most once" (messages will be lost.) BlazingMQ is "at least once" (>1 delivery of a message will occur.)
That difference is down persistence. NATS doesn't persist messages to stable storage. BlazingMQ must use persistent storage. From that all sorts of application design, performance and other implications are derived. And they all matter. A lot.
There are a number of ways to wrap native code libraries as a managed library. I did it once (packaged an Obj-C library as a .NET DLL for a Xamarin app) back in 2018 but had lots of help from a teammate who knew that dance far better than me and I forgot all of the details. I googled and found at least four different blogs posts showing how to do it.
I know that's not an answer but if you're feeling adventurous you might be able to help the project.
It's not on the roadmap for the next 6 months. But priorities can change depending upon the demand. One thing to note is that BlazingMQ client libraries are very stateful, and implementing a batteries included library is a non-trivial effort.
The python client library is planned to be published. They say they need to remove some dependencies and update the workflow to something more suitable for OSS.
I'm the developer of the Message-Oriented Async RPC library Mats3: https://mats3.io/
Its current sole implementation is based on Java Messaging Service JMS API, and it is used in production for a rather large UCITS unit holder registry on Apace ActiveMQ, and all tests runs fine on Apache Artemis MQ.
Every time a new message broker comes along, I sit up in the chair and wonder a) about their performance (!), and b) whether they have a JMS client implementation, and c) whether Mats3 works with that! When I tested RabbitMQ's JMS client library, I sadly found that there was rather many differences - things I thought was screamingly obvious, was not available. E.g. as basic function as redelivery: "Normal" MQs try to redeliver N times, typically with a backoff between each attempt, and then, when all N attempts fail, it puts the message on the DLQ. Rabbit instead tight-loops the delivery attempts until either the message goes through, or the sun burns out. To be able to use Rabbit, I would have to use the native API, and implement redelivery and DLQing myself, client side. Also, transactions.
I now wonder whether I should make a lower-level abstraction, so that the JMS implementation is converted to a "base" impl, and then the JMS, Rabbit, Bloomberg, NATS, ZeroMQ, Aeron, etc implementations was extensions, or "plugins", or "drivers", to that.
Sounds great, and you have lots of nice documentation on the page, but could you provide a TLDR? There's a lot of competition in this area: GRPC, Cap'n'proto (was posted on HN a day or two ago), NATS, etc.
I'm also having trouble figuring out if Mats3 is a library (with a JMS API) over a variety of messaging systems (WebSockets, NATS, etc.)?
I am truly finding it hard to explain it. I have tried along a dozen angles. I actually think it is a completely new concept - and that might be the problem. But there is actual value here, because it "merges" the amazingness of using messaging, with the simplicity of "linear thought" that synchronous, blocking code gives (i.e. ISC using e.g. REST or gRPC). #)
This page tries to directly explain the idea - but I guess it assumes that the reader is already extremely familiar with message queuing? https://mats3.io/docs/message-oriented-rpc/
Here's a way to code up Mats3 endpoints using JBang and a small toolkit which makes it extremely simple to explore the ideas - the article should also be skimmable even without a command line available: https://mats3.io/explore/jbang-mats/
If you read these and then get it, I would be extremely happy if you gave me a sentence or paragraph that would have led you to understanding faster!
The concept of messaging with queues and topics are essential to Mats3 - but the use of JMS is really just a transport. I could really have used anything, incl. any MQ over any protocol, or ZeroMQ, or plain TCP - or just a shared table in a database (but would then have had to implement the queue and topic semantics). As a developer, you code Mats3 Endpoints by using the Mats3 API, and you initiate Mats3 Flows using the API. You need an implementation of the MatsFactory to do this, and the sole existing is the JmsMatsFactory - which works on top of ActiveMQ and Artemis's JMS clients.
Wrt. WebSockets, that is a transport typically between a server, and a end-user client, e.g. an iOS App. Actually, there's also a "sister project", MatsSockets, that bring the utter async-ness of Mats3 all the way out to the client, e.g. a webpage or an app. https://matssocket.io/
NATS is, AFAIU, just a message broker, with some ability to orchestrate. Fundamentally, I do not like this concept of orchestration as an external service - this is one of the founding ideas of Mats3: Do the orchestration within each service, as you would do if you employed REST as the ISC mechanism. I do however assume that one could make a Mats3 implementation on top of NATS client libs.
#) Java's Project Loom is extremely interesting to me, as its argument for using threads as the basis of concurrency instead of asynchronous styles of programming, is exactly the same rationale for which I made Mats3: It is much simpler for the brain to grok a linear, sequential, "straight-down" code, than lots of pieces of code that are strung together in a callback-hell. Async/await is a way to make such callback-hell more readable, "linearaize it" (but you still have the "colored methods" problem which Loom just obliterates in a perfect explosion). One could argue that this is what I have achieved with Mats3 when using messaging.
Does anyone here do large scale interface stuff with many millions of messages? I do this in healthcare and would like to try applying my skills somewhere else...
Very interesting. Thank you for benchmarks. Thank you for sharing. In a zero-replication no-leader-election (or all-in-one-node) scenario what is min latency observed? It appears predominant delay is in ACKing to ensure durability, right?
C++ and Java client examples are very simple which is cool.
Would appreciate if there is hook to enable `SO_TIMESTAMPNS` on socket and retrieve `cmsg` to get accurate timestamping and permit client code benchmarking.
I find it curious that lots of people always drag in Kafka as comparison to message brokers. Their own architecture (log of messages, vs. delivery of messages), but in particular the system architecture they lead to when used (event sourcing, vs. "react to events and commands"), are extremely dissimilar.
Kafka has its uses, in particular for massive influx of events, e.g. in a large IoT system - I'd say the perfect example would be continuous measurements of tens of thousands of gauges and sensors on an oil rig or any other large production system, or e.g. the energy meters in every home.
But I would personally never use such a system as the inter-service communication layer for a multi-service architecture. It is WAY to heavy coupling. Event sourcing looks fantastic on the surface, but is a disaster for a decades-living system with tons of developers. REST/gRPC is better, but async messaging really rocks.
If you use Kafka in an event sourced fashion, whereby each service emits events that it puts on a bunch of Kafka topics, and each service that needs that information consumes those events (by reading and applying those events to its local understanding of the world), I argue that you have effectively made a massive distributed and shared database - shared between all the services.
This goes smack in the face of the idea of "one service, one database" 1), where there is a distinct boundary where each service owns their own data. How a given service stores it data is of no concerns to any other service - the communication between them is done using e.g. REST or messages, with a clear contract.
Event sourcing / Kafka architectures are the exact opposite of a clear contract. You are exposing the absolute, deep-down innards of the storage solution of a service, by putting its microscopic events down for all to see, and all to consume. You may do aggregations, and emitting more coarse-grained "events" or state changes, thus kinda also exposing "views" of these inner tables, and maybe use a naming scheme to sort of which are "public" and which are not.
In the beginning, I really did find the concept of event sourcing to be extremely intriguing: Both the "you can get back to any point in history by replaying up-to", and "forget the databases, just emit your state changes as events" (I truly "hate" databases, as they are the embodiment of state, which is the single one thing that makes my field hard!), the ability to throw up a new projection for anything you'd need (a new "internal view") of the state, and that unit testing could be really smooth.
I upon deeper delving into the concepts found that the totality of such a system must quite quickly become extremely heavy: The amount of events, thus needing snapshots. Evolution of events, where you might have no idea of who consumes them (that is, the "shared database" problem). The necessary understanding, throwing a half-century or more of relational databases under the bus. The performance tweaking if things start to lag. Etc etc etc. It would become a distributed system in the worst way a distributed system could be distributed: All state laid out in minute details, little abstractions, and a massive diverse set of different implementations in the different services to get back to a relational view of the data. And this is even before the system gets a decade old, with lots of different developers coming and going, all of them needing to get up to speed, and the total system needing extremely good and hard steering to not devolve into absolute utter chaos.
That Kafka can be employed and viewed as a database has been argued hard by Confluent. Here's their former DevEx leader Tim Berglund explaining how databases are like onions, and that in the base of every database there is a commit log. And that this log is equivalent to a set of Kafka topics. 2) So why not just implement all the other layers of the database in application logic? Confluent even have a lot of libraries and whatnots to enable such architectures.
it never ceases to amaze me how the software engineering profession has very smart folks reimplement the same conceptual primitives over and over again. I'm curious if hospitals (if that's even analogous to a company) do similar things by having different processes to solve the same problem.
I imagine this is used for Bloomberg (the terminal) and not Bloomberg (the website)?
going back to the article - fantastic animations. I'm just as curious to how that was made as the queue itself.
I feel like on the contrary there's an enormous cost attached to using prefab components that only kinda do what you need them to do. Shops that have embrace this philosophy have been some of the most inefficient ones I've ever encountered. Screens of yaml code was written to implement tasks that could have been performed with lines of programming code; and many man-years were spent implementing functions that copied fields from one object to another nearly but not quite identical object.
It's a big driver of both complex and slow software. Often the adaptation layer necessary to get it to actually satisfy your needs rivals the size and complexity of the piece of existing solution code you're trying to leverage.
Software that is built from the ground-up to your exact needs is often both faster and slimmer and more maintainable and suffers from far less dependency churn.
There is a balance to Not In House syndrome where you build everything yourself, and the opposite: everything should be vendored in. I am inclined to think that a healthy NIH syndrome is the better place to be. Companies which vendor everything move along at a snails pace, and often getting your outside product to work with internal needs is a huge years long endeavor, to the point just writing a decent version yourself that is exactly what you need is really useful.
I think I just got it wrong. Not invented here sounds better and more right. I just typically think of only the acronym, and decided to write it out in case someone hadn’t heard it (and thus made a mistake). Good catch.
In my experience, the problem with vendor: What if a feature is missing? I see this often when I am forced to use a vendor product. And, on day one, no one (hands on) usually knows anything about the vendor product, so there is a huge learning curve. If you follow NIH, then you organically build knowledge in the org. And features can be added with time+money. (Sure, some vendor products will add features, but usually for an insane price tag.) You cannot win either way: They hardest part is knowing which one to choose.
Yeah. This works both ways. At small scale or early stage you should try like hell to work backwards from pre-existing components so that everything does exactly what you need them to do by definition. Eventually you outgrow that and it makes sense to build things that do exactly what you want because the incremental value is worth it.
>it never ceases to amaze me how the software engineering provision has very smart folks reimplement the same conceptual primitives over and over again.
Well, no one cares, but I'm not saying that to be mean to dismissive, it's just that software production as an industry (esp. at a company like Bloomberg) can basically do whatever it wants because of low capital costs to produce something. Things like writing another queue system will have zero impact on the industry unless it's adopted widely (and cargo culted) and later made worse.
Maybe I'm about to eat my shoe, but if a capable of team at Bloomberg wanted to reinvent something considered unnecessary in terms of industry need, they probably could if for no other reason that management allowing some indulgence in exchange for keeping talented people around is Worth It™.
Bloomberg had a history of inventing their own stuff going back to the first terminals. They had their own networking protocols and everything, when I was last there you could still see remnants in how machines were named.
They do adopt a lot of industry stuff, but it is always heavily modified. And of course they love “Invented Here” as a corporate tradition going all the way back to the beginning.
It’s actually nice from a developer perspective, the R&D group is highly valued by management.
I don't understand that phrase. Probably a large team of very well paid engineers developed this product. Bloomberg is well-known in the industry for paying well. "[L]ow capital costs"? This software probably cost millions in salaries to build, and millions more to maintain over the next 10+ years.
>You wrote: low capital costs to produce something
>I don't understand that phrase.
You don't need a big factory with large capital costs to make or distribute software. You're only paying people, and salaries are not capital investments.
Smart people solving "solved problems" over and over again is exactly how the tech industry progresses. Meanwhile other industries are stuck with processes from fifty years ago because "it has always been done this way", and anyone even suggesting the slightest bit of change is a heretic.
Yeah, pretty much any medium to large company is going to have a bunch of processes for doing things that any company that size needs to do and some more that are a bit more industry/domain specific but still basically the same as those at every other company in the space.
In the US, all hospitals are required to use electronic health records (EHRs) and most are implemented in Epic, the leading EHR system. But they are almost always electronic implementations of their old paper record systems, dating back to who-knows-when.
When I left Bloomberg around 7 years ago Kafka was being aggressively rolled out and there was a lot of interest in data streaming tech in general.
I am wondering if they ran into performance or resource usage issues with Kafka at scale, and decided to roll their own for use cases that better fit their workloads.
Bloomberg does have vast backend infrastructure for moving around / computing data which the Terminal is a consumer of.
What you say is true.
But...this is Bloomberg so I think we can give them a pass and take a deeper look.
I can't think of a more relevant company within which to develop messaging middleware.
Exactly. Handling message queuing and distribution is likely a core capability for Bloomberg, so it makes a lot of sense for them to develop something tailored for their needs. LinkedIn did the same with the original Kafka development, where distributed eventual state consistency was a core capability.
If anything, we should be thankful that they've decided to share it with the world.
1. It is easier to implement from scratch than understand and modify: we see this everyday at a way smaller scale with libraries/classes/functions. Not saying this is optimal but this is a driver for sure.
2. Recruiting: some devs love doing this. Fancy OSS project to put on your resume, some actually believe there's unfulfilled need, you get people like us discussing it (otherwise we wouldn't be thinking about Bloomberg).
It's a somewhat difficult argument to say that it's easier to build from scratch, but I hear what you're saying. At any rate, it is absolutely more fun to build from scratch.
I've polled developers about what part of software development they hate, and in general, it's deployment and maintenance (ie: devops). Using a off-the-shelf-component means that the developer only gets to do the un-fun part: installing, configuring, productionizing. If they code their own solution, they get to spend a few weeks coding!
Part of this evolution is from funded development to opensource development.
Then there is language variances like flavors so what ends up happening is a million different conceptual primitives that fundamentally the same.
It isn't the same because of a few things. At the end of the day too it is also about how good the support is over the rest. If a new MQ can demonstrate better tooling nothing stops a company from adopting it.
mainframe emulation is a thing, aka code change (Ala swap mq libraries) at enterprise scale, and supporting production without a migration path can very much stop adoption.
That really depends on how the MQ integration is designed and whether or not the in queue data is persistent or doesn't get stale. In most cases an application doesn't like stale data so a long lived queue is rare. People who adopt something typically PoC before adoption.
I've had to do this mostly because of batshit crazy rules many large companies have about how all software has to be supported. The issue there being the definition of supported in the eyes of clueless auditor types.
If software is written 100% externally, but the vendor only wrote 10% of it and is transparently taking 90% from an open source project so they can sell fraudulent support (assuming they don't have people contributing to the open source project here), that counts as supported.
If it is written 100% internally, even if it is built on top of some ancient internal codebase that nobody can figure out and has been in life support mode for 20 years, then it counts as supported.
But if its written 50% internally and 50% of it came from an external open source library with a BSD license, and you are a major contributor to that project, then it is considered unsupported and you get in trouble.
This is how you end up with security issues because some sysadmin decided they had to roll-their-own crypto library or authentication system, because its better to have an unknown implementation show up on a scanner than a known one that has a list of CVEs that can be checked.
Why wouldn’t they? This technology is not a competitive advantage for their company, wouldn’t make sense for them to sell as a standalone product, and it helps their engineering department attract good talent.
For the 200 repos that they have made open source, they had over 6000 people contribute feedback and/or code. So in addition to attracting talent, they also get more testers and feedback.
If it gains wide adoption, then they've got a global community of people providing free software engineering for their messaging platform. Its a win-win.
Because they're not making money off of their queueing system, but what they build with it; to them it's a cog in their machine, and open sourcing it means others may adopt it and help them maintain and improve it.
> This section explains the leader election algorithm at a high level. It is by no means exhaustive and deliberately avoids any formal specification or proof. Readers looking for an exhaustive explanation should refer to the Raft paper, which acts as a strong inspiration for BlazingMQ’s leader election algorithm.
So their own homegrown leader election algorithm?
> BlazingMQ’s leader election and state machine replication differs from that of Raft in one way: in Rafts leader election, only the node having the most up-to-date log can become the leader. If a follower receives an election proposal from a node with stale view of the log, it will not support it. This ensures that the elected leader has up-to-date messages in the replicated stream, and simply needs to sync up any followers which are not up to date. A good thing about this choice is that messages always flow from leader to follower nodes.
> BlazingMQ’s elector implementation relaxes this requirement. Any node in the cluster can become a leader, irrespective of its position in the log. This adds additional complexity in that a new leader needs to synchronize its state with the followers and that a follower node may need to send messages to the new leader if the latter is not up to date. However, this deviation from Raft and the custom synchronization protocol comes in handy because it allows BlazingMQ to avoid flushing (fsync) every message to disk. Readers familiar with Apache Kafka internals will see similarities between the two systems here.
"a new leader needs to synchronize its state with the followers and that a follower node may need to send messages to the new leader if the latter is not up to date". I thought a hallmark of HA systems was fast failover? If I come to your house and knock on the door, but it takes you 10mins to get off the couch to open the door, it's perfectly acceptable for me to claim you were "unavailable". Pedants will argue the opposite.
FWIW they mention this at the bottom of their document
> Just like BlazingMQ’s other subsystems, its leader election implementation (and general replicated state machinery) is tested with unit and integration tests. In addition, we periodically run chaos testing on BlazingMQ using our Jepsen chaos testing suite, which we will be publishing soon as open source. We have also tested our implementation with a TLA+ specification for BlazingMQ’s elector state machine.
"This section explains the leader election algorithm at a high level. It is by no means exhaustive and deliberately avoids any formal specification or proof."
It has a gorgeous landing page but I'm not really sure why this is better than any other MQ. Can someone more knowledgeable provide any insight into its comparative advantages?
Edit: I did see the comparisons page, but.. well, there's more to life than Rabbit and Kafka!
> Can someone more knowledgeable provide any insight into its comparative advantages?
Well, you know, there is the "Comparison With Alternative Systems" page[1]. :-)
Sadly it doesn't include NATS in the comparison. I would have been interested to see that. But the usual suspects (Rabbit & Kafka) are given a mention.
I'd also be interested in seeing how it compares to AMQP implementations (Apache Qpid for example) and any of the proprietary message queue systems (IBM and Microsoft) they refer to but don't name-check.
I mean one unsubstantiated armchair take: it's newer than Rabbit / Kafka (RabbitMQ was built in 2007, Kafka in 2011), since then the learnings from distributed systems, message queues and programming languages has improved by leaps and bounds, and the existing ones will have backwards compatibility / legacy issues.
That said, there are indeed plenty of commercial and noncommercial alternatives, so now that it's been published I'm sure there will be some thorough people making comparisons soon!
My guess is that at the time when Bloomberg's infrastructure was considering supporting a message queue, they surveyed existing solutions (mind you that this was a long time ago) and found them lacking for one reason or another, rightfully or not.
RabbitMQ was probably avoided because nobody wanted to learn Erlang.
Also, R&D had a lot of experience building message oriented middleware "from scratch" in a low overhead high availability way, so first instinct was probably to start hacking in C++.
Nowadays it might be the case that some teams within Bloomberg need the performance or would rather have a bespoke solution instead of spending on migrating to something else off the shelf.
Keep in mind that this is a company that has its own implementation of most of the C++ standard library.
I worked on this team for 3 years along the OGs behind BMQ, specifically on a closed-source middleware with very similar architecture. I am very happy that they finally open sourced one of their main projects!
They have very strict standards which are necessary given that C++ really is a minefield. We did a great job avoiding UB by following those coding standards. There is some verbosity that can be avoided but hey these are details.
Personally, I have moved to Rust since then and I am not looking back.
Is that really your reaction to C++? These days when selecting a critical infrastructure component I'm very inclined to prefer memory-safe by default, like rust or (somewhat less so) golang. Sure, C/C++ projects that have been around and under attack for years (redis, postgres, etc) are fine but they've had those years of battle testing. For a new project I really feel a lot safer if they're built in something more failsafe.
From the "leadership election" snippet, this setup looks perilously close to the peer to peer propagation of transactions and blocks in a blockchain system such as Solana. I suppose the latency requirements are much tighter than 400ms though.
I know Solana does mutual TLS over QUIC authentication between peers and does bandwidth rate limiting via stake. Also, the topology is self organizing. Is it worth copying this tech into a message queue system?
Congrats on open sourcing this, but I am curious if anyone here has ever had an actual scale issue that required changing from some "slower" queuing system (like RabbitMQ, Kestrel, Kafka, etc) where you actually needed to change queuing systems?
Maybe I'm old school but 99% of the use cases I've worked with in my 20 year career could scale with using a MySQL database table acting as a queue and some scripts querying the table and doing work. I have worked with maybe a dozen queuing technogies (including super expensive cloud Amazon SQS) and unless your app needs sub second latency on processing messages at greater than 5k per second I just don't see why using these sophisticated systems have benefits?
Hope this comment doesn't come across cynical or dismissive. I love seeing new tech like this come out and I am not saying that there aren't use cases I am sure there (maybe HFT?) but curious if anyone has case studies to share of legit queue systems that couldn't scale.
You're right but "MySQL database table acting as a queue" doesn't stand out on resumes/CVs. Plus, new and shiny things attract devs like moths to a flame.
Very interesting, but looking up the supported standards it just uses TCP. There is no support for AMQP or MQTT. How is interoperability with just some client libraries for Python, Java, ...
Interesting that there's a comparison (albeit low on details) to Rabbit, but not MQTT? Anyone had experience with both MQTT and BlazingMQ able to compare/contrast?
I would be interested in the differences between BlazingMQ, AMQP, and MQTT protocols and the invariants afforded to the clients, not the implementations per se.
Is this implemented with any sort of thought to security?
I don't see anything that implies as such, all FAQ and comparisons to other message queues say nothing of security (I skimmed and ctrl+f so I could have missed it). This doesn't bode well for a network-based application written in a memory unsafe language.
edit: I'm surprised so many people are interpeting this as a weird cargo cult thing. Application security includes a lot of mundane and critical things like:
Authentication, authorization, message signing / authentication (e.g. HMAC), encryption, secret/key management, how the system handles updates, etc.
I perused the code and it was written in a C++03 style. Pointer chasing and manual memory management in the few files I poked through. (IE, no unique_ptr, just new/delete).
This may be perfectly safe... however, I'd give it a few years of battle testing before touching it.
Lol. I'm sorry but my work has me directly interfacing with Bloomberg and other financial institutions. If anything, I trust the code LESS because of who wrote it.
Just because software is heavily used and deals with finance, doesn't mean it's secure or well written. Consider the Equifax breach of 2017, everyone's credit info leaked because Equifax was using a crusty old version of struts with known critical CVEs.
Security purely in terms of the language used, or security in terms of cluster access, authentication and so on?
I mean, I don't often see documentation for networked services that goes in-depth on how exactly they write memory safe code in C or C++, as if that would mean the application is therefore secure; it's always much higher level.
> Security purely in terms of the language used, or security in terms of cluster access, authentication and so on?
I'm personally interested in both, but from a docs perspective I'd mostly expect the latter, more "security for users of this system" than anything else. I'm a little embarassed I didn't remember it until now, but I think the general term is "application security."
I'm concerned that I don't see any hints of either in the project. I'm not at a computer with access to my Github account so I can't easily search the source code for hints of obvious signs of care or neglect with respect to security.
Not sure where they are on the roadmap but authentication in particular is 'mid-term' but lacks detail.
I expect that if you want to deploy this then you're doing it on an internal network; its deployment examples are based on Docker so I expect it's relying on what you can do with k8s.
Thanks! That is the sort of stuff I was asking about.
It looks like it's an after thought but at least on their mind now, which is very fair with respect to Bloomberg's wants/needs. It'd be nice if they had a bit of a warning about using this until it has some basic auth(n) and TLS since they're releasing it to the public. I think it is, relativley speaking, rude to release insecure networked software without giving users a notice as to what sorts of insecurity is at least known/expected.
Adding a veneer of security isn't necessarily superior to leaving it out altogether. Systems of this sort are best secured at the network level, i.e. only trusted hosts should be able to connect to it. Redis is a good example of where this has been tried: it does support password based log in, but the password is stored and transmitted in plaintext, and a redis server will happily accept thousands of auth attempts per second making brute forcing a viable attack. Rather than improve the auth system Redis has instead doubled down on encouraging appropriate network level security by defaulting to only being accessible to the local host, so admins have to go through an explicit step (with warnings) before they can just expose it to the internet.
> Is this implemented with any sort of thought to security?
There's a hefty load of cargo cult mentality revolving the topic of security. Security is not a product, but a process. This sort of talk is getting tiresome.
That is an obviously false dichotomy. The author of the comment never mentioned Rust, and wasn’t asking about Rust. Such a false dichotomy feels like flamebait, which is against HN guidelines.
Your false dichotomy also implies that “unsafe” blocks are as unsafe as C++, which is not true. “Unsafe” in Rust turns off very few checks[0], most of them are still active. No one would write a serious Rust program entirely in unsafe anyways.
Regardless, asking about security considerations is a valid thing to do, even if it were written in Rust. Security is not just about memory safety.
Furthermore, choice of language can have an effect as to the actual security of the networking layer.
Having parsers (and serializers) proved absent of runtime errors (e.g. with something like SPARK) is a form of guarantee I wish I'd see as the default in any 'serious' network-facing library or component. It's not even that hard to achieve, compared to the learning curve of the borrow checker.
Once rust gets plugged into Why3 and gets some industrial-grade proof capacity, the question of 'is it written in rust?' will be automatic (as in 'why would you do it any other way?').
Rust's guarantees are all about what's happening inside your own process with memory ownership. "Security" in the context of a middleware like this is almost certainly more about external validation/auth— what prevents some random node from injecting messages into the queue, or adding itself as a subscriber to the output?
Also, it's a super bad-faith argument to talk about an entire program being in an unsafe block. Rust's unsafe is about specifically and intentionally opting out of certain guardrails in a limited scope of code where it should be fairly easy for a human reviewer to understand and validate the more complex invariants that aren't enforceable by the compiler itself.
This is absolutely not what Rust unsafe block is about. It is about getting around the limitations of the compiler to gain efficiency. And it happens a lot.
> Unsafe Rust exists because, by nature, static analysis is conservative. When the compiler tries to determine whether or not code upholds the guarantees, it’s better for it to reject some valid programs than to accept some invalid programs. Although the code might be okay, if the Rust compiler doesn’t have enough information to be confident, it will reject the code. In these cases, you can use unsafe code to tell the compiler, “Trust me, I know what I’m doing.”
I care more about methodology than specific methods. I would like applications that handle I/O, especially network I/O, to be designed and built with security in mind. I asked my question because that doesn't appear to be the case here, and I think that is dissapointing and concerning from a security perspective.
Unlikely, but they seem to be different things altogether. BlazingMQ appears to be a traditional message queue (think ActiveMQ), with message peristence. ZeroMQ is more of a network middleware (think Tibco Rendezvous), and does not include persistence.
BlazingMQ also appears to be more of a "platform" or "service" that an app can use (sort of like Oracle, say) -- ZeroMQ includes libraries that one can use to build an app, service or platform, but none is provided "out of the box".
Which makes it harder to get started with ZeroMQ, since by definition every ZeroMQ app is essentially built "from scratch".
If you're interested in ZeroMQ, you may want to check out OZ (https://github.com/nyfix/OZ), which is a Rendezvous-like platform that uses the OpenMAMA API (https://github.com/finos/OpenMAMA) and ZeroMQ (https://github.com/zeromq/libzmq) transport to provide a full-featured network middleware implementation. OZ has been used in our shop since 2020 handling approx 50MM high-value messages per day on our global FIX network.
It doesn't use UDP. So my tongue-in-cheek reply is, it doesn't qualify to have "blazing" in the name. Hopefully that's not just funny inside my own head. :)
Great. I also hope .Net. Erlang (Elixir) would be also very useful. Hard to doubt many would appreciate Node.JS. I would also cheer Common Lisp, Guile and Lazarus but I understand these probably are too exotic for Bloomberg to invest in. Whatever, the common denominator is - clients for all sorts of runtimes are expected from such a products. In fact I was very surprised to see such a small client libraries list. Perhaps this is going to get fixed later.
Hi, internally, we have support for JS but through an enterprise layer which cannot be open sourced. The focus so far has been to prioritize client libraries in languages which are dominant internally. I am sure we will see libraries in other languages due to demand as well as external contributions!
I’ve been getting pretty deep in C++ recently and the name Bloomberg shows up all across the C++ standards community. They seem to be a strong C++ shop.
> Carefully architected and written in C++ from the ground up with no dependency on any external framework, BlazingMQ provides…
What a weird selling point.
There are pros and cons to having no dependencies. Not long ago, it was a common decision for C++ projects because dependency management was a mess. But that hasn’t been the case for a long time (arguably we have the opposite problem - there are so many ways of doing it: Conan, vcpkg, bazel, spack, raw cmake, nix, etc)
So what would the pros and cons be today and why is it a selling point?
For instance, a pro is that everything is bespoke to the task at hand, no hammering square pegs into round holes designed for a slightly different use case.
A con is that the entire attack surface is managed by one team. CVEs identified and solved on another project is a pretty good thing when you depend on that project.
The main reason I’m surprised, though, is that there are some no-brainer dependencies these days. Fmt, catch2/gtest, metaprogramming libraries, etc.
Hi, one of the authors here. BlazingMQ depends on two other open source C++ libraries: https://github.com/bloomberg/bde and https://github.com/bloomberg/ntf-core. I believe documentation writer wanted to highlight that BlazingMQ does not depend on frameworks like ZooKeeper, etc.
Ah, yes, that makes sense! I'm thinking about framework from the wrong angle.
I can definitely see why that's a bonus. So many data pipeline tools, but so many written in Java. It's for that reason that I've always reinvented the wheel. Perhaps I'll check this out...
Though my approach is to fork and maintain the fork until my PR is accepted, merged, and in a tagged release.
Another is to apply a diff to the dependency within the build system, which is what I do a lot in Nix (mainly to solve cmake issues on hermetic builds).
Either way, it isn't unsurmountable. It doesn't really matter so much if it's used as an executable rather than as a library, though in the case of the latter it's really handy to be able to e.g. pass a custom spdlog sink for logging.
> So what would the pros and cons be today and why is it a selling point?
I'm not saying they pulled it off because I would need to evaluate it, but having your own dependency stack can result in a much more streamlined solution for extra performance since you don't have to compromise to someone else's data model. As this is a high speed message bus, it's entirely appropriate.
These are also basically application-level network routers which really don't need huge levels of dependencies so to be honest I'd be running away from anything that claims it's written using 1000s of the newest, latest and greatest whatevers. Churning out a faster string grinder is what I'm looking for.
It's a joke, as some Rust projects used to repetitively claim to be "memory safe & blazing fast", thus becoming a tongue-in-cheek phrase. I also anticipated language to be Rust, but it would be way too comical.
I’ll admit, I went into their repo expecting to find Rust, or at the very least go. Pleasantly surprised to see C++ code though I would need to look further to see if it’s “good” C++ code. The naming conventions of the source tree lends me to suspect it’s not.
i too determine if code is good based off its naming conventions and other things like
my linter is set to enforce MiXeD_SpOnGeBoB-KeBaB_CaSe as the default casing
Because the name is a technical claim. Is it fast today? Maybe but will it still be considered "blazing" in 20 years? This is equivalent to naming your thing VeryFastMQ. I can already imagine people saying "Oh yea BlazingMQ is craaaawling right now, I am thinking of replacing it with SuperiorMQ".
> I don't understand how you can make queueing high-performance. Queueing things is literally the opposite of making things fast.
You likely need to read about message queues, why they are used, and why they have performance constraints. Message queues are a system, and the system to maintain those queues and content has an overhead. The project is building a solution that is higher performing than comparable systems.
Queuing things often makes things faster (!) as you are usually dealing with limited resources (ex: web and job workers) and need to operate on other systems (ex: payments) that have their own distinct capacity.
If you have 1000 web workers, and your upstream payment provider supports 10 payments per minute, you'll quickly have timeouts and other issues with 1000 web workers trying to process payments. Your site won't process anything else while waiting for the payment provider if all web workers are doing this task.
A queue will let you process 1000 web requests, and trickle 10 requests to your payment provider. At some point that queue will fill, but that's a separate issue.
Meanwhile, your 1000 web workers are free to process the next requests (ex: your homepage news feed).
A queue is a thing to be managed carefully when designing systems:
- you need to ensure that your producers don't produce at a faster rate than your consumers can keep up
- if the consumer is unable to handle the request now, queueing it and handling it later is often not the right thing to do. Most likely by the time the resource is available the nature of the request will change.
TCP itself is already a queue. All these message queue systems also make the silly decision of layering their own queueing on top of TCP, leading to all sorts of problems.
Basically those things only sort of work if you have few messages and are happy with millisecond latency.
It really depends what you design your system to handle. A sudden burst of traffic that can be spread for minutes is fine (ex: Elasticsearch indexing requests can usually be delayed through background jobs).
> Basically those things only sort of work if you have few messages and are happy with millisecond latency.
Not really... queues are great to defer processing or have longer running job than your HTTP and TCP timeouts will allow. Building large data export won't happen within a single HTTP request.
It's a bit difficult to cover as it is highly dependent on the queue system you use.
You'd usually want your queue system to fail the enqueue if it is full, and you'd want monitoring to ensure the queue isn't growing at an unsustainable rate.
It also forces you to think a bit about your message payload (rich data or foreign keys the worker loads).
RabbitMQ, Redis-based queues (ex: ActiveJob or sidekiq), Gearman, and others will all offer different mechanisms to tackle full queues.
Queueing things can result in better throughput because you don't have to wait for the other side to process your last message before you can begin work on the next one.
That's just buffering. A lot of people use Kafka this way, as an impedance adapter for a producer with an internal architecture that can't tolerate blocking on production. Of course this requires you to unrealistically assume that Kafka is a zero-impedance message sink.
But what I think the other post is alluding to is the fact that in-order delivery is often not a requirement, and can be an anti-feature. I know that in every use of Kafka I have personally encountered the relative order of messages was quite irrelevant but the in-order implementation of Kafka converts the inability to process a single record into an emergency.
You’re misunderstanding the discussion. If you think of queuing as “waiting to process” then yes it’s slower than not waiting. But that’s not at all the necessary implication of these components. In fact “synchronous” requests are buffered (queued?) internally and can be delayed before processing. So the distinction is irrelevant to performance. And the implementation is what matters.
UDP doesn't provide reliable delivery, so you would need to implement some kind of delivery assurance in your application.
UDP also has very small message (datagram) size limits, so you would also need to implement some way to fragment and re-combine messages in your application.
At this point you've built an ad-hoc re-implementation of 80% of TCP.
i'm not an expert here but each message is pretty important right? What happens if you put a UDP datagram on the wire and it never shows up? Wouldn't you have to come up with some kind of signal to indicate the message made it to the receiver? After that you'll probably start signaling delivery of groups of messages instead of individually and then ordering them on the receiver by when they were sent and then maybe some validation of integrity and sender backoff mechanism for network congestion... and then you've made TCP.
TCP means that all bytes must be copied to a local buffer in order to be re-sent in the case the other end didn't receive them (even though by that time they're probably old and irrelevant). Once the buffer is full you cannot send more data and must wait for the other end to ack them (potentially indefinitely, though most implementations will eventually time out after long enough). This means that if you wanted to send 10Gbps unthrottled to another continent, you'd need a buffer of 500MB per TCP session, assuming no data loss, and consumer keeping up at line rate.
TCP is just not a good fit for anything that needs high-performance or bounded sending times. What it's good for is giving you a simple high-level abstraction for streaming bytes when you'd rather not design your networked application around networking.
Building the right logic for your application is not difficult (most often, just let the requester resend, which you needs to do anyway in the age of fault-tolerant REST services, or just wait for more recent data to be pushed) and easier than tweaking TCP to have the right performance characteristics.