> Supports a unique multi-hop network topology leading to a distribution tree fo...

Sparkyte · on July 27, 2023

No idea, but I am interested too. One of the reasons this was probably created was because of a network constrained environment where data is expensive in more than one way.

Cloud Providers can be expensive.

WJW · on July 27, 2023

Bloomberg (who built this) operate a system that makes money by transmitting the same data to a huge amount of clients. Having one server transmit the same piece of information to a million client is approximately 500 times slower than having one server transmit it to 1000 proxies who each pass it on to 1000 end users.

pclmulqdq · on July 27, 2023

The actual architecture of most of these systems in finance relies heavily on UDP multicast, which is a technology that big tech has basically forgotten because it can be tough to administer at large scale.

Multicast puts the entire load of distributing the data in its natural place: the network. The pub/sub queues that the rest of the world uses are more complicated and a lot worse.

This appears to be the beginning of a "cloud-native" bloomberg that can't lean on multicast.

Turskarama · on July 28, 2023

UDP Multicast relies on constantly throwing out the entire state of your system in order to be reliable, whereas with TCP you only need to throw out changes. This is fine in finance because the state is changing constantly anyway so you might as well just continually throw out the multicast traffic.

Message queuing is a more complicated system, but it allows you to push the bulk of your work out to the distributed network instead of concentrating it on the producer.

pclmulqdq · on July 28, 2023

That is definitely not true about state management - you can build very TCP-like systems on top of UDP (see QUIC), and the final adaptation you need is to figure out how to make that work with multicast (a non-trivial adaptation, but possible). Once you do, you have the foundations of a pub/sub system in the form of a one-to-many (or many-to-many if you are brave enough) network protocol, and now you basically only have to figure out how to do state recovery of new clients and clients who are behind, and that can basically just be a database. The system described in the OP is kind of a combination of a database and load balancer in order to do this over unicast connections.

You have to do this state management with a TCP-based system anyway - the use of TCP doesn't magically come with the full state of past messages.

zarkov99 · on July 28, 2023

This is not right in finance at least. Messages updates are typically incremental just like in TCP. The reliability problem is dealt with with careful network stack tuning, core affinity and out of band recovery.

buttaphingas · on July 29, 2023

Check out 0MQ (ZeroMQ), Informatica Ultra Messaging (formerly 29 West), and aeron.io They create reliable uni/multicast protocols over UDP, including strategies for NAK storms etc.

cgio · on July 28, 2023

Keep in mind Bloomberg works with a few “cloud native “ consumers. In that context it may make more sense.

IggleSniggle · on July 28, 2023

Well that certainly sent me down a rabbit hole. Seems like UDP Multicast is really IP Multicast. I'd love to play with this

pclmulqdq · on July 28, 2023

Yes, UDP on top of IP multicast - UDP is a pretty thin layer on top of IP in general, and the IP protocol is deceptively powerful. You just can't run protocols like TCP over IP multicast.

eric-hu · on July 27, 2023

> Cloud Providers can be expensive.

Do you have a story around this? If so, would love to hear more about it.

lelandbatey · on July 27, 2023

Specifically, egress from Cloud Providers (transmitting data from inside the cloud to a computer outside the cloud) is the usual culprit for being "very expensive". It's part of a pricing strategy by cloud providers that encourages folks to put all their infrastructure inside of the cloud and to have none of it outside the cloud. As one example of the ink spilled over this topic, see this discussion from 2 years ago about AWS's very high egress fees:

Article "AWS’s Egregious Egress" (2021) https://blog.cloudflare.com/aws-egregious-egress/

HN discussion: https://news.ycombinator.com/item?id=27930151

beyonddream · on July 27, 2023

Ingress can also be costly especially if there is a steady state of high load traffic sent from on-prem machines to machines inside cloud (not sure about aws but have experienced this with azure where we had to resort to buy their expressroute which was very costly and ultimately unsustainable for us)

linuxdude314 · on July 28, 2023

Ingress is free for AWS. Doesn’t make any sense to charge for ingress…

saled · on July 28, 2023

Something else I was caught out on recently is cross availability zone traffic, especially for multi AZ kubernetes where the nodes are very chatty. It's possible to get the services to consider network topology to limit cross az traffic, but it's not the default.

itscodingtime · on July 27, 2023

Can anyone give an example in which there would be less network bandwidth? This tree topology seems like it would use more bandwidth.

gavanm · on July 27, 2023

Instead of a the tree structure, think about it as Data Center -> WAN -> Site Server -> LAN -> Client

By having a proxy/repeater/site server, you are reducing the amount of WAN traffic you need to send.

manvillej · on July 28, 2023

maybe if you're hosting replicas or proxies outside of a highcost bandwith area?

low cost end replicas at the end. if the bandwidth usage is half the cost in their multihop topology, 14XXXX compared to 16XXXX? depends on the number of proxies or replication and pricing.