The latencies in the table are based on heuristics or averages that we’ve observed. However, in reality, based on the conversation, some of the larger latency components can be much lower.
High level overview of how video is delivered to 100,000 participants using mesh/cascaded video routers (aka, SFUs) with 200ms latency.
Servers are geographically spread, servers coordinate with each other to route media. Selectively forward media based on who is watching whom in the viewport and at what quality.
Agreed, all of the early HTTP based streaming was about HTTP progressive download. Microsoft’s smooth streaming was dominant in the early 2000s used it too. Glad that Twitch has been using it.
For me, DASH and HLS are just manifests or playlist definitions, of how best to find the resource. It is not dissimilar to webrtc signalling, i.e., go to a resource to discover where to get other parts of the resource.
Mbone was tunneled, and expanded to native IP multicast to save bandwidth in networks that supported it. Meanwhile current networks use multicast (non routed, granted) all the time for eg service discovery and lower level stuff like finding the MAC address corresponding to a IP address.
All the listed protocols came after HTTP. RTSP, SIP borrowed heavily (albeit badly in retrospect) from HTTP.
I do not have all the historical context (early 90s), but for WebRTC, the idea was to not define any new protocol(s) or do a clean slate design. but rather to just agree on the flavors of the various protocols, and then to universally implement those. We already had SDP, RTSP, RTP, SAP, etc. And the idea was to cobble together the existing protocols into something everyone could agree on (the young companies, the old companies, etc)
We ended up defining variations to the flavors that we already had and for the most part everything turned out okay (maybe the SDP plan wars did not end up where we wanted it, but… it was a good enough compromise).
For realtime media, if we are able combine “locator:identifier” issue, we will be able to make media and signaling work inband.
I know they came later, so I'm still confused why RTSP and SIP weren't implemented atop HTTP. I realize that RTSP and SIP can push server to client, but there's ways around that, though perhaps long polling and Websockets weren't conceivable when RTSP and SIP were invented. I mean, in a pinch, I have an HTTP server serving a folder where SDP files are generated, and I've written clients that just look for a well-known SDP file and use that to consume an RDP stream. It's a ghetto form of "signaling" that I love using when doing experiments (not suitable for production for various reasons obvious to you I imagine.)
I'm not saying WebRTC had poor design decisions or anything. I think it was very smart for WebRTC to reuse SDP, RDP, etc so the same libraries and pipelines could keep working with minimal changes. It also means very little new learning for folks familiar with the rest of the stack.
> For realtime media, if we are able combine “locator:identifier” issue, we will be able to make media and signaling work inband.
+1000. I think RTSP+TCP is a decent way to do in-band signaling and media, and RTMP defines strict ways to send both anyway.
To me, the whole typical IP Multimedia stack screams telco. They prefer to remove and reattach headers upon passing interfaces, separate control and data plane, and rely on synchronization for session integrity. Great when there's a phone line to HQ and a heavily metered satellite link to do a live, I guess...
Pavlov’s comment is correct. I came to add that soon the stream can be stored on customer’s own S3. Ergo, you’d be able to do a call in real-time, store it on your S3 account and make it available for streaming.
If we use Vimeo for video hosting would there be some way to automatically post to Vimeo or some other way to get the file up there and available for immediate viewing? Other approaches besides S3?
If Vimeo offers an ingest API using either RTMP(S) or HLS, that would be one way to get the stream from Daily directly to them without any extra processing step in between.
In the case of QUIC, it is likely that the streaming would be over H/3 (HTTP3) or HTTP over QUIC. They may fallback to H1 or H2 but typically over a long enough time, firewall rules become more relaxed.