What I miss in this article are some references to some of the great research on Starlink performance and characteristics that has come out in the past ~2 years. [1] and [2] come to mind. They also go into much more detail on the 15s interval, quote [1]:
> Interestingly, we observe that the Starlink OWD (one-way delay) often noticeably shifts at interval points that occur at 15 s increments. Further investigation reveals the cause to be the Starlink reconfiguration interval, which, as reported in FCC filings [71], is the time-step at which the satellite paths are reallocated to the users.
AFAIK it is not the dish itself that does the tracking but a central orchestration.
One comment about the analysis of ping times in the section "Low Earth Orbit Systems". Specifically, the analysis of ping times "within each 15s satellite tracking interval".
Most routers do not put ping-processing in the "fast path". That is, instead of having the ping be processed by an ASIC, the ping gets processed by the router's CPU. And ping-processing is typically a lower-priority task. Because of that, you can't assume that the high variation in latency is because of Starlink.
It's unclear to me. `traceroute` typically just returns router IPs, so I'm assuming that "the first IP end point behind the Starlink earth station" is the IP address of a router of some sort.
Hm, I understood "ip endpoint" to mean something else than a router (because I am very sure that Geoff is aware of the "ICMP is not handled on the ASIC"-issue).
About a week ago there were significant routing issues between east Africa and Europe. Latency ballooned, and jitter was quite significant too - especially over thr course of the day. I believe both EASSY and one of the semewe circuits were down. I had packets from the U.K. reaching Mozambique via Tokyo.
That was a temporary situation though, and I think the worst I’ve seen in east Africa from the last 10 years.
The argument was “Most routers do not put ping-processing in the "fast path". That is, instead of having the ping be processed by an ASIC, the ping gets processed by the router's CPU.”
That’s meaningless. Control planes are often policed sure, but on overload will simply drop. In my experience they drop icmp to expired before echo response but most will generate ttl expired just fine.
Any router capable of processing a full bgp table, and to be honest any router made in the last 20 years, is perfectly capable of responding to icmp echos.
There was then a second argument that “3rd world” routers aren’t as good as ones in western country. In the majority of cases they’re exactly the same. That western arrogance is somewhere between insulting and amusing.
The final argument is about path loss/jitter/etc, specially loss on the first hop (your “my crappy 3G provider” argument)
That’s exactly what this test of starlink is showing.
Starlink is a great tool in specific cases, but the fanboyish ness often drowns out the actual benefits.
> That is, instead of having the ping be processed by an ASIC, the ping gets processed by the router's CPU.
I wonder how true that is anymore, with ICMPv6 processing being a mandatory part of IPv6. I could totally see ICMP processing being a low-priority task, but am far less certain that it would not be done by dedicated hardware these days.
On top of that, I've never, ever, ever noticed the behavior he's observing with either my cable ISP connection, or the terrestrial microwave link provided by my local WISP. I don't have enough data to say that I'd never see that behavior if I happened to ping some router powered by a potato... but I've pinged a whole bunch of systems over the years, and have never seen variation like what he describes.
> I wonder how true that is anymore, with ICMPv6 processing being a mandatory part of IPv6.
ICMPv6 may be mandatory according to specs, but you can still drop most of it with no ill effects. You probably shouldn't drop needs fragmentation packets, but everyone has adapted to those being dropped sometimes, so...
If you ignore neighbor discovery, you'll have trouble reaching your neighbors, though.
But neighbor discovery is low volume mostly, and can be handled by the cpu, not the asic. Needs frag likely won't be directed at the router ip, the asic can forward them just fine though.
I've certainly seen much more variable ping times for routers than for the hosts behind them. If the router's cpu is less busy, ping times are usually a bit more than a host that's right there, and as the router's cpu gets more busy, the rtt increases or pings get dropped. It's not usually a factor of how busy the links are either; routers tend to have a pretty limited amount of cpu and if they're getting a lot of traffic, it's easy to overload them.
I haven't seen it happen quite in the bands like in the article, but sometimes it varies quite a bit. The results tend to be much more consistent if you can get a rtt from full host and not a router.
I’ve seen ping latencies of up to 30 seconds (yes, 30 000 milliseconds!) with a certain cable ISP, while at the same time VoIP (RDP over regular Internet IP, not “cable voice” or ISP-provided SIP, which often has its own QoS class) was borderline usable.
Could have been traffic shaping or prioritization (their network was in complete shambles after all), but ICMP was definitely taking some lower priority queue or path.
I did indeed mean RTP! Almost everybody uses RTP these days, whether they realize it or not. It’s the basis for WebRTC and most proprietary VoIP solutions.
Indeed. Surprising that such a technical article from a supremely technical and knowledgeable person would not at least have a disclaimer about it. I would also be interested to know how much a Starlink satellite is an ASIC router vs pure CPU.
I believe a better technique would be to ping _through_ the router to another endpoint under ones own control, on the same network as the origin client, and where processing latency can be controlled. Basically ping yourself but via starlink and back.
That said, in practical terms he's quite right about the jitter problem. I maintain 2 networks myself at my remote home. One my microwave WISP for tight latency and jitter, and then Starlink for much better bandwidth/throughput.
I interpreted that statement incorrectly as being an endpoint on the satellite itself. I guess the "earth station" is the round trip terrestrial endpoint via the satellite. I was misreading it as being the local terminal.
BBR may not be perfect but it sure is an improvement for Starlink over Cubic. I switched to it on a server I connect to frequently from Starlink and got a 2x performance improvement on TCP connections. Before, big file transfers would regularly bounce between 5 Mbytes/s and 12. After it'd hold a steady 12. Which is what you'd expect given Starlink's high packet loss rate (0.5% for me) and how TCP congestion control traditionally works.
For those not aware of how Starlink operates: The customer antenna is called the User Terminal (U.T.) a.k.a. "dish" although all production models are rectangular - only the pre-production beta model is round and dish-like.
The U.T. contains a phased array antenna that can electronically 'steer' the bore-sight (aim) of the transmitted (and received) signal at the current satellite that is in view. In ideal circumstances the U.T. antenna has approximately 110 degrees field of view (~ 35 degrees above each horizon).
The satellites pass from west to east and take approximately 15 seconds to pass through the field of view of the U.T. The satellite forms a beam aimed at a fixed location on the ground - this is called a 'cell'. All U.T. within that area share the radio link that has a fixed bandwidth, so contention is managed by the satellite.
The path length to a satellite directly overhead would be around 550km (in most cases the satellite is slightly north, or slightly south, of the U.T. but for round numbers sake assume 550km).
The path length to a satellite appearing 35 degrees above the horizon (the slant range) is ~ 2568km.
Satellites relay the packets from the U.T. to the (nearest) Earth ground station, so the path length and therefore travel-time will vary enormously over just 15 seconds.
The round-trip for the minimum case is 4 x 550km = 2200km but for the maximal case is 4 x 2568km = 10272km. These equate to a travel time of between 1.8 and 3.6ms per leg, so that gives a hard physical minimum of 4 x 1.8ms = 7.2ms to 4 x 3.6ms = 14.4ms
As more satellites are added to the constellation so the gap between satellites decreases and the angle above horizon at which a satellite is acquired can increase thus shortening the maximum path and lowering the latency.
Starlink has a publicly stated goal of less than 20ms round trip latency and published a report in March 2024 about the engineering efforts to achieve this [0]. Much of the effort that customers see focuses on two issues:
1. reducing latency between ground station and Internet connection point
2. scheduling the radio links between satellite and all U.T.s in its beam area
Starlink balances contention by sometimes restricting and sometimes promoting activation of new U.T.s in each area - this is why on occasion a fully subscribed cell will impose a waiting list on new activations. At other times Starlink will, and does, dynamically change the monthly subscription cost. Recently some areas had their residential price reduce from US$120 to US$90 where others in congested areas had an increase from US$90 to US$120 (in the USA).
fyi, User Terminal or UT is quite an established term. Oldest random Googlable example comes to my mind is a thin client server process named "SUNWut"(a ticker code + "ut") from early 2000s. I'm pretty sure the usage traces back multiple decades from that point in phone and software industries.
General usage stemmed from the Tactical User Terminal's deployed in the 1980s by the U.S. Army in Germany; part of codename TENCAP ELINT programme - and were truck-sized!
> This orbital velocity at the surface of the earth is some 40,320 km/sec.
Huh? Is this right? That's really fast, like 24,800 miles per second, which is a significant fraction of c (the speed of light, 186,000 miles/sec). >10% of c, is this a typo?
Earth's orbit has a circumference of about 940 million km. Divide that by 365.25 days x 24 hrs/day x 3600 seconds/hr and I'm getting in the arena of 29.78 km/sec.
That’s the velocity of the earth around the sun. But “orbital velocity at the surface of the earth” is supposed to be the velocity a satellite in an orbit at ground level around the earth would have.
> A “flooding” ping sends a new ping packet each time a packet is received from the remote end
Uhm, no:
-f
Flood ping. For every ECHO_REQUEST sent a period “.” is
printed, while for every ECHO_REPLY received a backspace is
printed. This provides a rapid display of how many packets
are being dropped. If interval is not given, it sets interval
to zero and outputs packets as fast as they come back or one
hundred times per second, whichever is more. Only the
super-user may use this option with zero interval.
> Interestingly, we observe that the Starlink OWD (one-way delay) often noticeably shifts at interval points that occur at 15 s increments. Further investigation reveals the cause to be the Starlink reconfiguration interval, which, as reported in FCC filings [71], is the time-step at which the satellite paths are reallocated to the users.
AFAIK it is not the dish itself that does the tracking but a central orchestration.
[1]: https://arxiv.org/abs/2310.09242 [2]: https://arxiv.org/abs/2306.07469