Hacker Newsnew | past | comments | ask | show | jobs | submit | lgeek's commentslogin

BunnyCDN don't run their own network, most of their servers are hosted at DataPacket(.com), but they use some other hosting companies too.

DataPacket has a very large network though and is kind of, sort of EU-based. AFAIK most operations are in Czechia, but the company is registered in UK. And there's also the Luxembourg-based Gcore.


> Worryingly, VNPT and Bunny Communications are home/mobile ISPs

VNPT is a residential / mobile ISP, but they also run datacentres (e.g. [1]) and offer VPS, dedicated server rentals, etc. Most companies would use separate ASes for residential vs hosting use, but I guess they don't, which would make them very attractive to someone deploying crawlers.

And Bunny Communications (AS5065) is a pretty obvious 'residential' VPN / proxy provider trying to trick IP geolocation / reputation providers. Just look at the website [2], it's very low effort. They have a page literally called 'Sample page' up and the 'Blog' is all placeholder text, e.g. 'The Art of Drawing Readers In: Your attractive post title goes here'.

Another hint is that some of their upstreams are server-hosting companies rather than transit providers that a consumer ISP would use [3].

[1] https://vnpt.vn/doanh-nghiep/tu-van/vnpt-idc-data-center-gia... [2] https://bunnycommunications.com/ [3] https://bgp.tools/as/5065#upstreams


These days RFC8805[0] is pretty widely supported. But as far as I understand, it's not entirely trusted and geolocation providers will still override that data if it doesn't match traceroutes and whatever other sources they use

https://datatracker.ietf.org/doc/html/rfc8805


A bit late to reply so much longer (10h) I posted my comment. But just for the record here I go.

After reading that RFC8805 here it's what it writes situation at the time of publishing August 2020.

"8. Finding Self-Published IP Geolocation Feeds" and subsequent

The issue of finding, and later verifying, geolocation feeds is not formally specified in this document. At this time, only ad hoc feed discovery and verification has a modicum of established practice (see below); discussion of other mechanisms has been removed for clarity."

and subsequently

"8.1. Ad Hoc 'Well-Known' URIs

To date, geolocation feeds have been shared informally in the form of HTTPS URIs exchanged in email threads. Three example URIs ([GEO_IETF], [GEO_RIPE_NCC], and [GEO_ICANN]) describe networks that change locations periodically, the operators and operational practices of which are well known within their respective technical communities."

I spent also a moment trying to figure out what can I find about its adoption and use and didn't find much of it. Some blog posts, articles and comments to question whether Amazon AWS or Microsoft Azure support it and answers were pretty much nope, no they don't at least yet time of writing last year and this year.

Thus I'm concluding it's unlikely any major source of location information for GeoIP providers like MaxMind. Nope they're not, it's too marginal source for them to spend time on so little used spec yet.


If you're buying transit, you'll have a hard time getting away with less than 10% commit, i.e. you'll have to pay for 10 Gbps of transit to have a 100 Gbps port, which will typically run into 4 digits USD / month. You'll need a few hundred Gbps of network and scrubbing capacity to handle common DDoS attacks using amplification from script kids with a 10 Gbps uplink server that allow spoofing, and probably on the order of 50+ Tbps to handle Aisuru.

If you're just renting servers instead, you have a few options that are effectively closer to a 1% commit, but better have a plan B for when your upstreams drop you if the incoming attack traffic starts disrupting other customers - see Neoprotect having to shut down their service last month.


From having worked on DDoS mitigation, there's pretty much no difference between CGNAT and IPv6. Block or rate limit an IPv4 address and you might block some legitimate traffic if it's a NAT address. Block a single IPv6 address... And you might discover that the user controls an entire /64 or whatever prefix. So if you're in a situation where you can't filter out attack trafic by stateless signature (which is pretty bad already), you'll probably err on the side of blocking larger prefixes anyway, which potentially affect other users, the same as with CGNAT.

Insofar as it makes a difference for DDoS mitigation, the scarcity of IPv4 is more of a feature than a bug.


(Having also worked on DDoS mitigation services) That "entire /64" is already hell of a lot more granular than a single CG-NAT range serving everyone on an ISP though. Most often in these types of attacks it's a single subnet of a single home connection. You'll need to block more total prefixes, sure, but only because you actually know you're only blocking actively attacking source subnets, not entire ISPs. You'll probably still want something signature based for the detection of what to blackhole though, but it does scale farther in a combo on the same amount of DDoS mitigation hardware.


you can heuristically block ipv6 prefixes on a big enough attack by blocking a prefix once a probabilistic % of nodes under it are themselves blocked, I think it should work fairly well, as long as attacking traffic has a signature.

consider simple counters "ips with non-malicious traffic" and "ips with malicious traffic" to probabilistically identify the cost/benefit of blocking a prefix.

you do need to be able to support huge block lists, but there isn't the same issue as cgnat where many non-malicious users are definitely getting blocked.


You should block the whole /64, at least. It's often a single host. It's often but not always a single host, that's standardized.


Usually a /64 is a "local network", so in the case of consumer ISPs that's all the devices belonging to a given client, not a single device.

Some ISPs provide multiple /64s, but in the default configuration the router only announces the first /64 to the local network.


Presumably a compromised device can request arbitrarily new ipv6 from the dhcp so the entire block would be compromised. It would be interesting to see if standard dhcp could limit auto leasing to guard reputation of the network


Generally, IPv6 does autoconfiguration (never seen a home router with DHCPv6), so no need to ask for anything. Even for ipv4, I've never seen a home router enforce DHCP (even though it would force the public ip).

But the point stands, you can't selectively punish a single device, you have to cut off the whole block, which may include well-behaved devices.


In mobile networks it's usually a single device.


This DDoS is claimed to be the result of <300,000 compromised routers.

That would be really easy to block if we were on IPv6. And it would be pretty easy to propagate upstream. And you could probabilistically unblock in an automated way and see if a node was still compromised. etc.


> That would be really easy to block -- if we were on IPv6.

Make that: If the service being attacked was on IPv6-only, and the attacker had no way to fall back to IPv4.

As long as we are dual-stack and IPv6 is optional, no attacker is going to be stupid enough to select the stack which has the highest probability of being defeated. Don't be naive.


It'd be far more acceptable to block the CG-NAT IPv4 addresses if you knew that the other non-compromised hosts could utilize their own IPv6 addresses to connect to your service.


Better to rely on ip blocks than on NAT to bundle blocks.


This is very challenging, in about one year the biggest recorded DDoS attack has increased from 5 Tbps to almost 30.

Almost all of the DDoS mitigation providers have been struggling for a few weeks because they just don't have enough edge capacity.

And normal hosting companies that are not focused on DDoS mitigation also seem to have had issues, but with less impact to other customers as they'll just blackhole addresses under larger attacks. For example, I've seen all connections to / from some of my services at Hetzner time out way more frequently than usual, and some at OVH too. Then one of my smaller hosting providers got hit with an attack of at least 1 Tbps which saturated a bunch of their transit links.

Cloudflare and maybe a couple of the other enterprise providers (Gcore?) operate at a large enough scale to handle these attacks, but all the smaller ones (who tend to have more affordable rates and more application-specific filters for sensitive applications that can't deal with much leakage) seem to be in quite a bad spot right now. Cloudflare Magic Transit pricing supposedly starts at around $4k / month, and it would really suck if that became the floor for being able to run a non-HTTP service online.

Something like Team Cymru's UTRS service (with Flowspec support) could potentially help to mitigate attacks at the source, but residential ISPs and maybe the T1s would need to join it, and I don't see that happening anytime soon.


> has increased from 5 Tbps to almost 30

That's nearly a pint, or over 2 daL!


I'm surprised that the best response to ddos is not blocking traffic, but just handling it.


It was taught in a first year software ethics class on my Computer Science programme. Back in 2010. I'm wondering if they still do


I was taught Computer Ethics back in the early 2000s as part of my CS degree.


Over 22 hours of downtime for the one VPS I have in that region.

My infrastructure is redundant and spread out among hosting providers and DCs so there's no real impact, but I'm pretty sure this is the longest outage I've ever had with any provider. And the communication level has been so dissapointing. 4 hrs to say it's a power / HVAC issue? Updates that basically just say we're still working on it since then.


We are approaching 24 hours of downtime, I'm still one of those also affected and I'm starting to wonder if the situation is worse than they are letting on.


Update: it's finally up again after around 25.5 hrs of downtime


Here's to hoping mine comes up again soon


On Firefox on Android on my pretty old phone, a blurry preview rendered in about 10 seconds, and it was fully rendered in 20 something seconds. Smooth panning and zooming the entire time


Firefox on a Samsung S23 Ultra did it a few seconds faster but otherwise the same experience


Following up with Firefox on S24 Ultra loaded from blank to image in a second and then could zoom right in fine with no blurriness or stuttering at all!


> One crawler downloaded 73 TB of zipped HTML files in May 2024 [...] This cost us over $5,000 in bandwidth charges

I had to do a double take here. I run (mostly using dedicated servers) infrastructure that handles a few hundred TB of traffic per month, and my traffic costs are on the order of $0.50 to $3 per TB (mostly depending on the geographical location). AWS egress costs are just nuts.


I think uncontrolled price of cloud traffic - is a real fraud and way bigger problem then some AI companies that ignore robot.txt. One time we went over limit on Netlify or something, and they charged over thousand for a couple TB.


> I think uncontrolled price of cloud traffic - is a real fraud

Yes, it is.

> and way bigger problem then some AI companies that ignore robot.txt.

No, it absolutely is not. I think you underestimate just how hard these AI companies hammer services - it is bringing down systems that have weathered significant past traffic spikes with no issues, and the traffic volumes are at the level where literally any other kind of company would've been banned by their upstream for "carrying out DDoS attacks" months ago.


>I think you underestimate just how hard these AI companies hammer services

Yeas, I completely don't understand this and don't understand comparing this with ddos attacks. There's no difference with what search engines are doing, and in some way it's worse? How? It's simply scraping data, what significant problems may it cause? Cache pollution? And thats'it? I mean even when we talking about ignoring robots.txt (which search engines are often doing too) and calling costly endpoints - what is the problem to add to those endpoints some captcha or rate limiters?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: