One of the arguments I've seen is: 'what if your antivirus scans it' to which I think: if your antivirus blows up on a zip bomb in 2024, you need a new antivirus that isn't total garbage?
This could be the result of a google indexing error. Strangeley enough my blog (the one linked in this hn post) itself was de-listed from google justa few weeks ago.
The Blog is a static HTML page with no external dependencies and I didn't even update it in the time google thought there was phishing somewhere.
The Webmastertools showed the error but didn't link to any specific site (it even said null).
So i sent it in to re-evaluate and it was put back on google (without changing anything on the static files themselves). Very strange stuff
People have been coming up with ideas like that regularly. I'm not a fan.
The title says that you can "defend" your webpage, but it is not clear how it "defends" against anything. The only thing you possibly achieve is that every now and then, someone with an automated scanner (which may be an attacker, or may be a security researcher or service) will see his tool crash or consume large amounts of resources.
You're spending time trying to annoy attackers that you should probably just ignore. If you really worry that someone running some automated scanner against your webpage causes you any harm, you probably should spend your time with something different than building zip bombs, and instead fix the security problems you have.
> you probably should spend your time with something different than building zip bombs, and instead fix the security problems you have.
Why not both? You can both create a well configured server to reduce its attack surface and add a booby-trap or two for really adamant scanners which hit very specific endpoints on your site.
I don't have to welcome every scanner with open arms. Maybe I'm doing some research, PoC||GTFO style, and your scanner found my research. It's not my problem.
Perhaps used to be true, however I believe active countermeasures will grow in importance in the near future. We are on the cusp of seeing a lot more sophisticated class of automated attacks (once the malicious cyber actors get a grasp on how to use LLM's in their favour) and active disruption of these capabilities will be important, not just building more shields around yourself. Before that, we need to fast-track the normalization of using active countermeasures as most are in the same camp as you are.
Please elaborate, how are LLMs going to provide a "more sophisticated" class of automated attacks, and how are toy countermeasures like ZIP bombs going to defend against these "sophisticated" attacks?
I could believe you if you said that ZIP bombs are perhaps mildly effective against script kiddies using ChatGPT to generate a naive, simplistic automated scanners that are susceptible to being "zip bombed". The other way around? Not so much.
These zip bombs are trivial to defend against as an attacker, e.g. by inspecting the payload while you're decompressing it to see if it matches some expected output (an <html> tag, for example) or by aborting after the decompressed size exceeds some limit (10 MB is way more HTML than you'd typically expect, for example).
It really is only in the world of IT security is this attitude so pervasive, where you constantly blame victims. While defense is a great offense it can't hurt to be an annoyance to bad actors.
You are very insistent that people are doing "defending" wrong, and I can see how someone could read victim blaming into it, the industry does have a bad habit of loud hindsight bias. But it's also entirely beside the point, since no one here is claiming that this is real security, it's just a small nostalgic hack, and it comes off a bit grumpy to be so adamantly against it.
>You are very insistent that people are doing "defending" wrong, and I can see how someone could read victim blaming into it
that's so absolutely ridiculous.
"Hey bud, you're wearing your helmet backwards."
"Oh, so it's MY fault when I run into something on my motorcycle, HUH?!"
"Oh ok, well, have fun. I'll be safely over here. "
a zip bomb will only serve to hinder teenage/kid 'hackers' -- next you're going to tell me if I don't implement a zip bomb somewhere that it'll deprive the next generation of hackers of a valuable life lesson -- please.
At what point can we point out a flawed methodology without being accused of being 'the bad guys' ourselves? It gets to be that when you see someone making a mistake you feel like just letting them dive in and do it ; otherwise you'll be labeled the worlds' worst victim-blaming monster -- eh, easier to keep your mouth shut at some point .. that's a dangerous condition.
> a zip bomb will only serve to hinder teenage/kid 'hackers'
And that is literally what the blog post says - it will mess with script kiddies who don't change their user agent. Author acknowledges that it is not an actual methodology to protect their server, so pointing out that it's a flawed methodology is a weird flex. I have not seen anyone suggest using zip bombs instead of hardening the server.
From the article: "This script obviously is not - as we say in Austria - the yellow of the egg, but it can defend from script kiddies I mentioned earlier who have no idea that all these tools have parameters to change the user agent.".
Perhaps consider reading the article thoroughly before you start claiming it to be more than what it is.
>You're spending time trying to annoy attackers that you should probably just ignore.
s/ignore/block at firewall-level/
To me this article is of relevance nonetheless because It inspires me for messing with AI trainers by crafting an html page ZIP bomb full of ZIP-bomb-like embedded attachments (stylesheets, images, etc).
WAFs that "block" "attacks" that ultimately should just cause a 404 error are part of the same mindset: That you think you "have to do something" about an attack that you should probably just ignore. They're also part of the mindset that security should mean adding more complexity, which is the opposite of what you should do.
I would like to compare this to dealing with unsolicited robocalls. Most would just ignore them, and it's indeed an unwise choice to accept them. But some try to strategically waste the caller's time by only accepting and doing nothing else. I can't say whether it's beneficial or not, but I can say that it is easy enough to do without worrying too much.
No, block at firewall-level. I never implied you should have to "study", "scan" the attacker's request: drop it.
You don't owe net-neutrality to a botnet instance.
That's just for when you should need to be clever, as in when you are tasked to be a sysadmin for, like, an hosting service or a corporate network, so you're not really aware of everything coming and going through.
If you're just a self-hoster you should not be clever, neither at ingress nor at egress: you "just" have to minimize attack surface because you probably know what you want to offer, e.g. for a blog you can publish with a SSG or serve HTML pages scraped from a dynamic CMS you'd like to use, like Wordpress, instead of serving the CMS' contents directly (and deal with comments using something else).
I use fail2ban for this. I know what my servers run (and it's never PHP, Python microsoft/AD sharepoint etc, because I don't have services in these languages or systems) so I simply fail2ban everything that matches these broad items.
Access "wp-admin"? Ban. Try "cron.php" ban. Looking for "phpmyadmin"? ban.
It works reasonably well. These bots can switch to other IPs, or try again after the jailtime is lifted (20 mins in my case). But instead of a bot attempting thousands of endpoints, I get only one attempt.
I initially configured this on a backend, that for reasons, had to handle every URL in the (Ruby, so heavy and slow) application layer: a bot trying common endpoints for popular CMSes and services, would put severe load on this application. Fail2ban was already there, blocking SSH, FTP, SMTP and whatnot, so I configured it to also handle these HTTP endpoints.
Fail2ban can also work on ModSecurity logs, banning IPs that trigger a bunch of rules there.
By the way, I have mine set to block for 12 hours, and some still come back. I suppose what I should look into is some sort of progressive extension to the delay.
No one is really worried, and no time is wasted when the outcome is occasional fun. There's no security holes to fix, there's just a bunch gnats poking at your server and you shoo them away, that's all.
> The only thing you possibly achieve is that every now and then, someone with an automated scanner (which may be an attacker, or may be a security researcher or service) will see his tool crash or consume large amounts of resources.
True. But it's also a low-cost, low-effort thing. It may not change the world, but putting a small hiccup in someone's operation can bring a small bit of joy.
It reminds me of someone I read about from the early web who defended their website against email address harvesting crawlers by adding a dynamic page that threw up some random emails and links, where the links just led back to the dynamic page under a new path on the same host. The crawlers would get stuck downloading millions of fake email addresses until they eventually broke.
Guilty, I did this on my first project site back in the early 00s. It only worked for a small time before scanners got more sophisticated. It was a fun diversion but people who do this quickly adapt.
This reminds me of the chat bot tool that was deployed to reply to scam emails and waste maximal time of the scammers by seeming like a vulnerable, gullible target. The chat bot would drag out the interaction slowly, wasting as much time as possible for the scammer, all the while posing as a real human.
There are a lot of comments here decrying this as a method to defend your website, but in some senses it's a very smart tactical weapon, especially if deployed widely. There's an often quoted phrase in making secure password hashes that you're not making it impossible to crack, only prohibitively expensive to do so such that the benefits don't outweigh the costs. The same principle applies here -- if the good guys collectively make scanning a very expensive endeavor, the juice is no longer worth the squeeze for the bad guys. Just blocking an IP is cheap for them. Make them bleed a little.
Another possible tactic is to trickle your packets back slowly, say trickle 1 TCP packet back to them per second. Sure, they probably have client-side timeouts, but again, if everyone did this, wouldn't it be a pain to scan for vulns? Every endpoint you hit would last the duration of your timeout.
I've done this without compression but just sending infinite data. On some days I've sent a TB to a single IP address... Might be an idea to combine this. I'd assume the resulting gzip file here contains a repeating pattern that you can generate on the fly?
A zip bomb is way more effective as it will be transferred very quickly and saturate attacker storage as fast as possible, with a good probability to make the system inoperable (login into a machine with with 0 storage left can be challenging).
The other thing to do here is tarpitting/slow feeding data keeping the bot busy for a long time. Not sure if modern bots are better at timing out than in the past.
And if you need an explicit binary, it is very easy to make one. Prepare a large enough file filled with the same byte (or generate them on the fly), let gzip to compress it, then observe where the zero byte start to repeat. Cut everything after that point and repeat the zero byte from there on. In my system:
So you can keep the first 0x18 bytes and repeat zero bytes after that, preferably very slowly. (The middle bit is an arbitrary block boundary, while the last bit is CRC-32 and the uncompressed size which the client will never see.) I have also used `-c` option to avoid the original file name in the result, which precedes the compressed data.
If I recall correctly, HTTP clients do not need to care about Content-Encoding at all and can choose to just not do anything with your ZIP bomb. To really hit them, you will want to do this at the Transfer-Encoding level.
I have something similar - except I don't send them ZIP bombs, but I engage in something akin to a Slowloris attack. I keep the connection open and send random bytes with random delays in between.
To my surprise, I got few "champions" who spend >12h on a socket trying to get data that lead nowhere. And since there is ultimately a limit on number of sockets on a system, you can effectively DDoS that attacker.
This can also be done with nginx's gzip_static/brotli_static, somewhat easier.
One similar technique against port scanners is to send ~50% rejects and do ~50% drops in the case of closed ports. Most of them will pollute their output or slow down tremendously assuming packet loss.
Another method that stuck with me: in the early days of bitcoin someone built an "ssh paywall" – i.e. you would pay to enable ssh remote authentication for a minute or two.
In essence a hacker would have to pay before attempting to hack the ssh endpoint. Of course the admin would have to pay too but the money would end up on his/her wallet.
Not sure it works for the scanners, sure it crashes the browsers, but I would guess most scanners use libcurl which has a callback function for saving data and would give up on loading the resource if over 2MiB let's say.
Is it really malicious though? 10 gigs of zeros doesn’t seem that malicious to me. Microcontrollers often have a few megabytes of RAM if not less, does that make a few megabyte photo malicious?
Edit: spelling. I’m old school and used to typing on my computer. It’s getting repaired and all I’ve got is my phone. /rant
If the client supports Brotli, this attack can be made much more effective because the maximum compression ratio of Brotli is much higher than gzip. DEFLATE used by gzip has the maximum ratio of 1032:1 because each run can emit at most 258 bytes and need at least 2 bits, but Brotli allows zero bits per code when there is only one code possible. So Brotli's result only depends on the underlying metablock overhead, but sadly each metablock can contain at most 2^24 - 1 uncompressed bytes (because that uncompressed size is encoded per each metablock, while DEFLATE needs a separate end-of-block code). Still this translates to at least 1000000:1 ratio so it's worthwhile.
Nice, I didn’t even know that ZIP bombs existed! I have seen some huge spam traffic on servers I run in the past so this will be a good tool for me to use. Thanks
.. or more like 'How to retaliate when you think too much of your [website] importance'.
While the ability to (maaaaybe) crash the crawler sounds nice it probably doesn't do what do you think. At best you just snapped off one head of Hydra, only more to come.
Also using PHP instead of the web-server's path actions...