Hacker News new | past | comments | ask | show | jobs | submit login
How to defend your website with ZIP bombs (2017) (blog.haschek.at)
122 points by BonoboIO on Jan 10, 2024 | hide | past | favorite | 74 comments



related: https://news.ycombinator.com/item?id=14707674 (July 6, 2017 — 183 comments)


https://42.zip served just that (after the 42.zip named on https://en.wikipedia.org/wiki/Zip_bomb) until some ****hole reported it to Google/etc for.... phishing? Kinda sad, lol.

One of the arguments I've seen is: 'what if your antivirus scans it' to which I think: if your antivirus blows up on a zip bomb in 2024, you need a new antivirus that isn't total garbage?


This could be the result of a google indexing error. Strangeley enough my blog (the one linked in this hn post) itself was de-listed from google justa few weeks ago.

The Blog is a static HTML page with no external dependencies and I didn't even update it in the time google thought there was phishing somewhere.

The Webmastertools showed the error but didn't link to any specific site (it even said null).

So i sent it in to re-evaluate and it was put back on google (without changing anything on the static files themselves). Very strange stuff


People have been coming up with ideas like that regularly. I'm not a fan.

The title says that you can "defend" your webpage, but it is not clear how it "defends" against anything. The only thing you possibly achieve is that every now and then, someone with an automated scanner (which may be an attacker, or may be a security researcher or service) will see his tool crash or consume large amounts of resources.

You're spending time trying to annoy attackers that you should probably just ignore. If you really worry that someone running some automated scanner against your webpage causes you any harm, you probably should spend your time with something different than building zip bombs, and instead fix the security problems you have.


> you probably should spend your time with something different than building zip bombs, and instead fix the security problems you have.

Why not both? You can both create a well configured server to reduce its attack surface and add a booby-trap or two for really adamant scanners which hit very specific endpoints on your site.

I don't have to welcome every scanner with open arms. Maybe I'm doing some research, PoC||GTFO style, and your scanner found my research. It's not my problem.


> PoC||GTFO style

?


Oh. It's "International Journal of Proof-of-Concept or Get The Fuck Out (PoC||GTFO or PoC or GTFO)", available at https://www.alchemistowl.org/pocorgtfo/


Perhaps used to be true, however I believe active countermeasures will grow in importance in the near future. We are on the cusp of seeing a lot more sophisticated class of automated attacks (once the malicious cyber actors get a grasp on how to use LLM's in their favour) and active disruption of these capabilities will be important, not just building more shields around yourself. Before that, we need to fast-track the normalization of using active countermeasures as most are in the same camp as you are.


Please elaborate, how are LLMs going to provide a "more sophisticated" class of automated attacks, and how are toy countermeasures like ZIP bombs going to defend against these "sophisticated" attacks?

I could believe you if you said that ZIP bombs are perhaps mildly effective against script kiddies using ChatGPT to generate a naive, simplistic automated scanners that are susceptible to being "zip bombed". The other way around? Not so much.

These zip bombs are trivial to defend against as an attacker, e.g. by inspecting the payload while you're decompressing it to see if it matches some expected output (an <html> tag, for example) or by aborting after the decompressed size exceeds some limit (10 MB is way more HTML than you'd typically expect, for example).


I didn't say a zip bomb is effective against LLMs, I said active countermeasures in general


Do you mind elaborating, how are LLMs going to help with sophisticated attacks and what kind of countermeasures do you have in mind?


It really is only in the world of IT security is this attitude so pervasive, where you constantly blame victims. While defense is a great offense it can't hurt to be an annoyance to bad actors.


Can you tell me where in my words you read that I blamed victims? I think you're reading something into it that isn't there.


You are very insistent that people are doing "defending" wrong, and I can see how someone could read victim blaming into it, the industry does have a bad habit of loud hindsight bias. But it's also entirely beside the point, since no one here is claiming that this is real security, it's just a small nostalgic hack, and it comes off a bit grumpy to be so adamantly against it.


>You are very insistent that people are doing "defending" wrong, and I can see how someone could read victim blaming into it

that's so absolutely ridiculous.

"Hey bud, you're wearing your helmet backwards." "Oh, so it's MY fault when I run into something on my motorcycle, HUH?!"

"Oh ok, well, have fun. I'll be safely over here. "

a zip bomb will only serve to hinder teenage/kid 'hackers' -- next you're going to tell me if I don't implement a zip bomb somewhere that it'll deprive the next generation of hackers of a valuable life lesson -- please.

At what point can we point out a flawed methodology without being accused of being 'the bad guys' ourselves? It gets to be that when you see someone making a mistake you feel like just letting them dive in and do it ; otherwise you'll be labeled the worlds' worst victim-blaming monster -- eh, easier to keep your mouth shut at some point .. that's a dangerous condition.


> a zip bomb will only serve to hinder teenage/kid 'hackers'

And that is literally what the blog post says - it will mess with script kiddies who don't change their user agent. Author acknowledges that it is not an actual methodology to protect their server, so pointing out that it's a flawed methodology is a weird flex. I have not seen anyone suggest using zip bombs instead of hardening the server.


From the article: "This script obviously is not - as we say in Austria - the yellow of the egg, but it can defend from script kiddies I mentioned earlier who have no idea that all these tools have parameters to change the user agent.".

Perhaps consider reading the article thoroughly before you start claiming it to be more than what it is.


For real

I definitely blame the "oh we should do nothing" approach to the current security situation

Yes, if you want to be the lame duck and get scanned leisurely by the bad actors and diligently serving 404s etc be my guest.

Then we wonder why email became useless outside of the big providers, and every site needs to behind a CDN, etc


unless it paints a target on your back


>You're spending time trying to annoy attackers that you should probably just ignore.

s/ignore/block at firewall-level/

To me this article is of relevance nonetheless because It inspires me for messing with AI trainers by crafting an html page ZIP bomb full of ZIP-bomb-like embedded attachments (stylesheets, images, etc).


No, ignore.

WAFs that "block" "attacks" that ultimately should just cause a 404 error are part of the same mindset: That you think you "have to do something" about an attack that you should probably just ignore. They're also part of the mindset that security should mean adding more complexity, which is the opposite of what you should do.


I would like to compare this to dealing with unsolicited robocalls. Most would just ignore them, and it's indeed an unwise choice to accept them. But some try to strategically waste the caller's time by only accepting and doing nothing else. I can't say whether it's beneficial or not, but I can say that it is easy enough to do without worrying too much.


Except that you are giving away your voice as training data


Doing nothing includes not saying anything, of course.


No, block at firewall-level. I never implied you should have to "study", "scan" the attacker's request: drop it.

You don't owe net-neutrality to a botnet instance.

That's just for when you should need to be clever, as in when you are tasked to be a sysadmin for, like, an hosting service or a corporate network, so you're not really aware of everything coming and going through.

If you're just a self-hoster you should not be clever, neither at ingress nor at egress: you "just" have to minimize attack surface because you probably know what you want to offer, e.g. for a blog you can publish with a SSG or serve HTML pages scraped from a dynamic CMS you'd like to use, like Wordpress, instead of serving the CMS' contents directly (and deal with comments using something else).


Sometimes you’ve been aggravated for so long that doing something might be worth it for your own sense of balance.


I use fail2ban for this. I know what my servers run (and it's never PHP, Python microsoft/AD sharepoint etc, because I don't have services in these languages or systems) so I simply fail2ban everything that matches these broad items.

Access "wp-admin"? Ban. Try "cron.php" ban. Looking for "phpmyadmin"? ban.

It works reasonably well. These bots can switch to other IPs, or try again after the jailtime is lifted (20 mins in my case). But instead of a bot attempting thousands of endpoints, I get only one attempt.

I initially configured this on a backend, that for reasons, had to handle every URL in the (Ruby, so heavy and slow) application layer: a bot trying common endpoints for popular CMSes and services, would put severe load on this application. Fail2ban was already there, blocking SSH, FTP, SMTP and whatnot, so I configured it to also handle these HTTP endpoints.


Fail2ban can also work on ModSecurity logs, banning IPs that trigger a bunch of rules there.

By the way, I have mine set to block for 12 hours, and some still come back. I suppose what I should look into is some sort of progressive extension to the delay.


A progressive jailtime is, indeed, a very good idea.


Slowing down and resisting unauthorised scanners does sound like one more layer of defence to me.


No one is really worried, and no time is wasted when the outcome is occasional fun. There's no security holes to fix, there's just a bunch gnats poking at your server and you shoo them away, that's all.


If your goal is "fun" then I guess that's fine. But don't pitch it as "defending" your website.


It is fun to pitch it as defending a website. Now it’s fine


> The only thing you possibly achieve is that every now and then, someone with an automated scanner (which may be an attacker, or may be a security researcher or service) will see his tool crash or consume large amounts of resources.

True. But it's also a low-cost, low-effort thing. It may not change the world, but putting a small hiccup in someone's operation can bring a small bit of joy.


It reminds me of someone I read about from the early web who defended their website against email address harvesting crawlers by adding a dynamic page that threw up some random emails and links, where the links just led back to the dynamic page under a new path on the same host. The crawlers would get stuck downloading millions of fake email addresses until they eventually broke.



Guilty, I did this on my first project site back in the early 00s. It only worked for a small time before scanners got more sophisticated. It was a fun diversion but people who do this quickly adapt.


What if the returned content is just chat gpt generated add hoc nonsense in the amount nonsense heuristics-1 ?


Heh, sounds like the stuff that's coming up on Googles front page for some results already.


Wait you mean that https://instalopter.com didn't give you accurate information on modern DOM parsing?


This reminds me of the chat bot tool that was deployed to reply to scam emails and waste maximal time of the scammers by seeming like a vulnerable, gullible target. The chat bot would drag out the interaction slowly, wasting as much time as possible for the scammer, all the while posing as a real human.

There are a lot of comments here decrying this as a method to defend your website, but in some senses it's a very smart tactical weapon, especially if deployed widely. There's an often quoted phrase in making secure password hashes that you're not making it impossible to crack, only prohibitively expensive to do so such that the benefits don't outweigh the costs. The same principle applies here -- if the good guys collectively make scanning a very expensive endeavor, the juice is no longer worth the squeeze for the bad guys. Just blocking an IP is cheap for them. Make them bleed a little.


Another possible tactic is to trickle your packets back slowly, say trickle 1 TCP packet back to them per second. Sure, they probably have client-side timeouts, but again, if everyone did this, wouldn't it be a pain to scan for vulns? Every endpoint you hit would last the duration of your timeout.


Do modern bots deal with TARPITTING better? This was something that was pretty common years ago when dealing with bots.


This is a great reference. I hadn’t seen this before. Thanks. No idea about the effectiveness now vs then.


Had it had a marked effect on the bots? Dropping *.php calls helped a lot.


I've done this without compression but just sending infinite data. On some days I've sent a TB to a single IP address... Might be an idea to combine this. I'd assume the resulting gzip file here contains a repeating pattern that you can generate on the fly?


A zip bomb is way more effective as it will be transferred very quickly and saturate attacker storage as fast as possible, with a good probability to make the system inoperable (login into a machine with with 0 storage left can be challenging).


Was your hosting OK with that? ZIP bombs are light on the "data transferred" front, a TB is pretty massive.


The other thing to do here is tarpitting/slow feeding data keeping the bot busy for a long time. Not sure if modern bots are better at timing out than in the past.


I've got a limit of 50TB which I never came even close to. I'm still with them, but the script got lost at some point.


Yes, it would be trivial to make an "endless" gzip stream.


And if you need an explicit binary, it is very easy to make one. Prepare a large enough file filled with the same byte (or generate them on the fly), let gzip to compress it, then observe where the zero byte start to repeat. Cut everything after that point and repeat the zero byte from there on. In my system:

    $ head -c 10000000 /dev/zero | gzip -c -9 | hexdump
    0000000 8b1f 0008 530a 659e 0302 c1ec 0101 0000
    0000010 8000 fe90 eeaf 0a08 0000 0000 0000 0000
    0000020 0000 0000 0000 0000 0000 0000 0000 0000
    *
    0002010 0000 0000 0000 db80 0383 0012 0000 4100
    0002020 5fff 23b7 0150 0000 0000 0000 0000 0000
    0002030 0000 0000 0000 0000 0000 0000 0000 0000
    *
    00025f0 0000 0000 0000 0000 0000 0000 0000 e000
    0002600 cb26 3ba5 803e 9896 0000
    0002609
So you can keep the first 0x18 bytes and repeat zero bytes after that, preferably very slowly. (The middle bit is an arbitrary block boundary, while the last bit is CRC-32 and the uncompressed size which the client will never see.) I have also used `-c` option to avoid the original file name in the result, which precedes the compressed data.


Back in the day if you had a faster modem and typed faster than the other person’s modem could receive, you could kick them offline.

The more things change, the more they stay the same?


If I recall correctly, HTTP clients do not need to care about Content-Encoding at all and can choose to just not do anything with your ZIP bomb. To really hit them, you will want to do this at the Transfer-Encoding level.


If the client wants to parse HTML and do something accordingly, it at least has to honor some popular Content-Encoding.


TIL today: ZIP bombs are still a thing, 40 years after the first BBS.


Similar things were popular in the DC++ age too.

Some servers required you to have X gigabytes of stuff in your shares, so you just had a zip bomb in your share with the correct size.

It reported the size as 10GB to the server, but actually it took a few kilobytes on disk.


I have something similar - except I don't send them ZIP bombs, but I engage in something akin to a Slowloris attack. I keep the connection open and send random bytes with random delays in between.

To my surprise, I got few "champions" who spend >12h on a socket trying to get data that lead nowhere. And since there is ultimately a limit on number of sockets on a system, you can effectively DDoS that attacker.


This can also be done with nginx's gzip_static/brotli_static, somewhat easier.

One similar technique against port scanners is to send ~50% rejects and do ~50% drops in the case of closed ports. Most of them will pollute their output or slow down tremendously assuming packet loss.


Another method that stuck with me: in the early days of bitcoin someone built an "ssh paywall" – i.e. you would pay to enable ssh remote authentication for a minute or two.

In essence a hacker would have to pay before attempting to hack the ssh endpoint. Of course the admin would have to pay too but the money would end up on his/her wallet.

Quite ingenious if you ask me.


Stuff like this is going to be wild when digital transaction fees for tiny amounts are basically free and transactions can be truly anonymous.


Not sure it works for the scanners, sure it crashes the browsers, but I would guess most scanners use libcurl which has a callback function for saving data and would give up on loading the resource if over 2MiB let's say.


I tried the example with FF on Ubuntu, and it had no effect whatsoever. It didn't even use up any memory. Are browsers protecting against this now?


In the spirit of a good offence is the best defence, etc., I wonder if there are any other ways to defend my website.


> This script obviously is not - as we say in Austria - the yellow of the egg, [...]

I think my pig is whistling!


Is it legal to purposefully distribute a malicious payload as a booby trap?


Is it really malicious though? 10 gigs of zeros doesn’t seem that malicious to me. Microcontrollers often have a few megabytes of RAM if not less, does that make a few megabyte photo malicious?

Edit: spelling. I’m old school and used to typing on my computer. It’s getting repaired and all I’ve got is my phone. /rant


It's a joke answer at best, plus if someone would go complain, their actions for discovering the booby trap were illegal anyways.



I don't think that a zip bomb counts as "malicious". I can't think of a law (in the US) that would prohibit doing this sort of thing.


never link to it and add to robots.txt

digital castle doctrine or whatever.


If the client supports Brotli, this attack can be made much more effective because the maximum compression ratio of Brotli is much higher than gzip. DEFLATE used by gzip has the maximum ratio of 1032:1 because each run can emit at most 258 bytes and need at least 2 bits, but Brotli allows zero bits per code when there is only one code possible. So Brotli's result only depends on the underlying metablock overhead, but sadly each metablock can contain at most 2^24 - 1 uncompressed bytes (because that uncompressed size is encoded per each metablock, while DEFLATE needs a separate end-of-block code). Still this translates to at least 1000000:1 ratio so it's worthwhile.


Nice, I didn’t even know that ZIP bombs existed! I have seen some huge spam traffic on servers I run in the past so this will be a good tool for me to use. Thanks


> How to defend your website with ZIP bombs

.. or more like 'How to retaliate when you think too much of your [website] importance'.

While the ability to (maaaaybe) crash the crawler sounds nice it probably doesn't do what do you think. At best you just snapped off one head of Hydra, only more to come.

Also using PHP instead of the web-server's path actions...




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: