In general, this theoretical problem is called the Kolmogorov Complexity of a string: the size of the smallest program that outputs a the input string, for some definition of "program", e.g., an initial input tape for a given universal turing machine. Unfortunately, Kolmogorov Complexity in general is incomputable, because of the halting problem.
But a gzip decompressor is not turing-complete, and there are no gzip streams that will expand to infinitely large outputs, so it is theoretically possible to find the pseudo-Kolmogorov-Complexity of a string for a given decompressor program by the following algorithm:
Let file.bin be a file containing the input byte sequence.
1. BOUNDS=$(gzip --best -c file.bin | wc -c)
2. LENGTH=1
3. If LENGTH==BOUNDS, run `gzip --best -o test.bin.gz file.bin` and HALT.
4. Generate a file `test.bin.gz` LENGTH bytes long containing all zero bits.
5. Run `gunzip -k test.bin.gz`.
6. If `test.bin` equals `file.bin`, halt.
7. If `test.bin.gz` contains only 1 bits, increment LENGTH and GOTO 3.
8. Replace test.bin.gz with its lexicographic successor by interpreting it as a LENGTH-byte unsigned integer and incrementing it by 1.
9. GOTO 5.
test.bin.gz contains your minimal gzip encoding.
There are "stronger" compressors for popular compression libraries like zlib that outperform the "best" options available, but none of them are this exhaustive because you can surely see how the problem rapidly becomes intractable.
For the purposes of generating an efficient zip bomb, though, it doesn't really matter what the exact contents of the output file are. If your goal is simply to get the best compression ratio, you could enumerate all possible files with that algorithm (up to the bounds established by compressing all zeroes to reach your target decompressed size, which makes a good starting point) and then just check for a decompressed length that meets or exceeds the target size.
I think I'll do that. I'll leave it running for a couple days and see if I can generate a neat zip bomb that beats compressing a stream of zeroes. I'm expecting the answer is "no, the search space is far too large."
I'm an idiot, of course the search space is too large. It outgrows what I can brute force by the heat death of the universe by the time it gets to 16 bytes, even if the "test" is a no-op.
I would need to selectively generate grammatically valid zstd streams for this to be tractable at all.
But a gzip decompressor is not turing-complete, and there are no gzip streams that will expand to infinitely large outputs, so it is theoretically possible to find the pseudo-Kolmogorov-Complexity of a string for a given decompressor program by the following algorithm:
Let file.bin be a file containing the input byte sequence.
1. BOUNDS=$(gzip --best -c file.bin | wc -c)
2. LENGTH=1
3. If LENGTH==BOUNDS, run `gzip --best -o test.bin.gz file.bin` and HALT.
4. Generate a file `test.bin.gz` LENGTH bytes long containing all zero bits.
5. Run `gunzip -k test.bin.gz`.
6. If `test.bin` equals `file.bin`, halt.
7. If `test.bin.gz` contains only 1 bits, increment LENGTH and GOTO 3.
8. Replace test.bin.gz with its lexicographic successor by interpreting it as a LENGTH-byte unsigned integer and incrementing it by 1.
9. GOTO 5.
test.bin.gz contains your minimal gzip encoding.
There are "stronger" compressors for popular compression libraries like zlib that outperform the "best" options available, but none of them are this exhaustive because you can surely see how the problem rapidly becomes intractable.
For the purposes of generating an efficient zip bomb, though, it doesn't really matter what the exact contents of the output file are. If your goal is simply to get the best compression ratio, you could enumerate all possible files with that algorithm (up to the bounds established by compressing all zeroes to reach your target decompressed size, which makes a good starting point) and then just check for a decompressed length that meets or exceeds the target size.
I think I'll do that. I'll leave it running for a couple days and see if I can generate a neat zip bomb that beats compressing a stream of zeroes. I'm expecting the answer is "no, the search space is far too large."