Hacker Newsnew | past | comments | ask | show | jobs | submit | duskwuff's commentslogin

> Not sure about legalities, but might be cool if Cloudflare or Amazon could help in keeping the resources themselves up

Definitely not happening. All of the commercial games in the collection are still copyrighted; some are still being sold as remasters/remakes.


> zstd does everything in frames and everything in those frames can be decompressed separately (so you can seek and decompress parts). Bzip2 doesn’t do that.

This isn't accurate.

1) Most zstd streams consist of a single frame. The compressor only creates multiple frames if specifically directed to do so.

2) bzip2 blocks, by contrast, are fully independent - by default, the compressor works on 900 kB blocks of input, and each one is stored with no interdependencies between blocks. (However, software support for seeking within the archive is practically nonexistent.)


The biggest savings for a service like GMail are going to be based around deduplication - e.g. if you can recognize that a newsletter went out to a thousand subscribers and store those all as deltas from a "canonical" copy - congratulations, that's >1000:1 compression, better than you could achieve with any general-purpose compression. Similarly, if you can recognize that an email is an Amazon shipping confirmation or a Facebook message notification or some other commonly repeated "form letter", you can achieve huge savings by factoring out all the common elements in them, like images or stylesheets.

I kind of doubt they would do this to be honest. Every near-copy of a message is going to have small differences in at least the envelope (not sure if encoding differences are also possible depending on the server), and possibly going to be under different guarantees or jurisdictions. And it would just take one mistake to screw things up and leak data from one person to another. All for saving a few gigabytes over an account's lifetime. Doesn't really seem worth it, does it?

That's why a base and a delta. Whereas PP was talking about general compression algorithm, my question was different.

In line with the original comment, I was asking about specialized "codecs" for gmail.

Humans do not read the same email many times. That makes it a good target for compression. I believe machines do read the same email many times, but that could be architected around.


Yes.

These and other email specific redundancies ought to be covered by any specialized compression scheme. Also note, a lot of standard compression is deduplication. Fundamentally they are not that different.

Given that one needs to support deletes, this will end up looking like a garbage collected deduplication file system.


Also potentially relevant: in the 00s, the performance gap between gzip and bzip2 wasn't quite as wide - gzip has benefited far more from modern CPU optimizations - and slow networks / small disks made a higher compression ratio more valuable.

One other redeeming quality that gzip/deflate does have is that its low memory requirements (~32 KB per stream). If you're running on an embedded device, or if you're serving a ton of compressed streams at the same time, this can be a meaningful benefit.

bzip2 is particularly slow because the transform it depends on (BWT2) is "intrinsically slow" - it depends on cache-unfriendly operations with long dependency chains, preventing the CPU from extracting any parallelism:

https://cbloomrants.blogspot.com/2021/03/faster-inverse-bwt....


> "Gatcha" type games

Typically spelled "gacha", although I have to admit that "gotcha" seems apt.


From ガチャ; the "t" is not really there in the Japanese pronunciation, although it is used for transliteration of English words with T like チケット (chiketto, from ticket)

> Do Mac users check and report their SSD wear anywhere?

As a data point: I got a 14" MacBook Pro with a 512 GB SSD the first day it was available in 2021, and I've used it daily since then.

According to the SMART data ("smartctl -x /dev/disk0"), the SSD "percentage used" is 7%, with ~200 TBW. At this rate, the laptop will probably outlive me.


I mean, in principle you already can: https://getutm.app/

It doesn't work well - probably not at all for a modern version of Windows - but the tools exist.


You mean the "candles that drop metal pins every hour" mentioned in the subtitle?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: