Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm not entirely understanding the motivation for this. The most naive method would be to just UCI move notation, that's simply square1-square2. Eg: e2e4 e7e5 g1f3, etc. That's 4 bytes per half move or ply (5 bytes for the once in a blue moon promotion). At an average of 80 ply per game, that's 320 bytes per game notation.

So if we have a billion games that'd just be 320GB. And as you mention converting this to binary yields 6 bits per ply, for a fraction of that. In other words you could naively encode every single game that's ever been played online, offline, or anywhere for a small fraction of the space on a regular large consumer HDD and still have plenty of space left for centuries of games to come. 20TB RAID kits go for less than $1k now a days. And you could load critical (likely to be accessed multiple times and/or concurrently) games (every game ever played by a very strong player, all games within the past year, etc) on a single SSD for rapid access.

Of course saving space is much better than not saving space, but the proposed method does not come for free. It involves some fairly complex heuristics. That means you're trading space for computational complexity. The former is practically free for things on this scale, but the latter very much is not. The article mentions that their decoder was only able to process around 1,111 games per second on average.

Lichess has been pushing around 40k players at some times and will presumably continue to grow. Performance this low on a critical and invokable task also seems to expose a clear denial-of-service vector. And adding new servers is vastly more expensive than adding some new hard disks. By contrast encoding and decoding games in UCI is practically free.

So what is the motivation here?



Agreed, I don't see the practical benefit here either, it's mostly interesting to me from an academic point of view. But we are after all in a thread about something with even less crowded practical value, hiding messages in constructed chess games.


Writing chess software as a hobby is relatively popular. And many people try to optimize as much as possible even when it's not strictly necessary. That's probably motivation enough - although lichess as a business probably doesn't have the same incentives as a hobbyist would.


Sure I completely get that, but what the core issue I was hitting on is that you end up trading an expensive resource (computing power) for a very cheap resource (disk space). In cases where disk space requirements become enormous this can be a necessary trade, but here that's not really an issue. I enjoy not strictly necessary optimization as much as anybody, but unless I'm missing something this optimization seems that it would ultimately result in a less performant system with very little to show for it so it wouldn't be particularly fair to call it an optimization.


Ah yes, but this way you get to write compression algorithms and do performance optimization. The journey is more important than the destination.

Edit: Also (in Lichess' case) you're not trading CPU usage for disk space, you're trading CPU usage for disk space and network traffic, which you're probably paying much more for.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: