Hacker Newsnew | past | comments | ask | show | jobs | submit | buzzm's commentslogin

Wonderful. I don’t particularly care if it is or is not a valid test. I like the “wrong” renderings better. Some are hilarious, some … inspired.


Of late I have grown fond of ... brace yourself ... RDF in turtle format for config files. Supports comments, multiline literals, built-in support for references, and rich typing via xsd casting when necessary e.g. ex:subject ex:createdOn "2002-01-24T12:00:00.000Z"^^xsd:dateTime ; You can have no namespace (:subject :name "foo") or one or more to help separate metadata and structure (the config structure) from actual config data (ex:myInstance config:logfile "pathname to file"). And of course, all the metadata itself is labelable and can carry comments and descriptions so the config is essentially self-documenting. Or at least there is a standard, straightforward way to extract and organize the labels and comments if time has been taken to add them.


Game in BGN (https://moschetti.org/rants/bgnchess.html)

{"moves":[{"p":"P","f":"d2","t":"d4"},{"p":"n","f":"g8","t":"f6"},{"p":"P","f":"c2","t":"c4"},{"p":"p","f":"e7","t":"e6"},{"p":"N","f":"g1","t":"f3"},{"p":"p","f":"d7","t":"d5"},{"p":"N","f":"b1","t":"c3"},{"p":"b","f":"f8","t":"b4"},{"p":"P","f":"e2","t":"e3"},{"p":"k","f":"e8","t":"g8","castle":"K"},{"p":"B","f":"f1","t":"d3"},{"p":"p","f":"c7","t":"c5"},{"p":"K","f":"e1","t":"g1","castle":"K"},{"p":"n","f":"b8","t":"c6"},{"p":"P","f":"a2","t":"a3"},{"p":"b","f":"b4","t":"a5"},{"p":"N","f":"c3","t":"e2"},{"p":"p","f":"d5","t":"c4","x":"P"},{"p":"B","f":"d3","t":"c4","x":"p"},{"p":"b","f":"a5","t":"b6"},{"p":"P","f":"d4","t":"c5","x":"p"},{"p":"q","f":"d8","t":"d1","x":"Q"},{"p":"R","f":"f1","t":"d1","x":"q"},{"p":"b","f":"b6","t":"c5","x":"P"},{"p":"P","f":"b2","t":"b4"},{"p":"b","f":"c5","t":"e7"},{"p":"B","f":"c1","t":"b2"},{"p":"b","f":"c8","t":"d7"},{"p":"R","f":"a1","t":"c1"},{"p":"r","f":"f8","t":"d8"},{"p":"N","f":"e2","t":"d4"},{"p":"n","f":"c6","t":"d4","x":"N"},{"p":"N","f":"f3","t":"d4","x":"n"},{"p":"b","f":"d7","t":"a4"},{"p":"B","f":"c4","t":"b3"},{"p":"b","f":"a4","t":"b3","x":"B"},{"p":"N","f":"d4","t":"b3","x":"b"},{"p":"r","f":"d8","t":"d1","x":"R","c":1},{"p":"R","f":"c1","t":"d1","x":"r"},{"p":"r","f":"a8","t":"c8"},{"p":"K","f":"g1","t":"f1"},{"p":"k","f":"g8","t":"f8"},{"p":"K","f":"f1","t":"e2"},{"p":"n","f":"f6","t":"e4"},{"p":"R","f":"d1","t":"c1"},{"p":"r","f":"c8","t":"c1","x":"R"},{"p":"B","f":"b2","t":"c1","x":"r"},{"p":"p","f":"f7","t":"f6"},{"p":"N","f":"b3","t":"a5"},{"p":"n","f":"e4","t":"d6"},{"p":"K","f":"e2","t":"d3"},{"p":"b","f":"e7","t":"d8"},{"p":"N","f":"a5","t":"c4"},{"p":"b","f":"d8","t":"c7"},{"p":"N","f":"c4","t":"d6","x":"n"},{"p":"b","f":"c7","t":"d6","x":"N"},{"p":"P","f":"b4","t":"b5"},{"p":"b","f":"d6","t":"h2","x":"P"},{"p":"P","f":"g2","t":"g3"},{"p":"p","f":"h7","t":"h5"},{"p":"K","f":"d3","t":"e2"},{"p":"p","f":"h5","t":"h4"},{"p":"K","f":"e2","t":"f3"},{"p":"k","f":"f8","t":"e7"},{"p":"K","f":"f3","t":"g2"},{"p":"p","f":"h4","t":"g3","x":"P"},{"p":"P","f":"f2","t":"g3","x":"p"},{"p":"b","f":"h2","t":"g3","x":"P"},{"p":"K","f":"g2","t":"g3","x":"b"},{"p":"k","f":"e7","t":"d6"},{"p":"P","f":"a3","t":"a4"},{"p":"k","f":"d6","t":"d5"},{"p":"B","f":"c1","t":"a3"},{"p":"k","f":"d5","t":"e4"},{"p":"B","f":"a3","t":"c5"},{"p":"p","f":"a7","t":"a6"},{"p":"P","f":"b5","t":"b6"},{"p":"p","f":"f6","t":"f5"},{"p":"K","f":"g3","t":"h4"},{"p":"p","f":"f5","t":"f4"},{"p":"P","f":"e3","t":"f4","x":"p"},{"p":"k","f":"e4","t":"f4","x":"P"},{"p":"K","f":"h4","t":"h5"},{"p":"k","f":"f4","t":"f5"},{"p":"B","f":"c5","t":"e3"},{"p":"k","f":"f5","t":"e4"},{"p":"B","f":"e3","t":"f2"},{"p":"k","f":"e4","t":"f5"},{"p":"B","f":"f2","t":"h4"},{"p":"p","f":"e6","t":"e5"},{"p":"B","f":"h4","t":"g5"},{"p":"p","f":"e5","t":"e4"},{"p":"B","f":"g5","t":"e3"},{"p":"k","f":"f5","t":"f6"},{"p":"K","f":"h5","t":"g4"},{"p":"k","f":"f6","t":"e5"},{"p":"K","f":"g4","t":"g5"},{"p":"k","f":"e5","t":"d5"},{"p":"K","f":"g5","t":"f5"},{"p":"p","f":"a6","t":"a5"},{"p":"B","f":"e3","t":"f2"},{"p":"p","f":"g7","t":"g5"},{"p":"K","f":"f5","t":"g5","x":"p"},{"p":"k","f":"d5","t":"c4"},{"p":"K","f":"g5","t":"f5"},{"p":"k","f":"c4","t":"b4"},{"p":"K","f":"f5","t":"e4","x":"p"},{"p":"k","f":"b4","t":"a4","x":"P"},{"p":"K","f":"e4","t":"d5"},{"p":"k","f":"a4","t":"b5"},{"p":"K","f":"d5","t":"d6"}],"opening":{"ECO":"E56"},"site":"Reykjavik ISL","event":"Spassky-Fischer World Championship Match","result":"W","type":"C","players":[{"handle":{"domain":"UNK","value":"Boris Spassky"}},{"handle":{"domain":"UNK","value":"Robert James Fischer"}}]}


Why replace PGN, which any decent chess player can simply read and follow (1 e4 e5, 2 Nf3 Nf6…), or if needed play out on a chessboard, with such a huge, unparseable monstrosity?

PGN [1] already supports all the features of BGN. The only complaint of not having long algebraic notation can simply be added as a comment, maintaining the simple readability of PGN. And then it works with the extensive ecosystem of existing tools.

[1] https://en.wikipedia.org/wiki/Portable_Game_Notation


Well... it's actually more parseable. Human readability is not the primary goal of BGN. It's precise capture of information esp. things ending up in PGN comments like %eval and %clk


Again, if you need long algebraic, a single comment in a PGN stores it, it is smaller than your format, is still readable. And it could use a shorter, faster to parse, more cache friendly, format than inserting a soup of useless braces, colons, quote marks, and other clutter that simply makes BGN a complete mess to parse.

There may be a reason for zero downloads.


:-) You visited the site so props for that. But I find it odd you call the soup of braces, colons, and quotes a complete mess. That's JSON. As in industry standard everyone-uses-it JSON. If you're not familiar with the space I can see why PGN has more human readable appeal.


I've written some of the high perf JSON code used in industry, so I am well aware of it. A main reason companies (and me, and many others) work so hard on making high speed readers is to get them withing a fraction of the speed of reading many native formats.

For example, converting a 32 bit float to text for JSON, then parsing it back in later, is orders of magnitude slower than simply using the binary one. And in many languages it's not bit exact, so it creates a giant mess. I've worked on getting many projects that try to roundtrip floats correctly since the JSON layer was breaking projects (hint: C++ has precisely one, corner and rarely used, format/parse that promises to roundtrip, and this is not even guaranteed across compilers or systems or versions - only one compiler version. To do better you import your own Grisu or successor and implement tests to catch system errors.... or switch to hex format, which is much nicer). So yes, I have done tons of JSON, but I get called mostly to fix/remove/mitigate JSON since groups and people know I am the guy that fixes idiot-added JSON where it breaks things.

Next, JSON is also an order of magnitude bigger than many other formats. There's good reason AI models are not JSON, video is not JSON, etc. JSON slows down major pipelines because of the size increases and slowness of formatting and parsing compared to simply using a smarter format. Junior engineers love JSON because they don't understand the pros/cons and can jam it everywhere. Many projects I've been called in to speed up were slow due to "JSON all the things!".

Next, I've also written high performance chess engines (and all sorts of custom searching engines to do stuff like find proof games, do various things related to retrograde analysis, and construct richer data for endgame tables for my own analysis, some published...), with all the bells and whistles and bit tricks and used things like Z3 to invent better magic bitboard constants, which also improved state of the art there. So I also know that space.

Naming your format BGN for "Better Game" Notation (?) to replace PGN, when your format breaks pretty much every thing PGN was designed to be used for, and is widely used for, is simply ludicrous. It's not a better game notation - it's an unusable game notation. You want it to be a searchable, queryable format, of which many already exist, yet again yours is going to be orders of magnitude slower since you chose JSON, replacing at most bytes of chess move info with variable length (thus ensuring you cannot random access information but must parse the 1000+ characters before move 50 to see move 50...) 25+ to ~40 bytes each. Your format is simply a terrible data format, solving nothing.

And most big engines and certainly all big chess database software support all the major formats.

Did you look at either the motivation for PGN or for existing chess query supporting database format before you made this? Or in the intervening 5 years?

So I don't think you have improved anything over the previous technology. There's a reason you have zero downloads over 5 years.

Here's a hint: if you're going to make something claiming to be a better X, then you should reproduce what X was designed for, and then, and only then, add features.

> As in industry standard everyone-uses-it JSON

That's PGN. As in industry standard everyone-uses-it PGN. If you're not familiar with the space I can see why JSON has more human unreadable appeal.

Good luck getting to 1 download.


UGN -- unusable game notation. That has a ring to it.

UGN was designed with SPARK and AWS S3 pbzip2-aware threading JSON readers in mind. For an analysis of a year's worth of lichess games -- over a billion, maybe more -- I believe UGN is a good solution. Try bucketing clock time vs. blunders or castling analysis or opening vs. ELO on 1 billion PGNs. Not simple search; aggregation. In the end, UGN is an information architecture; JSON is used as a convenient implementation/representation but it happens work well with SPARK and S3. And you can reliably use jq or even grep on the same UGN files. I didn't feel the need to create yet another binary format.

Anyhoo.... We're up to 60 downloads!


> Try bucketing clock time vs. blunders or castling analysis or opening vs. ELO on 1 billion PGNs

No one uses PGN for that. Strawmen are easy to defeat.

Heck, I can crush your example by simply using a proper binary format, fixed length records, and reading flat files directly into memory, all for about 1 hour of coding to transcode your format into something designed for analysis.

Better yet, compare your examples to any of the current best chess DB formats. They all support all your query examples a massively more. And are properly designed, tested and improved over decades of actual use, by professionals and researchers alike.

People have been doing billion+ chess game analyses for well over a decade. Chess.com added 1 billion games in Feb alone two years ago , they just passed 100 billion games total in Jan, they regularly run all sorts of analyses on all the games, and provide datasets and services to researchers for similar work.

It's hard to improve on what you do not understand. If you spent some time understanding the space you may be able to improve on it. Until then a lone wolf, unaware of what is in the space, will simply de-invent a worse wheel.


Agreed. Multi-multi billion. It is a fundamental change in the global energy dynamic -- including the fact that D-T reactors are still use heat exchange which means heat is also a useful output product for industrial purposes. MSRs are really good for this but of course are fission not fusion reactors.

It borders on a national security issue.


Very good.


Exactly; see pollinators comment above. Although as tech progresses... perhaps we can make black plants as attractive to genetically modified pollinators. Do I sense a slippery slope here?


A fascinating piece that combines technology, precision in micromanufacturing, design aesthetics, economics, politics, and history -- centered on the now all-too-forgettable wristwatch in the smartphone age. I haven't worn a wristwatch since I got a Blackberry about a 100 years ago (wink) but it is articles like this that rekindle my appreciation of them.


Adding to the chorus of woes, my legal first name is Paul and middle starts with A and often the ticket comes out as "Paula". I have had at least two heated encounters with gate attendants saying the ticket could not possibly be mine because I don't look like a "Paula".


I got business cards made that politely explain why my appearance may not match their expectations based on my name, and that has been quite effective at diffusing such problems.


Agreed! Unexpected and made my morning.


Is there a "hello world" example available that is a highly simplified version of https://qrlew.readthedocs.io/en/latest/tutorials/getting_sta... ? Just 2 tables, 2 fields apiece, with the join bringing private data into the mix?


I wrote a minimal demo here: https://github.com/Qrlew/docs/blob/main/tutorials/minimal.ip...

There is an interactive playground there as well: https://qrlew.github.io/dp You can change things on the left and then rewritten query shows on the right.

I hope it helps.


It does; I got this running easily because fortunately I have current postgres and psycopg2 installed so no tangential problems to interfere with the main event.

So clearly the example shows how I can issue a SELECT statement against the raw data. Is there a concise statement about what the DP'd statement offers me? This isn't access control or raw data obfuscation in the traditional sense. What is the consumable use case here as opposed to, say, the statistically private use?


Imagine you want to open an access to your DB to someone - Alice - you do not particularly trust. And you want to make sure Alice cannot learn anything about an individual in the database.

Maybe you will filter the queries that you assume safe, let's say aggregation queries. But to prevent GROUP BYs with singleton groups, you will enforce some more constraints. Let's says GROUPs should have a minimal size of 10 elements. But then Alice could run a first aggregation with a GROUP of 100 and the same with a GROUP of 100 minus Bob, by using both responses Alice can know things about Bob. Then you will design more involved rules and grow a complex set of rules and eventually add human supervision and your system will not be scalable.

With DP, you add some noise to a query result so that you have a guarantee that nothing substantial can be learned about Bob.

Imagine you want to open your DB to the outside world in a scalable way, you can design a service that: - receives SQL queries - use Qrlew to rewrite them - run the rewritten query on any DB and send back the result as a response - you would also accumulate privacy loss in a privacy accountant so that you make sure the amount of private information leaked by the repeated disclosure of DP query results stays bounded.

That's what Sarus does as a company. Qrlew is the DP-SQL core we use.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: