This makes me think LLMs would be interesting to set up in a game of Diplomacy, which is an entirely text-based game which soft rather than hard requires a degree of backstabbing to win.
The findings in this game that the "thinking" model never did thinking seems odd, does the model not always show it's thinking steps? It seems bizarre that it wouldn't once reach for that tool when it must be being bombarded with seemingly contradictory information from other players.
Reading more I'm a little disappointed that the write-up has seemingly leant so heavily on LLMs too, because it detracts credibility from the study itself.
Fair point. The core simulation and data collection was done programmatically - 162 games, raw logs, win rates. The analysis of gaslighting phrases and patterns was human-reviewed. I used LLMs to help with the landing page copy, which I should probably disclose more clearly. The underlying data and methodology is solid, you can check it here: https://github.com/lout33/so-long-sucker
There was one much more successful EV, although it too was niche: The UK had "perhaps 40,000 milk floats" in the 1970s and 1980s before supermarkets took over as primary milk distributors. ( https://zavanak.com/transport-topics/british-electric-cv-his... )
You jest, but times around the Earth is the actual origin of the Meter. Kinda.
The history is quite interesting and well worth checking out.
I can't recommend a book on the subject, but I do heartily recommend "Longitude", which is about the challenges of inventing the first maritime chronometers for the purpose of accurately measuring longitude.
That's kind of my point, ISPs use that max speed in their advertising when it isn't really relevant, especially if it hits your cap in a minute or two.
It is relevant, though. I have 1.2 Gbps down with a 2 TB monthly cap. I've never hit the monthly cap even once, but by your standard I have "1.2 Gbps down for 3 hours, 42 minutes".
But that doesn't change the reality that it matters to me that a 20 GB video that a friend took at my wedding downloads in just 2 minutes rather than the ~30 minutes it would take if I had a 100 Mbps connection.
I can't remember the artist but there's a fun song about how they used to pick up second hand LPs really cheap and then they got popular and too expensive, then discovered second hand CDs are really cheap now.
Frank turner-ish vibes but I don't think it was actually him.
It's completely un-googlable though, and even the LLMs aren't much help on this one.
The findings in this game that the "thinking" model never did thinking seems odd, does the model not always show it's thinking steps? It seems bizarre that it wouldn't once reach for that tool when it must be being bombarded with seemingly contradictory information from other players.