Since the page didn't load for me several times, and the title is ambiguous, here's the Abstract: Large language models (LLMs) have recently made vast advances in both generating and analyzing textual data. Technical reports often compare LLMs’ outputs with “human” performance on various tests. Here, we ask, “Which humans?” Much of the existing literature largely ignores the fact that humans are a cultural species with substantial psychological diversity around the globe that is not fully captured by the textual data on which current LLMs have been trained. We show that LLMs’ responses to psychological measures are an outlier compared with large-scale cross-cultural data, and that their performance on cognitive psychological tasks most resembles that of people from Western, Educated, Industrialized, Rich, and Democratic (WEIRD) societies but declines rapidly as we move away from these populations (r = -.70). Ignoring cross-cultural diversity in both human and machine psychology raises numerous scientific and ethical issues. We close by discussing ways to mitigate the WEIRD bias in future generations of generative language models.
I was struck by explicit orders being used as suggestive of both commonness[1] and rareness[2]. But maybe it resolves as validating an interpretation of commonly reported events, vs amplifying the reported unusualness of one.
[1] "Heck, Tacitus has a Roman general lay out the sequence in an order to his men" (in a context of "examples [...] are not hard to come by").
[2] "‘engage at discretion’ order needed to be given as an order [...] If [...] this was the standard way of fighting, there would be no point in Plutarch having Aemilius order it" (in a context of "Livy noting the unusual nature").
Laptops sometimes have stickers. For a time, I instead had a transparent slip cover, to vary the sticker set, user-test alternatives, and throttle conversations. Science education topics (Boston/Cambridge subway). Anti-patriarchy stickers drew proto-MAGAs. Some backpacks now have low-res screens built into the back, suggesting new possibilities.
One Laptop Per Child, at its peak, generated fun continuous crowd conversations.
> a pair of glasses with a screen inside of them
I've no idea what current tech is like, but I use to proselytize aphysical UIs, where a small head motion results in larger screen motion, to reduce neck swiveling.[1]
> weirder
Laptop harness walking desks are a thing. And one can do hand and head tracking[2] (I had that setup at a meetup where the swag was little stick-on privacy shutters for laptop webcams :). Boston/Cambridge is perhaps culturally a best case for such games - I've not tried them in NYC... hmm.
> but something very complex, [...] instead sketch out a diagram on a piece of paper [...] keep a small notebook in my bag
Same. I've tried swapping in an iPad, but it hasn't stuck.
Fwiw, I've done a pinch-nail-hand-arms 1-10-100-1000 mm "body as size reference" a couple of times around 5ish. And a 1000x "micro view" "pinch is zoomed to arms size" "it's like a scale model or doll playset - everything zoomed together" world of "bacteria sprinkles, red blood cell candies (M&M minis or concave Smarties minis or Sweetarts - there's lots of cell candy analogs), hair poles, salt/sugar boxes". Stories of sitting on a grain of salt and eating... etc; pet eyelash mites. No idea if it actually worked.
I did some user-test videos, now only on archive.org.[1] Hmm... the "Arms, hands" video there now doesn't seem to play inline? - but does wget'ed and browsered. :/
Hmm, perhaps with flying? When stuck on the ground, people's feel for size gets poorer as things get bigger (tall buildings, clouds, map distances). I think of having 4ish orders of magnitude available for visual reference in a classroom (cm to 10 m), plus less robustly 100 m and km in AR. At that micrometer per meter, a grain of salt towers over a city skyline - "nano view" in [1] (eep - a decade ago now - I was about to take another pass at it as covid hit).
Hmm, err, that could be misleading... 4ish for visible lengths in a large class. But especially in a small group, one can use reference objects of sand (mm) and flour (fine 100 um, ultrafine 10 um). The difference between the 100 um and 10 being more behavioral and feel (eg mouth feel) than unmagnified visible size. Thus with an outdoor view (for 100 m), one can use less-abstract "it's like that there accessible length" concrete-ish analogues across like 8 orders of magnitude. Or drop to 6, or maybe push for 9, as multiples of 3 nicely detent across SI prefixes.
Nice. Two quick UI thoughts. Upon loading, perhaps start with some unit selected, and a default amount 1, so there's immediate content to be seen? And to extend the experience, maybe add a "dice roll" button, so users can "see more neat things" click-click-click without the cognitive overhead of pathing the option space
"Years You Have Left to Live, Probably"[1], on Nathan Yau's FlowingData[2], reminded me of Lanes. I stumbled on it, selected my gender and age, and the animated distribution sampling began. And the first-ish sample was a 'dead this year', immediate ball plummet. That... left an impression.
"[O]ne outlier can dominate the average"; "We're used to living in this world of normal distributions and you act a certain way, but as soon as you switch to this realm that is governed by a power law, you need to start acting vastly different. It really pays to know what kind of world or what kind of game you are playing."
First, context: a "life/not-life" distinction is far more "science" than science - widespread in "science" education, but rarely comes up in science research. (Might be interesting to create a list of similar?) Why the emphasis there... I don't know - perhaps because we teach by memorizing definitions and lists, not by learning design spaces and their landmarks? Or at least by giving exemplars without characterizing variance.
One of the few places I've seen it come up in science, was ecosystem multi-scale simulation software. Where virus was squarely in the heritable characteristics under selection pressure ("life") bucket, rather than abiotic or biogenic.
Informal "do you think of viruses as alive?" seems to vary by field. I've seen a marine bio labs be overwhelmingly yes. I've been told medical immunology leans no. But it seems more social-media engagement question than research question or synthesis.
reply