More

wizzwizz4 · 2025-04-16T21:23:33 1744838613

PhasmaFelis and mikeash have all matches mutual for the top 20, 30, 50 and 100. Are there other users like this? If so, how many? What's the significance of this, in terms of the shape of the graph?

tablespoon is close, but has a missing top 50 mutual (mikeash). In some ways, this is an artefact of the "20, 30, 50, 100" scale. Is there a way to describe the degree to which a user has this "I'm a relatively closer neighbour to them than they are to me" property? Can we make the metric space smaller (e.g. reduce the number of Euclidean dimensions) while preserving this property for the points that have it?

wizzwizz4 · 2025-04-16T10:43:00 1744800180

In urgent, dangerous situations (e.g. sudden busy traffic)? Yes. But, you can explain the existence of those situations ahead of time, and practice things like "get off the road" or "let go of that cooking pot".

wizzwizz4 · 2025-04-14T09:50:58 1744624258

The next token is obviously "goes". (Any language model that disagrees is simply wrong.)

JohnKemeny · 2025-04-14T21:09:10 1744664950

I'm not sure if my chain's bein' yanked right now, but surely you mean "gos"‽

wizzwizz4 · 2025-04-14T22:39:41 1744670381

The plural of mangoe is mangoes. https://en.wiktionary.org/wiki/mangoe

wizzwizz4 · 2025-04-14T09:03:36 1744621416

Of course you find it hard to distinguish the two! You don't have equipment for measuring tidal forces, and they are locally indistinguishable.

Alex-Programs · 2025-04-14T09:20:10 1744622410

Of sure you find it hard to tell the two away! You lack the gear for tide pull test, and they feel the same here and local.

I hate this.

wizzwizz4 · 2025-04-14T09:43:14 1744623794

The new pull and the old pull both just feel like a pull, if you can only feel the pull at one spot. To see how the old pull is not like the new pull, you have to test the pull at a spot near you (but not the same spot), too. The new pull will be the same at each spot, but the old pull may not be the same (we call this the tide), and you test the sum of the new and old pull.

(This is hard.)

BobaFloutist · 2025-04-14T16:20:22 1744647622

It's fair that it's hard to keep the two from becoming the same in your head, you need fancy stuff to test for the force of the tide, and they are more or less the same from a close-up (any which is much closer than, say, the moon) view!

(Verbosity is your friend)

wizzwizz4 · 2025-04-13T17:40:05 1744566005

Obligatory SMBC comic: https://smbc-wiki.com/index.php/Dragons

wizzwizz4 · 2025-04-13T08:12:18 1744531938

I find legacy systems fun because you're looking at an artefact built over the years by people. I can get a lot of insight into how a system's design and requirements changed over time, by studying legacy code. All of that will be lost, drowned in machine-generated slop, if next decade's legacy code comes out the backside of a language model.

ThrowawayR2 · 2025-04-13T18:13:16 1744567996

> "All of that will be lost, drowned in machine-generated slop, if next decade's legacy code comes out the backside of a language model."

The fun part though is that future coding LLMs will eventually be poisoned by ingesting past LLM generated slop code if unrestricted. The most valuable code bases to improve LLM quality in the future will be the ones written by humans with high quality coding skills that are not reliant or minimally reliant on LLMs, making the humans who write them more valuable.

Think about it: A new, even better programming language is created like Sapphire on Skates or whatever. How does a LLM know how to output high quality idiomatically correct code for that hot new language? The answer is that _it doesn't_. Not until 1) somebody writes good code for that language for the LLM to absorb and 2) in a large enough quantity for patterns to emerge that the LLM can reliably identify as idiomatic.

It'll be pretty much like the end of Asimov's "Feeling of Power" (https://en.wikipedia.org/wiki/The_Feeling_of_Power) or his almost exactly LLM relevant novella "Profession" ( https://en.wikipedia.org/wiki/Profession_(novella) ).

eMPee584 · 2025-04-13T08:57:13 1744534633

thanks to git repositories stored away in arctic tunnels our common legacy code heritage might outlast most other human artifacts.. (unless ASI choses to erase that of course)

mckn1ght · 2025-04-13T08:58:12 1744534692

That’s fine if you find that fun, but legacy archeology is a means to an end, not an end itself.

wizzwizz4 · 2025-04-13T09:03:46 1744535026

Legacy archaeology in a 60MiB codebase far easier than digging through email archives, requirements docs, and old PowerPoint files that Microsoft Office won't even open properly any more (though LibreOffice can, if you're lucky). Handwritten code actually expresses something about the requirements and design decisions, whereas AI slop buries that signal in so much noise and makes "archaeology" almost impossible.

When insight from a long-departed dev is needed right now to explain why these rules work in this precise order, but fail when the order is changed, do you have time to git bisect to get an approximate date, then start trawling through chat logs in the hopes you'll happen to find an explanation?

mckn1ght · 2025-04-13T09:11:38 1744535498

Code is code, yes it can be more or less spaghetti but if it compiles at all, it can be refactored.

Having to dig through all that other crap is unfortunate. Ideally you have tests that encapsulate the specs, which are then also code. And help with said refactors.

wizzwizz4 · 2025-04-13T09:31:02 1744536662

We had enough tests to know that no other rule configuration worked. Heck, we had mathematical proof (and a small pile of other documentation too obsolete or cryptic to be of use), and still, the only thing that saved the project was noticing different stylistic conventions in different parts of the source, allowing the minor monolith to be broken down into "this is the core logic" and "these are the parts of a separate feature that had to be weaved into the core logic to avoid a circular dependency somewhere else", and finally letting us see enough of the design to make some sense out of the cryptic documentation. (Turns out the XML held metadata auxiliary to the core logic, but vital to the higher-level interactive system, the proprietary binary encoding was largely a compression scheme to avoid slowing down the core logic, and the system was actually 8-bit-clean from the start – but used its own character encoding instead of UTF-8, because it used to talk to systems that weren't.)

Test-driven development doesn't actually work. No paradigm does. Fundamentally, it all boils down to communication: and generative AI systems essentially strip away all the "non-verbal" communication channels, replacing them with the subtext equivalent of line noise. I have yet to work with anyone good enough at communicating that I can do without the side-channels.

Ekaros · 2025-04-13T21:34:20 1744580060

Makes me think that the actual horrific solution here is that every single prompt and output ever made while developing must be logged and stored. As that might be only documentation that exist for what was made.

Actually really thinking, if I was running company allowing or promoting AI use that would be first priority. Whatever is prompted, must be stored forever.

mckn1ght · 2025-04-13T10:41:06 1744540866

> generative AI systems essentially strip away all the "non-verbal" communication channels

This is a human problem, not a technological one.

You can still have all your aforementioned broken powerpoints etc and use AI to help write code you would’ve previously written simply by hand.

If your processes are broken enough to create unmaintainable software, they will do so regardless of how code pops into existence. AI just speeds it up either way.

wizzwizz4 · 2025-04-13T14:29:31 1744554571

The software wasn't unmaintainable. The PowerPoints etc were artefacts of a time when everyone involved understood some implicit context, within which the documentation was clear (not cryptic) and current (not obsolete). The only traces of that context we had, outside the documentation, were minor decisions made while writing the program: "what mindset makes this choice more likely?", "in what directions was this originally designed to extend?", etc.

Personally, I'm in the "you shouldn't leave vital context implicit" camp; but in this case, the software was originally written by "if I don't already have a doctorate, I need only request one" domain experts, and you would need an entire book to provide that context. We actually had a half-finished attempt – 12 names on the title page, a little over 200 pages long – and it helped, but chapter 3 was an introduction-for-people-who-already-know-the-topic (somehow more obscure than the context-free PowerPoints, though at least it helped us decode those), chapter 4 just had "TODO" on every chapter heading, and chapter 5 got almost to the bits we needed before trailing off with "TODO: this is hard to explain because" notes. (We're pretty sure they discussed this in more detail over email, but we didn't find it. Frankly, it's lucky we have the half-finished book at all.)

AI slop lacks this context. If the software had been written using genAI, there wouldn't have been the stylistic consistency to tell us we were on the right track. There wouldn't have been the conspicuous gap in naming, elevating "the current system didn't need that helper function, so they never wrote it" to a favoured hypothesis, allowing us to identify the only possible meaning of one of the words in chapter 3, and thereby learn why one of those rules we were investigating was chosen. (The helper function would've been meaningless at the time, although it does mean something in the context of a newer abstraction.) We wouldn't have been able to used a piece of debugging code from chapter 6 (modified to take advantage of the newer debug interface) to walk through the various data structures, guessing at which parts meant what using the abductive heuristic "we know it's designed deliberately, so any bits that appear redundant probably encode a meaning we don't yet understand".

I am very glad this system was written by humans. Sure, maybe the software would've been written faster (though I doubt it), but we wouldn't have been able to understand it after-the-fact. So we'd have had to throw it away, rediscover the basic principles, and then rewrite more-or-less the same software again – probably with errors. I would bet a large portion of my savings that that monstrosity is correct – that if it doesn't crash, it will produce the correct output – and I wouldn't be willing to bet that on anything we threw together as a replacement. (Yes, I want to rewrite the thing, but that's not a reasoned decision based on the software: it's a character trait.)

mckn1ght · 2025-04-13T20:50:01 1744577401

I guess I just categorically disagree that a codebase is impossible to understand without “sufficient” additional context. And I think you ascribe too much order to software written by humans that can exist in quite varied groups wrt ability, experience, style, and care.

wizzwizz4 · 2025-04-13T21:40:02 1744580402

It was easy to understand what the code was instructing the computer to do. It was harder to understand what that meant, why it was happening, and how to change it.

A program to calculate payroll might be easy to understand, but unless you understand enough about finance and tax law, you can't successfully modify it. Same with an audio processing pipeline: you know it's doing something with Fourier transforms, because that's what the variable names say, but try to tweak those numbers and you'll probably destroy the sound quality. Or a pseudo-random number generator: modify that without understanding how it works, and even if your change feels better, you might completely break it. (See https://roadrunnerwmc.github.io/blog/2020/05/08/nsmb-rng.htm..., or https://redirect.invidious.io/watch?v=NUPpvoFdiUQ if you want a few more clips.)

I've worked with codebases written by people with varying skillsets, and the only occasions where I've been confused by the subtext have been when the code was plagiarised.

wizzwizz4 · 2025-04-12T21:49:44 1744494584

You know "puréed orphan extract" is just salt, right? You can extract it from seawater in an expensive process that, nonetheless, is way cheaper than crushing orphans (not to mention the ethical implications). Sure, you have to live near the ocean, but plenty of people do, and we already have distribution networks to transport the resulting salt to your local market. Just one fist-sized container is the equivalent of, like, three or four dozen orphans; and you can get that without needing a fancy press or an expensive meat-sink.

wizzwizz4 · 2025-04-10T11:49:42 1744285782

I would expect it to be closer to 1KB, as well. 100KB is (at time of writing) about 5× the size of this webpage, and this doesn't load instantly for me.

wizzwizz4 · 2025-04-08T12:19:03 1744114743

There are other stable conditions: law is not the only possible system of justice. Is it in the best interests for everyone if the law steps in every time one person punches another? Law is helpful when things can't be resolved at an interpersonal level: there are situations where a single punch should be prosecuted, so we can't just make punching legal; but equally, if too many things are illegal, selective policing becomes possible, and that's an abuse we really don't want.

Institutions like the criminal justice system are tools. Some can wield the institutions skilfully (e.g. https://www.loweringthebar.net/2006/07/judge_tells_con.html, https://www.bbc.co.uk/news/av/uk-38021839/speeding-drivers-q...), but often, it's a blunt instrument.

ben_w · 2025-04-08T14:17:02 1744121822

I think "justice" is one of those words where people all think they're in agreement about it being good, but when you ask them what it means then suddenly they're all wildly divergent.

And that's the problem.

"Swinging one's fist" is more of a quote than an example here; for an example, consider that everyone agrees "murder is wrong", but we don't agree about abortion, euthanasia, deaths by police action, the death penalty, accidental civilian casualties during war, war crimes, or population liabilities if a large number of each people produce a small quantity of toxin that causes a statistically significant change in the life expectancy of the area. People protest these things, and some attempt crimes to force change on these topics.

Some say it's acceptable to use lethal force to prevent a homicide. Is it acceptable for anti-pollution protestors to vandalise gasoline supplies to reduce NOx emissions? Was it acceptable 20 years ago when we didn't have any obvious rapid path to electrification of road traffic, given that our economies are dependent on road transport?

A while before the 9/11 attacks, I saw a chain-email demanding action against the Taliban for their mis-treatment of women. When Afghanistan was invaded, I saw people upset about that, too (though in different ways, e.g. because the invading forces accidentally killed people by dropping food on their heads or bombing weddings because of the celebratory machine gun fire). Nobody was a fan of Saddam Hussein, but the second Iraq war was even more heavily criticised, despite UK/US leadership insisting Iraq had WMDs.

The boundaries here seem clean, crime vs. justice, peace vs. war, protest vs. terrorism, self defence vs. attack, but the closer I look the more I see these things as continuums.

wizzwizz4 · 2025-04-08T15:07:02 1744124822

The world is deep and hard to categorise, people disagree on the nature of justice, and many (all?) people mistake their moral heuristics for moral truth. But there's one thing that everybody agrees on: "justice is obeying the law" is wrong. https://existentialcomics.com/comic/196 (Or https://plato.stanford.edu/entries/legal-obligation/, if you're one of those boring types who wants factual understanding.)

wizzwizz4 · 2025-04-08T10:05:24 1744106724

In the olden days, when law enforcement wanted to intercept a letter, they would locate the sender, nab the letter before it got whisked away, and read it. (If the letter was sealed, they would copy the seal, so they could convincingly re-seal the letter after reading.) Law enforcement wasn't able to do this with whispered conversations, nor easily identify disguised people without following or arresting them. Things still got done.

I don't understand why computer-mediated communication means we have to choose between a panopticon, or the end of law enforcement. It seems to me that good old-fashioned detective work is still perfectly possible. Sure, there are cyber-enabled crimes, and new classes of cyber-dependent crimes, but each of those is a crime because of an interaction with the physical, human world. Those interactions haven't gone away, and are still amenable to investigation. (At a basic level: how do you know a crime has happened in the first place?)

graemep · 2025-04-08T10:11:48 1744107108

Yes, detective work is possible. So are technological extensions to it. For example investigators being allowed (maybe requiring a warrant, or other appropriate controls) to crack the devices for people under investigation.

In fact, things like forcing Apple to backdoor its encryption will not be effective against any but stupid criminals (I admit many criminals are stupid, but the stupid ones are not the most dangerous ones). Once it is known that this can be done, smart criminals will just use other means of communication.

The aim of this is not to help investigate serious crime, it is mass surveillance to deal with things like what the British government has called "legal but harmful speech", or things like "non-crime hate incidents" or minor offences that would not justify putting money into investigations, or civil matters.

I have in mind the way the Regulation of Investigatory Powers act was used to catch people doing things such as not picking up their dog's poo or lying about where they lived to get their kids into a better school.