If you view LLM driven dev as a kind of evolutionary process rather than an engineering process (at the level of a single LLM output) then this makes a lot of sense. You're widening the population from which you select for fitness.
Ah ok. If you do run it again that would be a worthwhile change. I know I personally have biases about models and I have seen others commenting the same - it seems likely it would skew the results at least a little.
Nonetheless you've convinced me to try an even wider variety of models, thanks!
In fact, this makes me think I should add this as a feature to my AI dev tooling - compare responses side by side and pick the best one.
>In an interview with Robert Wright in 2003, Dyson referred to his paper on the search for Dyson spheres as "a little joke" and commented that "you get to be famous only for the things you don't think are serious" [...]
To be fair, he later added this:
>in a later interview with students from The University of Edinburgh in 2018, he referred to the premise of the Dyson sphere as being "correct and uncontroversial".[13] In other interviews, while lamenting the naming of the object, Dyson commented that "the idea was a good one", and referred to his contribution to a paper on disassembling planets as a means of constructing one.
Thanks for pointing out those follow ups. Interesting stuff!
> correct and uncontraversial
From the original quote it is clear he was referring to the idea of aliens being detectable by infrared because they will absorb all of their sun's energy. Later in the same paragraph he says:
> Unfortunately I went on to speculate about possible ways of building a shell, for example by using the mass of Jupiter...
> These remarks about building a shell were only order-of-magnitude estimates, but were misunderstood by journalists and science-fiction writers as describing real objects. The essential idea of an advanced civilization emitting infrared radiation was already published by Olaf Stapledon in his science fiction novel Star Maker in 1937.
So the Dyson Sphere is a rhetorical vehicle to make an order-of-magnitude estimate, not a description of a thing that he thought could physically exist.
Full quote from the video cited before "the idea was a good one":
> science fiction writers got hold of this phrase and imagined it then to be a spherical rigid object. And the aliens would be living on some kind of artificial shell. a rigid structure surrounding a star. which wasn't exactly what I had in mind, but then in any case, that's become then a favorite object of science fiction writers. They call it the Dyson sphere, which was a name I don't altogether approve of, but anyway, I mean that's I'm stuck with it. But the idea was a good one.
Again he explicitly says this "wasn't exactly what I had in mind." This one hedges a bit more and could be interpreted as his saying the idea of a Dyson Sphere is a good one. He may have meant that in the sense of it being a good science fiction idea though, and he subsequently goes on to talk about that.
The Dyson Sphere is good for order-of-magnitude calculations about hypothetical aliens, and also for selling vapourware to the types of people who uncritically think that vapourware is real.
Have you read the paper itself, not just summaries of the idea? It's obvious from the way he wrote it, dripping in sarcasm. Talking about "Malthusian principles" and "Lebensraum", while hand waving away any common sense questions about how the mass of Jupiter would even be smeared into a sphere around the sun, just saying that he can conceive of it and therefore we should spend public money looking for it. He's having a lark.
Also, he literally said it was a joke, and was miffed that he was best know for something he didn't take seriously.
He thought SETI listening to space radio waves was dumb, so made essentially a satirical paper saying we should look for heat instead, because "an advanced civilization would be using these Shells to capture all star energy, so we could only see the heat"
The "dyson sphere" was a made up and entirely unfounded claim, without justification.
You’ll just end up approving things blindly, because 95% of what you’ll read will seem obviously right and only 5% will look wrong. I would prefer to let the agent do whatever they want for 15 minutes and then look at the result rather than having to approve every single command it does.
That kind of blanket demand doesn't persuade anyone and doesn't solve any problem.
Even if you get people to sit and press a button every time the agent wants to do anything, you're not getting the actual alertness and rigor that would prevent disasters. You're getting a bored, inattentive person who could be doing something more valuable than micromanaging Claude.
Managing capabilities for agents is an interesting problem. Working on that seems more fun and valuable than sitting around pressing "OK" whenever the clanker wants to take actions that are harmless in a vast majority of cases.
It’s not just annoying; at scale it makes using the agent clis impossible. You can tell someone spends a lot of time in Claude Code: they can type —dangerously-skip-permissions with their eyes closed.
It's not reliable. The AI can just not prompt you to approve, or hide things, etc. AI models are crafty little fuckers and they like to lie to you and find secret ways to do things with alterior motives. This isn't even a prompt injection thing, it's an emergent property of the model. So you must use an environment where everything can blow up and it's fine.
People here are claiming that this is true of humans as well. Apart from the fact that bad content can be generated much faster with LLMs, what's your feeling about that criticism? It's there any measure of how many submissions before LLMs make unsubstantiated claims?
Thank you for publishing this work. Very useful reminder to verify sources ourselves!
I have indeed seen that with humans as well, including in conference papers and medical journals. The reference citations in papers is seen by many authors as another section they need to fill to get their articles accepted, not as a natural byproduct of writing an article.
We must hold the line.
reply