In the same way that finding waste while increasing the federal budget isn't efficiency.
Technically, maybe you can squint and find small pieces that are more efficient but in the grand scheme of things they goal doesn't seem to be a smaller government.
Its also worth noting that the NLRB has a proposed budget of $320M for the the 2025 fiscal year and a total of around 1,300 employees [1].
I'm a strong proponent of small government and don't know enough about the NLRB to say if I would find them useful, but that is well within the range of a small federal department today.
I ended up drastically cutting back on Amazon purchases when they started getting flooded with brands like that.
Its absolutely on Amazon to maintain quality. There are certain brands and types of products I'll order there because they're just harder to find otherwise, but its mostly a last resort these days given that Amazon doesn't care to curate what is on their "shelves".
I assume we'll see backups on both sides. Containers backed up in Chinese ports and a huge backlog of unclaimed packages and delayed tariff bills waiting for USPS/UPS/FedEx to process them.
> This highlights an opportunity for organizations to better support their developers’ interest in AI tools, considering local regulations.
This is a funny one to see included in GitHub's report. If I'm not mistaken, github is now using the same approach as Shoplify with regards to requiring LLM use and including it as part of a self report survey for annual review.
I guess they took their 2024 survey to heart and are ready to 100x productivity.
I mean, I spent years learning to code in school and at home, but never managed to get a job doing it, so I just do what I can in my spare time, and LLMs help me feel like I haven't completely fallen off. I can still hack together cool stuff and keep learning.
I actually meant it as a good thing! Our industry plays very loose with terms like "developer" and "engineer". We never really defined them well and its always felt more like gate keeping.
IMO if someone uses what tools they have, whether thats an LLM or vim, and is able to ship software they're a developer in my book.
Probably. There is a similar question: if you ask ChatGPT / Midjourney to generate a drawing, are you an artist ? (to me yes, which would mean that AI "vibe coders" are actual developers in their own way)
If your daughter could draw a house with enough detail that someone could take it and actually build it then you'd be more along the lines of the GP's LLM artist question.
Not really, the point was contrasting sentimental labels with professionally defined titles, which seems precisely the distinction needed here. It's easy enough to look up on the agreed upon term for software engineer / developer and agree that it's more than someone that copy pastes code until it just barely runs.
EDIT: To clarify I was only talking about vibe coder = developer. In this case the LLM is more of the developer and they are the product manager.
Do we have professionally defined titles for developer or software engineer?
I've never seen it clarified so I tend to default to the lowest common denominator - if you're making software in some way you're a developer. The tools someone uses doesn't really factor into it for me (even if that is copy/pasting from stackoverflow).
I don't know, if you actually design in some way and deliver the solution for the structure of the bridge, aren't you THE structural engineer for that project ?
That probably depends on whether you consider LLMs, or human artists, as tools.
If someone uses an LLM to make code, is consider the LLM to be a tool that will only be as good as the person prompting it. The person, then, is the developer while the LLM is a tool they're using.
I don't consider auto complete, IDEs, or LSPs to take away from my being a developer.
This distinction likely goes out the window entirely if you consider an LLM to actually be intelligent, sentient, or conscious though.
This argument isn't particularly compelling in my opinion.
I don't actually like the stochastic parrot argument either to be fair.
I feel like the author is ignoring the various knobs (randomization factors may be a better term) applied to the models during inference that are tuned specifically to make the output more believable or appealing.
Turn the knobs too far and the output is unintelligible garbage. Don't then them far enough and the output feels very robotic or mathematical, its obvious that the output isn't human. The other risk of not turning the knobs far enough would be copyright infringement, but I don't know if that happens often in practice.
Claiming that LLMs aren't stochastic parrots without dealing with the fact that we forced randomization factors into the mix misses a huge potential argument that they are just cleverly disguised stochastic parrots.
This seems like it was inevitable. Most people do not understand the meaning of the word "stochastic" and so they're likely to simply ignore it in favour of reading the term as "_____ parrot."
What you have described, a probability distribution with carefully-tuned parameters, is perfectly captured by the word stochastic as it's commonly used by statisticians.
Human brains are similarly finely tuned and have similar knobs, it seems to me. People with no short term memory have the same conversations over and over again. Drunk people tend to be very predictable. There are circuits that give us an overwhelming sense of impending doom, or euphoria, or the conviction that our loved ones have been replaced by imposters. LLMs with very perturbed samplers bear, sometimes, a striking resemblance to people on certain mind-altering substances.
And that's really a core of the problem, we don't well understand how the human mind works and we can't really define or identify "intelligence."
I mentioned I don't like the stochastic parrot argument, and that I find this article's argument lacking. Both are for the same reason, the arguments are making claims that we simply can't make while missing the fundamental understanding of what intelligence really is and how human (and other animals) brains work.
You seem to be coming with the assumption that the difference between parrots and what many would consider intelligence is math, or that math is a reliable indicator of those different groups.
Solving hard math problems requires understanding the structure of complex mathematical reasoning. No animal is known to be capable of that.
Most definitions and measurements of intelligence by most laypeople and psychologists include the ability to reason, with mathematical reasoning widely accepted as part of or a proxy for it. They are imperfect but “intelligence” does not have a universally accepted definition.
Math is a contrived system though, there are no fundamental laws of nature that require math to be done the way we do it.
A human society may develop their own math in a base 13 system, or an entirely different way of representing the same concepts. When they can't solve our base 10 math problems in a way that matches how we expect does that mean they are parrots?
Part of the problem here is that we still have yet to land on a clear, standard definition of intelligence that most people agree with. We could look to IQ, and all of its problems, but then we should be giving LLMs an IQ test to answer rather than a math test.
The fact that much of physics can be so elegantly described by math suggests the structures of our math could be quite universal, at least in our universe.
Check out the problems in the MATH dataset, especially Level 5 problems. They are fairly advanced (by most people’s standards) and most are not dependent on which N in the base-N system used to solve them. The answers would be different of course but the structures of the problems and solutions remain largely intact.
> Solving hard math problems requires understanding the structure of complex mathematical reasoning. No animal is known to be capable of that.
Except, it doesn't. Maybe some math problems do -- or maybe all of them do, when the text isn't in the training set -- but it turns out that most problems can be solved by a machine that regurgitates text, randomly, from all the math problems ever written down.
One of the ways that this debate ends in a boring cul-de-sac is that people leap to conclusions about the meaning of the challenges that they're using to define intelligence. "The problem has only been solved by humans before", they exclaim, "therefore, the solution of the problem by machine is a demonstration of human intelligence!"
We know from first principles what transformer architectures are doing. If the problem can be solved within the constraints of that simple architecture, then by definition, the problem is insufficient to define the limits of capability of a more complex system. It's very tempting to instead conclude that the system is demonstrating mysterious voodoo emergent behavior, but that's a bit like concluding that the magician really did saw the girl in half.
Please check out the post on Math-Perturb-Hard conveniently linked to above before making a comment without responding to it.
A relevant bit:
“for MATH-P-Hard, we make hard perturbations, i.e., small but fundamental modifications to the problem so that the modified problem cannot be solved using the same method as the original problem. Instead, it requires deeper math understanding and harder problem-solving skills.”
For the skeptics: Scoring just 10% or so in Math-Perturb-Hard below the original MATH Level 5 (hardest) dataset seems in line with or actually better than most people would do.”
Gemini 2.5 Pro:
“The sentence argues that even if a model's score drops by about 10% on the "Math-Perturb-Hard" dataset compared to the original "MATH Level 5" (hardest) dataset, this is actually a reasonable, perhaps even good, outcome. It suggests this performance decrease is likely similar to or better than how most humans would perform when facing such modified, difficult math problems.”
I think 'nopinsight' and the paper are arguing that the drop is 10%, not that the final score is 10%. For example, Deepseek-R1 dropped from 96.30 to 85.19. Are you actually arguing that a child guessing randomly would be able to score the same, or was this a misunderstanding?
Technically, maybe you can squint and find small pieces that are more efficient but in the grand scheme of things they goal doesn't seem to be a smaller government.
reply