>> > LLM’s never provide code that pass my sniff test
> This is ego speaking.
Consider this, 100% of AI training data is human-generated content.
Generally speaking, we apply the 90/10 rule to human generated content: 90% of (books, movies, tv shows, software applications, products available on Amazon) is not very good. 10% shines.
In software development, I would say it's more like 99 to 1 after working in the industry professionally for over 25 years.
How do I divorce this from my personal ego? It's easy to apply objective criteria:
- Is the intent of code easy to understand?
- Are the "moving pieces" isolated, such that you can change the implementation of one with minimal risk of altering the others by mistake?
- Is the solution in code a simple one relative to alternatives?
The majority of human produced code does not pass the above sniff test. Most of my job, as a Principal on a platform team, is cleaning up other peoples' messes and training them how to make less of a mess in the future.
If the majority of human generated content fails to follow basic engineering practices that are employed in other engineering disciplines (i.e: it never ceases to amaze me how much of an uphill battle it is just to get some SWEs just to break down their work into small, single responsibility, easily testable and reusable "modules") then we can't logically expect any better from LLMs because this is what they're being trained on.
And we are VERY far off from LLMs that can weigh the merits of different approaches within the context of the overall business requirements and choose which one makes the most sense for the problem at hand, as opposed to just "what's the most common answer to this question?"
LLMs today are a type of magic trick. You give it a whole bunch of 1s and 0s so that you can input some new 1s and 0s and it can use some fancy proability maths to predict "based on the previous 1s and 0s, what are the statistically most likely next 1s and 0s to follow from the input?"
That is useful, and the result can be shockingly impressive depending on what you're trying to do. But the limitations are so limited that the prospect of replacing an entire high-skilled profession with that magic trick is kind of a joke.
Your customers don't care how your code smells, as long as it solves their problem and doesn't cost an arm and a leg.
A ton of huge business full of Sr Principal Architect SCRUM masters are about to get disrupted by 80 line ChatGPT wrappers hacked together by a few kids in their dorm room.
> Your customers don't care how your code smells, as long as it solves their problem and doesn't cost an arm and a leg.
Software is interesting because if you buy a refrigerator, even an inexpensive one, you have certain expectations as to its basic functions. If the compressor were to cut out periodically in unexpected ways, affecting your food safety, you would return it.
But in software customers seem to be conditioned to just accept bugs and poor performance as a fact of life.
You're correct that customers don't care about "code quality", because they don't understand code or how to evaluate it.
But you're assuming that customers don't care about the quality of the product they are paying for, and you're divorcing that quality from the quality of the code as if the code doesn't represent THE implementation of the final product. The hardware matters too, but to assume that code quality doesn't directly affect product quality is to pretend that food quality is not directly impacted by its ingredients.
Code quality does not affect final product quality IMHO.
I worked in companies with terrible code, that deployed on an over-engineered cloud provider using custom containers hacked with a nail and a screwdriver, but the product was excellent. Had bugs here and there, but worked and delivered what needs to be delivered.
SWEs need to realize that code doesn't really matter. For 70 years we are debating the best architecture patterns and yet the biggest fear of every developer is working on legacy code, as it's an unmaintainable piece of ... written by humans.
> Code quality does not affect final product quality IMHO.
What we need, admittedly, is more research and study around this. I know of one study which supports my position, but I'm happy to admit that the data is sparse.
The parent point isn't that shitty code doesn't have defects but rather that there's usually a big gap between the code (and any defects in that code) and the actual service or product that's being provided.
Most companies have no relation between their code and their products at all - a major food conglomerate will have hundreds or thousands of IT personnel and no direct link between defects in their business process automation code (which is the #1 employment of developers) and the quality of their products.
For companies where the product does have some tech component (e.g. refrigerators mentioned above) again, I'd bet that most of that companies developers don't work on any code that's intended to be in the product, in such a company there simply is far more programming work outside of that product than inside of one. The companies making a software-first product (like startups on hackernews) where a software defect implies a product defect are an exception, not the mainstream.
Having poor quality code makes refactoring for new features harder, it increases the time to ship and means bugs are harder to fix without side effects.
It also means changes have more side effects and are more likely to contain bugs.
For an MVP or a startup just running off seed funding? Go ham with LLMs and get something in front of your customers, but then when more money is available you need to prioritise making that early code better.
Much like science in general, these topics are never -- and can never be -- considered settled. Hence why we still experiment with and iterate on architectural patterns, because reality is ever-changing. The real world from whence we get our input to produce desired output is always changing and evolving, and thus so are the software requirements.
The day there is no need to debate systems architecture anymore is the heat death of the universe. Maybe before that AGI will be debating it for us, but it will be debated.
> That is useful, and the result can be shockingly impressive depending on what you're trying to do. But the limitations are so limited that the prospect of replacing an entire high-skilled profession with that magic trick is kind of a joke.
The possible outcome space is not binary (at least in the near term), i.e. either AI replace devs, or it doesn't.
What I'm getting at is this: There's a pervasive attitude among some developers (generally older developers, in my experience) that LLM's are effectively useless. If we're being objective, that is quite plainly not true.
These conversations tend to start out with something like: "Well _my_ work in particular is so complex that LLM's couldn't possibly assist."
As the conversation grows, the tone gradually changes to admitting: "Yes there are some portions of a codebase where LLM's can be helpful, but they can't do _everything_ that an experienced dev does."
It should not even be controversial to say, that AI will only improve at this task. That's what technology does, over the long run.
Fundamentally, there's ego involved whenever someone says "LLM's have _never_ produced useable code." That statement, is provably false.
> This is ego speaking.
Consider this, 100% of AI training data is human-generated content.
Generally speaking, we apply the 90/10 rule to human generated content: 90% of (books, movies, tv shows, software applications, products available on Amazon) is not very good. 10% shines.
In software development, I would say it's more like 99 to 1 after working in the industry professionally for over 25 years.
How do I divorce this from my personal ego? It's easy to apply objective criteria:
- Is the intent of code easy to understand?
- Are the "moving pieces" isolated, such that you can change the implementation of one with minimal risk of altering the others by mistake?
- Is the solution in code a simple one relative to alternatives?
The majority of human produced code does not pass the above sniff test. Most of my job, as a Principal on a platform team, is cleaning up other peoples' messes and training them how to make less of a mess in the future.
If the majority of human generated content fails to follow basic engineering practices that are employed in other engineering disciplines (i.e: it never ceases to amaze me how much of an uphill battle it is just to get some SWEs just to break down their work into small, single responsibility, easily testable and reusable "modules") then we can't logically expect any better from LLMs because this is what they're being trained on.
And we are VERY far off from LLMs that can weigh the merits of different approaches within the context of the overall business requirements and choose which one makes the most sense for the problem at hand, as opposed to just "what's the most common answer to this question?"
LLMs today are a type of magic trick. You give it a whole bunch of 1s and 0s so that you can input some new 1s and 0s and it can use some fancy proability maths to predict "based on the previous 1s and 0s, what are the statistically most likely next 1s and 0s to follow from the input?"
That is useful, and the result can be shockingly impressive depending on what you're trying to do. But the limitations are so limited that the prospect of replacing an entire high-skilled profession with that magic trick is kind of a joke.