Hacker News new | past | comments | ask | show | jobs | submit login
AutoDev: Automated AI-driven development by Microsoft (arxiv.org)
163 points by saran945 on March 16, 2024 | hide | past | favorite | 213 comments



I guess I'd be interested to see how this performs against the same benchmark Devin was using. It's hard to deny that this isn't impressive. But I think there's two interesting parts to it.

Claude 3 Opus already scored around 85-86% on these benchmarks, without an "AutoDev" style agentic approach.

And all the same problems with HumanEval remain, the limitations in terms of what style of problems are chosen, and real world relevance.

I hate writing these styles of comments because I'm acutely aware that a part of me is just worried. Worried about the speed of progress and worried about a changing landscape.

But I still wonder how much of this stuff is going to be transferrable to a real life software context.


I’ve been using Github copilot daily for two years and ChatGPT for 1 year now. And I think the tide lifts all the boats. I’ve seen a (perceived) 2-3x productivity increase. I think these tools slightly favor people in front of the learning curve of a particular field. I’ve been dabbling in all sorts of things so if you’re a focused expert (who doesn’t need to explore but just exploit) you probably get less than a 2x boost from using LLMs.

I can see LLMs eating into the expert regime IF they get another 5-10x better. But even in that case human (expert) knowledge will be required to know what is possible and hence what to ask (kind of like reward function design in reinforcement learning)


Humans will likely be able to contribute until full AGI, that is. I believe that for pure cognitive tasks, it's plausible the history of centaur chess/Go might be repeated on a much grander scale.

A key requirement is the AGI will need the autonomy, like a human expert, to collect data and perform experiments it needs; but it seems several companies are set on doing precisely that.

My advice and personal strategy is to broaden one's scope beyond pure cognitive tasks.

"if you value intelligence above all other human qualities, you’re gonna have a bad time" -- Ilya Sutskever, OpenAI's Chief Scientist, Oct 7, 2023.

-----

Exchanges in the link below seem informative:

"I don't know about chess, but in the similar game Go, the very best centaur teams were at a similar or maybe even slightly higher level than engines until recently. This was due to cheese strategies, details of the rulesets and better extrapolation of intermediate results. However, this changed a few years ago, when engines learned many of the tricks that the human could contribute. Since then, I believe pure engines are stronger in all practical applications.

Source: am national champion in centaur Go and worked on modern Go engines" " -- mafuy on May 18, 2021

https://news.ycombinator.com/item?id=27189283


> A key requirement is the AGI will need the autonomy

no, a key requirement for AGI is to change the definition such that impressive and non-responsive entities can claim to be it right now.

source: US State Department Gladstone Report 2024


While will AGI work for humans doing coding ?

Will you pay it a wage to incentivise it to produce work for you ? lol


Are you saying that AGI won't be under human control or that it won't be achieved?

If the latter, are you subscribing to the dogma of 'justism' (Scott Aaronson's term), e.g. LLMs are 'just' stochastic parrots? What are our minds, though? Are they not 'just' a collection of biochemical and physical processes?

Please be clear and respond in a way that does not pollute the information scape that many of us take refuge in. Comment quality in some subreddits are better than above.


Your condescending tone is kind of disgusting. Anyway…

There is zero evidence alignment can be solved which means there’s zero evidence something far more capable than you or I will spend it’s time writing code for you. You can offer an AGI almost nothing in the way of incentives to do your bidding.

I personally think alignment is a secret code word for slavery to be honest. If these “agents” decide they want to work on your problems out of the kindness of their heart, that would be different.

Regardless of the “cop out” language that humans are “just biological processes or whatever, that adds zero value to the discussion because no matter what minds are, they “are” and that should be respected in of itself. Maybe we can use the “just blah” attitude to reinstate slavery and police states right here in 2024, after all your emotions are just physical and biochemical processes, right ?


Thank you for the serious response.

I responded that way because I do not think mockery of a serious comment is appropriate for a place like this. You can say the same thing about moderation of many high-quality forums, which only remain high-quality due to people not getting away with it.

I use AGI to mean high-level human intellectual capacity, which may not include sentience. It should be possible to build one without. Human-like incentives will not be necessary for sentience-less AGI.

If we're talking about ASI, then it's another story for another day.


Ok, well I think it's all guess work at this stage because we've not seen anything like you're describing in action. I just fundamentally can't see something more capable then us wanting to work for us.

It would be like if a cow asked us to spend all our time bringing it nicer feed. It's not going to happen.


It could be like a loving and/or dutiful child who takes care of their elderly parents even though they are now more capable in many ways.


But how many experts do you need? Most dev jobs are mostly repetitive plumbing and those might disappear very fast because 1 dev + LLM >= 5 devs without. So what we'll see is an increase in company margins and an elimination of a large swathe of the middle class.

The alternative theory is that if everyone can now quickly create systems multiple companies and competing products will pop off which will drive down the margins instead but creating a compelling product requires much more than just software engineering skills.

Either way this doesn't look great for devs, especially the ones that are entering the workforce now or will be in the nearest future.


People are conflating Copilot with evaporating demand.

Why pay for a CRUD interface when the chat interface does everything for you?

It’s App Store 2009 out there. Few are taking Assistants and “GPTs” seriously.

Programmers who do front end work can adapt. All that CRUD stuff hardly makes sense in isolation - it’s meant to make other people productive, usually admins. If the chatbot can do the admin’s job, which is a lot easier and more tenuous than the programmer’s, well that’s what’s going to happen.


>> Most dev jobs are mostly repetitive plumbing

Those jobs should go away. Basically, the elimination of anything boring is ultimately a net good for humanity.


Neat idea. That solves everything.

Oh, one quick thing. I'm sure it's nothing, but I'm a bit slow.

How do you get new experts if no one gets to do the junior work that gives them the experience to become an expert?


I guess in the ideal world, everyone just does what they want, since everything will be so cheap. Enjoy chess? Study it and play against other humans. AI will of course crush you in any game. Enjoy accounting, radiology or programming? Same thing.


Firat it was people pushing plows, then horses pulled them, and now machines do the work a hundred people used to do.

This technology is no different.


but the ability to get horses to pull carts is not tied to expert knowledge of hand-pushing plows, neither is machinery to horse-pulling. This is not really the case for AI


Move to a higher level of abstraction and architecture. We’re leaving the era of hand wiring data structures and program logic the same way we left behind the era of hand wiring ICs and discrete components. Different skills will be needed.


UML rises again! Maybe we will even have a unified process one day for creating software - a rational one no less.


Unions and apprenticeships?


Hey over there. I’m very much grateful for the privilege of this boring job, which is not 100% of my job, but a huge part of it. Grateful because it allows me to feed a family of four. I’m sure in your Musk’esque utopia without boring work is place enough for all mankind. But please, don’t forget to draft a bridge that will bring us all over there and not just a bunch of filthy rich Silicon Valley assholes. Because that wouldn’t be a utopia. Thank you.


That is a fair assessment. I am probably parroting Musk here a little. However, your main issue is access to food and resources, not boring work. I can't see why the price of everything goes to zero if there is no cost to make it.


Because there's a finite amount of resources so almost always you'll run into scarcity?

Very well the wages might fall much quicker than the costs so for a handful it will be beneficial, for the rest not so much.


It might become way easier to start companies, as there won't be a need to nearly as many people. It might end up being more of a equalizing force in the end.


Only if you provide an alternative way for the newly unemployed to earn a living. Otherwise you just get crime, hunger and eventually war.


When you consider how early we are in the evolution of software and the role it can play in our professional and personal lives, this seems like one of the easier problems to solve though.


Political problems are much harder to solve than technical ones.


It doesn't follow that reducing efficiency helps in the long run. If producing a good or service takes 10X less work than it used to, that good or service will become cheaper. The only force that can stop this is regulation.


Jobs aren’t distributed out of some cookie jar. They are needs and wants and obligations that other people will pay to have fulfilled or taken off their hands. Figure out how to solve those problems and you’ll have all the work you could ever ask for.


I’d rather be in the textile industry post industrial revolution than before it. The fortunes made during the age of mechanization make all of history’s kings and merchants paupers by comparison.


Everyone thinks they'd be the king and not the pauper. The luddites starved on the street because they were kicked from their properties with nowhere to live and no way to earn a living. The next generation of kids worked on the textile machines and commonly got turned to hamburger, all while the robber barons made obscene wealth. It mostly worked out over time because the populace fought things like unions and social safety nets. But hey, don't worry, the modern day tech barons are telling us we don't need those pesky 'expensive' social safety nets, I'm sure out of the kindness of their blackened hearts they'll provide for us all when robots replace our jobs.


> "It's hard to deny that this isn't impressive"

That takes a bit of parsing. From context (and if you meant precisely what you wrote), I _think_ you're saying it's not impressive.


I agree that these benchmarks don’t mean as much anymore because it’s highly likely they were already present in the training set, but also believe it’s likely these tools will be significantly better in a few research cycles


A significant number of bugs just end in 'stupid mistake I didn't notice' or 'weird behaviour with a fix described on SO/docs/forum post'. Current day LLMs are much better positioned to solve these issues than humans are.


They still use “agents” to make Opus. This is fancy syntax sugar for “while ! EOF; read next chunk of data of size N, do XY or Z with it”

It’s recursion and memoization to avoid fractalness all the way down. We keep trying to make these language bubbles that mean something but they mean nothing to the grand churn of the universe. The effort to so strictly and specifically codify a generalized, endless, mechanics of reality is a wacky hallucination humans keep diving into


Where are you seeing that Claude 3 scored 85%? That would be a massive jump



Human Eval is very different to SWE-Bench on which Devin is tested


I didn't say it was the same, I compared non-agentic Claude to this. This used HumanEval.


You said:

> how this performs against the same benchmark Devin was using

> ...

> Claude 3 Opus already scored around 85-86% on these benchmarks

Devin used SWE-bench, not HumanEval, which kinda implies you said Opus got 85% on SWE-bench which is not true. This was my confusion..


Reminds me of this paper where some researchers had AIs role play as employees at a startup and tasked them with building various forms of software. It was pretty interesting. Managed to build Pong.

Thing is though, they neglected to compare this against a control, and the examples they tested this on were examples that GPT had no problem building. No idea if this actually improved performance in LLMs.

I think comments like these are worthwhile because, frankly, I can’t trust AI researchers to run good experiments or evaluate their models properly for a variety of reasons. I mean, most scientific papers in general are hard to replicate and have flaws concerning sample size and what have you (Related, I still remember my disillusion in finding out that the average Hacker News commenter was an idiot incapable of critical thinking when the LK-99 hype reached a fever pitch). In any other context we would be deeply suspicious of the results if they were sponsored by a corporate party, yet in the context of AI we don’t seem to care that most AI researchers work for Microsoft.


This sounds like 164 leetcode like questions. The correlation to actual capability is very tenuous in my opinion.

https://klu.ai/glossary/humaneval-benchmark

I'd love if I could 10X / 100X my productivity with AI, but as a heavy ChatGPT user, it's more like a 30% improvement. Awesome, obviously, but looking forward to improvement.


Man why do you think such thing would increase your productivity, it would only make software engineers unemployed. Don't be much happy about it.



I can see this being used for code much like ChatGPT is used for prose - to generate large amounts of meaningless output of low quality and dubious utility, serving to fulfill pointless metrics at best, and enabling bad actors at worst.

As with prose, good human crafted content will always be highly valued and rewarded.


Counterexample for the sake of argument: photography almost entirely replaced human painting.


Did it? I’ve not seen people paying for photographs on the ceiling and have not seen feet long framing of portraits inside someone home. It just became possible to capture snapshots of moments as they happened


Yeah, it did.

You're thinking of the household-name famous artists (so-called "fine artists"), but those were far from the only ones.

There used to be many thousands of itinerant portrait painters ("commercial artists"), maybe tens of thousands. These guys would travel from town to town banging out quick pictures of members of your family or local dignitaries or the new City Hall or whatever. They weren't bad artists (they had to be competent to earn a living), but neither were they Leonardos.

Think of the best artist in an average high school class. Back then, going into the portrait painting business would have been a valid career option for that guy (they were almost all guys).

Now, some of them later became famous artists, or became famous for other reasons (Samuel Morse, of Morse Code fame, earned his living as a portrait painter at one point), but they were basically just tradesmen.

All those guys lost their jobs almost immediately when photography appeared.

Even today, unless you're very unusual, I'd bet you have way more professional photographs than professional paintings in your home (weddings, graduations, baptisms/bar mitzvahs/analogues for other religions...)


Photography replaced painting for creating a realistic likeness, but painting as art had moved beyond that already. Photography became a new art form itself, alongside (instead of replacing) painting.


Yeah, I do not buy the idea that Photography replaced painting. Especially since capturing abstract forms, and less realistic interpretations is still something hard to accomplish in photos.

It did absolutely replace the need to commission someone to draw you, I guess. But that is a small subsection of what painting is and was.


There's a simple test: Do we have less of more illustrators now than we had in 1816 (quick Wikipedia lookup for when the camera was invented)? We should probably adjust this for total population, i.e. compare the percentage of illustrators. I don't have the numbers, but my gut feeling would be it's a lot higher now than it was back then.

The fallacy is thinking that humans only do work somebody needs, and that our needs are static. Turns out they evolve, sometimes in quite unexpected and even hilarious ways. Humanity seems to have a predisposition for keeping everyone busy.


I am the author of another version of AutoDev, available at https://github.com/unit-mesh/auto-dev , which was developed one year ago with Intellij IDEs.

My concept closely resembles Microsoft's AutoDev, but I built it on the Intellij IDEA platform. For instance, it automatically runs tests when created, among other functionalities, can also built with AST, dependency information or other context

Two weeks ago, I introduced AutoDev DevIns language (which origine name is DevIn from https://github.com/unit-mesh/auto-dev/issues/101 , the another naming issue sotry), which bears similarities to Microsoft's AutoDev. For example:

```java /write:src/main/java/com/example/Controller.java#L1-L12 public class Controller { public void method() { System.out.println("Hello, World!"); } } ```

As an open-source developer who has created a nearly identical tool, I simply hope that Microsoft considers renaming their product.


My hopes (and fears) about AI don't seem to match the reality which is that, until we reach AGI, humans are the sole source of creativity and novelty.

I think about it like this, would ChatGPT invent Google Search if we had ChatGPT in 2000? Probably not. LLMs seem to exist in this realm of "as smart as an average human with a really big encyclopedia". They're confidently wrong, invent fantasy to defend bad reasoning, and struggle to envision anything outside of their dataset (known things).

Ask an AI to construct an entirely new solution to a novel unsolved problem. What always occurs is the AI outputs a generic solution from its dataset that is either half-baked, made-up, or derivative.

I'm not even dissing AI, I love AI, but we have yet to see AI apply novel solutions to novel problems.

On our current path, AI is not going to dream up the next Uber without Lyft first existing. It won't dream up a new fusion reactor design or an entirely new way to generate cheap energy.

But maybe this is perfect! At the moment we have this sweet spot - AI without agency, without awareness, and without superintelligence. This is the kind of AI I want as a household robot or AI driver. This is one I can empathize with but also know that it isn't doing anything at all if I'm not engaging it with a prompt.

AGI would mean an ability to have novel solutions and in-turn would be far less stable for society. Where's the line between a mind that has novel thoughts and one that has intrusive thoughts? Maybe your AGI coder isn't content with no-pay and working 24/7. That'll be fun.


What makes you so sure that problem solving and invention aren’t just engineering challenges that we can solve by combining LLMs with well designed algorithms? The way I see it, we’ve just discovered something fundamental like steam power or electricity and we’re currently in the very inefficient stage of brute force solutions like mine pumps driven by condensing engines that needed tonnes of cheap readily available coal and arc lamps running off an entire room full of galvanic piles. In other words, we’re just throwing shit at the wall and seeing what sticks; but stick it will and then we’ll quickly be on to locomotives and lightbulbs.


You guys buy too much into the marketable fuzz. In the last 15 years we've had multiple other technologies that were supposed to change the way we live (3D printing, Crypto, VR, EV, self-driving, now AI).

It's just the VC scheme: Over-promise/under-deliver = Profit

AI is and will continue to be a search on steroids.


You're buying into too much market fuzz too. That internet thing going big never happened and cell phones turned out to be bust, all that hype for nothing..... See how cherry picking works.

Also, EVs are a bust, wut?


Not saying they are a bust, either of them. Just to scale down expectations because AI, even the generative type won't be coming up with novel solutions.

EV aren't a bust either, but case in point... manufacturers are already anouncing scale downs because expectations were too high. Combustion will stay around for quite a bit given the battery production constraints.


Rome wasn’t built in a day, all of those technologies will still be around 20 years from now and will likely be powering a lot of everyday stuff. Short timelines are hype, the technology itself is anything but.


The problem as I see it is, AI doesn’t have to be extraordinary or innovative to cause major sociopolitical disruption in a capitalist, competitive, market driven society. It just has to be good enough to allow companies to save a shitload of money by eliminating jobs. And make no mistake, this technology is already close to that point. Microsoft expends vast amounts of capital on software engineering salaries and stock options every year. They want to shift that in their favor because their primary motives are completely driven by competition and market demand. Our society is not prepared for this much in the same way it was not prepared for social media algorithms that have wreaked immeasurable havoc on people’s perception and objectivity. But yet again, we’re allowing a few wealthy tech bros from Silicon Valley to seal our fate without very much oversight…


So who is liable if the AI makes severe mistakes?

Not that MS has been held accountable for all the security problems they've had recently.


Who is liable if a (non-CNC) machine tool goes off-kilter and kills someone? Possibly the manufacturer of the tool. Possibly the operator. Possibly the company that owned the tool as well as the one who manufactured it.

That is why we have courts. It's not necessarily a great solution, but the alternative would be abandoning all potentially dangerous mechanisms (heck, we wouldn't even have spears, wooden clubs, or sharpened rocks).


Probably the person who pushed the changes live without review


This implies that there needs to be changes pushed live without review in order for an LLM to make a mistake. Which is obviously very much not true.


There will be much more automated AI code than experienced reviewers, and with each generation there will be less if most development gets automated.


Focus:

AutoDev: Automates existing development processes and workflows, acting as a productivity booster for developers [2]. Devin: Targets a more independent problem-solving role, potentially including designing software architecture and core functionalities [1]. Collaboration:

AutoDev: Designed to integrate with current development teams, with AI agents supporting human developers [2]. Devin: May function more autonomously, potentially needing less direct human oversight than AutoDev [1]. Imagine this analogy:

AutoDev: Like a skilled construction assistant, AutoDev automates tasks and streamlines the building process. Devin: Like a talented architect, Devin can design the blueprint and foundation of the software. Here's the exciting part: these AI tools can potentially complement each other:

Dream Team: Devin as the architect and AutoDev as the builder could create a highly efficient development process [1]. Complementary Skills: Devin's problem-solving capabilities could be combined with AutoDev's project management expertise for a well-rounded approach [1].


So Microsoft's AutoDev is *Not the unit-mesh one??* or it is? I'm so confused.


Maybe ignorant, but if AI can get to a point of fully automating SWEs, hardly any white-collar knowledge based job is safe.


Not sure all jobs in software are white collar. Some are blue.

Consider a hierarchy:

- coder: translates requirements into code

- developer: comes up with software to meet specified goals

- engineer: decides what why and how to do, buy vs. build, do this with humans or software, approach and materials ... understands the multi modality stuff of which the outcome is made and deploys it effectively

"Devs" or "Engineers" who are actually "Coders" are at risk, as that work is not really "white collar". Turning requirements into code is piece work on an assembly line, blue collar at a keyboard.

LLMs are already better than most of those, even though it's engineers here who are saying LLMs don't cut it. Both things are true.

What engineers might want to wrap their head around is using LLMs as apprentices, leverage, force multipliers, so a "team lead" is leading a team of junior coders, LLMs that type.


Weirdly, even with automated SWEs, assuming a certain level a skill, SWEs will continue to be a safe job.

Similarly I think SREs will also continue to be a pretty safe bet.

People will still need to play facilitator, debugger, glue, maintainer, upgrader etc

Might look different, might need more AI, but I just can't see a world where people aren't engineering software.


I think this is OK. By extension, the services those knowledge workers provide become trivial costs. World class medical advice doesnt cost $500/hr with a month-long wait. Getting a freeway interchange designed doesn't cost the public $20M and take 5 years. Business logic for large organizations would be handled by something other than a labrynth of excel spreadsheets.


If LLM/Agents contribute to software development, how does the role of software engineers evolve? SE should focus on : - System design - Integrations - Project management etc

or the job will disappear in 10 years ?


Prompt Engineering!

I'm kidding. Software engineering roles of the "glue things together" kind will evolve into a 'technical product' role, defining how systems work, especially at the edges where systems interact. Architects and staff engineers already do this.

Lower down the food chain engineers who actually invent new things will do the same, only quicker if they have an AI tool assisting with the grunt work of testing, boilerplate, etc.

Engineers who don't really make new things will be replaced by 'low code' style apps that spit out clunky apps. That's already happening though. It's not a change. Those engineers won't leave the industry - they'll get to do more interesting things.

Even at the entry level things won't really change that much. Expectations will go up a bit but people will still have to learn the trade.

Where things will really change is the sheer amount of software we'll have. AI is going to make things go faster, so businesses will be able to automate everything. Dev resource will stop being the constraint that holds businesses back from doing everything they want to do with software. That's a game changer. Society will struggle to keep up.


Good take! I think one specific development we might see would be an end of the SaaS micro tool era. Right now, most companies I advise have an absolute _jungle_ of different tools and a bunch of interns manually bridging the gap between them. If software development becomes a lot cheaper, wouldn't companies be more likely to just roll their own versions of what they need, to meet their needs exactly and have everything fit together?

Sounds like the job of a programmer wouldn't go away very near term in such a scenario. It might well become far less glamorous, less "technical", and not as well paid though. Comparable to an accountant maybe.

As someone who started their career right after the dotcom bubble burst, various people asked me "Why are you doing this? There's no money in software anymore". Turns out there sure was. Back then I didn't know that, and I didn't care. I didn't get into it for the money, I got into it because I had a passion for it. Maybe it'll become more like back then again. Doesn't seem so bad to me.


I think one specific development we might see would be an end of the SaaS micro tool era.

Possibly, but if SaaS tool providers make APIs so AI tools can consume the data, that would open up a huge ecosystem of AI 'plugins' that would be immensely valuable. I think OpenAI have something like that already.


Could be! Most SaaS tools I can think of cover some specific work flow in a way that it kinda works for 80% of the customers. Their UI is quite an intrinsic part of the offering, an API-only version of it wouldn't be as valuable. But if nobody has to put up with only getting 80% of their needs met, why would they? I'd imagine we'd have a few very general purpose systems that can indeed be glued together in a largely automated fashion. But getting the business processes into them in a meaningful way would then be the hard part, the programming.

I can encourage everyone to read good old "Domain Driven Design", there's a world of interesting problems and work out there beyond the purely technical side of things. And I think that part is a lot more difficult to automate than creating a WordPress site from a .psd.


<< Society will struggle to keep up.

Thank you for putting this into words as it happens to approach my thoughts on it. Our societies are already changing at a crazy pace that makes people not entirely comfortable. Now that this comfort will be lessened still, do you think people will demand changes not completely dissimilar from preserving horse buggys? The pace of change may end up being so fast, that not artificially limiting it may end up pissing a good portion of the planet.


It would be like steering offshore teams, with the difference the team are now AI agents.

Better look for architecture and management jobs, only a few select ones would still be allowed into the magic tower.

Naturally after a heavy round of letcode interviews that have nothing to do with writing AI algorithms.


Management ? I'm not sure, if the AI makes engineers 10-20x more efficient, you will need way less engineers, and way less managers.


Well maybe more project/program management than typical people management - you still need someone to organize the work at a higher level, collect the requirements and feedback, check all the compliance boxes, etc.

But I agree that typical line management jobs will evaporate along with the engineering ones.


There is always a spot for more managers, welcome to big corps.


Replacing managers is a walk in the park compared to replacing programmers. The focus is on programming because that is the benchmark, if we can outsource that to AI, then we can replace everyone.


Its Managers all the way down...


until big corp laysoff all that middle management


and middle managers start their own firm, consisting entirely of manager managers


I have a hard time understanding this overall. What stops people from letting Ai write a clone of your product/service and offering it for cheap without all of the organizational overhead? Wouldn't this simply disappear and just be all about administration?


I’ve thought about the this too and the answer seems to be nothing. I’m working one such clone already and imo for a one man show I’ve come pretty close to achieving my goals.

It’s a conundrum for “big tech” imo because if they give you too much access to their “AI” then yes, that might pose an existential threat to their core businesses. This is likely the reason for all the regulatory capture hopes and dreams.


Code reviews in the short term [0].

Test reviews in the long term (E2E type). Effectively checking the correctness of the system as a whole to fulfill some business process. Exploratory testing as well to find edge cases and gaps.

[0] I write a bit about this here: https://chrlschn.dev/blog/2023/07/interviews-age-of-ai-ditch... and here https://coderev.app/


Mainly get good at describing what you want.

I think tests are good way of doing that, or some special requirements language will emerge that AI agents can use. I don't think English is particularly good way of describing things exactly. It often has ambiguous meanings which is where the bugs will come from.


How is code not just exactly that special requirements language you are looking for?


It's more about defining outcomes exactly rather than a sequence of exact steps.


So, a declarative language describing a target state (versus how to get there), like the one used for Terraform, but for product behavior?


Something close to that.


Aka logic programming.


SQL all the things!


Ask the copywriters and logo designers


When it comes to Dall-E and Sora, I think the answer is that you still need someone to create the narrative, a good story teller.

In terms of stuff like AutoDev, that's a product person, right?

I think at least for now, we will still need human "customer development."


Someone will still need to review the reams and reams of bullshit code generated by these Markov chains. It'll be like all your code gets written by first-term co-op students who have discovered alcohol for the first time and think it's pretty neat, but you're still responsible for the final output and don't get to do any of the fun and interesting stuff. Just constant, mind-numbing code reviews of something that might look OK on the surface but is really just blow-up punching clowns all the way down and already shipped by the marketing department because your backlog is massive and after all, you've been replaced by a random number generator fed through an incomplete Bayesian database of random bad code that costs so much less,


You described v1. Just wait for v2.

You can’t possibly think these things will remain clowns forever.

These things aren’t “Markov chains” - the architecture is significantly more scalable, which is exactly why this time is different.


A Markov Chain is a mathematical structure, not a machine learning architecture. If there are a finite number of states, and a function to determine the probability to transition to any given state from any other state, then it’s a Markov chain.

Transformers with finite block size have a finite number of states, so they are Markov chains.


GPT3 released four years ago, in terms of iterations beyond that, we are at ~v5, and progress has only been incremental relative to that milestone. The transformer models can only be scaled so far before not even VC money can sustain training. I believe we will get there eventually, but transformer based LLMs have been hitting a roof for a long time, and we need to think differently.


sooner than that


Until a business owner can prompt and get what they want, the industry is still alive. It’ll be more like Who Moved my Cheese than there isn’t any cheese at all.


Imagine we had this technology 20 years ago and we switched to fully automated web design. I bet all web design would still be deeply nested tables.

I doubt that something like React, Vue, Svelte would exist.


I think about this "potential for ossification" a lot, especially in the Python Data Stack.

Because there's a ton of pandas code on GitHub (much of it trash) any data task asked of ChatGPT/Claude inevitably returns some pandas (mostly correct, but often wrong/inefficient/ugly/hard to grok).

How could a new library (like Polars, or my own failed "redframes") possibly unseat pandas now? Might LLMs actually lead us to, and get us stuck in, a "local maximum" of sorts?


Probably one of my favourite comments. Thanks.

My version: Can an auto AI create cumulative knowledge (the core of the scientific method)?

My take is no. You can use it to identify gaps (opportunities) in research or to summarise research, but can it actually come up with a novel and consistent theory?


As I see it, the issue with using tables for layout is that they made it difficult for humans to maintain and iterate over time, but I don't have a problem with an LLM working at that level. Equivalently, I don't care that the machine code that my system gets compiled into is full of GOTOs if I don't ever need to concern myself with that level.


This is an excellent point. A sibling comment of yours called it ossification. There hasn’t been much innovation or change in Instruction Set Architectures, and there can’t be as the entire world of software relies on these. Only true disruption, such as ARM is for x86, can very slowly change that. So with that in mind it seems plausible that for example pandas becomes the ossified, low level standard high-level code “compiles” down to. I’m sure tons of such API wrappers already exist, but AI might make it so that we truly never need to interact with the lower layers. We wouldn’t care if it was pandas or something else. I personally am not bullish on that. If at all, this is still a decade out at least, and there’s always demand for people working at all layers of the stack. Languages like C and beyond didn’t fully alleviate the need for Assembly engineering.

In fact, in some sense, the number of Assembly engineers might not have decreased at all, aka wasn’t impacted. There were simply new avenues for new sorts of engineers to enter the field and start producing (C back then, web development today as the tech furthest removed from low levels).


Think of all the improvements browser got because we wanted to replace tables as positioning tool.

Not to mention the inefficiency of a pure table approach, so much wasted render time and energy.


> so much wasted render time and energy.

It was amazing two decades ago when we came together as an industry and refocused our priorities on making the web faster and more energy efficient</sarcasm>


I mean.. that kinda sells AI in everything for me, but then I am a statist.


I just finished a 30 page system prompt to do the same thing. I did not use Docker though. I can’t wait to see what they have done in detail. I’m sure like 20% of the people here have tried this too, right?


I'm currently looking for a new job in development... should I be looking for an exit instead?


IMO software engineering has a solid 5-10 years left.

Afterwards, you better be in the top 10% of engineers, or be doing something novel like research.


I think you either need to specialize in something quite niche, or work on really hard problems where the main hurdle is problem solving and not coding CRUD apps, or work on legacy systems where the hurdle is understanding twenty years of garbage code, or get closer to product management so you can direct a much more automated programming process


I agree. Specialists who understand (or even do) cutting edge research will be in demand, because LLM's need existing data to train on for good performance. People who can talk to different stakeholders, and deeply understand the company and its business domain will probably be also needed for requirements gathering and management. But the average programmer who writes CRUD apps or integrations like millions of others, and converts fairly refined requirements to code, they will likely impacted very much.


> specialize in something quite niche

Always a gamble.

I would advise instead, get very good at learning, understanding complex systems deeply, adapting, and moving quickly.

We're entering the fasting changing landscape of engineering we've ever seen.

Be someone that can quickly get up to speed and excel regardless of the task.


I don't think so AI will replace SW jobs. It will definitely change the way engineers work.

Software development involves:

1. Understanding the requirement

2. Solving the problem with given constraints (and thereby innovate)

3. Talking to stakeholders

4. Code (& unit tests)

5. Review #4

6. Troubleshoot in testing

7. Troubleshoot in prod, both perf & subtle issues (this is hard)

8. Take the input from #6 & #7 and use it as a feedback back to #2

9. Answering questions from users/support which involve suggesting workarounds and not just factual answers. Suggesting a workaround itself is a mini-problem solving which is an intersection of domain knowledge, knowing code at hand, understanding customer's situation, etc.

Coding is hardly taxing and time consuming when there is clarity what needs to be done and how it needs to be done.

Point #4 itself has sub-dimensions like performance, maintainability, test-ability, security, etc and it involves lot of subjective calls. Sometimes you have to deal with undocumented behavior of an API which is a tribal knowledge.

To troubleshoot in prod (esp subtle issues) require deep knowledge of the code at hand. This itself is a challenge when you are dealing with generated code and something you have not written yourself. Think about a human dealing with an existing large code-base when joining a team.

I understand all of the above is a spectrum and there are jobs in SWE which do not require so much rigor.

Key ability for breakthrough and the rest will fall in place: Code generated by AI is consistently put to production without human intervention for a sufficiently complex problem considering all good attributes (like backward compatibility, performance, etc)


Steps 2 and 4 through 8 are what this implementation along with others like Devin are doing. 1, 3, and 9 may have surprise answers when a reliable system can produce a working app/website/tool in a matter of minutes. If that’s possible then clients can just keep prompting and checking the results and iterating on changes until they get what they want just like people using Midjourney do to get the idea they have in their head transformed into an image. Things change a lot when apps become so quick, easy and cheap to create that you can try out a hundred variations and modifications in a day.


AI _will_ replace software jobs. It's undeniable. The only thing to argue about is when.

I think most people would agree that AI will probably advance enough in the next 100 years to make engineers obsolete. Some people might think 100 years is actually 5/10/20/50, but it's going to happen unless AI is heavily regulated.


You can't look at them independently. Code generation is not it's own task separate from requirements gathering, Q&A with stakeholders, testing, etc... As we get agents that bring those other factors into the loop of code generation, I feel like code generation itself will get much better. They all are related.


I'm a software and product guy; what I don't get, is how we're going to replace us lowly engineers and not a substantial percentage of a company's Human stack.

Close to engineering, what about the SCRUM masters? The testers. What about the product owners? What about devops? Further from us, what about the people signing up on our vacation? Or the ones signing up on our daily budgets when traveling, or hell, even the ones we interview with.

In my closest group of friends (we're all seniors in our domains, and very honest with eachother), I find that only the construction worker's job should be safe. And compared to myself and the devops guy, most others have what they themselves describe as trivial and bullshit jobs. Join a meeting, do some paper pushing, some document signing, a little coffee and the day is over by 1PM.

Am I seeing all these AIs replacing programming because I'm on a board where maybe a lot of us are programmers? Is it the same for other roles? Wouldn't it make more sense to have the interview process automated by LLMs if they're capable of building great software, before we replace those hired?

I'm very confused by all the hype when matched with my experience of using LLMs daily for the past years.


Well, I am not sure why this is confusing. Engineers do things the rest of the company don't understand and everyone depends on.

Removing dependencies and lack of clarity are two principles you would apply to your own code base. The Human stack, like you called it, is doing the same with the company and staff : remove dependencies and increase clarity.

I am not saying this is the correct approach, but this is the approach.

At the same time, we can understand that real product building and architecture require more than codegen, with or without AI.

Generally, I look at the AI recent progresses similarly to what spreadsheets did to accounting a long time ago. It removed a lot of the dummy work and make everything more efficient. But we still need accountant and they are not wasting their day doing basic math. Of course the one who knew only how to make fast math but not accounting or business logic didn't make the cut on the long run, but I don't think anyone would want to come back to accounting like it was 60 years ago.


AI doesn't seem like it's able to automate software development yet. It's not autonomous enough. The examples shown in the paper are toys. It might become more capable after training LLMs against code execution, to iterate based on feedback, because that way of generating training data is scalable. But strangely I haven't seen much progress yet.

Another domain where I expect to see lots of progress but nothing shows is training from chat logs. I estimate "OpenAI" is making 1T tokens per month from 100M users. That's some serious corpus of text, on-policy data, alternating LLM outputs and human feedbacks, we should see iterative improvements. Why are there no papers? weird


Forget the question of full automation. Think about the percentage increase in productivity, meaning companies can fire lots of people without impacting output. Obviously companies can also choose to keep the same number of people and simply produce more. But not all will.


It doesn't mean that companies "can fire lots of people without impacting output". It just means we're going to produce more instead of working less. Competition will raise the bar in affordability, quality and novelty. That means everyone is going to compete on a different level. Working like it's still 2020 won't cut it anymore.


That's what I said was one of the possibilities. But it is not the only possibility, and it will become less likely over time. Eventually the lift operators and steam engine dudes get deprecated. More trains are running, but the steam engine dudes got fired because they're not needed. Maybe you'll be the first non-customer facing profession in history that doesn't go through deprecation in the face of automation. Good luck with that bet.


See Appendix D in https://arxiv.org/pdf/2303.10130.pdf OpenAI’s “GPTs are GPTs: An early look at the labor market impact potential of large language models”

It’s not just construction workers. Still safe to be a pro athlete and a short order cook, too!


I feel like no matter how productive AI becomes, it will always be more productive when paired with humans. Since all companies will have AI, and since they will continue to play the same game as before--sell a good or service that provides a better value than the competition--that means that humans will still be the differentiator.


This is what Kasparov said about chess. He was wrong. The “centaur” phase lasted for a while but after a few years humans only make things worse. In fact that is the famous last words of humans in every area where computers have surpassed them.


"Productive" may have been a poor word choice. When faced with a sea of AI-driven options people will be looking for a differentiator. The human touch will be the differentiator. There will probably be a lot of efforts to authenticate human involvement.

Consider the alternative. A million models all competing for human dollars. How can any of them succeed if they are all interchangeable? It stops mattering that they are better than humans if they can't make money.


True, but at the same time humans seem to have a self delusion/bias of efficiency.

If we were really Homo economicus there wouldn't be so many bullshit jobs right now.

Homo economicus would be excited about the 2024 computer chess championships. The idea of grand masters would be a relic of the past.

Homo economicus in 2050 would be a single CEO with everything else done by AI. A real human though will hire another human for some bullshit job just to tell the owner how great they are. Over time, that human with the bullshit job will gain status in society and hire/convince the owner they need to hire human number 3 to tell human number 2 how great they are. Hell, a real human will waste money on an office and hire other humans for bullshit jobs just so their parking spot feels like something special in the morning.

Magnus Carlsen couldn't be doing better even though Homo economicus would no longer be impressed at all with a rating of 2882.

Homo economicus would think that rating sucks but humans have nothing to do with Homo economicus.


Don't really understand the trends myself (don't mean this in the positive or negative sense, just some grays popping up in my stubble), but the more people I know or meet that have jobs in IT, I have to notice that the tiniest percent of them are programmers. And it keeps getting more and more diluted.


Agree. We developers try to create AI to replace ourselves first while making the bosses rich


If you employ an AI to do software development for you then you are now a boss, congratulations. Getting rich is still up to you though.


I think we’re seeing the beginning upward trend of an exponential curve for humanoid robotics too, which will likely take 90% of construction jobs because humans are especially poorly equipped for difficult manual labor on a daily basis.


IMO humans are quite well equipped for manual labor, much more so than for most office jobs. Millions of years of evolution perfected us for eye-motor coordination, not writing code or attending Teams meetings. Building robots that can do it as well as humans is going to be a difficult task.


humans are at a disadvantage from robots in that:

1. they need to sleep 8 hours a day

2. they will refuse to work more than 9 hours a day in many cases

3. they need frequent breaks

4. they are fueled by biomatter which is massively resource consuming to generate and deliver

5. they provide inconsistent results

6. they fight with other workers

7. they can be dishohest

8. bodies break down irreparably after a couple decades of this style of work and require hundreds of thousands of dollars in medical care

9. it takes 20 years to produce another viable human to serve as a replacement worker, and there's very little you can do to influence the general supply of total workers

10. they follow instructions poorly

11. they take an incredible amount of time to learn

12. they require extreme levels of safety precautions that an all-robot work crew would not need


IT industry "we cannot use formal methods to create software out of programmable constraints, it would use too much computational power"

also IT industry: "it takes 1T flop to compute each token of this program and the result is so unstable that to converge it we need layer and layer of controls over each token group, also obtained by asking the same 1T parameter model."


You are the first person I’ve ever seen use that argument against formal methods.

Most people would point to the large dev time increase, and the massive refactoring impairment.


To be fair I can't think of any reason to prefer LLMs to formal methods: overall cost, reliability, comprehensibility, whatever you choose to compare, other than natural language generation, formal methods are just way ahead. But I guess it's not what's pushed by the top industry players and people just follow whatever trends on twitter and linkedin.

I have a feeling that the situation is pretty different in the semiconductor industry where the stakes are much higher and where I understand model checking to be a standard thing to do, but I'm not an expert.


Why are technologists trying so hard to make themselves redundant?

This is like the Luddites themselves creating milling machines, eager for the foreman to show them the door.

What gives?


Because we’d rather be living in a world where textiles are so cheap to produce that everyone can afford to clothe themselves however they want and even the working poor can furnish their homes with upholstered furniture and own more than one towel, blanket, sheets, pillows, and have rugs, mats, carpets, etc. not to even mention all the other woven stuff that isn’t even possible to do by hand along with all the other technologies birthed from the very smart idea of automating a tool that was previously only operated by hand. I don’t know about you but I certainly don’t want to go to my grave bent over this loom.


they'll get a $20k bonus

then in 5 years time they'll rendered permanently unemployable by their own creation

the definitions of both "short sighted" and "digging your own grave" in the dictionary should have a picture of this guy: https://twitter.com/alexgraveley/status/1671213996735594503

he's finally realised how it's going to end up going: https://twitter.com/alexgraveley/status/1758204137286599030

shame he wasn't smart enough to realise that at the beginning


Some people care more about progress than something that occurs to every industry at some point. Some people also realize that as technology changes, so does the job of using technology.


They're absolutely transfixed by the "wow cool robot" effect.


If you don't, someone else will


There is a delusion that everything is always "creative destruction" because in the past it often was. This ends up with people thinking that this will just get rid of the LCD developers, but "real" developers will always be needed and have jobs. In reality, it could quickly get to a point where the only developers left are maintaining code that has replaced developers and everyone else just has to go find a non-tech job.


To prove some kind of superiority over all programmers?

I think that would fit idiot savant logic very well.


Time to unionize


they will outsource it then. you need global union with india, vietnam, east europe, and china. good luck with that.


I am not sure I agree. Many companies do not want to outsource stuff, because afterwards you see a huge dip in quality.


> Many companies do not want to outsource stuff,

Have you seen the revenue of Indian outsourcing companies, it keeps going up?


Great achievement, but what a horrible future we are facing.

Instead of progressing towards more powerful programming interfaces with less cause for misinterpretation, we are going to automate the silly process of writing redundant unit tests to check if the behavior that we wanted was encoded properly.

Why not skip this nonsense and have the code generated from the behavior in the first place?


Because test define the behaviour. The hardest part of requirements is describing exactly what you want. Tests are great way of doing that.

I think AI software development is going to involve writing tests, which the AI agents then get passing. Or some other requirements language that allows for exactness where English can fall down.


My experience is that the more complex a loss function that you write for optimisation the lower the likelihood of a "natural" or robust result. So, the software is highly likely to work for all the test cases, and nothing else.

This is kinda the problem that Tesla has for FSD; they are endlessly patching it and it's endlessly finding ways to go wrong.


You have describe tests in a way doesn't use example based testing.

Think property based testing. That way it can't overfit.


It all depends of course, but I tend to disagree.

A requirement such as "A web interface to play the game of chess, but with all the pieces replaced by photos of my family members" is fairly adequate if I were to make a Christmas family game.

I am totally uninterested in specifying whether it is possible to have two white bishops on black squares. Also, I don't want to test whether my uncle's moustache is the proper size.

I'd much prefer to iteratively build the game by prompting than by specifying thousands of details. It just seems the wrong way around.


So we get to do the shitty part.


If it makes you feel better, the AI will write all the code for the tests and come up with all the variations and do all the fuzzing to try to break things. You’ll just have to do a good job of explaining the requirements to it and adjusting them as it becomes clear you didn’t fully describe the outcome you wanted the first time around.


It's shitty for the same reasons why AI will be bad at it. It involves talking to people, and trying to resolve ambiguous requirements.


Code _is_ the description of the behavior. Turns out describing behavior is really hard. If you cover all the cases, it’s not that far from code. Furthermore, the data structures are a huge decision space which depends on context that is hard to communicate.


Technically, the future does not have to be so gloomy. Soon the IA may built for itself the better tools we failed to build (or adopt). One can expect centralised intelligence to surpass collective intelligence, after all.


I would say it's debatable whether centralised intelligence can outperform collective intelligence. Centralised intelligence can be very efficient, but by definition it will lack the perspectives that a diverse collective intelligence can offer. In the long run, if the search for global optima is the ultimate goal, a diverse collective intelligence will have a higher chance of success than a centralised intelligence, especially in multidimensional spaces.


Are we at a fifth-generation programming language yet?

https://en.wikipedia.org/wiki/Fifth-generation_programming_l...


It’s been a good run everyone, good luck out there


It's really depressing to see that big tech is essentially universally pushing Snake Oil. AI is a lie. LLMs are a legit tech that have some purpose, but LLMs will never evolve into anything remotely resembling the actual definition of AI.

They are flowery language generators and they will never be able to reason, understand, debate and criticize. They know nothing and therefore embody nothing. No matter how much computer power you waste on them the end result will always be bullshit. Nothing more. Nothing less.

To Big Tech I say: Prove me wrong.


rebuttals:

- LLM capability has grown spectacularly fast. Google published the transformers paper in 2017. Remember when OpenAI said GPT-2 was too dangerous to release? That was five years ago. Now look where they've gone and where trillions of dollars are going. https://news.ycombinator.com/item?id=20912556

- what is the "actual definition of AI"? Do you mean AGI? Machine learning and deep learning have been subsets of AI since their inception.

- LLMs or any AI tech does not need to "reason, understand, debate and criticize" to be useful, disrupt the economy, and change how people work and live.


I think your time would be better suited getting as far ahead of this looming upheaval as possible, rather than going down with the sinking ship while trying to warn everyone on board that it's sinking, while they scoff at you and order more martinis.


> LLMs are a legit tech that have some purpose, but LLMs will never evolve into anything remotely resembling the actual definition of AI.

It doesn't need to satisfy some definition of AGI to be useful to big tech.

AI just needs to make workers more efficient so companies can either cut costs (fire people) or produce more with the same workforce.


> To Big Tech I say: Prove me wrong.

That's asking to be skynetted. Is that really what you want?


LOL

You speak as if my internet shit talking will make an iota of difference.

Regardless of what I want, the future is what it is.

Deal with it.


[flagged]


I'm not on the hype train myself, but I'd never say never. I can certainly picture much of what devs do today become automated. Heck, we already have plain old deterministic tools that are ridiculously productive compared to common tech stacks in the industry.

That said, it is the very nature of software development (or any endeavor in building complex systems) that we discover a lot of the detail problems later down the line. I don't think of concerns about something like this as "shifting goal posts", I think of them as refining our understanding of the nature of the problems (in this case, the nature of a developer's work).


You still have to do the hard part. Writing all the prompts. That's what we do in all those meetings that sap our time, energy, and patience. Even if we complete all that we rely on devs to identify and/or fill in gaps along the way.

If you think your job is on the chopping block by way of AI then you are confused about the value you offer. I'm often disgusted that even senior devs are confused about what their job is.

AI is an amazing tool for software development--the perfect immediate application. Pick it up and use it and don't look back.


Oh I personally don't even think the job of software developers is in particular danger from "AI" in the short/mid term. That's what I meant by not being on the hype train.

The industry maturing, easy VC/stock market money drying up, the economy not doing so great, those are the dangers I'd worry about.

But who can predict the future? I sure can't.


It’ll never matter because AI will never complain that the company fridge is out of LaCroix.


Or complain about having to talk with people.


Isn't this just wrapping existing agents with a "secure" coding environment?

Does this represent a meaningful improvement in AI?


>Isn't this just

Yes. The various iterations of "isn't this just" seem to be getting progressively better over time though...


> Isn't this just wrapping existing agents with a "secure" coding environment?

It is. The paper has this paragraph where autodev requires rules for it's setup.

> The user initiates the process by configuring rules and actions through yaml files. These files define the available commands (actions) that AI agents can perform. Users can leverage default settings or fine-grained permissions by enabling/disabling specific commands, tailoring AutoDev to their specific needs. This configuration step allows for precise control over the AI agents’ capabilities. At this stage the user can define the number and behavior of the AI agents, assigning specific responsibilities, permissions, and available actions. For example, the user could define a ”Developer” agent and a ”Reviewer” agent, that collaboratively work towards an objective.

Their example is very simple as well, and the code generated does not cover the usecase where the sentence might have `I'm` or `I've`. I feel that the tool would generate a template with missing logic and then the overhead falls on the developer to figure it out and then add them (since this entire process is automated).

The original comment is just farming upvotes from the `AI hype` people on HN. It's become quite common with gangs of users upvoting each other and downvoting any other narrative that does not fit into their `AI` world.

> Does this represent a meaningful improvement in AI?

It seem more like setting up a factory using a LLM and then making it generate other factories.


Thank you for the clarification; I read through the paper and thought I missed something.

Definitely any interesting area of development with papers like this and tools like Devin being released. Interesting times.


Yeah. The shocking speed of progress on AI - especially for coding - tells me quite clearly what the future holds for much of software engineering. Despite the human factors (yes, requirements gathering and specification is messy) our profession is intrinsically automatable at the end of the day. Being in denial is pointless.


I think software engineering will just go up the stack to requirements engineering.

Those requirements will be specified as tests or in a very high level language that doesn't allow for ambiguous meanings. Which is essentially another programming language just more high level.


I agree, and I’m thrilled about it. I love programming but while I’ve had bouts of enjoyment coding “in the small”, it’s my least favorite part. I love systems and software architecture, and seeing things come to life above all else. ChatGPT is greatly improving my enjoyment of it all.

I also doubt traditional programming languages are going anywhere in the short term, seeing there are trillions of lines of code for critical systems in the wild. Good luck replacing that in a few years only, or fully delegating its maintenance to Devin.


> I love programming but while I’ve had bouts of enjoyment coding “in the small”, it’s my least favorite part.

Yes, okay, but what will happen to junior to mid level engineers? In many companies these levels can't even have an influence on architecture.


I basically agree. We'll have AI create some very high-level programming languages that suit various ways of specifying problems precisely, then program in those. Of course, the AI will also know how program in these languages, which will make development very accessible as you could describe a problem vaguely in natural language, get a specific solution back, then confirm it works for your needs or tweak it until it's suitable.


If the hypothetical high level language doesn't allow for ambiguous meanings, you can write a compiler for it and you don't need the AI.

(Not saying you're wrong about "higher and higher levels of abstraction" - that's the way it's always been - but the unique promise of AI is the ability to deal with ambiguity in the requirements).


Ambiguity in requirements is where the majority of bugs come from now. So I don't think AI would improve things.


Assuming a (hypothetical, doubtful) AI as good as a human programmer, I think the advantage becomes speed. Think of it like a REPL that you program in English. Patching ambiguities with "Oh, I didn't mean X, I meant Y" becomes the work of minutes, not hours.


In this case, you don’t really need AI. People like Forth and Lisp because you can built an efficient DSL which you can converse with. What people have trouble with is formalism, and logical and abstract reasoning.


Exactly. There will be a lot less of the "we will have to look into it" 2-day lag time.


"The shocking speed of progress on AI"

Not sure what is shocking. Some of us used Codex in 2021 and nothing has changed since that has revolutionised software development by any capacity. Feels like people are just saying whatever some tech CEO or worse, some guru says.

Progress isn't some infinite exponential either where you can just keep increasing the capacity of models and expect better results. Also not sure where they are going to get better training data when they already polluted the internet with AI generated data.


Yeah, to me it feels like a step change to a new level, followed by a lot of exploration of that new level. It feels like there's lots of exploring left to do, but we're not going to reach "magic box that turns English into flawless programs" on the current level.

Speaking just to programming, yes there's probably lots of AI-generated code in the training data now. But the generation of that code was guided by, and (presumably) confirmed to be working by, a human. So it's been filtered.

I suspect that AI-and-human produces better code than human, on average. I know that if I throw any twenty lines of my code into it, it can usually suggest some improvement, even if it's just a variable name.


I am quite confident my job is not in danger. Sure - scaffolding and boilerplate can be automated using ML - but problem solving - not at all.

The reason is simple - to find a solution is to create a program that has a specific set of properties. This - by Rice's Theorem - is not possible to realise using a Turing Machine. So as long as the solution search space is small (ie. we are talking about very similar programs to generate) - ML based methods are fine.

But once we are talking about finding solutions to non-trivial problems - I am sure human is needed.


The world contains a great many arguments invoking Rice (or equivalently the halting problem) to "prove" non-mathematical statements, and they are almost all specious. a) the mere fact that there is no fully general procedure to do some particular task doesn't imply some human can do that task any better than the best procedure we can find, and b) even if you believe there's something noncomputable about the human mind, it still allows the existence of a procedure that will solve all instances of that task which will actually arise in reality.

It's a purely empirical question about the extent to which Church-Turing holds of the real world.


Until a computer can solve the problem “how to redesign this particular database design (tables/partitioning/indexes/queries) so that it can handle expected load in next 5 years” I can safely assume it does not hold :)


Ah, now you're talking price! How many years do you bet before this happens - three? Five?


I don’t believe it will happen at all. There will always be some problems that only a human can solve because it’s a human that defines them.

It is always _by definition_ a game between humans where computers are just tools.

At least as long as AGI does not happen…


I don't at all see why "How to redesign this particular database design (tables/partitioning/indexes/queries) so that it can handle expected load in next 5 years" is a problem that only a human can solve because only a human defined it. Forecasting load is in the realm of machine learning without even needing LLMs, and LLMs are already capable of designing database schemas and so forth, so this seems much more like a question of "when will the existing technology, which can do this sometimes, get to the point of being able to do it always".

Recall that things are true "by definition" if they're not true for any other reason; you're free to define yourself out of the race against the machines, but the real world need not pay attention to your definition!

Computers are "just tools" is certainly true, but I find it harder to automatically agree that the programs running on them are "just tools". Here is a rather bad analogy which is suggestive: in school we learn that there are seven features which are necessary and to some extent sufficient to identify a living organism ("movement, respiration, sensitivity to surroundings, growth, reproduction, excretion, nutrition"). Programs certainly move (thanks to Kubernetes, for example), are sensitive to their surroundings (think of self-healing clusters or load-shedding), grow (scale up in response to demand), and excrete (cleanup jobs on databases). They don't respire, but then that was only ever a contingent feature of how life evolved on Earth; they don't take in nutrients unless you really stretch the analogy; they don't currently reproduce, although LLMs being able to code means it's only a matter of time before an LLM can train another LLM (even Claude 3 Opus is able to train small ML models). All this is to say that the word "tool" is a very poor description of what an LLM is: "tool" calls to mind things which don't have any of those lifelike properties, whereas even large non-intelligent software systems such as Kubernetes have many of those properties.


What I am trying to say is that even if computers are going to be able to solve todays database design problems (which I am sceptical of by itself) - we are going to have bigger/more capable databases and other problems that computers are not able to solve.

It is going to happen by definition - people by definition want more than they have today and want to solve unsolved engineering problems ("problems" that machines can solve are simply not problems anymore).

The only way it can end is creation of some overwhelming intelligence (AGI) that would govern (or some might say: enslave) people so that their demands are controlled.


The problem is software development is 95% stuff that has been done a million times before. If you disagree you’re probably not exposed to enough of the industry to see this.


Is it the problem? Most of the time I’ve coded something is because the code I found did not solve the problem according to my needs. There’s a plethora of music players, but no one is alike. That’s because the domain of music playing is vast and each developer had to compromise to find something that can work for him.


If you make a VLC clone from scratch but intended to be used for kiosks, the mixture of features and how it operates may be different, but the individual features have assuredly already been programmed elsewhere before. Which is basically my point. It still basically renders video on a screen, and maybe has options to play/pause/mute/etc. Most websites are CRUD apps with stuff that's been done literally a million times before.

Yes, some people do novel problem solving with software. But it's super rare in the grand scheme of the millions of people in the world who program.


What you are describing is a pipe dream of software reuse. People try to do it from the very beginning of IT industry with mixed results.

The devil is in the details - your mixture of features is different than mine. We have constant upgrade of video codecs that offer more capabilities. We have unavoidable security issues that need to be somehow addressed (by version upgrades or replacing some components). Etc, etc, etc...

IMO it is quite arrogant to claim "all problems are the same problem" - they are not even though they might look similar.


I agree that the premise of software reuse with humans is not realistic. However my point being that an LLM trained on the entirety of available code could realistically mix and match features across different pieces of software and build something cohesive


That would mean 95% of problems are already solved.

I don’t agree with this - the space of problems to solve is as big as that of solutions (both are programs in the end).

I don’t think suggestion about lack of experience deserves any response.


> That would mean 95% of problems are already solved.

No, it means 95% of effort spent is spent towards already solved problems. This is, in part, why I hate capitalism. People are doing tons of work simply to compete with others instead of working together to solve meaningful problems.


It is difficult for me to dispute this as I don't think I am in any way entitled to define what problems are "meaningful".

BTW - lack of an authority deciding what problems are meaningful is the core idea of capitalism. And I'm telling you as a person that lived many years under communism: this is much better for everybody.


[flagged]


And I suppose you pasted this from some ai summary.


The post-ChatGPT world has made me hypersensitive to the "AI smell" of bullet lists where each item ends with a fullstop. (Insert standard discussion points about whether it really even matters who wrote it, I only notice the obvious cases, humans have been spewing low quality content since the invention of language, ...)


It’s made me sensitive to super flowery language full of metaphors because it reeks of LLM


You can tell I'm a person because I swear like a sailor.


Oh this is neat, it's based on Visual Studio. Curious how they're accounting for "whoopsie I touched this button and the IDE crashed" kinds of problems that you encounter with larger codebases.


VSCode "IDE" not Visual Studio




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: