An example of LLM prompting for programming

afro88 · on April 18, 2023

The article shows everything that works for this approach. But it's a bit disingenuous. At the end:

> Once this is working, Xu Hao can repeat the process for the rest of the tasks in the master plan.

No, he can't. After that much back and forth and getting it to fix little things where it gives responses with the full code listing again, he would have easily hit the token limit (at least with any chat LLM capable of this quality code and conversation - ChatGPT). The LLM will start hallucinating the task list, the names of functions it wrote earlier etc. and the responses would get less and less useful with more and more "this doesn't work, can you fix X".

So anyone following this approach will hit a footgun after task 1.

For anyone that really wants to follow this approach, the next step is to start a new chat and copy/paste the inital requirement prompt, put the task list in there, any relevant code, adjust the instruction (ie "help me with task 2") and go from there.

It is of limited utility though. By step 3 (or even 2) you end up with so much code that you're at the token limit anyway and it can't write code that fits together.

Where I've found ChatGPT 4 useful is getting me going on something, providing boilerplate, and unblocking me.

If you don't know how to approach a problem like the "awareness layer" (like I didn't before reading the post), you can get a great breakdown and starting point from ChatGPT. Similarly, if you're not sure how to approach that view model, or write tests etc. And if you want a first draft of code or tests.

All that said, I'm looking forward to much larger and affordable token limits in future.

ryanjshaw · on April 18, 2023

Your experience matches mine closely. I've had ChatGPT-4 do great and then it just gets confused after a while. I can literally tell it "task X is done" and it'll apologise and show me a list of tasks where X is still not done - this is clearly not just a context window issue, as I have repeated variations of my statement over and over in the same session and the issue persists.

I have ended up using it the same way you have - it's honestly the best anti-procrastination tool I've ever used because I can tell it my intentions, what I've thought of so far... and it'll spit out a list of bite-sized chunks that get me going. I find myself looking forward to telling the AI I've completed a task.

Similarly, if I'm facing a tricky design decision, I find that just writing it out for ChatGPT is extremely helpful for clarifying my thought process. I actually used to do this conversational decision making process in a text editor long before ChatGPT, but when I know there's an AI on the other end my thinking becomes clearer and more goal-oriented. And unlike talking to myself or a human friend, it's happy to just say "well if these are your concerns, let's start HERE and then see what happens".

travisjungroth · on April 18, 2023

Good rule of thumb with ChatGPT: you can’t exit loops. Once you’ve gone A > B > A, your best move is to start a new chat. Even then it may reproduce and you should do some similar but different task. Remember that it’s a prediction engine, weighing heavily on the existing prompt. So you say B again, or B1 and it’s like, I know what to do! A! Cause last time was A->B so let’s do it again.

In your case this would be “[]Task1”, “Task1 is done”, “[]Task1”, [here is where you start a new chat or fix it yourself if possible].

sycren · on April 19, 2023

Instead of starting a new chat, why not change the prompt higher up in the conversation with the relevant detail that you have gained through the responses?

travisjungroth · on April 19, 2023

I think that would work and be better in some cases. The thing is you want to change things up, lose some context. I may not be able to go far enough back without losing progress to do that. But I could copy just the current state of the task (last response) to a new chat.

peterashford · on April 18, 2023

Ooh! That's a really good point - ChatGPT is effectively rubber-ducky as a service =)

Fuzzwah · on April 18, 2023

This is exactly how I've been explaining LLM tech to my "non-geek" friends and family. I start by explaining rubber ducking, and how I now use chatgpt as a more advanced version of the process.

hn_throwaway_99 · on April 18, 2023

Hmm, I also use ChatGPT as an anti-procrastination tool and task manager, and it's never made a mistake with keeping track of my task list (except that when it sums the estimated times of subgroups of tasks, sometimes those sums are wrong).

Note that it outputs my updated task list every time I add or remove a task (I only asked it to do that one time), so even if old messages go outside of the context window, it's not a big deal because the full updated state of the list is output basically every other message.

byt143 · on April 18, 2023

Interesting. What's your prompt?

hn_throwaway_99 · on April 20, 2023

Details here, https://news.ycombinator.com/item?id=35390644

funnbot · on April 20, 2023

Here's a web app I found recently that should work way better for you. Idk what model it uses (its also free, it feels like chatgpt3.5, so I guess they are funding it out of pocket?)

https://goblin.tools/

Tostino · on April 18, 2023

You iterate on your plan after it is generated step by step. You go and edit the prompt chain you started working on step 1 on, and modify it to start working on step 2 (including any ideas or fixes you have identified while implementing step 1. Repeat until complete.

You can still absolutely hit the context limit, but you are far less likely to do so if you go back and start a new prompt chain for each different thought process you are going through with it.

afro88 · on April 18, 2023

Great idea. But does it get hard to navigate back to something in older chat histories though?

I find a new separate chat with the revised initial prompt to be easier.

williamcotton · on April 18, 2023

I’ve been using another call to an LLM to write or rewrite code that is separate from the main “conversation”.

What I mean is that I’ve got a dialog going with an LLM and I’ve trained it to call a build() function with instructions that then returns the function, with the text of the function kept out of the dialog with the main thread.

quijoteuniv · on April 18, 2023

It's great to see that there's now a term for the type of prompting, “generated knowledge”. I've been experimenting with this technique since the beginning, and I've noticed a significant improvement in version 4. The process involves outlining the project, creating tasks, and feeding them back to chatGPT as you progress. This approach has helped me complete projects that would have otherwise taken me much longer to finish.

It's also useful for creating practical tutorials. While there are plenty of tutorials available online, sometimes you need guidance on a specific set of technologies. By using generated knowledge prompts, you can get a good outline and tasks to help you understand how these technologies interact.

One thing to keep in mind is to avoid derailing the conversation with questions that are not relevant to the core tasks. If you get stuck on something and need to debug, it's best to use a separate conversation to avoid derailing the project's progress and the allucinations & forgettingness

AzzieElbab · on April 18, 2023

Something must be wrong with me. I could never get anything useful from Martin Fowler's writings, and coincidentally I cannot get any functional code out of ChatGPT. Even the boilerplate it produces for me needs to be corrected. I still use chatGPT to produce examples of abstract things but was not able to get any working code that matches concrete problems or even compiles.

afro88 · on April 18, 2023

Are you using the GPT4 model? There's a very significant improvement between 3.5 (the free one) and 4.

AzzieElbab · on April 18, 2023

I am supposedly on GPT4 via GPT+. I try using it for boilerplatey things, like terraform, and the results are simply incorrect. It seems more helpful in providing examples, even for some far more complex tech - like rust code.

simonw · on April 18, 2023

Does it say GPT-4 at the top of the screen?

AzzieElbab · on April 19, 2023

It does. One example of incorrect TF got produced is splitting the dynamo table, and it's gsi into two distinct resources

afro88 · on April 18, 2023

Absolutely, and same here. I've done multiple tools that would have taken 2-3 days each in 2-3 hours each.

> One thing to keep in mind is to avoid derailing the conversation with questions that are not relevant to the core tasks. If you get stuck on something and need to debug, it's best to use a separate conversation to avoid derailing the project's progress and the allucinations & forgettingness

Definitely. Great advice.

Another tip: don't bother asking it to fix small things. Just mention you fixed it in the next reply and move on.

matchagaucho · on April 19, 2023

he would have easily hit the token limit

That was my first question. Do all these tasks fit within a 4K or 8K buffer?

Wouldn't be surprised, though, if it works in a 32K GPT4 token limit. Amazing things are possible.

heyyyouu · on April 19, 2023

This may be a dumb question, but do you know if this something that LLM frameworks like LangChain can (or others) can help with? Aren't they designed to help with more complex prompts/logic/outputs? Or will they run into the same token limits?

blastro · on April 19, 2023

I completely agree with this take.

themodelplumber · on April 18, 2023

If somebody thinks an LLM is coming for everybody's coding job, I'd say this article is a great counterpoint just for existing.

You could tell someone from decades ago that we now use a very high level language for complex tasks in complex code ecosystems, never even mention AI, explain that the parser is really generalist-biased, and this article would make perfect sense as an example of exemplary code by a modern coder working for a living.

That's code in there, the stuff Xu Hao is writing.

And also, that's not even getting into the debugging part... Which will be about other code, that looks different.

wpietri · on April 18, 2023

Yeah, I think there's a "stone soup" effect going on with AI.

It's the same sort of thing you see happening with the customers of psychics. People often have poor awareness of how much they're putting in to a conversation. Or it's a bit like the way Tom Sawyer tricks other kids into painting the fence for him. For me a lot of the magic here is in knowing what questions to ask and when the answers aren't right. If you have those skills, is pounding out the code that hard?

The interesting part for me is not generating new bits of code, but the long-term maintenance of a whole thing. A while back there was a fashion for coding "wizards", things that would ask some questions and then generate code for you. People were very excited, as they saw it as lowering the barrier to entry. But the fashion died out because it just pushed all the problems a bit further down the road. Now you had novice developers trying to understand and improve code they weren't competent to write.

I suspect that in practice, anything a person can get a LLM to wholly write is also something that could be turned into a library or framework or service or no-code tool that they can just use. That, basically, if the novelty is low enough that an LLM can produce it, the novelty is low enough that there are better options than writing the code from scratch over and over.

baq · on April 18, 2023

I mostly agree except one critical detail: LLMs are the low code/no code service. You literally tell them what you want and if they’re fine tuned on the problem domain, you’re all set. Microsoft demo’d the office 365 integration and if it works half as well in practice they’ll own the space as much as they have in 1997.

wpietri · on April 18, 2023

Maybe they will be, but that's not proven yet. We'll see! If anything, the article we're looking at suggests that the "tell them what you want" step is not obviously much less rigorous or effortful than coding. Tuning could make the difference, or it could be one of those things that produces better demos than results.

fijiaarone · on April 19, 2023

One strange coincidence with the emergence of ChatGPT is that at almost the exact same time, Google became practically unusable as a search engine. Like at least an order of magnitude worse.

People used to use Google the same way they use ChatGPT. They would ask a question in plain English, and get sent back a list of relevant links to blog posts, articles, stack overflow, or whatever that had answers to their questions, including example code.

Sometimes that code or information was outdated or completely wrong and sometimes it was too basic to be useful, or just the code-generated docs.

Google has been getting gradually worse over the years due to spam, algorithm gaming, and ads, but circa late November 2022 it became practically worthless.

harlanlewis · on April 18, 2023

Great points (and after checking your user name, I’ve been nodding my head to posts of yours for about a decade now).

This is a bot tangential - your reference to stone soup is a wonderful example of the information density possible with natural language. And all the meaning and story behind the phrase is accessible to LLMs.

I’ll have to start experimenting with idiom driven development, especially when prompt golfing.

62951413 · on April 18, 2023

I believe the Model Driven Architecture fad (https://en.wikipedia.org/wiki/Model-driven_architecture) is a better analogy than wizards. Back then the holy grail of complete round trip UML->code->UML didn't get practical enough to justify the effort.

wpietri · on April 19, 2023

Definitely also a good analogy. But MDA at least was about supporting users in understanding what's going on. Wizards and "AI" both are about users avoiding the work of understanding.

notacoward · on April 18, 2023

The problem is that it's not quite code. It's almost code, but without the precision, which puts it into a sort of Uncanny Valley of code-ness. It's detailed instructions for someone to write code, but the someone in this case is an alien or insane or on drugs so they might interpret it the way you meant it or they might go off on some weird tangent. You never know, and that means you'll need to check it with almost as much care as you'd take writing it.

Also, having it write its own tests doesn't mean those tests will themselves be correct let alone complete. This is a problem we already have with humans, because any blind spot they had while writing the code will still be present for writing the tests. Who hasn't found a bug in tests, leading to acceptance of broken code and/or rejection of correct alternatives? There's no reason to believe this problem won't also exist with an AI, and they have more blind spots to begin with.

elendee · on April 19, 2023

I think often of the adage "it's harder to read code than write it". GPT gives you a lot to read. definitely a better consultant than coder imo. I've also had GPT write entirely false things, then I say "isn't that false?" and it says, "yes sorry about that" . very uncanny

fijiaarone · on April 19, 2023

And the code that GPT does write, if it is even close to correct, must be code that exists in many places already, and usually (like the case of so much react code) doesn’t need to exist at all.

nextworddev · on April 18, 2023

The opposite might be true, and here’s why - 1) by using English as spec, the barrier of entry has gone lower, 2) LLMs can also write prompts and self introspect to debug.

ZephyrBlu · on April 18, 2023

I think English as a spec actually makes the barrier of entry higher, not lower. Code itself is far easier to understand than an English description of the code.

To understand an English description of code you already have to have a deeper understanding of what the code is doing. For code itself you can reference the syntax to understand what's going on.

The prompt in this case is using very technical language that a beginner will have no idea about. But if you gave them the code they could at least struggle along and figure it out by looking things up.

valenterry · on April 19, 2023

Exactly. So much that often with a tricky problem or discussion, I ask the person to sit down with me and just code the most relevant parts without implementation, just the type signatures.

That is so much more productive because it immediately removes or highlights all the ambiguity which you can sugar coat with English but not in a programming language.

lcnPylGDnU4H9OF · on April 18, 2023

This reminds me of rubber ducking[0] in how it necessitates a certain understanding. If one is able to explain it in plain English it's because it is understood.

[0] https://en.wikipedia.org/wiki/Rubber_duck_debugging

nextworddev · on April 18, 2023

Yes but LLMs can also be used by laypeople to explain the issue in plain English too. That’s the problem. Not that LLMs would need human to guide the debugging process anyways (at least in a few years)

ZephyrBlu · on April 18, 2023

You still have the same problem... You cannot describe a technical field with plain English. If you did so the semantics would be incorrect. There is a reason jargon exists.

The first two paragraphs alone are absolutely chock with terms that would not be easily explained to a layperson:

"The current system is an online whiteboard system. Tech stack: typescript, react, redux, konvajs and react-konva. And vitest, react testing library for model, view model and related hooks, cypress component tests for view.

All codes should be written in the tech stack mentioned above. Requirements should be implemented as react components in the MVVM architecture pattern."

What is every library in that list? What is a model? What is a view model? What is a hook, component test, view, MVVM, etc?

If a layperson could understand explanations for all these things then they would not be a layperson.

mooreds · on April 18, 2023

> by using English as spec, the barrier of entry has gone lower,

I'm not sure that is true. The level of back and forth and refinements needed indicate to me that the "English" used is not the normal language I use when talking to people.

It's almost like a refined version of cucumber with syntax that is slightly more forgiving.

Maybe I'm being a codger, but LLMs seem (at least for now) far better for summarizing and giving high level overviews of concepts rather than nailing precise code requirements.

ben_w · on April 18, 2023

> It's almost like a refined version of cucumber with syntax that is slightly more forgiving.

I don't know if "cucumber" is autocorrupt or an actual non-vegetable thing; can you clarify?

mjr00 · on April 18, 2023

Not a typo.[0]

In the 00s/early 10s, software went through a fad phase where people earnestly thought that by implementing Gherkin frameworks like Cucumber, you'd be able to hand off writing tests to "business people" in "plain English." It went about as well as you'd expect.

[0] https://cucumber.io/docs/gherkin/

ben_w · on April 18, 2023

Thanks!

Despite that period being when I finished my Software Engineering degree, got my first job, and then attempted self-employment, I'd never heard of it before.

Looking at the book titles — "Cucumber Recipes" in particular — even if I had encountered it, I might have assumed the whole thing was a joke.

Tainnor · on April 19, 2023

I still think the basic idea of Cucumber and similar tools is sound. It just doesn't match how 99% of companies operate.

On the other hand, many developers write shitty tests where it's absolutely unclear what the test is even trying to achieve, so trying to find some sort of framework which tries to forcibly decouple the what of the test from the how maybe isn't the worst idea.

hitchdev · on April 19, 2023

That's exactly what I tried to do with this framework:

https://github.com/hitchdev/examples

Rather than trying to force your testers and stakeholders to adapt to the DSL, the templated story->documentation generation lets the dev or tester adapt the DSL to whatever the stakeholders want to see while keeping the story strictly about behavior.

omnicognate · on April 18, 2023

https://cucumber.io/

That "did they actually mean that or was it autowrong?" feeling is going to get worse I fear.

gwright · on April 18, 2023

https://cucumber.io/docs/guides/overview/

agentultra · on April 18, 2023

But you can't determine if a statement is true by simply reading more words.

It's also not efficient for doing higher level work. There was a time before we had algebra where people were still expressing the same ideas but the notation wasn't there. Mathematics was expressed in "plain language." It's extremely difficult to read for us. For mathematician's of the time there was no other way to explain algorithms or expressions.

For simple programs I have no doubt that these tools enable more people to generate code.

However it's not going to be helpful for people working on hypervisors, networking stacks, operating systems, distributed databases, cryptography, and the like yet. For that you need a more precise language and an LLM that can reason about semantics and generate understandable proofs: not boilerplate proofs either -- they have to be elegant so that a human reading them can understand the problem as well. We're still a ways from being able to do that.

nextworddev · on April 18, 2023

Arguably reading code can’t lead to definitive conclusions about its bug-free-ness

staunton · on April 18, 2023

Reading and proving a spec can though. LLMs are in principle capable of doing that. (If your objection is that the spec might have bugs then "bug free" is subjective and nothing at all can ever lead to definitive conclusions about it)

agentultra · on April 18, 2023

Precisely! And neither can generating a handful of unit tests. As EWD would say, they only prove the existence of one error. Not that there are no errors.

If we want more programs that are correct with respect to their specifications we need to write better, precise specifications… not wave our hands around.

However for a lot of line-of-business tasks we’re generally fine with ambiguous, informal specifications. We’re not certain our programs are correct with respect to the specifications, if we had written them out formally, but it’s good enough.

I think most businesses that are writing software that needs to be reliable and precise are not going to benefit from these kinds of tools for some time.

tjr · on April 18, 2023

This is true in aerospace software. Lots of process, lots of specification, lots of verification. I wouldn't want to say that GPT-seque tools would be useless here, but I really don't see them offering the same kind of magic leverage that they might offer on some other projects.

And vice-versa! Most software projects do not benefit from the rigor used in aerospace, because it's just not needed, and would be a waste of time.

I am definitely seeing ways that GPT tools could speed up some aerospace work, but we need to be really really sure that things are being done correctly... not just mostly correct, or seemingly correct.

notacoward · on April 18, 2023

> LLMs can also write prompts and self introspect to debug.

Why should we assume that won't lead to a rabbit hole of misunderstanding or outright hallucination? If it doesn't know what "correct" really is, even infinite levels of supervision and reinforcement might still be toward an incorrect goal.

ben_w · on April 18, 2023

To which the normal response[0] is: that's just like humans.

Of course, it's still bad that humans do it; but despite the scientific method etc., even successful humans often work towards an incorrect goal.

[0] I am cultured, you're quoting memes, that AI is just a stochastic parrot: https://en.wikipedia.org/wiki/Emotive_conjugation

notacoward · on April 18, 2023

But it's not just like humans. For one thing it's built differently, with a different relationship between training and execution. It doesn't learn from its mistakes until it gets the equivalent of a brain transplant, and in fact extant AIs are notorious for doubling down instead of accepting correction. Even more importantly, the AI doesn't have real-world context, which is often helpful to notice when "correct" (to the spec) behavior is not useful, acceptable, or even safe in practice. This is why the idea of an AI controlling a physical system is so terrifying. Whatever requirement the prompter forgot to include will not be recognized by the AI either, whereas a human who knows about physical properties like mass or velocity or rigidity will intuitively honor requirements related to those. Adding layers is as likely to magnify errors as to correct them.

ben_w · on April 18, 2023

> But it's not just like humans. For one thing it's built differently

I'm referring to the behaviour, not the inner nature.

> in fact extant AIs are notorious for doubling down instead of accepting correction.

My experience suggests ChatGPT is better than, say, humans on Twitter.

I've had the misfortune of several IRL humans who were also much, much worse; but the problem was much rarer outside social media.

> Even more importantly, the AI doesn't have real-world context, which is often helpful to notice when "correct" (to the spec) behavior is not useful, acceptable, or even safe in practice.

Absolutely a problem. Not only for AI, though.

When I was a kid, my mum had a kneeling stool she couldn't use, because the woodworker she'd asked to reinforce it didn't understand it and put a rod where your legs should go.

I've made the mistake of trying to use RegEx for what I thought was a limited-by-the-server subset of HTML, despite the infamous StackOverflow post, because I incorrectly thought it didn't apply to the situation.

There's an ongoing two-way "real-world context" miss-match between those who want the state to be able to pierce encryption and those who consider that to be an existential threat to all digital services.

> a human who knows about physical properties like mass or velocity or rigidity will intuitively honor requirements related to those

Yeah, kinda, but also no.

We can intuit within the range of our experience, but we had to invent counter-intuitive maths to make most of our modern technological wonders.

--

All that said, with this:

> It doesn't learn from its mistakes until it gets the equivalent of a brain transplant

You've boosted my optimism that an ASI probably won't succeed if it decided it preferred our atoms to be rearranged to our detriments.

notacoward · on April 18, 2023

> I'm referring to the behaviour, not the inner nature.

Since the inner nature does affect behavior, that's a non sequitur.

> we had to invent counter-intuitive maths to make most of our modern technological wonders.

Indeed, and that's worth considering, but we shouldn't pretend it's the common case. In the common case, the machine's lack of real-world context is a disadvantage. Ditto for the absence of any actual understanding beyond "word X often follows word Y" which would allow it to predict consequences it hasn't seen yet. Because of these deficits, any "intuitive leaps" the AI might make are less likely to yield useful results than the same in a human. The ability to form a coherent - even if novel - theory and an experiment to test it is key to that kind of progress, and it's something these models are fundamentally incapable of doing.

ben_w · on April 18, 2023

> Since the inner nature does affect behavior, that's a non sequitur.

I would say the reverse: we humans exhibit diverse behaviour despite similar inner nature, and likewise clusters of AI with similar nature to each other display diverse behaviour.

So from my point of view, that I can draw clusters — based on similarities of failures — that encompasses both humans and AI, makes it a non sequitur to point to the internal differences.

> The ability to form a coherent - even if novel - theory and an experiment to test it is key to that kind of progress, and it's something these models are fundamentally incapable of doing.

Sure.

But, again, this is something most humans demonstrate they can't get right.

IMO, most people act like science is a list of facts, not a method, and also most people mix up correlation and causation.

ModernMech · on April 18, 2023

It’s like when you continually refine a Midjourney image. At first refining it gets better results, but if you keep going the pictures start coming out…really weird. It’s up to the human to figure out when to stop using some sort of external measure of aesthetics.

gumballindie · on April 18, 2023

I mean sure if the world were to run on basic code. Perhaps wordpress developers may feel slightly threatened by even that is well above all examples of a"i" code i've seen.

have_faith · on April 18, 2023

English as a spec is incredibly "fuzzy", there are many valid interpretations of intent. I don't think that can be avoided?

sarchertech · on April 18, 2023

It can't. Legalese is an attempt to do so, and it's impenetrable by non experts and still frequently ambiguous.

bartimus · on April 18, 2023

But there's still going to have to be a human who has the ability to form a mental model of the thing that's needing to be implemented. Functionally and technically. The results of the LLM will vary depending on the level of know-how the human instructor has.

twelve40 · on April 18, 2023

Exactly, I actually liked the systematic approach in the article, but it seemed pretty labor-intensive and ... not that much different from other types of programming

sanderjd · on April 18, 2023

To me, that's the whole point of this. I think it is directly analogous to the jump between assembly and higher level compiled languages. You could have said about that, "it still seems pretty labor intensive and not that much different than writing assembly", and that's true, but it was still a big improvement. Similarly, AI-assisted tools haven't solved the "creating software requires work" problem. But I think they're in the process of further shifting the cost curve, making more software possible to make.

Veedrac · on April 18, 2023

‘Artists' jobs are safe because AI is bad at hands.’

themodelplumber · on April 18, 2023

Artists' jobs are safe in part because they can also use AI, and most already use relevant ecosystems that now incorporate AI.

Consumers who can operate AI for clip art purposes are simply still part of the same non-artist-paying demographic they always were.

Same with code

Veedrac · on April 18, 2023

As farmers' jobs were safe because farmers can use farming tools.

These arguments don't track even vaguely. You are doing the equivalent of analyzing the future of solar power by assuming solar will cost the same in 10 years as it does today, and that each new watt of solar is matched 1:1 with new units of demand. Neither of these are sensible.

It may be that ML code tools never displace many people, or even that they supercharge demand, but you don't get to justified conclusions by assuming the future is just the present but with a bigger UNIX timestamp.

all2 · on April 18, 2023

Industrialization has made farming tools incredibly complex, so I believe the statement "farmers' jobs were safe because farmers can use farming tools" is correct. You still need a farmer to farm, but you now need less manpower to farm. The specialist is secure while the untrained laborer is at risk.

fijiaarone · on April 19, 2023

And yet, despite there being more complex machinery in farming there is also much more untrained labor. People just don’t notice it because the untrained labor (millions of people) that plant and harvest are an underclass that does not speak their language or associate with them.

happypumpkin · on April 18, 2023

Sadly I don't think this is true for art:

https://restofworld.org/2023/ai-image-china-video-game-layof...

I really hope it doesn't end up being the same with code :|

thomaslord · on April 20, 2023

Yeah, even if ChatGPT could perfectly understand the prompts you'd still run into major issues with token limits. I tried to get it to rebuild a single page for me (to move from one UI framework to another) and I couldn't fit the existing code in the token limit. I might be able to get it to do a chunk of the initial work for a greenfield project if I perfected the prompts, but it's structurally incapable of maintaining existing code.

karmasimida · on April 19, 2023

I was thinking that draft of the master plan ... you can't really just write it up this clear and easy.

Overall, I don't think a 95% auto pilot GPT model would provide more efficiency than a 80% one.

SanderNL · on April 18, 2023

Except you now have a way “upwards” from an abstraction POV. Regular code is severely limited and highly surgical, by design. This is not.

All these abstraction layers were invented to serve old style manual coders. Why bother explaining in great detail about “Konva” layers and react anymore? Give it a few years and let it finetune on IT tech and I see this being reduced to “I want whiteboard app with X general characteristics” at which point I’d no longer speak about “programming”.

themodelplumber · on April 18, 2023

That "upwards" excludes a lot of relevant systems design logic that won't go away though, insofar as it is abstraction ad infinitum in the direction of fewer-relevant-details.

What'll happen is, details will continue to be relevant as tastes adjust to the new normal.

Like for my work, today, React is enterprise-ready, which is not good for me. It means it will likely dip my projects in unnecessary maintenance costs as compared to another widget of its type that does what I want in a lightweight manner. When I troubleshoot something of React's complexity, even my prompts will likely need to be longer.

But also, that's just one component of one component. And you have to experience this stuff in the first place, to know that you should pay attention to these details and not those other ones, for a given job, for a given client, in a given industry, with given specs.

So, if I was able to wave my hands I'd simply have all the problems I had back when I was a beginner. Ergo, it comes back to the clip art problem: Being able to buy clip art never made anyone a designer. But it made a lot of designers' jobs way easier.

We are simply regressing toward the mean with regard to programming. It was never about computers in the first place, never so concerned with syntax.

Anyway, back to browsing my theater program...

SanderNL · on April 18, 2023

Fair enough, but don’t we abstract “upwards” all the time? Assembly won’t go away, but do you deal with it?

themodelplumber · on April 18, 2023

For one, assembly ceases to be a relevant detail and is replaced by other relevant details.

So, I can't code fast games in a 1984 workplace, currently, being too out of touch with assembly on a given chipset. But I also can't wave my hands at an LLM and expect a modern, fast game of the desired quality to code itself. (Even though a clip art-style result is possible, the requirements are always going to be special details)

The upwards direction example is also interesting because it's foundational to the cognitive functionality of one of the Jungian personality types. But other personality perspectives also apply to coding, which means in part that the directional, metaphorical-abstraction view can effectively be a blind spot if we map it as the preferred view on outcomes.

The most common blind spot for this personality involves questions of relevant details, and their intersection with planning for yet-unknowns. There is a tendency to hand-wave which ends up being similar to prophetic behavior. Jung called this the "voice in the wilderness" noting that it can easily detach from sensibility (rationality) by departing from life details. Kind of interesting stuff.

(Ni-dominant type)

SanderNL · on April 18, 2023

Now you got me on the edge of my seat. What is this personality type?

themodelplumber · on April 18, 2023

Ni-dominant. It exists nowadays in various post-Jungian models, many of which are really fascinating, having fleshed it out a lot.

The opposing function to Ni is Se, which creates a dichotomy of planning/foreseeing vs. doing/performing. The functions oscillate as a kind of duty cycle, so a lot of sages out there have hobbies as musicians, stage magicians, etc.

This dichotomy also effectively shuts out detail memory for context, dealing mostly with present vs. future. Even nostalgia is often ignored on the daily. So a Ni-dom will usually describe their memory as pattern-based, gestalt, more vague or general, etc.

SanderNL · on April 18, 2023

I couldn't quite tell if you found a beautiful way to insult me, but it is fascinating indeed. I am hand wavey and I understand its failure modes quite well, unfortunately. It's cool to talk about it at this level of abstraction.

themodelplumber · on April 18, 2023

No insult intended... I don't really know how much it applies in your case, but since you really took on that viewpoint, that's when the personality theory side of me goes, "well if this is a favored viewpoint then there IS this idea about the population that favors this viewpoint" :-) And thoughts about GPT are generally crafted from general personality positions, in the absence of other relevant self-development experience.

I agree, it's cool stuff

rootusrootus · on April 18, 2023

I would like to subscribe to your newsletter.

Even if approximately 75% of that sailed right over my head.

themodelplumber · on April 18, 2023

Best I can do is RSS!

gdubs · on April 18, 2023

There's an unfortunately common take on AI that goes basically like this:

"I tried it and it didn't do what I wanted, not impressed."

My suggestion is to tune out the noise and really try experimenting with these tools – and know that they're rapidly improving. Even if ultimately you have criticisms or decide one way or another, at least really investigate them for your own use-cases rather than jumping on a bandwagon that's either "AI is bad" or the breathless hype-machine at the other end.

rootusrootus · on April 18, 2023

I agree it's a good idea to take a moderate approach. The hype that LLMs are going to replace SWEs is clearly just that, hype, if you've done any real work trying to get GPT4 to give you the code you want. But it's also clearly a very useful tool. I think it'll absolutely destroy Stack Overflow.

z3c0 · on April 18, 2023

I am very critical of the LLM hype, but the threat to stackoverflow is evident. Like stackoverflow, I never write code verbatim that comes from even GPT4. I frequently find issues in the output, as the code I write is generally very context-specific. However, I find the back-and-forth with interesting tidbits of info dropped here-and-there amounts to something like rubber duck debugging on steroids.

lcnPylGDnU4H9OF · on April 18, 2023

> destroy Stack Overflow

It'll be interesting to see how future training data is sourced.

svachalek · on April 18, 2023

You simply need the system to train itself on its own interactions, like how search engines improve results by counting clicks.

lcnPylGDnU4H9OF · on April 18, 2023

I'm not wondering about how the system will determine what's most helpful but instead determining what's even "correct". A model will learn what's "correct" from Stack Overflow by finding accepted or highly-voted answers but when it can't find such content anymore (in this case because Stack Overflow is hypothetically gone) then what would even exist to generate these discussions to be used as training data?

Github, per the sibling comment, is a good example because projects will have issues (tied to the individual repository of source code to be seen as a working implementation of the idea) which will be where such discussions happen.

ok_dad · on April 18, 2023

When Google search became important, people structured their information so that Google could best index it. When AIs become important in the same way, people will start to structure their information so that a particular class of AI can best index it. If that involves API documentation, perhaps there will be a standard format that AIs understand the best.

stygiansonic · on April 19, 2023

The difference is that folks had an incentive to make their pages easily indexable: drive page views.

With becoming training data the incentive to the creators is a little less clear

ok_dad · on April 19, 2023

Yea right now sure. Later, there will be a clear incentive. Or not, if AI fails to become useful.

melagonster · on April 19, 2023

look the condition of artists, haha :(

LawTalkingGuy · on April 18, 2023

Those topics that AI replaces the forums for won't need discussion. People won't be confused about that thing because the coding AI knows the details of it. Soon that'll be most syntax questions, soon simple to mid-level algorithms, etc.

People will move on to higher-level questions.

rootusrootus · on April 18, 2023

Github would be my first guess.

lcnPylGDnU4H9OF · on April 18, 2023

That does seem like a likely option. Discussions on issues alongside the actual working (and not working) code.

chefandy · on April 19, 2023

Nobody who professionally designs and writes software AND has used LLM code generation tools sees this as a drop-in replacement for developers, generally, anytime soon. That stance is for overeager, credulous enthusiasts and doomsdayers jumping to conclusions.

Similarly, nobody who professionally designs and creates complex art products sees this as a drop-in replacement for commercial artists anytime soon. That stance is for people dazzled by their new image-generation superpower who don't know how little they know about professional creative work.

I doubt the markets for utility-grade code work (e.g. customizing existing WordPress themes) or low-effort, high-volume creative assets (template-based logos, lightly customized game sprites) will survive. They're still real people with lives and families and medical bills and mortgages and we really ought to get serious about worker protections in this country. Seriously.

valray · on April 19, 2023

In a few years, as mid-level cognitive tasks get automated by LLMs, resulting in elimination of some percent of well-paying white-collar jobs, there will be economic dislocation and social disruption.

Oligarchic capitalist societies with a hypocritical philosophy of free market economics (such as the USA) will experience social unrest and civil strife.

In the meantime, social-democratic societies that have effective governance and can grow their safety net with universal basic income will be advantaged in this new economic order. Thinking Scandinavian and some Asian economies.

The geopolitical balance of power will shift toward stable societies that are able to make the conceptual leap to UBI. Others who follow the primitive fantasy of free market economics will crumble and get left behind.

At least that is how it looks right now.

chefandy · on April 19, 2023

Maybe. I don't see any reason to assume the US won't continue to successfully a) escalate police power to keep citizens under control, b) continue giving people just enough crumbs to keep them from getting violent, and c) propagandize the hell out of the idea that societal mismanagement is a matter of personal responsibility. China has done pretty well while clamping down on civil liberties, and the US has always been better at obfuscating and spinning similar tactics to be more palatable to its citizens. And you only need to look at the rust belt to see what happens when US business leadership decides to follow the cheaper option.

Though, regardless of ensuing social unrest, the fact that the people involved are human beings should be enough to not treat them like used condoms when computers figure out how to do their jobs. Should be.

tarruda · on April 18, 2023

> The hype that LLMs are going to replace SWEs is clearly just that, hype

LLMs cannot replace anyone, but it is clear that engineers which master LLMs usage might multiply their productivity by a lot.

The question is: If one LLM assisted engineer can work 10x faster, will companies reduce their engineer staff by 90%?

majormajor · on April 18, 2023

I've worked at far more companies with miles of product idea backlog we never get to than ones with nothing for engineering to do.

Now product will be able to use an LLM to come up with feature proposals and design docs even faster! :o

So: are you working at a company where engineering is a cost center or a revenue center? The latter wants to get more done at the same cost much more than it wants to just cut spend.

abhibeckert · on April 18, 2023

I'm working at a company where we perpetually don't have enough engineers and we've learned (the hard way) that adding more tends to make the problem worse (too many cooks).

Copilot helps a lot with that. I can write `// check if em` and it will finish the comment (email is valid) and write the actual check.

It's a massive productivity boost. I'm spending less time looking stuff up (what was that function name again?) and less time fixing typos/basic syntax errors.

Copilot doesn't always know the function name, but it does more often than me. And it makes errors too, but again less often than me (also, because I'm reading the code, not typing it, I tend to see the error immediately without needing to test it).

nuancebydefault · on April 18, 2023

To answer your question with a question if I may -- when did productivity increase in software ever result in headcount reduction? The competition also will have similar productivity gain.

MacsHeadroom · on April 18, 2023

>when did productivity increase in software ever result in headcount reduction? The competition also will have similar productivity gain.

The average AI company has like 1 employee per $25M valuation. That's around 25x fewer employees than the typical tech company.

quirkot · on April 19, 2023

There used to be a profession called "typist" where a skilled person would type up your dictation quickly and without wasting sheets of paper from typewriter error. Now it's all software

There used to be mailrooms filled with people sorting and delivering physical envelopes. Now it's all software.

nuancebydefault · on April 19, 2023

You are right. Sorry, i meant headcount of software staff.

riwsky · on April 19, 2023

> The question is: If one LLM assisted engineer can work 10x faster, will companies reduce their engineer staff by 90%?

What did you think “replacing SWEs” means?

drowsspa · on April 18, 2023

Yet, the whole movement of getting blue collar workers to code seems to have lost its steam.

gumballindie · on April 18, 2023

Probably because “graduating” bootcamps doesnt make one a swe and people figured out it’s a scam?

antifa · on April 20, 2023

I'm a fullstack dev of 10+ years experience. We have people on our team who've done almost nothing more than a react bootcamp. They can't do what I do, even with chatGPT, but they can do a lot of things I don't want to do.

valray · on April 19, 2023

Also it requires an inherent level of talent which gets turned into skill.

Not everyone who takes piano and music harmony lessons can become a jazz composer, especially in 6 weeks or 6 months.

lionkor · on April 19, 2023

Sadly I dont think this can happen. There is a load of trash answers on SO, and you bet ChatGPT is trained on that.

So you get not only the good of SO, you also get the worst of SO, and theres no way to tell.

Just a downgrade for me, plus for most things I do you are better of reading the source code or the documentation (however lackluster) than fumbling with chatgpt and getting an answer that may or may not be right.

I might as well ask someone else who doesnt know any programming to search for the answer for me - they wont be able to tell a trustworthy answer from another one.

There are so many SO answers (esp. on C++) which look good, but one of the comments points out some edge case in which it breaks.

Remember, not everyone does copy paste programming, some of us have to sit there and think of a solution and work it out over hours, because its not been done before publically

peterashford · on April 18, 2023

Of course, there's the issue that a lot of the info for useful LLMs probably comes from places like Stack Overflow

spaceman_2020 · on April 18, 2023

People also forget that the model is trained on older data. At first, it will default to referencing out of date frameworks and solutions, but if you tell it that its code isn't working, it will usually correct itself.

mise_en_place · on April 18, 2023

I was very impressed when it showed me the different techniques for deep reinforcement learning. However, where it struggles is when building an agent. Because you will need a high amount of tokens to template a prompt, in the case of langchain or AutoGPT.

alexashka · on April 18, 2023

You may be underestimating how much meaning people derive from jumping on bandwagons and having a simple to understand group identity.

Your suggestion would make many people unhappy. They can't win the competence game and hence 'really investigating' is a losing proposition for them. What they can do is jump on bandwagons very quickly, hoping to score a first mover advantage.

How much of an advantage would one get from taking a couple of years to really investigate Bitcoin and the algorithms involved, vs buying some as early as possible and telling everyone else how great it is? :)

mk89 · on April 18, 2023

For me chatGPT or phind (which is based on chatGPT4, if I understood right) are great documentation tools and also general productivity tools, nothing to say about it.

The main issue is that sometimes they really f** it up bad, they make you rethink your knowledge quite deeply (do I remember wrong? did I maybe understand this wrong? is chatGPT wrong?) and this is for me something that can be worse than having to do it myself, because it creates some sort of insecurity, as you always have to challenge your self thinking, and this is not how we work in our daily job, isn't it? At least this doesn't happen so frequently to me - from time to time we have arguments in the team, but this kind of "wrong information" feels more like "hidden" traps than someone else arguing (with valid arguments, of course).

themodelplumber · on April 18, 2023

One thing that really bothers me is that I want it to use best practices and it doesn't really know which ones I'm talking about, and then I realize they are _my_ set of best practices, made from others' nameless best practices.

So I have to decide if it's just a matter of manually converting the 5-10 little things like using `env bash` in the header, etc. Or do I ask it to remember that and proceed to the next layer of the project, and feel like Katamari Coder, which is quite a feeling of what-is-this-fresh-encumbrance at times.

There is a nascent sense that the interface is not even close to where it needs to be to efficiently support that kind of recall for working memory on the coder's end.

I can definitely see a new LLM relativistic-symbolic instruction code & IDE-equivalent (with yet-unseen presentational and let's even say modal editing factors) being extremely useful, which is a bit funny but also that's what those things are good for... Right now I can scroll up through my prompts to supplement my working memory, but that's another place where the whole activity starts to seem very tedious.

(Is the LLM coming for the coders, or are coders coming for the LLM?)

mason55 · on April 18, 2023

I think that Copilot is much better/more promising for this kind of thing because it's looking at the code you've already written without you having to constantly prompt it.

I had a lot of the same hangups as you when I had played around with ChatGPT. How do I get it to handle the monotonous stuff without me having to spend all my time teaching it?

I finally tried Copilot the other day and it was stunning. I had a half-written golang client that was a wrapper around an undocumented and poorly structured API for a tool we use. I had written the get and create methods. Then I added a comment with an example URL for delete and Copilot auto-completed the entire method in the same style as the two methods I had already written. In some cases, like formatting & error handling, it was exactly the same as what I'd written, but other cases, like variable naming, string templating, etc., it replicated the spirit of my style but adapted for this new "delete" method.

I think ChatGPT is just the wrong interface for this kind of thing (at least right now).

Filligree · on April 18, 2023

They’re complementary, I’d say. GPT-4 handles greenfield development better; you can tell it to write a quick script, and usually it more or less works. Copilot doesn’t do much when you’re looking at a blank page.

This would make copilot the better tool in 90% of cases, but I’ve been using GPT-4 to script a lot of things I previously would never have scripted at all. It reduces the cost to where even one-off scripts for a twenty minute job are usually worth writing.

dpkirchner · on April 18, 2023

> Or do I ask it to remember that and proceed to the next layer of the project

I think this could be solved with a good browser extension. Something that provides an easy to access (e.g., keyboard-only) way to paste customized prompt preludes that enforce your style (or styles if, say, you're using multiple languages).

It looks like Maccy could do the job, albeit not as an extension. I haven't tried it yet.

themodelplumber · on April 18, 2023

I tried one kinda like this. Setting aside the extension feel of it, what I'd like to see is a move from prompt-helper to pattern language for visually reporting the process of working with the LLM, to which the LLM has parsing access.

So, let's say you can see your conversation as normal, but you can also see your actual code project as a node-based procedural design layout in an editable window. The relevant conversation details are used to draw the nodes.

You go to one node representing a bash script and click its Patterns tab and search-to-type for the community pattern, "Joe's Best Bash Practices". It's added to your quick palette and LLM offers to add similar patterns to other nodes in Nim and Pascal and ABS, but actually for ABS there's a "concept" symbol that indicates it's only going to be able to guess what you would want based on the others.

Then it offers to gradually teach you node-shorthand as you edit the project, so eventually you don't need to write any prompts, just basic shorthand syntax. Where the syntax gets clunky, or when you buy a custom keyboard just for this syntax but with a few gotchas, you can work together and change syntax to fit.

Nbdy hus lrnd shrtnd nos knda whr m gng wths.

rootusrootus · on April 18, 2023

One thing ChatGPT (specifically, the GPT4 version) keeps doing to me is confidently lying, and when I call it out, apologizing and spitting out another response. Sometimes the right answer, sometimes another wrong one (after a couple tries it then says something like "well, I guess I don't have the right answer after all, but here is a general description of the problem")

Part of me laughs out loud (literally, out loud for once) when it does that. But the other part of me is irritated at the overconfidence. It is a potentially handy tool but keep the real documentation handy because you'll need it.

moonchrome · on April 18, 2023

Honestly to me it happens more than it doesn't - but maybe that's because I've tried it in cases where I've already used traditional approaches to come up with the answer and going to GPT and phind to benchmark their viability.

I've mentioned it on other thread, but phind's "google-fu" is weak, it does a shallow pass and bing index (I'm assuming) is worse than google. It's also slow as hell with GPT4 which makes digging deeper slower than just manually going in.

cwp · on April 18, 2023

To me, this is a great illustration of why chat is a terrible interface for a coding tool. I've gone down this path as well, learning that you need to have a detailed prompt that establishes a lot of context, and iteratively improve it to generate better code. And yup, generating a task list and working from that is definitely a key strategy for getting GPT to do anything bigger than a few paragraphs.

But compare that to Copilot: Copilot doesn't help much when you're starting from scratch, and there's nothing for it to work with. But once you have a bit of structure, it starts to make recommendations. Rather than generating large chunks of code, the recommendations are small, chunks of a few lines or maybe even one line at a time. And it's sooooo good at picking up on patterns. As soon as you start something with built-in symmetries, it'll quickly generate all the permutations. It's sort of prompting by pointing.

This is so. much. better. than writing prompt for the chat interface. I'm really excited to see where these kinds of tools lead.

SamPatt · on April 18, 2023

I've noticed that after using copilot on a code base for a while, you can effectively prompt the AI just by creating a descriptive comment.

// This function ends the call by sending a disconnection message to all connected peers

Bam, copilot will recommend at least the first line, with subsequent lines usually being pretty good, and more and more frequently, it will recommend the whole function.

I still use GPT-4 a lot, especially for troubleshooting errors, but I'm always pleasantly surprised at how good copilot can be.

armchairhacker · on April 18, 2023

Copilot is a game-changer and very underrated IMO. GPT4 is smart but not really used in production yet. Copilot is reportedly generating 50% of new code and I can't imagine going without it.

smashface · on April 18, 2023

Where do you get that 50% number? Do you mean 50% of all new code in the industry? That seems beyond extremely unlikely.

moyix · on April 18, 2023

The number is 40%, and it's 40% of code written by Copilot users. It's also just for Python:

> In files where it’s enabled, nearly 40% of code is being written by GitHub Copilot in popular coding languages, like Python—and we expect that to increase.

https://github.blog/2022-06-21-github-copilot-is-generally-a...

iudqnolq · on April 18, 2023

I wonder if this properly counts cases where copilot writes a bunch of code and then I delete it all and rewrite it manually.

moyix · on April 18, 2023

From what I remember they check in at a few intervals after the suggestion is made and use string matching to check how much of the Copilot-written code remains.

Nifty3929 · on April 18, 2023

It's all about the denominator!

jvanderbot · on April 18, 2023

There was some discussion by the copilot team that x% of new code in enabled IDEs was generated by copilot.

It varies, but here's one post with x=46 from last month. So, very close to half.

https://github.blog/2023-02-14-github-copilot-for-business-i...

2devnull · on April 18, 2023

Measuring output by LOC is not a very useful metric. The sort of code that’s most suited to ai is closer to data than code.

fnordpiglet · on April 18, 2023

(I read it as 50% of their code)

Keyframe · on April 18, 2023

I would really love to see that. So far, all I've seen is cookie cutter code to reduce a bit of typing time. Everything else was more or less hot garbage that just stood in the way of typing. Maybe in a few iterations or years. So far, personally, I haven't seen anything useful. Not saying there isn't anything, just that I haven't seen any use and code offered by it stank. Is there a demo of someone using it to showcase this game-changing power?

armchairhacker · on April 18, 2023

Copilot only writes boilerplate, it can't really handle anything non-trivial. But I write a lot of boilerplate, even using abstraction and in decent programming languages. A surprising amount of code is just boilerplate, even just keywords and punctuation; and there's a lot of small, similar code snippets that you could abstract, but it would actually produce more code and/or make your code harder to understand, so it isn't worth the effort.

Plus, tests and documentation (Copilot doubles as a good sentence/"idea" completer when writing).

SamPatt · on April 18, 2023

It surprises me to hear this. Have you used it as I described by writing a descriptive comment first then waiting to see its response?

I only noticed it getting good at this after I was somewhat far along on a project, so I assume it requires an overall knowledge of what you're trying to do first.

jvanderbot · on April 18, 2023

For my side projects, copilot easily generates 80% of the code. It snoops around the local filesystem and picks up my naming schemes and style to help recommend better. It makes me so much more productive.

For work projects, I tried it on some throwaway work because we're still not allowed to use it for IP reasons, but it is very good at finding small utility functions to help with DRY, and can help with step by step work, but can't generate helpful code quite as easily since some of our API and codebase just doesn't follow its own norms or conventions, and it seems to me that copilot makes a lot of guesses based on its detected conventions.

iudqnolq · on April 18, 2023

> It snoops around the local filesystem and picks up my naming schemes and style to help recommend better.

Are you sure about this? It doesn't seem to work on my machine. I think it will infer things that might be in other modules, but only based on the name. I'm basing this on the fact it assumes my code has an API shape that's popular but that I don't write (eg free functions vs methods).

armchairhacker · on April 18, 2023

It looks at your recently-viewed files in your IDE. I don't think it looks at anything outside your open workspace but maybe...

mjr00 · on April 18, 2023

Absolutely. People will quickly realize that for coding, the natural language part of LLMs is a distraction. Copilot is much better for someone actually writing code, but unfortunately doesn't get as emphasized due to the narrative surrounding LLMs right now.

yodsanklai · on April 18, 2023

> Copilot is much better for someone actually writing code

I haven't used copilot yet, but I'm using occasionally chatgpt with prompts such as "write a bash/python script take takes these parameters and perform this tasks". Then I iterate if needed, and usually, i can get what i want faster than without using chatgpt. It's not a game changer, but it's a performance boost.

How natural language is a distraction here? and how copilot would do much better for the same task?

mjr00 · on April 18, 2023

Try not using natural language and just type what you'd type into Google. You'll get the same results and realize that all of the natural language fluff is totally unnecessary. I just typed in "bash script recursive chmod 777 all files" (as a dumb toy example) and got a resulting script back. It was surrounded by two natural language GPT comments:

> It's generally not recommended to give all files and directories the 777 permission as it can pose a security risk. However, if you still want to proceed with this, here's a bash script that recursively changes the permission of all files and directories to 777: [...] Make sure to replace "/path/to/target/directory" with the path of the directory you want to modify. To run the script, save it as a file (e.g., "chmod_all.sh"), make it executable with the command "chmod +x chmod_all.sh", and then run it with "./chmod_all.sh".

It's up to the reader to decide if those are necessary, but I'd lean towards no.

kenjackson · on April 18, 2023

I tried this with the following:

"Bash script to add a string I specify to the beginning of every file in a directory, unless the file begins with “archive”"

I tried looking for this on Google and didn't find anything that did this -- although I could cobble together a solution with a couple of queries.

The interesting thing is that I wanted ChatGPT to append the string to the filename -- that's what I meant. But it actually append the string to the actual file. That's actually what I said, so I give it credit for doing what I said, rather than what I meant. And honestly my intent isn't necessarily obvious.

I definitely see this as a value add over just searching with Google.

scarface74 · on April 18, 2023

> Try not using natural language and just type what you'd type into Google. You'll get the same results and realize that all of the natural language fluff is totally unnecessary.

I can get similar results with Google sometimes and I can put together what I learned from different places.

But I can get scripts that meet my exact requirements with ChatGPT. Most of my ChatGPT related code is scripting AWS related code and CloudFormation templates.

I’ve asked it to translate AWS related Python code to Node for a different projects and a bash shell script. It’s well trained on AWS related code.

I don’t know PowerShell from a hole in the wall. But I needed to write PS scripts and it did it. I’ve also used it to convert CloudFormation to Terraform

mjr00 · on April 18, 2023

I think you (and kenjackson above) are misinterpreting what I was saying. I'm not saying use Google instead of ChatGPT; I'm saying pretend ChatGPT is Google and interact with the ChatGPT text prompt the same way. You don't need fully formed coherent sentences like you would when talking to a person; just drop in relevant keywords and ChatGPT will get you what you want.

scarface74 · on April 18, 2023

Isn’t that the game changer though that you can use natural language and treat it like the “worlds smartest intern” and I can just give it the list of my requirements?

It’s the difference between:

“Python script to return all of the roles with a given policy AWS” (answer found on StackOverflow with Google)

And with ChatGPT

“Write a Python script that returns AWS IAM roles that contain one or more policies specified by one or more -p arguments. Use argparse to accept parameters and output the found roles as a comma separated list”

mjr00 · on April 18, 2023

> “Write a Python script that returns AWS IAM roles that contain one or more policies specified by one or more -p arguments. Use argparse to accept parameters and output the found roles as a comma separated list”

Again, this is completely unnecessary. This is like in the old days when technically illiterate people would quite literally Ask Jeeves[0] and search for full questions because they didn't know how to interface with a search engine.

A prompt that does exactly what you're asking: "python script get AWS IAM roles that contain a policy, policy as -p command line argument, output csv"

We'll see more of that terse, efficient, style as people get more comfortable, similar to how people have (mostly) stopped using full questions to search on Google. The "talk to ChatGPT like a human" part is entirely a distraction from taking advantage of the LLM for coding purposes. Perhaps more importantly, the responses being humanized is a distraction, too.

[0] https://en.wikipedia.org/wiki/Ask.com

scarface74 · on April 18, 2023

At first, when I didn’t specify “use argparse” it would use raw argument parsing

It also thought I actually wanted a file called “output.csv” based on your text and gave me an actual argument to specify the output file that I didn’t want.

There is a lot of nuance to my requirements that ChatGPT missed with your keywords.

Sidenote: there is a bug in both versions and also when I did this for real. Most AWS list APIs use pagination. You have to tell it that “this won’t work with more than 50 roles” and it will fix it.

vorticalbox · on April 18, 2023

you can always include the instruction to only return the code and no other text

mjr00 · on April 18, 2023

Sure, but I want a system built for coding that does that by default... like Copilot.

iudqnolq · on April 18, 2023

... and it'll describe the code anyway, at least to me.

gunapologist99 · on April 18, 2023

No script needed:

  chmod ugo+rwX . -R

(This is for GNU chmod like in Linux, BSD will be slightly different)

Of course, that's not exactly what you asked for (it's better, read the chmod man page: X applies executable only to directories) but you could just replace ugo+rwX with 777 or 0777.

visarga · on April 18, 2023

> It's not a game changer, but it's a performance boost.

The story of all AI in 2023 - maybe 2x performance improvement, maybe a bit less. The big problem is that you can't trust it on its own, so it doesn't improve productivity 100x. Not even a receipt reader is good enough to reach 100%, you got to check the total, maybe it missed the dot and you get the 100x boost after all.

moffkalast · on April 18, 2023

Has the Copilot backend been updated to use anything more advanced yet? I tried it out when it was new and free for a while and it really struggled with anything that wasn't incredibly common. GPT 4 in its chat form works a whole lot better for niche stuff than that one did.

gunapologist99 · on April 18, 2023

It's definitely far better than when it was free but not GPT4 yet for most people.

It's the opposite of chatGPT: it takes more time to produce useful output but it gets much better in more complex programs while ChatGPT gets worse.

kgeist · on April 18, 2023

Copilot's original underlying model is currently deprecated, if I remember correctly

avereveard · on April 18, 2023

Idk for sure autocomplete is a great interface for someone in the ide coding, but LLM can understand requirements whole and spit out full classes and validate that the output from the server matches the specs, they work great from outside an ide.

barbariangrunge · on April 18, 2023

Either way, you’re sending your companys biggest asset to another company, aren’t you? I’ll try these tools when they start being able to run locally

throwaway202303 · on April 18, 2023

No or no company would be able to use it. As you type fragments of code are sent and discarded after use. You need to trust Microsoft to actually do the discarding but contractually they do and you can sue them if they accidentally or deliberately keep your code around or otherwise mismanage it.

xiphias2 · on April 18, 2023

They are obligated to give data to the government, and government took part of spying in Brazil for Boeing in the past, but I guess they are using this capability only for a few strategic companies, and most companies are not that.

matheusmoreira · on April 20, 2023

> government took part of spying in Brazil for Boeing in the past

Do you have more details? Please elaborate.

zelphirkalt · on April 18, 2023

But that is naive, isn't it? Who has the money and time in their life, to actually sue MS? Even if "you" is a business, few will have the resources for that.

justinhj · on April 18, 2023

Individuals do not (although a class action would be feasible), but large companies that use Github and other Microsoft products, of course they have both the means to sue Microsoft and the motivation should their business be impacted.

throwaway202303 · on April 18, 2023

Exactly

SanderNL · on April 18, 2023

I sort of disagree that code is the biggest asset. Take the Yandex leak. What can you do with it? Outcompete them?

visarga · on April 18, 2023

> Take the Yandex leak. What can you do with it?

Obviously, add it to the big training set of the next code model.

koonsolo · on April 18, 2023

I surely hope they use my copyrighted code and make millions out of it. Ideal case for me to sue them for lots of money.

freedomben · on April 18, 2023

How would you ever know? It will come in chunks of a dozen or less lines at a time and it will be written into your competitor's proprietary codebase (that you don't have access to).

HALtheWise · on April 18, 2023

> GitHub Copilot [for business] transmits snippets of your code from your IDE to GitHub to provide Suggestions to you. Code snippets data is only transmitted in real-time to return Suggestions, and is discarded once a Suggestion is returned. Copilot for Business does not retain any Code Snippets Data.

Likely, some employee would whistleblow that they're not complying with their privacy policy, and either government litigation or a class action lawsuit would ensue. That legal process would involve subpoenas and third-party auditors being granted access to GitHub/Microsoft's internal code and communications history, which makes it pretty hard to hide something as big as collecting, storing, and then training from a huge amount of uploaded code snippets they promised not to.

It's not inconceivable that they're noncompliant, but my bet would be that if they are collecting data they explicitly promise not to it's an accidental or malicious action by an individual employee, and they will freak out when they discover it and delete everything as soon as they can. If they intended to collect that data, it would be much easier to write that into the policy than deal with all the risk.

Notably, this applies to Copilot for Business, which is presumably what you're using if you are at work.

zelphirkalt · on April 18, 2023

Couldn't it happen more subtly, without having the code lying around for long? The model could be doing online-learning (ML term) and only then they discard code that they get send. This means your code could appear in other people's completions/suggestions, without it having to lie anywhere. It is basically learned into the model. The code could appear almost or even completely verbatim on someone else's machine, possibly working for a competitor. Even that it is your code would not be obvious, because MS could claim, that Copilot merely accidentally constructed the same code from other learned code.

Not sure that this is how the model works, but it is conceivable.

survirtual · on April 18, 2023

Right.

If you are building something truly valuable locally, and it is innovative or otherwise disruptive and relies on being a first mover, centrally hosted LLMs are a non-starter.

Most software corps have countless millions of lines of code. You'd be spending lifetimes tracing where someone ripped your "copyrighted" techniques and methods.

The complete lack of security awareness and willingness to compromise privacy for convenience in people deeply saddens me.

koonsolo · on April 19, 2023

> willingness to compromise privacy for convenience

I have to ask: do you carry a cellphone?

survirtual · on April 19, 2023

This is not really a valid comparison.

The cellphone is not a compromise for convenience. It allows me to make a living, providing internet connectivity and lets me keep in contact with friends and family. Without it, my freedoms would be drastically diminished.

With software we develop with, we have choices. We can use OSS. We can try to use open hardware. If we are working on sensitive things, we can use an airgapped system with vim.

When you practice these kinds of routines, they are not a burden. Actually, using vim instead of something like vscode increases productivity eventually. It does take a little bit of time.

When we couple our productivity with centrally hosted services, we greatly diminish our freedom to be productive on a wide range of problem areas. I don't say this to brag, it is to maximize all of our freedom.

In my view, most of us SHOULD be working on "sensitive" things. There is so, so much work to be done for the cause of freedom and liberty in software. We need to reserve that capability in us, we cannot let nameless people have an inside access to our expression.

koonsolo · on April 19, 2023

A cellphone literally tracks your every move. If that's not a privacy concern then I don't know what is. Maybe a device with a microphone that's constantly on you. Oh no wait, that's also a cellphone.

I was born in the 70's, and I can tell you, you can survive just fine without a cellphone.

All of what you describe can be done on a desktop. But hey, if you want to compromise your privacy for some convenience, that's your choice.

survirtual · on April 19, 2023

Are you going to carry your desktop into the forest on a hammock and work? How about on a plane to other countries?

Will you carry your desktop in your car while living on the road? In the middle of forests and on top of mountains?

Will you work from a campsite with your desktop while not connected to the internet?

Can you have a meeting via your desktop from a rocky beach and no internet service?

A cellphone can't track what I type on my laptop, and it can't read encrypted comms my laptop makes to remote systems. I can put a cellphone in a distant location and use a portable, open source router with a VPN on the router, with encrypted, private DNS.

Not everyone lives inside a comfortable little box. There are all kinds of ways to do life.

koonsolo · on April 19, 2023

Sure, if you are willing to compromise your privacy, which you clearly are.

melagonster · on April 19, 2023

In these companies, people are not permitted to carry their cellphone into workspace.

koonsolo · on April 19, 2023

I was talking about someone stealing my codebase.

Talking about integrating the code into the LLM, others get the same benefit that you are getting, so I don't really see the issue.

So you can either develop everything on your own, or you can leverage LLM's, helping both yourself and others.

maroonblazer · on April 18, 2023

As a hobbyist developer with no formal training, I wish Copilot had a 'teaching' or "Senior Dev" mode, where I can play the role of the Junior Dev. I'd like it to pick up on what I'm trying to write, and then prompt me with questions or hints, but not straight up give me the code.

Or, if that's too Clippy-like annoying, let me prompt it when I'm stuck, and only then suggest hints or ask suggestive questions that guide me to a solution.

I agree, very exciting to see where all this goes.

ukuina · on April 18, 2023

The Github Copilot Labs extension has "codebrushes" that can transform and explain existing code instead of generating new code, but none of it only gives "hints". Maybe one of the codebrushes can take a custom prompt.

SparkyMcUnicorn · on April 18, 2023

You can create custom brushes, or open the "CoPilot Labs" panel and "explain" with a custom prompt.

cwp · on April 18, 2023

One thing you might try with Copilot is to ask it to explain the code. It can often give insight, even on code that you yourself wrote a few minutes ago.

supernikio2 · on April 18, 2023

Exactly this. I've tried to implement ChatGPT into may daily workflow, but you have to give it an excruciating level of detail to get something that remotely resembles real code I'd use, and even then you have to hold its hand to guide it in the correct direction, and still have to make some manual final touches at the end.

This is why I'm looking forward to Copilot X so much. It will hold much more context than the current implementation, and integrate the Chat interface that's so natural to us.

throwaway202303 · on April 18, 2023

People have different preferences and habits. Having tried both models I much prefer having a conversation in one window and constructing my code from that in another. Although copilot is about to add some interesting features that may win me back.

mov_eax_ecx · on April 18, 2023

How to overengineer with an LLM, don't state clearly the requirements, shove your pet patterns first, it is more important to follow the slice redux awareness hook than to have working solution, never trust your developers to make decisions, worry more how it is built than building a solution.

My way to work with an LLM is to have a good, clear requirement and make the LLM write a possible file organization and query the contents of each file, just the code no comments and assemble a working prototype fast, then you can iterate over the requirements and evolve from there.

lyjackal · on April 18, 2023

Generally, I agree that approach works well. It’s going to perform better if it’s not trying to fulfill your teams existing patterns. On the other hand, allowing lots of inconsistencies in style in your large code base seems like a quick way to create a hot mess. Chat prompts seem like a really difficult way to communicate code style and conventions though. A sibling comment to yours mentions that a copilot autocomplete seems like a much better pattern for working in an existing code base, and I tend to agree that’s much more promising. Read the existing code, and recommend small pieces as you type

moonchrome · on April 18, 2023

How often do you get working code that way ? Unless it's something trivial that fits in it's scope I'd say that's going to produce garbage. I've seen it steer into garbage on longer prompt chains about a single class (of medium complexity) - I doubt it would work project level. Mind sharing the projects ?

mov_eax_ecx · on April 18, 2023

I work only with closed source codebases and this approach for prototypes, but, using the same example as the blog i prompt: "the current system is an online whiteboard system. Tech stack: react, use some test framework, use konva for the canvas, propose a file organization, print the file layout tree. (without explanations)." The trick is that for every chat the context is the requirement+the filesystem + the specific file, so you don't have the entire codebase in the context, only the current file, also use gpt4, gpt3 is not good enough.

My main point is that the blog post final output is mock test awareness hook redux, where an architect feels good to see his patterns, with my approach you have a prototype online whiteboard system,

moonchrome · on April 18, 2023

I feel like this is a bunch of ceremony and back and forth, and also considering GPT-4 speed - I feel like I would fly past this approach just using copilot and coding.

I look forward to offloading these kinds of tasks to LLMs but I'm not seeing the value right now. Using them feels slow and unsatisfying, need to triple check everything, specify everything relevant for context.

Also maybe it's just me but verbalizing requirements unambiguously can often be harder than writing code for it. And it's not fun. If GPT4 was GPT3.5 fast it would probably be a completely different story.

isaacfrond · on April 18, 2023

The article stresses to never put anything that may be confidential into the prompt. Yet, chatGpt offers to out-out from using your data for training.

For most purposes that seems to be sufficient doesn't it? Or are there reasons not to trust OpenAi on this one?

vharuck · on April 18, 2023

I will never have full trust in an assertion unless (a) it's included in a contract that binds all parties, (b) the same contract includes a penalty for breaking the assertion that's severe enough to discourage it, and (c) I know the financial and other costs of litigation won't be severe for me.

In short, unless my large employer will likely win in punishing OpenAI should they break a promise, that promise is just aspirational marketing speak.

For data retention and usage, I'd also need a similar contractual agreement to tie the hands of any company that would acquire them in the future.

twelve40 · on April 18, 2023

Copilot for individuals stores code snippets by default according to their TOS. Sure, you can probably find a way to opt out of that somewhere as well, but you'd have to read the TOS for every plugin and service you use, find the opt-out links and make sure you don't opt-in again via some other route (such as not Copilot but ChatGPT proper or some other Github, VSCode or some other plugin or service button or knob).

themodelplumber · on April 18, 2023

> Or are there reasons not to trust OpenAi on this one?

Yes, more related to general tech history and not a dig on OpenAI though.

dustypotato · on April 18, 2023

There was a bug where chat history of some users were visible to others

blowski · on April 18, 2023

From a GDPR or commercial confidentiality perspective, it doesn't matter what OpenAI say they'll do with your data, you can't share it with them.

Let's say your doctor enters sensitive info about you, and despite having told OpenAI not to train data with it, they use it anyway due to a bug. A year from now, ChatGPT is generating personal information tells everyone and anyone about your sensitive info.

Would you exclusively blame ChatGPT?

clarge1120 · on April 18, 2023

> are there reasons not to trust OpenAi on this one?

Yes, the fact that they are closed, not open, for one. And that they switched from open to closed the moment it benefited them to do so.