Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I feel very left out of all this LLM hype. It's helped me with a couple of things, but usually by the time I'm at a point where I don't know what I'm doing, the model doesn't know any better than I do. Otherwise, I have a hard time formulating prompts faster than I can just write the damn code myself.

Am I just bad at using these tools?



I give you an example. I took advantage of some free time in these days to finally implement some small services on my home server. ChatGPT (3.5 in my case) has read the documentation of every language, framework, API out there. I asked it to start with Python3 http.server (because I know it's already on my little server) and write some code that would respond to a couple of HTTP calls and do this and that. It created an example that customized the do_GET and do_POST methods of http.server, which I didn't even know exist (the methods.) It did do well also when I asked it to write some simple web form. It did not so well when things got more complicated but at that point I already knew how to proceed. I finished everything in three hours.

What did it save me?

First of all the time to discover the do_GET and do_POST methods. I know that I should have read the docs but it's like asking a colleague "how do I do that in Python" and getting the correct answer. It happens all the time, sometimes it's me to ask, sometimes it's me to answer.

Second, the time to write the first working code. It was by no means complete but it worked and it was good enough to be the first prototype. It's easier to build on that code.

What it didn't save me? All the years spent to recognize what the code written by ChatGPT did and to learn how to go on from there. Without those years on my own I would have been lost anyway and maybe I wouldn't been able to ask it the right questions to get the code.


Regarding your last point about what it didn't save you, it reminded me of this blog post: https://overbring.com/articles/2023-06-23-on-using-llm-to-ge...


I've been learning boring old SQL over the last few months, and I've found the AIs quite helpful at pointing out some things that are perhaps too obvious for the tutorials to call out.

I don't mind taking suggestions about code from an AI because I can immediately verify the AI's suggestion by running the code, make small edits, and testing it.


But this is my pet peeve when people claim this; the only reason is because the AI code is small and constrained in scope. Otherwise the very claim that humans can easily verify AI code quickly would, like, violate Rice's Theorem.


That's an underutilization of the complexity theory. Since not all problems are formulated as being Turing-complete, there are better theorems to apply than Rice's Theorem, such as:

* IP=PSPACE (you can verify correctness of any PSPACE computation in polynomial time)

* NIP=NEXPTIME (you can verify correctness of any NEXPTIME computation with two non-cooperative provers)

* NP=PCP(1,log(n)) (you can verify correctness of any NP statement with O(log(n)) bits of randomness by sampling just O(1) bits from a proof)

What these means is that a human is indeed able to verify correctness of the output of a machine with stronger computational abilities than the human itself.


I'd reframe that slightly: it's not that you are bad at using these tools, it's that these tools are deceptively difficult to use effectively and you haven't yet achieved the level of mastery required to get great results out of them.

The only way to get there is to spend a ton of time playing with them, trying out new things and building an intuition for what they can do and how best to prompt them.

Here's my most recent example of how I use them for code: https://til.simonwillison.net/github-actions/daily-planner - specifically this transcript: https://gist.github.com/simonw/d189b737911317c2b9f970342e9fa...


I've developed a workflow that's working pretty well for me. I treat the LLM as a junior developer that I'm pair programming with and mentoring. I explain to it what I plan to work on, run ideas by it, show it code snippets I'm working on and ask it to explain what I'm doing, and whether it sees any bugs, flaws, or limitations. When I ask it to generate code, I read carefully and try to correct its mistakes. Sometimes it has good advice and helps me figure out something that's better than what I would have done on my own. What I end up with is like a living lab notebook that documents the thought processes I go through as I develop something. Like you, for individual tasks, a lot of times I could do it faster if I just wrote the damn code myself, and sometimes I fall back to that. In the longer term I feel like this pair programming approach gives me a higher average velocity. Like others are sharing, it also lowers the activation energy needed for me to get started on something, and has generally been a pretty fun way to work.


Here's what I find extremely useful:

1 - very hit or miss -- I need to fidget with the aws api in some way. I use this roughly every other month, and never remember anything about it between sessions. ChatGPT is very confused by the multiple versions of the APIs that exist, but you can normally talk it into giving you a basic working example that is then much easier to modify into exactly what I want than starting from scratch. Because of the multiple versions of the aws api, it is extremely prone to hallucinating endpoints. But if you persist, it will eventually get it right enough.

2 - I have a ton of bash automations to do various things. just like aws, I touch these infrequently enough that I can never remember the syntax. chatgpt is amazing and replaces piles of time googling and swearing.

3 - snippets of utility python to do various tasks. I could write these, but chatgpt just speeds this up.

4 - working first draft examples of various js libs, rails gems, etc.

What I've found has extremely poor coverage in chatgpt is stuff where there are basically no stackoverflow articles explaining it / github code using it. You're likely to be disappointed by the chatgpt results.


As the article says, it helps to develop an intuition for what the models are good or bad at answering. I can often copy-paste some logs, tracebacks, and images of the issue and demand a solution without a long manual prompt - but it takes some time to learn when it will likely work and when it's doomed to fail.


This is likely the biggest disconnect between people that enjoy using them and those that don’t. Recognizing when GPT-4’s about to output nonsense and stopping it in the first few sentences before it wastes your time is a skill that won’t develop until you stop using them as if they’re intended to be infallible.

At least for now, you have to treat them like cheap metal detectors and not heat-seeking missiles.


I had good results by writing my requirements like they were very, high-level code. I told it specifically what to do. Like formal specifications but with no math or logic. I usually defined the classes or data structures, too. I’d also tell it what libraries to use after getting their names from a previous, exploratory question.

From there, I’d ask it to do one modification at a time to the code. I’d be very precise. I’d give it only my definitions and just the function I wanted it to modify. It would screw things up whereby I’d tell it that. It would fix its errors, break working code with hallucinations, and so on. You need to be able to spot these problems to know when to stop asking it about a given function.

I was able to use ChatGPT 3.5 for most development. GPT 4 was better for work needed high creativity or lower hallucinations. I wrote whole programs with it that were immensely useful, including a HN proxy for mobile. Eventually, ChatGPT got really dumb while outputting less and less code. It even told me to hire someone several times (?!). That GPT-3-Davinci helped a lot suggests it’s their fine-tuning and system prompt causing problems (eg for safety).

The original methods I suggested should work, though. You want to use a huge, code-optimized model for creativity or hard stuff, though. Those for iteration, review, etc can be cheaper.


Its useful for writing generic code/template/boilerplate, then customizing it by inserting your own code. For something you already know better, there isn't a magic prompt to express it, since the code is not generic enough for LLM to understand as a prompt.

Its best usecase is when you're not a domain expert, need quickly to run some unknown API/library inside your program inserting code like "write a function for loading X with Y in language Z" when you barely have an idea what is X,Y,Z. Its possible in theory to break-down everything to "write me a function for N" but the quality of such functions is not worth the prompting in most situations and you better ask it to explain how to write a function X,Y,Z step-by-step.


This is exactly where I get the most value. For example, I know write a bunch of custom chrome plugins. I'm not much of javascript guy, but I can get by and validate the code. Usually what I want to do is simple, but requires making an api call and basic parsing. All stuff I could probably figure out myself in an hour or two. But instead I can get an initial version done in 2 minutes and spend 5-10 debugging. I probably wouldn't even try otherwise.


I felt the same every time I tried to get some help in a subject matter where my knowledge was quite advanced and/or when the subject matter was obscure/niche.

Whenever I tried it on something more common and/on in some stuff I had absolutely zero familiarity it did help me bootstrap quicker than reading some documentation would have

That tells a lot about how hard it is to write/find documentation that is tailored exactly to you and your needs


You can use LLMs as documentation lookups for widely used libraries, eg: the Python stdlib. Just place a one line comment of what you want the AI to do and let it autocomplete the next line. It’s much better than previous documentation tools because it will interpolate your variables and match your function’s return type.


Yeah, I’m not sure how often these tools will really help me with the things that end up destroying my time when programming, which are stuff like:

1) Shit’s broken. Officially supported thing kinda isn’t and should be regarded as alpha-quality, bugs in libraries, server responses not conforming to spec and I can’t change it, major programming language tooling and/or whatever CI we’re using is simply bad. About the only thing here I can think of that it might help with is generating tooling config files for the bog standard simplest use case, which can sometimes be weirdly hard to track down.

2) Our codebase is bad and trying to do things the “right” way will actually break it.


One thing you might find useful is using it to write unit tests for legacy code, that more easily allow you to refactor the crappy codebase.


No, you're likely just a better programmer than those relying on these tools.


That could be the case and likely is in areas where they are strongest, just like the articles example of how its not as useful for systems programming because he is an expert.

If you ask it about things you don't know it was likely trained on high quality data for and get bad answers, you likely need to improve your writing/prompting.


Except it's most dangerous when used in areas that you're weakest because it will confidently spit out subtly wrong answers to everything. It is not a fact engine.


That only matters if you use it for things where failure actually causes damage.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: