Yeah, that is hard to argue with because I just go to OpenRouter and play around with a lot of models before I decide which ones I like. But there's something special about running it locally in your basement
Yup, I have downloaded probably a terabyte in the last week, especially with the Step 3.5 model being released and Minimax quants. I wonder what my ISP thinks. I hope they don't cut me off. They gave me a fast lane, they better let me use it, lol
Is that with a WISP by chance? Or in a developing country? Or are there really wired providers with such low caps in the western world in this day and age?
ATT once told me if I don't pay for their TV service then my home gigabit fiber would have a 1TB cap. They had an agreement with the apartment building so I had no other choice of provider.
We said the same thing when 3D printing came out. Any sort of cool tech, we think everybody’s going to do it. Most people are not capable of doing it. in college everybody was going to be an engineer and then they drop out after the first intro to physics or calculus class.
A bunch of my non tech friends were vibe coding some tools with replit and lovable and I looked at their stuff and yeah it was neat but it wasn't gonna go anywhere and if it did go somewhere, they would need to find somebody who actually knows what they're doing. To actually execute on these things takes a different kind of thinking. Unless we get to the stage where it's just like magic genie, lol. Maybe then everybody’s going to vibe their own software.
The difference is that 3D printing still requires someone, somewhere to do the mechanical design work. It democratises printing but it doesn't democratise invention. I can't use words to ask a 3d printer to make something. You can't really do that with claude code yet either. But every few months it gets better at this.
The question is: How good will claude get at turning open-ended problem statements into useful software? Right now a skilled human + computer combo is the most efficient way to write a lot of software. Left on its own, claude will make mistakes and suffer from a slow accumulation of bad architectural decisions. But, will that remain the case indefinitely? I'm not convinced.
This pattern has already played out in chess and go. For a few years, a skilled Go player working in collaboration with a go AI could outcompete both computers and humans at go. But that era didn't last. Now computers can play Go at superhuman levels. Our skills are no longer required. I predict programming will follow the same trajectory.
There are already some companies using fine tuned AI models for "red team" infosec audits. Apparently they're already pretty good at finding a lot of creative bugs that humans miss. (And apparently they find an extraordinary number of security bugs in code written by AI models). It seems like a pretty obvious leap to imagine claude code implementing something similar before long. Then claude will be able to do security audits on its own output. Throw that in a reinforcement learning loop, and claude will probably become better at producing secure code than I am.
> This pattern has already played out in chess and go. For a few years, a skilled Go player working in collaboration with a go AI could outcompete both computers and humans at go. But that era didn't last. Now computers can play Go at superhuman levels. Our skills are no longer required. I predict programming will follow the same trajectory.
Both of those are fixed, unchanging, closed, full information games. The real world is very much not that.
Though geeks absolutely like raving about go and especially chess.
This is like timing the stock market. Sure, share prices seem to go up over time, but we don't really know when they go up, down, and how long they stay at certain levels.
I don't buy the whole "LLMs will be magic in 6 months, look at how much they've progressed in the past 6 months". Maybe they will progress as fast, maybe they won't.
I’m not claiming I know the exact timing. I’m just seeing a trend line. Gpt3 to 3.5 to 4 to 5. Codex and now Claude. The models are getting better at programming much faster than I am. Their skill at programming doesn’t seem to be levelling out yet - at least not as far as I can see.
If this trend continues, the models will be better than me in less than a decade. Unless progress stops, but I don’t see any reason to think that would happen.
I’m not a fan of analogies, but here goes: Apple don’t make iPhones. But they employ an enormous number of people working on iPhone hardware, which they do not make.
If you think AI can replace everyone at Apple, then I think you’re arguing for AGI/superintelligence, and that’s the end of capitalism. So far we don’t have that.
I think validation is already much easier using LLMs. Arguably this is one of the best use cases for coding LLMs right now: you can get claude to throw together a working demo of whatever wild idea you have without needing to write any code or write a spec. You don't even need to be a developer.
I don't know about you, but I'd much rather be shown a demo made by our end users (with claude) than get sent a 100 page spec. Especially since most specs - if you build to them - don't solve anyone's real problems.
Hm, how much real life experience do you have in delivering production SW systems?
Demo for the main flow is easy. The hard part is thinking through all the corner cases and their interactions, so your system robustly works in real world, interacting with the everyday chaos in a non-brittle fashion.
Well he said - anyone can (or will soon) vibe-program their own MS Word - there is no way he is a programmer, sorry. The complexity of these systems is crazy. Unless he meant ah HTML text area with "save" button - then sure, why not.
> The complexity of these systems is crazy. Unless he meant ah HTML text area with "save" button - then sure, why not.
What do you see as the difference between an LLM making an HTML text area and a save button, and an LLM making MS word? It just sounds like a scaling problem to me. We've been scaling computers since long before I was born. My first computer was a 386 with 4mb of ram. You needed a special add-in chip to enable floating point calculations. Now look at what we have.
As far as I can tell, the only difference between opus 4.6 and some future AI model that could code up MS word is a difference in scale. Are you betting against the entire computing (software and hardware) industry being unable to scale LLMs past their current point? That seems like a really bad bet to me. Especially seeing how far they've come in the last few years. Claude code can already do some quite complex tasks. I got it to write a simple web based email client for me yesterday. It took about an hour in total. It has some bugs, but the email client works.
We scaled hard drives. We scaled down silicon chips. We scaled digital camera sensors. And display resolutions. And networking bandwidth. We went from the palm pilot to the first iphone to modern phones. Do you really think we'll be unable to scale AI models?
>> industry being unable to scale LLMs past their current point
100% bet - no way any "AI" will be able to generate you anything close to a complex piece of software like Ms Word within reasonable time and budget. Given infinite time and money - sure, anything is possible, just like a trilling monkeys randomly printing "War and Peace" once in a trillion years in some remote galaxy. I don't even understand your confidence given how much guidance and hand holding LLMs need at the moment to produce anything useful.
Yep. Claude today? No way can it achieve this. It can barely write a working C compiler.
I'm looking at the trend line. A few years ago it couldn't make a simple webpage. Now it can make a bad C compiler in thousands of dollars of tokens. What does it look like in another few years? Or another 2 decades?
Hard disagree, clients/users often don't know what the best/right solution is, simply because they don't know what's possible or they haven't seen any prior art.
I'd much rather have a conversation with them to discuss their current problems and workflow, then offer my ideas and solutions.
> The second part is going to be the hard part for complex software and systems.
Not going to. Is. Actually, always has been; it isn’t that coding solutions wasn’t hard before, but verification and validation cannot be made arbitrarily cheap. This is the new moat - if your solutions require time consuming and expensive in dollar terms qa (in the widest sense), it becomes the single barrier to entry.
There was recent discussion about how making AI to write the validation for the code is a good approach. If you have formal proofs for your code, your QA needs go down.
> I can't use words to ask a 3d printer to make something.
You can: the words are in the G-code language.
I mean: you are used to learn foreign languages in school, so you are already used to formulate your request in a different language to make yourself understood. In this case, this language is G-code.
This is a strange take; no one is hand-writing the g-code for their 3d print. There are ways to model objects using code (eg openscad), but that still doesn't replace the actual mechanical design work involved in studying a problem and figuring out what sort of part is required to solve it.
I spent years writing a geometry and gcode generator in grasshopper. I wasn’t generating every line of gcode (my typical programs are about 500k lines), but I write the entire generator to go from curves to movements and extrusions.
I used opus to rewrite the entire thing, more cleanly, with fewer bugs and more features, in an afternoon. Admittedly it would have taken a lot longer without the domain expertise from years of staring at geometry and gcode side by side.
You can basically hand it a design, one that might take a FE engineer anywhere from a day to a week to complete and Codex/Claude will basically have it coded up in 30 seconds. It might need some tweaks, but it's 80% complete with that first try. Like I remember stumbling over graphing and charting libraries, it could take weeks to become familiar with all the different components and APIs, but seemingly you can now just tell Codex to use this data and use this charting library and it'll make it. All you have to do is look at the code. Things have certainly changed.
I figure it takes me a week to turn the output of ai into acceptable code. Sure there is a lot of code in 30 seconds but it shouldn't pass code review (even the ai's own review).
For now. Claude is worse than we are at programming. But its improving much faster than I am. Opus 4.6 is incredible compared to previous models.
How long before those lines cross? Intuitively it feels like we have about 2-3 years before claude is better at writing code than most - or all - humans.
I keep seeing this. The "for now" comments, and how much better it's getting with each model.
I don't see it in practice though.
The fundamental problem hasn't changed: these things are not reasoning. They aren't problem solving.
They're pattern matching. That gives the illusion of usefulness for coding when your problem is very similar to others, but falls apart as soon as you need any sort of depth or novelty.
I haven't seen any research or theories on how to address this fundamental limitation.
The pattern matching thing turns out to be very useful for many classes of problems, such as translating speech to a structured JSON format, or OCR, etc... but isn't particularly useful for reasoning problems like math or coding (non-trivial problems, of course).
I'm pretty excited about the applications for AI overall and it's potential to reduce human drudgery across many fields, I just think generating code in response to prompts is a poor choice of a LLM application.
Have you actually tried the latest agentic coding models?
Yesterday I asked claude to implement a working web based email client from scratch in rust which can interact with a JMAP based mail server. It did. It took about 20 minutes. The first version had a few bugs - like it was polling for mail instead of streaming emails in. But after prompting it to fix some obvious bugs, I now have a working email client.
Its missing lots of important features - like, it doesn't render HTML emails correctly. And the UI looks incredibly basic. But it wrote the whole thing in 2.5k lines of rust from scratch and it works.
This wasn't possible at all a couple of years ago. A couple of years ago I couldn't get chatgpt to port a single source file from rust to typescript without it running out of context space and introducing subtle bugs in my code. And it was rubbish at rust - it would introduce borrow checker problems and then get stuck, trying and failing to get it to compile. Now claude can write a whole web based email client in rust from scratch, no worries. I did need to manually point out some bugs in the program - claude didn't test its email client on its own. There's room for improvement for sure. But the progress is shocking.
I don't know how anyone who's actually pushed these models can claim they haven't improved much. They're lightyears ahead of where they were a few years ago. Have you actually tried them?
Honestly, I really did do this for a while, mostly in response to comments like this, with some degree of excitement.
I've been disappointed every time.
I do use the LLMs for summarization and "a better google" and am constantly confronted with how inaccurate they are.
I haven't tried with code in the past couple months because to be completely honest, I just don't care.
I enjoy my craft, I enjoy puzzling and thinking through better ways of doing things, I like being confronted with a tedious task because it pushes me towards finding more optimal approaches.
I haven't seen any research that justifies the use of LLMs for code generation, even in the short term, and plenty that supports my concerns about mid to long term impact on quality and skills.
It is certainly already better than most humans, even better than most humans who occasionally code. The bar is already quite high, I'd say. You have to be decent in your niche to outcompete frontier LLM Agents in a meaningful way.
I'm only allowed 4.5 at work where I do this (likely to change soon but bureaucracy...). Still the resulting code is not at a level I expect.
i told my boss (not fully serious) we should ban anyone with less than 5 years experience from using the ai so they learn to write and recognize good code.
Honestly you could just come up with a basic wireframe in any design software (MS paint would work) and a screen shot of a website with a design you like and tell it "apply the aesthetic from the website in this screenshot to the wireframe" and it would probably get 80% (probably more) of the way there. Something that would have taken me more than a day in the past.
I've been in web design since images were first introduced to browsers and modern designs for the majority of sites are more templated than ever. AI can already generate inspiration, prototypes and designs that go a long way to matching these, then juice them with transitions/animations or whatever else you might want.
The other day I tested an AI by giving it a folder of images, each named to describe the content/use/proportions (e.g., drone-overview-hero-landscape.jpg), told it the site it was redesigning, and it did a very serviceable job that would match at least a cheap designer. On the first run, in a few seconds and with a very basic prompt. Obviously with a different AI, it could understand the image contents and skip that step easily enough.
I have never once seen this actually work in a way that produces a product I would use. People keep claiming these one-shot (or nearly one-shot) successes, but in the mean time I ask it to modify a simple CSS rule and it rewrites the enter file, breaks the site, and then can't seem to figure out what it did wrong.
It's kind of telling that the number of apps on Apple's app store has been decreasing in recent years. Same thing on the Android store too. Where are the successful insta-apps? I really don't believe it's happening.
I've recently tried using all of the popular LLMs to generate DSP code in C++ and it's utterly terrible at it, to the point that it almost never even makes it through compilation and linking.
Can you show me the library of apps you've launched in the last few years? Surely you've made at least a few million in revenue with the ease with which you are able to launch products.
There's a really painful Dunning-Kruger process with LLMs, coupled with brutal confirmation bias that seems to have the industry and many intelligent developers totally hoodwinked.
I went through it too. I'm pretty embarrassed at the AI slop I dumped on my team, thinking the whole time how amazingly productive I was being.
I'm back to writing code by hand now. Of course I use tools to accelerate development, but it's classic stuff like macros and good code completion.
Sure, a LLM can vomit up a form faster than I can type (well, sometimes, the devil is always the details), but it completely falls apart when trying to do something the least bit interesting or novel.
Absolutely. I also think there's a huge number of wannabe developers who don't have the patience to actually learn development. Those people desperately want this AI development dream to be true so they pretend and convince themselves that it is. They talk about how well it works on internet forums, but you ask for the product and it's crickets. It's all wishful thinking.
The number of non-technical people in my orbit that could successfully pull up Claude code and one shot a basic todo app is zero. They couldn’t do it before and won’t be able to now.
You go to chatGPT and say "produce a detailed prompt that will create a functioning todo app" and then put that output into Claude Code and you now have a TODO app.
This is still a stumbling block for a lot of people. Plenty of people could've found an answer to a problem they had if they had just googled it, but they never did. Or they did, but they googled something weird and gave up. AI use is absolutely going to be similar to that.
Maybe I’m biased working in insurance software, but I don’t get the feeling much programming happens where the code can be completely stochastically generated, never have its code reviewed, and that will be okay with users/customers/governments/etc.
Even if all sandboxing is done right, programs will be depended on to store data correctly and to show correct outputs.
Insurance is complicated, not frequently discussed online, and all code depends on a ton of domain knowledge and proprietary information.
I'm in a similar domain, the AI is like a very energetic intern. For me to get a good result requires a clear and detailed enough prompt I could probably write expression to turn it into code. Even still, after a little back and forth it loses the plot and starts producing gibberish.
But in simpler domains or ones with lots of examples online (for instance, I had an image recognition problem that looked a lot like a typical machine learning contest) it really can rattle stuff off in seconds that would take weeks/months for a mid level engineer to do and often be higher quality.
You don't need to draw the line between tech experts and the tech-naive. Plenty of people have the capability but not the time or discipline to execute such a thing by hand.
Not really. What the FE engineer will produce in a week will be vastly different from what the AI will produce. That's like saying restaurants are dead because it takes a minute to heat up a microwave meal.
It does make the lowest common denominator easier to reach though. By which I mean your local takeaway shop can have a professional looking website for next to nothing, where before they just wouldn't have had one at all.
I think exceptional work, AI tools or not, still takes exceptional people with experience and skill. But I do feel like a certain level of access to technology has been unlocked for people smart enough, but without the time or tools to dive into the real industry's tools (figma, code, data tools etc).
The local takeaway shop could have had a professional looking website for years with Wix, Squarespace, etc. There are restaurant specific solutions as well. Any of these would be better than vibe coding for a non-tech person. No-code has existed for years and there hasn't been a flood of bespoke software coming from end users. I find it hard to believe that vibe-coding is easier or more intuitive than GUI tooling designed for non-experts...
I think the idea that LLM's will usher in some new era where everyone and their mom are building software is a fantasy.
I more or less agree specifically on the angle that no-code has existed, yet non-technical people still aren't executing on technical products. But I don't think vibe-coding is where we see this happening, it will be in chat interfaces or GUIs. As the "scafolding" or "harnesses" mature more, and someone can just type what they want, then get a deployed product within the day after some back and forth.
I am usually a bit of an AI skeptic but I can already see that this is within the realm of possibility, even if models stopped improving today. I think we underestimate how technical things like WIX or Squarespace are, to a non-technical person, but many are skilled business people who could probably work with an LLM agent to get a simple product together.
People keep saying code was never the real skill of an engineer, but rather solving business logic issues and codifying them. Well people running a business can probably do that too, and it would be interesting to see them work with an LLM to produce a product.
> I think we underestimate how technical things like WIX or Squarespace are, to a non-technical person, but many are skilled business people who could probably work with an LLM agent to get a simple product together.
In the same vein, I think you underestimate how much "hidden" technical knowledge must be there to actually build a software that works most of the time (not asking for a bug-free program). To design such a program with current LLM coding agents you need to be at very least a power user, probably a very powerful one, in the domain of the program you want to build and also in the domain of general software.
Maybe things will improve with LLM and agents and "make it work" will be enough for the agent to create tests, try extensively the program, finding bugs and squashing them and do all the extra work needed, who know. But we are definitely not there today.
Yeah I've thought for a while that the ideal interface for non-tech users would be these no-code tools but with an AI interface. Kinda dumb to generate code that they can't make sense of, with no guard rails etc.
Wouldn’t we have more restaurants if there was no microwave ovens? But microwave oven also gave rise to many frozen food industry. Overall more industrializations.
Its not our current location, but our trajectory that is scary.
The walls and plateaus that have been consistently pulled out from "comments of reassurance" have not materialized. If this pace holds for another year and a half, things are going to be very different. And the pipeline is absolutely overflowing with specialized compute coming online by the gigawatt for the foreseeable future.
So far the most accurate predictions in the AI space have been from the most optimistic forecasters.
There is a distribution of optimism, some people in 2023 were predicting AGI by 2025.
No such thing as trajectory when it comes to mass behavior because it can turn on a dime if people find reason to. Thats what makes civilization so fun.
I don't know if they will ever get there, but LLMs are a long ways away from having decent creative taste.
Which means they are just another tool in the artist's toolbox, not a tool that will replace the artist. Same as every other tool before it: amazing in capable hands, boring in the hands of the average person.
Also, if you are a human who does taste, it's very difficult to get an AI to create exactly what you want. You can nudge it, and little by little get closer to what you're imagining, but you're never really in control.
This matters less for text (including code) because you can always directly edit what the AI outputs. I think it's a lot harder for video.
> Also, if you are a human who does taste, it's very difficult to get an AI to create exactly what you want.
I wonder if it would be possible to fine train an AI model on my own code. I've probably got about 100k lines of code on github. If I fed all that code into a model, it would probably get much better at programming like me. Including matching my commenting style and all of my little obsessions.
Talking about a "taste gap" sounds good. But LLMs seem like they'd be spectacularly good at learning to mimic someone's "taste" in a fine train.
> LLMs always seem to be inarguably worse than the “original”.
True. But quantity has a quality of its own.
I'm personally delighted at the idea of outsourcing all the boring cookie cutter programming work to an AI. Things like writing CSS, plumbing between my database, backend server and web UI. Writing and maintaining tests. All the stuff that I've done 100 times before and I just hate doing by hand over and over again.
There's lots of areas where it doesn't really matter that the code it produces isn't beautifully terse and performant. Sometimes you just need to get something working. AIs can do weeks of work in an afternoon. The quality isn't as good. But for some tasks, that's an excellent trade.
Taste is both driven by tools and independent of it.
It's driven by it in the sense that better tools and the democratization of them changes people's baseline expectations.
It's independent of it in that doing the baseline will not stand out. Jurassic Park's VFX stood out in 1993. They wouldn't have in 2003. They largely would've looked amateurish and derivative in 2013 (though many aspects of shot framing/tracking and such held up, the effects themselves are noticeably primitive).
Art will survive AI tools for that reason.
But commerce and "productivity" could be quite different because those are rarely about taste.
100% correct. Taste is the correct term - I avoid using it as Im not sure many people here actually get what it truly means.
How can I proclaim what I said in the comment above? Because Ive spent the past week producing something very high quality with Grok. Has it been easy? Hell no. Could anyone just pick up and do what Ive done? Hell no. It requires things like patience, artistry, taste etc etc.
The current tech is soul-less in most people hands and it should remain used in a narrow range in this context. The last thing I want to see is low quality slop infesting the web. But hey that is not what the model producers want - they want to maximize tokens.
The job of a coder has far from become obsolete, as you're saying. It's definitely changed to almost entirely just code review though.
With Opus 4.6 I'm seeing that it copies my code style, which makes code review incredibly easy, too.
At this point, I've come around to seeing that writing code is really just for education so that you can learn the gotchas of architecture and support. And maybe just to set up the beginnings of an app, so that the LLM can mimic something that makes sense to you, for easy reading.
And all that does mean fewer jobs, to me. Two guys instead of six or more.
All that said, there's still plenty to do in infrastructure and distributed systems, optimizations, network engineering, etc. For now, anyway.
This goes well along with all my non-tech and even tech co-workers. Honestly the value generation leverage I have now is 10x or more then it was before compared to other people.
HN is a echo chamber of a very small sub group. The majority of people can’t utilize it and needs to have this further dumbed down and specialized.
That’s why marketing and conversion rate optimization works, its not all about the technical stuff, its about knowing what people need.
For funded VC companies often the game was not much different, it was just part of the expenses, sometimes a lot sometimes a smaller part. But eventually you could just buy the software you need, but that didn’t guarantee success. Their were dramatic failures and outstanding successes, and I wish it wouldn’t but most of the time the codebase was not the deciding factor. (Sometimes it was, airtable, twitch etc, bless the engineers, but I don’t believe AI would have solved these problems)
Tbh, depending on the field, even this crowd will need further dumbing down. Just look at the blog illustration slops - 99% of them are just terrible, even when the text is actually valuable. That's because people's judgement of value, outside their field of expertise, is typically really bad. A trained cook can look at some chatgpt recipe and go "this is stupid and it will taste horrible", whereas the average HN techbro/nerd (like yours truly) will think it's great -- until they actually taste it, that is.
Agreed. This place amazes in regards to how overly confident some people feel stepping outside of their domains.. the mistakes I see here in relation to talking about subject areas associated with corporate finance, valuation etc is hilarious. Truly hilarious.
The example is bad imo because chatgpt can be really great for cooking if you utilize it correctly. Like in coding you already need some skill and shouldn't believe everything it says.
> whereas the average HN techbro/nerd (like yours truly) will think it's great -- until they actually taste it, that is.
This is the schtick though, most people wouldn't even be able to tell when they taste it. This is typically how it works, the average person simply lacks the knowledge so they don't even know what is possible.
> To actually execute on these things takes a different kind of thinking
Agreed. Honestly, and I hate to use the tired phrase, but some people are literally just built different. Those who'd be entrepreneurs would have been so in any time period with any technology.
1) I don’t disagree with the spirit of your argument
2) 3D printing has higher startup costs than code (you need to buy the damn printer)
3) YOU are making a distinction when it comes to vibe coding from non-tech people. The way these tools are being sold, the way investments are being made, is based on non-domain people developing domain specific taste.
This last part “reasonable” argument ends up serving as a bait and switch, shielding these investments. I might be wrong, but your comment doesn’t indicate that you believe the hype.
100%, it's like with "Suno" - everyone can create a good quality music/song basically in 2-3 minutes (and vibe programming can only do.... nothing in a few minutes) - how many new great bands and musicions we got )))))
You might not get great musicians from using Suno, but an ad company might decide to just generate a jingle rather than hire a musician to do it. Same with images/videos. The result might not be great, but the companies does it in 3 minutes and close-to-zero cost. Similarly, you can vibe-code a website for a restaurant (that does a very basic thing like display a menu, opening hours, maybe a google map location). It might not be the best, but you would be surprised at the amount of people that are willing to sacrifice quality for cheap prices.
Sure, but it's nothing new - CMSes, website builders, no-code platforms, code generators - AI on top of a website builder can really help and it does help. But not building operating systems or Word from scratch for every person or company - this is just nuts.
I heard a stat on the economist podcast the other day talking about AI music production. They said spotify estimates 40% of songs on their platform are now AI generated. The AI generated songs make up 0.5% of total listening time.
Low quality music made in bulk seems much less useful than low quality code made in bulk.
Yeah but did he actually try to use the repurposed "verbose" mode?
I did, and it's way more verbose than I need, but the regular mode now is basically like mute mode. In addition, recently it started running a lot of stuff in the background and that causes some crazy flicker and Claude has become stubbornly autonomous. It just runs stuff in a flyby mode, asks me a question and then waits a couple seconds and proceeds with a default choice while I am still reading and considering options. I am left mashing Esc and that sometimes does not stop stuff either. Last couple updates have really annoyed me tbh.
Yeah, I agree with you. I'm so tired of people complaining about AI-generated text without focusing on the content. Just don't read it if you don't like it.
It's another level of when people complain how a website is not readable for them or some CSS rendering is wrong or whatever. How does it add to the discussion?
The problem is that there’s infinite “content” out there.
The amount of work the author puts in is correlated with the value of the piece (insight/novelty/etc). AI-written text is a signal that there’s less less effort and therefore less value there.
It’s not a perfect correlation and there are lots of exceptions like foreign language speakers, but it is a signal.
This is really just a meme. People don't know how to use these tools.
Here is the response from Gpt-5.2 using my default custom instructions in the mac desktop app.
OBJECTIVE: Decide whether to drive or walk to a car wash ~50 meters from home, given typical constraints (car must be present for wash).
APPROACH: Use common car-wash workflows + short-distance driving considerations (warm engine, time, parking/queue). No web sources needed (stable, local decision).
OUTPUT:
• Default choice: Drive.
A car wash generally requires the car on-site, and 50 meters of driving is trivial.
• When walking makes sense:
• You’re going to drop the car off (hand-wash/detail) and pick it up later.
• You want to avoid moving the car now (ice, tight parking, you’ve been drinking, etc.) and someone else will handle the car.
• Micro-guidance:
• If it’s an automatic wash you stay in, drive.
• If it’s detail/hand wash with a wait, drive there, then walk home (or just wait).
CONFIDENCE: 92/100
Assumptions: It’s a normal car wash that requires your car to be there; conditions are safe to drive 50 meters.
Your objective has explicit instruction that car has to be present for a wash. Quite a difference from the original phrasing where the model has to figure it out.
I did get it, and in my view my point still stands. If I need to use special prompts to ask such a simple question, then what are we doing here? The LLMs should be able to figure out a simple contradiction in the question the same way we (humans) do.
Not really a special prompt. It's basically my custom instruction to ChatGPT, the purpose of that instruction is to disambiguate my ramblings, basically. It's pretty effective. I always use speech to text, so it's messy and this cleanup really helps.
For the last three or four months, what I've been doing is anytime I have Claude write a comment on an issue, it just adds a session ID, file path and the VM it is on. That way, whenever we have some stuff that comes up, we just search through issues and then we can also retrace the session that produced the work and it's all traceable. In general, I just work through gitea issues and sometimes beads. I couldn't stand having all these MD files in my repo because I was just drowning in documentation, so having it in issues has been working really nicely and agents know how to work with issues. I did have it write a gitea utility and they are pretty happy using/abusing it. Anytime I see that they call it in some way that generates errors, I just have them improve the utility. And by this point, it pretty much always works. It's been really nice.
reply