Hacker Newsnew | past | comments | ask | show | jobs | submit | smt88's commentslogin

Biden appointed Lina Khan. "Both parties are the same" is lazy and unhelpful.

It's wiser to judge the parties by actions rather than rhetoric. From both parties there has been complete absence of meaningful action on the issue, even though both have regularly cycled through complete control of government with majorities in the house and senate while also holding the presidency.

Good reviews are almost 100% fake or planted

I expect masked ICE agents to be deployed to polls in purple and blue states to "prevent non-citizens from voting" (i.e. to scare minorities away from polls)

Bet. Lets see if we can get this up on polymarket, bet on it.

You already lost your own bet.

"A pair of armed and masked men in tactical gear stood guard at ballot drop boxes in Mesa, Ariz., on Oct. 21 as people began early voting for the 2022 midterm elections."

They might be "off-duty" but this is during Biden's admin. They're immensely more emboldened now and local LE will absolutely not enforce any laws restricting this.

Source: https://www.cnbc.com/2022/11/06/election-officials-facing-ar...


So the goal post moved from ICE or Federal agents being stationed at polling stations to any individual at all?

> deployed to polls in purple and blue states to "prevent non-citizens from voting" (i.e. to scare minorities away from polls)

MOST states (purple, blue, red) have mail-in voting. https://en.wikipedia.org/wiki/Postal_voting_in_the_United_St...


They're working on that.

Challenging the rules: https://www.pbs.org/newshour/politics/supreme-court-revives-...

Changing the rules at USPS: https://www.pbs.org/newshour/nation/how-this-new-mail-rule-c...

And I'd fully expect some fuckery via executive orders closer to the election, and SCOTUS to use the emergency docket to let them "temporarily" be enforced.


It is being restricted. My red state has gone from allowing mail-in ballots that were allowed if they were postmarked by election day, to requiring them to be in by election day. When the postmaster general is a Trump appointee, and the mail has slowed down over the last few years, it makes me wonder if this is deliberate.

Correct, which the administration is also trying to remove.

For now. The tyrant controls the post office.

They're targeting that too. e.g. recent change to postmark dates.

If trying to overthrow a democratic govt doesn't deserve the death penalty, nothing does (which is a reasonable position to have)

> In software, we, the developers, have increasingly been a bottleneck. The world needs WAY more software than we can economically provide, and at long last a technology has arrived that will help route around us for the benefit of humanity.

Everything you wrote here is directly contradicted by casual observation of reality.

Developers aren't a bottleneck. If they were, we wouldn't be in a historic period of layoffs. And before you say that AI is causing the layoffs -- it's not. They started before AI was widely used for production, and they're also being done at companies that aren't heavily using AI anyway. They're a result of massive over-hiring during periods of low interest rates.

Beyond that, who is demanding software developers? The things that make our lives better (like digital forms at the doctor's office) aren't complex software.

The majority of the demand is from enshittification companies making our lives worse with ads and surveillance. No one is demanding developers, but certainly individual humans aren't demanding them.


Yes, the layoffs are a market correction initiated by non-AI factors, such as the end of the ZIRP era.

The world is chock-full of important, society-scale problems that have been out of reach because the economics have made them costly to work on and therefore risky to invest in. Lowering the cost of software development de-risks investment and increases the total pool of profitable (or potentially profitable) projects.

The companies that will work on those new problems are being conceived or born right now, and [collectively] they'll need lots of AI-native software devs.


> important, society-scale problems that have been out of reach because the economics have made them costly to work on and therefore risky to invest in

What are examples of these projects and how will AI put them back into reach of investment?

I haven't seen anything in this category so far.


There's no chance LLMs will be an engineering team replacement. The hallucination problem is unsolvable and catastrophic in some edge cases. Any company using such a team would be uninsurable and sued into oblivion.

Writing software is actually one of the domains where hallucinations are easiest to fix: you can easily check whether it builds and passes tests.

If you want to go further, you can even require the LLM to produce a machine checkable proof that the software is correct. That's beyond the state of the art at the moment, but it's far from 'unsolvable'.

If you hallucinate such a proof, it'll just not work. Feed back the error message from the proof checker to your coding assistant, and the hallucination goes away / isn't a problem.


  > you can easily check whether it builds and passes tests.
This link were on HN recently: https://spectrum.ieee.org/ai-coding-degrades

  "...recently released LLMs, such as GPT-5, have a much more insidious method of failure. They often generate code that fails to perform as intended, but which on the surface seems to run successfully, avoiding syntax errors or obvious crashes. It does this by removing safety checks, or by creating fake output that matches the desired format, or through a variety of other techniques to avoid crashing during execution."
The trend for LLM generated code is to build and pass tests but do not deliver functionality needed.

Also, please consider how SQLite is tested: https://sqlite.org/testing.html

The ratio between test code and code itself is mere 590 times (590 LOC of tests per LOC of actual code), it used to be more than 1100.

Here is notes on current release: https://sqlite.org/releaselog/3_51_2.html

Notice fixes there. Despite being one of the most, if not the most, tested pieces of software in the world, it still contains errors.

  > If you want to go further, you can even require the LLM to produce a machine checkable proof that the software is correct.
Haha. How do you reconcile a proof with actual code?

I've recently seen Opus, after struggling for a bit, implement an API by having it return JSON that includes instructions for a human to manually accomplish the task I gave it.

It proudly declared the task done.


I believe you have used Albanian [1] version of Opus.

[1] https://www.reddit.com/r/ProgrammerHumor/comments/1lw2xr6/hu...


Recent models have started to "fix" HTML issues with ugly hacks like !important. The result looks like it works, but the tech debt is considerable.

Still, it's just a temporary hindrance. Nothing a decent system prompt can't take care of until the models evolve.


> Haha. How do you reconcile a proof with actual code?

You can either proof your Rust code correct, or you can use a proof system that allows you to extract executable code from the proofs. Both approaches have been done in practice.

Or what do you mean?


Rust code can have arbitrary I/O effects in any parts of it. This precludes using only Rust's type system to make sure code does what spec said.

The most successful formally proven project I know, seL4 [1], did not extracted executable code from the proof. They created a prototype in Haskell, mapped (by hand) it to Isabelle, I believe, to have a formal proof and then recreated code in C, again, manually.

[1] https://sel4.systems/

Not many formal proof systems can extract executable C source.


> Haha. How do you reconcile a proof with actual code?

Languages like Lean allow you to write programs and proofs under the same umbrella.


As if Lean does not allow to circumvent it's proof system (the "sorry" keyword).

Also, consider adding code to the bigger system, written in C++. How would you use Lean to prove correctness of your code as part of the bigger system?


I mean, it's somewhat moot, as even the formal hypothesis ("what is this proof proving") can be more complex than the code that implements it in nontrivial cases. So verifying that the proof is saying the thing that you actually want it to prove can be near impossible for non-experts, and that's just the hypothesis; I'm assuming the proof itself is fully AI-generated and not reviewed beyond running it through the checker.

And at least in backend engineering, for anything beyond low-level algorithms you almost always want some workarounds: for your customer service department, for engineering during incident response, for your VIP clients, etc. If you're relying on formal proof of some functionality, you've got to create all those allowances in your proof algorithm (and hypothesis) too. And additionally nobody has really come up with a platform for distributed proofs, durable proof keys (kinda), or how to deal with "proven" functionality changes over time.


You focused on writing software, but the real problem is the spec used to produce the software, LLMs will happily hallucinate reasonable but unintended specs, and the checker won’t save you because after all the software created is correct w.r.t. spec.

Also tests and proof checkers only catch what they’re asked to check, if the LLM misunderstands intent but produces a consistent implementation+proof, everything “passes” and is still wrong.


This is why every one of my coding agent sessions starts with "... write a detailed spec in spec.md and wait for me to approve it". Then I review the spec, then I tell it "implement with red/green TDD".

The premise was that the AI solution would replace the engineering team, so who exactly is writing/reviewing this detailed spec?

Well, perhaps it'll only shrink the engineering team by 95% then.

Why would you shrink the team rather than become 20x more productive as a whole?

Users don't want changes that rapidly. There's not enough people on the product team to design 20x more features. 20x more features means 400x more cross-team coordination. There's only positive marginal ROI for maybe 1.5-2x even if development is very cheap.

Either way can work. It depends on what the rest of the business needs.

The premise is in progress. We are only at the beginning of the fourth year of this hype-phase, and we haven't even reached AGI yet. It's obviously not perfect, maybe never will, but we are not a the point yet were we can conclude which future is true. The singularity hasn't happend yet, so we are still moving with (llm-enhanced) human speed at the moment, meaning things need time.

That's a bad premise.

Maybe, but you're responding to a thread about why AI might or might not be able to replace an entire engineering team:

> Ultimately I think over the next two years or so, Anthropic and OpenAI will evolve their product from "coding assistant" to "engineering team replacement", which will include standard tools and frameworks that they each specialize in (vendor lock in, perhaps), but also ways to plug in other tech as well.

This is the context of how this thread started, and this is the context in which DrammBA was saying that the spec problem is very hard to fix [without an engineering team].


Might be good to define the (legacy) engineering team. Instead of thinking 0/1 (ugh, almost nothing happens this way), the traditional engineering team may be replaced by something different. A team mostly of product, spec writers, and testers. IDK.

The job of AI is to do what we tell it to do. It can't "create a spec" on its own. If it did and then implemented that spec, it wouldn't accomplish what we want it to accomplish. Therefore we the humans must come up with that spec. And when you talk about a software application, the totality of its spec written out, can be very complex, very complicated. To write and understand, and evolve and fix such a spec takes engineers, or what used to be called "system analysts".

To repeat: To specify what a "system" we want to create does is a highly complicated task, which can only be dones by human engineers who understand the requirements for the system, and how parts of those requirements/specs interact with other parts of the spec, what are the consequences of one (part of the) spec to other parts of it. We must not writ e"impossible specs" like draw me a round square. Maybe the AI can check whether the spec is impossible or not, but I'm not so sure of that.

So I expect that software engineers will still be in high demand, but they will be much more productive with AI than without it. This means there will be much more software because it will be cheaper to produce. And the quality of the software will be higher in terms of doing what humans need it to do. Usability. Correctness. Evolvability. In a sense the natural language-spec we give the AI is really something written in a very high-level programming-language - the language of engineers.

BTW. As I write this I realize there is no spell-checker integrated into Hacker News. (Or is there?). Why? Because it takes developers to specify and implement such a system - which must be integrated into the current HN implementation. If AI can do that for HN, it can be done, because it will be cheap enough to do it -- if HN can exactly spell out what kind of system it wants. So we do need more software, better software, cheaper software, and AI will helps us do that.

A 2nd factor is that we don't really know if a spec is "correct" until we test the implemented system with real users. At that point we typically find many problems with the spec. So somebody must fix the problems with the spec, evolve the spec and rinse and repeat the testing with real users -- the developers who understand the current spec and why it is is not good enough.

AI can write my personal scripts for me surely. But writing a spec for a system to be used by thousands of humans, still takes a lot of (human) work. The spec must work for ALL users. That makes it complicated and difficult to get right.


Same, and similarly something like a "create a holistic design with all existing functionality you see in tests and docs plus new feature X, from scratch", then "compare that to the existing implementation and identify opportunities for improvement, ranked by impact, and a plan to implement them" when the code starts getting too branchy. (aka "first make the change easy, then make the easy change"). Just prompting "clean this code up" rarely gets beyond dumb mechanical changes.

Given so much of the work of managing these systems has become so rote now, my only conclusion is that all that's left (before getting to 95+% engineer replacement) is an "agent engineering" problem, not an AI research problem.


In order to prove safety you need a formal model of the system and formally defined safety properties that are both meaningful and understandable by humans. These do not exist for enterprise systems

An exhaustive formal spec doesn't exist. But you can conservatively proof some properties. Eg program termination is far from sufficient for your program to do what you want, but it's probably necessary.

(Termination in the wider sense: for example an event loop has to be able to finish each run through the loop in finite time.)

You can see eg Rust's or Haskell's type system as another light-weight formal model that lets you make and proof some simple statements, without having a full formal spec of the whole desired behaviour of the system.


Yeah, but with all respect, that is a totally uninteresting property in an enterprise software system where almost no software bugs actually manifest as non-termination.

The critical bugs here are related to security (DDoS attacks, authorization and authentication, data exfiltration, etc), concurrency, performance, data corruption, transactionality and so forth. Most enterprise systems are distributed or at least concurrent systems which depend on several components like databases, distributed lock managers, transaction managers, and so forth, where developing a proper formal spec is a monumental task and possibly impossible to do in a meaningful way because these systems were not initially developed with formal verification in mind. The formal spec, if faithful, will have to be huge to capture all the weird edge cases.

Even if you had all that, you need to actually formulate important properties of your application in a formal language. I have no idea how to even begin doing that for the vast majority of the work I do.

Proving the correctness of linear programs using techniques such as Hoare logic is hard enough already for anything but small algorithms. Proving the correctness of concurrent programs operating on complex data structures requires much more advanced techniques, setting up complicated logical relations and dealing with things like separation logic. It's an entirely different beast, and I honestly do not see LLMs as a panacea that will suddenly make these things scale for anything remotely close in size to a modern enterprise system.


Oh, there's lots more simple properties you can state and prove that capture a lot more, even in the challenging enterprise setting.

I just gave the simplest example I could think of.

And termination is actually a much stronger and more useful property than you make it out to be---in the face of locks and concurrency.


That is true and very useful for software development, but it doesn't help if the goal is to remove human programmers from the loop entirely. If I'm a PM who is trying to get a program to, say, catalogue books according to the Dewey Decimal system for a library, a proof that the program terminates is not going to help that much when the program is mis-categorizing some books.

Is removing the human in the loop really the goal, or is the goal right now to make the human a lot more productive? Because...those are both very different things.

I don't know what the goal for OpenAI or Anthropic really is.

But the context of this thread is the idea that the user daxfohl launched that these companies will, in the next few years, launch an "engineering team replacement" program; and then the user eru claimed that this is indeed more doable in programming than other domains because you can have specs and tests for programs in a way that you can't for, say, an animated movie.


OK, so you successfully argued that replacing the entire engineering team is hard. But you can perhaps still shrink it by 99%. To the point where a sole founder can do the remaining tech role part time.

I have no idea what will happen in a few years, maybe LLM tech will hit a wall and humans will continue to be needed in the loop. But today humans are definitely needed in the loop in some way.

> Writing software is actually one of the domains where hallucinations are easiest to fix: you can easily check whether it builds and passes tests.

What tests? You can't trust the tests that the LLM writes, and if you can write detailed tests yourself you might as well write the damn software.


Use multiple competing LLM. Generative adversarial network style.

Cool. That sure sounds nice and simple. What do you do when the multiple LLMs disagree on what the correct tests are? Do you sit down and compare 5 different diffs to see which have the tests you actually want? That sure sounds like a task you would need an actual programmer for.

At some point a human has to actually use their brain to decide what the actual goals of a given task are. That person needs to be a domain expert to draw the lines correctly. There's no shortcut around that, and throwing more stochastic parrots at it doesn't help.


Just because you can't (yet) remove the human entirely from the loop, doesn't mean that economising on the use of the humans time is impossible.

For comparison have a look at compilers: nowadays approximately no one writes their software by hand, we write a 'prompt' in something like Rust or C, and ask another computer program to create the actual software.

We still need the human in the loop here, but it takes much less human time than creating the ELF directly.


It’s not “economizing” if I have to verify every test myself. To actually validate that tests are good I need to understand the system under test, and at that point I might as well just write the damn thing myself.

This is the fundamental problem with this “AI” mirage. If I have to be an expert to validate that the LLM actually did the task I set out, and isn’t just cheating on tests, then I might as well code the solution myself.


From a PM perspective, the main differentiator between an engineering team and AI is "common sense". As these tools get used more and more, enough training data will be available that AI's "common sense" in terms of coding and engineering decisions could be indistinguishable from a human's over time. At that point, the only advantage a human has is that they're also useful on the ops and incident response side, so it's beneficial if they're also comfortable with the codebase.

Eventually these human advantages will be overcome, and AI will sufficiently pass a "Turing Test" for software engineering. PMs will work with them directly and get the same kinds of guidance, feedback, documentation, and conversational planning and coordination that they'd get from an engineering team, just with far greater speed and less cost. At that point, yeah you'll probably need to keep a few human engineers around to run the system, but the system itself will manage the software. The advantage of keeping a human in the loop will dwindle to zero.


I can see how LLMs can help with testing, but one should never compare LLMs with deterministic tools like compilers. LLMs are entirely a separate category.

Tests and proofs can only detect issues that you design them to detect. LLMs and other people are remarkably effective at finding all sorts of new bugs you never even thought to test against. Proofs are particularly fragile as they tend to rely on pre/post conditions with clean deterministic processing, but the whole concept just breaks down in practice pretty quickly when you start expanding what's going on in between those, and then there's multithreading...

Ah, most the problem in programming is writing the tests. Once you know what you need the rest is just typing.

I can see an argument where you can get none programers to create the input and output of said tests but if the can do that, they are basically programmers.

This is of course leaving aside that half the stated use cases I hear for AI are that it can 'write the tests for you'. If it is writing the code and the tests it is pointless.


You need more than tests. Test induced design damage:

https://dhh.dk/2014/test-induced-design-damage.html


Well - the end result can be garbage still. To be fair: humans also write a lot of garbage. I think in general most software is rather poorly written; only a tiny percentage is of epic prowess.

Who is writing the tests?

Who writes the tests?

A competing AI.

Ah, it is turtles all the way down.

Yes. But it's no different from the question of how a non-tech person can make sure that whatever their tech person tells them actually makes sense: you hire another tech person to have a look.

These types of comments are interesting to me. Pre-chatGPT there were tons of posts how so many software people were terrible at their jobs. Bugs were/are rampant. Software bugs caused high profile issues, but likely so many more we never heard about.

Today we have chatGPT and only now will teams be uninsurable and sued into oblivion? LOL


LLMs were trained on exactly that kind of code.

If you've ever used Claude Code in brave mode, I can't understand how you'd think a dev team could make the same categories of mistakes or with the same frequency.

I am but a lowly IC, with no notion of the business side of things. If I am an IC at, say, a FANG company, what insurance has been taken out on me writing code there?

> If I am an IC at, say, a FANG company, what insurance has been taken out on me writing code there?

Every non-trivial software business has liability insurance to cover them for coding lapses that lead to data breaches or other kinds of damages to customers/users.


I use LLM's to write the majority of my code. I haven't encountered a hallucination for the better part of a year. It might be theoretically unsolvable but it certainly doesn't seem like a real problem to me.

I use LLMs whenever I'm coding, and it makes mistakes ~80% of the time. If you haven't seen it make a huge mistake, you may not be experienced enough to catch them.

Hallucinations, no. Mistakes, yes, of course. That's a matter of prompting.

> That's a matter of prompting.

So when I introduce a bug it's the PM's fault.


You can install Linux on Dell laptops, or Dell will do it for you:

https://www.dell.com/en-us/shop/dell-laptops/scr/laptops/app...


> or Dell will do it for you:

Only in certain countries. Dell refuses to sell laptops with Linux installed in Australia. And they've never given a believable explanation as to why.


Possibly a bad deal they made with Microsoft for the region

I'm a coder, potter, and (sometimes) writer. This post is content-free, insight-free nonsense.

You can find parallels between any two things if you strain hard enough, but just listing them doesn't necessarily convey any new ideas.

Code is nothing like clay, and coding is nothing like potting.


To paraphrase a wise bot: like everything else in life, coding is just a primitive, degenerate form of bending.

Humans relate concepts to established concepts and world views, it’s a form of prejudice, and has philosophical resonance with premature optimization being the root of some not great thinking.


No. First of all, AI companies make money per token, not per user. The unlimited plans are all dead or dying.

Second, they can increase prices if they're really killing jobs. The price for a model that can kill 1 job is much lower than the price for a model that can kill 100.


That's a fair point. Model prices can increase. Let's say Anthropics/OpenAI/Gemini figure this out.

Doesn't this mean that any company that depends on headcount growth (every SaaS), loses?

100 SWE -> 10 SWE, 100 slack/gmail/notion/zoom/etc. subscriptions become 10.


> Doesn't this mean that any company that depends on headcount growth (every SaaS), loses?

Yes, assuming they aren't also scaling their costs down with AI.

But this is mostly a moot point because we don't yet have any evidence that AI is killing lots of jobs. Companies are doing necessary/planned layoffs after over-hiring for years, and some of them are making it look better to investors by saying it's because they're smart (for using AI) instead of the truth, which is that they stupidly over-hired.

You also have to remember that AI is way, way under-priced right now. That $200/mo. Claude bill should probably be double or triple what it is. All of the AI companies are plowing money into keeping prices artificially low.

The economics will change a lot once they can't do that anymore. Google will likely dominate because they can burn their own cash from other businesses instead of cash from VCs or retail investors.

But if prices go up and these companies charge enough to be profitable, people will start to question whether it's cheaper to just have people doing the work instead.


Things like flexbox have made CSS indescribably better and easier to use than it used to be. It's still bad, but degrees matter a lot.

As a fullstack dev, I couldn't do pixel-perfect CSS 10 years ago, and today I can. That's a lot of progress.


I was already using flexbox ten years ago. And if the goal was pixel-perfect layout, I could do that twenty years ago using `position: absolute`.

I would instead characterize the recent developments in CSS as enabling good layout even when there are major unknowns in your content. It was always easy to write CSS tailored to one set of content (say, one style of toolbar in your UI), but it has become possible to write generic CSS (say, a generic toolbar component where the icons are unknown, the width and height are also unknown).


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: