Hacker Newsnew | past | comments | ask | show | jobs | submit | ramraj07's commentslogin

This exactly matches our findings: if we start molding the repo to be "AI native" whatever that means, add the right tooling and still demand all engineers take full responsibility for their output, this system is a true multiplier.

I also have Copilot and Cursor bugbot reviews and run it on a Ralph wiggum loop with claude code. A few rounds overnight and the PR is perfect and ready for a final review before merging.

I do run 4 CC sessions in parallel though, but thats just one day a week. The rest of the week is spent figuring out the next set of features and fixes needed, operational things, meetings,feedback, etc.


Why did your company get claude code with token billing instead of getting everyone max plans ?

I think they need to have the enterprise plan for accessing advanced security and data handling guarantees. Also they set up pretty strict controls on what tools the agents can use at the org level that we cannot override, not sure that's an option with the subscription plans.

ZDR in place at API level but need enterprise contract if on a plan. Vendor lock-in and IP drivers.

Not that guy, but here token billing was chosen to get the Enterprise monitoring shit. I think the C-suite is expected to report productivity increases and needs all of the data that Anthropic can scrape to justify how much money is being on fire right now.

I used to think that the people who keep saying (in March 2026) that AI does not generate good code are just not smart and ask stupid prompts.

I think I've amended that thought. They are not necessarily lacking in intelligence. I hypothesize that LLMs pick up on optimism and pessimism among other sentiments in the incoming prompt: someone prompting with no hope that the result will be useful end up with useless garbage output and vice versa.


Exactly. You have to manifest at a high vibrational frequency.

Thanks for the laugh.

This is kinda like that thing about how psychic mediums supposedly can't medium if there's a skeptic in the room. Goes to show that AI really is a modern-day ouija board.

The accurate inferences that can be drawn from subtle linguistic attributes should freak you out more than they do.

Switching one good synonym can send the model off an entirely different direction in response, or so I’ve observed.

That sounds a lot more like confirmation bias than any real effect on the AI's output.

Gung-ho AI advocates overlook problems and seem to focus more on the potential they see for the future, giving everything a nice rose tint.

Pessimists will focus on the problems they encounter and likely not put in as much effort to get the results they want, so they likely see worse results than they might have otherwise achieved and worse than what the optimist saw.


That's a valid sounding argument. However many people with no strong view either way are producing functional, good code with AI daily, and the original context of this thread is about someone who has never been able to produce anything committable. Many, many real world experiences show something excellent and ready to go from a simple one shot.

It's probably more to do with the intelligence required to know when a specific type of code will yield poor future coding integrations and large scale implementation.

It's pretty clear that people think greenfield projects can constantly be slopified and that AI will always be able to dig them another logical connection, so it doesn't matter which abstraction the AI chose this time; it can always be better.

This is akin to people who think we can just keep using oil to fuel technological growth because it'll some how improve the ability of technology to solve climate problems.

It's akin to the techno capitalist cult of "effective altruism" that assumes there's no way you could f'up the world that you can't fix with "good deeds"

There's a lot of hidden context in evaluating the output of LLMs, and if you're just looking at todays success, you'll come away with a much different view that if you're looking at next year's.

Optimism is only then, in this case, that you believe the AI will keep getting more powerful that it'll always clean up todays mess.

I call this techno magic, indistinguishable from religious 'optimism'


Don’t know why you’re getting downvoted, this is a fascinating hypothesis and honestly super believable. It makes way more sense than the intuitive belief that there’s actually something under the human skin suit understanding any of this code.

If you dont take a stand and refuse to clean their mess, aren't you part of the problem? No self respecting proponent of AI enabled development should suggest that the engineers generating the code are still not personally responsible for its quality.

Ultimately that's only an option if you can sustain the impact to your career (not getting promoted, or getting fired). My org (publicly traded, household name, <5k employees) is all-in on AI with the goal of having 100% of our code AI generated within the next year. We have all the same successes and failures as everyone else, there's nothing special about our case, but our technical leadership is fundamentally convinced that this is both viable and necessary, and will not be told otherwise.

People who disagree at all levels of seniority have been made to leave the organization.

Practically speaking, there's no sexy pitch you can make about doing quality grunt work. I've made that mistake virtually every time I've joined a company: I make performance improvements, I stabilize CI, I improve code readability, remove compiler warnings, you name it: but if you're not shipping features, if you're not driving the income needle, you have a much more difficult time framing your value to a non-engineering audience, who ultimately sign the paychecks.

Obviously this varies wildly by organization, but it's been true everywhere I've worked to varying degrees. Some companies (and bosses) are more self-aware than others, which can help for framing the conversation (and retaining one's sanity), but at the end of the day if I'm making a stand about how bad AI quality is, but my AI-using coworker has shipped six medium sized features, I'm not winning that argument.

It doesn't help that I think non-engineers view code quality as a technical boogeyman and an internal issue to their engineering divisions. Our technical leadership's attitude towards our incidents has been "just write better code," which... Well. I don't need to explain the ridiculousness of that statement in this forum, but it undermines most people's criticism of AI. Sure, it writes crap code and misses business requirements; but in the eyes of my product team? That's just dealing with engineers in general. It's not like they can tell the difference.


Hi thanks for this brilliant feature. It will really improve the product. However it needs a little bit more work before we can merge it into our main product.

1) The new feature does not follow the existing API guidelines found here: see examples an and b.

2) The new feature does not use our existing input validation and security checking code, see example.

Once the following points have been addressed we will be happy to integrate it.

All the best.

The ball is now in their court and the feature should come back better

This is a politics problem. Engineers were sending each other crap long before AI.


Engineers also wrote good code before AI. We don't get to pretend that the speed increase of AI only increases the output of quality code - it also allows engineers to send much more crap!

..so they copy/paste your message into Claude and send you back a +2000, -1500 version 3 minutes later. And now you get to go hunting for issues again.

If that happens then there’s an issue.

In the past I’ve hopped on a call with them and where I’ve asked them to show me it running. When it falls over I say here are the things the system should do, send me a video of the new system doing all of them.

The embarrassment usually shames them into actually checking that the code works.

If it doesn’t then you might have to go to the senior stakeholder and quietly demonstrate that they said it works, but it does not actually work.

You don’t want to get into a situation where “integrate” means write the feature while others get credit.


There is an alternative way make the necessary point here.. Let it go through with comments to the effect that you can not attest to the quality or efficacy of the code and let the organization suffer the consequences of this foray into LLM usage. If they can't use these tools responsibly and are unwilling to listen to the people who can, then they deserve to hit the inevitable quality wall Where endless passes through the AI still can't deliver working software and their token budget goes through the ceiling attempting to make it work.

I think you're falling victim to the just-world fallacy.

I am absolutely certain the world isn't just. I'm also absolutely certain the world can't get just unless you let people suffer consequences for their decisions. It's the only way people can world.

IME that simply doesn't work in professional environments. People will either misrepresent the failure as a success or find someone else to pin the blame on. Others won't bother taking the time to understand what actually happened because they're too busy and often simply don't care. And if it's nominally your responsibility to keep something up, running, and stable then you're a very likely scapegoat if it fails. Which is probably why people are throwing stuff that doesn't work at you in the first place. Trying to solve the problem through politics is highly unlikely to work because if you were any good at politics you wouldn't have been in that situation in the first place.

I understand how people can get into these fatalist outlooks from experience. I just refuse to lock myself into them. And because I've refused to do so, every once in a while I have success and make the work environment just that little bit better. So I'll keep doing it.

> My org [...] is all-in on AI with the goal of having 100% of our code AI generated within the next year.

> People who disagree at all levels of seniority have been made to leave the organization.

So either they're right (100% AI-generated code soon) and you'll be out of a job or they'll be wrong, but by then the smart people will have been gone for a while. Do you see a third future where next year you'll still have a job and the company will still have a future?


"100% AI-generated code soon" doesn't mean no humans, just that the code itself is generated by AI. Generating code is a relatively small part of software engineering. And if AI can do the whole job, then white collar work will largely be gone.

I agree, but it seems like if we can tell the AI "follow these requirements and use this architecture to make these features", we're a small step away from letting the AI choose the requirements, the architecture and the features. And even if it's not 100% autonomous, I don't see how companies will still need the same number of employees. If you're the lead $role, you'll likely stay, but what would be the use of anyone else?

And then we all go on trades and uhhh no one will be able pay for it lol

> ... I make performance improvements, I stabilize CI, I improve code readability, remove compiler warnings, you name it ...

These are exactly the kind of tasks that I ask an AI tool to perform.

Claude, Codex, et al are terrible at innovation. What they are good at is regurgitating patterns they've seen before, which often mean refactoring something into a more stable/common format. You can paste compiler warnings and errors into an agentic tool's input box and have it fix them for you, with a good chance for success.

I feel for your position within your org, but these tools are definitely shaking things up. Some tasks will be given over entirely to agentic tools.


> These are exactly the kind of tasks that I ask an AI tool to perform.

Very reasonable nowadays, but those were things I was doing back in 2018 as a junior engineer.

> Some tasks will be given over entirely to agentic tools.

Absolutely, and I've found tremendous value in using agents to clean up old techdebt with oneline prompts. They run off, make the changes, modify tests, then put up a PR. It's brilliant and has fully reshaped my approach... but in a lot of ways expectations on my efficiency are much worse now because leadership thinks I can rewrite our techstack to another language over a weekend. It almost doesn't matter that I can pass all this tidying off onto an LLM because I'm expected to have 3x the output that I did a year ago.


Unfortunately not many companies seem to require engineers to cycle between "feature" and "maintainability" work - hence those looking for the low-hanging fruits and know how to virtue signal seem to build their career on "features" while engineers passionate about correct solutions are left to pay for it while also labelled as "inefficient" by management. It's all a clown show, especially now with vibe-coding - no wonder we have big companies having had multiple incidents since vibing started taking off.

Culture and accountability problems aren't limited to software.

It's best to sniff out values mismatches ASAP and then decide whether you can tolerate some discomfort to achieve your personal goals.


Shipping “quality only” work for a long time can be stressful for your colleagues and the product teams.

You’re much better off mixing both (quality work and product features).


> Shipping “quality only” work for a long time can be stressful for your colleagues and the product teams.

I buried the lede a bit, but my frustration has been feeling like _nobody_ on my team prioritizes quality and instead optimizes for feature velocity, which then leaves some poor sod (me) to pick up the pieces to keep everything ticking over... but then I'm not shipping features.

At the end of the day if my value system is a mismatch from my employer's that's going to be a problem for me, it just baffles me that I keep ending up in what feels like an unsustainable situation that nobody else blinks at.


That's a different situation than the one I had in mind. I was assuming a sane culture that balances shipping features and quality work. What you're describing sounds like a serious value function mismatch.

"aren't you part of the problem?"

Yes? In the same way any victim of shoddy practices is "part of the problem"?


Employees, especially ones as well leveraged and overpaid as software engineers, are not victims. They can leave. They _should_ leave. Great engineers are still able to bet better paying jobs all the time.

> Great engineers are still able to bet better paying jobs all the time

I know a lot of people who tried playing this game frequently during COVID, then found themselves stuck in a bad place when the 0% money ran out and companies weren’t eager in hiring someone whose resume had a dozen jobs in the past 6 years.


You obviously haven't gone job hunting in 2026

I hope you get the privilege soon


Employees are not victims. Sounds like a universal principle.

Came here to say this. The right solution to this is still the same as it always was - teach the juniors what good code looks like, and how to write it. Over time, they will learn to clean up the LLM’s messes on their own, improving both jobs.

> and refuse to clean their mess

You can should speak up when tasks are poorly defined, underestimated, or miscommunicated.

Try to flat out “refuse” assigned work and you’ll be swept away in the next round of layoffs, replaced by someone who knows how to communicate and behave diplomatically.


ramraj07 went on to clarify that they were advocating for putting the onus for cleanup back on mess generators.

They clearly were not advocating for flat out refusing.


Do you want issues of Nature and cell to be replication studies? As a reader even from within the field, im not interested in browsing through negative studies. It'll be great if I can look them up when needed but im not looking forward to email ToC alerts filled with them.

Also who's funding you for replication work? Do you know the pressure you have in tenure track to have a consistent thesis on what you work on?

Literally every single know that designs academia is tuned to not incentivize what you complain about. Its not just journals being picky.

Also the people committing fraud aren't ones who will say "gosh I will replicate things now!" Replicating work is far more difficult than a lot of original work.


> Do you want issues of Nature and cell to be replication studies?

Of course I do! Not all of course, and taking (subjectively measured) impact into account. "We tried to replicate the study published in the same journal 3 years ago using a larger sample size and failed to achieve similar results..." OR "after successfully replicating the study we can confirm the therapeutic mechanism proposed by X actually works" - these are extremely important results that are takin into account in meta studies and e.g. form the base of policies worldwide.


Honestly even if they didn't publish the whole paper, if there was just a page that was a table of all the replication studies that were done recently, that would be pretty cool.

> Do you want issues of Nature and cell to be replication studies?

More than anything. That might legitimately be enough to save science on its own.


Maybe nature and cell and a few other journals should be exceptions: they should be the place that the most advanced scientists publish interesting ideas early for the consumption by their competitors. At that level of science, all the competitors can reproduce each other's experiments if necessary; the real value is expanding the knowledge of what seems possible quickly.

(I am not seriously proposing this, but it's interesting to think about distinguishing between the very small amount of truly innovative discovery versus the very long tail of more routine methods development and filling out gaps in knowledge)


> that level of science, all the competitors can reproduce each other's experiments if necessary

But they don't, and that's the problem!


The problem is bigger. It even blocks research!

In my own experience I was unable to publish a few works because I was unable to outperform a "competitor" (technically we're all on the same side, right?). So I dig more and more into their work and really try to replicate their work. I can't! Emailing the authors I get no further and only more questions. I submit the papers anyways, adding a section about replication efforts. You guessed it, rejected. With explicit comments from reviewers about lack of impact due to "competitor's" results.

Is an experience I've found a lot of colleagues share. And I don't understand it. Every failed replication should teach us something new. Something about the bounds of where a method works.

It's odd. In our strive for novelty we sure do turn down a lot of novel results. In our strive to reduce redundancy we sure do create a lot of redundancy.


I've seen this from both sides.

Sometimes the result is wrong, or it's not as big or as general as claimed. Or maybe the provided instructions are insufficient to replicate the work. But sometimes the attempt to replicate a result fails, because the person doing it does not understand the topic well enough.

Maybe they are just doing the wrong things, because their general understanding of the situation is incorrect. Maybe they fail to follow the instructions correctly, because they have subtle misunderstandings. Or maybe they are trying to replicate the result with data they consider similar, but which is actually different in an important way.

The last one is often a particularly difficult situation to resolve. If you understand the topic well enough, you may be able to figure out how the data is different and what should be changed to replicate the result. But that requires access to the data. Very often, one side has the data and another side the understanding, but neither side has both.

Then there is the question of time. Very often, the person trying to replicate the result has a deadline. If they haven't succeeded by then, they will abandon the attempt and move on. But the deadline may be so tight that the authors can't be reasonably expected to figure out the situation by then. Maybe if there is a simple answer, the authors can be expected to provide it. But if the issue looks complex, it may take months before they have sufficient time to investigate it. Or if the initial request is badly worded or shows a lack of understanding, it may not be worth dealing with. (Consider all the bad bug reports and support requests you have seen.)


I definitely think all these are important, even if in different ways. For the subtle (and even not so subtle) misunderstandings it matters who misunderstands. For the most part, I don't think we should concern ourselves with non-experts. We do need science communicators, but this is a different job (I'm quite annoyed at those on HN who critique arxiv papers for being too complex while admitting they aren't researchers themselves). We write papers to communicate to peers, not the public. If we were to write to the latter each publication would have to be prepended by several textbooks worth of material. But if it is another expert misunderstanding, then I think there's something quite valuable there. IFF the other expert is acting in good faith (i.e. they are doing more than a quick read and actually taking their time with the work) then I think it highlights ambiguity. I think the best way to approach this is distinguish by how prolific the misunderstanding is. If it is uncommon, well... we're human and no matter how smart you are you'll produce mountains of evidence to the contrary (we all do stupid shit). But if the misunderstanding is prolific then we can be certain that ambiguity exists, and it is worth resolving. I've seen exactly what you've seen as well as misunderstandings leading to discoveries. Sometimes our idiocracy can be helpful lol.

But in any case, I don't know how we figure out which category of failures it is without it being published. If no one else reads it it substantially reduces the odds of finding the problem.

FWIW, I'm highly in favor of a low bar to publishing. The goal of publishing is to communicate to our peers. I'm not sure why we get so fixated on these things like journal prestige. That's missing the point. My bar is: 1) it is not obviously wrong, 2) it is not plagiarized (obviously or not), 3) it is useful to someone. We do need some filters, but there's already natural filters beyond the journals and conferences. I mean we're all frequently reading "preprints" already, right? I think one of the biggest mistakes we make is conflate publication with correctness. We can't prove correctness anywhere, science is more about the process of elimination. It's silly to think that the review process could provide correctness. It can (imperfectly) invalidate works, but not validate them. It isn't just the public that seems to have this misunderstanding...


Things are easier when you are writing to your peers within an established academic field. But all too often, the target audience includes people in neighboring fields. Then it can easily be that most people trying to replicate the work are non-experts.

For example, most of my work is in algorithmic bioinformatics, which is a small field. Computer scientists developing similar methods may want to replicate my work, but they often lack the practical familiarity with bioinformatics. Bioinformaticians trying to be early adopters may also try to replicate the work, but they are often not familiar with the theoretical aspects. Such a variety of backgrounds can be a fertile ground for misunderstandings.


Sure. You can't write to everyone and there's tradeoffs to broadening your audience. But I'm also not sure what your point is. That people are arrogant? Such variety of backgrounds can also be fertile ground for collaboration. Something that should happen more often

As a gross simplification, there are two kinds of fields. Some are defined by the methods they use, and some by the topics they study.

The latter will use any methods that may yield results. That creates a problem. The people who are in the target audience for a paper and may try to replicate the results often fail to do so, because they lack the expertise. Because their background is too different.


I think you think that because we don't agree that I have some grave misunderstanding of some, to be frank, basic facts. I assure you, I perfectly understand what you're bringing up here and in the last comment.

But I think you still haven't understood my point about trade-offs. At least you aren't responding as if these exist.

Our disagreement isn't due to lack of understanding the conditions, it is due to a difference in acceptable limitations. After all, perfection doesn't exist.

So you can't just solve problems like this by bringing up limitations in an opposing viewpoint. I assure you, I was already well aware of every single one you've mentioned...


My original point was that replication attempts often fail, because the person trying to replicate the result is not an expert in the field, and they do not have enough time to devote to the effort. This is a common situation in fields that use results from other fields. If they don't have the time for proper replication, they probably don't have the time for publishing the attempt.

As for your point, I don't really understand what you are trying to say.


Advanced groups usually replicate their competitor's results in their own hands shortly after publication (or they just trust their competitor's competence). But they don't spend any time publishing it unless they fail to replicate and can explain why they can't replicate. From their perspective, it's a waste of time. I think this has been shown to be a naive approach (given the high rate of image fraud in molecular biology) but people who are in the top of the field have strong incentives to focus on moving the state of the art forward without expending energy on improving the field as a whole.

"strong incentives to focus on moving the state of the art forward without expending energy on improving the field as a whole"

That sort of Orwellian doublethink is exactly the problem. They need to move it forward without improving it, contribute without adding anything, challenge accepted dogma without rocking the boat, and...blech!


  > challenge accepted dogma without rocking the boat
I think the funniest part is how we have all these heroes of science who faced scrutiny by their peers, but triumphed in the end. They struggled because they challenged the status quo. We celebrate their anti authoritative nature. We congratulate them for their pursuit of truth! And then get mad when it happens. We pretend this is a thing of the past, but it's as common as ever[0,1].

You must create paradigm shifts without challenging the current paradigm!

[0] https://www.scientificamerican.com/article/katalin-karikos-n...

[1] https://www.globalperformanceinsights.com/post/how-a-rejecte...


"Science is the belief in the ignorance of experts" - Richard Feynman

All that makes it more important for top journals to reward replication, not less!

Top journals are not inherently prestigious. They are prestigious because they try to publish only the most interesting and most significant results. If they started publishing successful replication studies, they would lose prestige, and more interesting journals would eventually rise to the top. (Replication studies that fail to replicate a major result in a spectacular way are another matter.)

Are you explaining this from experience or from speculation?

I can tell you that it doesn't match my own experience. I also think it doesn't match your example. Those cases of verified image fraud are typically part of replication efforts. The reason the fraud is able to persist is due to the lack of replication, not the abundance of it.


Mostly experience (based on being a PhD scientist, a postdoc, a National Lab scientist, and engineer at several bigtech companies), partly speculation (none of the groups/labs I worked in operated at "the highest level", but I worked adjacent to many of those).

I'm pretty sure most image fraud went completely unrealized even in the case of replication failure. It looks like (pre AI) it was mostly a few folks who did it as a hobby, unrelated to their regular jobs/replication work.


In most of the labs I've worked in replication is not a common task[0]

  > 'm pretty sure most image fraud went completely unrealized even in the case of replication failure
Part of my point is that being unable to publish replication efforts means we don't reduce ambiguity in the original experiments. I was taught that I should write a paper well enough that a PhD student (rather than candidate) should be able to reproduce the work. IME replication failures are often explained with "well I must be doing something wrong." A reasonable conclusion, but even if true the conclusion is that the original explanation was insufficiently clear.

  > It looks like (pre AI) it was mostly a few folks who did it as a hobby
I'm sorry, didn't you say

  >>> Advanced groups usually replicate their competitor's results in their own hands shortly after publication 
Because your current statement seems to completely contradict your previous one.

Or are you suggesting that the groups you didn't work with (and are thus speculating) are the ones who replicate works and the ones you did work with "just trust their competitor's competence")? Because if this is what you're saying then I do not think this "mostly" matches your experience. That your experience more closely matches my own.

[0] I should take that back. I started in physics (undergrad) and went to CS for grad. Replication could often be de facto in physics, as it was a necessary step towards progress. You often couldn't improve an idea without understanding/replicating it (both theoretical and experimental). But my experience in CS, including at national labs, was that people didn't even run the code. Even when code was provided as part of reviewing artifacts I found that my fellow reviewers often didn't even look at it, let alone run it... This was common at tier 1 conferences mind you... I only knew one other person that consistently ran code.


Note that my field is biophysics (quantitative biology) while yours is physics and CS. Those are done completely differently from biology; with the exception of some truly enormous/complex/delicate experiments that require unique hardware, physics tends to be much more reproducible than biology, and CS doubly-so.

Replication of an experiment and finding image fraud are kind of done as two different things. If somebody publishes a paper with image fraud, it's still entirely possible to replicate their results(!) and if somebody publishes a paper without any image fraud, it's still entirely possible that others could fail to replicate. Also, most image errors in papers are, imho, due to sloppy handling/individual errors, rather than intentional fraud (it's one of the reasons I worked so hard on automating my papers- if I did make an error, there should be audit log demonstrating the problem, and the error should be rectified easily/quickly in the same way we fix bugs in production at big tech).

This came up a bunch when I was at LBL because of work done by Mina Bissell there on extracellular matrix. She is actively rewriting the paradigm but many people can't reproduce her results- complex molecular biology is notororiously fickle. Usually the answer is, "if you're a good researcher and can't reproduce my work, you come to my lab and reproduce it there" because the variables that affect this are usually things in the lab- the temperature, the reagents, the handling.

See https://www.nature.com/articles/503333a (written by Dr. Bissell).


  > physics tends to be much more reproducible than biology, and CS doubly-so.
With physics I think there is a better culture of reproduction, but that is, I believe, due more to culture. That it is acceptable to "be slow". There's a high stress on being methodical and extremely precise. The prestige is built on making your work bulletproof, and so you're really encouraged to help others reproduce your work as it strengthens it. You're also encouraged to analyze in detail and to faithfully reproduce, because finding cracks also yields prestige. I don't know if it's the money, but no one is in it for the money. Physics sure is a lot harder than anything else I've done and it pays like shit.

For CS the problem is wildly different. It should be easy to reproduce as code is trivial to copy. Ignoring the issue of not publishing code alongside results, there's also often subtle things that can make or break works. I've found many times in replication efforts that the success can rely on a single line that essentially comes form a work that was the reference to a reference of the work I'm trying to reproduce. The problem here is honestly more of laziness. In contrast to physics there's an extreme need for speed. In physics (like everyone else I knew) I often felt like I was not smart enough, and that encouraged people to dive deeper and keep improving or to give up. In CS (like everyone else I knew) I often felt like I was not fast enough, and that encouraged people to chase sponsorships from labs that provided more compute, it encouraged a "shotgun" approach (try everything), or for people to give up (aka "GPU poor").

The reason I'm saying this is because I think it is important to understand the different cultures and how replication efforts differ. In physics a replication failure was often assumed to be due to a lack of intelligence. In CS a replication effort is seen as a waste of time. Both are failures of the scientific process. Science is intended to be self-correcting. Replication is one means of this, but at its heart is the pursuit of counterfactual models. This gives us ways to validate, or invalidate, models through means other than direct replication. You can pursue the consequences of the results if you are unable to pursue the replication itself. This is almost always a good path to follow as it is the same one that leads to the extension and improvement of understanding.

There's a lot I agree and disagree with from Dr Bissell's article. Our perspectives may differ due to our different fields, but I do think it also serves as some a point of collaboration, if not on the subject of meta-science. Biology is not unique in having expensive experiments. I want to point out two famous and large physics projects: the LHC's discovery of the Higgs Boson[0] and LIGO's Observation of a Gravitational Wave[1]. The former has 9 full pages of authors (IIRC over 200) while the latter has about 3. These works are both too expensive to replicate while also demonstrating replication. Certainly we aren't going to take another 2 decades to build another CERN and replicate the experiments. But there's an easy to miss question that might also make apparent the existence of replication: who is qualified to review the paper and is not already an author of it? There's definitely some, but it really isn't that many. In these mega projects (and there are plenty more examples) the replication is done through collaboration. Independent teams examine the instruments that make the measurements. Independent teams make measurements, using the same device or different devices (ATLAS isn't the only detector at CERN), different teams independently analyze and process the information, and different teams model and simulate them. With LIGO this is also true. It would be impossible to locate those black holes without at least 2 facilities: one in Hanford (Washington) and the other in Livingston (Louisiana) (and now there's even more facilities). Astrophysics has a long history of this type of replication/collaboration as one team will announce an observation and it is a request for other observations. Observations that often were already made! In HEP (high energy particle physics) this may be less direct, but you'll notice other particle physics labs are in the author list of[0]. That's because despite the exact experiment not being replicatable in other facilities, there are still other experiments done. In the effort to find the Higgs there were many collisions performed at Fermi Lab.

I don't think this same in biophysics, but I think there are nuggets that may be fruitful. Bissell mentions at the end of her argument that she believes replication might have higher success were labs to send scientists to the original labs. I fully agree! That would follow the practice we see in these mega experiments in physics. But I also do think she's brushing off an important factor: it is far quicker and cheaper to replicate works than it is to produce them. You're a scientist, you know how the vast majority of time (and usually the vast majority of money) is "wasted" in failures (it'd be naive to call it waste). Much of this goes away with replication efforts. The greater the collaboration the greater the reduction in time and money.

And I do agree with Bissell in that we probably shouldn't replicate everything[2]. At least if we want to optimize our progress. But also I want to stress that there is no perfect system and there are many roadblocks to progress. Frankly, I'd argue that we waste far more time in things like grant writing and publication revisions. I don't know a single scientist who hasn't had a work rejected due to reviewers either not giving the work enough care or simply because they were unqualified (often working in a different niche so don't understand the minutia of the problem). As for the grant writings, I think they're a necessary evil but I'm also a firm believer of what Mervin Kelly (former director of Bell Labs) said when asked how you manage a bunch of geniuses: "you don't"[3]. You're a scientist, an expert in your domain. You already know what directions to look in. You've only gotten this far because you've been honing that skill. We don't have infinite money, so of course we have to have some bar, but we can already sniff out promising directions and we're much better at sniffing out fraud. Science has been designed to be self-correcting.

[More of a side note]

  > Usually the answer is, "if you're a good researcher and can't reproduce my work, you come to my lab and reproduce it there" because the variables that affect this are usually things in the lab- the temperature, the reagents, the handling.
And we should not undermine the importance of these variables. Failures based on them are still informative. They still inform us about the underlying causal structure that leads to success. If these variables were not specified in the paper, then a replication failure shows the mistake of the writing. Alternatively a failure can bound these variables, by making them more explicit. I'm no expert in biophysics, but I'm fairly certain that understanding the bounds of the solution space is important for understanding how the processes actually work.

[0] https://arxiv.org/abs/1207.7214

[1] https://arxiv.org/abs/1602.03837

[2] I also would be very cautious about paid replication efforts. I am strongly against it as well as paywalls on publishing (both in creation of publication as well as the access of).

[3] https://1517.substack.com/p/why-bell-labs-worked


Replication studies cannot save science and might make the fraud problem worse.

https://blog.plan99.net/replication-studies-cant-fix-science...


> Do you want issues of Nature and cell to be replication studies? As a reader even from within the field, im not interested in browsing through negative studies.

Actually, yes, I do. The marginal cost for publishing a study online at this point is essentially nil.


I think archives with pretty low standards for notability are a good idea. At some point though you have to pick what actually counts as interesting enough to go in the curated list that is actually suggested reading, where the prestige is attached. If there's no curation by Nature then it falls to bloggers or another journal to sift through the fire-hose and make best-of lists. Most of the value is in the curation, not the publishing. Without exclusivity there's very little signal.

> The marginal cost for publishing a study online at this point is essentially nil.

The marginal cost for doing a study remains the same, which is quite a bit. Society doesn't have unlimited scientific talent or hours. Every year someone spends replicating is a year lost to creating something new and valuable.


I know you got a ton of responses already but not caring about replicability just invalidates science as a method. If we care only about first to publish we end up in the current situation where we don't even know that we know is actually even remotely correct.

All because journals prefer novelty over confirmation. It's like a castle of cards, looks cool but not stable or long-term at all.


> Do you want issues of Nature and cell to be replication studies?

Hell yeah. We’re all trying to get that Nature paper. Imagine if you could accomplish that by setting the record straight.


If you're thoroughly debunking a previous Nature paper they just might publish that. But the expectation is that you'll succeed. Publishing that sort of mundane article would reduce the prestige of getting something into the journal. Publishing in a high impact journal is only seen as an achievement in the first place because of what it implies about the content of your paper.

If you're a reader within the field, then you are the one person in the world who should be most interested in negative replication studies.

Realistically, everyone will say “yes” to the “do you want” question because if you’re not a reader or a subscriber you benefit from the readers reading replication studies.

I believe people will enthusiastically say yes but that they do not routinely read that journal.


Suggesting that people would stop reading Nature if they also included replication studies send like an incredible leap.

It would directly undermine the reason that people read Nature in the first place.

Not really.

"It ain’t what you don’t know that gets you into trouble. It’s what you know for sure that just ain’t so."

Knowing that something I thought was true was actually false would have saved me years in several situations.


I didn't understand us to only be talking about failed replication studies of previous Nature papers which would hopefully be few and far between and thus noteworthy indeed. Rather replication studies in general which on average are arguably less interesting to the reader than even the content of the typical archival journal.

They certainly will be few and far between when the system is structured to repress them. But there's reason to believe they wouldn't be as rare as you seem to think:

https://www.nature.com/nature/articles?type=retraction


Are you seriously attempting to imply that Nature retractions aren't few and far between?

What's even your point here? Hopefully we are at least in agreement that Nature is seen as prestigious and worth looking through precisely because of the sort of content that they publish. Diluting that would dilute their very nature. (Bad pun very much intended sorry I just couldn't resist.)


"Are you seriously attempting to imply that Nature retractions aren't few and far between?"

No. I'm explicitly stating that they are few and far between, but perhaps (not certainly, but conceivably) they shouldn't be.

"What's even your point here?"

My point is that focusing on positive findings and neglecting negative findings perverts the mechanism that makes science work. Science isn't about proving things correct, it's about rooting out errors.


I'm not sure I agree. The system certainly isn't optimal but results aren't just dumped into a vacuum. Something is only useful if people can build on it. Even if negative results don't get published, even if it isn't optimal, by virtue of future positive results building on past things that did reproduce you get forward progress.

Regardless, I don't think that's at odds with my original assertion that becoming a venue for publishing negative results would undermine the "point" of Nature.

The missing link isn't a venue in which to publish. It's funding to do the work in the first place. Also funding to spend the time writing it up when you find that you've inadvertently been tricked into doing the work while trying to get something that builds on it to work.


"Also funding to spend the time writing it up when you find that you've inadvertently been tricked into doing the work while trying to get something that builds on it to work."

Oh there have been times would have loved to be able to apply for one of those!


That is a novel interpretation of my comment certainly.

Tagging seems like an option here.

"Original research" isn't worth much unless replicated, which is the entire problem being discussed in this thread. Replicating studies are great though because they tell you if the original research actually stands and is valid.

> Replicating work is far more difficult than a lot of original work.

Only if the original work was BS. And what, just because it's harder, we shouldn't do it?


Why blame just the journals when every other system also disintivizes the same.

I must be missing something, surely the argument isn't "other systems also disincentivize solving the problem, therefore we shouldn't work to fix this one"

Even if that negative study could save you one, two, three+ years of work for the same outcome (which you then also can't really do anything with)? Shouldn't there BE funding for replication studies? Shouldn't that count towards tenure? Part of the problem is that publications play such a heavy role in getting tenure in the first place.

I'm sure you can more narrowly tune your email alerts FFS.


> Do you want issues of Nature and cell to be replication studies?

There are journals dedicated to replication, like ReScience C[1]. But they are niche. Perhaps we should have more of these.

[1] http://rescience.github.io/


> Replicating work is far more difficult than a lot of original work.

I don’t regularly read scientific studies but I’ve read a few of them.

How is it possible that a serious study is harder to replicate than it is to do originally. Are papers no longer including their process? Are we at the point where they are just saying “trust me bro” for how they achieved their results?

> Do you want issues of Nature and cell to be replication studies?

Not issues of Nature but I’ve long thought that universities or the government should fund a department of “I don’t believe you” entirely focused on reproducing scientific results and seeing if they are real


> How is it possible that a serious study is harder to replicate than it is to do originally.

They aren't. GP was on point until that last sentence. Just pretend that wasn't there. It's pretty much always much easier to do something when all the key details have been figured out for you in advance.

There is some difficulty if something doesn't work to distinguish user error from ambiguity of original publication from outright fraud. That can be daunting. But the vast majority of the time it isn't fraud and simply emailing the original author will get you on track. Most authors are overjoyed to learn about someone using their work. If you want to be cynical about it, how else would you get your citation count up?


>Also who's funding you for replication work? Do you know the pressure you have in tenure track to have a consistent thesis on what you work on?

This is partly why much of today's science is bs, pure and simple.


Not everyone knows who geohot is and even if they do they may not see the url handle. They may (like me) think why is a glorified shower thought tweet on top of HN.

They may not know that this dude was an anti-masker (with nuance) for example. This could really make them decide not to even spend too much time thinking about the passage which in theory is profound for 10 seconds but no further.

As much as ad hominem attacks are not great approaches, the one scenario I feel it's justified is in cases like this.


> They may not know that this dude was an anti-masker (with nuance) for example.

Why are we supposed to care about that? There was a time when "masks do not work" was very much the conventional wisdom.


Masks didn't work for people needing an immediate cure, but it was never that, it always was a multiplier, and even an multiplier with only 30% efficiency would translate to 4x reduction in spread through 4 levels.

And that reduction was there to give healthcare workers a chance to not be overwhelmed as they were for a large part of the initial pandemic.


There was literally never a time where mainstream medical advice was "masks do nothing".

Public health recommendations aren’t medical advice though. The advice agencies give is given to everyone and so has to take things like supply chains and the economy into consideration before making recommendations.

Conventional for whom?

I might be misremembering, but I think the WHO claimed this at some point?

It was obvious nonsense, and did not comfort me as I watched an avoidable catastrophe become, day by day, an unavoidable one; politicians caring more about pacifying the populace with platitudes than about taking measures to render SARS-CoV-2 extinct in the wild – measures which would have been several orders of magnitude cheaper than the extended pandemic lockdowns, disabilities and trauma, loss of life, and now a new disabling endemic disease we're going to have to fight the hard way, for centuries, until it can finally go the way of smallpox.


There was an extremely brief period where public health advice discouraged the general population from masking. This was because there was a huge undersupply for medical workers and because we hadn't fully figured out whether covid aerosolized mere weeks into the pandemic.

Once we had a bit more information in a rapidly evolving situation public health advice switched to recommending masks and stayed that way for years.

We cannot possibly expect public health advice to get everything right immediately during a once-in-a-century pandemic and this error should definitely not be used as a general "wow public health officials are dumb idiots or engaged in a malicious conspiracy", as this error is often used.


There was indeed a huge undersupply for medical workers. The appropriate response from public health officials should've been something like "surgical masks help protect others, and right now, we need to protect hospital patients", not "we want to keep masks available in hospitals where they're needed, because they're important, so let's tell people they're not important to make sure they're available". This is the kind of plan that would be used as an example of Adults Are Useless in a children's novel (ref: https://tvtropes.org/pmwiki/pmwiki.php/Main/AdultsAreUseless): it was never going to not backfire. (Does it even count as backfiring if you point the gun at your foot, and lean down to watch it go bang?) Institutional dysfunction can produce decisions that no individual would ever author, but there's a reason the Evil Overlord List has #12:

> One of my advisors will be an average five-year-old child. Any flaws in my plan that he is able to spot will be corrected before implementation.

And… it's SARS-CoV-2. Of course it aerosolises. The "we hadn't fully figured out" was sheer incompetence (see doi:10.1080/02786826.2024.2387985); just like the "we aren't sure whether it's reached our jurisdiction yet, so be vigilant", "oh lots people are showing symptoms", "actually turns out it reached us 3 weeks ago and is now endemic, haha oops" pattern we saw playing out in country after country, region after region, while open source intelligence collated on LessWrong obviously showed what the governments apparently could not see until tens of days too late. "Paranoid" early lockdowns could've lasted two weeks, allowed us to identify who was affected, and then allowed us to give them top-of-the-line isolated care while they recovered, and while the rest of the region got back to doing an economy. Instead, COVID-19 is still claiming new victims today.

Australia managed it, and everyone could've copied that, but they didn't. China could have managed it right at the start, and saved the world, if their accountability culture hadn't favoured a cover-up. (Their belated attempts to pursue a zero-COVID strategy were not particularly effective once it was a global pandemic and multiple strains were circulating, because it turns out you can't persuade people into not being ill; and sealing residential buildings is neither necessary nor sufficient contact tracing / isolation.) Zero-COVID would've been feasible as a global strategy, even starting as late as February 2020, if not for all the politicised bullshit. (No, "don't kill your neighbours by giving them a novel respiratory virus while all the hospitals are full" isn't authoritarianism. Sensible precautions are not setting a bad precedent, because it's a conditional precedent: if we were to wipe out the disease, there would no longer be a reason for those measures. In fact, the cryptographers figured out how to do privacy-preserving contact tracing, and shipped the protocol very quickly, so that the best available system was inherently anti-authoritarian.)

While I wouldn't use the phrase "dumb idiots", public health officials are, largely, responsible for long-term policy decisions, not rapid response. (If a response to a seasonal respiratory disease fails, you've usually got 8 months to put together a better one.) They had had little practice making the snap decisions, and they almost invariably made bad ones when it mattered. Replacing soap with diluted alcohol gel at handwashing stations, for example, is stupidity. Soapy water is one of the most effective anti-coronavirus measures disinfectants there is. A roughly 70% alcohol solution is a close second. (There are viruses that are not denatured by soap, for which alcohol is a much better disinfectant: I can only imagine that people got confused.) 30% alcohol solution is basically useless. Alcohol gel is useful because it's portable, but it's not as good for removing SARS-CoV-2 as washing your hands with soap and water.

So many resources were used (and entire supply chains were established!) ensuring that every surface is wiped, even though it's a respiratory disease and surfaces were not a major transmission vector; but very few resources were employed to ensure sufficient ventilation which, again, respiratory disease. It does not take a genius to make the link between "it infects your air holes" and "we should ventilate or filter the air". (Yes, I witnessed many people closing windows using the disinfectant cloth as a glove, to avoid touching the handle, right before filling the room with occupants.)

We were not prepared for the pandemic that happened, and I will condemn that, because are we going to do any better next time? Have we put measures in place to ensure that accurate information about the disease, including appropriate disease-specific hygiene measures, are rapidly disseminated? Have we ensured that international authorities and the populaces will calmly overreact, because overreaction is cheap and allows each patient to receive specialist treatment and maybe lets us wipe out the disease entirely? From where I'm standing, the WHO is weaker than ever, we've traumatised a generation to the point they'll resist any attempt to impose another lockdown (even though acting early means it might only last a few weeks, or even a few days!), and we've gained a political playbook for profiting from pandemic denialism.

A once-in-a-century pandemic should be expected to happen once in a century. Allocate a hundred people around the world whose jobs it is to know how to deal with that, and then listen to them if the rare event shows up. It's not conceptually difficult. That this is difficult for us in practice is damning.


>are we going to do any better next time

The US is completely fucked if another pandemic hits within the next ~20 years. Or at least until a large percentage of the anti-vaxers created in the covid pandemic age out of the population. Having a decent response when half the country is going to be on team virus is not possible.

> Have we put measures in place to ensure that accurate information about the disease, including appropriate disease-specific hygiene measures, are rapidly disseminated.

This is irrelevant if half the country has the through process of "goverment says it therefore it is fake news bill gates microchip conspiracy".


The title is >Create value for others and don’t worry about the returns.

Isn't being an anti-masker the opposite of this viewpoint? Literally saying, I only care about the returns for myself, even if creates negative value for others.


It's so we can definitively identify this person as a Nazi, as persona non grata, so we can feel better about ourselves while we break quarantine and contravene public health orders to get clandestine haircuts and attend illegal cross-household parties.

    So you must be careful to do everything they tell you.
    But do not do what they do, for they do not practice
    what they preach. They tie up heavy, cumbersome loads
    and put them on other people’s shoulders, but they
    themselves are not willing to lift a finger to move
    them.

    Everything they do is done for people to see: They make
    their phylacteries wide and the tassels on their
    garments long; [...]

Full stop disagree. This is not what HN is for, and should never be for. I have spent hours on IRC with geohot back in the late 2000s / early 2010s mind you, I never liked him, but this is not what HN is for, and not what it should ever become. You can do all of that on reddit, let's not ruin a good rare slice of the internet with meaningless bickering.

Ad hominem attacks are never good approaches. They're irrational in nature. Ad hominem is one of the first fallacies taught in a critical thinking class.

Ad hominems are formal fallacies. They are not valid deductive reasoning.

But people basically never use valid deductive reasoning for anything. Using available evidence to make predictions about things and act on those predictions is fine. If somebody has a history of poor thought or writing and then I encounter more of their thoughts or writing it is not unreasonable to say "this new material is likely to be poor and I don't need to spend time on it."

If somebody says "hey do you want to see Transformers 7", responding "I did not like Transformers 1-6 so I'll pass" is fine even if it is not deductive proof that you won't like Transformers 7.


if ad hominem attacks were of no value, humans wouldn't have evolved the strong tendency to engage in them.

they are not proofs in logic, hence the fallacy, but that does not mean they are irrational. it's irrational to think that human discourse can be capture by logic.


Isn’t rational a synonym for logical, though? People can subjectively rationalize their behavior, but that doesn’t make it objectively rational.

>They may not know that this dude was an anti-masker (with nuance) for example

If you're going to ad hominem, at least give a citation.

>As much as ad hominem attacks are not great approaches, the one scenario I feel it's justified

Because reasons?


> Not everyone knows who geohot is

But I guess people can get pretty much to the same conclusion by reading any of the blog post, I had the same idea just by reading the title here


what's anti-masker?

Some people thought that surgical masks wouldn’t stop you from getting Covid

Surgical masks don't stop you from getting Covid. That was never what they were for: they were to reduce the viral load you exposed others to, between when you got infected and when you noticed that you were infected. See https://en.wikipedia.org/wiki/Surgical_mask#Function.

Some cloth masks can (when dry) also trap small particulates through electrostatic interactions, although they are less effective as a mechanical filter than surgical masks; and many washing methods destroy this effect.


Right, but that still makes the people who refused to wear one selfish assholes.

They didn't. The point is that they stop you from giving covid to other people.

It's a distinction without a difference. Masks served to reduce the herd from spreading covid. Including other people giving it to you.

The distinction is important! The mechanism by which surgical masks prevent you from getting COVID-19 is peer pressure: it's important for people to know this, so they know how to protect themselves. (And there are fitted masks that protect the wearer: there was just a shortage of them, because despite all the warnings we were not prepared for a pandemic.)

The distinction does matter, because by not wearing a mask, instead of indicating that you don't care about your own safety, you're indicating that you don't care about anybody else's.

You do realise that surgeons don't wear those masks to stop them from catching something right?

> Some people thought that surgical masks wouldn’t stop you from getting Covid

You do realize that masks would help prevent you from getting covid if other people are wearing the masks, right?

The comment just talked about masks, not whether you are the one wearing the mask.


It was more nuanced than that and importantly, "Anti-masker" as a derogatory statement was about "This person is literally unwilling to do anything at all that doesn't pay off for them personally" because a goddamn mask is such a simple thing to do, and they just couldn't handle it, because it was someone else telling them to do something and some people just cannot function when that happens.

People lost their damn minds because "Hey could you maybe take a single small step to ensure you don't sneeze on produce" was protest worthy.

I do not believe someone saying "Masks made out of T-Shirts don't work well" or "Surgical masks aren't as effective as real N95 masks" are an "Anti-masker".

It was about vice-signalling. All the people who get pissy about masks do plenty of things for their "health" that have zero science behind them.

These people often did other things like hiding or downplaying any symptoms, and choosing to go to events while sick, and were often basically superspreaders.

This was all well understood like 25 years ago when WoW did that blood plague thing and people put effort into spreading it as much as possible. We've seen this with other diseases including ones that have no real controversy or political angle. Some people are just that insistent about doing everything they can to make the world a worse place.

It just sucks so much.


Thank God I lived long enough to forget COVID-era terms.

I'm taking, Covid-era anti-masker (?)

Hey at least its not ATLAS!


Anyone who's used both Claude and ChatGPT will instantly agree what is better by a large margin. Theres maybe a brand recognition long tail but its more likely theyre the rare occasional users who use the free tier. Thus ChatGPT is becoming the shitty free AI app while Claude is what you use to get real work done. Time (in months) will tell yow this will go.


Avoiding Doing something that could cause job loss has never been and will never be a productive ideal in any non conservative non regressive society. What should we do? Not innovate on AI and let other countries make the models that will kill the jobs two months later instead?


Its definitely against terms. The claude code oauth token is only supposed to be used with claude code. I hope no one gets their claude account banned trying to use this.


That's literally what it does :D I.e. it uses the auth token to use claude code (the CLI binary). Check the code here: https://github.com/desplega-ai/agent-swarm/blob/main/src/com...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: