Hacker Newsnew | past | comments | ask | show | jobs | submit | 33a's commentslogin

We also caught this right away at Socket,

https://socket.dev/blog/npm-author-qix-compromised-in-major-...

While it sucks that this happened, the good thing is that the ecosystem mobilized quickly. I think these sorts of incidents really show why package scanning is essential for securing open source package repositories.


So how do you detect these attacks?

We use a mix of static analysis and AI. Flagged packages are escalated to a human review team. If we catch a malicious package, we notify our users, block installation and report them to the upstream package registries. Suspected malicious packages that have not yet been reviewed by a human are blocked for our users, but we don't try to get them removed until after they have been triaged by a human.

In this incident, we detected the packages quickly, reported them, and they were taken down shortly after. Given how high profile the attack was we also published an analysis soon after, as did others in the ecosystem.

We try to be transparent with how Socket work. We've published the details of our systems in several papers, and I've also given a few talks on how our malware scanner works at various conferences:

* https://arxiv.org/html/2403.12196v2

* https://www.youtube.com/watch?v=cxJPiMwoIyY


So, from what I understand from your paper, you're using ChatGPT with careful prompts?

You rely on LLMs riddled with hallucinations for malware detection?

I'm not exactly pro-AI, but even I can see that their system clearly works well in this case. If you tune the model to favour false positives, with a human review step (that's quick), I can image your response time being cut from days to hours (and your customers getting their updates that much faster).

You are assuming that they build their own models.

He literally said "Flagged packages are escalated to a human review team." in the second sentence. Wtf is the problem here?

What about packages that are not "flagged"? There could be hallucinations when deciding to (or not) "flag packages".

>What about packages that are not "flagged"?

You can't catch everything with normal static analysis either. LLM just produces some additional signal in this case, false negatives can be tolerated.


static analysis DOES NOT hallucinate.

So what? They're not replacing standard tooling like static analysis with it. As they mention, it's being used as additional signal alongside static analysis.

There are cases an LLM may be able to catch that their static analysis can't currently catch. Should they just completely ignore those scenarios, thereby doing the worst thing by their customers, just to stay purist?

What is the worst case scenario that you're envisioning from an LLM hallucinating in this use case? To me the worst case is that it might incorrectly flag a package as malicious, which given they do a human review anyway isn't the end of the world. On the flip side, you've got LLM catching cases not yet recognised by static analysis, that can then be accounted for in the future.

If they were just using an LLM, I might share similar concerns, but they're not.


well, you've never had a non-spam email end up in your spam folder? or the other way around?

when static analysis does it, it's called a "misclassification"


> We use a mix of static analysis and AI. Flagged packages are escalated to a human review team.

“Chat, I have reading comprehension problems. How do I fix it?”


Reading comprehension problems can often be caught with some static analysis combined with AI.

"LLM bad"

Very insightful.


AI based code review with escalation to a human

I'm curious :)

Does the AI detect the obfuscation?


It's actually pretty easy to detect that something is obfuscated, but it's harder to prove that the obfuscated code is actually harmful. This is why we still have a team of humans review flagged packages before we try to get them taken down, otherwise you would end up with way too many false positives.

Yeah, what I meant is that obfuscation is a strong sign that something needs to be flagged for review. Sadly, there's only a thin line between obfuscation and minification, so I was wondering how many false positives you get.

Thanks for the links in your other comment, I'll take a look!


I think that would be static analysis. After processing the source code normally (looking for net & sys calls), you decode base64, concatenate all strings and process again (until decode makes no change)

Probably. It’s trivial to plug some obfuscated code into an LLM and ask it what it does.

Yeah, but just imagine how many false positives and false negatives there would be...

[flagged]


Apparently it found this attack more or less immediately.

It seems strange to attack a service like this right after it actively helped keep people safe from malware. I'm sure its not perfect, but it sounds like they deserve to take a victory lap.


I don’t think celebrating a company who has a distinct interest in prolonging a problem while they profit off it is a good thing, no.

They're profiting off helping to solve the problem through early warning and detection. And by keeping their customers safe from stuff like this.

Seems good to me. I want more attention and more tooling around this problem. You seem mad at them for helping solve a real problem?


You could at least offer some kind of substantive criticism of the tool (“socket”).

Do I need any? Automated tools cannot prevent malicious code being injected. While they can make attempts to evaluate common heuristics and will catch low hanging malware, they are not fool proof against highly targeted attacks.

Either way, the parent post is clearly ambulance chasing rather than having a productive conversation, which should really be about whether or not automatically downloading and executing huge hierarchal trees of code is absolutely fucking crazy, rather than a blatant attempt to make money off an ongoing problem without actually solving anything.


When we find malware on any registry (npm, rubygems, pypi or otherwise), we immediately report it to the upstream registry and try to get it taken down. This helps reduce the blast radius from incidents like this and mitigates the damage done to the entire ecosystem.

You can call it ambulance chasing, but I think this is a good thing for the whole software ecosystem if people aren't accidentally bundling cryptostealers in their web apps.

And regarding not copying massive trees of untrusted dependencies: I am actually all for this! It's better to have fewer dependencies, but this is also not how software works today. Given the imperfect world we have, I think it's better to at least try to do something to detect and block malware than just complain about npm.


So instead you prolong the problem while making money? Nice!

I’m all for thinking about second, or third, or fourth order effects of behavior, but unless you have proof that Socket is doing something like lobbying that developers keep using NPM against their own best interests, frankly, I don’t know what your point here is.

> Do I need any? Automated tools cannot prevent malicious code being injected. While they can make attempts to evaluate common heuristics and will catch low hanging malware, they are not fool proof against highly targeted attacks.

So just because a lock isn't 100% effective at keeping out criminals we shouldn't lock our doors?


Im not sure how that relates to the company ambulance chasing on what should be a public service announcement without a shade of advertising.

That’s like lock companies parading around when their neighbour is murdered during a burglary but they weren’t because they bought a Foobar(tm) lock.


The more tools that exist to help find vulnerabilities, the better, as long as they're not used in a fully automated fashion. Human vetting is vital, but using tools to alert humans to such issues is a boon.

For those interested, points associated with this post spiked to at least 4 then dropped back to one. Take of that what you will.

Signing doesn't protect against maintainer sabotage, but it could theoretically help if the registry were ever compromised. It mainly works to prevent MITM type attacks on the package distribution itself.

In the case of central package managers like rails/npm/cargo/etc., these benefits are very speculative, but there is probably some merit to adopting this approach in distributed ecosystems like go.



BlueSky is still working great.


It had a pretty massive outage on Feb 26 FWIW. And Feb 19.

I don't recall seeing it on HN though.


Unfortunately I got banned on BS.


Why?


If the self evaluation makes it better, then why not do the self evaluation as part of the normal RAG workflow?


For what it's worth, my kids really like this project.


This seems really badly argued. The second version seems much worse and harder to extend. Looks like classic ORM style database abstraction wrapped with hand written types. This type of code usually leads to inflexible data models and inefficient n+1 query patterns. Relational algebra is inherently more flexible than OOP/ML-style type systems and its usually better to put as little clutter between your code and the db queries as possible in practice.


Collisions are violations of the pairwise non-intersection constraint between bodies. Collision forces are Lagrange multipliers of these constraints. Collision normals are the (normalized) partial derivatives of the constraint function wrt one of the body's configurations.


That sounds like the kind of thing which works if you're doing physics at 1kHz+ with an integration algorithm with good numeric stability which honours conservation of energy, but in games, we're often running physics at down to 30Hz using some ad-hoc Euler-Chromer, which requires very different approaches


It's still the same principle even in games. If you are trying to explain where forces come from and how resolution works, you need to ground it in something. Otherwise you are just adding extra assumptions onto assumptions.


In proper physics simulations, everything's about forces, most things are springs, and you never teleport stuff. In games, modelling the ground as a spring really doesn't work and teleporting entities when they collide with parts of the world often makes a lot of sense. It's just not the same.

EDIT: If I'm incorrect, please explain how. I've written some game physics systems and seen some proper physics sim systems and these comments reflect my understanding of the situation, and if I've said something wrong, please correct me instead of just downvoting.


The principle of least constraint is the basis for rigid body mechanics based contact forces. This has been known since the days of Gauss and Hamilton, and is fundamentally how restitution and collision forces are derived in Lagrangian mechanics. There's a long literature on this going back more than a hundred years.

It's true that some commercial solvers like Ansys use spring/penalty methods, but this is due to the spring forces being easier to couple to other solvers. It's harder in the Ansys force/velocity formulation to combine things like elasticity and fluids to their rigid body solver. To deal with the instability of systems of many stiff springs they have to take many small timesteps to avoid convergence issues.

More recently techniques like XPBD have been gaining popularity, particularly in film, which use purely positional constraints and variational methods to combine many different types of physics simulations. There's a really great and approachable series of videos by Matthias Muller on youtube which goes through how to implement all this in JS https://matthias-research.github.io/pages/

Finally it's funny you should mention games, since many older games used spring methods for physics. It was only when constraint based solvers became popular after Havok/Half-Life 2 did we start to see games with real rigid body dynamics and stable stacking of boxes. Older physics games like Trespasser ( https://en.wikipedia.org/wiki/Trespasser_(video_game) ) had many bugs due to the use of hacky spring physics. For a good explanation of how games do it today look at Erin Catto's work on Box2D https://box2d.org/publications/


Neat! do you have a resource that explains this perspective further?


Posted a reply here https://news.ycombinator.com/item?id=40466855

This is a specific reference on how constraints model contact between rigid bodies https://box2d.org/files/ErinCatto_UnderstandingConstraints_G...

Most games since Half Life 2 use constraint forces like this to solve collisions. Springs/penalty forces are still used sometimes in commercial physics solvers since they're easier to couple with other simulations, but they require many small timesteps to ensure convergence.


I second this. I would love to learn more.


TI92 graphing calculator and the instruction manual. Hundreds of pages of weird math functions to try out and understand. Triggered some kind of weird pokemon-esque collect em all ocd instinct in me and I ended up learning programming as a side effect.

I think one thing that's missing in a lot of current programming education is too much focus on the grammar and basic constructs and not enough focus on the nouns and verbs. Environments like MATLAB, QBasic and even JS are great because they have lots of little bits of random stuff you can just start poking at and plugging together. I wish there were more beginner references which start with lots of random facts about assorted bits and bobs instead of trying to preach some boring grand unified theory of coding.


In china an employer is required to pay something like 30% of your salary if it elects to enforce a non-compete. If the company doesn't pay you, then it's non-enforceable. Assuming you were at some decent compensation, this can actually be quite a bit of money. So if these workers took that pay cheque, they get to enjoy the timeoff during the non-compete and go party or start a family or whatever. The downside of the non-competes in China though is that because you are being paid to NOT work, if you start moonlighting or doing something sketchy the penalties are way more severe. I actually kind of like this approach to non-competes and it is in some ways better than how it works in many US states.

https://goglobalgeo.com/blog/non-compete-clauses-in-china-re...

The big problem with this system is not the penalties for breaking it, but the fact that it can sabotage you if you get hit too early in your career with a big gap. On the other hand, it gives you a nice opportunity to take a year or two off to start a family, spin up a new business or go back to school.


30% of an entry level job is not a living wage, let alone a lot of money.

This guy was getting about $6k per year for his non compete, and has been sued for $60k. This has effectively destroyed the prospects of this guy’s entire life. He couldn’t mentally endure what they were demanding, and then they gave him a non living wage while simultaneously forbidding him from finding employment.


FTA:

“Yao’s agreement prohibited him from working for rivals for nine months, during which time he would receive Rmb3,700 ($513) a month. It was too little to live on, Yao said.”


Yeah, and it's very weird to bring a non-compete down on an entry level worker like this. Under what circumstances is it even worth it for PDD to spend the resources to enforce and monitor this kind of an agreement? I wonder if there's more to this story than the FTA?

Also keep in mind this guy could have easily gotten a different job while still collecting the non-compete pay at a non-rival company, even still doing programming. Something about it doesn't quite add up to me.


> Under what circumstances is it even worth it for PDD to spend the resources to enforce and monitor this kind of an agreement?

Do you know that the marginal cost for monitoring an extra ex-employee is that large? If they catch someone like him, it seems like it has to pay for itself.


> they get to enjoy the timeoff during the non-compete and go party or start a family

On one hand, you have a guaranteed income for a few months, but I wouldn't be starting a family or partying if was living somewhere where I expected to make over 3x more. Here, rent alone is expected to a third of your income.


> 30% > Assuming you were at some decent compensation

Assuming your compensation was well above average and you lived quite frugally and well below your means.

There are countries in Europe where you can claim unemployment even if you left on your own and that would pay more so it doesn’t seem like a very good system at all. Of course nobody would really expect a “communist” country like China to not have garbage tier workers rights..


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: