Malicious code in the purescript NPM installer

hombre_fatal · on July 28, 2019

Another reminder of how annoying it is for a package system to have unqualified package names.

Having to ask someone to gift a `purescript` package shouldn't even be a thing. It should've been `@shinn/purescript` and the compiler developers just create their own `@whatever/purescript`.

This is something Elm and many others got right. https://package.elm-lang.org/ It's just infinitely, obviously better.

You see all sorts of problems because of this, like people "giving packages away" when they quit. Or buying package names. Or coming up with annoying name hacks because the obvious, best name is simply taken. Or people thinking/guessing that `npm install mysql` is the correct/best/canonical package because it's the simplest name, and anyone who publishes a better library has to name it mysql2 or better-mysql, etc. These just shouldn't even be things.

munificent · on July 29, 2019

> It's just infinitely, obviously better.

Whenever a large number of skilled people do something for which an alternative is "infinitely, obviously better", there's a good chance that there is more going on than you know.

RubyGems used to be namespaced this way and moved away from it. They didn't do so lightly.

The problem is that ownership, and even names of owners change all the time. In the very very large majority of cases, this change of ownership is an implementation detail that doesn't need to impact package consumers. If you enshrine the owner's name in the package, it means any change of ownership is effectively a breaking change to the package. When you have very large transitive dependency graphs, the result is constant, pointless churn.

hardwaresofton · on July 29, 2019

It seems like a reasonable fix to this is to prevent name-specific owning (changing it to a group instead) -- this has the benefit that package maintainers who plan to maintain their packages forever can keep the same names, and those that don't can essentially fork their project and stop fixing the older version (@ <maintainer>/<project>) and force all changes to go to a new one (@ <group>/<project>) and hand off ownership as necessary.

This doesn't break old consumers and it allows for a pretty graceful migration path (if you want new updates, change your version) -- can eve be helped with marking package as deprecated or repo as archived and what not.

akerl_ · on July 29, 2019

As a thought experiment:

Assume Rubygems chose to operate in that way, where all gems must be owned by a group rather than an individual user.

Then, assume that the most flexible option is for each gem to be owned by a unique group: that way even if two gems are maintained by the same users right now, they use two distinct groups in case that ownership changes in the future.

We might as well just name the “group” the same as the gem name, since only one gem is managed by each group. So now the “purescript” group maintains “purescript”, the “pry” group maintains “pry”, etc.

As syntactic sugar for users, since the group name and gem name will always match, why make them type both? Let’s have all the commands support just referencing the gem name. If somebody wants to fork a gem and release it, their group and gem get a new name.

I think there’s a pretty compelling case that package managers should support group ACLing on publishing (giving multiple humans the first-class right to publish using individual creds to a group namespace, with the ability to add/remove users from the group over time. But once you’ve done that, the distinction between explicit group-name-in-package-path and changing-name-to-fork (so the difference between fork-group/orig-name and orig-name_fork-group) seems to shrink.

thetrainfold · on July 29, 2019

Could it help to force long names, preventing simple, 'elegant' names?

akerl_ · on July 29, 2019

For what it’s worth, I think I actually like the idea of mandating a 2-part namespace, as the way to force “long” names (where “long” means “with enough context to make forking easier and more obvious”). I just wanted to call out that “force namespacing” isn’t a silver bullet for the issue.

A better feature-add might be supporting dependency replacements. For example, pre-modules, Go had an issue where if you had a dependency and I forked it, I had to go through all my code and replace references to your import path with mine. If I depended on something that depended on you, I was out of luck (or had to vendor and regex or a variety of other hacks). Now, with go modules, I can do a “replace” in my go.mod and sub in my fork for yours. That has enabled me to be much more flexible with my use of forked repositories, and most languages don’t have a direct parallel.

hardwaresofton · on July 29, 2019

> Assume Rubygems chose to operate in that way, where all gems must be owned by a group rather than an individual user.

Sorry, this wasn't my premise -- I meant to have this as an option. As in <user>/<project> or <group>/project>.

> Then, assume that the most flexible option is for each gem to be owned by a unique group: that way even if two gems are maintained by the same users right now, they use two distinct groups in case that ownership changes in the future.

> We might as well just name the “group” the same as the gem name, since only one gem is managed by each group. So now the “purescript” group maintains “purescript”, the “pry” group maintains “pry”, etc.

I don't agree -- even if it's always group-a/purescript I think the distinction is still important, because in this case previous-group/puresecript still exists, but is frozen/archived. Here's how I'm understanding the scenario you laid out:

1. group-a/purescript is created

2. purescript changes ownership, group-b is going to be publishing it going forward

3. group-b/purescript is created

4. group-a/purescript freezes/archives/deprecates itself

5. group-b/purescript is actively developed

The fact that "group-b" is the "right" purescript is arbitrary/subjective to some degree.

> I think there’s a pretty compelling case that package managers should support group ACLing on publishing (giving multiple humans the first-class right to publish using individual creds to a group namespace, with the ability to add/remove users from the group over time. But once you’ve done that, the distinction between explicit group-name-in-package-path and changing-name-to-fork (so the difference between fork-group/orig-name and orig-name_fork-group) seems to shrink.

I think these two issues are a bit separate. Letting people dynamically change who owns/can publish to a repository is one way to solve this problem, but I think it's more complex than the fork-and-move approach.

IMO if some user wants to give up/transfer their repo, they:

1. find someone else to take over if they want

2. freeze/archive/whatever their repo

3. let the person fork & continue their work

An ownership change should be opt in, unless it was known @ package creation time that ownership would be a shared/rotated/changing/nebulous thing (which would be demonstrated by a group owning the package from the beginning).

akerl_ · on July 29, 2019

I wasn’t implying you claimed that all gems needed a group, I was proposing it as part of the thought experiment. My apologies if that was unclear.

To your list of examples: my point parallels your own, I think. I’m saying that given the “right” version is arbitrary and subjective, the difference between “group-b/purescript” and “purescript-group-b” is effectively nil. More concretely: if namespacing existed, you could fork “group-a/purescript” to “group-b/purescript”, but if namespacing didn’t, you could fork “purescript” to “purescript-group-b”. In either case, dependent projects need to update where they source their dependencies from.

Namespacing, in my experience, tends to make the forking process slightly “cleaner”, because you avoid having a potentially non-“right” “original” (for example, “purescript” tends to look more legitimate than “purescript-group-b”). But some comments in this thread seem to paint namespacing as a hard requirement, or claim that package managers without namespacing are missing a core, mandatory feature. The case I’m presenting is that this isn’t the case: namespacing is a useful feature for several workflows, but adding namespacing doesn’t fundamentally alter the issue.

hardwaresofton · on July 29, 2019

> I wasn’t implying you claimed that all gems needed a group, I was proposing it as part of the thought experiment. My apologies if that was unclear.

My apologies I certainly misread your comment.

> To your list of examples: my point parallels your own, I think. I’m saying that given the “right” version is arbitrary and subjective, the difference between “group-b/purescript” and “purescript-group-b” is effectively nil. More concretely: if namespacing existed, you could fork “group-a/purescript” to “group-b/purescript”, but if namespacing didn’t, you could fork “purescript” to “purescript-group-b”. In either case, dependent projects need to update where they source their dependencies from.

I agree -- the effects are definitely similar and almost equivalent. However does requiring a group/author change things at all? It seems like it could introduce an abstraction lever.

> Namespacing, in my experience, tends to make the forking process slightly “cleaner”, because you avoid having a potentially non-“right” “original” (for example, “purescript” tends to look more legitimate than “purescript-group-b”). But some comments in this thread seem to paint namespacing as a hard requirement, or claim that package managers without namespacing are missing a core, mandatory feature. The case I’m presenting is that this isn’t the case: namespacing is a useful feature for several workflows, but adding namespacing doesn’t fundamentally alter the issue.

I'm on the fence -- I'm not sure if this is a good counter case, but what about the layer of abstraction introduced by the implied/required existence of <group>? You could write code that imports "purescript", but then resolve it later (as some others mentioned, go.mod is or some other modules file that clarifies mappings) to determine which "purescript" that is. You could solve this by "alias"ing "project/purescript" to "purescript" (and then having some similar extra configuration that says "purescript" -> "project/purescript", and now I'm not sure if either is better (so basically having this indirection be a "module resolution feature" or a "module aliasing feature"), and if there's any value in forcing one (requiring the existence of <group> would almost certainly force the module resolution thing, but also break builds the second similarly named packages were published...) or if they really are just the same.

I also found the page on this by the rust team pretty convincing[0].

[0]: https://internals.rust-lang.org/t/crates-io-package-policies...

akerl_ · on July 29, 2019

For clarity, I think the comment referencing “go.mod” that you’re describing, at least in this thread, is from me :D

I think I agree that the core feature that impacts this issue is what go.mod solves, and what you’re describing: it should be easy and language-supported to sub in one fork of a dependency for another fork, so that users can flip between “group-a”’s purescript and “group-b”’s purescript, regardless of how the namespacing works on the module registry (notably, golang dispenses with a registry entirely: there’s no central system, except insofar as github is used for lots of people’s packages).

munificent · on July 29, 2019

OK, let's walk through how the scenario you lay out works in the context of a package graph.

I have my_app, which depends on foo and bar. Both of those use purescript. My app calls into foo which gets some object created from the purescript library and returns it. I then pass that object to bar. For this to work gracefully, they need to have a shared dependency on the same purescript.

(purescript is a weird example to use here, but imagine the shared dependency is a library that provides something like a reusable data structure.)

What happens to my_app when purescript gets passed from group-a to group-b? If foo wants to be on the latest, they need to move over to group-b. But if they do that and bar doesn't, then my_app can't get the latest version of foo. The foo and bar maintainers know that, which means they know they have a disincentive to move over to group-b. Better to stay on group-a and keep things moving smoothly with their existing users.

You end up in a situation where the choice that is better for a single package in isolation (move to the latest version of a dependency) is harmful to the package in context (it breaks shared dependencies and prevents users from upgrading to your latest version).

This is one of the most important situations to avoid when designing a package manager. As much as possible, you want to give package maintainers the freedom to evolve their package without it destabilizing the ecosystem. There's an argument that this is fundamentally what a package manager is — a tool to let you reuse changing code. If you don't need to evolve the code being reused, then FTP is a perfectly sufficient package manager.

Enshrining ownership in the package name directly confounds that. And, like another comment suggests, if you do that, maintainers will just route around it by creating an "organization" for each package, putting you right back where you started.

sterlind · on July 29, 2019

You could go farther and use a DNS name as a group name, then publish packages by signing with the SSL key. Anyone who doesn't want to shell out for a domain name could use a registry service that gives subdomains out. Why reinvent the governance wheel?

hardwaresofton · on July 29, 2019

I'd agree, but I don't think the additional complexity is worth it. I was prepared to suggest stuff like company.com/

That said, it's worked well for golang as far as I can see, despite 99% of the packages are github.com/<creator>/<package>.

Here are some writeups on what the rust team decided:

- https://github.com/rust-lang/crates.io/issues/58

- https://internals.rust-lang.org/t/crates-io-package-policies...

Their stance seems pretty reasonable though I'm not sure I would have done the same (and it's obviously very likely I would be wrong to do the opposite of what they did):

> Namespacing

> In the first month with crates.io 58, a number of people have asked us aboutthe possibility of introducing namespaced packages 90.

> While namespaced packages allow multiple authors to use a single, generic name, they add complexity to how packaged are referenced in Rust code and in human communication about packages. At first glance, they allow multiple authors to claim names like http, but that simply means that people will need to refer to those packages as wycats' http or reem's http, offering little benefit over package names like wycats-http or reem-http.

> When we looked at package ecosystems without namespacing, we found that people tended to go with more creative names (like nokogiri instead of “tenderlove’s libxml2”). These creative names tend to be short and memorable, in part because of the lack of any hierarchy. They make it easier to communicate concisely and unambiguously about packages. They create exciting brands. And we’ve seen the success of several 10,000+ package ecosystems like NPM and RubyGems whose communities are prospering within a single namespace.

> In short, we don’t think the Cargo ecosystem would be better off if Piston chose a name like bvssvni/game-engine (allowing other users to choose wycats/game-engine) instead of simply piston.

> Because namespaces are strictly more complicated in a number of ways,and because they can be added compatibly in the future should they become necessary, we’re going to stick with a single shared namespace.

avereveard · on July 29, 2019

> this change of ownership is an implementation detail that doesn't need to impact package consumers

it does, because having a dependency means having a trusted path from the developer to the packager to the repository into your software and to your customers

frankly you'd want to do what you call "constant pointless churn" at every single version update, because god knows what comes trough your package manager, as this and many other articles pointed out repeatedly, and the fact that you are advocating to skip it not for version changes but for whole ownership changes is the opposite of a security oriented mindset.

pjc50 · on July 29, 2019

But every change of ownership is a potential mass security compromise.

munificent · on July 29, 2019

Every change period is. There's no guarantee that the same owner (individual or group) won't spontaneously go rogue and introduce harmful changes.

This is why modern package managers have lockfiles and give you control over when you upgrade any package version, regardless of ownership change. Ultimately, you are responsible for the code you reuse.

gridlockd · on July 29, 2019

> The problem is that ownership, and even names of owners change all the time.

If ownership changes, I want to know. It's perfectly fine to cause a little bit of breakage there that is easily fixed in a semi-automated, supervised way.

If a name changes, there could just be an alias, unless there's literally a trademark dispute underway, in which case the ownership-change process can be applied.

> When you have very large transitive dependency graphs, the result is constant, pointless churn.

A very large transitive dependency graph is a terrible thing to have. It shouldn't be made convenient to have it. Your package manager should scold you for it!

In all seriousness, if you have such a huge graph, then the amount of churn caused by package updates will likely be the dominating factor, not the change of ownership.

munificent · on July 29, 2019

> is easily fixed in a semi-automated, supervised way.

It's not. Once you have shared, transitive dependencies, the application author consuming a package is no longer the one who authored or is in control of the dependency on that transferred package.

> A very large transitive dependency graph is a terrible thing to have.

This is a fair subjective preference. Unfortunately, it flies in the face of reality. Anyone who maintains a package manager will tell you real-world package graphs are typically quite large and deep. If users didn't want that, they wouldn't do it.

> Your package manager should scold you for it!

Users don't generally like or use tools that scold them for doing what they want to do.

mschuster91 · on July 29, 2019

Why not resolve the owner name to some kind of uuid at time of package require/install and saving it in the lockfile? That way, a package upgrade can detect if the owner name has changed and fetch the correct package.

gridlockd · on July 29, 2019

This doesn't solve the issue where you type "npm i -s letf-pad" and your computer gets taken over by The Russians™.

Make people copy-paste the name, it decreases the chances of catastrophic failure.

davnicwil · on July 29, 2019

When package owners change, or change names, would the original package namespace still stick around for all the existing versions? Also, would there be a pointer built in that let you know about the new owner namespace when you try to upgrade automatically?

If so, I really don't see the practical problem. Actually it seems like useful information. If I'm upgrading and the package owner changed, that's definitely something I want to at least be told so I can look into whether that's important depending on the usecase. In that sense, it is a potentially breaking change. Particularly from a security standpoint.

If it's not the case, and everything including the version history is ported over to the new namespace and you're simply forced to change stuff just to get it working again, I agree this is just pointless churn.

littlestymaar · on July 29, 2019

> Also, would there be a pointer built in that let you know about the new owner namespace when you try to upgrade automatically?

And what would be the point of the namespace then ?

davnicwil · on July 29, 2019

> pkg upgrade @person-a/some-package

"You're on the latest version 1.2.3 of @package-a/some-package, however @person-a has officially transferred ownership of some-package to @person-b and there is a newer version 1.2.4 available at @person-b/some-package. If you'd like to upgrade, please update your dependency to @person-b/some-package"

littlestymaar · on July 29, 2019

That's the mechanism, I was asking about the benefit of namespace in that case. With this kind of ownership transfers you lose all the alleged security benefit of namespaces.

davnicwil · on July 29, 2019

I think we're asking the same thing, or I've not explained well. I'm saying when ownership transfers, it'd be good to keep all the existing version history on the original namespace, exactly so you're not forced to update all your dependencies in case you don't need to upgrade and/or don't want the updates of the new owner.

_pmf_ · on July 29, 2019

> Whenever a large number of skilled people do something for which an alternative is "infinitely, obviously better", there's a good chance that there is more going on than you know.

The Maven ecosystem has been doing things better than NPM for at least 10 years. The onus is on the NPM team for justifying their pathologically bad solution, it's also on the JS community to back off their "everybody can contribute" pipe dream. Maven works because in practice, we have some few organizations contributing packages, not thousands of individual, anonymous developers.

gridlockd · on July 29, 2019

I think the problem is that Java people are considered "old and lame", whereas the Javascript people are considered "young and immature", respectively. While this assessment is somewhat accurate, it inhibits the process of learning from each other.

allover · on July 29, 2019

> While this assessment is somewhat accurate

No it isn't, and please stop perpetuating this nonsense.

_pmf_ · on July 29, 2019

On further reading, it's debatable whether the "maliciousness" is not the community attempting to take away a project from its founder. Similar to how that toxic Twisted dev tried to push responsibility for maintaining their broken branch of Python 2 to the Python core lib dev.

You can fork, but then you're responsible for more than the "fun parts". The project founder might introduce measures to prevent fragmentation due to your bad decisions as part of damage control.

twic · on July 29, 2019

> a large number of skilled people

> RubyGems

Well, which is it?

Java has used namespaced packages since 1996, and it's been a roaring success. There have certainly been problems - for example, there is currently some absolute nonsense going on about the handover of big chunk of stuff from Oracle to the Eclipse Foundation [1] - but they have always been manageable, and don't come close to outweighing the benefits.

[1] https://www.infoq.com/news/2019/05/end-of-javax-package/

littlestymaar · on July 29, 2019

Java was indeed a success, but its dependency management is archaic at best. Yes it's better than FORTRAN or C, but I wouldn't use it as a reference in 2019…

avereveard · on July 29, 2019

it's signed, hierarchical, decentralized, supports addressing multiple artifacts within the same module by type, includes a standardized way to retrieve sources or docs, supports pinning the version of transitive dependencies so you can manually resolve conflicts and on top of that can be extended to use completely alien package sources (which is unholy if you ask me, but it helped me survive the great osgi catastrophe of the 2010s thanks to tycho)

kkapelon · on July 29, 2019

what would you use instead as a reference?

geofft · on July 28, 2019

How would that have helped things in this case? If we go with the hypothesis of an angry maintainer getting revenge, would they not be just as angry at their project being forked by the community, and just as able to sabotage it via other libraries still in their namespace? (And perhaps more willing, since their original name is still around.) If we go with the hypothesis of a compromised account, people will still be installing the package from the original maintainer's namespace, so they'll still get the malicious code.

And in all likelihood, because the story here is that the maintainer intentionally (though begrudgingly) transferred ownership, they would have intentionally (though begrudgingly) given other people access to the package in their namespace, simply because people value the namespaced name. (If they didn't, and everyone was immediately happy to install anyone/purescript, then namespacing doesn't solve any problems and also creates some!) And the situation would have played out as given.

delinka · on July 28, 2019

Why would anyone who owns a namespace be willing to leave a package that's moving to another maintainer within their own namespace? A namespace tends to come with a reputation, and if you give an outsider access to the namespace, the reputation can change without the owner's consent. No, I don't think I'd be allowing @delinka/ExcellentPackage to be maintained by someone else. They can fork it to @fredralphbob/ExcellentPackage and I'll turn off @delinka/ExcellentPackage when I'm done maintaining it. Yep, it'll break dependent installs, but that's the point: get dependents to move to the proper version. I think of a namespace like a domain. If I host a project at project.delinka.engineer, I'm definitely not transferring access to the subdomain to a new maintainer.

Yep, we still have people who would eschew best practice and go against my method above, but that happens everywhere. Just because a solution isn't perfect doesn't mean it's not an improvement.

geofft · on July 29, 2019

> Why would anyone who owns a namespace be willing to leave a package that's moving to another maintainer within their own namespace? A namespace tends to come with a reputation, and if you give an outsider access to the namespace, the reputation can change without the owner's consent. No, I don't think....

Good first-principles argument, but in practice, this happens in ecosystems that do have namespaced packages. Off the top of my head:

- Until recently, kennethreitz was a GitHub organization so other people could manage kennethreitz/requests etc.

- Foursquare's Android app is still com.joelapenna.foursquared, which originally was a third-party app that got adopted by Foursquare and turned into their official app. Joe never worked for Foursquare.

- When Linus took a break from Linux development for 4.19, Greg K-H still released it from https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin... and https://github.com/torvalds/linux .

yawaramin · on July 29, 2019

> Yep, it'll break dependent installs, but that's the point: get dependents to move to the proper version.

Probably more realistically, you'd put a note in your @delinka/ExcellentPackage repo readme saying that the package is deprecated and to use @fredralphbob/ExcellentPackage instead going forward, and simply not publish any more new versions. In fact I'm pretty sure npm has a built-in system to warn users of a package that its canonical name has changed.

trabant00 · on July 28, 2019

Imo your solution is just a band aid. The real solution is having a distribution of packages which are maintained and suppervised by a group of people. Like the linux distribution maintainers or the group that develops the language.

Community contribuited packages should be declared "install at own risk" like in archlinux aur.

All this is already solved. But people want to reinvent the wheel and ride the user generated content train.

hobofan · on July 28, 2019

That's not a "solution". That's just shifting the trust to a different (smaller) group of people.

jonas21 · on July 28, 2019

> That's just shifting the trust to a different (smaller) group of people.

Practically speaking, shifting trust from a large, anonymous group of people, to a small group of people who are known and trusted by the community is a pretty good solution.

krainboltgreene · on July 28, 2019

Please list the group of people you trust to manage 1,037,274 packages.

hedora · on July 28, 2019

I trust these people to manage software packages: https://nm.debian.org/public/people/dm_all

They “only” manage ~18K packages, but those 18K do a lot more than NPM:

https://people.debian.org/~corsac/

perhaps number of packages is the wrong metric.

GordonS · on July 28, 2019

A lot of those packages are going to be versions months or sometimes years behind the latest version.

I'm not convinced this is a model that can work for programming libraries.

justinclift · on July 29, 2019

You may be correct for javascript, whose ecosystem is notoriously unstable and the equivalent of "building on sand".

eg write some code in a JS framework, then go off and do something for 18 months. Come back and there's a very good chance the entire framework has been obsoleted and replaced, perhaps even several times.

In JS land that's "Doh, well of course!".

In sane ecosystems (eg not Ruby), not so much. ;)

halostatue · on July 29, 2019

This is a problem for _any_ active development package, to the point where most language communities have (previously had) instructions on how to fix breakage of the language introduced by, for example, arbitrary Debian packaging restrictions without working with the language maintainers.

I seen this in _any_ environment that has a relatively active package management system of its own (e.g., Ruby, Perl, and Python included)—and it _still_ happens in some Java cases. The only _sane_ thing to do is to completely ignore the OS package management system and to package your applications as relocatable install packages complete with all dependencies you need, modulo the minimal OS stuff required.

umanwizard · on July 29, 2019

It does work, in practice, and has for years. Every package in the Debian (or Ubuntu, or FreeBSD, etc.) package repos depends only on other packages in those repos and on the base OS. It works fine.

Packages being months behind the latest version is a feature, not a bug — it means things will only be randomly changing under your feet rarely, with the exception of security fixes.

trabant00 · on July 29, 2019

> Packages being months behind the latest version is a feature, not a bug

I struggle to convince devs and management about this in the past few years. Everyone's gone crazy about latest and greatest set of features with no respect for stability, maintainability, etc.

tudelo · on July 29, 2019

To be fair upgrading very old libraries because you hit a fixed bug to very new libraries can be a long process... but I never have worked on a project that stayed on new libraries continuously so... not sure how it works the other way.

gregmac · on July 29, 2019

> Packages being months behind the latest version is a feature, not a bug — it means things will only be randomly changing under your feet rarely, with the exception of security fixes.

If things are "randomly changing" by updates, that means the upstream package isn't following semantic versioning practices, and more importantly, isn't preserving backwards compatibility with their releases. That's a mark of bad software development practices, and it also isn't solved by an arbitrary wait period: that just means months after the breaking change is done, you finally notice and complain, but most maintainers are going to (rightfully) ignore you by that point.

I'd make the same argument in general software quality: with new features, often come new bugs, so by delaying, you can delay introduction of those bugs. However, bugs are easier to fix the sooner they're found, so again, quick turn-around improves things, even if it sometimes causes short-term pain.

This is one thing where NPM is definitely miles ahead of APT. With APT, you get a single version of each package, so it's all or nothing. With NPM, you can specify `1.1.x` so even if a version 1.2 or 2.0 comes out, you're on the stable old one. The closest thing that seems to happen with linux packaging is on a major version (with backwards-incompatible changes) a new package name is created with a "2" on the end, to signal this incompatibility -- how is that anything but a hacky workaround to not having proper versioning support?

voltagex_ · on July 29, 2019

>With APT, you get a single version of each package, so it's all or nothing.

It's definitely not used as often as in npm packages, but you can use =version after your package name to apt install a particular version, or apt pinning for more complex setups.

Xylakant · on July 29, 2019

The point your parent is making is that with mom, you can install one version of a package per project you’re working on, instead of one version on the system. Especially if you’re working on multiple projects or multiple versions of the same project, this is required.

soulofmischief · on July 29, 2019

But it doesn't work for libraries people will be using in development. Often you are waiting for a handful of packages to introduce specific features or bug fixes and need them the moment they are available. NPM isn't user space, it's dev space. Timeliness is the maxim.

umanwizard · on July 29, 2019

I’m not a JS developer so maybe I’m missing something, but how do you use a library during development and not use it in production?

In the C++ world, I can’t imagine a situation where you would need to depend on, for example, libjpg while developing, but not need to read JPEG files in prod/end-user-space.

huehehue · on July 29, 2019

There are a slew of dev-only dependencies in JS-land. Packages that run local servers for hot-reloading during development. Test runners, linters, TypeScript compilers, SCSS compilers, and so on. None of those things need to be included with the bundled product.

Check out Electron or React starter apps for an example, their boilerplates should have dozens of examples.

soulofmischief · on July 29, 2019

Since another user explained that point, I would also add that in the "move fast and break things" environment of the web, for better or for worse, responsive and automated library updates are often desirable or even required functionality which can be tied to substantial real-world profits if something suddenly goes wrong somewhere in the stack.

pjmlp · on July 29, 2019

Web development is one of my tasks and thakfully we don't move fast and break things, rather rely on proven and established development stacks like JEE, Spring, ASP.NET and VanilaJS.

soulofmischief · on July 29, 2019

There are some reasonably stable packages in the npm ecosystem, it's just pretty much the Wild West out there. It really depends on what you're building. It's pretty hard to get a new SPA started these days without a development environment which uses npm, especially if you're running lean.

The problem is keeping your dependency tree reasonable. Even pulling in a couple of packages might lead to hundreds of dependencies. And even if you have the sense not to use an external package for something as simple as left-padding a string, someone somewhere in that dependency tree might not feel the same way. And that's all it takes to bring a chunk of the web down.

All of these problems of course are due to JS not having a mature ecosystem or standard library. People shouldn't need to reinvent the wheel every hour, nor should they be pulling in modules less than a dozen or two lines of code. That, and the constant misguided financial incentive to deploy ever more complex functionality over http are what lead to this aggressive push for cutting edge tools.

naniwaduni · on July 29, 2019

In the C++ world, you almost certainly don't need headers in prod.

umanwizard · on July 29, 2019

Yes, true. What’s your point? You still need libjpg to be either statically linked into your executable or present on the target system.

liveoneggs · on July 29, 2019

headers are not libraries

toupeira · on July 28, 2019

Looking at [1] and [2], in unstable it actually seems to be:

- 31K source packages

- which build into 136K binary packages

- consisting of 1.2 billion lines of code

[1] https://sources.debian.org/stats/

[2] https://packages.debian.org/unstable/allpackages?format=txt....

SCLeo · on July 29, 2019

Honestly, I think the reliability of those packages are not guarded by these people, but rather by the corresponding communities of those packages.

If one of the community failed to secure its package from malicious people, these people at debian are not going to be able to stop it.

Thus, those packages are still guarded by a huge community.

umanwizard · on July 29, 2019

Why do you think this? OSes like Debian don’t just pull packages from upstream automatically. Packages have actual maintainers affiliated with the OS, not the upstream community, and it’s those maintainers who build packages for the OS repos.

oefrha · on July 29, 2019

My software is packaged by Debian, and the update process is me notifying the Debian Developer (DD) responsible for the package of the new version => said DD pulling the new tarball from GitHub. Pretty sure no one’s gonna notice until after it’s pushed to Debian FTP if I introduce some subtle malicious code. Point is DDs don’t review version deltas for the most part, so when the upstream is compromised, they add little to your defense (other than security by outdatedness, I suppose).

umanwizard · on July 31, 2019

Sure, but a very long time will pass between it being uploaded and being merged into stable, so there is a lot of time for people to discover your malicious code.

It is not like npm or crates.io where you can just upload whatever random code you like and people will start picking it up immediately.

But it’s not foolproof, sure, I agree with that.

ryacko · on July 29, 2019

Not entirely true, Debian actually made OpenSSL less secure once: https://www.debian.org/security/2008/dsa-1571

umanwizard · on July 29, 2019

This doesn’t contradict my point. I never said that Debian maintainers are more trustworthy than upstream 100% of the time.

I merely said that Debian packages are built, uploaded, and vended by Debian package maintainers, not by upstream. Whether that makes them more trustworthy or less is a different question.

hnbroseph · on July 29, 2019

> I trust these people

why?

varjag · on July 29, 2019

There is a process for recruiting Debian devs where trustworthiness is a consideration.

https://wiki.debian.org/DebianDeveloper#Advocating_a_Debian_...

There is none for npm.

scbrg · on July 29, 2019

Can't speak for parent, but because they have historically shown themselves to be trustworthy.

I don't know much about the "NPM community," but judging from the bits I do hear, I'm not sure they have done the same. OTOH, I understand that as an outsider that only pay attention when shit blows up, I'm only seeing the bad parts.

krainboltgreene · on July 28, 2019

[flagged]

justinclift · on July 29, 2019

Those packages provide pretty much every software capability computing has to offer. It's a pretty vast set of capabilities, though npm/javascript itself has gotten way out of hand over the last few years too. ;)

acheron · on July 29, 2019

Indeed. I looked through the Debian packages and didn't see left-pad anywhere.

hnbroseph · on July 29, 2019

why is it "so hilariously wrong"?

JustSomeNobody · on July 28, 2019

What do you propose should be done?

fock · on July 29, 2019

Tell me, why one needs packages such as this except for personal (and ultimately npm) marketing: https://github.com/sindresorhus/shebang-regex - the whole thing smells

inimino · on July 29, 2019

You're being downvoted because that is mean-spirited, but the reason we needed these packages is to find out what the right size of package is. We now have a lower bound.

fock · on July 29, 2019

I'm not sure what's mean-spirited about the simple statement that the size/complexity of the npm-eco-system arises only from personal/corporate business-decisions.

But thanks for telling me anyways ;)

inimino · on July 29, 2019

Business decisions and marketing are not the only reasons people write open source software and you would do well to remember it.

hanniabu · on July 29, 2019

In my opinion there should be different namespaces, similar to what the parent mentioned. There can still be the public namespace as there is now, but there can also be ones registered, such as google/mysql-golang. The other alternative is that it's just linked to Github/Gitlab, but then you run into potential naming collision issues.

matthewbauer · on July 28, 2019

That comes at a cost of lots of volunteer hours. I’m not sure if it would scale to NPM’s massive size.

naniwaduni · on July 28, 2019

npm's size is not independent of its curation strategy.

This can be construed as a good or a bad thing.

kjksf · on July 28, 2019

NPM's size is first and foremost dependent on the popularity of JavaScript.

All other "modern" (where "modern" is "last 20 years") package managers have zero curation.

Perl, Python, Ruby, Go, Rust, Dart, JavaScript.

You say: "if they had curation, they would be better".

I say: "if they had curation, they would have lost to a competitor that doesn't have curation".

People care about having more packages much more than abstract fears of security or quality of those packages.

TeMPOraL · on July 29, 2019

People care about having more packages much more than abstract fears of security or quality of those packages.

At their own peril.

justinclift · on July 29, 2019

> People care about having more packages much more than abstract fears of security or quality of those packages.

You're partially right. Some people haven't yet learned (eg the hard way, or whatever) that quality does matter.

Hopefully most of those people will learn to appreciate and demand quality over time, instead of the current anything-goes-approach.

In the meantime, things like this npm package example will continue to give npm/javascript black eyes (repeatedly) and help make that happen. :)

umanwizard · on July 29, 2019

> modern

Apt is about 20 years old, and the most famous apt package repos (I.e., the Ubuntu project) is much younger than that.

neilv · on July 28, 2019

Java got that part right in the 1990s, when it was the cool Web-savvy language. At the time, I especially liked how they piggybacked onto the existing DNS domain name control, avoiding having to create a new centralized registry to keep names unique. (Of course, more could be done beyond that, today.)

pvg · on July 28, 2019

they piggybacked onto the existing DNS domain name control

It's a recommended and widely-followed voluntary naming scheme - it's not in any way connected to domain name control.

vbezhenar · on July 28, 2019

But it's easy to connect to domain name control. Allow uploading to maven central only after domain verification. Java did not do that, AFAIK, but other languages can do that.

neilv · on July 28, 2019

I should've been more clear about what I was saying with "control". The recommendation is connected directly to domain names you control. What was not done at the time was enforcing that, or using that as a basis of authentication or distribution, which is part of why I said more could be done, today.

pvg · on July 28, 2019

The purpose of the scheme was to make namespace collisions less likely and that's about it, though. And people regularly deviate from it, both then and now. Not using it as a basis for authentication or distribution probably remains a really excellent idea.

naniwaduni · on July 28, 2019

This is not a package namespacing issue. This is not a technical issue. It is mechanically no more difficult to change from "@shinn/purescript" to "@whatever/purescript" than it is to change from "purescript" to "purescript-whatever", except that in the common cases (where no disputed community moves have occurred without a corresponding name change) everyone has to include an author. The author can still "give away" or sell package names; a lot the value is not in holding the name, but having the existing installed base and mindshare.

These are the social issues associated with a hostile fork.

timothycrosley · on July 28, 2019

I don't actually feel like this solves anything, but instead just adds one more avenue for name hijacking. Now instead of trying to register an angular package first, I'll try to register a google namespace, or similar. If google is taken, I'll make an official_google, or g00gle etc.

McGlockenshire · on July 28, 2019

Even PHP gets this right. It's not hard. It makes me wonder why npm hasn't already moved to namespaced package names.

Fellshard · on July 28, 2019

Mostly because the vast majority of JS developers don't seem to be aware of the rest of the software universe, and so seem to reinvent the wheel, rediscover the worst of software's history, and discard the most useful of software findings with shocking regularity.

NPM tends to reinforce the worst of the JS world's tendencies.

allover · on July 28, 2019

> Mostly because the vast majority of JS developers don't seem to be aware of the rest of the software universe

Do you have any evidence to back up this statement, compared to developers in other languages? Or is this just business-as-usual JS bashing?

Fellshard · on July 28, 2019

Primarily, I end up basing this off of the types of libraries being developed for Javascript, and what kinds of articles and thought leaders JS developers tout as innovative.

buzzerbetrayed · on July 28, 2019

So by looking at a tiny fraction of the 1 million+ npm libraries, and articles by a few dozen people on the internet, you are able to conclude that the

> vast majority of JS developers don't seem to be aware of the rest of the software universe

Forgive me if I dismiss this as business-as-usual JS bashing

Fellshard · on July 28, 2019

There is a difference between arbitrary bashing and giving concrete points of concern. I would hope such criticism would not be dismissed out of hand.

yawaramin · on July 29, 2019

What concrete points of concern? I don't see any...?

hnbroseph · on July 29, 2019

sure, it's just sampling.

pjmlp · on July 29, 2019

For starters, developers in other language communities don't publish packages for single line functions.

SahAssar · on July 29, 2019

That's a philosophical difference which says nothing of their understanding of the rest of the software universe.

I dislike the microdependencies and the "DRY-taken-to-the-extreme" stuff that the js community does but your argument does not hold up.

Corrado · on July 29, 2019

I beg to differ; witness the fibur[0] Ruby Gem. Written by Arron Patterson (@tenderlove) to show that using threads in Ruby is very easy. The Gem consists of this single line:

   Fibur = Thread

Of course, he did it as a joke, and it doesn't excuse the serious packages that NPM contains, but it does prove that other languages have single line packages.

[0] https://rubygems.org/gems/fibur

kkapelon · on July 29, 2019

I cannot talk for all other languages but at least in Java

1)All packages that are published can never be unpublished or re-released from a different contributor

2)Packages are namespaced

3)Nobody downloads packages directly from the internet. You always use a proxy which in most companies has security scans.

4)There are no "local packages" (like the node_modules dir), so it is impossible for the checked out source code to override your own vetted and secure package.

Not directly related to the incident of the original post, but I was mindblown when I realized that you can unpublish npm packages

allover · on July 29, 2019

1) Same applies for npm (granted, this was only fixed after the left-pad incident, and npm was not the only language's registry to have that issue).

2) As mentioned elsewhere in this thread, npm supports namespaced packages, but they are not mandatory. There are other major languages' registries in same situation.

3) Can you back up 'nobody'. I would suspect a lot of companies don't use a proxy. Some JS teams also use an internal proxy for npm, but it is obviously additional infrastructure to setup/maintain which has a cost.

4) Never heard anyone raise this as a problem before.

> Not directly related to the incident of the original post, but I was mindblown when I realized that you can unpublish npm packages

You can't, with the exception of a 72 hour window, to allow for accidental publishing [1].

[1] https://www.npmjs.com/policies/unpublish

kkapelon · on July 29, 2019

1) The fact that an incident actually forced something that Maven registry did since inception, doesn't actually reinfornce the original argument? (that JS developers did not look at what other languages were doing already)

2) Again, whoever thought that namespaces should be optional instead of required "doesn't seem to be aware of the rest of the software universe". Who took this decision? Why?

3) Do a survey on your own. Ask Java developers you know if they use Artifactory/Nexus in their job and note down the percentage. Then ask the same question to JS teams

4) Just because something hasn't been exploited yet, doesn't mean it shouldn't be fixed. By that definition if left-pad hadn't happened would you say that (unpublishing) packages has not been raised as a problem yet?

allover · on July 29, 2019

1) As I said, it was not just JS in this situation at the time, it also applied to other major registries like PyPi. So your point does not reinforce the original attack on JS developers. Congrats to Maven for getting this right.

2) Namespaces were added later. It wasn't "a decision to make them optional". Also check out for the discussion here as to how namespaces don't solve this issue, this point is largely moot.

3) I'm not the person blanket attacking a community. Or making unlikely assertions that "nobody" in the Java world installs direct from the internet.

4) Detail the exploit, otherwise this is FUD.

kkapelon · on July 29, 2019

4) You work for a Linux distribution. You have several global npm modules already installed that are safe and secure. You download source code of a killer app in order to package it. You check the source code itself and it is safe. However you didn't realize that there was a local node_modules directory in the git repo that contains package foo-1.2.3 with replaced code that does bad things. That package overrides your global one. You ship a compromised app.

The above scenario is impossible with maven, because there is no concept of local modules. Only the "global" ones will be used when you package an app. So if you check just the source code and it is safe then everything is fine.

akamaozu · on July 30, 2019

Sounds like your argument boils down to "check the source carefully", not "local modules are evil".

If the source was checked carefully, you'd notice a checked-in node_modules dir.

If you didn't check the source properly, you could install a module that seems like it'ss using a known package, but really is using its own malicious version of the global package.

geofft · on July 28, 2019

Perl (cpan), Python (pip or conda), Ruby (gem), and Rust (cargo) all behave as NPM does, so that doesn't seem to be the explanation here.

adev_ · on July 29, 2019

> Perl (cpan), Python (pip or conda), Ruby (gem), and Rust (cargo) all behave as NPM does

Wrong.

None of them have chosen the "micro-package" way of NPM.

None of them have an average size of "3-10" line of code per package, which NPM has for many MANY packages. Cargo, pip have many packages but "reasonnable size" packages. ( > 100 lines ).

The Micro-package philosophy that NPM chose, meaning every single line of code can have its own package, increase by several factor the number of dependencies required to create anything in JS.

A simple hello world in react already have ~100 dependencies

https://npm.anvaka.com/#/view/2d/react-create-app

Some core utility bundler/packer/etc overpasses 2000 dependencies

https://npm.anvaka.com/#/view/2d/webpack

This is madness.

It is a simple evidence that bigger the dependency tree is, higher the chance of "one element" malicious / corrupted is.

Maybe the NPM developpers / users should reflect on what lead to this situation instead of closing their eyes and sing "everything is fine".

geofft · on July 29, 2019

Nobody is closing their eyes and singing "everything is fine." A large number of small packages is a good thing, and a technically strong ecosystem supports it. Having spent the last week in the internals of glibc chasing bugs in code that has no reason to be jammed into the same library that handles initial program loading, I can attest that there are good, justifiable, technical reasons to do things the NPM way, and I'm glad that there are smart, qualified, talented people implementing that.

I know it's hard for you to imagine, but perhaps the JavaScript ecosystem has some good things about it.

adev_ · on July 29, 2019

> I know it's hard for you to imagine, but perhaps the JavaScript ecosystem has some good things about it.

There is good in very ecosystem. But a simple keyword "node_modules" on twitter should probably convince you that the good of JS is not in its package system.

To be fair, even the NodeJS author agree on that.

Modularity does not mean "split your code at the atomic level".

akamaozu · on July 30, 2019

JavaScript's package manager is a massive plus to the ecosystem.

The language would not be worth using without it.

m90 · on July 29, 2019

> None of them have chosen the "micro-package" way of NPM.

This is not something inherent to npm itself, it's what the people using npm choose to publish.

No matter if people like small modules or not, it's not at all related to the topic discussed.

adev_ · on July 29, 2019

> No matter if people like small modules or not, it's not at all related to the topic discussed.

It is very relevant to the topic discussed.

Just for fun :

https://npm.anvaka.com/#/view/2d/purescript

> 150 dependencies.

Including a package named "one-time", bundled several times in two different versions. To do something highly relevant and technical like "Call a function once".

I have no doubt that it is an Highly complex code that requires indeed two packages..... Irony

Little question: What would have been the probability of purescript getting malicious if its dependency tree would be something reasonnable... Let's say 20 packages instead of the current ~200 ?

stephenr · on July 29, 2019

Thank you for linking to that dependency grapher.

The crab-grass like dependencies of many/most NPM packages is scary enough, and then they (or you?), I guess because of lazy loading, to improve responsiveness, update it as you watch.. It's like a scene out of an alien monster movie, where the creature keeps growing more limbs.

wdroz · on July 29, 2019

conda use namespaces (called channels). If you want a package from a random person, you've to explicitly add it (either globally or when installing the package).

Fellshard · on July 29, 2019

That is why I cited the behaviour and culture of JavaScript-only devs as the primary explanation, with NPM's model reinforcing those issues - issues that do not exist in the same manner in those other languages' ecosystems and cultures.

twic · on July 28, 2019

cargo, at least, does this for the reason Fellshard gives - the Rust team essentially commissioned a copy of Ruby's bundler, without considering whether there was anything to learn from any other language's ecosystem.

geofft · on July 28, 2019

Well, okay, if your point of view is that behaving like multiple other major languages and specifically taking the lessons (good and bad) of a specific existing language ecosystem into account before doing your own thing is the same as disregarding other languages and reinventing the wheel, I'm not sure what words mean anymore.

twic · on July 29, 2019

> Well, okay, if your point of view is that behaving like multiple other major languages and specifically taking the lessons (good and bad) of a specific existing language ecosystem into account before doing your own thing is the same as disregarding other languages

If you've only looked at two or three languages that not coincidentally happen to do things pretty much the same as each other, and not looked elsewhere, then you are indeed disregarding other languages.

> and reinventing the wheel

I didn't mention reinventing the wheel.

> I'm not sure what words mean anymore.

Agreed, but i'm not sure i can help you with that.

steveklabnik · on July 29, 2019

Cargo is more like npm than Bundler in this regard, as Bundler does not let you have multiple versions of a package at the same time. That lesson was learned from npm, though implemented in a different way.

kibwen · on July 28, 2019

Namespaces would be a feature of Crates.io, not Cargo (and Crates.io was not contracted to Yehuda Katz, as Cargo was).

twic · on July 29, 2019

How so? Do you mean that Cargo would support crates with names like "example/foo" or "org.example/foo", but that's just not how Crates.io works?

If so, that's interesting, but then my question is why we didn't make use of that.

steveklabnik · on July 29, 2019

There's a few different elements of the design space. But, while Cargo can use crates.io, it's distinct from it, so on first principle, this would be a crates.io feature.

That being said, Cargo would also have to understand it, because the Rust language does not understand namespaced external packages, so you'd either have to change the language, or change Cargo to do something to paper over that somehow.

As for "why didn't we do this in the first place with crates.io", https://internals.rust-lang.org/t/crates-io-package-policies... lays out some of this background, though not all of it.

twic · on July 29, 2019

Thanks for the reference!

> When we looked at package ecosystems without namespacing, we found that people tended to go with more creative names (like nokogiri instead of “tenderlove’s libxml2”). These creative names tend to be short and memorable, in part because of the lack of any hierarchy. They make it easier to communicate concisely and unambiguously about packages. They create exciting brands.

I will never stop admiring your ability to see the bright side of things.

Ayesh · on July 28, 2019

PHP dependency management (composer and packagist) got many things right, such as namespaces packages, lock files, proper autoloading, etc.

allover · on July 29, 2019

> Even PHP gets this right. It's not hard.

PHP had the luxury of coming out with a package manager (Composer) later, and learning from others before it. (2012 vs 2010 for npm).

Npm similarly improved on a lot of package managers that came before (e.g it's superior to Pip, which doesn't resolve dependencies [1]).

> It makes me wonder why npm hasn't already moved to namespaced package names.

Also, npm does have namespaced package names, the problem is that it was introduced later, so isn't mandatory. I'm not sure how they could make it mandatory without breaking everyone?

[1] https://github.com/pypa/pip/issues/988

oblio · on July 29, 2019

Apache Maven was released in 2004... PHP just looked at what Java was doing. Might be a wrong idea for some things but it definitely helps for stuff such as package management.

preommr · on July 29, 2019

They have, its just optional.

littlestymaar · on July 29, 2019

> It should've been `@shinn/purescript` and the compiler developers just create their own `@whatever/purescript`.

And how is a user supposed to make the difference between @legit_dev/package_name and something like @nlegit_dev/package ?

Namespacing makes name-squatting way easier, not harder.

lonelappde · on July 28, 2019

There's nothing preventing whatever_purescript. Even if you have namespacing, anyone who owns a spot in it can sell or rent their spot to a malicious actor.

dane-pgp · on July 28, 2019

Fortunately it seems to be something that Entropic gets right:

https://github.com/entropic-dev/entropic

If it ends up supporting PGP signatures for packages (ideally created by developers using air-gapped machines) then so much the better:

https://github.com/entropic-dev/entropic/issues/86#issuecomm...

nullwasamistake · on July 29, 2019

Java got this right with reverse domain names over a decade ago. To use a namespace you have to own the URL. Simple and effective abuse resistant package naming.

svnpenn · on July 29, 2019

The real issue the Balkanization of JavaScript programs. The `rate-map` package is essentially one line of code:

    start + val * (end - start);

https://github.com/shinnn/rate-map/blob/90c234c9/index.mjs#L...

SCLeo · on July 29, 2019

I honestly don't understand why people use packages like this. If I need this functionality, I will simply write my own. Plus, I will never able to find this specific package. I guess PureScript uses this because its author is also the author of rate-map.

ricardobeat · on July 29, 2019

I’ll give you one: it’s code already written and tested by > 1 person, edge cases already figured out. Saves you time. The gains are small but quickly add up.

This is why lately I’ve been a fan of very extensive standard libraries (like Crystal has) - its like having a huge repository but vetoed by the same team and without any of the package management drawbacks.

vhakulinen · on July 29, 2019

> it’s code already written and tested by > 1 person, edge cases already figured out.

It is tested? Edge cases figured out? Essentially for this code?:

    return start + val * (end - start);

Sure, if its running in hostile environment, you might need to do those sanity checks for your parameters - but I have hard time imagining such situation. If you actually need to "Map a number in the range of 0-1 to a new value with a given range" in your own code, can't you guarantee that the variables are all numbers? Its your responsibility as a developer to know your code, and the data your code is handling.

> Saves you time.

There is something really wrong if finding a package to do this niche thing is faster and more optimal that just writing out that one line of code.

spenczar5 · on July 29, 2019

> The gains are small but quickly add up.

There are costs to these micro-libraries that outweigh the gains. This code is trivial; there aren’t really edge cases to be worked out.

jasonhansel · on July 29, 2019

Also: the way the library handles those edge cases isn't necessarily the way you want. Case in point: rate-map throws exceptions in situations where you might expect it to fail more gracefully.

ben509 · on July 29, 2019

> edge cases already figured out.

Nope, it doesn't guarantee its invariants. This would pass all the tests and yet returns a value out of bounds:

    > s = 1e12
    1000000000000
    > e = 1e-8
    1e-8
    > s + 1.0 * (e - s)
    0

tjpnz · on July 29, 2019

>The gains are small but quickly add up.

Same could also be said for build times not to mention security issues should one of your micro-dependencies be hijacked.

cannedslime · on July 29, 2019

What gains? That line wouldn't take much resources or time to figure out and test. Now you have yet another dependency that could be injected with bad code in the future...

inimino · on July 29, 2019

[C]ode already written and tested by > 1 person, edge cases already figured out wastes time when that person was not you. This costs you time. The losses are small but they add up.

dangoor · on July 30, 2019

> This is why lately I’ve been a fan of very extensive standard libraries (like Crystal has) - its like having a huge repository but vetoed by the same team and without any of the package management drawbacks.

Having had some involvement in the early days of node, I had imagined there being something like the Python stdlib. When the npm world grew, I thought "oh, that's pretty cool. It's neat how npm can handle multiple versions of the same package."

Now, I'm in absolute agreement with you. There are definitely downsides to a large standard library, but I think the upsides are worth it if that library is maintained.

sergiomattei · on July 29, 2019

As usual, everyone here will start their "I'm better than this" comments.

If you've ever used a code dependency, you are a target for malicious code. That's just how it is.

In this case, using small packages like this helps in...

1. Reliability - these packages are typically 100% unit tested

2. Convenience

3. Reduce codebase size. I can't imagine having to copy paste every little small function into a mega utils file.

GrumpyNl · on July 29, 2019

You should take a look at the trim package and how often hat is downloaded.

reeeeee · on July 29, 2019

There is not even a link to the source code on the npm page for it. I installed it and inspected the source code, but I doubt everyone does this when installing a dependency.

delfaras · on July 29, 2019

Here's the complete source for anyone curious

``` exports = module.exports = trim;

function trim(str){ return str.replace(/^\s|\s$/g, ''); }

exports.left = function(str){ return str.replace(/^\s/, ''); };

exports.right = function(str){ return str.replace(/\s$/, ''); };

```

rjmunro · on July 29, 2019

That code doesn't match what actually happens. It will only trim a single character. Have asterisks been trimmed in your copy-paste or something?

delfaras · on July 29, 2019

Ah you're right, the asterisks have been interpreted as italics. Can't edit my comment, sorry

Here's a paste: https://pastebin.com/kBHprdyj

dmitriid · on July 29, 2019

Funnily enough, in my opinion, the JS ecosystem fully embraced the Unix way: have small programs/libs that do only one thing and do it (somewhat) well.

lenkite · on July 29, 2019

I fear your analogy does not go far enough if you are comparing the JS ecosystem to the UNIX philosophy.

Javascript would offer a library for every single option, variant and logical operator for a UNIX command. Combinatorial explosion will devour the web. It already destroys dev laptops anyways when downloading something via NPM.

inimino · on July 29, 2019

No, they do not on the whole do anything well.

aur09 · on July 29, 2019

Wow—and spread into 11 files to facilitate CI, linting, etc—and with 1kb worth of error checking. That is nuts!

jasonhansel · on July 29, 2019

...wow. I literally did not believe that until I clicked the link. JavaScript has gone too far.

iamleppert · on July 29, 2019

The problem is clearly due to vanity metrics like number of packages motivating people to publish an insane number of useless packages to fluff their contributions.

ivanfon · on July 30, 2019

Agreed - I’ve found and been astonished by Github users who maintain hundreds of these little npm packages, all of which have usually under 10 lines of actual code, and sometimes even have chains of dependencies on the user’s other packages.

It seems like the only reason these packages get any significant downloads is when one of them gets depended on by a big package, causing the entire dependency chain of the user’s little packages to be downloaded.

inimino · on July 29, 2019

That is not clear.

fock · on July 29, 2019

well, the rate-map at least includes a function: https://github.com/sindresorhus/shebang-regex

hu3 · on July 29, 2019

A dependency for /^#!(.*)/;

With 8.6 mil downloads and 71 dependents...

https://www.npmjs.com/package/shebang-regex

arkh · on July 29, 2019

I love the useless tests : https://github.com/sindresorhus/shebang-regex/blob/1cb5d4aee...

You could replace the regexp with something matching everything and they would still pass.

fock · on July 29, 2019

and if you look closely, 70 of those dependents are just small-scale/bs packages and the only sensible real dependency is cross-spawn _via_ the shebang package (which applies this regex to a string). The whole ecosysten could use a purge of all those useless 1-line-requires (which were all introduced by helpful commits from the "i has 1337 downloads"-community), currently this is madness.

tuananh · on July 29, 2019

sindresorhus is famous for that stuff :D

vhakulinen · on July 29, 2019

This just made my day.

TeMPOraL · on July 29, 2019

And this one-liner library even has a dependency (!) used by the surrounding error-checking code.

ddalex · on July 29, 2019

What's the difference between an NPM dependency chain and a black hole?

We can measure the depth of a black hole.

lenkite · on July 29, 2019

Don't know whether to laugh or cry at this. Truth is stranger than fiction in the land of JavaScript. Could any developer 20 years ago predict this is what software engineering on the Web would devolve to ?

ww520 · on July 29, 2019

But it's great for resume and portfolio.

s_Hogg · on July 29, 2019

This is the problem - how to disincentivise that behaviour?

throwaway66666 · on July 29, 2019

Do not hire people who do this. Most of them will say "over 200m downloads of npm packages". Go and take a look. If you see things like 'is-not-foo, checks if a given string is not equal to the string "foo"', and 0 meaningful contributions, either pay no attention to these claims or pass.

_ofdw · on July 29, 2019

It's an ecosystem full of reinventing the wheel. The most popular library, lodash, includes a reimplementation of a foreach loop for Pete's sake, for reasons passing understanding since it's part of the ecma spec.

JavaScript is just amateur hour, and these things are going to keep happening. It's pathological.

rwbcxrz · on July 29, 2019

Which foreach are you talking about? Array.prototype.forEach, for...in loops, for...of loops?

The first only works with arrays and array-like objects.

The second works on objects and arrays, but it iterates over all enumerable properties, so you don't want really want to use it for arrays. It's also made a lot less useful because it only iterates over properties, not keys.

The third finally provides some sanity, but it's only been around since ES6. Before that, lodash's each method was the most reliable way to iterate over a collection, be it an object or an array.

Just because you don't know the reason for something doesn't mean there isn't one.

runarberg · on July 29, 2019

`for ... of` only iterate over objects that implement `Symbol.iterator`. Native objects don’t do that by default, so `_.forEach` is more useful the `for ... of` even if you are only targeting modern browsers and not compiling the code down to an earlier version of the spec.

That said, you can use `Object.keys`, `Object.values`, or `Object.entries` if you want to iterate over objects that don’t implement `Symbol.iterator`, so if you only need `_.forEach` there is no reason to pull in any libraries.

    Object.entries(object).forEach(([key, value]) => {
      // ...
    });

inimino · on July 29, 2019

> Array.prototype.forEach

Yes, this one. If the object is an Array. According to whatever test they are using for that.

Lodash includes a reimplementation of Array.prototype.forEach because mistakes were made. It also works on other objects because other mistakes were made.

We all make mistakes. But just because there is a reason for something does not mean there is a good reason.

tonetheman · on July 29, 2019

[flagged]

dang · on July 29, 2019

Hey, can you please stop breaking the site guidelines? https://news.ycombinator.com/newsguidelines.html

I don't want to ban you, but when accounts keep doing this, we kind of have to.

Benjammer · on July 28, 2019

I always think of this article when these things come up:

https://hackernoon.com/im-harvesting-credit-card-numbers-and...

z3t4 · on July 28, 2019

Before "tree shaking" I stored all npm modules in SCM and reviwed all updates as I had to commit after "npm update". I also put ton of files in .ignore as 90% of files in some packages are not required. I also used to include npm modules in distribution/deployment. So my request to npm is to add an option in the main package.json to disable tree shaking.

robocat · on July 29, 2019

I wish there was a way to "bless" packages when they were reviewed.

I want a network of trust, such that a Google reviewed package is worth 10 points, a package fuzzed by foobar is worth 2 points, something skimmed by a dependant user is worth 1 point etc.

I can then chose a compromise between a highly rated/reviewed dependencies and functionality/risk/cost-to-review.

My own blessing of a package I have reviewed might become a very small signal in a web of trust.

MaulingMonkey · on July 29, 2019

For Rust there is at least one tool for this that I've started using, although I don't think it's very widely used: https://github.com/dpc/crev/tree/master/cargo-crev

Ironically this depends on a whole slew of additional packages I haven't yet reviewed to build ;)

Very recently (2 commits starting 8 days ago) someone started on an npm equivalent but it's not in a usable state yet: https://github.com/kspaans/npm-crev/blob/master/TODO.md

Two problems - few people out there reviewing code, and poor discoverability for who else is reviewing code that you might trust at the moment.

yaa_minu · on July 29, 2019

You may find npms.io[1] useful. For each package, they provide a score on maintainability, popularity and quality.

[1] https://npms.io

bauerd · on July 29, 2019

npms.io rates the `rate-map` package with quality 99%, so not sure how helpful that is

dentemple · on July 29, 2019

I would not at all trust a "Google" reviewed package, not for their penchant privacy violations and writing code to fuel them.

ivanfon · on July 30, 2019

I assume the idea is that, even if Google violates their user’s privacy, they take care not to let others violate their user’s privacy through their dependencies.

eitland · on July 29, 2019

Interesting! I thibk this is a good start but I guess it would still need some thought to be resistant to sybil attacks.

BTW: I used to think the dependencies that came as part of for example the official Angular packages where vetted.

Can anyone confirm or deny this?

Anyone can

kreetx · on July 28, 2019

I evaluated wordpress plugins this way, a good way to check if something strange had crept in.

By the ignored files do you mean non-code?

z3t4 · on July 28, 2019

Take the most popular npm module lodash for example. It has over one thousand files! But you probably only need one (lodash.js) and that's the one I would commit to SCM.

aspaceman · on July 28, 2019

I think they mean code - just remove all files that your program doesn't touch. With something like JS, this should be doable in cases because of the way libraries tend to be designed.

The equivalent in a compiled language would be to strip unused symbols from the binary.

nightkoder · on July 28, 2019

I have been installing purescript using Nix from https://github.com/justinwoo/easy-purescript-nix for a while now. It works quite well and I get to avoid npm.

Shoue · on July 28, 2019

Purescript (purs) is in unstable now as well

a-dub · on July 29, 2019

I wouldn't use the words "malicious" or "exploit" wrt this... It's more like, I dunno, trolling on planet JavaScript? I feel like there should be a big Twitter fight about it...

nameiscubanpete · on July 29, 2019

That was my first thought. But then I realized some guy basically broke something so his stuff would work and someone else's wouldn't. He didn't destroy files, but that was malicious as hell.

a-dub · on July 29, 2019

mean spirited and dramatic as hell, yes... also, a bad place where real "malicious" things could be done. but "malicious" has a specific meaning and this didn't affect users.

more like dramaticious if you ask me... but also uncovers actual dangerous weaknesses in the npm delivery pipeline...

a-dub · on July 29, 2019

it's kinda like a cat-fight in the one hundred acre javascript wood... pretty harmless, nobody's shit got pwned, but holy shit, kind of a vulnerable vector they found...

Thorrez · on July 29, 2019

shinnn is claiming his account was hacked, and the hacker added the code. Generally if a hacker hacks someone's npm account and adds harmful code, that would be considered malicious and an exploit.

Kiro · on July 29, 2019

That's obviously a lie though.

Thorrez · on July 29, 2019

Yes, but it sounds like Harry Garrood doesn't have enough evidence to outright accuse shinnn of lying, and the NPM team also is acting as if shinnn isn't lying.

So instead of going on a lone and not-fully-backed-up crusade accusing shinnn of lying that could wind up in a he-said-she-said situation with negative fallout, Harry instead decided to use shinnn's words for his own benefit, and hype it up as a serious NPM account hacker inserting malicious code.