Title could be improved, as I was wondering if the command line tool npm had itself been rewritten.
It’s actually one of npm’s web services that was rewritten.
FTA: “Java was excluded from consideration because of the requirement of deploying the JVM and associated libraries along with any program to their production servers. This was an amount of operational complexity and resource overhead that was as undesirable as the unsafety of C or C++.”
I find this comparison a bit odd. Even if not using containers, the JVM isn’t hard to deploy as distro package managers include it. Unless a team is managing servers manually rather than an automated tool this doesn’t seem that complex. Am I missing something here?
Well, speaking from experience with a JBoss application layer in a recent software project I worked on:
New java versions do break existing libraries or apps, and need to be tested thoroughly. When the company hasn't budgeted for that expense, it becomes difficult to update.
Often an architect or software team will insist on using the Oracle JVM rather than the included openjvm. That adds extra steps to download, store as an artifact, distribute, verify, etc etc.
The people who wrote the build pipeline have since been laid off, and an updated set of libraries requires a lot of work to trace back through poorly documented and understood code to make changes.
(Not to disagree with you here, it's more that I'm trying to illustrate how, with poor foresight, Java dependencies can get difficult to manage)
> New java versions do break existing libraries or apps
Have you used the Rust compiler? My experience tells me that any Github Rust project not updated in the last two years doesn't work with my Rust compiler. Whereas Java apps written a decade ago still compile and run on OpenJDK/Oracle often with zero or near zero changes.
> Often an architect or software team will insist on using the Oracle JVM rather than the included openjvm.
Yes, you could choose this. But if you have the ability to choose an entirely different language, I guarantee you have to ability to choose OpenJDK if that's what you really want.
> build pipeline , an updated set of libraries
Doesn't Rust have libraries?
In fact, if anything, Java libraries have a longer shelf life and greater compatibility because they don't require integration of build systems. Java libraries can (and do) use 5-year-old compilers, postprocess bytecode for perf (e.g. Hikari) or size (e.g. Proguard), and even use entire other languages like Scala and Kotlin. But the only thing you as the library user need is the JVM bytecode, which is still high-level enough to maintain runtime interoperability but sufficiently low-level enough to achieve strong build-time interoperability.
---
Much of the "organizational overhead" seems to come down to "I don't like managing the runtime". And that's not wrong; static binaries are nice.
But how much harder is it really to manage/isolate/version your JVM runtime for your server deployment, than manage/isolate/version your Rust compiler for your CI pipeline?
It's definitely a thing. It's easier to see with applications that use a lock file rather than libraries that always try to fetch the latest dependencies. That's because there have been incompatible changes made, but within the "allowed breakage" policy. For example, try checking out 0.2.0 or 0.3.0 of ripgrep and building it with Rust 1.33. Both fail for different reasons. The former fails because rustc-serialize 0.3.19 was using an irrefutable pattern in a method without a body, and this was prohibited: https://github.com/rust-lang/rust/pull/37378 However, if you run `cargo update` first, then the build succeeds.
The 0.3.0 checkout fails because of a change in Cargo. Specifically, I think it used to define OUT_DIR while compiling the build script, but this was not intentional. It was only supposed to be set when running the build script. ripgrep 0.3.0 got caught here in the time after when the bug was introduced but before it was fixed. See: https://github.com/rust-lang/cargo/issues/3368 In this case, cargo update doesn't solve this. You cannot build ripgrep 0.3.0 with any recent version of the compiler. You need to go back to a version of the Rust distribution which ships a Cargo version that sets OUT_DIR when compiling build scripts. (Or, more likely, patch build.rs since it's a trivial fix.)
Of course, both of these things are fine on their own. But strictly speaking, from a user experience point of view, code does break from time to time. With that said, ripgrep 0.4.0 and newer all compile on Rust 1.33, and ripgrep 0.4.0 was released over two years ago. So, strictly speaking, it does show the GP is wrong in at least some circumstances. :-)
This has less something to do with Rust itself but the ecosystem, but once I updated my system and obtained a newer version of openssl on my OS.
Sadly, some of my projects still used an older version of the openssl crate somewhere inside their crate tree.
The openssl-sys crate author chose to check that the native library version it was compiled against was below a certain version, so it broke. All requests by users to fix the legacy bug were deflected with the mantra "go update the openssl crate the fixed version is out since a year already"...
But moreover, you stumbled upon one of the worst cases, a soundness problem that has yielded a lot of discussion because of that crate specifically: https://github.com/rust-lang/rust/issues/50781
The issue here is that the typemap crate was found to be relying on a compiler bug. The compiler bug could be exploited to write transmute in safe code: in other words, all the safety guarantees of Rust go out the window unless we break that crate. It's a bad situation, but I don't know what we could have done differently. I think everyone agrees it's not worth sacrificing all the safety properties of Rust to keep typemap building.
My project stopped building on compilers w/ the new module system. I did file an issue (56247, see also 56317) and was told it won't be fixed. The fix to my dependency was trivial, but it did stop my code from compiling with no change to my Cargo.toml/Cargo.lock, and ultimately it did require me to unpin that crate's version and update it.
My understanding was that by virtue of my project not having the edition 2018 key I would have been isolated from such changes to the language.
> any Github Rust project not updated in the last two years doesn't work with the latest Rust compiler.
I don't think this is correct? Rust introduced an "editions"-based system in 2015, with the very aim of ensuring forward compatibility on a crate-by-crate basis. The aim is definitely that any crate written to be compatible with some "edition of Rust", whether 2015, 2018 or whatever, can be made to compile on a future version of the compiler.
Maybe, but you have to specifically opt-in to the nightly toolchain in order to compile something that uses it. It's not something that's included by default, or appropriate for use in production scenarios (unless you're willing to deal with the resulting breakage, of course).
Exactly. I think it's getting better now that some of the cool things people were waiting on have made it into the 2018 edition. Off-hand I know that Remacs switched from the nightlies to 2018.
It's probably just the lack of emote via text, but your reply comes off as confrontational. It's also seemingly replying to me about general Rust issues when I'm just sharing my experience of a recent Java project. I don't have much experience of Rust beyond reading some code every now and then when an interesting blog post pops up here.
I'm also not condoning the decisions made in the project I describe. In fact, those ancient decisions are responsible for a significant amount of stress and work for me right now.
I've absolutely had experience of good JVM based projects that had loose dependencies on libraries and jdk versions such that it ran just fine under the openjvm, and ran just fine when that package was updated.
It's always possible improve. But Java is pretty much king when it comes to compatibility and maintainability. (Not so much in other areas, like verbosity, language features, type system, memory overhead, etc.)
Not so ironic—just because a practice is common among the median engineer in an ecosystem, doesn’t mean it will be common among the most experienced engineers in that ecosystem.
Node gets a bad rap mostly for the fact that it has tons and tons of inexperienced engineers using it (probably as one of their first programming languages.) Same reason PHP got a bad rap back when. You can build solid software in both, by finding and absorbing engineering best-practices from people who have already had hard-fought battles to learn them; but an engineer will have to be burned at least once on building/scaling+SLAing something before they start looking for those. You might say that the Node ecosystem has a lot of programmers that are “engineering virgins”—they’ve never been forced to contend with the real problems of engineering software. But they’d be “engineering virgins” no matter what language they’re using; that’s not an indictment of the language, just a consequence of its popularity and approachability.
I think majority of the good practices are the same in PHp as in other languages like Java, many PHP issues were caused by creating SQL statements using user input(you can have same issues with beginner devs in most languages), so the best practice is to find a framework that does all the ugly part for you(the logins, session management, routing), there are large and small frameworks , I suggest use a popular one even is not perfect(to avoid the risk of hitting bugs if is not used that much or the risk of it getting abandoned or deprecated),
Don't get me wrong, if you need to do one small thing, like for example I had a desktop app and the user could submit feedback directly from the app, then a single PHP file was enough for this case(no dependencies, no frameworks), you get the submitted data, clean it and put it in the database or submit it to a third-party API that can handle it.
A fair amount of the pitfalls are already gone. Register_globals is off by default now, placeholders for sql queries are the default examples, etc.
You will probably hear that you should use Laravel or similar. I'd argue it's a pretty big hammer, so don't reach for it if you don't need it.
The biggest issue is probably still the breadth and inconsistency of the standard library. Too many ways to do the same thing. Also, the general issues of a dynamically typed language, sprinkled in with things like == vs ===.
> I just like that someone at npm would avoid something because it has lots of dependencies and overhead. The irony is strong with this one.
For this to be "ironic", having lots and lots of dependencies would actually have to be a Node.js "best practice." But it's not.
It seems like a best practice to outsiders, for the same reason that setting `register_globals` seemed like a "best practice" in PHP back in PHP3/4. Because it was extremely common, one might assume that it was endorsed as a canonical approach. And so you do it yourself, and write sophomoric tutorials suggesting others do the same, perpetuating the problem.
In reality, the "best practice" followed by experienced software engineers (for Node.js or any other language) is to carefully consider your dependencies, and to try to avoid dependencies that cause an explosion of sub-dependencies. The NPM maintainers are experienced engineers, and so they follow this best practice.
There is no irony here. It is not "the Node.js way" to use tons and tons of dependencies, such that the NPM maintainers are going against the grain somehow. It's just the way of programmers inexperienced in engineering to not care about dependency proliferation; and then, further, to make a large set of their own tiny libraries (with already-exploded deps trees) because they aren't yet at a stage of programming expertise where they see that code as trivial to bang out whenever they need it (see: the left-pad package) that then further encourages others to depend on them. It's the "copying and pasting a solution together from bad code in five StackOverflow posts" phase of one's programming career, except instead of having to copy-and-paste, all the snippets are symbolically linked together into a big tree and you refer to them by name. (Again, that's not an indictment of Node.js—there's nothing you can do to stop a bunch of inexperienced engineers from doing this to your package ecosystem as well, if they happen to be drawn into your community.)
On the other hand, NPM itself kind of paved the way for ecosystems with tons of dependencies, managed and versioned with a single tool. Installing system software like the JVM is more old school.
If I have a choice between dealing with it and not I’ll choose not every time. It’s an annoying dependency that got even more annoying when Oracle decided to be a pain in the ass. Just because something is “easy” doesn’t mean it’s easier than the alternative.
I just unpack JDK, I have no idea what so complex about it. It was complex on Windows because I used VM to install Java and then copy folder, but they made it easier with 11.
Not to mention that the Web is now full of discussions around this, with official posts from Oracle, Red-Hat, Amazon, IBM, Azul, Microsoft explaining how to go forward.
None of that helps you with what the charge is when you're elastic, or using multiples of hyperthreads that don't add up to a integer number of cores. I agree that OpenJDK is a better idea.
If you want to be fully sure and insist in using Oracle JDK, then do like any enterprise shop and call the sales team to get their written word, what is so hard about it?
Why would use use only Oracle documentation? Call sales guys and ask them for a license. They would be happy to sell you something and there's even some chance for some kind of discount.
But having options is always good, isn't the same with compiled languages? you keep the defaults options but if you want extra performance you try different compiler flags or different compiler if the language has more then one.
Honestly I’d rather have good defaults than lots of options. If one tool meets requirements out of the box, then it’s preferable to a tool that requires a lot of tooling.
But if you do not have options then what is a "good default" , you have a default option if you have more then an option, so you prefer software with no options?
Can you give examples of languages(compilers or VMs with bad defaults in your opinion?
Java by default has a hard memory limit which when hit causes services to hang in a semi-responsive state rather than exiting cleanly, requiring every user to hand-tune limits and carefully setup health checks and monitoring. Two decades in, they added a non-standard option to instead exit.
If they had instead made the default to act like almost everything else it would have worked with standard process monitors and limits with no effort required and a substantial fraction of the downtime I’ve seen for Java applications would never have happened.
How is it better for something to stop working but not exit? It’s exceedingly uncommon for anyone to have correct handling for OOM exceptions so the most likely effect is that stuff partially works - servers accept requests but never respond, apps have menus/buttons which don’t work, etc.
Similarly, if developers who deployed apps knew enough to avoid this, we’d know by now because it wouldn’t happen so frequently. It does highlight who failed to do real health-checks (e.g. years ago most Hadoop services needed huge amounts of RAM to start without crashing but they’d be listed as okay) but it’s the kind of thing sysadmins have been cleaning up for decades.
You are probably right for this case, my initial comment was related to performance tuning configuration, the comment I replied sounded like JVM should have had X set to my preferred value or "I prefer software that has no options to confuse me with aka GNOME mentality"
You can have options and have good default values for those options. Good default values are those that are reasonable for most people most of the time such that they don't need to be (or hire) domain experts to properly configure their tool.
My opinion of a bad default would include a language toolchain which defaults to dynamic linking and requires one to opt into static linking (Java, Python, JS, etc, etc, etc). Worse than that is an anorexic toolchain that has no defaults whatsoever and requires you to pass every little detail directly to the compiler--bonus points if your language has a massive ecosystem of competing tools which are meant to manage these sorts of details for you but utterly and uniformly fail to do so (looking at you, C/C++).
So the JVM run-time is installed and used by average people, so the defaults should be set for this people that install Java to run a desktop app IMO, the developers that want extra performance should read the manual and configure things.
I’ve never heard of a runtime that forces a dichotomy between end-user- and developer-friendliness (putting aside for the moment that end users are famously annoyed by the Java runtime). Rust, Go, etc don’t have runtimes which force a choice between end-users and developers...
JVM can be used for Desktop, server and in the past applets , it would be impossible to find a configuration that is optimal for all applications, so competent developers would tweak the defaults or use different deployment methods like AOT compilation or bundle your own JVM , is your problem that JVM is used in so many different places that different algorithms, runtimes and optimization were created for it ? Your examples of Go and Rust are of languages that are so far not been used in such diverse places, there are no good alternative compilers but something like Python has such a diversity of runtimes.
How much is that actually necessary these days? I remember spending ages tweaking flags in the 1.4 - 1.6 years. But there have increasingly been sensible defaults with broad applicability. Now that G1 is both default and usable, even more so.
As i've moved onto 11, i've slashed our apps' JAVA_OPTS down to almost nothing - max heap size, some GC logging flags, that's it.
At scale? Very necessary. The default GC as of Java 8 (the last version I used in production) suffered significant performance issues above 40 GB of working set, and required careful tuning thereafter.
Hardly any different that having to use VTune or perf to optimise C and C++, which require the added steps of recompiling with different flags and redemploying.
I wouldn’t recommend C or C++, but in their defense, you don’t need to worry about tuning as soon as you do on the JVM. Many apps won’t need to bother with it at all (which is good because C/C++ programmers have all manner of other things to worry about that aren’t a concern for modern languages).
There's also a notion of failure mode for improperly tuned apps. This is certainly a personal preference, but I prefer true failures over constraint failures. I'd rather OOM in golang than run out of threads in my Java execution context while having plenty of free memory. At least when starting out.
Me neither, I do like Rust and see its adoption at NPM as good PR for the language, however I also think that the way the decision process went wasn't free of bias.
I'm not a fan of relying on distro package managers for installation of runtime dependencies on servers. Too many opportunites for variables to creep in if the version isn't locked, and then having to make sure all the package manager dependencies and config themselves. Even if you automate you're still at the mercy of the repo to have the version you need etc. and often times you need to customize the install for a highly-available and/or virtualized environment. Bad times all around.
The alternative is vendoring the dependencies. That includes the JDK or Node runtime. With “modern” deployments you’ll see this with container based packaging that includes the runtimes either explicitly or via system packaged (in the container). The classic approach is copy the runtimes into your final application tarball.
Either way the runtime is baked into the app and gets deployed and tested with it as a core component. Runtime upgrades then become vanilla deployments.
Lots of options: containerization, downloading prebuilt binaries, downloading and building from source, or downloading, building, then packaging the artifacts and leaving somewhere centralized for deploy.
Deploying jvm app/service is now even more complex than figuring out which version of node to use to run a service. Is it supposed to be oracle java? Openjdk? Adoptopenjdk or one of a half dozen more? Which version of it? Anything to tweak in gc or startup settings for it/the version? Do we need to regression test the service on minor jdk upgrade? Is that jvm compatible with some os version we are running that has some security patches and other settings?
> Deploying jvm app/service is now even more complex than figuring out which version of node to use to run a service. > Is it supposed to be oracle java? Openjdk? Adoptopenjdk or one of a half dozen more? Which version of it?
All of these must have been tested against the Java TCK. So, the answer is "whichever you/your company decided is okay" for the first part and for the second part "newest update of the Java version your software runs on". Doesn't sound very complex to me.
Hey folks! This is part of our whitepaper series. This means that the audience is CTOs of larger organizations, and so the tone and content are geared for that, more than HN. Please keep that in mind!
Are CTO's of largish organizations still not part of the hacker news audience ...? That seems a bit of a damning statement to be about the population of CTO's ...
I would be shocked if any CTOs of large organizations browse HN. Large orgs put people in those positions that are more about theory and future thinking over current knowledge.
It's expected that high-level executives of large organizations deal with planning and strategy rather than low-level technical details. They're not following language features or implementations.
However in that case, this whitepaper (and many others) are damning in how little they actually state and why so many technical decisions go wrong.
Good overall whitepaper and I like to see efforts like these.
One are I felt was missing is some data here:
"It keeps resource usage low without the possibility of compromising memory safety. "
How did the resource usage compare with the Go and node rewrites? What metrics were used under which workload? Benchmarks are never perfect but I think a CTO-level person would like to see a table of results like that.
> Java was excluded from consideration because of the requirement of deploying the JVM and associated libraries along with any program to their production servers. This was an amount of operational complexity and resource overhead that was as undesirable as the unsafety of C or C++.
Just so everyone here is aware, this is by now an outdated complaint against Java.
I'm choosing vertx as an example since it competes already with rust and c based applications over at https://www.techempower.com/benchmarks but you ought to be able to compile general programs ahead of time.
This is not at all outdated. GraalVM and SubstrateVM are still very new and only support a subset of JVM features. The Vert.x article itself mentions that, as well as this:
Before I respond, please don't misinterpret me. I think it is perfectly acceptable to choose Rust over Java saying nothing more than, "We felt Rust would be a better fit for the team," or "We were more excited by Rust." However, if somehow you have imposed on you the constraint that you need to compile ahead of time, this is not enough to toss Java out of the running.
Vert.x encountered some issues with reflection while Rust simply does not support the sort of dynamic reflection that Java running on the JVM can achieve. SubstrateVM forces you to have compile time reflection, and Rust can also support this to some extent. If Rust is a feasible alternative to Java for you, then you are not going to encounter this limitation if you choose to go with Java. Plus, if you ever decide you do need this power in your application, if you go with Java you can pay the cost of the increased operational expense and install a JVM.
Shared this in this same thread for another comment.
The below videos show GraalVM based app loading up Spring framework, and flowable process engine and making a rest call to an external service all in 13 ms!!!
Twitter uses GraalVM in production, you can find several talks from them.
Beyond that, enterprises using PTC, Aicas, IBM, ExcelsiorJET JDKs have had the option to deploy AOT native code in production for the last 20 years or so.
I'd like to point out that even though it took them a week compared to an hour, a week is actually incredibly fast to learn Rust and build something useful with it. Learning C++ can take more than a month of training, and it's only because most people learn it over an entire semester at school that they learn it at all. This is also the time it takes to learn Rust, and presumably now that they've written one program, writing the next program will take them a fraction of a week.
The amount of time it takes to learn something is often indicative of its power. Anyone who has learned a foreign language or a musical instrument knows that the time spent investing up front pays huge dividends down the road when you have the skills and tools to richly express yourself. The reason that Go takes two days to learn is because it artificially limits the amount of up front investment at the cost of limiting expressiveness over the lifetime of your use of the language.
> I'd like to point out that even though it took them a week compared to an hour, a week is actually incredibly fast to learn Rust and build something useful with it.
I would argue that something that takes even an experienced engineer a mere hour to write is very small and has little complexity (especially if unit tests are counted towards the hour it took them to re-write it). This means it's difficult to gauge how much Rust was 'learned' during that week.
Learning any of C, C++, Rust or other systems languages takes way, way more than a week or a month. For C, the simplest of the three, it is usually rated at 3 months.
What happens is that if you are already proficient in one of them, one of the others takes way less time (specially in the case of C++, given it forces you to learn almost all paradigms).
In addition, writing a small program does not mean you have learnt C, C++ or Rust.
I’ve been playing around with rust since it came out but only recently did I decide to use it for a part of a project. It’s a very pleasant language. I didn’t fight the borrow checker much (maybe due to prior experience).
The language is nuts. It’s true what they say cargo is even better than the language, it’s just so easy to add packages to your project or to split your project into packages.
Cargo is an amazing investment as this will help people write non duplicated code. Like how many string implementations are there across c code bases. Each c project has so much code that’s the most boring, repetitive shit you can imagine. Cargo let’s you concentrate on writing your code without hassle.
I have experience with a lot of package managers, gems, go, cocoapods, sbt, cabal, pip, spm, npm, you name it but cargo is on a different plane of existence. Cargo makes the whole internet your standard library.
I also like cargo workspaces. Modern development needs a workflow where you pull in a dependency, and work on it in tandem with your code. Achieving a good workflow for this is surprisingly hard.
> write non duplicated code ... Like how many string implementations are there across c code bases.
In one of projects I needed foo::string that can share data with foo::variant without copying the data each time. So foo::string was implemented as a COW string - smart pointer for foo::string_data. std::string simply does not work in such requirements.
So I am not sure I understand how cargo will help in this case. Either Cargo source repository will contain implementations of any possible permutation of requirements of strings or people will just use standard std::string.
The only feature that I need in C/C++ is unified ability to include libraries in code:
I am perfectly fine with downloading png.tar.gz manually and putting it in place where I need it.
In any case decision to include library to a product requires quite a lot of reasonings and architectural investigations.
For typical web front-end projects NPM or Cargo probably make sense. But for, say, NodeJs or Cargo itself they should not use any such automatic downloader.
Your problem isn't hard to solve in Rust. Reasoning about allocations is a foundational idea of Rust and is solved using lifetimes.
Basically, lifetimes are allocators as a first-class language construct. You can reason about what happens if values are stack or heap allocated and specialize your code based on that. Lifetimes are definitely an advanced feature and I think you can do some crazy optimizations with them. But your use case is not hard to implement. If you show me the C++, I'll show you how to achieve the same semantics.
Cargo helps because Rust lets you build cleaner abstractions and abstractions that compose nicer. So integration is super easy.
> For typical web front-end projects NPM or Cargo probably make sense.
I might misunderstand your sentence, but neither Rust nor cargo are for front-end?
> Modern development needs a workflow where you pull in a dependency, and work on it in tandem with your code. Achieving a good workflow for this is surprisingly hard.
Well put! I've struggled with this exact situation, and although I figured out a setup that works for me, it's still not ideal. In my experience, npm the package manager doesn't enable such workflows reliably (yet).
While I'm a big fan of Rust, excluding Java because "JVM" is kinda laughable. It's not hard to run at all. You package everything into a jar then run a single command. As easy to get working as a JS backend.
If their complaints are about GC tuning, is it not the same thing as tuning the GC in Js/Go? Java still had arguably the more mature GC of any language
Anyone knows if they considered .NET Core? It has dependency management, can be deployed as a self-contained binary and is memory safe. Seems to match the requirements for me.
> This stuff happened before, or at least around the time, that .NET Core was released. So it either didn’t exist or was an extremely new option, at least.
This entire article is a pretty damning report on JavaScript in general, but this sentence takes the cake (emphasis mine):
> The process of deploying the new Rust service was straight-forward, and soon they were able to forget about the Rust service because it caused so few operational issues. At npm, the usual experience of deploying a JavaScript service to production was that the service would need extensive monitoring for errors and excessive resource usage necessitating debugging and restarts.
They also state that writing the service in Node took them an hour, two days for Go, and a week for Rust. Even taking into account their unfamiliarity with the language, it's probably fair to say that when switching to Rust, you'll usually spend more time writing and less time debugging. Whether that trade-off is worth it depends on the project.
It depends, I am over a year in Rust and it doesn't take me that much longer to write something in it than say in Python. The confidence I have that the thing I wrote is way higher in Rust and it is usually much faster.
And: it is incredible easy to build and deploy.
I think whether Rust is useful or not depends entirely on the application. If you need high confidence in what the thing is doing, it should run parallel and fast and you are familiar with the concepts Rust is using – it isn't a bad choice. For me it replaced Python in nearly every domain except one-use scripts and scientific stuff.
It can be hard for advanced programmers to abandon certain patterns they bring from other languages though. In the first months I tried too much to use OOP, which doesn't make any sense and leads to convoluted code. If you work more in a compositional and data oriented way while making use of Types and Traits, you will end up with much simpler solutions that work incredibly well.
Katherine West's RustConf 2018 Talk on ECS Systems describes this incredibly well and might be even good to watch if you are never intending to use Rust at all, because the patterns discussed are quite universal: https://www.youtube.com/watch?v=aKLntZcp27M
Which is ironic, given that people tend to forget that it was C++ which made OOP mainstream, there wasn't any Java or C# back then, two fully OOP based languages.
The other part is that component oriented programming is actually a branch of OOP, from CS point of view, with books published on the subject at the beginning of the century.
Once a programmer achieves a certain competency level with Rust, writing familiar workflows requires little additional effort than what is expended with a dynamic language. However, lower level Rust will demand more, regardless of proficiencies.
> Whether that trade-off is worth it depends on the project.
Sure, but when you consider the drastically reduced operational cost that they're talking about there... that week is absolutely peanuts in comparison, and that was also a week including getting to grips with the language sufficient to produce the component. You really don't want to have to pay attention to production. You want to be able to concentrate on getting stuff done, not losing time keeping what you've already got just ticking along.
I'm really skeptical of this unless it's just a wrapper for a thing that happens to already exist. It would be interesting to have comparative LOC numbers.
> At npm, the usual experience of deploying a JavaScript service to production was that the service would need extensive monitoring for errors and excessive resource usage necessitating debugging and restarts
So, they deployed it after an hour, but it wasn't finished until they stopped having to debug it in production?
the point is to measure at a common level of proficient across different languages. Once you are proficient and familiar with a language, then you can measure how long it takes you compared to another lang.
>They also state that writing the service in Node took them an hour, two days for Go, and a week for Rust.
>At npm, the usual experience of deploying a JavaScript service to production was that the service would need extensive monitoring for errors and excessive resource usage necessitating debugging and restarts.
But if you factor in over the life time of the program, Where Node saves you a week times at the initial implementation and you paid back in extensive monitoring, it is probably safe to say Rust's TCO is much lower.
Not Sure how Rust will flare against Go. But I think there is a high probability that Rust is better in the longer run.
I guess it will be domain dependent. Go uses a highly-developed concurrent GC, which is going to make it a lot more convenient for certain specialized workloads that involve graph-like or network-like structures. (That's the actual use case for tracing GC, after all. It's not a coincidence that garbage collection was first developed in connection with LISP. And yes, you could do the same kind of thing in Rust by using an ECS pattern, but it's not really idiomatic to the language.)
Rust's package managment (cargo) is the best thing I have ever seen of it's kind. The very basic thing you can do is:
cargo new funkyproject
Which creates a new barebones rust project called "funkyproject". Every dependency specified in it's Cargo.toml will be automatically downloaded at build (if there is a new version).
When a build is sucessful the versions of said dependency will be saved into a Cargo.lock file. This means if it compiles for you, it should compile on every other machine too.
A cargo.toml allows you also to use (public or private) repositories as a sorce for a library, specify wildcard version numbers to only select e.g. versions newer than 1.0.3 and older than 1.0.7 etc.
Because the compiler will show you unused dependencies you never really end up including anything you don't use. In practice this system does not only work incredibly well, but is also very comfortable to use and isolates it self from the system it is running on quite well.
I really wish Python also had something like this. Pipenv is sort of going into that direction, but it is nowhere near cargo in functional terms.
> Every dependency specified in it's Cargo.toml will be automatically downloaded at build (if there is a new version).
Why do people want this? The builds are no longer reproducible, security and edge case issues can come out of nowhere, api changes from an irresponsible maintainer can break things, network and resource failure can break the build, it's just a terrible idea.
The proper use of a semvar system is entirely optional and unenforceable and seen people been bitten countless times by some developer breaking their package and having everyone complaining ... If the tool didn't do the stupid thing of just randomly downloading crap from the internet none of this would be a problem.
I presume all my dependences are buggy...I just know that the current ones don't have bugs that I have to deal with now. You swap out new code and who the heck knows, it becomes my job again. It's more work because of a policy that doesn't make sense.
Newer code isn't always better. People try new ideas that have greater potential but for a while the product is worse. That's fine, I do it all the time. But I sure as hell don't want software automatically force updating dependency code to the latest lab experiment.
Cities, power plants, defence systems, satellites, and airplanes run on software from the 80s; they don't break because a new version of some library had bugs and it automatically updated, no. They fucking work.
There's a giant inherent huge irreplaceable value in predictability and this approach ignores all those lessons.
Reproducibility was a core concern for cargo. Your parent is incorrect. A lock file means that your dependencies are never updated unless you explicitly ask for an update.
You're misreading your parent. The download only happens for the first build using a new dependency. As they mention, once the version is written into the Cargo.lock file, that is the exact version that is used until there is an explicit update step run.
Sorry, english is not my first language, I meant this:
When you build initially the used dependencies get downloaded. Only if you [A] update, [B] add a new dependency or [C] clean your project and build it again there will be new things downloaded.
If you update the versions in your Cargo.lock are ignored and updated if the build is sucessful.
If you add a dependency only that depndency is downloaded, the rest is kept as you had it.
If you clean it is as if you cloned that project fresh with git and you will have to download all dependencies. If there is a lockfile the exact versions from it will be used.
To me this is extremely flexible and works very well AND you get precise control over versions if you want it. By the way it is also possible to clone all dependencies and keep a local copy of them, so you are really 100% sure that nothing could ever change with them. Although I am quite sure crates.io doesn't allow changes without version number change, which means you should be save as long as you rely on the version number.
Yes, I suppose that's rather misleading, and that sentence contradicts with the actual behaviour that described later in the original comment. For a fixed set of dependencies, versions are only checked and changed on an explicit 'cargo update' run.
The current version of npm does this and this is "correct behaviour". I got bitten by this a few weeks ago.
For the passers-by, the only way to make npm behave expectedly in this specific case is to use "npm ci" instead of "npm install". If you do not do this, npm will assume you want to update the packages to the latest version at all times, at all costs, even if you have a lock file in place, and even if you have your package file and lock file locked to exact versions. (i.e. 2.0.0 exact, not ^2.0.0)
This is a new addition, and it has been added a couple months ago. Before that, you had to check your dependencies into your source control. That might still be the best practice, and likely the only trustable way to get reproducible builds consistently over a longer time horizon.
Yes - to be more specific, you can lock your own package's dependencies to an exact version, but you cannot lock dependencies of your package's dependencies. You can't do anything about them. They will get updated because their package lock specifies the lock in the form of ^2.0.0. The fact that a package lock can resolve to multiple versions is counterintuitive. One would think the whole point of a package lock is to lock packages.
As a result, when you do a npm install in your oblivious and happy life, npm naturally assumes you want to summon Cthulhu. If you didn't want to summon Cthulhu, why did you call the command that summons Cthulhu? Yes, the default command summons Cthulhu because we believe in agile Cthulhu. If you don't want to summon Cthulhu, try this undocumented command with a misleading name we've added silently a few weeks ago for weird people like you who don't want to summon Cthulhu when they want to do a npm install. But seriously, why do you not want to summon Cthulhu?
Unfortunately, this was the impression I've gotten of the position of npm folks when I read a few threads about this. I've moved to npm ci for now and moved on. Npm's package lock is many things, however, none of the things it is, is a package lock.
No, you guys don't understand.. npm updates the package lock even when not adding a new package, i.e. the initial `npm install`. It's insane I'm think to go back to yarn again..
Why is that insane? What else is supposed to happen when you install a package?
EDIT: I misunderstood and thought you were talking about installing a package. If you're running `npm install` to just reinstall dependencies then yes the lockfile should not be modified. However it seems like that is indeed the case and you may be talking about a prior bug with NPM.
`npm install` is what you the developer would run when you first clone a project; it should install exactly what's in the package-lock.json file. Unfortunately, it sometimes doesn't do that.
Well just like many other languages with sane environment (dependencies, building, etc.) management. I think this is the norm nowadays (D, Clojure, and so on).
That sounds pretty similar to NPM, as well as NuGet and Paket for .NET. TBH, it's the 'obvious' way for a package manager to work, so I'd be a little surprised if they didn't all work more or less the same?
You have to run npm ci instead of npm install to get npm to respect the lock file. I don’t consider that remotely obvious. And this feature was just added to npm last year, 8 years after npm was invented!
That is incorrect. Both `npm install` and `npm ci` respect the lock file, and if a lock file is present, will make the `node_modules` tree match the lock file exactly.
`npm ci` is optimized for a cold start, like on a CI server, where it's expected that `node_modules` will not be present. So, it doesn't bother looking in `node_modules` to see what's already installed. So, _in that cold start case_, it's faster, but if you have a mostly-full and up to date `node_modules` folder, then `npm install` may be faster, because it won't download things unnecessarily.
Another difference is that `npm ci` also won't work _without_ a `package-lock.json` file, which means it doesn't even bother to look at your `package.json` dependencies.
Thanks for the reply Isaac! This doesn’t match my first-hand experience unfortunately. Are there any circumstances under which npm install with a lockfile present deviates from the lockfile where npm ci does not?
If you run `npm install` with an argument, then you're saying "get me this thing, and update the lock file", so it'll do that. `npm install` with no argument will only add new things if they're required by package.json, and not already satisfied, or if they don't match the package-lock.json.
In the bug linked, they wanted to install a specific package (not matching what was in the lockfile), without updating the lockfile. That's what `--no-save` will do.
The SO link is from almost 2 years ago, and a whole major version back. So I honestly don't know. Maybe a bug that was fixed? If this is still a problem for you on the latest release, maybe take it up on https://npm.community or a GitHub issue?
> Both `npm install` and `npm ci` respect the lock file
This is not correct. `npm install` will update your dependencies, not install them, disregarding the package versions defined in the lock file.
It feels like you are not getting the point of having a lock file in the first place. It should be obvious that you can't do an install (which npm calls ci) if you don't have a lock file.
The lock file represents your actual dependencies. Package.json should only be used to explicitly update said dependencies.
If you run `npm install` with no arguments, and you have a lockfile, it will make the node_modules folder match the lockfile. Try it.
$ json dependencies.esm < package.json
^3.2.5
# package.json would allow any esm 3.x >=3.2.5
$ npm ls esm
tap@12.5.3 /Users/isaacs/dev/js/tap
└── esm@3.2.5
# currently have 3.2.5 installed
$ npm view esm version
3.2.10
# latest version on the registry is 3.2.10
$ npm install
audited 590 packages in 1.515s
found 0 vulnerabilities
# npm install runs the audit, but updates nothing
# already matches package-lock.json
$ npm ls esm
tap@12.5.3 /Users/isaacs/dev/js/tap
└── esm@3.2.5
# esm is still 3.2.5
$ rm -rf node_modules/esm/
# remove it from node_modules
$ npm i
added 1 package from 1 contributor and audited 590 packages in 1.647s
found 0 vulnerabilities
# it updated one package this time
$ npm ls esm
tap@12.5.3 /Users/isaacs/dev/js/tap
└── esm@3.2.5
# oh look, matches package-lock.json! what do you know.
Now, if you do `npm install esm` or some other _explicit choice to pull in a package by name_, then yes, it'll update it, and update the package-lock.json as well. But that's not what we're talking about.
I often don't know what I'm talking about in general, but I do usually know what I'm talking about re npm.
Everything has evolved to get to that point. I suppose if you start with something modern like npm then it's not obvious how bad the earlier ones are. Compare the good ones with composer, dpkg, rpm, apt or dnf, to name a few examples.
Dub definitely does this (pretty much exactly the same, dub.json = Cargo.toml, dub.selections.json = Cargo.lock), and afaik cpan does something similar.
Python does have something like this, which is conda [0].
It allows specifying dependencies with much of the same freedom you mentioned, in an environment.yaml file and other config files, you can provide arbitrary build instructions via shell scripts, use a community led repository of feeds for stable and repeatable cross-platform builds of all libraries [1], generate the boilerplate automatically for many types of packages (not just Python) [2], compiled version specifics with build variants / "features", and you can use pip as the package installer inside a pip section in the conda yaml config file.
I wrote and deployed a production service written in pre-1.0 Rust. In over three years of being deployed I never once had to touch that code. The infrastructure around it evolved several times, we even moved cloud providers in that time, but that particular service didn't need any changes or maintenance. It just kept chugging along.
Perhaps Rust's name is apropos: your code will be so reliable that you won't need to look at it again until it has collected rust on its thick iron framework.
Deploying a service written in any language into production environment at scale of npmjs is far from straightforward.
I think the satire here is that internet got so centralized lately that even a simple piece of code in JavaScript requires such a huge behemoth of an org running and maintaining all this monstrous infrastructure.
You've identified the problem, but I think you're wrong about the cause. It's not internet centralization that's the problem, it's the mundane fact that JavaScript does not have a large standard library. And JavaScript absolutely should have a large standard library, but it's not clear what organization would have the motivation to implement, support, and promote one. It's not impossible that a community-driven effort could accomplish the same thing, but nobody seems to be working on it.
I think the biggest to a hurdle is the fact that there is no stdlib so nothing unused has to be shaken out. I think that’s a low barrier now though with more than adequate tooling. Another comment mentions a different stdlib for server vs browser but that’s also not a terribly hard problem.
I think a good first pass would come from studying analytics from npm. What are the most used packages? The most stable? I know lodash makes a lot of sense but there’s also underscore. I think the biggest hurdles are really political over technology as everyone has been so entrenched now that a one-size-fits-all stdlib would be hard. Not impossible, just hard. I do wish someone were working on it and I hate to say it but Google probably has the most skin in the game with v8 and Chrome yet I don’t really trust them not to abandon it. So who else is there? It wouldn’t be a very ‘sexy project’ either but still seems worth it to at least try.
Which is roughly the equivalent of every single human being downloading two npm packages per week. To me, this suggests that the real problem is that too many packages are being downloaded.
I think this is a natural result of two things which should be appealing to fans of old-school UNIX philosophy:
- NPM is intentionally suited to lots of small libraries that do one thing and do it (hopefully) well, and composing those libraries in useful ways. Whereas systems like Debian have scaling limits with large numbers of packages, NPM tries hard to avoid this so that one hundred ten-line packages are as reasonable as a single thousand-line package.
- CI systems aim for reproducibility by deploying from source and having declarative configurations, in much the way that most distro package builds happen in a clean-room environment.
Probably a lot of these downloads are from bots. Continuous Integration is very common in Node.js/JavaScript projects, so each git commit anyone with CI (and no dependency caching) will download lots of packages.
> Which is roughly the equivalent of every single human being downloading two npm packages per week
The current human population of earth is about 7.7 billion, so that number should probably be closer to 1.17 npm packages per week per human being. That is still quite a lot, though
This highlights the problem of averages. Most (99.87% or so) humans download zero npm packages. But those that do, often download them in the thousands at a time. And yes, clean-room CI servers are a big part of that.
You might be surprised (or maybe not) to learn that many service providers are far more willing to spend money on predictably large bandwidth bills than on less predictable changes in their infrastructure which require human time and attention to implement.
The idea of a scripting language is that it does not have a std. It will be different in each environment. You for example don't want the same std in nodejs and the browser. Each runtime can choose what API's it want to expose.
That’s not a definition for scripting language I’ve ever heard before and it’s neither true nor desirable. Even JavaScript has a standard library - think about things like Set, Map, Regexp, Promise, etc. – because they’re universally useful, as opposed to the runtime environment where things like the DOM are less relevant to many use cases. JSON is a great example of something crossing over as an increasingly high percentage of projects will use it.
Not having a standard library on par with other scripting languages just added overhead and incompatibility for years as people invented ad hoc alternatives, often buggy. The accelerated core language growth has been hugely helpful for that but you still have issues with things as basic as the module system which exist for understandable reasons but are just a waste of developer time.
I would expect that to be par for the course for most languages. The more dynamic the more problematic, but it stands to reason that the less you can check for and enforce statically the more will eventually blow up at runtime.
Resource usage is similar though not exactly aligned e.g. Haskell has significant ability to statically enforce invariants and handle error conditions, but the complex runtime and default laziness can make resource usage difficult to predict.
I'd guess OCaml would also have done well in the comparison as it too combines an extensive type to system which is difficult to bypass with an eager execution model.
Default laziness in Haskell is not as big a problem as is made out to be. For someone like NPM though, Haskell's current GC would probably be too much of a bottleneck. Haskell's GC is tuned for lots of small garbage and does not like a large persistent working-set. But this has nothing to do with laziness.
Honestly just using strict data structures (one StrictData pragma will do) and strict accumulators in recursive functions will get you there, it's not that hard.
I wrote trumped.com and deployed it prior to the last presidential election. The frontend and assets have been redeployed, but the core rust service for speech generation hasn't been touched. I've never had a service this reliable, and it took so little effort!
Rust is the best language I've ever used, bar none, period. And I've used a countless many of them.
The only places where I won't write Rust are for small one-off scripts and frontend web code. (Even with wasm, Typescript would be tough to beat.)
Does every language have to be suitable for every task? “Operating a business-logic-embedding CDN at scale” isn’t ever something Node claimed to be capable of. I wouldn’t expect any other dynamic runtime-garbage-collected language, e.g. Ruby or Python or PHP, to be suited to that use-case either.
Use the right tool for the job. The PHP website is running PHP, but it isn’t running a web server written in PHP. Web servers are system software; PHP isn’t for writing system software. Same thing applies here. Node does fine with business logic, but isn’t really architecturally suited to serving the “data layer” when you have control/data separation.
> The PHP website is running PHP, but it isn’t running a web server written in PHP.
Are you sure? I'm not familiar with the PHP.net architecture, and there may be less gains from how PHP has traditionally tied itself as a module to web servers in the past, but Rails (and any number of other dynamic language frameworks) are actually web servers implemented in that language, with an optional separate web server such as NGINX or Apache you can run in front to handle the stuff they aren't as good at (static file serving, etc).
Now, that is a framework, and not the language proper, but I wouldn't be all that surprised to find python.org running on top of a Python framework.
Their mirroring page suggests they most likely use Apache.
"NOTE: Some of our maintainers prefer to use web servers other than Apache, such as Nginx. While this is permitted (as long as everything ultimately works as directed), we do not officially support these setups at this time"
From the paper it appears to be an authorization service that decides what rights a particular user has. Not a webserver or CDN. It mentions it being CPU bound, though it isn't clear to me why it would be, or why JS wouldn't work well enough for that.
The paper makes it clear they were evaluating languages based upon efficiency.
The rust implementation was more efficient than the JS one. A CPU bound service of course is bottlenecked at the CPU, and this benefits from efficiency.
At scale, it makes sense to replace this with Rust. Javascript did the job, but did not provide the same efficiency as Rust.
I'm not clear on why this is particularly CPU heavy, though: "the authorization
service that determines
whether a user is allowed to,
say, publish a particular
package"
> Anything will be CPU-heavy if you give it enough work.
Not in a relative sense. If authorization is 5% of the work, scaling it leaves it at 5% of the work, and it's never a bottleneck. Authorization was being a significant bottleneck, not a tiny percent, and that is somewhat surprising.
No I'm talking about private NPM right. The perms on the file system are not equal to (or as costly as) the auth I need to have to access my private NPM repo.
Couldn't you implement the authorization logic in, say, Redis? Then the npm service is I/O-bound again, and everyone is doing the job they're optimized for.
It's not clear either, it makes sens to use Rust for CPU heavy task, but a CRUD service that do authentication would be fine in Nodejs since every low level crypto are using C.
So I'm not sure exatly what they mean, tbh the paper is very light on details.
Some CPU heavy operations like crypto are not put in a threadpool. While it may be running C code, it will block the main thread while executing. See https://github.com/nodejs/node/issues/678
Perhaps the new worker threads may alleviate this, but I'm not sure (it's still an experimental API).
My gut feeling is that the user keys are not random generated, but actually encrypted strings that contain the permission values. Decrypting them with modern encryption algorithms (like ChaCha) is pretty CPU intensive.
"Javascript did the job", but not very well apparently. They specifically say that they "were able to forget about the Rust service because it caused so few operational issues". Add to that the increased CPU- and RAM-efficiency that generally comes with a rust implementation, and that rust rewrite looks like a no-brainer.
Users who suggest cryptography are getting downvoted (myself included). I am curious as to why.
Is it perceived as spam or an attack on nodejs? Encrypted cookies and JWT (json web tokens) rely on a similar strategy so this is pretty standard. It would be pretty secure as long as the encryption key is not unique, so definitely not criticism on the NPM team but mere theories as to why a (what one would assume is a) database or memory bottle-neck is being presented as a CPU bottleneck in this scenario.
I guess I just don't see it. NPM is a massive infrastructure where one little piece was rewritten in Rust and then the Rust team wrote a promotional paper about it. The paper isn't bad, but it also isn't a technical white paper.
I don't see how any of this is critical of JavaScript, which isn't even really discussed in the paper and still runs the rest of the infrastructure. If anything the paper is more damning of C, C++, and Java but I still think damning is far too extreme to describe what the paper said.
I don't see it as damning on either of these. On C++ it says "we didn't want to learn it." Which is fine. Maybe after learning they would have decided different, or not. On Java they said "we didn't want to learn how to operate it" as they feared the complexity of an Java application server for a single small service, which they can create in a way which hooks into their monitoring infrastructure. No damning their either.
However their company's purpose is to push Javascript and they are saying "operating JavaScript is hard, doing this in Rust is easy" which directly goes against their business.
I mean, we still write the overwhelming majority of our code in JavaScript. We port to Rust when CPU-heavy task becomes a bottleneck to the rest of the system. It's not as if this paper is saying (nor is it the case) that we've ported the whole registry to Rust. JS has lots of advantages we appreciate.
I'm curious why this authorization service is CPU intensive. The article says it's basically deciding if you're authorized to publish a package. It sounds like the sort of thing that would talk to a database, or a cache like redis, and therefore mostly be IO bound itself.
It's been a few years since I was directly involved in engineering, but my fairly educated understanding is that it's more around reading of possibly-private packages than publishing.
Publishing is a relatively rare event compared with reading, but in a world of private packages, orgs, and teams, the "can {user} read {object}" gets more complicated. It probably wouldn't be CPU bound if not for the sheer scale we're dealing with, but once all the IO bottlenecks are resolved, you still have to check to make sure that a login token is valid, then get the user associated with it, then check the teams/orgs/users with access to a thing (which might be public, in which case, the problem is a lot simpler, but you still have to get that info and check it), and then whether the user is in any of those groups. So there's a straightforward but relevant bit of CPU work to be done, and that's where Rust shines.
One wonders if there'll be a similar talk in a year...
> The audience walks away feeling empathetic that they aren’t alone in their journey to writing idiomatic Go and is now equipped with strong refactoring techniques developed by some of the world’s top engineers for the Kuberentes project.
As an occasional user of kubernetes, minikube etc. it's not something I would have guessed to have been developed by the word's top engineers.
I mean kubernetes tries to, and probably manages to provide an useful abstraction, but at a few million LOC and a few man-month of full time senior engineering effort to run anything in production it's not exactly a epitome of elegant and efficient engineering.
You would be surprised at the quality of Android tooling stable releases, to the point that now there is Project Marble in place to try to improve its image.
Even people working for Google on Android tell me the tooling is pretty garbage, so I'm less surprised than you might think ;)
The interesting thing with Kubernetes is that it's basically a re-imagining of Borg, which one assumes was not a few million lines of code when it was already running all of Googles infra more than a decade ago. It's obviously not solving the exactly same problem (e.g. Google correctly recognized that DNS isn't so hot and wrote their own replacement protocol, BNS which wouldn't fly for external adoption etc.). But I'd be curious to know how Borg's code quality and size back when it became the standard way to run stuff at Google maybe 12 years ago compares to Kubernetes today.
That’s an ongoing annoyance with using NPM in a security-conscious environment. It’s really easy to end up with thousands of submodules and the amount of time you’ll have an audit showing a vulnerable package can be many months while layers of dependencies slowly update. You can usually show that it’s not exploitable but the number of modules on the average project means you’ll be doing that all the time.
Yes - which is great for surfacing this, along with GitHub’s alerts, but unless it’s a direct dependency I find I’m usually just stuck researching the vector and waiting months for numerous layers of dependencies to update in sequence.
Oh, and to be clear: I think this is a problem with OSS sustainability – shipping updates takes real work – more than NPM, mildly exacerbated by the JS stdlib leading to more modules being used instead.
The distinction in the vernacular between statically typed and strongly typed languages is so narrow at this point that pointing it out is a bit pedantic.
I would say Rust is both a strongly typed language and a statically typed language. The static type checking happens at compile time, and in general the types in use are strict and strongly typed at runtime.
But, even Rust allows you to cast types from one to another and use dynamic types determined at runtime.
Yes, most people would say that static typing is the primary advantage you get from the compiler in Rust.
It wasn't intended as a drive-by pedantic swipe - I was genuinely curious whether OP meant strong or static. Conversations about type systems and application correctness are exactly the place where precise definitions are welcomed, but I understand that the distinction between strong and static is often conflated. It can be relevant if we're discussing static typing for example, as then something like TypeScript becomes useful.
There's a great writeup by one of the C# people (I want to say it was Erik Meijer, but I'm having a hard time finding it atm) about the distinctions we're discussing here, their relevance to correctness, and the impact on ergonomics. My takeaway from it was that occasionally you will encounter problems that are easier to solve with some freedom and that's why strong/static languages like C#/Rust include pragmatic escape hatches like the dynamic object and the Any trait (respectively).
> The distinction in the vernacular between statically typed and strongly typed languages is so narrow at this point that pointing it out is a bit pedantic.
I'm not sure why you think this has changed today, or what you mean. It appears to me that many programmers don't realize that the two are orthogonal, so I find it an important, not pedantic, distinction (it just happens that languages generally improve on both fronts over time, hence asking for one also gives you the other, but it's because of correlation, not causation). I'll make an attempt at describing it here, please tell if I'm missing something.
Strong typing means that types describe data in a way that the data won't accidentally be mistreated as something else than what it represents. E.g.
(a) take bytes representing data of one type and interpret it as data of another type (weak: C; strong: most other languages)
(b) take a string and interpret it as a number without explicitly requesting it (weak: shells, Perl; strong: Python, JavaScript, Ruby)
(c) structs / objects / other kinds of buckets, (strong: C if type wasn't casted; weak: using arrays or hash map without also using a separate type on them that is enforced, the norm in many scripting languages although usually strengthened via using accessor methods which are automatically dispatched via some kind of type tag; also, duck typing is weaker than explicit interfaces)
(d) describe data not just as bare strings or numbers, but wrap (or tag / typedef etc.) those in a type that describes what it represents (this depends on the programmer, not the language)
(e) a request of an element that is not part of an array / list / map etc. is treated as an error (similar to or same as a type error (e.g. length can be treated as being part of the type)) instead of returning wrong data (unrelated memory, or a null value which can be conflated with valid value)
These type (or data) checks can happen at runtime ("dynamically") or compiletime ("statically"). The better a static type system is, the more of these checks can be done at compile time.
For security and correctness, having strong typing is enough in principle: enforcing type checks at runtime just means getting a failure (denial of service), and systems should be designed not to become insecure or incorrect when such failures happen (fail closed), which of course might be done incorrectly [1]. Testing can make the potential for such failures obvious early (especially randomized tests in the style of quickcheck).
Static typing makes the potential for such failures obvious at compile time. It's thus a feature that ensures freedom of denial of service even in the absense of exhaustive testing. It can also be a productivity feature (static inspection/changes via IDE), and it can enable more extensive use of typing as there is no cost at run time.
[1] note that given that static typesystems usually still allow out of memory failures at run time, there's usually really no way around designing systems to fail closed anyway.
It's not that I disagree, with any of your points. I think they're all valid. I do think most people use the term strongly typed, where they mean statically -and- strongly typed.
I personally don't fret about it unless we get into specific details about these notions. I do especially like your (d), which many people often overlook when designing programs. An example would be to use a String as the Id in a DB, but not wrap the String in a stronger type to represent the Id, thus not getting the advantage of static type checking by the compiler. So there are definitely areas where this conversation can lead to better advantages of different languages.
For example, in Rust declaring a type to be a `struct Id(String);` would cause no overhead to be associated with the Id in terms of memory allocation to that of just a String. Not all languages can say that, thus we could also get into a fun conversation about the overhead associated with the type system itself. All fun topics.
Do you actually believe it and why? To me it looks like a fairly pathetic attempt to damn JavaScript but I'm not sure why anyone who has actually used JS for any length of time would be convinced of its accuracy.
The article is hosted at rust-lang.org so one would be wise to take their words with a grain of salt. And Rust isn't even a tiny fraction as popular as JavaScript and so when you use Rust you're choosing from a small set of packages written by experts and the kind of people who use languages that nobody else really uses. Meanwhile, JS has millions of packages written by everybody for various platforms (since JS can run in all sorts of environments where nobody would ever want to run Rust.)
Also there's the anecdotal, yet easily empirical evidence that just about any developer who uses JS can tell you about: I deploy new JavaScript services all the time without any of those problems.
So, I wonder if you're actually asking this question or if you have some other agenda.
This paper is written about the experiences of the organization that distributes those millions of packages. Those same people that have written so much important JavaScript their tool gets distributed with node itself.
What kind of load do your node services get? I’d be willing to bet the npm registry has more. That plays into this kind of thing.
The article may have been hosted at rust-lang.org but it was written by people at npm which is very much a JavaScript boosting organization.
Rust definitely has some benefits as well as tradeoffs when compared to JavaScript which they discuss. Learning curve is higher but the end product is probably devoid of a number of errors and operational issues over the lifetime of the service. While in theory possible to get similar results with JavaScript the level of consistent discipline it requires is in practice impossible.
The idea that some programming languages can solve scalability issues is a myth. A language cannot solve scalability issues; all they can do is push the needle a tiny little bit further in terms of performance but this is completely meaningless.
Scalability is an architectural concern which cannot be ignored by system developers. This is because scalability is not about speed or performance, it's all about figuring out which workloads can be split up and executed in parallel; in order to do this, you need to understand the real-world problem which the software is trying to solve; this is not something that you can delegate to a compiler.
The best that a language can offer in terms of scalability is to make it easier to reason about parallel workloads and make the difference between serial and parallel workloads as explicit as possible. Whenever a language tries to hide the complexity of parallelization behind thread pools, they're not solving any real scalability issue; they're just delaying them some more.
> The idea that some programming languages can solve scalability issues is a myth.
True.
> A language cannot solve scalability issues; all they can do is push the needle a tiny little bit further in terms of performance but this is completely meaningless.
> Scalability is an architectural concern which cannot be ignored by system developers.
A language can prevent or delay such architectural concerns from being addressed by not offering sufficient capabilities.
I don't understand the criticism of thread pools. Their only purpose is to avoid the expensive creation of threads. They don't do anything by themselves.
This stuff happened before, or at least around the time, that .NET Core was released. So it either didn’t exist or was an extremely new option, at least.
According to the article they have been using rust in production for 1½ years, .net core 2 was released in 2017 so it definitely existed. If .net core was mature enough for their liking is however another matter.
.net core was released in August 2017. That’s basically one and a half years ago. Depending on how fuzzy the one and a half year time is, it may not have been released. Like it may not have literally been 18 months exactly. That’s all I’m saying.
For all this talk about performance, are there any benchmarks anywhere? Also, is there any blog post or anything by npm itself that this is sourced from?
I have my doubts and confusion about this problem statement
> Most of the operations npm performs are network-bound and JavaScript is able to underpin an implementation that meets the performance goals. However, looking at the authorization service that determines whether a user is allowed to,
say, publish a particular package, they saw a CPUbound task that was projected to become a performance bottleneck.
Oh, Really ???
So essentially Authorization service and I doubt the security algorithms computation are the main cause.
What i dont understand here is why is it not possible to write lower level JS or asm code to craft a well optimized code which V8 can totally nail to minimum CPU instructions required?
My guess is that this uses public key cryptography. Generating a signature is rather expensive. We found this out the hard way a few years ago, when we tried to verify OAuth tokens using RSA rather than HMAC.
The server was hammered at maybe 500 signatures a second, greatly slowing down token generation.
I'm guessing they dropped to Rust so they could use native C libraries for signature generation.
The statement about Go using global dependencies being the standard is just someone’s opinion on the Go team. I’ve written Go for a few years now and never once shared a dep across projects. Create a new folder, set your GOPATH (use direnv) and pull your deps. I very much doubt they’ll be more productive in rust vs Go had they actually given Go a chance.
You shouldn't be afraid to make independent micro services. Using a different language makes it more likely that it can be deleted, and more easily rewritten. If you are forced into a monolith usually means the architecture has a complexity problem.
can someone please give this outsider-noob an eli5 explanation why golang do not adopt a same policy/attitude/implementation towards package management?
Go is being developed by Google and as such is designed to meet their needs. Google's view of package management is to bring every package in-house and manage it themselves. So, if a Go program needs package "A" then Google will directly import that package into their workflow.
This is actually one of the reasons that I don't think Go is a very good language for most people/organizations. It was conceived with a specific set of guidelines that Google needed; easy to learn, performant, etc. Go is also designed to be used by teams of thousands, so it's much easier to adopt misc packages into the fold and maintain them than it is to manage links to outside requirements.
It's heartening to see the community being mentioned as a positive factor.
> npm called out the Rust community as a positive factor in the decision-making process. Particular aspects they find valuable are the Rust community’s inclusivity, friendliness, and solid processes for making difficult technical decisions. These aspects made learning Rust and developing the Rust solution easier, and assured them that the language will continue to improve in a healthy, sustainable fashion.
We've updated the title from “Rust at NPM”. It's not so obvious how to extract a decent 80-character title from this one so we can cut the OP some slack.
> Referring to deploying the JVM as “operational complexity” or referring to C or C++ as “unsafe” both strike me as insecure, defensive attitudes.
Calling their attitudes "insecure" sounds petty to me. The engineering rationale was laid out and makes sense, especially if you apply a little charity and read that the JVM operational complexity was undesirable TO THEM.
If your org has JVM deployments already with experience managing at scale then this argument against Java is not valid.
> If not, then you might use a different language
That's what they decided to do and then got slammed for.
> Regardless though, these feel like non-sequitur, pejorative pot shots to characterize Java, C or C++ in some type generic bad light, as if those tools are a priori conceptually bad, which is a silly point of view.
Maybe try reading their reasons again, and assume they are talking about their org, use case, and engineers and not everyone else.
In the post it is specifically stated that Java was ruled out a priori because deploying the JVM and associated libraries is unqualified “operational complexity” that they don’t desire. They have not said why that is undesirable in a use-case specific way, only that they view it as generically undesirable.
Similarly they brush past usage of C or C++ because of “unsafety” without explaining why you could not achieve a parsimonious, safe solution in either of those languages.
Here is the quote again where all three languages are disingenuously dismissed in a manner that is clearly meant to be pejorative,
> “Java was excluded from consideration because of the requirement of deploying the JVM and associated libraries along with any program to their production servers. This was an amount of operational complexity and resource overhead that was as undesirable as the unsafety of C or C++.”
Upon re-reading the article, I stand firm that my analysis is fair, that I’m not reading anything into it, and that the article needlessly tosses in malignant comments about these other languages, and that this is a clear indicator of insecurity.
Nothing in the article says no one should consider Java, just that they rejected it.
You know nothing about their team expertise (probably very little Java background) and their deployment setup.
The concerns are perfectly valid in my opionion:
* JVM brings an increase in deployment complexity (version upgrades, etc)
* Deploying a high-volume Java application requires experience and usually also tuning of the VM parameters, which also means trial + error and a increased monitoring burden for the ops team
* Java is a old language with plenty of quirks and oddities, which is fine if you work with it a lot, but a consideration if you want to bring it in as a new stack
Personally, I also always reach for Go or Rust first because of the mentioned benefits:
- single, static binary means trivial deployment
- performance and memory usage is pretty predictable, often good enough without much adaptation, and requires no parameter tuning
I think you are only proving my point here. They have not provided any specific details about any of this, and did not elect to make a prototype implementation in Java to actually measure the costs and compare to the other solutions.
So you’re precisely right that I don’t know why they chose to dismiss Java as a candidate language. That’s my whole point.
If they aren’t going to provide evidence or more detailed explanation of why Java wouldn’t solve it well, then it seems very disingenuous that they do throw in a pejorative comment like saying Java adds unqualified operational complexity.
And the same thing for unqualified “unsafety” of C or C++. I’m willing to cut them more slack about C/C++, but still there is zero need to toss in these pot shots as if those languages are always inferior in an absolute sense (that is the only thing we can possibly think they mean, literally because they dismiss those choices out of the box with little discussion and no experimentation).
They could have just said something like this:
“We elected to create candidate implementations in node, go and rust. We did not elect to consider Java, C or C++, but would encourage others to evaluate those alternatives if they seem applicable in other use cases.”
> It’s the same with C or C++. These languages support a type of low-level memory and data model that is permissive in ways that can lead to errors if not used properly.
To be honest, I don't think I know of single piece of C/C++ code that doesn't suffer from security exploits from "improper" memory use.
Chrome does, Firefox does, Linux does. Hell car and pacemaker software suffers from exploits, most likely caused by improper memory handling.
This type of thinking is exactly why I stay away from the developer communities of Rust and Julia. Both communities seem to have this weird perception that their respective language categorically supersedes other tools in all use cases.
Don’t get me wrong. Rust and Julia are very cool languages with a lot of use cases where they are great choices. But the communities come off as disproportionately full of zealots, which makes me less interested to put effort into yoking real projects to either of those communities.
For what is worth, I'm not a Rust dev. I did wrote some code in Rust, but by that definition I'm also a Fantom dev.
I've just seen too many memory exploits in C/C++ to consider statements like "you can achieve memory safety in C/C++ if you're a True Scotsman" as anything other as logical fallacies.
> "I've just seen too many memory exploits in C/C++ to consider statements like "you can achieve memory safety in C/C++ if you're a True Scotsman" as anything other as logical fallacies."
The problem is that for each person who offers your anecdata, there's another person who offers the antithetical anecdata. For example, in my experience with two very large defense lab projects written in C and C++ spanning the 80s to the early 00s, both projects had exceptionally nice tooling to prevent memory issues, and it was pleasant to work with the code. Debugging time was usually spent on application logic bugs and very rarely, if ever, spent on memory leaks, segfaults, corrupt data, alignment issues, etc.
While I know this isn't directly applicable to Rust, I have also worked in a separate codebase that was a huge Haskell project and it was a nightmare of spaghetti code, performance bugs that were impossible to reason about, some segfaults, and it was extremely painful to descend into unsafe code for the occasional situations when it was required. Having the compiler to help us and leveraging the type system as part of the application design didn't really offer any significant benefits.
This doesn't prove anything except that, like usual, it's situation-specific and depends on relevant trade-offs at hand. It almost never has anything to do with the generic types of safety guarantees that a framework or language allows you to formally verify.
I did not mean to reply to a different comment. I did address the parent's point by explaining that the premise of the point (the claim that no piece of C/C++ code is ever safe) should be categorically rejected and not engaged with.
> "Replying to everybody who replied to your comment is pointless if you don't take the time to read and process the replies."
This feels like you are being needlessly antagonistic and assuming incorrectly that I failed to read other comments or replies. If you did not understand the manner in which my comment was responding to the parent, that's fine, but it was and I don't appreciate your tone.
maybe it is possible that some languages are actually generally worse for some domains? there isn’t any physical law that prevents one from making a language that ends up not being a useful tool in any toolbox. java might fit that category today, now that the jvm can be leveraged with kotlin, clojure, and scala... and that .net core usually performa better than the jvm
It’s actually one of npm’s web services that was rewritten.
FTA: “Java was excluded from consideration because of the requirement of deploying the JVM and associated libraries along with any program to their production servers. This was an amount of operational complexity and resource overhead that was as undesirable as the unsafety of C or C++.”
I find this comparison a bit odd. Even if not using containers, the JVM isn’t hard to deploy as distro package managers include it. Unless a team is managing servers manually rather than an automated tool this doesn’t seem that complex. Am I missing something here?