Speaking of concurrent programming and parallelism. If you're not into functional programming, check out Apple's Grand Central Dispatch[1] and Objective-C Blocks[2].
Unless you write your own Objective-C http server and run it on Mac OS X Server (it's not that hard, I've done it), this isn't very useful for Web programming. However, if you're comparing the languages / frameworks themselves (you can use all three to code command line tools, for example), GCD becomes a very seductive option.
GCD works by throwing code blocks (obj-c closures) into queues, and letting the runtime do its magic. You can have it execute code synchronously, asynchronously, or in parallel.
GCD will optimize and distribute your blocks the the available CPU cores. You can even enumerate using blocks, and instead of doing loop iterations one by one, it'll distribute them to the cores in parallel.
People tend to lump Node and Erlang together because they both avoid shared-state concurrency. But they're completely opposite approaches: Erlang has concurrency but no shared state. Node has shared state but no concurrency.
Well, that's debatable. I suppose if you squint a bit, you mentally perform a reverse CPS transform on a Node program and end up with a set of tasks that are executing concurrently. In that case, I'll rephrase: Erlang as parallelism (on adequate hardware) but no shared state; Node has shared state but no parallelism.
Still, given the way Node forces programmers to manually unravel tasks and write everything as callbacks, I'm not inclined to call it "concurrency" even if eg. the processing of a group of web requests overlap in wall-clock time.
The article is not "gobbledygook". A well known limitation of NodeJS is its lack of support for parallelism. You have to try to take advantage of parallelism of the OS itself by pre-forking the Node server.
NodeJS is great for applications with a lot of clients, but not for CPU intensive apps. That's why I predict similar technologies built on Erlang, Scala, and Go will have more longevity than NodeJS.
That works really well for handling requests for HTML pages, because they tend to render independently of one other. However, you run into trouble when you to make "Nodes" communicate, and the comment that the article is addressing specifically mentions interprocess communication.
Clearly it's not that easy unless you plan for it from the beginning. One benefit of Erlang is that you have to structure you code like a distributed application. You can mess that up, but it's harder.
I used the word "gobbledygook" because the article is poorly written, not because I think he's wrong to criticize Node. Reread the paragraph on Node and tell me it doesn't meet the Wikipedia definition: "text containing jargon or especially convoluted English that results in it being excessively hard to understand."
Since the early days of node there has been a proposal to dispatch tasks via the WebWorker API, with a callback - as is currently done for calls to OS subsystems. Sounds like that would be a great way of dispatching CPU intensive tasks without breaking the semantics of Node. What happened to this?
commit 9d7895c567e8f38abfff35da1b6d6d6a0a06f9aa
Author: Ryan <ry@tinyclouds.org>
Date: Mon Feb 16 01:02:00 2009 +0100
add dependencies
How old is Erlang? 25 years or so?
> What happened to this?
There's been some preliminary stuff on giving spawned node processes a more slick API, with the intent of then being able to optimize them in some way.
Whether or not it'll end up at the WebWorker API is yet to be seen, but that'd certainly fit with node's "don't reinvent BOM conventions where they fit" pattern.
That seems too nitpicky to me. You can use those definitions if you like, but it's not the common usage. To most programmers those terms are synonyms. In my experience, people trying to be precise about architectures like node.js use the term "asynchronous" and not "concurrent".
Calling node.js concurrent obscures the important fact under discussion: namely that it won't scale beyond one CPU in a world where 8-core servers are routine.
The distinction between parallelism and concurrency is extremely important. The guys who wrote Real World Haskell did a good job of explaining it here http://book.realworldhaskell.org/read/concurrent-and-multico... (explanation has nothing to do with Haskell).
In essence, concurrency has to do with systemsy stuff- how to do things that might overlap without causing problems (race conditions). On the other hand, parallelism is about breaking a problem into smaller parts and attacking it in pieces. The problem with most languages is that they require the programmer to worry about both at the same time; however, languages like Erlang alleviate most of these problems, the biggest of which is shared state.
You're arguing semantics: about words, not meaning. My point wasn't that this isn't interesting, but that the jargon you are using (and that book is using, for that matter) is revisionist and confusing. That's just not what "concurrency" means to most working programmers, who have used it for decades to talk about (ahem) "systemy stuff".
Rewriting language via blog posts doesn't work (c.f. "hacker"). Doing so as a way to, frankly, cover up a huge design flaw in your favorite library just seems dumb to me.
Seems too fine a point for me too. The number of CPUs doesn't matter, that's up to the scheduler. What limits the the scheduler are coordinating shared resources. That includes CPU, locks, memory, disk, network, IO, etc. The multi-node issue doesn't seem that much of a problem in that you can start a node.js process per core. There's very good performance doing this for MySQL, for example.
My node.js project spreads all node.js i/o across multiple node processes ( in machine, network, or browser ) and creates a distributed EventEmitter across all nodes.
It would be pretty simple conceptually (maybe not practically) to make node.js work in an actor-like way, here's a piece of toy code I wrote that does it for JS (not node, but no reason the same couldn't be done for node):
http://blog.ometer.com/2010/11/28/a-sequential-actor-like-ap...
By "actor-like way" here I just mean a code module ("actor") sees one thread (at a time), and the runtime takes care of the details of scheduling threads when a module has an event/message/request to process. Also I guess avoiding callbacks. But you could be more Erlang-ish/Akka-ish in more details if you wanted.
node.js punts this to the app developer to instead run a herd of processes. In most cases that's probably fine, but in theory with one process and many threads, the runtime can do a better job saturating the CPU cores because it can move actors among the threads rather than waiting for the single-threaded process an actor happens to be in to become free. The practical situations where this comes up, I admit, are probably not that numerous as long as you never use blocking IO and are basically IO-bound. (Only CPU-intensive stuff would cause a problem.)
Does anyone know how well Scala can handle the requirements of a typical Node.js project? IE, thousands of network connections and rather light CPU load overall? Can Scala be a Node.js or Erlang replacement?
Erlang has a slightly different but largely equivalent model, but the other alternatives you mention lack Go's channels and/or the select construct which is one of the greatest things about the language.
When I found that Stackless Python didn't have a way to read/write on multiple channels at once I was quite shocked.
It also looks that "select" statement could be done in a combinator library way.
The power of Haskell (or of any proper modern language) isn't in the language itself, it's in the number of things you can express as a library, on top of the language.
Edit: looks as if Erlang does indeed have its own, separate definition for lightweight process. See the disambiguation page on http://en.wikipedia.org/wiki/Light-weight_process. How very poor of whoever started misusing an existing concurrency term to refer to something else - as if discussing these matters isn't already difficult enough.
Could someone familiar with Erlang please clarify:
"To understand why this is misleading, we need to go over some background information. Erlang popularized the concept of lightweight processes (Actors) and provides a runtime that beautifully abstracts the concurrency details away from the programmer. You can spawn as many Erlang processes as you need and focus on the code that functionally declares their communication. Behind the scenes, the VM launches enough kernel threads to match your system (usually one per CPU) "
In common Unix tools like 'ps' and 'top' the term 'Lightweight Process' is used as a synonym for OS thread, eg, the LWP column in 'ps -eLf' shows the thread ID.
In this article, LWPs seem to be different from threads? Is this correct? If they're not threads, what are they?
> Node.js’s concurrency mechanisms are simply an approximation of Erlang’s.
Lulz? Here's a much simpler explanation: it's a polling server. It's not an intentional approximation of this or that (Erlang), this is just how event loops using select\poll\epoll\kqueue have always worked. Unless you want to do a bunch of extra work and throw in per-core preforking\threading and scrap the libev dependency Node built upon.
Erlang, and other similar efforts like Haskell's forkIO and Python's eventlet/gevent, are also built upon the same fundamentals as libev (and some actually just use libev). But using libev in the way that Node.js does boils down to user-space threads with cooperative scheduling (you yield control every time you make a blocking I/O call). The abstractions that Erlang/Haskell/Python provide let you program in the familiar synchronous, threaded style while retaining the performance advantages of an explicit (e)polling server.
>(you yield control every time you make a blocking I/O call)
You're referring to is coroutine vs. continuation passing (callback). This is unrelated to whether you also are also forking into a small number of processes for each core. You can do both. Node simply doesn't yet. Nginx with workers is an example of an asynchronous server that does.
In regards to your link and the coroutine yielding approach: Being able to write psuedoblocking and monkeypatching code is not necessarily a good thing! It encourages you to keep making subrequests sequentially in serial, rather than in parallel as comes natural with using callbacks. It also discourages one from using custom continuation logic such as quorums. Examples of when the yielding approach fails:
- You want to both send and read independently on the same connection without creating multiple coroutines\greenthreads\userthreads per connection.
- You want to continue once 2 of 3 data services have calledback that information was successfully stored
I've written a fast single-core asynchronous server here in Lua without using user space thread yielding that you may be interested in: https://github.com/davidhollander/ox
I think the opinions expressed in this article are valid.
However, I don't think the inability of current JavaScript to do async I/O without callbacks is Node's biggest problem. As others have said, it works for smaller projects (and even has some geek appeal). And as Havoc Pennington and Dave Herman have explained, generators (which are coming with ECMAScript Harmony) and promises will eventually provide a very nice solution. So Node has a path to grow out of the callback model without giving up its single threaded paradigm.
The bigger problem (which I don't see getting solved anywhere down the road) is the lack of preemptive scheduling, which is available in Erlang or on the JVM. What you see under high load with Node is that latency is spread almost linearly over a very wide spectrum, from very fast to very slow, whereas response times on a preemptively scheduled platform are much more uniform.
And no, this is something that can't be solved by distributing load over multiple CPU cores. This is problem really manifests itself within each core, and it is a direct consequence of Node's single threaded execution model. If anybody knows how to solve this without resorting to some kind of preemptive threading I'd be very curious to hear about it.
I've read a lot about Node and watched Ryan Dahl's introduction to Node and maybe the author of this post should have done that too. At no moment does Ryan talks about Erlang and having anything to do with Erlang's way of thinking. Node was built to use javascript's awesome V8 engine and its event loop that many people already know and love to provide evented IO. I love Node and I think it is a great project, which has nothing to envy about Erlang.
This is like saying "This Honda Civic clearly sucks compared to my helicopter. Let me write you an article about everything that my helicopter does betting that your Civic". Clearly the Civic was build for another purpose and so the comparison is void.
It looks to me like all this debate has no observable consequence on the respective programming communities. Node folks will program the Node way and love it and Erlang folks will program the Erlang way and love it. The creators of neither are trying to woo the other and they perfectly well understand operationally where each system stands.
I don't see how it does. Making note of the entrenchment bonus granted by a language already known by much of the target audience isn't necessarily making a value statement about the bonus or secondary effects thereof.
I love Node for many things, but I agree with this sentiment.
At work, every time we undertake a project in Node, it just doesn't work at scale, and it has to get re-implemented in Erlang.
A lot of this is for personnel reasons. To all of us Node is a neat new toy, where a few engineers are Erlang wizards. If Node crashes we don't have any experience debugging it. If it locks up there's little intuition why.
Can people point to examples of large production Node deployments?
It's doing what normally Erlang would have been used for, which answers the original question. It's a node.js deployment, and very early version one at that. I'm not really sure what your point is.
I know both and I take Node seriously. Using the same language at both the client and server end has some serious benefits. And web servers are mostly shared-none, so not having first-class support for communication between Node processes is not that big of a deal. Node has its niche.
> Using the same language at both the client and server end has some serious benefits
Like what? I hear this argument all the time, but it's always unsubstantiated.
In my experience, the choice of language is less important than the programmer's understanding of facts that have little to do with the language itself.
If you are developing client-side browser code, for instance, then you have little choice but to use Javascript.
But learning JavaScript is not the hard part: mainstream programming languages, in and of themselves, are rarely difficult to master.
On the contrary, the skill that separates the superior engineers from the inferior ones is the mastery of the environment that the program operates in.
For front-end developers, it is an intricate understanding of how browsers work (i.e., the DOM and supported events) and some good understanding of how to optimize requests back to the server. (There is probably much more they know than this, but I'm not a front-end developer.)
Similarly, back-end developers have much to learn about how to efficiently and reliably serve (hopefully) many thousands of requests per second. They need to understand the limitations of their servers, how to manage memory and storage effectively, and build reliable, operable services.
For both types, these skills often take many years to master, and they are hardly interchangeable. Consider that developers themselves often describe themselves as "front-end" and "back-end" people. The fact that you can now use JavaScript on the server side does little to disturb this reality; and things wouldn't change much if you could run Perl or Python or Ruby in the browser tomorrow.
> Like what? I hear this argument all the time, but it's always unsubstantiated.
Have you ever done it? It's very comfortable. It reduces one element of friction in the daily thought-work of the programmer.
By the same token, a comfortable chair or nice text editor won't turn an amateur into an all-star programmer in the absence the other important stuff (smarts, work, etc.) It can be a huge waste of time to fret over your editor or your chair adjustments. But that's not going to make me less likely to sit in a nice ergonomic chair and use vim rather than notepad.
Comfort isn't everything, but it's certainly not nothing.
It's a distributed pub-sub engine, so one client can publish an event, and the others on the server will receive it. Probably about half the code is shared between the server and client. That means less bugs and easier maintenance.
Technically there are tools that will convert, say, Haskell to Javascript so that you can have shared code between a client and server. In practice, I don't know anyone who does that. I'm sure for most it feels like a bit of a hack. So for practical purposes, the only way to share code between the client and server is by using javascript (or coffeescript) on the server-side as well. Node.js IMO is the best server-side javascript engine.
I looked through your shared/util.js and, while I don't want to take anything away from your project, it doesn't seem to me that there is much of a case for significant and useful code sharing in Node. When I think of code sharing, I expect something like Luna, from Asana http://asana.com/luna/
Instead, the examples of shared code in Node are always simple utility functions, validators and the like. While it helps not to have to rewrite those, it's not groundbreaking. Facilitating the sharing of state between client and server -- hopefully irrespective of the server-side language -- would be a much better goal, IMO.
The code itself might not be significant, but the implications are. It defines standard interfaces for interacting with the library. Thus people can write plugins on top that work at both the client and server end.
-No mental context switch when working on both ends
-Easier serialization (though JSON is pretty portable...)
-Sharing code
They aren't exclusive to Node and JS, but browsers run JS and will continue to for the foreseeable future. Since the front end can't budge its language at the moment, the back end has to.
I see the "sharing code" claim a lot but I haven't really seen it substantiated. Seems like you write a few little functions but it's not like you sharing the same framework which would be the real benefit.
Seriously? A very common problem in web apps is validation logic is usually duplicated in JS and Server side language.
Node allows the same logic to be used in both cases. These are not a 'few little functions'. Validation is a central part of most business applications.
Given the potential to template html with javascript, and to code exclusively in javascript. You could end up with webapps written entirely in one language.
Whether that is a good thing or not remains to be seen, but it is something that we should pay attention to.
In my experience, most validation on the server is a superset of what's required on the client (or disjoint).
For example, validating a user's name on the client merely checks if the name's length is in a certain range and doesn't contain invalid characters. On the server it requires a query to the database -- possibly more.
I also reject the notion that there is a significant context switch involved in going from one language to the other.
Validation is not a central part of most biz applications. Validation on the client side is a UX convenience, but not strictly necessary because anyone with two brian cells to rub together is going to ignore any client side validation and run the real input validation on the server.
The functionality implemented is mutually exclusive
Sorry, but what is that supposed to mean?
The functionality on either end of a client/server application is usually not "mutually exclusive". They're two sides of the same coin and could indeed benefit tremendously from code sharing. Just because the potential has barely been realized so far doesn't mean it doesn't exist.
I'm working on a distributed app that shares code between the client and server, and there are already a multitude of open-source libraries that are designed to run in any JS environment.
Math (significant math), encoding and hashing being done on the client sounds like a poorly designed application. Same for templating on the server -- which I assume you mean manually compiling templates using a library, and sending the result to a client, perhaps as a property of a JSON object.
Unless you write your own Objective-C http server and run it on Mac OS X Server (it's not that hard, I've done it), this isn't very useful for Web programming. However, if you're comparing the languages / frameworks themselves (you can use all three to code command line tools, for example), GCD becomes a very seductive option.
GCD works by throwing code blocks (obj-c closures) into queues, and letting the runtime do its magic. You can have it execute code synchronously, asynchronously, or in parallel.
GCD will optimize and distribute your blocks the the available CPU cores. You can even enumerate using blocks, and instead of doing loop iterations one by one, it'll distribute them to the cores in parallel.
[1]: http://en.wikipedia.org/wiki/Grand_Central_Dispatch [2]: http://developer.apple.com/library/ios/#documentation/cocoa/...