I am pretty sure that is a typo in the blog post, since the performance of Spark improves as more cores are used, as does Julia's and they both show similar scaling characteristics.
The performance plot would probably be more readable if the y axis was on log scale.
But the "--master local[1]" setting they're using for Spark will run it on a single thread.
And, in the article they state "The algorithm took around 500 seconds to train on the NETFLIX dataset on a SINGLE processor, which is good for data as large as 1 billion ratings."
Being the one who conducted these experiments, I confirm that the number of threads was varied along(the graph shows performance scaling). I am sorry for the confusion caused, this was a typo, should have been "--master local[N]".
edit: local[1] has been updated to local[N], thank you for the update!
Ok thanks, I didn't know that's what "local[1]" did, so the more relevant comparison would be with --master local[30]?
The algorithm took around 500 seconds to train on the NETFLIX dataset on a SINGLE processor, which is good for data as large as 1 billion ratings.
- this is from the sequential portion of the test, the parallel portion is the next section.
We wanted to do this on a true distributed setup. However, all the largest datasets we could find where everyone has run ALS just fit on a single machine (even with lesser RAM than this one).
The Spark comparison, given the April 2016 posting of the article, was likely done with Spark 1.6. Spark 2.0, released in July, added significant performance improvements (https://docs.cloud.databricks.com/docs/latest/sample_applica...), so it is possible the performance difference may be different nowadays.
Quite possible, and would be interesting to see how this stacks up today. I was just glad to see that Julia's parallel computing could out of the box give results comparable to Spark, with the ALS algorithm completely written in Julia without crazy optimized code.
What's going on with Multithread support? I was trying to do a project a while back to make a pure julia mapreduce like engine with a distributed file system, but it was hard to get off the ground due to poor multithreading support.
For the uninitiated, Julia has two types of concurrency built in. Tasks, which are co-routines on the same thread and Clusters, which are "separate machines".
The multi-threading in Julia is really new and limited. The plan is first to get the whole codebase to be thread-safe and provide some simple parallelism models and then figure out what a good composable multi-threading model could be.
For now, since the GC effectively runs only in one thread, you get good speedup with multi-threading if you avoid allocation and thus GC in the parallel code sections. In some cases this is possible, but in many cases it is unnatural. Of course, all this is under heavy development.
To build a julia mapreduce engine on a distributed filesystem, Julia's multi-processing should be pretty good though. For simple problems we attempted with packages like Elly.jl, that is what our experience has been.
In particular, note that if you were doing this experiment more than a month ago, there was no threading support (except on master); there has been support for distributed computing from the first release. In the new 0.5 release there is support for multithreading, but it's still (as Viral said), experimental.
6 months is a long time ago in Julia land. Since then, the standard Julia install is a minor version higher which makes -O3 optimization standard, has . syntax for automatic fused broadcasts (simple loop fusing for MATLAB-style vectorization), anonymous functions are orders of magnitude faster, Base has been slimmed and many new organizations like JuliaMath and JuliaDiffEq have been developed with new packages to enhance performance for these kinds of things. In Julia you can program really fast so the language (written in Julia) and the package ecosystem evolves fast as well. Code from 6 months ago is recognizably different (at least right now).
Regarding minutiae such as these maybe. But I've been following Julia on and off for close to 5 years, and it's mostly crickets even among "major" versions (e.g 0.2 to 0.3. to 0.4 etc).
That said are little better/faster/picking up this year.
On-Topic: Anything that good hackers would find interesting. That includes more than hacking and startups. If you had to reduce it to a sentence, the answer might be: anything that gratifies one's intellectual curiosity.
Off-Topic: Most stories about politics, or crime, or sports, unless they're evidence of some interesting new phenomenon. Videos of pratfalls or disasters, or cute animal pictures. If they'd cover it on TV news, it's probably off-topic.
This means running on a single core. Not really a fair comparison with the Julia multithreaded or multiprocess version