But the "--master local[1]" setting they're using for Spark will run it on a single thread.
And, in the article they state "The algorithm took around 500 seconds to train on the NETFLIX dataset on a SINGLE processor, which is good for data as large as 1 billion ratings."
Being the one who conducted these experiments, I confirm that the number of threads was varied along(the graph shows performance scaling). I am sorry for the confusion caused, this was a typo, should have been "--master local[N]".
edit: local[1] has been updated to local[N], thank you for the update!
Ok thanks, I didn't know that's what "local[1]" did, so the more relevant comparison would be with --master local[30]?
The algorithm took around 500 seconds to train on the NETFLIX dataset on a SINGLE processor, which is good for data as large as 1 billion ratings.
- this is from the sequential portion of the test, the parallel portion is the next section.
We wanted to do this on a true distributed setup. However, all the largest datasets we could find where everyone has run ALS just fit on a single machine (even with lesser RAM than this one).