As someone "fully fluent" in both, for many workflows that can be properly implemented in SAS, you would expect on a technical level the SAS program could be faster. It's a fully compiled language, it's a "simple" compilation model (compared to R), and the interaction between incremental compilation and the macro system allows you to do some really good blurring between run-time and compilation when performance matters. Plus, by abusing the fact you can define both sql and data step views to further minimise disk read/write, database pass through on certain procedures, and allowimg for in-memory operations (like R) with the sasfile command, from a purely technical point of view, an experienced user of both should be able to beat R in SAS.
But... and here's the big but...I almost never actually meet anyone capable of putting all these steps together in SAS these days that actually understands the SAS computation model end to end.
And SAS's strength, a computation model not being limited by memory by default, becomes a performance weakness when everyone reads/writes every step out to disk and programs without understanding all those little intricacies. SAS hasn't helped any of this by trying to move its eco system away from "programmer" to "application users", so now "programmers" can pick up an interpreted language like R with in-memory default vectorised operations and beat SAS.
Course, I'd still recommend places move to python/R these days because of the broader ecosystems, university talent pool, and avoiding the extensive lock in of proprietary software, but I still feel I have to reflexively respond to "R faster than SAS" claims :p
Believe me, I know. The code just becomes unreadable when you put all execution inside the same data step and use hash table to do fast small to big merging. And not to mention debugging that mess when you have a macro layer on top of it. Not having access to function source code, installation process being what it was. I do not miss it.
And yes technically SAS is faster than R but part of the equation is how many people can make SAS code faster than R/python. I had maybe, 1-2 people that could write efficient SAS code.
One version we had was a bunch of macro producing hash merge plus the whole how can I do something without having to get out of the data step. Just horrible. Number of characters in a line of code? You forgot your quote somewhere and now you have to run the magic line.
I hope I'm not too emotional when I say I hope SAS disappears from my industry and we embrace less adversarial licensing.
I'm being emotional when I say I have a soft spot for it because of some nostalgia and occasionally dropping in to do some "rock star" programming moments with it. But that's the opposite of what I'd want if/when I was running my own ship.
I too almost always try to steer myself and others away from it now because of the licensing/customer hostility. It's absolutely ridiculous...
Do you have any resources that help explain these SAS performance measures? A book perhaps?
I have been trying to help with exactly this (and your breadcrumbs help) but it is tricky for me since I am used to open source/*nix environment where you can use much different tools and also information and tutorials are distributed much more widely.
Unfortunately not. With SAS I never used books and relied solely on having access to the fully licensed system at a previous job and all of the SAS PDFs floating around the internet and findable with specific searches.
That combined with a general computer science background and you can start to put the whole thing together.
I'd be lying if I said I hadn't considered writing one, but at my age I'd honestly ask why write one for an old proprietary system and make business for someone else when, if I'll ever go back long term, they can pay me an exorbitant amount as a consultant. Might as well start writing 'the dark arts of COBOL' :p
But... and here's the big but...I almost never actually meet anyone capable of putting all these steps together in SAS these days that actually understands the SAS computation model end to end.
And SAS's strength, a computation model not being limited by memory by default, becomes a performance weakness when everyone reads/writes every step out to disk and programs without understanding all those little intricacies. SAS hasn't helped any of this by trying to move its eco system away from "programmer" to "application users", so now "programmers" can pick up an interpreted language like R with in-memory default vectorised operations and beat SAS.
Course, I'd still recommend places move to python/R these days because of the broader ecosystems, university talent pool, and avoiding the extensive lock in of proprietary software, but I still feel I have to reflexively respond to "R faster than SAS" claims :p