I switched from Python to R about 3 years ago. I missed iPython (Now Jupyter) fo...

autokad · on June 10, 2017

why do many say 'serious' work gets done in Python? R is great for linear models, but I find it tedious for many other things such as machine learning. However, I wouldn't classify that as 'serious', just that I find one performs better for different tasks.

its already been said, but I do NLP a lot. R handles text poorly. humans use a lot of text.

tensorflow, neural networks, etc is better in Python

between pandas, list comprehensions, python collections library, sklearn, spyder, I feel I have a lot of power at my finger tips and its easy to do most of the machine learning I want.

importing a package takes a meaningful amount of time in R. Several seconds, that is just unacceptable.

its a personal matter, but R has syntaxes that get on my nerves. python list: a = [1,2,3] a = c(1,2,3). perhaps its because i used other languages before, but my fingers are more adept at hitting [ which requires no shift compared to (. some people love curly braces and lots of parentheses in if/for statements, I appreciate them not being there.

I have to fight with R on scientific notation, always copy - pasting into my code: options(scipen=999)

that said, spyder is buggy, and R studio is fantastic. I still haven't come across a good python IDE that is par with R studio.

edit: I forgot to say, I feel pyspark is far superior to sparkr. last i seen, sparkr only works with a VERY old version of spark. I dont even think that version is supported anymore. this is a bit of a big deal to me

phillc73 · on June 10, 2017

It maybe worth investigating sparklyr[1][2] from RStudio.

[1] https://github.com/rstudio/sparklyr

[2] https://databricks.com/blog/2017/05/25/using-sparklyr-databr...

minimaxir · on June 10, 2017

Yes, sparklyr is very good for using Spark in R (I have another post detailing that: http://minimaxir.com/2017/01/amazon-spark/ )

confounded · on June 10, 2017

> that said, spyder is buggy, and R studio is fantastic. I still haven't come across a good python IDE that is par with R studio.

It's certainly taken some time investment, but after bouncing around all the editors for both, with some config, Emacs (with ESS for R and anaconda mode for Python) is the best environment I've found for both languages.

doug1001 · on June 10, 2017

I'm not a python or R dev (Scala dev actually whose day job consists substantially of re-factoring python and R code written by data scientists and data analysts to run on the JVM on prod servers). Sure python is easier to grok for up-stream ETL/data processing, but that's commodity work (or it should be anyway) and not a solid basis to compare R vs python. R has far more packages than the "scientific" python portion of pypi and for certain domains the quality (and quantity) of the packages in R makes the better choice; examples: signal processing (or any more-than-routine time-series analysis; seismic interpretation, finance, experimental design, chemo-metrics, etc. And with strict use of the datatables package--coercing dataframes as datatables and using that package's syntax to manipulate your data, R is very fast. Ignore the smack and leave those folks to their "serious" work

thearn4 · on June 9, 2017

Out of curiosity, what was the motivation for switching from Python to R for analysis, was there a particular R package that you were looking to use?

jjn2009 · on June 9, 2017

I can't speak for top comment but I started learning R and realized that a lot of the primitives which are exposed in python as libraries are just primitives in R and are thus are more natural to use in subtle ways. Once you start thinking in R you think in data and statistics rather than how you deal with data and statistics within a language. This doesn't mean one is actually better than the other unless you want to do generic programing things along your math oriented code then probably python is better but I find this change of mental state to be useful when focusing on problems R is suppose to solve.

int_19h · on June 9, 2017

R has some really crazy metaprogramming facilities. This might sound strange coming from Python, which is already very dynamic - but R adds arbitrary infix operators, code-as-data, and environments (as in, collections of bindings, as used by variables and closures) as first class objects.

On top of that, in R, argument passing in function calls is call-by-name-and-lazy-value - meaning that for every argument, the function can either just treat it as a simple value (same semantics as normal pass-by-value, except evaluation is deferred until the first use), or it can obtain the entire expression used at the point of the call, and try to creatively interpret it.

This all makes it possible to do really impressive things with syntax that are implemented as pure libraries, with no changes to the main language.

lottin · on June 9, 2017

Quite true. R was originally build on top of Lisp and it shows in many places, such as in the metaprogramming stuff that you mention.

Overall R seems a little weird at first but the more you get to know the language the more you realise it's actually pretty well thought out.

peatmoss · on June 10, 2017

It's true, the more I do in R, the more I wish that it had remained scheme compatible (originally, R was built on a scheme if I remember correctly).

My someday project is `#lang arcket` for Racket, which would allow people to use existing R code, and mix with Racket, with appropriate data.frame data structures and whatnot.

int_19h · on June 10, 2017

The problem will likely be similar to that with alternative Python implementations - because so many existing libraries in the ecosystem are written in C, an implementation that is not ABI-compatible with them is not attractive to most existing users.

peatmoss · on June 10, 2017

No fundamental reason I know of that the C libraries for R could be used with Racket as well. After all, both R and Racket are able to do the FFI dance. Could the hypothetical R-in-Racket implementation sufficiently mimic the R FFI as to allow all R code to run without modification? Probably, maybe?

int_19h · on June 11, 2017

That's exactly the problem. R has a very specific API for its extensions - it's not just vanilla C, it's stuff like SEXP.

Although now that I think more about it, it's not quite as bad as Python, because the data structures are mostly opaque pointers (at least until someone uses USE_RINTERNALS - which people do sometimes, even though they're not supposed to), so all operations have to be done via functions, where you can map them accordingly.

You'd also need to emulate R's object litetime management scheme with Rf_protect etc; but that shouldn't be too difficult, either.

Some more reading on all this:

https://cran.r-project.org/doc/manuals/r-release/R-exts.html...

http://adv-r.had.co.nz/C-interface.html

peatmoss · on June 12, 2017

Oh, yeah, now that you mention it I have seen the SEXP and protect / unprotect stuff before. Maybe a hybrid approach of porting some of the core stuff / popular libraries to Racket's FFI would be more ideal if one were to do this for real.

Maybe aiming for "mostly compatible, with some porting work for a handful of the more popular non-R (C, Rcpp) packages would yield a better result in the end.

peatmoss · on June 10, 2017

* could not be used

dnautics · on June 10, 2017

you should see Julia. There's a lot of syntactic sugar that borrows from the best of other languages, for example the pipe |> operator from elixir and do...end block syntax from ruby.

Full unicode is supported. Unicode pi is implemented to mean the pure mathematical entity, so at compile-time it is turned into an memory reference to the most possible exact value.

The metaprogramming in julia is so good I wrote a verilog DSL that transpiles specially written julia into compilable and verifiable verilog - in 3 days.

int_19h · on June 11, 2017

Yep, I'm aware of Julia. I hope it takes off - it certainly looks a lot better thought out than R, which is very idiosyncratic. So far, though, it's still a fledgling.

minimaxir · on June 9, 2017

I had a similar story. I used R for statistics at college, but only base R, and it is verbose even for basic data manipulation. The scripts I made for my older data blog posts are effectively incomprehensible to me now.

I ended up learning how to use Python/pandas/IPython because I had had enough and wanted a second option on how to do data analysis.

Then the R package dplyr was released in 2013, alleviating most annoyances I had with R. dplyr/ggplot2 alone are strong reasons to stick with the R ecosystem. (not that Python is bad/worse; as I mention at the post, both ecosystems are worth knowing)

newen · on June 10, 2017

Same story. I use data.table and ggplot2, with a couple of dplyr functions, for pretty much all of my plotting and analysis now.

kermatt · on June 9, 2017

I use both, but R for interactive analysis and reporting, Python for data transformations (ETL).

While the syntax of Python is "cleaner" for backend scripts, R feels more straightforward when working with dataframes (dplyr) resulting in things to report on. The syntax for ggplot2 fits the same category.

As much as having one languages for both categories would be nice, using both today seems like a better option.

paultopia · on June 10, 2017

The thing that makes me sad about R is textmining. TM makes me sad, strings-as-factors makes me very sad. But maybe I'll try tidytext...

disgruntledphd2 · on June 10, 2017

Yeah, Python is way, way better for text. And I say that as a long-time R user. R really doesn't like things that can't be represented as datasets.