Thanks for the "tidy data" reference, getting that now. Have you explored dplyr yet, since you mentioned "split-apply-combine" and you like data.tables?
I'm familiar with dplyr but haven't really used it. I prefer using stuff that's mature, and I'm sure that dplyr is going to undergo a lot of evolution in the early days.
That is, "use everything from Hadley, except use data.table instead of plyr". This was before dplyr came out, so maybe that could change. But I kind of like the syntax of data.table, although I don't understand all of it.
Of course plyr is ridiculously slow and can't be used for even modest-sized data sets.
> Of course plyr is ridiculously slow and can't be used for even modest-sized data sets.
Right and that's what dplyr is supposed to fix. The benchmarks so far mark it at roughly the same performance as data.table. But, as you said, it's early days for dplyr ;) Thanks for your comments!