Just curious. In which way is data.table superior to pandas? Really interested about it! From my personal experience pandas is just sometimes a bit slow.
I'm more a dplyr man myself, but data.table is much faster than pandas, most noticeably IMO when reading large files. It's also extremely succinct if you're into that sort of thing (though I find it a bit obfuscated). pandas is a lot of things, but "fast" and "concise" are not two of them.
Got it. Regarding fast you have something like Vaex on python side (but not sure how fast it realy is). For me I had with pandas the most issues using it's multiindex.
> For me I had with pandas the most issues using it's multiindex.
Yessss. I loathe indices, and have never been in a situation where I was better off with them than without them.
> Regarding fast you have something like Vaex on python sid
I've never used Vaex, but I've used datatable (https://github.com/h2oai/datatable) and polars (https://github.com/pola-rs/polars). Polars is my favorite API, but datatable was faster at reading data (Polars was faster in execution). I'll have to give Vaex a try at some point.
Pandas is the PHP of data science. Pretty badly designed, but immensely popular because it got there first and had no real competition (in Python) for years.