I’ve seen devs just run select \* from table then filter it and sort it in their...

SJetKaran · on Nov 4, 2019

From what I encountered, this is generally the case when someone is in the "analysis/reports" mode. Rather than get summary statistics on each column, find number of nulls, etc by writing a sql query, they instead get the data into the Python/R instance, and use general purpose functions, utilities, etc. "Programmers are expensive" statement probably applies here as well. I'm not trying to be defensive here, just saying that this might be one reason.

mrbungie · on Nov 5, 2019

If you believe "Programmers are expensive", then you should do as much as you can do with a declarative data manipulation language (usually SQL, you can also consider sequences of text manipulations tools using pipes) and leave that last 15-5% of high-value work to a more powerful but also verbose imperative lenguage (usually Python, but any).

Asking for what you want is considerably faster than saying how you want it done.

goatinaboat · on Nov 5, 2019

From what I encountered, this is generally the case when someone is in the "analysis/reports" mode

I understand this use case, but this is in actual application code!

starpilot · on Nov 4, 2019

Is anyone working on a translator for pandas dataframe syntax to SQL?

alexhutcheson · on Nov 4, 2019

In the R world, dbplyr[1] does this amazingly well.

[1] https://dbplyr.tidyverse.org/

airstrike · on Nov 5, 2019

tidyverse is just an mind-boggingly amazing ecosystem of packages

Forget Da Vinci, the first man to be cloned should be Hadley Wickham

faizshah · on Nov 4, 2019

Ibis: https://docs.ibis-project.org/

https://docs.ibis-project.org/notebooks/tutorial/2-Basics-Ag...

starpilot · on Nov 5, 2019

Dang, Ibis doesn't support Redshift or SQL Server. I'm also having trouble understanding what it really is - it's an entire framework for big data it seems and not just a translator. What I'd really like is just that, something that turns pandas dataframe operation into ANSI SQL. So input pandas2sql('tablename["col"]') -> "select col from tablename". Something really simple to use.

goatinaboat · on Nov 5, 2019

Pandas has from_sql and to_sql methods that are compatible with SQLAlchemy if you insist on using an ORM, that gets you most of the way there...

philipov · on Nov 6, 2019

SQLAlchemy is more than just an ORM. It also has sql expression language, for writing queries using python without using any ORM features.

medecau · on Nov 4, 2019

[Blaze](https://blaze.readthedocs.io/en/latest/what-blaze-isnt.html#...)

shrikant · on Nov 4, 2019

There's also this: https://pypi.org/project/pandasql/

pojzon · on Nov 4, 2019

Im surprised ppl dont use ORM libs for this instead..

bshipp · on Nov 4, 2019

ORMs are just as capable as any developer of generating bad SQL queries that crush databases and plug network connections.

pojzon · on Nov 5, 2019

Yes, but in the example used, all orm libs I know would fetch only a single column or two (often for a single row). While "select *" would fetch everything.