Hacker News new | past | comments | ask | show | jobs | submit login

Please give me an example. I can't think of any transform which cannot be done by using SQL or inbuilt Functions or new UDF.



train a set of sklearn models one each per a random partition of the data (computed distributed). then combine all those models using averaging and evaluate them all against an even larger dataset. how do you do that in SQL


Sharding the table can help scale the problem across many machines and as I mentioned earlier you can use PL/R or PL/Python language extension to lift all sorts of ML functions to SQL functions.


I'm unfamiliar with PL/Python. Can you have a Python object be the returned value of a sql query? Because that's a requirement of my example.


It’s also possible to do a lot in excel, it is just not always the best tool for the job.


Spark != SQL

It's also graph analysis and ML models.


Graph analysis -> Recursive common table expression (https://www.postgresql.org/docs/current/queries-with.html)

ML models - I already mentioned how to uplift R and Python functions to SQL function. even if you are not using PostgreSQL many other databases help you with uplifting and interfacing with existing ML libraries through FFI




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: