Hacker News new | past | comments | ask | show | jobs | submit login

This is the #1 thing VC backed startups are trying to automate away



They've been trying to automate away the "grunge" work of building and managing complex software systems since the 80s. Trust me, we'll be fine


To be fair, it's not as funny as automating data cleaning, on the principle that data scientists don't want to do it.

And yeah, lots of people dislike it, but you can't build models without an understanding of the data, so even if automated data cleaning became possible (unlikely) you'd still need to spend a load of time doing work on the dataset before building anything useful.


Some people seem to think they will be able to type "clean my data" into ChatGPT or similar and get a beautiful clean dataset. They are probably descendents of the people who said "COBOL means we don't need programmers any more".

Data cleaning requires a lot of judgement and domain knowledge. Imagine if an AI did clean your dataset. Are you just going to trust it (Hell no!)? Or are you going to spend ages trying to work out what it did, which doesn't seem much of an improvement.

I write data cleaning/ETL software and I'm confident that the need for my product is going to going up between now and when I retire.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: