Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I am confused...

It's a cool example, but really an antipattern. Nowadays everyone gets analysts want access to raw data, since they know which aggregations they need best, whereas data engineers stay away from pre-aggregating and focus on building self-service data access tooling. Win-win this way.

How about building a duckdb accessible catalog on top of s3? Like instead of read_parquet, you would select from tables, which themselves would be mapped to s3 paths aka external tables.




Not a catalog though, still need to input s3 path...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: