As I commented on a recent similar discussion, these tools can't be used for upd...

mgradowski · on Sept 24, 2022

What you're really saying is that the database presented in OP is not useful because it only handles DQL.

1. SQL can be thought of as being composed of several smaller lanuages: DDL, DQL, DML, DCL.

2. columnq-cli is only a CLI to a query engine, not a database. As such, it only supports DQL by design.

3. I have the impression that outside of data engineering/DBA, people are rarely taught the distinction between OLTP and OLAP workloads [1]. The latter often utilizes immutable data structures (e.g. columnar storage with column compression), or provides limited DML support, see e.g. the limitations of the DELETE statement in ClickHouse [2], or the list of supported DML statements in Amazon Athena [3]. My point -- as much as this tool is useless for transactional workloads, it is perfectly capable of some analytical workloads.

[1] Opinion, not a fact.

[2] https://clickhouse.com/docs/en/sql-reference/statements/dele...

[3] https://docs.aws.amazon.com/athena/latest/ug/functions-opera...

TAForObvReasons · on Sept 24, 2022

The title is an editorialization. The project is very careful to emphasize that it is for reading data:

> Create full-fledged APIs for slowly moving datasets without writing a single line of code.

Even the name of the project "ROAPI" has "read only" in the name.

gavinray · on Sept 24, 2022

Question: I've built something that supports full CRUD, and queries that span multiple data sources with optimization and pushdown

What kind of headline would make you want to read/try such a thing?

(I'm planning on announcing it + releasing code on HN but have never done so before)

fshr · on Sept 25, 2022

Hi Gavin; that sounds interesting! I saw @eirikbakke make a comment about https://www.ultorg.com earlier. It appears to also support editing the underlying data. I'm curious to see how you've each tackled these tricky topics.

porker · on Sept 24, 2022

Show HN: Read and update Arrow, Parquet and xxxx files using SQL

gavinray · on Sept 24, 2022

It works on databases and arbitrary data sources too though

andygrove · on Sept 24, 2022

I think it is worth pointing out that this tool does support querying Delta Lake (the author of ROAPI is also a major contributor the native Rust implementation of Delta Lake). Delta Lake certainly supports transactions, so ROAPI can query transactional data, although the writes would not go through ROAPI.

tomrod · on Sept 24, 2022

90% of SQL usage, or more, is select in slowly changing data contexts.

jaxn · on Sept 24, 2022

Maybe in your database. Do you have any validation of that claim in a larger context?

tomrod · on Sept 25, 2022

Purely the power law. That would be an interesting thing to figure out though. Maybe a github crawl.

EDIT: I stand corrected based on github code files (which might better represent application CRUD queries versus use by analysts, more thought required!)

SELECT: 7.3M code results [0]

INSERT: 8.9M code results [1]

UPDATE: 5.5M code results [2]

DELETE: 5.0M code results [3]

[0] https://github.com/search?q=select++extension%3Asql&type=Cod...

[1] https://github.com/search?q=insert++extension%3Asql&type=Cod...

[2] https://github.com/search?q=update++extension%3Asql&type=Cod...

[3] https://github.com/search?q=delete++extension%3Asql&type=Cod...

jaxn · on Sept 27, 2022

Reads are easy to cache at various layers (query cache, application, web, etc). Inserts and updates must go to the database.

So even a read-heavy application could have more writes than reads due to caching.

cerved · on Sept 24, 2022

i disagree