Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Document Stores: Please Give Me A Standard API (kirkwylie.blogspot.com)
9 points by lecha on Dec 11, 2009 | hide | past | favorite | 15 comments


Yes. SQL should be that standard API.

Most document- and KV-stores only support a subset of what can be expressed with SQL. In fact, I have yet to see a feature that can not be expressed in SQL.

Thus, the advantages of using SQL are obvious:

  * SQL is well understood and mature

  * People already know SQL

  * SQL parsers and clients are dime a dozen

  * SQL lends itself reasonably well to interactive 
    use by humans. Much better than typing raw 
    javascript into a console (MongoDB) or having to write
    actual code for every little data-manipulation task
    (most others).

  * Existing SQL-infrastructure can be leveraged.
    For example your favorite ORM could easily grow an
    adapter for a "NoSQL"-Db when it's just a slightly 
    different SQL dialect instead of a completely
    distinct API.


> SQL should be that standard API.

Unlikely. How will SQL deal with low-level semantics that do not provide transactions in the way SQL-users understand them but requires the user to understand the concepts of the CAP principle? How well does SQL deal with conflicting data being returned from retrieval operation? If SQL is being used as nothing more than a small set of verbs (a la memcache) then why bother?

I certainly hope that document-store developers do not take the lazy route and prematurely converge on a standard like SQL. Fortunately it seems that few of them are particularly interested in this route and the more likely path seems to be providing multiple interfaces with RESTful HTTP, memcache, and a programmable/functional interface using JavaScript and/or the language the DB is coded in (Erlang and Java primarily.) Other possible candidates here are things like LINQ, Hive's QL, and Pig.


Unlikely. How will SQL deal with low-level semantics that do not provide transactions in the way SQL-users understand them but requires the user to understand the concepts of the CAP principle?

I'm not sure what you mean. SQL Databases have varying feature-sets already and SQL copes just fine. A "NoSQL" backend would simply be yet another type of database, lining up with e.g. MySQL and Postgres.

It's just a matter of applying the verbs (SELECT, CREATE, INSERT, UPDATE) in a meaningful way.

The individual stores could even extend the language to a degree, just like the RDBMS flavours do today. The important wins would be a common baseline (select foo from bar), a standarized way for human interaction (adhoc queries in a console are very useful) and a much easier migration path from traditional RDBMs.

How well does SQL deal with conflicting data being returned from retrieval operation?

Again I wonder what you mean?

If you're referring to versioned stores, or stores with particular semantics, then I'd imagine things like that to be simply embedded into the response. I.e. all tables returned by a select-statement could simply contain an additional column containing the tuple version. The user can then deal with that meta-data using the usual, time-tested SQL machinery (select .. where version=, group by, etc.)

Yes, this means every store needs slightly different treatment. But that's no different to how we treat RDBMs today - a common baseline with variance in the more specific features (column-types, indexes, triggers etc.).

I certainly hope that document-store developers do not take the lazy route and prematurely converge on a standard like SQL.

That's a strange wish to have as it only makes life harder for everyone. I don't see anything bad about SQL (the language) that would prevent it from taking this role.

Moreover the query language doesn't need to have ties into the underlying implementation - it's merely a common vocabulary.

SQL seems like a natural choice due to its ubiquity.

LINQ, in my book, is an ORM and operates at a different level of abstraction. Pig and HiveQL otoh are exactly what I have in mind - SQL dialects.


[Not converging on SQL] only makes life harder for everyone.

No, it makes life harder for people trying to use off-the-shelf ORMs and other RDBMS tools. I would much rather make life harder for them than make it harder for the developers of these databases.

I don't see anything bad about SQL (the language) that would prevent it from taking this role.

I, for one, do not think that it is expressive enough to cover all of the different paradigms being explored. It is based on a row/column view of the data that may not be appropriate and which may require additional hoops to be jumped through in order to get the data presentable for the assumptions that SQL makes. Who would be responsible for these transformations? IMHO it should be the end-user, but if SQL gains traction any time soon the db developers will be repeatedly browbeaten by DBAs who can't comprehend why anyone would not share their viewpoints regarding data structuring and could you please make things look like the RDBMs systems we are used to kthxbye...

Adding SQL seems to offer very little at this point except future headaches.


I, for one, do not think that it is expressive enough to cover all of the different paradigms being explored.

You are not alone with that opinion, I've seen it a couple times in similar discussions. Yet I'm missing a concrete example for a piece of functionality that can't be reasonably wrapped.

Browbeating is irrelevant in either direction. What matters for adoption is a sane interface for day-to-day use and that includes proper tooling for interactive use - an area where many of the contenders are still sorely lacking.


Good points, but SQL is 'only' a language, not a protocol. Servers still need to agree how to expose an HTTP API that could take SQL. Ideally such API would be RESTful.


I don't see that as a big problem. The queries will have to be parsed and processed inside the Database anyways. Thus the servers could simply accept SQL strings over HTTP - or any other transport protocol.


A standard might lead to slower innovation at this point.


The author focuses on document stores, but the same argument can be made to key/value stores.

It is time for NoSQL community to come up with a standard key/value API. Is anyone working on it?

As one commented said "JDBC came out ~25 years after SQL was created.". By today's standards 2.5 years should be about right time to expect a standard :)


Can't talk for the other NoSQL players, but for Redis this is more or less impossible since the key-value business is something like 10% of the current API. How to cover the other 90%?

Maybe in the future Redis may be able to listen in another port to talk the memcached protocol, but this is not an high priority feature for now.

What I think people should do is to abstract the interaction with the DB in their code. So that it's just a matter of writing adapters to support new KV stores, and this adapters can take advantage of peculiarities in specific stores.

For instance if using Redis one can use Redis Sets to take the list of friends. When using something different will serialize data as JSON, and so forth.


Since a lot of these different data stores have fairly different semantics, I'm not sure how much value there is in trying to create a common set of abstractions in your code.


There is a lot of value if the DB API is abstract enough, like:

    (id) addBookmark(title,url,taglist)
    (bookmark) getBookmark(id)
    (array) searchBookmarksByTag(taglist)
And so forth. You can switch from SQL to a NOSQL solution and so forth without to touch the app but just a single file with all the DB API.

If it's at very low level like Db.get(), DB.set(), ... is still useful if the application just using a strict common subset of features (get/set/exists/expire/incr).


It is time for NoSQL community to come up with a standard key/value API. Is anyone working on it?

I think if people want a standard API, it's going to be the memcached API.


For key-value stores, probably. For document stores with richer functionality, probably not.


Well, yes. But the question was specifically about key-value APIs.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: