Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Wow, the reasons why Redis commands API suck in Andy's video (linked in the post) are the weakest ever. It is possible to make a case against the Redis API (I would not agree of course but... it's totally legitimate), but you gotta have stronger arguments than those, particularly if you are a teacher of some kind. Especially: you need to be somewhat fluent in Redis and how developers use Redis in order to understand why so many people like it, and then elaborate what it's wrong about it (if you believe there is something wrong). The video shows a general feeling of "I don't really use / know this, but I don't like how NON-SQL it is".


> Wow, the reasons why Redis commands API suck in Andy's video (linked in the post) are the weakest ever.

In my example, the API on a key changes based on its value type. And the same collection can have different value types mixed together. You've recreated the worst parts of IBM IMS from the 1960s. However, the original version of IMS only changed the API when a collection's backing data structure changed. Redis can change it on every key!

We didn't get into the semantics of Redis' MULTI...EXEC, which the documentation mischaracterizes as "Transactions". I'm happy that at least you didn't use BEGIN...COMMIT.


You totally miss that Redis is more like a remote interpreter with a DSL that manipulates data structures stored at global variables (keys): you (hopefully) would never complain about languages having this semantics.

I don't think you understood how Redis collections work. The items are just strings, they can't be mixed like integers or strings together or whatever, nor collections can be nested.

The Redis commands do type checking to ensure the application is performing the right operation.

In your example, GET against a list, does not make sense because:

1. GET is the retrieve-the-key-of-string-type operation.

2. Having GET doing something like LRANGE 0 -1 would have many side effects. Getting for error a huge list and returning a huge data set without any reason, creating latency issues. Also having options for GET to provide ranges (SQL alike query languages horror story). And so forth.

So each "verb" should do a specific action in a given data type. Are you absolutely sure you were exposed enough to the Redis API, how it works, and so forth?

About MULTI/EXEC, when AOF with fsync configured correctly is used, MULTI/EXEC provide some of the transactional guarantees you think when you hear "transaction", but in general the concept refers to the fact that commands inside MULTI/EXEC have an atomic effect from the point of view of an external observer AND point-in-time RDB files (and AOF as well). MULTI / INCR a / INCR a / EXEC will always result in the observer to see either 2, 4, 6, 8, and so forth, and never 3 or 5.

Anyway, I believe you didn't put enough efforts in understanding how really Redis works. Yet you criticized it with weak arguments in front of the most precious good we have: students. This is the sole reason why I wrote my first comment, I believe this to be a wrong teaching approach.


> 1. GET is the retrieve-the-key-of-string-type operation.

That's a tautological argument. The question isn't what the definition of GET is, but whether the design is good.

> 2. Having GET doing something like LRANGE 0 -1 would have many side effects. Getting for error a huge list and returning a huge data set without any reason, creating latency issues.

If this really were the reason, you'd have separate operations for tiny strings and huge strings. After all, by analogy having GET return a huge string "without any reason" would create latency issues.

But that's not how Redis works, right?


The examples I made are just a subset of the protection that this provides. Similarly you can't LRANGE a set type, and so forth. So this in general makes certain errors evident ASAP (command mismatch with the key type).

This does not meant that Redis would not work having generic LEN, INSERT, RANGE commands. But such commands would end also having type-specific options, that I have the feeling is not very clean. Anyway these are design tastes, but I don't think they dramatically change what Redis is or isn't. The interesting part is the data model, the idea of commands operating on abstract data structures, the memory-disk duality, and so forth. If one wants to analyze Redis, and understand merits and issues, a serious analysis should hardly focus on these kind of small choices.


Eh. What people are really arguing about here is redis’s type system. Redis’s approach has some pros and some cons. I think dismissing redis’s approach out of hand for its choices is too simple a treatment.

Most sql databases (like Postgres) require all types to be declared once, and then they do type checking on mutation. In that sense, sql is like a static language like C. But weirdly, the results returned from a sql query are always dynamically typed values, expressed in a table. Applications reading data from sql will still typically need to know what kind of data they expect back - but they usually do that type checking at runtime.

Redis flips both of those choices. It’s dynamically typed - so it won’t check your mutations. But also, you don’t need schema migrations and all the complexity they bring. And rather than having a single “table” type, redis queries can return scalar values, lists or maps. What kind of return value you get back depends on the query function. (Eg GET vs LRANGE).

If you think of a database as the foundation underneath your house, static typing & type checking is a wonderful way to make that foundation more stable. There’s a reason Postgres is the default, after all. But redis isn’t best used like that. Instead, it’s a Swiss Army knife which is best used in small, specific situations in which a real database would be complex overkill. Stuff like caching, event processing, data processing, task queues, configuration, and on and on. Places where you want some of the advantages of a database (fast, concurrent network-accessible storage) but you don’t want to stress about tables and schema migrations.

If you really hate redis, maybe say the same thing I say about Java when I teach it to my students. “I hate this, and I’ll tell you why. But there are smart people out there who disagree with me.”

If you ask me, I wish sql looked more like redis in some ways. I think it’s quite awkward that every sql query returns exactly one “table”. I’d much rather if queries could return scalar values or even multiple tables, depending on your query.


Minor nit. Some SQL databases allow you to return multiple tables. IIRC, SQL Server stored procedures can do that. Agreed its not a language feature of SQL.


> I’d much rather if queries could return scalar values

Since when can't they?


I mean, they can - but they’re always wrapped up as pseudo-tables.

Not everything is best described as a table, y’know?


> they’re always wrapped up as pseudo-tables.

Are they??? Not as I understand it.


If you call “select 1;”, you get back a table with 1 row and 1 column.


That's just because SQL clients present their results that way, AFAICS. If you use a sub-query like in, say,

   select * from orders where custno = (select custno from customers where name = 'John Doe');
you'll get the same result as if you'd put that scalar in your query, like

   select * from orders where custno = 123456; -- John Doe's customer number
Or maybe you're right, that to SQL databases scalar values are single-row single-column tables. But so what? In mathematics, isn't any number also the single-member set of numbers that contains only that number? Where's the harm in that? (And, hey, RDBMSes are founded on set theory...)

So I don't really see what the big problem is either way. Hoping I'm not being stupid AF, maybe you could explain further?


Mathematically, they're equivalently expressive. But returning everything as a table has bad ergonomics.

Imagine the programming language equivalent. We could make a programming language where every function call returns a table. If you expect 1 return value from your function, the caller grabs the first row out of the return array, and the first column out of that row. It would absolutely work, and that its mathematically equivalent in some sense. But it would be confusing, computationally inefficient and error prone. What happens if there's more than 1 row in the table? Or more than 1 column? What happens if the type of the columns doesn't match match what you expect? What happens if the table is empty? Or you want a function which returns a two lists instead of one? We could write that programming language. But it would be pretty weird and frustrating to use.

This is the situation today with SQL. Every query returns a dynamically typed table. Its up to the caller to parse that table.

With redis, the caller expresses to the database what kind of value they expect in the query function name. (At least, list or scalar). The database guarantees that a GET request always returns a scalar value, and LRANGE always returns a list. I think this has better ergonomics because the types are more explicit.


> You totally miss that Redis is more like a remote interpreter with a DSL that manipulates data structures stored at global variables (keys):

I think he makes the point that these "global variables" are dynamically typed; you can have "listX" and then write a non-list into that same name; statically typed systems would not allow this. He makes the fairly non-controversial point that a statically typed system (SQL, other than that of SQLite) adds a level of type safety that can guard against software bugs.


> you can have "listX" and then write a non-list into that same name; statically typed systems would not allow this

Well, that depends. In most SQL databases there are many cases where supplying the wrong type of value will implicitly convert to the expected type, often in unexpected ways that can result in subtle bugs.


PostgreSQL is very very good about really never doing this, and also a scalar vs. list is pretty much a PostgreSQL case since most other relational DBs dont have a native ARRAY type. I think you're mostly thinking of MySQL that has some int/string coercion cases which are to be clear bad, but not as egregious as "any arbitrary type goes right in with no checking whatsoever.

as mentioned, SQLite breaks all these rules and I think SQLite is very wrong on this.


>> stored at global variables

This is an interesting (and correct) perspective. Global variables scare us in software but we are ok with it when it comes to application state stored in a db.


There have been proposals to have global state in programming languages which function like databases with the advantages of monitoring/persistence/naming etc, but also retain the modularity of local state.

https://www.scattered-thoughts.net/writing/local-state-is-ha...

https://awelonblue.wordpress.com/2012/10/21/local-state-is-p...


Global variables can definitely be overused, but in the right situations, they’re generally fine. After all, the filesystem is a big global variable too. So is any database. But people don’t complain too much about that.

The strongest argument against global variables is that they don’t show up in the parameter lists of functions. In that way, they’re sort of “spooky action at a distance”. And they encourage functions to be impure. But if this bothers you, you can always pass your database connection as an explicit parameter to any function which interacts with it.


They aren’t scary. They are useful and you probably use them in many forms (lambdas capturing locals, logging, singletons).

This is yet another reason why single threaded should be the default assumption and multi-threaded require special consideration.


Just because there are reasons for why Redis sucks doesn’t meant it doesn’t suck


> We didn't get into the semantics of Redis' MULTI...EXEC, which the documentation mischaracterizes as "Transactions". I'm happy that at least you didn't use BEGIN...COMMIT.

Hmmm, this is a subtler issue than you make it out to be, I think, though I generally agree with you. The quality issues with Redis's technical design here interrelate substantially with user expectations/perceptions/squishier stuff.

The term "transaction" is anchored in most users' minds to a typical RDBMS transactional model assuming: a) some amount of state capture (e.g. snapshot) at the beginning of the transaction and b) "atomicity" of commit being interpreted as "all requested changes are atomically performed (or none are)" rather than "all requested changes are atomically attempted".

Redis has issues with both of those, so I'm sympathetic to your statement that what they call "transactions" is mis-characterized and would be better described as "best-effort command batching".

It's poor naming/branding to call it "transactions", and I don't think it had to be this way: MULTI/EXEC "transactions" should have been deprecated long ago--in favor of Redis scripts and other changes that should have been made in the Redis engine.

First, a defense of scripts: Redis scripts are, to a certain variety of user who wants transaction-esque functionality, not ideal. Those users may be reluctant to engage with a full procedural programming language rather than the database's query language. However, there's substantial overlap between those users and the ones who will be extremely confused by and unhappy with the existing MULTI/EXEC model--they're the folks with the most specific (wrong, in Redis) assumptions of how transactions should work, and suffer the most from them not working that way. Lua scripts, unfamiliar or not, are likely less troublesome in the long run for this cohort. Specifically, requiring users to be explicit about failure behavior of specific commands via call() vs. pcall() would remove one of the worst sharp edges of the MULTI/EXEC world.

Scripts can't answer other transaction-related needs, though. Ideally, I would have preferred that Redis go in the direction of a uniform set of conditions that can be applied to most write commands. There already are conditions in Redis, but they're not uniformly available: SET + NX/XX conditions single-key writes; WATCH semantically/implicitly conditions later EXEC commands with "if version of $key matches the version retrieved by the WATCH statement", etc. If that type of functionality were made explicit and uniformly available to all or most write operations, a further chunk of transaction-related needs could be addressed. When making single commands conditional isn't enough, scripts used to atomically batch-attempt commands could be invoked with parameters used to conditionalize those scripts' internal commands, and so on.

A final simple affordance in support of transaction-ish behavior would be a connection-scoped value type: either a modifier for arbitrary commands to have them operate on an empty database scoped to the connection, or a simple list-like value for connections to "stash" arbitrary data. This wouldn't fundamentally change any semantics, but would, at the cost of some indirection, marginally reduce the need for clients complexity when buffering conditions/commands for later flush via a pseudo-"commit"-script. This is somewhat hair-splitting, though: MULTI/EXEC is already such a connection-scoped buffer, just one that stages commands and not data. My hunch is that a data-only buffer to be consumed by scripts instead of "EXEC" would be an improvement here, but I may well be wrong.

Now, the system that results from these changes is still not as ergonomic/low friction as traditional transactions, and is especially unergonomic when users have to manually capture undo state and decide on rollback semantics during the failure of script execution. As Antirez mentioned in an adjacent comment, AOF can help ensure appropriate conistency in the face of database crashes during script execution, but database level reconciliation--aka "what is the equivalent of 'rollback' for a given script"--is still on the user to work out.

But that's what we're really talking about here, isn't it? That lack of undo (that is: the ability to capture and discard transactional state a la MVCC) is at the root of most of the weird and not-quite-transactional capabilities of Redis in this era.

Antirez is totally right that adding those capabilities would have substantially complicated the Redis engine, and I believe him when he says that made it not worth it to do so. Given that, I'd have vastly preferred a Redis which embraced providing tools that work in spite of/with full acknowledgement of that lack, rather than concealing it/confusing users by mis-branding MULTI/EXEC as "transactions".


With all due respect, the linked video was pretty fair. It didn't imply not to use Redis, just not as a primary datastore.

I don't think folks work with Redis out of fondness for the model, but because it's the least worst datastore for caching, lightweight message broker, and simple realtime things like counters.


Talking about the broken API argument here. Also Redis is particularly useful exactly in other situations compared to what OP says. Leaderboards style use cases with sorted sets are killer applications (super hard to model with SQL) of the data structure server thing. Apparently OP does not understand this and says "simple GET/SET" is what you should use Redis for.

Redis has probabilistic data structures, the ability to implement complex queueing patterns, and so forth. That's where the value is. Otherwise we would still be just with Memcached without caring about Redis. Another killer app was Twitter initial use case (then they used it for pretty much everything): to cache latest N Tweets, using capped lists. I could continue forever.

So OP argoment is flawed IMHO, for the above arguments, not fair. When you talk to students you need to make your homeworks. Really understand the system you are talking and provide a realistic image of it. Then, yes, if you want, criticize it as much as you want, with grounded arguments.

You know what? I re-read this comment and it's embarassing I ever have to write this, because after 15 years of Redis history at such scale and popularity, pretty much everybody that was seriously exposed to Redis knows those stuff. Is tech culture really degraded so much that we have to restate the obvious? Do I really need to explain GET/SET is not exactly where Redis shines after 15 years of half the Internet used all the kind of Redis patterns?


> Is tech culture really degraded so much that we have to restate the obvious?

Maybe, though the author of the article is known to be a little bit too opinionated, and unfortunately habitual with phrasing himself in a bombastic manner. The piece reads like a dramatic recap of the past year's sporting events, littered with irrelevant and disconnected references to lyrics and drama in the world of rap and hip hop. A "quirky and fun" journalistic abortion.


What are your thoughts about Rails switching to SQLite from Redis? I've only used Redis to store session data and cache app data. So my opinions are pretty limited and mostly positive.

https://rubyonrails.org/2024/11/7/rails-8-no-paas-required


My feeling is that for their use case, it makes sense to have something vertical that just cover the needs of Rails. AFAIK SQLite has a RAM backend, so still you are not going to hit disk. Seems like a good idea to reduce system complexity, to me.


Now I have never used redis in this capacity but these sorted sets(a data structure that maintains it's sort as data is entered I assume) how is that different from "create index on player_score (score)"? that is, an index on the score column. which will create a data structure that maintains it's sorted nature as data is entered.

My naive view is that you create a sorted set every time you define an index. that is, the opposite of "super hard to model with SQL"


maybe it was a reference to two or three or N level deep nested tree structures. Those dont map as naturally using simple SQL relations.


> I re-read this comment and it's embarassing I ever have to write this, because after 15 years of Redis history at such scale and popularity, pretty much everybody that was seriously exposed to Redis knows those stuff. Is tech culture really degraded so much that we have to restate the obvious?

I know you wrote OG and did a lot of Redis but...

Yes, Tech culture is that fucked.

In my past job I was hired semi-specifically to deal with a concern, namely that their use case 'fit well with Kafka' but the latency for their case sucked at least as much as the API pain. (Yes I did something better. No, IDK if folks will ever see it).

Now?

Now I spend my days trying to 'paper-over' patterns that drive me to insanity just trying to make it work from a 'people need to learn why things work on a starship' level [0].

On a real level you didn't fail, Redis has lots of great patterns. On a -practical- level it's a shitshow because you now have lots of folks 'glue-gunning' the Redis API on use cases that probably need tweaking or aren't the right fit at all, alas they all worked off the same example on GH/SO/etc and then did their own "this wasn't even the right way to do this so I'm adding glue, what could possibly go wrong" case.

(That said, Nats has decent stuff for this in form of KV CompareExchange style APIs, and I see the inspiration there, so that's something to feel good about.)

[0] - Namely, if anyone has a good prompt for taking a photo of someone and doing img2img of 'Person in astronaut uniform preaching from an ivory tower', that would be a plus


I am grateful for Redis and I agree you pioneered a lot of data access patterns in production for a lot of people, myself included. I've used Redis for 10 years, at times for use cases as you mention, for real time feature engineering for ML as well.

The API is just different compared to SQL, which is a downside for many. There's modern advancements in the space with IVM and more databases are supporting probabilistic data structures.


Maybe this is a weird question but, knowing only some math and not redis, what is a sorted set and how is it different than a list/tuple?


Sorted sets are abstract data structures were you insert elements into a set, but every element is associated with a floating point score. Elements are taken ordered inside the sorted sets, so you can ask for ranges, or a specific element rank (position), and so forth. It sounds like the (many) cases where Redis is the best idea to get started and deliver (see for instance the Instagram case, that used Redis for years while becoming bigger and bigger). Then as you understand you are at scale and need just XYZ, you may choose to implement XYZ inside your system in other ways and that's it.


Thank you for explaining it! I appreciate that.


Redis is stable, powerful, widely supported, and has been running strong... over a decade now? Ive never heard it recommended as a primary datastore... why would someone do that? Ive seen it used at scale for numerous businesses now and its caused problems exactly never. People understand how to use it because its relatively simple and provides the first things you need beyond the database. Do people complain about redis commonly? News to me.


Adtech/ML


Yea, I always use Redis for very specialized purposes.

Like offloading a shared data structure between threads / processes / machines so that I don’t have to deal with thread safety issues.


I understand machines but threads?! Why introduce IPC overhead on the fastest/easiest way to share data? This is beyond a solved problem and your language probably has multiple ready-made battle tested solutions.

In Python you don't even need a lib, dict is thread safe even in nogil.


Depends on the use case. Often if I’m sharing data between threads there’s a good chance I’ll want to scale it to another instance at some point so just going straight to Redis is pretty common.

Especially if there’s a chance I’d want that data to persist across restarts.

It’s one thing if I’m using a BEAM language, but otherwise I’ll usually reach for Redis.


  In Python you don't even need a lib, dict is thread safe even in nogil.
Is it? https://google.github.io/styleguide/pyguide.html#218-threadi...


Yep! Not every operation you can do on a dict is thread safe but if you find a situation where it isn't it's a bug.

https://github.com/python/cpython/issues/112075

Google is recommending people not rely on it because it does make dict subclasses not substitutable. It's easy enough to avoid the issue completely so in most cases you might as well do that.


> Andy's video (linked in the post)

Is there a "to long didnt watch" summary any one knows of? I hate videos, but am curious lol


As far as I can tell the two main criticisms in the video are that:

1. The Redis API requires the developer to use different commands to retrieve/manipulate data depending on the type of data being stored. To retrieve a string you use GET, but if you want to retrieve a list it's LRANGE, for a set it's SMEMBERS, for a hash it's HGETALL. (As opposed to an API design which would allow you to call GET on all of the different data types and have it return the right thing.)

2. The lack of a predefined schema means you can overwrite values with different types. So you can create a list named "foo" and then overwrite it with a string named "foo" and then overwrite that with a hash named "foo" and Redis will happily do it, meaning the developer needs to keep track on their end what actual type any given key is holding onto.

To me these criticisms come across as essentially saying "Redis doesn't behave like a RDBMS" to which I suppose antirez's point is "well, yeah, it's not supposed to".


thanks for saving me some time!


Right there with you. The trend towards video content instead of written sucks so much.


same same


I've been working on something Redis-y over the holidays, and it has reinforced my view that it's the epitome of a 20%- 80% tool. I've always used the 20%, but anything beyond that sounds useless unless you've encountered the requirement in a production environment. The challenges Redis has been solving for years, never really touched the research/academic community (even the 20%).

Even in the various taxonomies of DBS in the research literature, Redis was mentioned with a wave of the hand as an "in-memory" database, which undersells the important (for me) part of the "data structure" server.

Putting the "database" after Redis could be a marketing misstep. Because it puts you in the is-it-sql territory.

TL;DR: Redis is mostly appreciated by practitioners (web) developers. Academics find it lacking a theoretical foundation, so... meh.


Developers know it's limits. Or you have developers with vague "scaling issues" or "buggy caching" who don't understand why they have them or suddenly start suffering from them at inconvenient moments.


SQL is king and history has shown non-sql languages are not good which causes many non-sql DBMS's to adopt sql eventually.


Many non-SQL DBs had query languages that were broken Javascript-ish versions of SQL. Of course, this is wrong, and people will eventually adopt SQL instead. But if your data model isn't anything like relational DBs, non-SQL makes a ton of sense. OP seems to miss exactly this, that the Redis query language is shaped on the Redis data model, that is basically alien to the relational model.

The idea behind Redis data model is that "describe data" then "query those data in random ways" is conceptually nice but practically will not model very well many use cases. SQL databases plagued tech with performance issues for decades because of that. So Redis says instead: you need to put your data thinking about fundamental things like data structures and access times and the way you'll need those data back. And the API reflects this.

You don't have to automatically agree with that. But you have to understand that, then provide your "I'm against" arguments. Especially if you are in front of young people listening to you.


Agreed. Many noSQL-boom-era databases eventually bolted on a SQL-esque layer, but that was also because they were mostly also all targeting "enterprise database" use cases and customers who both expected that and whose use cases largely fit with it. So, there was a lot of pressure to conform to norms when the advantage of not doing so wasn't immediately self-evident.

We have a database [1] and query language [2] that's tailored to storing & querying trace/telemetry data produced by different layers and components of cyber-physical systems for systems engineers to analyze, verify, and validate what a complex system is doing. It's not quite a traditional relational problem. It's not quite a traditional time series problem. It's not quite a traditional graph problem.

Addressing the way that systems engineers think about their domain in an effective way required coming up with something different. Are there caveats and rough edges? Sure. But, they're a lot less pernicious and onerous than the alternative of trying to leverage a bunch of ill-fitting menageries of different solutions.

Redis is fit-for-purpose. So, it makes sense that its query interface would also express that.

[1] https://docs.auxon.io/modality/

[2] https://docs.auxon.io/speqtr/


>But if your data model isn't anything like relational DBs, non-SQL makes a ton of sense. OP seems to miss exactly this, that the Redis query language is shaped on the Redis data model, that is basically alien to the relational model.

Sure...but all roads lead back to SQL eventually. Another recent example also mentioned in the OP is BigTable adopting SQL.


> but all roads lead back to SQL eventually

No it doesn't. SQL is designed for relational databases.

For other forms i.e. JSON, Graph, Key/Value they all use other query languages.


If you write code that uses a hash map, would you insist on using SQL to query it? This makes no sense.


For what it's worth, SQL kind of sucks. It's just the de facto choice because it's extremely widespread and good enough for 80% of the use cases out there and what's missing can be kludged on top of it, either by specific DB vendors, or by various extensions.

It's not too hard to come up with alternatives that improve upon individual aspects of SQL like https://prql-lang.org/ but the barrier of entry is about as high as trying to make a huge social media network, most attempts will remain niche.

Then again, most software kind of sucks, it's just that some of it also works. For example, the Linux FHS reads like an overcomplicated structure that is the way it is for historical reasons, but works in practice.


SQL (and RDBMS in general) has its limitations, particularly with regards to recursive operations.

An extended Datalog[1] can provide performance optimizations not available to RDBMS.

[1]: https://dl.acm.org/doi/10.1145/3639271




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: