Redis on Acid: 0.5M ops/sec, 1ms latency and ACID compliance

aaronblohowiak · on July 24, 2017

"almost always" atomicity means "not atomic". (if i guess correctly, there is no stage/commit phase for data mutations in the aof persistence, so i believe incremental changes are written to the AOF as the script runs so you can have partially-applied updates on restore if your redis process dies in the middle of executing a watch/multi/exec or lua script.) also, instances die all the time.

antirez · on July 25, 2017

Hello, I'm not sure of the setup Redis Labs is using, but vanilla Redis AOF does not allow partially-applied updates. Not for Redis transactions (MULTI/EXEC) nor with scripting. The same happens in the replication channel. In order to enforce this, Redis goes a very long way to avoid partial applications of Lua scripts, more info are in the EVAL page and in the -BUSY error in case of scripts not returning that already made writes.

aaronblohowiak · on July 25, 2017

ah, thank you. I should have checked the docs! It is nice to know that you are writing the script and multi/exec semantics into the AOF log (at least in the default config) and that my guess is wrong.

I still wonder what the details are around the "almost" in "almost always" and stand by the conclusion that "almost atomic" is not the same as "atomic".

antirez · on July 25, 2017

My best guess is that the author is referring to the fact that there are no rollbacks in Redis transactions, but I'm not sure. I'll try to ask internally. Thanks!

gopalv · on July 24, 2017

> also, instances die all the time

This where the durability argument has a problem as well.

At a first glance, this looks like an un-replicated system, which means that the loss of an instance is an availability nightmare.

The worrying quote from the article for me is this one

>> With Redis Enterprise, it is easy to create a master-only cluster and have all shards run on the same node

Another node has to be brought in and attached to the same network disk to restore access to that key range?

500k+ ops/sec is nothing to laugh about on a single node with 1:1 read-write ratios, however the fragility of this system is concerning.

Half a decade ago, I was working with row-level atomic ops in ZBase/Membase (set-with-cas[1]), which gets away with using replication instead of an ssd backing the durability of operations + an fsync - the 99% latencies were at 3-4ms, but the scalability and availability were baked in.

[1] - https://github.com/zbase/documentation/wiki/Data-Integrity

lpinca · on July 24, 2017

I'm not sure how the append only mode works but I tried to run a script (appendfsync directive set to always) which sets a new key in an endless loop and kill the server in the middle of the execution.

No changes are written to the AOF so I guess the operation is atomic.

znpy · on July 23, 2017

Questions I would like to see answered:

- How does oss redis compare to enterprise redis ?

- What are the differences when running the same benchmarks with both oss redis and enterprise redis ?

- what is the marginal utility of an additional cpu core/thread ? that is, what happens if I run those benchmarks on an AMD ThreadRipper ?

dvirsky · on July 23, 2017

Re the first question - enterprise Redis includes, besides the management utilities you can expect from such a product - a different cluster architecture (that actually predates Redis cluster) and cluster manager; It doesn't require a cluster aware client as it uses a proxy tier (that also does implicit pipelining and other stuff to increase throughput); And the next version of it will include some extra features like flash support, extra modules, etc.

Re the second and third questions - I'm afraid I cannot answer. I hope the author of the post will reply tomorrow morning (it's the middle of the night here).

Yiftach · on July 24, 2017

To add to what dvirsky said [and I'm also from Redis Labs]

- Redis Enterprise adds some enhancements to the Redis storage layer (details are in the blog).

- This benchmark only tested the Redis Enterprise. The idea was to show how fast Redis (Enterprise) can run on a single node with ACID transactions and still keep sub-millisecond latency. Note the hidden point - at the moment you cannot achieve sub-millisecond latency over the cloud (any cloud) storage, I mean persistent storage that is attached to an instance and not the local storage, which is ephemeral by design. So in-order to see how far we can go we decided to test it over Dell-EMC's VMAX that doesn't have these limitations

- In theory adding CPUs/cores can of course help, as you can add more shards to the Redis cluster and increase the parallelism when accessing the storage. That said, we haven't tested it over AMD ThreadRipper.

neomantra · on July 24, 2017

To add to what dvirsky and Yiftach said, this video [1] from RedisConf17 gave me the best understanding of the storage layer of Enterprise Redis. Before that, it was hard for me to break through the marketing fluff. Amazing what it can do with NVMe and/or Optane.

[1] RedisConf17 - Building High Performance DB with Redis using Flash Memory - Cihan B. & Frank O. https://youtu.be/Tf8JRsE6w2U?list=PL83Wfqi-zYZF1MDKLr5djmLYU...

koolba · on July 23, 2017

I'm curious on all of these as well. Also, is this a first party (ie antirez) piece of software? If not, has he blessed calling something "Redis Enterprise"?

detaro · on July 23, 2017

Redislabs is the primary sponsor for redis and employs antirez.

jonesetc · on July 23, 2017

I believe he still works for redislabs, so I'm sure this is all done with his knowledge.

dvirsky · on July 23, 2017

Yes, of course it is (I work for Redis Labs as well).

segmondy · on July 23, 2017

Very impressive, especially how it doesn't seem to matter much on the read/write ratio. I've only used redis to cache, do folks really use it as a DB?

dvirsky · on July 23, 2017

Yes. Full disclosure: I work for Redis Labs, but prior to that I designed the back-end of a server-heavy mobile app (EverythingMe launcher) with Redis as its "front-end database". Meaning it wasn't the only source of truth, but it wasn't a cache either - we served data to users from Redis only, while some of it was migrated (live or in batch) from MySQL, and other was the output of machine learning jobs that filled up data in Redis.

Some of it was done with raw Redis commands, and for some we wrote a data-store engine with automatic indexing and optional mapping to models - https://github.com/EverythingMe/meduza

We also used Redis for geo tagging, queuing, caching, and other stuff I forgot probably. It is very flexible, but requires some effort when not used as just a cache.

Right now I'm developing a module for Redis 4.0 that can do full-text and general purpose secondary indexing in Redis. https://github.com/RedisLabsModules/RediSearch

fouadf · on July 23, 2017

Job queuing kue.js https://github.com/Automattic/kue

edem · on July 23, 2017

I'd also like to know what else is it used for.

neomantra · on July 24, 2017

I use it for a specialized time-series storage / messaging layer. We are receiving stock market data directly, normalizing it into JSON and also PUBLISHing these objects via Redis to consumers (generally connected through a custom WebSocket gateway). We basically turn the whole US stock market into an in-memory sea of JSON, optimized for browser-based visualization.

Redis is great because of its multiple data structures. Depending on their "kind", these JSON objects are either `APPEND`ed onto Redis Strings (e.g. for time&sales or order history) or `HSET` (e.g. opening/closing trade) or ZSET (e.g. open order book).

Sometimes an object transitions from a SortedSet to a String. We used to handle this with `MULTI` but now we use custom modules to do this with much better performance (e.g. one command to `ZREM`, `APPEND`, `PUBLISH`).

We run these Redis/feed-processor pairs in containers pinned to cores and sharing NUMA nodes using kernel bypass technology (OpenOnload) so they talk over shared-memory queues. This setup can sustain very high throughput (>100k of these multi-ops per second) with low, consistent latency. [If you search HN, you'll see that I've approached 1M insert ops/sec using this kind of setup.]

We have a hybrid between this high-performance ingestion and long-term storage. To reduce memory pressure (and since we don't have 20 TB of memory), we harvest these Redis Strings into object storage (both NAS and S3 endpoints) with Postgres storing the metadata to facilitate querying this.

We also do mundane things like auto-complete, ticker database, caching, etc.

I love this tech! It's extremely easy to hack Redis itself and now with modules you don't even need to do that anymore.

tgtweak · on July 23, 2017

High traffic atomic push/pop between multiple clients (often used for jobs like kue.js)

write coalescing - caching individual writes for later bulk inserts into a db to help insert rates.

Simple counts and counters which can be incremented and then purged/reset on interval.

Simple Pub/sub for medium throughout systems.

Many more applications than simple key/value.

ilkkao · on July 23, 2017

I'm using Redis as the main data store for my semi-serious chat app project. For better or worse, all tables, including the userdb is stored in Redis.

In general, an extra layer on top of Redis is a must to make it even simple searchable database. Especially indices can't be ad-hoc implemented with separate Redis commands. Only transactions or Lua snippets that update both data and related indices atomically can avoid data corruption. For my own use, I wrote: https://github.com/ilkkao/rigidDB It supports indices and strict schema for the data.

I think somebody has described Redis at some point as a database-SDK. Makes sense to me.

le-mark · on July 24, 2017

I used redis' lua scripting to implement a per user/account cache; I wanted to provide a memcached instance per account, but also enforce a limit on cache size so a single account couldn't cache GBs of data and detoriate the service for everyone. I used hashes to track items and their sizes per account, and simply calculated the available size of all live objects on insert.

nosefouratyou · on July 23, 2017

I would think that something like hyperdex/warp[1] would be better for that use case.

[1] http://hyperdex.org/ [2] https://www.cs.cornell.edu/people/egs/papers/hyperdex-sigcom... [3] http://rescrv.net/papers/warp-tech-report.pdf

atombender · on July 24, 2017

Hyperdex is no longer maintained [1]. While the technology is impressive, the author seems to have lost interest, and is now working on something called Consus [2].

Hyperdex's problem all along was that the author — a very talented developer from what I can tell — seems more invested in his projects from the perspective academic research (he's at Cornell) than in delivering a practical, living open source project. He tried to form a company around Hyperdex (the transactional "Warp" add-on thing was commercial) even though nobody seemed to be using it; and he was the sole developer. Unfortunately, as interesting as Consus is, history seems to be repeating itself there.

But yeah, Hyperdex seemed to have real potential at one point. It was the only NoSQL K/V store (at the time) that had transactions.

[1] https://github.com/rescrv/HyperDex/issues/233

[2] https://github.com/rescrv/Consus

preetamjinka · on July 24, 2017

> It was the only NoSQL K/V store (at the time) that had transactions.

What about FoundationDB?

atombender · on July 24, 2017

Not open source, though.

preetamjinka · on July 24, 2017

Warp wasn't open source either.

atombender · on July 24, 2017

Good point, but at least you could try out Hyperdex and consider whether you wanted transactions. But this is pretty moot at this point, unless someone picks up Hyperdex development again.

cjhanks · on July 23, 2017

I guess I don't understand who requires less than 1ms latency on ACID writes in a system accessible only through socket interfaces. Even so - isn't this benchmark simply pushing the requirement of fast disk syncs onto a fast flash drive designed for DMA? I mean, okay.. I guess... did customers actually think the networking wasn't the latency culprit?

dis-sys · on July 24, 2017

Stopped reading after I saw the following statement - “All or nothing” property is almost always achieved, excluding cases like...

Almost always achieved? That actually means it is "never really achieved", fixed for you.

It is just sad that cheap marketing materials like this one are keep being pushed to the front page of NH.

b34r · on July 24, 2017

Why wouldn't they say 500kops/s? Using a small number like 0.5 is just bad marketing.

overcast · on July 24, 2017

Because anyone who actually understands any of that, knows it's an impressive number regardless. They aren't "marketing" this to the general public.

latch · on July 24, 2017

FWIW, I just get a blank page in FF with uBlock running. I believe it's the "naked-social-share" which various 3rd party extensions (Fanboy's) is blocking.

patkai · on July 23, 2017

I don't like to admit but I still don't get it, could I use Redis as a main webapp backend?

immad · on July 23, 2017

Redis won't function till it has loaded all data from disk to memory. So if your webapp doesn't have too much data then possibly.

Also since Redis is memory backed you need more RAM then data. This can get very costly.

Another annoyance is that Redis is single process single threaded so you really have to avoid running long running queries unless you do extensive manual sharding.

(Disclaimer: it's been a few years since I had to think about these constraints so maybe some are removed in more recent versions of Redis)

yaaminu · on July 23, 2017

Redis can now be multi threaded in custom modules

dvirsky · on July 23, 2017

Sort of - the model it supports is not really multi-threaded. The modules can spawn threads, and acquire a "GIL" when they want to touch actual Redis data - thus only one thread at a time actually "owns" the entire Redis instance.

This allows long running queries to do primitive cooperative multi-tasking, releasing the GIL and letting other queries have a chance; But there is no real parallel data access. You will only gain real parallelism if you have actual work to do that does not touch the data directly when a thread is not touching the GIL. There aren't many cases that this applies to - usually copying the data aside to do work on it is not worth the gain of parallelism.

Yiftach · on July 24, 2017

[disclosure I'm from Redis Labs] - around 50% of our 7K+ paying customers use Redis as a database. In general, you can setup an environment in which Redis is HA (replication + auto-failover) and persistent. Redis Enterprise does it by default + backup and DR + some enhancements to the Redis storage layer (as mentioned in this blog. These enhancements together with the high end storage device by Dell-EMC allowed us to reach this throughout & latency. BTW, this test ran on a single node, you can scale it by just adding node(s) to the cluster.

mayank · on July 23, 2017

You definitely could, with some pretty big caveats:

* you can tolerate some data loss in case of system crashes/power failures/etc, because redis only flushes data to disk periodically [1]

* your dataset can fit in RAM, since redis is an in-memory datastore

[1] https://redis.io/topics/persistence

kristoff_it · on July 23, 2017

Using AOF you have better durability, you can read about it in the link you provided.

mayank · on July 23, 2017

As far as I know, AOF is still not the default setting. It's important to point this out, or otherwise Redis will suffer the same fate as early MongoDB, which had a similar default persistence model which many users didn't fully understand.

dvirsky · on July 23, 2017

As long as your data will fit in RAM in a cost-effective way, and you can scale your cluster when it exhausts its memory. AFAIK Netflix are using it as a "real database", but of course not for everything - https://medium.com/netflix-techblog/introducing-dynomite-mak...

znpy · on July 23, 2017

Long story short, "ACID" is what a datastore has to provide in order to be eligible to be called a "proper database".

So yeah, I guess you could use redis as main datastore.

sgmansfield · on July 23, 2017

It's worth noting that they're running this benchmark on an absolutely enormous (and expensive) instance: https://aws.amazon.com/ec2/instance-types/x1/

mkj · on July 24, 2017

Keep reading - AWS latency was too high so the benchmark was run on some bare Dell hardware.

jjawssd · on July 23, 2017

All I see with uBlock is a black screen. I have to whitelist over a dozen ad trackers to see the content. Several ad scripts pull in even more scripts from external domains. Try it yourself.

TimNN · on July 23, 2017

I think the problem is that some block list includes `:root .scroll-top` as a blocked Element, which triggers for the `<body>` element on this page.

janpieterz · on July 23, 2017

uBlock Origin working fine here. 2 requests blocked.

jjawssd · on July 23, 2017

Are you using the uBlock defaults? I have blocked everything above "Regions, languages" in the 3rd-party filters tab except the experimental filter.

pravula · on July 23, 2017

Works fine for me.

codefined · on July 23, 2017

Same issue here with black screen, depends on your block list.

dom0 · on July 23, 2017

I get a white screen instead.

Mister_Snuggles · on July 23, 2017

uMatrix blocks all of the trackers and renders this page seemingly perfectly.