> ¹Basic Tier instances experience a downtime and a full cache flush during scaling. Standard Tier instance experience very minimal downtime and loss of some unreplicated data during scaling operation. ²Applicable for GA release only.
Does that mean that right now there are no persistance options with this Managed Redis cluster? I have never tried to use Redis but I've been eyeing it for a while and the most confusing part is how it handles persistence.
Apparently there are ways to have Redis configured to be pretty durable as outlined here:
Maybe this is just for the initial period. Redis has orchestration commands that allow to stop the clients in one side, migrate, shutdown the first instance, and resume the other, without the possibility of getting writes in the middle of the transition. Other providers do that to ensure safety of operations.
Nope, no vendor does (fortunately). AFAIK people get the "Redis way": grab the source and do whatever you want with such code, in the limits of the license / law. Also I get near no contacts from big corps using Redis at scale. This is a sane model otherwise my work would be dealing with communications instead of writing code :-) However when technical people from big companies ping me about bugs / issues / improvements ... then it's great feedback, but is rare.
It sounds like they're scaling by adding larger instances, replicating, and then failing over. So they'll lose any data that hasn't been synced to the new instance when the failover occurs. I'd bet it's rare, but they're being explicit about it.
The Basic Tier probably only uses one instance, so they just replace it in place.
Their post isn't very clear on persistence, here's how I read it.
1) Yes, they're using Redis persistence (RDB, AOF, or a Google-written persistence layer, it's not specified).
2) Basic Tier scales by tearing down the smaller Redis instance and migrating the data (???) to a larger instance. This causes downtime, but not lost data.
3) Standard Tier migrates using replication during scaling, so there's little downtime but if data is in flight at cutover, it could be lost.
4) And "coming soon" you can migrate data in and out of Cloud Memorystore by importing and exporting RDB files.
Exactly. It will be still be useful to some people, it allows them to get something working end-to-end, and then iterate on it to provide the features for more people.
This is a great looking product, thanks for following up here.
Can you speak to what kind of durability you expect to achieve once persistence is in place? Is it reasonable to treat this as a durable datastore and simply have a disaster recovery plan in place? Or should applications be prepared to handle occasional minor data loss?
Transient caches are still a useful type of data store for better performance... That said, they're noting it as a goal and a beta limitation, so they're not satisfied with pure transience either.
They use the term “data store” multiple times, not to mention:
> and features like persistence, replication and pub-sub.
(Customer testimonial)
> We have used Redis on everything from storing asynchronous task queues for tens of thousands of CPUs to a centralized persisted key-value pair store for the feature vectors output by our ML models
Additionally, “persistence” isn’t mentioned at all in the “Coming soon to Cloud Memorystore” section.
I’m not trying to be a pedant, I just think it’s dangerous that it’s not made abundantly clear that there is no persistence.
On the surface it looks like this isn't a true-blue Redis server (or even a fork of Redis) but support for the Redis protocol bolted onto an in-house memory-based caching product.
That's not really an answer to the question. You may not agree with it, but there are many orgs out there using Redis as a persistent datastore. If Google wants their money, they're going to need persistence options.
I wanted to see if Google's implementation sacrifices any speed.
snip
EDIT: I deleted the benchmark results from this post because they're meaningless. I used the "intrinsic latency" tool [0] that Redis provides, but it must be run on the server.
So my results only reflected the intrinsic latency of the client VM, not the new Cloud Memorystore.
redis-cli -h 10.0.0.3 --intrinsic-latency 100
677411416 total runs in 100 seconds
avg latency: 0.1476 microseconds / 147.62 nanoseconds per run
Open-source Redis on a Google Cloud Platform micro instance, 0.6 GB:
redis-cli -h localhost --intrinsic-latency 100
353427208 total runs in 100 seconds
avg latency: 0.2829 microseconds / 282.94 nanoseconds per run
Open-source Redis on an AWS EC2 micro instance, 1.0 GB:
redis-cli -h localhost --intrinsic-latency 100
21681751 total runs in 100 seconds
avg latency: 4.6122 microseconds / 4612.17 nanoseconds per run
My benchmark:
Open-source Redis on a Hetzner Cloud VPS, CX11 (92% cheaper than Google's Cloud Memorystore):
redisbench-client:~# redis-cli -h 88.99.124.195 --intrinsic-latency 100
Max latency so far: 1 microseconds.
Max latency so far: 77 microseconds.
Max latency so far: 113 microseconds.
Max latency so far: 130 microseconds.
Max latency so far: 2562 microseconds.
Max latency so far: 2835 microseconds.
Max latency so far: 4165 microseconds.
Max latency so far: 5497 microseconds.
757281326 total runs (avg latency: 0.1321 microseconds / 132.05 nanoseconds per run).
Worst run took 41628x longer than the average latency.
Unfortunately intrinsic latency does not measure the latency of the Redis instance, but the one of the whole host, that is, the kernel scheduler max latency.
Why is it that GCP can provide an internal IP address for Cloud Memorystore but not for Google Cloud SQL? It would be beneficial for these teams to work together, no internal IP in cloud SQL makes it much less appealing then AWS RDS.
Private IP support for Cloud SQL is on the roadmap.
The reality of development is that some things are easier to do on a new product. It's not a matter of communication.
> authorized networks ensure that the Redis instance is accessible only when connected to the authorized VPC network
That's good that this is handled smoothly as it's the way Redis wants auth to work.
I've always seen the security story to be one of the big weaknesses of Redis. No TLS, opt-in password with no user and only one shared password, no per-database privileges, etc.
Does anyone know the status of TLS in Redis? I heard somewhere that Amazon has a patch to add that directly, rather than having to use stunnel.
This is exciting, but I really hope that Terraform will add support for it in reasonable time. These days I'd prefer to not manually manage cloud resources if I don't have to.
Hey jchw, Dana from GCP here. No promises on the timeline, but I can tell you that adding a request in the issue tracker (https://github.com/terraform-providers/terraform-provider-go...) will at least put it on our radar so we start working on it sooner.
Hashicorp doesn't seem to like Google Cloud too much. I've been moving to Google Cloud Builder and it's Jinja and Python templates, which I happen to be a little familiar with...
I’m not sure why you’d say that Hashicorp doesn’t like Google Cloud. I can think of recent partnerships between Hashicorp and Google on Vault and Terraform that are both substantial integrations. Google Cloud and Hashicorp both have people that work on the GCP Terraform Provider, and they do an excellent job keeping it up to date. By all external measures this looks like a great relationship.
Google Cloud Deployment Manager, you mean? Yeah, that is linked tightly with Google's API system which means it has the fastest support for new things.
Google does (last I checked) put some paid time into systems like Terraform so that they support GCP well, but there's always going to be more of a lag, especially before GA.
I don't know Terraform well enough to know how big the lag is for new AWS features, but some lag probably exists even there despite AWS's dominant market share.
There's a bunch of undocumented GCB features by the way. Pop into their Slack and they'll often give you tips that haven't made their way into the docs yet. We've been mostly happy with it, but still use Jenkins as the kickoff point.
If postgres on cloud sql is any indication, then it will only be another year and a half! :) They seem to take the Beta/GA process very seriously and don't rush things to GA.
It's fully managed, so you don't need to worry about OS security patches, rotating logs and all the other administration that comes with running your own server. Basically the same benefits as using Cloud SQL/Aurora RDS vs maintaining your own MySQL server.
Disclaimer: I work at GCP, but not on Memorystore.
Is it managed redis or something google made themselves that has the same API? I wonder if they will keep up with all the new stuff in redis like streams.
I was pretty excited when Memorystore came out, as there is very little I enjoy less than managing redis servers. However, I've had nothing but trouble since switching. Applications than ran smoothly on redis instances deployed on kubernetes engine are giving me all sorts of problems now.
"NOREPLICAS Not enough good slaves to write" errors have been extremely common, sometimes followed by problems connecting. Then operations on the memorystore instance start pending (I mean "repairing"), taking 20+ minutes. During that time I obviously get connection refused errors. This usually comes with huge spikes in network in/out that are completely unexplained by the app pointing to the service.
I thought the pitch for MemoryStore was that it was a managed service; y'know, less time on devops and all of that. I've found the exact opposite to be the case. Pay more to Google, spend more time on devops, get a redis service that doesn't work.
If you're already accessing it from within GCP, their network is excellent, not much lower latency than a high-quality internal physical network (under an order of magnitude) and highly flexible and configurable. In that use case it should be comparable to any solution you can host within GCP except hosting on the same instance as each client and using loopback access.
For accessing from outside GCP, yeah the broader internet latency would figure into this as any other external solution.
Double digits of microseconds of difference as of some time last year, if I remember correctly - it might have gotten even better since then since I know Google wanted to narrow that gap.
(Disclaimer: While I have in the past worked for the GCP team, nothing in this comment relates to my time at Google or to info I learned then.)
If Redis Cluster support is coming soon, how are they currently handing cross region replication and failover? Is Redis Sentinel (painful and requires client support) deprecated now?
It seems like they might be running Redis Sentinels (for handling failover and replication), but exposing it outside as a single IP, instead of giving a list of sentinels to connect to.
Applications might use that endpoint like a standalone Redis instance. Requests to that endpoint will be routed to the current master.
Effectively abstracting the details about the Sentinels and their configuration from the clients.
I was hoping to see something of a more flexible pricing like per 100 MB provisioned based on the far fetched assumption that it is not just a managed redis(like?) offering but a fully abstracted managed redis service where you are just charged by GB/s used as with lots of other services.
This just boils down to the same cost as self hosting minus the self hosting hassle. So not too bad as its just easier to use + no vendor lock in since it is redis compatible at a similar price point.
Redis Labs (the official sponsor of Redis now) uses a custom proxy layer on top of Redis, giving you the latest 4.0 version with support for modules, no downtime and features like storing the values of keys on SSDs. They also support cross-region multi-master using CRDTs.
If you need VPC/internal access only, or are fine with the lower level of features, or just want to consolidate on GCP services, then Cloud Memorystore can work, but otherwise I'd recommend Redis Labs.
how can a instance with 1gb cost 35 USD per month...
also I would've hoped that google won't add tiers, more like "limits" and that if I only host 1mb on my instance that I would only pay for 1mb... besides that I could scale to XGb (depending on my limit)...
$35 is, what, 30 minutes of an engineer's time from a company's perspective? If the management of the service would take any more than that every month (and it easily could!), you're ahead.
Even then, I think it'd be unlikely that the time-cost wouldn't make it worthwhile. People generally underestimate how much time gets soaked up by managing a computer service.
$35 seems to be 3x more cost than AWS's smallest tier (no further comparison of features/value beyond price), so it does seem expensive for an introductory project, considering a single $35~ Digital Ocean VPS might all some companies need their first year. On the other hand, the number of times I've seen an engineer suggest spending $500-$2000 in labor/hours (when already behind on mission critical work) as alternative to spending $5/month in some SaaS is just staggering.
In a datacenter, a network request to a server that has the information in DRAM can be faster than pulling it locally off of an SSD. It goes back and forth with perf gains on either side, but they're on the same order with different tradeoffs.
Especially if that "local SSD" is really a virtualized block device over a network anyway, which is how most people are deploying their cloud stuff for convenience.
Is this still the case? We don’t use much cloud from the big boys but I thought even they were moving towards local storage. All of our OVH instances are local SSD backed.
Both are available and AWS and GCP are pretty up-front about what you should use depending on the use case. The virtualized devices do have advantages in terms of being able to be moved around and not tied to the life of a single instance.
eg if you have multiple web servers and a lot to keep cached then it’s more economical to have one machine with a lot of memory than each web server.
Another would be keeping the cache consistent, a single instance of Redis offers atomic operations and strong consistency. E.g ensuring rate limiting is correctly applied across a service’s instances.
Another would be decoupling state from apps which is especially important for serverless/FaaS where app memory is frequently cleared.
If latency is an issue it can be mostly resolved by using a 2 tier cache, an in memory cache that is backed by a Redis cache. Redis pub sub can then be used to keep the in memory caches in sync. Stackoverflow is a good example of this architecture.
> ¹Basic Tier instances experience a downtime and a full cache flush during scaling. Standard Tier instance experience very minimal downtime and loss of some unreplicated data during scaling operation. ²Applicable for GA release only.
Does that mean that right now there are no persistance options with this Managed Redis cluster? I have never tried to use Redis but I've been eyeing it for a while and the most confusing part is how it handles persistence.
Apparently there are ways to have Redis configured to be pretty durable as outlined here:
https://redis.io/topics/persistence
But it doesn't look like Google is supporting either RDB nor AOF with this managed service, they say RDB is coming though.
Doesn't it heavily limits the use cases for this product and relegate it to a cooler memcache?
This is not criticism, it's just curiosity on good use cases for this product.