More

huntaub · 2024-11-19T18:44:50 1732041890

> That is, you aren't treating the object store as the source of truth. If your caching layer goes down after a write is ack'ed but before it's "replicated" to S3, people lose their data, right?

This is exactly why we're building our caching layer to be highly-durable, like S3 itself. We will make sure that the data in the cache is safe, even if servers go down. This is what gives us the confidence to respond to the client before the data is in S3. The big difference between the data living in our cache and the data living in S3 is cost and performance, not necessarily durability.

> I think you'll end up wanting to offer customers the ability to do strongly-consistent writes (and cache invalidation). You'll also likely end up wanting to add operator control for "oh and don't cache these, just pass through to the backing store" (e.g., some final output that isn't intended to get reused anytime soon).

I think this is exactly right. I think that storage systems are too often too hands off about the data (oh, give us the bytes and we will store them for you). I believe that there are gains to be had by asking the users to tell you more about what they're doing. If you have a directory which is only used to read files and a directory which is only used to write files, then you probably want to have different cache strategies for those directories? I believe we can deliver this with good enough UX for most people to use.

> Finally, don't sleep on NFSv4.1! It ticks a bunch of compliance boxes for various industries, and then they will pay you :). Supporting FUSE is great for folks who can do it, but you'd want them to start by just pointing their NFS client at you, then "upgrading" to FUSE for better performance.

I certainly don't, and this is why we are supporting NFSv3 right now. That's not going away any time soon. We want to offer something that's highly compatible with the industry at large today (NFS-based, we can talk specifics about whether or not that should be v3 or v4) and then something that is high-performance for the early adopters who can use something like FUSE. I think that both things are required to get the breadth of customers that we're looking for.

huntaub · 2024-11-19T16:44:22 1732034662

Thanks for the feedback. If price is the single blocker for teams to try the product, I'd love to discuss more. Please send me an email at hleath [at] regattastorage.com.

> If i'm on one cloud provider, the traffic between their S3 compatible solution and my infrastructure is most of the time in the same cloud provider

This is exactly right, and it's why we're working to deploy our infrastructure to every major cloud. We don't want customers paying egress costs or cross-cloud latency to use Regatta.

> I also don't get your calculator at all.

This could probably use a bit more explanation on the website. We're comparing to the usage of local devices. We find that, most often, teams will only use 15% of the EBS volumes that they've purchased (over a monthly time period). This means that instead of paying $0.125/GiB-mo of storage (like io2 offers), they're actually paying $0.833/GiB-mo of actual bytes stored ($0.125/15%). Whereas on Regatta, they're only paying for what they use -- which is a combination of our caching layer ($0.20) and S3 ($0.025). That averages out closer to $0.10/GiB stored, depending on the amount of data that you use.

Melonotromo · 2024-11-19T16:50:08 1732035008

What is then your initial latency if i start an AI job 'fresh'? You still need to hit the backend right? How long do you then keep this data in your cache?

Btw. while your experience works well for Netflix, in my company (also very big), we have LoBs and while different teams utilize their storage in a different way, none of us are aligned on a level that we would benefit directly from your solution.

From a pure curiosity point of view: Do you have already enough customers which have savings? What are their use cases? The size of their setups?

huntaub · 2024-11-19T17:50:55 1732038655

> What is then your initial latency if i start an AI job 'fresh'? You still need to hit the backend right? How long do you then keep this data in your cache?

That's correct, and it's something that we can tune if there's a specific need. For AI use cases specifically, we're working on adding functionality to "pre-load" the cache with your data. For example, you would be able to call an API that says "I'm about to start a job and I need this directory on the cache". We would then be able to fan out our infrastructure to download that data very quickly (think hundreds of GiB/s) -- much faster than any individual instance could download the data. Then your job would be able to access the data set at low-latency. Does that sound like it would make sense for you?

> Btw. while your experience works well for Netflix, in my company (also very big), we have LoBs and while different teams utilize their storage in a different way, none of us are aligned on a level that we would benefit directly from your solution.

I'm not totally sure what you mean here. I don't anticipate that a large organization would have to 100% buy-in to Regatta in order to get benefits. In fact, this is the reason why we are so intent on having a serverless product that "scales to 0". That would allow each of your teams to independently try Regatta without needing to spend hundreds of thousands of dollars on something Day 1 for the entire company.

> From a pure curiosity point of view: Do you have already enough customers which have savings? What are their use cases? The size of their setups?

These are pretty intimate details about the business, and I don't think I can share very specific data. However, yes -- we do have customers who are realizing massive savings (50%+) over their existing set ups.

huntaub · 2024-11-19T16:21:15 1732033275

I know that I've answered this question a couple times in the thread, so I don't know if my words add extra value here. But, I agree that it would be interesting to hear what Davies is thinking.

dsvf · 2024-11-19T17:20:05 1732036805

Yes, your input into the thread cleared many things up for me, thanks!

huntaub · 2024-11-19T15:37:15 1732030635

Well, I think this is the benefit that our customers are looking for. They aren’t interested in becoming storage administrators, and running Regatta as a service allows them to not. There are, of course, other teams who do want to do that. It’s great that both kinds of products can exist.

huntaub · 2024-11-19T15:35:42 1732030542

We don’t actually, but thanks for pointing that out!

huntaub · 2024-11-19T15:31:53 1732030313

I think we’re moving in that direction. I’m really interested to do more in the API space than traditional storage has allowed. Tell me a bit more what you mean by “transactional file system”?

datadeft · 2024-11-20T18:20:17 1732126817

Curently when we use filesystems we actually rely on kernel functionality to have persistence. Relying on syncing[1] can introduce interesting bugs. I can imagine a scenario where the FS has an API that is actually transactional and we can use that to transactionally mutate the content of files. instead of relying on fsync.

      1. fsync, fdatasync - synchronize a file's in-core state with
       storage device

huntaub · 2024-11-19T15:28:57 1732030137

All data cached in Regatta is also encrypted with AES-256

Re: bring your own compute: It’s certainly something we’re thinking about. We are in discussions with a lot of customers running GPU clusters with orphaned NVMe resources that they would like to install Regatta on. We’d love to get more details on who’s out there looking for this, so please shoot me an email at hleath [at] regattastorage.com

huntaub · 2024-11-19T15:24:38 1732029878

All connected file system clients see read-after-write consistency, so you see the up to date file data!

mdaniel · 2024-11-19T18:09:43 1732039783

I heard you about the "limited hands, infinite wishlist" but nowadays when I see someone making bold claims about transactions and consistency over the network, I grab my popcorn bucket and eagerly await the Jepsen report about it

The good news is that you, personally, don't have to spend the time to create the Jepsen test harness, you can pay them to run the test but I have no idea what kind of O($) we're talking here. Still, it could be worth it to inspire confidence, and is almost an imperative if you're going to (ahem) roll your own protocol for network file access :-/

huntaub · 2024-11-19T18:46:34 1732041994

We've actually been thinking about getting Jepsen to do this, so I'm happy to hear that you also think that it would inspire confidence!

ignoramous · 2024-11-19T20:14:15 1732047255

That's exactly right!

geertj · 2024-11-19T21:57:11 1732053431

> I grab my popcorn bucket and eagerly await the Jepsen report about it

I am the same, as distributed consensus is notoriously hard especially when it fronts distributed storage.

However, it is not imposssible.. Hunter and I were both in the EFS team at AWS (I am still there), and he was deeply involved in all aspects of our consensus and replication layers. So if anyone can do it, Hunter is!

huntaub · 2024-11-20T01:57:10 1732067830

Thank you for the kind words, Geert!

huntaub · 2024-11-19T15:22:06 1732029726

That would be my expectation, you need something in the middle to actually broker the file locks.

huntaub · 2024-11-19T15:21:33 1732029693

Great to hear from you, I think cunoFS is doing a lot of things right! It’s certainly a fun problem space!