More

lxpz · 2026-02-14T09:37:39 1771061859

Garage does actually support CORS (via PutBucketCORS). If there is anything specific missing, would you be wiling to open an issue on our issue tracker so that we can track it as a feature request?

egorfine · 2026-02-14T09:47:53 1771062473

Oh, indeed!

Although the doc only mentions CORS [1] on an "exposing website page" which is not exactly related; also the mention strongly suggests using reverse proxy for CORS [1] which is an overkill and perhaps not needed if supported natively?

Also googling the question only points out to the same reverse proxy page.

Now that I know about PutBucketCORS it's perfectly clear but perhaps it's not easily discoverable.

I am willing to write a cookbook article on signed browser uploads once I figure out all the details.

[1] https://imgur.com/a/r3KoNkM

lxpz · 2025-12-19T20:01:04 1766174464

They are not solving the same problem.

Syncthing will synchronize a full folder between an arbitrary number of machines, but you still have to access this folder one way or another.

Garage provides an HTTP API for your data, and handles internally the placement of this data among a set of possible replica nodes. But the data is not in the form of files on disk like the ones you upload to the API.

Syncthing is good for, e.g., synchronizing your documents or music collection between computers. Garage is good as a storage service for back-ups with e.g. Restic, for media files stored by a web application, for serving personal (static) web sites to the Internet. Of course, you can always run something like Nextcloud in front of Garage and get folder synchronization between computers somewhat like what you would get with Syncthing.

But to answer your question, yes, Garage only provides a S3-compatible API specifically.

lxpz · 2025-12-19T19:52:23 1766173943

If you have replication, you can lose one of the replica, that's the point. This is what Garage was designed for, and it works.

Erasure coding is another debate, for now we have chosen not to implement it, but I would personally be open to have it supported by Garage if someone codes it up.

hathawsh · 2025-12-19T21:00:30 1766178030

Erasure coding is an interesting topic for me. I've run some calculations on the theoretical longevity of digital storage. If you assume that today's technology is close to what we'll be using for a long time, then cross-device erasure coding wins, statistically. However, if you factor in the current exponential rate of technological development, simply making lots of copies and hoping for price reductions over the next few years turns out to be a winning strategy, as long as you don't have vendor lock-in. In other words, I think you're making great choices.

Dylan16807 · 2025-12-19T22:59:32 1766185172

I question that math. Erasure coding needs less than half as much space as replication, and imposes pretty small costs itself. Maybe we can say the difference is irrelevant if storage prices will drop 4x over the next five years? But looking at pricing trends right now... that's not likely. Hard drives and SSDs are about the same price they were 5 years ago. The 5 years before that SSDs were seeing good advancements, but hard drive prices only advanced 2x.

lxpz · 2025-12-19T19:43:40 1766173420

It does not, at least not for a small local dev server. I believe RAM usage should be around 50-100MB, increasing if you have many requests with large objects.

lxpz · 2025-12-19T19:42:00 1766173320

If you know of an embedded key-value store that supports transactions, is fast, has good Rust bindings, and does checksumming/integrity verification by default such that it almost never corrupts upon power loss (or at least, is always able to recover to a valid state), please tell me, and we will integrate it into Garage immediately.

agavra · 2025-12-19T20:13:55 1766175235

Sounds like a perfect fit for https://slatedb.io/ -- it's just that (an embedded, rust, KV store that supports transactions).

It's built specifically to run on object storage, currently relies on the `object_store` crate but we're consdering OpenDAL instead so if Garage works with those crates (I assume it does if its S3 compatible) it should just work OOTB.

evil-olive · 2025-12-20T19:42:32 1766259752

for Garage's particular use case I think SlateDB's "backed by object storage" would be an anti-feature. their usage of LMDB/SQLite is for the metadata of the object store itself - trying to host that metadata within the object store runs into a circular dependency problem.

johncolanduoni · 2025-12-20T05:43:14 1766209394

I’ve used RocksDB for this kind of thing in the past with good results. It’s very thorough from a data corruption detection/rollback perspective (this is naturally much easier to get right with LSMs than B+ trees). The Rust bindings are fine.

It’s worth noting too that B+ tree databases are not a fantastic match for ZFS - they usually require extra tuning (block sizes, other stuff like how WAL commits work) to get performance comparable to XFS/ext4. LSMs on the other hand naturally fit ZFS’s CoW internals like a glove.

fabian2k · 2025-12-19T20:50:06 1766177406

I don't really know enough about the specifics here. But my main points isn't about checksums, but more something like WAL in Postgres. For an embedded KV store this is probably not the solution, but my understanding is that there are data structures like LSM that would result in similar robustness. But I don't actually understand this topic well enough.

Checksumming detects corruption after it happened. A database like Postgres will simply notice it was not cleanly shut down and put the DB into a consistent state by replaying the write ahead log on startup. So that is kind of my default expectation for any DB that handles data that isn't ephemeral or easily regenerated.

But I also likely have the wrong mental model of what Garage does with the metadata, as I wouldn't have expected that to be ever limited by Sqlite.

lxpz · 2025-12-19T21:04:52 1766178292

So the thing is, different KV stores have different trade-offs, and for now we haven't yet found one that has the best of all worlds.

We do recommend SQLite in our quick-start guide to setup a single-node deployment for small/moderate workloads, and it works fine. The "real world deployment" guide recommends LMDB because it gives much better performance (with the current status of Garage, not to say that this couldn't be improved), and the risk of critical data loss is mitigated by the fact that such a deployment would use multi-node replication, meaning that the data can always be recovered from another replica if one node is corrupted and no snapshot is available. Maybe this should be worded better, I can see that the alarmist wording of the deployment guide is creating quite a debate so we probably need to make these facts clearer.

We are also experimenting Fjall as an alternate KV engine based on LSM, as it theoretically has good speed and crash resilience, which would make it the best option. We are just not recommending it by default yet, as we don't have much data to confirm that it works up to these expectations.

BeefySwain · 2025-12-19T19:47:39 1766173659

(genuinely asking) why not SQLite by default?

lxpz · 2025-12-19T19:54:36 1766174076

We were not able to get good enough performance compared to LMDB. We will work on this more though, there are probably many ways performance can be increased by reducing load on the KV store.

srcreigh · 2025-12-19T21:49:44 1766180984

Did you try WITHOUT ROWID? Your sqlite implementation[1] uses a BLOB primary key. In SQLite, this means each operation requires 2 b-tree traversals: The BLOB->rowid tree and the rowid->data tree.

If you use WITHOUT ROWID, you traverse only the BLOB->data tree.

Looking up lexicographically similar keys gets a huge performance boost since sqlite can scan a B-Tree node and the data is contiguous. Your current implementation is chasing pointers to random locations in a different b-tree.

I'm not sure exactly whether on disk size would get smaller or larger. It probably depends on the key size and value size compared to the 64 bit rowids. This is probably a well studied question you could find the answer to.

[1]: https://git.deuxfleurs.fr/Deuxfleurs/garage/src/commit/4efc8...

lxpz · 2025-12-19T22:09:59 1766182199

Very interesting, thank you. It would probably make sense for most tables but not all of them because some are holding large CRDT values.

asa400 · 2025-12-20T22:19:36 1766269176

Other than knowing this about SQLite beforehand, is there any way one could discover that this is happening through tracing?

rapnie · 2025-12-20T00:10:51 1766189451

I learned that Turso apparently have plans for a rewrite of libsql [0] in Rust, and create a more 'hackable' SQLite alternative altogether. It was apparently discussed in this Developer Voices [1] video, which I haven't yet watched.

[0] https://github.com/tursodatabase/libsql

[1] https://www.youtube.com/watch?v=1JHOY0zqNBY

tensor · 2025-12-19T20:46:58 1766177218

Keep in mind that write safety comes with performance penalties. You can turn off write protections and many databases will be super fast, but easily corrupt.

skrtskrt · 2025-12-19T20:22:35 1766175755

Could you use something like Fly's Corrosion to shard and distribute the SQLite data? It uses a CRDT reconciliation, which is familiar for Garage.

lxpz · 2025-12-19T20:44:12 1766177052

Garage already shards data by itself if you add more nodes, and it is indeed a viable path to increasing throughput.

__padding · 2025-12-21T00:29:32 1766276972

I’ve not looked at it in a while but sled/rio were interesting up and coming options https://github.com/spacejam/sled

ndyg · 2025-12-21T05:20:48 1766294448

Fjall

https://github.com/fjall-rs/fjall

__turbobrew__ · 2025-12-19T21:38:20 1766180300

RocksDB possibly. Used in high throughput systems like Ceph OSDs.

patmorgan23 · 2025-12-19T21:11:43 1766178703

Valkey?

VerifiedReports · 2025-12-20T06:06:06 1766210766

It's "key/value store", FYI

kqr · 2025-12-20T06:26:37 1766211997

It's not a store of "keys or values", no. It's a store of key-value pairs.

VerifiedReports · 2025-12-20T17:59:14 1766253554

A key-value store would be a store of one thing: key values. A hyphen combines two words to make an adjective, which describes the word that follows:

  A used-car lot

  A value-added tax

  A key-based access system

When you have two exclusive options, two sides to a situation, or separate things; you separate them with a slash:

  An on/off switch

  A win/win situation

  A master/slave arrangement

Therefore a key-value store and a key/value store are quite different.

kqr · 2025-12-20T19:02:15 1766257335

All of your slash examples represent either–or situations. A swich turns it on or off, the situation is a win in the first outcome or a win in the second outcome, etc.

It's true that key–value store shouldn't be written with a hyphen. It should be written with an en dash, which is used "to contrast values or illustrate a relationship between two things [... e.g.] Mother–daughter relationship"

https://en.wikipedia.org/wiki/Dash#En_dash

I just didn't want to bother with typography at that level of pedanticism.

VerifiedReports · 2025-12-20T20:01:08 1766260868

No, they don't. A master/slave configuration (of hard drives, for example) involves two things. I specifically included it to head off the exact objection you're raising.

"...the slash is now used to represent division and fractions, as a date separator, in between multiple alternative or related terms"

-Wikipedia

And what is a key/value store? A store of related terms.

And if you had a system that only allowed a finite collection of key values, where might you put them? A key-value store.

kqr · 2025-12-20T20:03:38 1766261018

The hard drives are either master or slave. A hard drive is not a master-and-slave.

VerifiedReports · 2025-12-21T05:43:31 1766295811

Exactly. And an entry in a key/value store is either a key or a value. Not both.

kqr · 2025-12-21T06:38:10 1766299090

No, an entry is a key-and-value pair. Are you deriously suggesting it is possible to add only keys without corresponding values, or vice versa?

abustamam · 2025-12-20T06:16:34 1766211394

Wikipedia seems to find "key-value store" an appropriate term.

https://en.wikipedia.org/wiki/Key%E2%80%93value_database

VerifiedReports · 2025-12-20T17:59:33 1766253573

See above.

abustamam · 2025-12-22T16:35:41 1766421341

Still not sure what point you're trying to make. You attempted to correct GP's usage of "key-value store" and I merely pointed out that it is the widely accepted term for what is being discussed.

Whether or not it's semantically "correct" because of usage of hyphen vs slash is irrelevant to that point.

DonHopkins · 2025-12-20T15:48:00 1766245680

Which is infinite of value is zero.

lxpz · 2025-12-19T19:39:49 1766173189

Thank you for your feedback, we will take it into account.

topspin · 2025-12-19T21:50:19 1766181019

Great, and thank you.

I really, really appreciate that Garage accommodates running as a single node without work-arounds and special configuration to yield some kind of degraded state. Despite the single minded focus on distributed operation you no doubt hear endlessly (as seen among some comments here,) there are, in fact, traditional use cases where someone will be attracted to Garage only for the API compatibility, and where they will achieve availability in production sufficient to their needs by means other than clustering.

lxpz · 2025-12-19T19:36:59 1766173019

I talked about the meaning of the Jepsen test and the results we obtained in the FOSDEM'24 talk:

https://archive.fosdem.org/2024/schedule/event/fosdem-2024-3...

Slides are available here:

https://git.deuxfleurs.fr/Deuxfleurs/garage/src/commit/4efc8...

lxpz · 2025-12-19T19:35:08 1766172908

Read-after-write consistency : yes (after PutObject has finished, the object will be immediately visible in all subsequent requests, including GetObject and ListObjects)

Conditionnal writes : no, we can't do it with CRDTs, which are the core of Garage's design.

skrtskrt · 2025-12-19T20:28:27 1766176107

Does RAMP or CURE offer any possibility of conditional writes with CRDTs? I have had these papers on my list to read for months, specifically wondering if it could be applied to Garage

https://dd.thekkedam.org/assets/documents/publications/Repor... http://www.bailis.org/papers/ramp-sigmod2014.pdf

lxpz · 2025-12-19T21:34:29 1766180069

I had a very rapid look at these two papers, it looks like none of them allow the implementation of compare-and-swap, which is required for if-match / if-none-match support. They have a weaker definition of a "transaction". Which is to be expected as they only implement causal consistency at best and not consensus, whereas consensus is required for compare-and-swap.

skrtskrt · 2025-12-20T00:34:14 1766190854

ack - makes sense, thank you for looking!

lxpz · 2025-12-19T19:32:45 1766172765

Losing a node is a regular occurrence, and a scenario for which Garage has been designed.

The assumption Garage makes, which is well-documented, is that of 3 replica nodes, only 1 will be in a crash-like situation at any time. With 1 crashed node, the cluster is still fully functional. With 2 crashed nodes, the cluster is unavailable until at least one additional node is recovered, but no data is lost.

In other words, Garage makes a very precise promise to its users, which is fully respected. Database corruption upon power loss enters in the definition of a "crash state", similarly to a node just being offline due to an internet connection loss. We recommend making metadata snapshots so that recovery of a crashed node is faster and simpler, but it's not required per se: Garage can always start over from an empty database and recover data from the remaining copies in the cluster.

To talk more about concrete scenarios: if you have 3 replicas in 3 different physical locations, the assumption of at-most one crashed node is pretty reasonable, it's quite unlikely that 2 of the 3 locations will be offline at the same time. Concerning data corruption on a power loss, the probability to lose power at 3 distant sites at the exact same time with the same data in the write buffers is extremely low, so I'd say in practice it's not a problem.

Of course, this all implies a Garage cluster running with 3-way replication, which everyone should do.

JonChesterfield · 2025-12-19T20:53:22 1766177602

That is a much stronger guarantee than your documentation currently claims. One site falling over and being rebuilt without loss is great. One site losing power, corrupting the local state, then propagating that corruption to the rest of the cluster would not be fine. Different behaviours.

lxpz · 2025-12-19T21:19:38 1766179178

Fair enough, we will work on making the documentation clearer.

JonChesterfield · 2025-12-24T04:29:09 1766550549

I think this is one where the behaviour is obvious to you but not to people first running across the project. In particular, whether power loss could do any of:

- you lose whatever writes to s3 haven't finished yet, if any

- the local node will need to repair itself a bit after rebooting

- the local node is now trashed and will have to copy all data back over

- all the nodes are now trashed and it's restore from backup time

I've been kicking the tyres for a bit and I think it's the happy case in the above, but lots of software out there completely falls apart on crashes so it's not generally a safe assumption. I think the behaviour is sqlite on zfs doesn't care about pulling the power cable out, lmdb is a bit further down the list.

jiggawatts · 2025-12-19T20:13:07 1766175187

So if you put a 3-way cluster in the same building and they lose power together, then what? Is your data toast?

lxpz · 2025-12-19T20:17:27 1766175447

If I make certain assumptions and you respect them, I will give you certain guarantees. If you don't respect them, I won't guarantee anything. I won't guarantee that your data will be toast either.

Dylan16807 · 2025-12-20T00:12:53 1766189573

If you can't guarantee anything for all the nodes losing power at the same time, that's really bad.

If it's just the write buffer at risk, that's fine. But the chance of overlapping power loss across multiple sites isn't low enough to risk all the existing data.

rakoo · 2025-12-21T09:37:25 1766309845

I disagree that it's bad, it's a choice. You can't protect against everything. The team made calculations and decided that the cost to protect against this very low probability is not worth it. If all the nodes lose power you may have a bigger problem than that

Dylan16807 · 2025-12-21T17:30:23 1766338223

Power outages across big areas are common enough.

It's downright stupid if you build a system that loses all existing data when all nodes go down uncleanly, not even simultaneously but just overlapping. What if you just happen to input a shutdown command the wrong way?

I really hope they meant to just say the write buffer gets lost.

rakoo · 2025-12-23T09:00:55 1766480455

That's why you need to go to other regions, not remain in the same area. Putting all your eggs in one basket (single area) _is_ stupid. Having a single shutdown command for the whole cluster _is_ stupid. Still accepting writes when the system is in a degraded state _is_ stupid. Don't make it sound worse than it actually is just to prove your point.

Dylan16807 · 2025-12-23T10:58:28 1766487508

> Still accepting writes when the system is in a degraded state _is_ stupid.

Again, I'm not concerned for new writes, I'm concerned for all existing data from the previous months and years.

And getting in this situation only takes one out of a wide outage or a bad push that takes down the cluster. Even if that's stupid, it's a common enough stupid that you should never risk your data on the certainty you won't make that mistake.

You can't protect against everything, but you should definitely protect against unclean shutdown.

rakoo · 2025-12-23T17:15:01 1766510101

If it's a common enough occurrence to have _all_ your nodes down at the same time maybe you should reevaluate your deployment choices. The whole point of multi-nodes clustering is that _some_ of the nodes will always be up and running otherwise what you're doing is useless.

Also, garage gives you the possibility to automatically snapshot the metadata, advices on how to do the snapshotting at the filesystem level and to restore that.

Dylan16807 · 2025-12-24T03:59:21 1766548761

All nodes going down doesn't have to be common to make that much data loss a terrible design. It just has to be reasonably possible. And it is. Thinking your nodes will never go down together is hubris. Admitting the risk is being realistic, not something that makes the system useless.

How do filesystem level snapshots work if nodes might get corrupted by power loss? Booting from a snapshot looks exactly the same to a node as booting from a power loss event. Are you implying that it does always recover from power loss and you're defending a flaw it doesn't even have?

rakoo · 2025-12-24T20:11:36 1766607096

No, the snapshotting and restore is manual

InitialBP · 2025-12-19T20:59:29 1766177969

It sounds like that's a possibility, but why on earth would you take the time to setup a 3 node cluster of object storage for reliability and ignore one of the key tenants of what makes it reliable?

lxpz · 2025-10-05T17:57:51 1759687071

If I can reassure you about Garage, it's not at all abandoned. We have active work going on to make a GUI for cluster administration, and we have applied for a new round of funding for more low-level work on performance, which should keep us going for the next year or so. Expect some more activity in the near future.

I manage several Garage clusters and will keep maintaining the software to keep these clusters running. But concerning the "low level of activity in the git repo": we originally built Garage for some specific needs, and it fits these needs quite well in its current form. So I'd argue that "low activity" doesn't mean it's not reliable, in fact it's the contrary: low activity means that it works well for us and there isn't a need to change anything.

Of course implementing new features is another deal, I personally have only limited time to spend on implementing features that I don't need myself. But we would always welcome outside contributions of new features from people with specific needs.

LTL_FTC · 2025-10-05T19:23:02 1759692182

I appreciate the response! Thanks for the update. I will continue keeping an eye on the project then and possibly giving it a try. I have read the docs and was considering setting it up across two sites. The implementation seemed address this pain point with distributed storage solutions and latency.