Some things seem to have changed from 2018, but MongoDB was by far the worst dat...

jd_mongodb · on Sept 30, 2021

Posting old Jepssen analyses is like pointing at old bug reports. Everytime Jepsen finds a bug we fix it lickety-split. I know it's not cool to focus on that fact, but it is a fact. The Jepsen tests are part of the MongoDB test suite so when we fix those problems they stay fixed.

I would love to hear your personal experience of MongoDB as opposed to reposting old Jepsen reports. Perhaps there is something that we can address in 5.1 that is still a problem?

(I work in developer relations at MongoDB)

jenny91 · on Sept 30, 2021

The latest "old Jepsen report" is barely a year old. It's not like digging up dirt from years ago.

It also seems like there was quite a lot wrong even a year ago, quoting from there:

> Roughly 10% of transactions exhibited anomalies during normal operation, without faults.

It's just not a very reassuring response to say "when someone goes to dig a bit and finds a lot of show-stopping bugs, we address those specific bugs quickly".

To me it sounds like the architecture and care just isn't there for a robust data storage layer?

jiggawatts · on Oct 1, 2021

Something that was drilled into me decades ago is that there is no such thing as fixing multi-threaded (or distributed) code via debugging or patching it "until it works".

You either mathematically prove that it is correct, or it is wrong for certain.

This sounds like an oddly strong statement to say, but the guy who wrote the textbook that contained that statement went on to dig up trivial looking examples from other textbooks that were subtly wrong. His more qualified statement is that if a professor writing simplified cases in textbooks can't get it right, then the overworked developer under time pressure writing something very complex has effectively zero chance.

The MongoDB guys just don't understand this. They're convinced that if they plug just one more hole in the wire mesh, then it'll be good enough for their submarine.

PS: The professor I was referring to is Doug Lea, who wrote the "EDU.oswego.cs.dl.util.concurrent" library for Java. This was then used as the basis for the official "java.util.concurrent".

bmcahren · on Oct 1, 2021

If you use MongoDB as a document store, arguably it's core functionality, you're not exposed to any of the shortcomings Jepsen rightly identified and exploited weaknesses in.

Transactions are new to MongoDB and they are not necessary for most. Structure your data model so you only perform single-document atomic transactions ($inc, $push, $pull) rather than making use of multi-document ACID transactions. It's possible, we're doing it for our ERP.

Sharding is something we've intentionally avoided opting for application-layer regional clusters. We specifically were avoiding other complexities related to shards that are not a concern for replica sets. Durability and maximum recovery time during emergency maintenance caused us to avoid them.

sigmonsays · on Sept 30, 2021

where is the latest jepsen test results published?