Hey HN,
I'm the creator of OrioleDB, an extension for PostgreSQL that serves as a drop-in replacement for the default Heap storage engine. It is designed to address scalability bottlenecks in PostgreSQL's buffer manager and reduce the WAL, enabling better utilization of modern multi-core CPUs and high‑performance storage systems.
We are getting closer to GA.
This blog post is dedicated to describing a new OrioleDB optimization: fastpath search, an inter-page search that bypasses page copying and tuple deformation for certain fixed-length datatypes.
We would love for more people to test and benchmark OrioleDB. You can check how fastpath search would help your workload. The fastest way to do that is to use the Docker image provided:
docker run -d --name orioledb -p 5432:5432 orioledb/orioledb
Additionally, OrioleDB beta12 features a new fastpath tree search, which can accelerate workloads with intensive key-value lookups by up to 20%. Stay tuned for a new blog post about this later this week.
Neon is indeed unrelated to OreoleDB, but Neon does also provide the separation of storage and compute in Postgres which GP asked about ("Postgres needs options for open-source separation of storage and compute"). A mention of Neon (which is Apache 2 licensed) therefore isn't totally unwarranted.
I understand Neon is open source and I think it’s an awesome product, but apart from the risks associated with longevity of open source once a company gets acquired - although the storage engine is open source, the control plane isn’t and is non-trivial to implement oneself. Orioldedb is positioned as a Postgres extension, which is must easier to setup (even on an existing operating database) than migrating to a completely different architecture that Neon provides.
Thank you for your feedback!
We tried to enable think time with go-tpc, thanks to @pashkinelfe.
That leaves us with 1 tpmC per connection, growing linearly up to ~300 connections for both heap and OrioleDB. So, in order to experience a storage bottleneck, we would need dozens of thousands of connections. Given PostgreSQL runs a process per connection, that would be more of a stress test for the Linux kernel. Additionally, PostgreSQL requires N^2 of memory depending on the number of connections, and it becomes sensible at this scale. All of that could be resolved by migrating PostgreSQL to co-routines and resolving memory requirement issues. However, this is currently out of scope for us.
Could you recommend another benchmark to reveal storage bottlenecks, given that TPC-B and YCSB are too trivial?
> Additionally, PostgreSQL requires N^2 of memory depending on the number of connections,
For sure, not all the PostgreSQL memory is N^2. AFAIR, just a couple of components, including deadlock decoding, require a quadratic amount of memory. Normally, they are insignificant but growing fast if you are rising max_connections.
One approach to mitigate the connection problems for tpcc would be to utilize a connection pooler like pgbouncer or yandex/odyssey. It’s certainly more complexity.
Another suite to look at is sysbench. It’s very flexible, for better or for worse, but it can allow you to create an interesting mix of queries at different scale factors. For something like this where you’re going head to head with Postgres, having more dimensions with more benchmarks isn’t going to hurt. Ideally you’ll see a nice win across the board and get an understanding of the shape of differences.
The important design issue about building active-active multi-master on the base of raft protocol is about being able to apply changes locally without immediately putting them into a log (without sacrificing durability). MySQL implements a binlog, which is separate from a log, to ensure durability. OrioleDB implements copy-on-write checkpoints and row-level WAL. That gives us a chance to implement MM and durability using a single log.
Some notable benchmarks from the OrioleDB beta7 release:
* 5.5x Faster at 500 Warehouses: In TPC-C benchmarks with 500 warehouses, OrioleDB outperformed PostgreSQL's default heap tables by 5.5 times. This highlights significant gains in workloads that stress shared memory cache bottlenecks.
* 2.7x Faster at 1000 Warehouses: Even when the data doesn't fit into the OS memory cache (at 1000 warehouses), OrioleDB was 2.7 times faster. Its index-organized tables improve data locality, reducing disk I/O and boosting performance.
Run your own workloads or existing benchmarks like go-tpc or HammerDB to see the performance differences firsthand. We Would love to hear about others' experiences with OrioleDB, especially in production-like environments or with different workloads.
This was driven by Andres Freund for pg12. Please, check this.
https://anarazel.de/talks/2018-10-25-pgconfeu-pluggable-stor...
However, the current table AM API in many aspects assumes heap-like storage. This is why OrioleDB comes with a patchset we're intended to upstream to PostgreSQL.
reply