The high-level is: You enable a setting and every CREATE SCHEMA creates a new shard. All the tables in the schema will be co-located so you can have efficient joins & foreign keys between the tables.
On top of that, you can also have reference tables that are replicated to all nodes, again for fast joins & foreign keys with all schemas.
Everything else is about making every PostgreSQL feature work as seamlessly as if there was no sharding. You can still do things like transactions across schemas, create and use custom types, access controls, work with other extensions, use procedures, etc.
Historically, there has been, but we have found physical replication is ultimately a lot more performant, robust, and tweakable. Citus users & platforms generally use physical replication of each node for high availability.
For instance, the ability to quickly spin up a new replica using a disk snapshot is very useful and only feasible at the server level.
It is still possible to replicate shards (via the citus.shard_replication_factor setting), but it only helps for scaling read throughput, at the cost of lower write throughput.
Depends on the definition of widespread use, but e.g. CockroachDB and YugabyteDB offer scaling of a subset of Postgresql features (and extending on the cluster capabilities).
Nevertheless, Citus is now a promising open source alternative.
Hm, question for people a bit more familiar with Postgres -- what is meant by "schema" here?
My definition is "the columns and column types of a table", but, that doesn't seem to make sense with what they're talking about here ("large" and "small" schemas probably aren't referring to wide and narrow tables for example, and I don't see how sharding by my definition of "schema" could even make sense anyways)
Schemas are namespaces (actually called that internally in Postgres).
The SQL standard defines a two level namespace hierarchy. A single "instance" of contains multiple catalogs and each catalog contains multiple schemas (and each schema then contains objects like tables, views, types, functions etc).
Many database products use the term "database" instead of "catalog" e.g. in Postgres and SQL Server. But "schema" is used quite uniformly. MySQL's "databases" are in fact "schemas" though.
I always vaguely wonder why no one allowed a full hierarchical schema path. I expect it is probably because "the standard sez one level of schema only"
Schemas are groupings of tables and other entities that can be defined within a database. You can think of them like of a namespace in programming languages. You can have the same table definition, within the same database, defined multiple times (each in a different schema) and each holding different data.
By large and small we are referring to the amount of data each schema holds currently. They can grow over time and some of them may become very big while others will remain small (storage wise).
> You can have the same table definition, within the same database, defined multiple times (each in a different schema) and each holding different data.
So in this respect, each table within a schema indeed already acts like a "shard" of the overall table
Is this enforced? Like, if I create a table "messages" in schema A and a table "messages" in table B, must they have the same columns/column types, or is that just convention
Ah I see! Yeah I guess it's just a concept I've never run into before; unfortunate that it shares a name with the "other" concept of schema used by e.g. https://json-schema.org
Having used schema based sharding in postgres before, I hope citus has a way around connection pooling and pgbouncer, as if you use pgbouncer transaction pooling your search_path could disappear at any time.
What happens if one node lost their shards due to external event? (e.g. Disk corruption, physcially destroyed like OVH Cloud) I do understand we still have to actively backup but I rather not serve any data than serving wrong data at the time of severe outage. Also I want to see any forward error correction code (FECC) would be implemented in Citus so we can do this on the fly rather than relying on RAID, e.g. RAID10, RAIDZ1, RAIDZ2
The answer is mostly the same as for PostgreSQL, since Citus stores data in regular PostgreSQL tables.
It's a good idea to set up archival into blob storage / S3 using a tool like WAL-G for disaster recovery purposes, and streaming replication using a tool like Patroni.
(Or use a managed service like Azure Cosmos DB for PostgreSQL)
If a node is down and cannot be quickly recovered then the remaining shards are still available for reads and writes, except for DDLs.
one gotcha: schemas are a weird old thing in SQL that are kinda the worst of all worlds, basically more like prefixing your SQL object names than a real level of indirection. Schema objects can't be manipulated as a batch, they don't provide the isolation / access control of databases, and can't be manipulated in DML and require stored procedures to manipulate.
True, though I'd phrase it more as a middle ground. Which model is best depends on your needs.
A big advantage of schemas over databases is that you can access different schemas over the same connection, which allows much higher density, since connections are a scarce resource in PostgreSQL. You can also have shared tables.
A big advantage over row-based sharding is that you can have variations of the table definitions or a different set of tables altogether in each schema. Plus for the purpose of sharding it's easier to use, since normalized data models complicate row-based sharding.
That's exactly the reason I went for YugabyteDB tablespaces implementation on one of past projects. Dedicated database locked down to selected nodes for full isolation.
It is worth noting that Citus was acquired by Microsoft a few years ago (not a secret, but may not be obvious to everyone), and they have since shifted heavily towards the Azure side of the world.
Back when it happened we tried to buy their product and they were not sure if the Citus standalone product was even going to exist, and they refused to demo it even. Odd timing possibly, but it's a data point.
On the positive side of things, the shard rebalancing was not open source back then iirc, which made the open source version pretty useless. Now it seems to be open source: https://www.citusdata.com/product/comparison -- pretty cool.
I'd still be careful to bank on it as a Citus only customer or open source user.
There no longer is an enterprise version, what runs on Azure is the exact same Citus that you can run yourself. We even invested in Patroni, to make it easier for the community to self-host Citus with HA setups.
While we obviously want people using Citus on Azure, having Citus as a viable open-source choice is our path to achieving that. I wasn't part of the company when the acquisition happened so can't speak to that, but I can imagine how that could have made sales at the transition time unclear.
Personally I would also like to add, that the team is full of long term open-source contributors. We contribute both to PostgreSQL and projects around it (like pgbouncer). I understand and respect your reservation, but wanted to share my perspective on it.
I'm having trouble getting CitusDB to work with RDS, even though I'd really like to use it with AWS. Whenever I try to research how to make it happen, I get stuck in a lot of challenges and end up concluding that it might not be possible. While I could use Azure instead, I'm hesitant because I have a lot of resources and infrastructure in AWS and it wouldn't make sense to move the database layer. The idea of having to do DBA and Ops work on my own EC2 instances is overwhelming. I would appreciate any guidance on how to use RDS with Citus on their documentation. It's concerning that Citus might be withholding information about the possibility of using RDS, given that Microsoft is its partner.
Microsoft isn't its "partner", they were entirely bought by Microsoft. The goal is to make the technology an advantage for Azure, as I understand it. It's very unlikely you'll see them offer first-party support for RDS, or CloudSQL.
Yeah it's a bit of a shame they were bought by one of the cloud providers.
Would've been great if something like Cruncy Bridge, PgAnalyze and Citus were merged into one company so that you have managed Postgres over the 3 major providers with the best pg tech out the box.
Having your DB and apps in different providers is a stuff up, the latency and data transfer costs killed us, so Citus was a non start (we don't want to manage it ourselves in EC2).
What's advantage over having tenant id as distribution column? Seems like you make schema name the distribution column. Maybe gross setups where same name function definition varies between schemas (been there done that, don't want to do it again)
Seems like article only offers ease of use. Guess I've never used microservices enough to consider that use case
Couldn't the microservice case be handled by having distributed tables with no distribution column? ie today I'd create a distribution column & fill it with the same value on every row
Can one have a reference schema which can be efficiently used alongside every other schema? Guess that's public schema with create_reference_table/create_distributed_function
> What's advantage over having tenant id as distribution column?
Ease of use is definitely the main one. If you're willing to put in the work required to use tenant ID as a distribution column (add it to tables, primary keys, foreign keys, filters, joins), then it's a more scalable approach.
A challenge with sharding by tenant ID is that many applications use a normalized data model, meaning not all tables obviously have a tenant ID column. When you use a schema per tenant on vanilla PostgreSQL, no additional steps are typically required to enable schema-based sharding.
There are some other benefits of schema-based sharding such as custom table definitions, simpler & more versatile access control, and longer term we expect it will be easier to pin a large tenant to a node using schemas, or distribute the tables of ultra-large tenants (by some other dimension). Of course, row-based sharding has other benefits like parallel cross-tenant queries and global DDL.
> Couldn't the microservice case be handled by having distributed tables with no distribution column?
Absolutely. We first implemented the notion of single shard distributed tables with a NULL distribution column, and then built schema-based sharding on top as a convenience layer.
> Can one have a reference schema which can be efficiently used alongside every other schema? Guess that's public schema with create_reference_table/create_distributed_function
Yes, public schema (or other another non-distributed schema) can have reference tables & distributed tables as usual, and tables in distributed schemas can have foreign keys / local joins with reference tables.
One reasons why some prefer multi tenancy via schemas instead of a tent ant column: Reading a column with the wrong tenant id happens easily, just forget the where in a query. Across schemas, this is much harder to get wrong as a schema has to be explicitly named in a query.
Doesn't sharding imply horizontal partitioning? Some of their examples show vertical partitioning use cases. Not sure how I feel about that. I can't imagine why you would do microservices and host their databases as schemas on the same instance.
Because with schema-based sharding in Citus, the schemas are the sharding unit and the system can move them to new nodes being added to the cluster.
You can start with all your microservices sharing a single node, add nodes and have the storage layer distributed horizontally in a way transparent to the services themselves.
The high-level is: You enable a setting and every CREATE SCHEMA creates a new shard. All the tables in the schema will be co-located so you can have efficient joins & foreign keys between the tables.
On top of that, you can also have reference tables that are replicated to all nodes, again for fast joins & foreign keys with all schemas.
Everything else is about making every PostgreSQL feature work as seamlessly as if there was no sharding. You can still do things like transactions across schemas, create and use custom types, access controls, work with other extensions, use procedures, etc.