This is such a killer feature in PG, my new job uses it and it makes audits of o...

e1g · on July 26, 2022

Every B2B client who asked us how we handle multi-tenancy also asked how we ensure their data is erased at the end of the contract. Using a shared database with RLS means you have to go through all DB backups, delete individual rows for that tenant, then re-generate the backup. That’s a non-starter, so we opted for having one DB per tenant which also makes sharding, scaling, balancing, and handling data-residency challenges easier.

brightball · on July 26, 2022

I filled out a ton of enterprise questionnaires on this stuff before and we just told people that it would be deleted when the backups expired after X days because we didn't have the capability to delete specific rows from our backups. Nobody ever argued.

There's not a single customer I've ever run across who's going to halt a contract because you can't purge their data from your backups fast enough. They're signing up because of what you offer, not the termination clause.

e1g · on July 26, 2022

A recent example from the other side: a client contacts me and says they will have to exit from our existing contract unless we can update our (AWS) infrastructure to use their (AWS) encryption keys for servers and databases handling their tenancy. In Enterprise, some tenants are very opinionated about what cloud you use and how their data lives/flows within it. I run all our infosec, including SOC2 & ISO27001 programs, and I know that using their encryption keys is nothing but security theater. But with $500k p.a. on the line, I also know when it's showtime.

brightball · on July 26, 2022

Fwiw, we also priced in dedicated database instances for people who wanted it where we had that capability. For the extra costs, nobody ever took us up on it.

victor9000 · on July 27, 2022

This is the right answer. You always say yes, and tell them how much it will cost. Then you can sit back and enjoy the sweet sound of crickets.

mixmastamyk · on July 28, 2022

Or "cha-ching!"

POPOSYS · on July 27, 2022

Would you please like to explain why "using their encryption keys is nothing but security theater"? Thanks!

X-Istence · on July 27, 2022

If you give me access to use your KMS key to encrypt/decrypt an EBS volume, and grant me access to that EBS volume/mount it to an AWS EC2 instance I manage, I can read/write data on that EBS volume all day long.

The fact that you own the KMS key doesn't stop me from reading/writing that EBS volume. It doesn't offer any additional security guarantees, especially if I was already encrypting the EBS volumes with a KMS key.

If I am a SaaS offering, whether I use a KMS key I own or one you own doesn't change the fact that I still have access to all of the unencrypted data that is silently being encrypted by those KMS keys, I've got access to the layer above it.

Sure, if the contract ended they could revoke the KMS key and now the data on the EBS volumes is no longer readable by me, but any backups I have of that data is still within my purview.

vlovich123 · on July 27, 2022

Human systems (contracts/legal) step in when technological systems can’t. Until there’s homomorphic encryption, enforcing that you’re using their key can give them piece of mind that they can revoke it. While true that you could be doing anything with that data, a contract in good standing and normal human ethics probably adds a high degree of likelihood that you’re not. However, if the relationship sours, they want the freedom to revoke quickly without needing your good will. If your backups aren’t using their keys, I think you’d be violating the contract.

But yes, from a purely technological perspective security theater. They could also misunderstand what’s happening and it’s also not worth it to try to explain for you at the risk of losing the contract.

X-Istence · on July 27, 2022

I don't know how this is for others, but in the environments I am in we use a different KMS key for our backups so that if something were to happen, like a mis-click in a web interface or an accidental terraform destroy, we can recover the data.

It is also stored in a different location than the original (different AWS account).

buzer · on July 27, 2022

The only real extra protection that I can think of that they get is that they can revoke the access at any time (e.g. right after terminating contract). If someone malicious gets into the provider's AWS account it's highly likely that they can get the access to the running machines and extract the data. I guess if attacker is stupid and just tries to restore the encrypted backup it could alert the customer instead of just the provider assuming that's a non-standard operation.

If there was a very deep integration, essentially column-level, it could make a lot more sense, but that could give quite a big performance & cost hit as well as limit what you can do (essentially you wouldn't be able to do any matching on content level).

e1g · on July 27, 2022

Other comments from X-Istence, vlovich123, and buzer are correct. If our apps/admins can get to client data, using their keys does not help to mitigate any plausible attack vector. If we become hostile or compromised, their data is toast anyway. Adding this technical control increases friction and makes the overall system more rigid/brittle but does nothing to increase security. Hence "security theater".

Other examples of such meaningless-and-impotent controls in infosec include "must run a firewall on your Linux server on a private subnet", and any policy requiring password complexity/rotation. But as a business, it's more productive for us to just tick the checkboxes than to spend resources on educating the market about the actual best practice (unless you're in the business of education).

LAC-Tech · on July 27, 2022

Interesting that they are so privy to how you administer stuff. Would they have terminated on the spot if you had been using azure or google or digital ocean? Or was using AWS in the initial contract?

e1g · on July 27, 2022

We had cases when a prospect said we must use AWS (because that's what they use, and are comfortable with how our data bridges would work/integrate), and when a prospect said we must not use AWS (because they are in retail and want to avoid feeding the 1,000-pound gorilla that's mauling them).

I don't want to imply this is common - it's not. Such cases come from the "arbitrary enterprise rules" side rather than the "SaaS/tech" side.

voberoi · on July 26, 2022

This is the way -- also never had an issue across US healthcare and enterprise SaaS.

coenhyde · on July 26, 2022

i did the same with the same results

wahnfrieden · on July 26, 2022

same. but! it’s a liability nonetheless, go talk with legal etc

mikeodds · on July 26, 2022

thirded

bearjaws · on July 26, 2022

We usually write a "reasonable best effort" clause into our deletion, that it will 100% be deleted from production within 30 days and automatically fall out of backups 60 days from there. This also helps since we can't control our downstream vendors such as Twilio, AWS SES, etc, who all have their own legal obligations and time frames.

Even for large health systems they have been okay with it.

jerryjerryjerry · on July 26, 2022

I think TTL feature provided by some DB vendors are actually orthogonal to multi tendency, where the former deals with life cycles policy of data but the original problem to delete the data of certain user is more related to privacy policy of data, though overlap may exists.

Dave3of5 · on July 27, 2022

> we opted for having one DB per tenant which also makes sharding, scaling, balancing, and handling data-residency challenges easier

When you say 1 DB I suspect you mean you have a single DB Server and multiple DB's on that server. Then I don't think this really solves the data-residency problem as the clients data is just in a different DB but still on the same instance. It makes other problems for you as well for example you now have 2 DB's to run maintenance, upgrades, data migrations on. Current company uses a similar model for multiple types of systems and it makes upgrading the software very difficult.

It also makes scaling more difficult as instead of having a single DB cluster that you can tweek for everyone you'll need to tweek each cluster individually depending on the tenants that are on those clusters. You also have a practical limit to how many DB's you can have on any physical instance so your load balancing will become very tricky.

There are other problems it causes like federation which Enterprise Customers often want.

hobs · on July 26, 2022

Agree - performance on row level security (at least on SQL Server) is terrible, sharing by database is fairly easy.

axlee · on July 26, 2022

It makes BI work an absolute hellscape as well. Tradeoffs.

mrtesthah · on July 27, 2022

Just make the application collect the statistics you need. Put them into a separate shared DB or use one of the many existing off the shelf SaaS collection frameworks.

spiffytech · on July 27, 2022

> we opted for having one DB per tenant which also makes sharding, scaling, balancing, and handling data-residency challenges easier

I'm surprised this simplified balancing for you. When I worked somewhere with per-customer DBs, we had constant problems with rebalancing load. Some customers grew too big for where we put them, some nodes usually performed fine until the wrong subset of customers ran batch jobs simultaneously, etc.

jquery_dev · on July 26, 2022

How do you manage your backend in this case? Do you have an insurance of backend for each customer or do you allow backend to make connections to all the DBs.

I'm interested in doing similar and wondering about the best way to handle the routing between the databases from a single backend.

mason55 · on July 26, 2022

It really depends on your requirements, both functional and cost. Having a full stack per customer can be great for a lot of reasons. It's probably the safest because you never have to worry about something getting messed up in the code and creating cross-customer access. Once you're sure the environment is set up correctly you can sleep well at night. You also don't have to worry about one customer doing something to impact the performance of other environments (unless you're using shared infra, like putting all your DBs on a single cluster). And it can make maintenance easier, for example you can data migrations can start with small and/or less important customers for practice. It also can give you more flexibility if you need to make special snowflakes (i.e. some big customer wants weird IP whitelisting rules that can't work with your normal env setup).

Downsides are that it's probably more expensive and more work. Even if your infra spin up is totally automated, you still need to keep track of all the environments, you still need to keep your Infrastructure-as-Code (e.g. your terraform scripts) up to date, more can go wrong when you make changes, there's more chance for environments to drift.

So, in short, separate stacks usually means more safety & simpler application architecture in exchange for more cost and more effort to manage the fleet.

gervwyk · on July 26, 2022

Also second this, we even split our AWS org into an AWS account per tentant. Although, this will maybe be a problem if we have +100s of clients. But it makes onboarding and off-loading simple.

HatchedLake721 · on July 26, 2022

It depends on an annual contract value (ACV), doesn't it? You can't give an AWS account to every $99 p/m plan, but you can for enterprise $50-100k+ deals.

gervwyk · on July 26, 2022

Yeah, that sums it up. I guess it means it can't be labelled as "multi-tenant" then..

jamesfinlayson · on July 27, 2022

Is account creation automated?

I know there are some resources where you can only have one per region (I think you can only have one AWS::EC2::VPCEndpoint per... type and service per region) but I don't know if letting multiple tenants use the same VPC endpoint is a risk or not.

X-Istence · on July 27, 2022

You can have an instance of a VPC Endpoint per VPC. You can have multiple VPC's all with the same VPC Endpoints.

You just get billed for them ;-)

jamesfinlayson · on July 29, 2022

Oh yes that makes complete sense - I'm living in a world where our internal AWS management team deploys our VPCs for us (one per account unless you have very special needs).

throw03172019 · on July 26, 2022

Each client is running on their own instances / load balancers?

BeefWellington · on July 27, 2022

One DB (or perhaps schema, depending on DBMS) per tenant is the way to go IMO. It simplifies so much of your work, from DR to deletion to scaling.

SahAssar · on July 26, 2022

I think the way to handle this (based on how many companies handle GDPR compliance) is to not keep backups older than X months (usually 3 months) and have a clause that all data past that time is deleted.

eastbound · on July 26, 2022

I honestly don’t understand how Oracle is still alive. Postgres has so many of these killer features.

Also, I wonder how others do tenant separation, what other solutions there are.

mritchie712 · on July 26, 2022

Legacy.

If you have thousands of lines of code relying on Oracle the cost to migrate would be enormous.

gav · on July 26, 2022

Ignoring the cost, there's the risk/reward alignment you see in large enterprises.

Imagine you're a new CIO. You know you're probably looking at a 3-5 year tenure at this new company and you want to lead with some big wins to set the tone and show your value.

You're reviewing proposals from your senior leadership. One of the options is an Oracle migration. It could cost a million dollars to migrate, but you'd save a million dollars a year going forward. Oracle runs your mission-critical internal systems, any issues with the migration and the system you migrate to is going to cause significant financial and reputation damage. You'll have to defend this decision if anything goes wrong, i.e. you've absorbed a lot of risk but a lot less upside to you personally.

What do you do? You put the proposal to the side and look for something that has a lot better upside.

beckingz · on July 26, 2022

Exactly. The risk/cost profile for migrations is bad: If it goes well, decent return. If it goes poorly, catastrophic.

slt2021 · on July 27, 2022

lift and shift migrations dont really make sense. It would make sense to do architectural rewrite, from on-prem Oracle monolith to cloud native serverless stack for example. Digital transformation, yeah

kennethh · on July 27, 2022

I worked in a bank previously and we migrated all our databases from Oracle to MS SQL Server. I think we used like 7-8 years to do it so I can understand people who are hesistant to convert. I think the advantages at the time (this was 10 years ago) was lower price for the db servers but also more people who are familiar with Sql Server compared to Oracle.

toomuchtodo · on July 26, 2022

Amazon has a great post on this topic.

https://aws.amazon.com/blogs/aws/migration-complete-amazons-...

I thought it was cool they retrained their Oracle DBAs into other roles as part of the project.

spacemanmatt · on July 26, 2022

I work with a few former-Oracle DBAs in a PostgreSQL-flavored consultancy now and they are aces. All the root-cause analysis and organization skills transfer handily.

paulmd · on July 26, 2022

Postgres is functionally and conceptually extremely similar to Oracle. There are a few oddities (in particular, oracle's "nulls are never in indexes" is kinda weird) but the redo log is similar to the WAL, etc. In most cases, similar approaches will perform similarly and experience pretty much transfers over with a few months of experience.

abraae · on July 26, 2022

Oracle has had the ability to do this for decades ("virtual private database"), so whatever is keeping them alive, it's nothing to do with this particular nifty Postgres feature.

ibejoeb · on July 27, 2022

PostgreSQL is a great product, but Oracle has so many more features. Even just in the query language. There's so much power embedded in just the model clause that would be all custom software in Postgres.

If you buy Oracle, you should use Oracle. Like really lean into it. If you really need it, it will be worth the money. I don't like dealing with Oracle sales, but the product is killer.

throwaway787544 · on July 26, 2022

They're like a tick. Very good at burrowing in and hard to remove. They have a lot of clients for whom a dozen million dollars is a drop in the bucket, and moving away is a decade-long millions-of-dollars project.

revskill · on July 26, 2022

Sharding is the only scalable way per my experience. The point about scalability here is, i can control the load as the data gets bigger.

Scarbutt · on July 26, 2022

oracle has flashback

gz5 · on July 26, 2022

And PG supports layer 3 shut-down of link listeners and inbound fw ports. So you can combine the L7 tenancy with a secure networking architecture which eliminates the problems of managing firewalls and ACLs. One of the open source examples: https://youtu.be/s-skpw7bUfI

ksec · on July 27, 2022

Fortune Top 10

1 Walmart

2 Amazon

3 Apple

4 CVS Health

5 UnitedHealth Group

6 Exxon Mobil

7 Berkshire Hathaway

8 Alphabet

9 McKesson

10 AmerisourceBergen

We can rule out 2,3,7,8 …