I agree a data diff would be challenging, especially at scale, but one tool I th...

evanelias · on Dec 9, 2019

I'm the author of a schema management tool, Skeema [1], designed to solve this problem for MySQL and MariaDB.

There are a number of other existing tools that can compare/diff schemas on 2 live databases, but Skeema is designed to also actually manage your database structure through a declarative repo of CREATE statements. It works at any scale (natively supports sharding and external OSC tools) and is trusted by several large users, including GitHub [2] and Twilio SendGrid [3].

[1] https://www.skeema.io

[2] https://fosdem.org/2020/schedule/event/mysql_github_schema/

[3] https://sendgrid.com/blog/schema-management-with-skeema/

knutwannheden · on Dec 9, 2019

jOOQ is an SQL library, which in addition to an internal SQL DSL for Java also includes other goodies like an SQL parser and since very recently (still under active development) also a schema diff tool (also available as CLI): https://www.jooq.org/diff/.

One thing which sets jOOQ apart from most other tools out there is the fact that it supports many different SQL dialects. Thus the schema diff tool can for instance also parse DDL in one dialect and render the diff in another SQL dialect. For certain applications this could be of interest.

Disclaimer: I am an active committer on the jOOQ project.

grey-area · on Dec 9, 2019

I run migrations locally and on dev on sanitised snapshots of live data, and have easy access to those, so I just use the db to view the schema if required. Regular snapshots of the data are useful too.

If migrations are kept small there isn't usually much confusion over what changed (see migration), or what exists (see db).

lichtenberger · on Dec 9, 2019

For sure you'd have to add rolling hashes, but that's probably more natural in a tree based data store.

I've added these to https://sirix.io, which fastens diffing considerably, especially with deep trees. Otherwise indexing changes could be done :-)