I work remotely and I've decided that ON is the right choice for me. I made this choice when I started joining a lot of calls where they had to be in English because of me. I figured camera on was a good way to show I was paying attention.
However one of the perks of being a male is that society only ever expects me to take a shower to be presentable. So I'm totally cool if a colleague wants to leave the camera off even if it's a 1 on 1.
You are correct, it is a disk-format issue, and MySQL officially supports in-place upgrade between versions.
It's actually quite hard to fix bugs in charset/collations, because any changes to the sort order implicitly could affect the on-disk format (indexes are sorted).
Skeema author here -- you're correct that a live DB is involved, but the notion of "truth" in Skeema actually depends on what operation you're running :)
In Skeema, you have directories of *.sql files, where each directory defines the desired state of a single "logical schema". A config file in each dir allows you to map that logical schema to one or more live databases, possibly in different environments (prod, stage, etc) and/or possibly different shards in the same environment.
Most commonly, users will want to "push" the filesystem state to the databases, e.g. generate and run DDL that brings the database up to the desired state expressed by the filesystem. But users can also "pull" from a live database, which does the reverse: modifies the filesystem to look like the current state of a live database, essentially doing a schema dump. So the source of truth depends on the direction of the operation.
Skeema also uses the live database as a SQL parser. Instead of needing to parse every type of CREATE statement accurately across many different versions/dialects of MySQL, Percona Server, MariaDB, (and maybe someday Aurora etc), Skeema runs the statements in a temporary location and then introspects the result from information_schema. This avoids an entire possible class of bugs around parser inaccuracies.
That all said, I do wholeheartedly agree that generating DDL from git alone is both useful and really cool! I actually built a "database CI" product around this concept as a GitHub app last June: https://www.skeema.io/ci/
and agree re parsing -- turned out to be a huge pain. I think someone bundled the postgres parser for python; I wish this were true for every dialect but even so, there are versions to consider.
One detail that is not always obvious is how much work goes into limiting regressions. The work to switch to utf8mb4 really started in MySQL 5.6 by not allocating the sort buffer in full (and then further improved in 5.7). 8.0 then added a new temptable storage engine for variable length temp tables.
These are not small cases either: When you compare to latin1 because the _profile_ of queries could change from all in memory to on disk, we could be talking about 10x regressions. In MySQL 8.0 it is more like 11% https://www.percona.com/blog/2019/02/27/charset-and-collatio...
Edit: Also forgot to mention, switching the default character set broke over 600 tests. It's not as easy as it sounds!
While I appreciate that it's the default now (utf8mb4)... If someone specified (by error) "utf8" as the collation, is that real utf8 or some other implementation currently?
I agree that column and row store have very different characteristics, but what I think is worth mentioning is that some hybrid solutions actually store as both row and columnar and have a query optimizer that can pick between them. For example: Oracle DB In-Memory, SQL Server Columnstore index.
At the same event as this announcement, we also announced that we are working on TiFlash which will do similar. Stay tuned for a blog post with more details :-)
As described in that paper, it’s not sufficient to simply store a second, columnar projection of the data to get good performance. You also need a block-oriented execution engine, which means you effectively have two separate databases operating side-by-side. This is a huge challenge and it’s not clear if it’s really worth it, since for logistical reasons you will nearly always operate a separate data warehouse doing mostly OLAP and production database doing mostly OLTP.
Morgan from the TiDB team here. We are working on at rest encryption now - stay tuned.
w.r.t. nested transactions, this is not something that MySQL currently offers (TiDB is MySQL 5.7 compatible). Sometimes this is emulatable via savepoints, which is a feature we plan to add in the future.
Morgan from the TiDB team here. Thank you for the feedback, and I agree with you. We actually took this line out from the same copy in the docs: https://pingcap.com/docs/
(We must have missed a spot, and I will follow up and make sure it is addressed).
We try to be transparent about the differences from MySQL. On the compatibility page, there are a few cases described such as large transactions, small transactions and single threaded workloads:
However one of the perks of being a male is that society only ever expects me to take a shower to be presentable. So I'm totally cool if a colleague wants to leave the camera off even if it's a 1 on 1.