Just write a test for it

karparov · 2025-03-30T09:53:09 1743328389

> This wasn’t caught by the existing test suite (even though it runs almost 200 end-to-end tests), because it always starts from an empty database, applies all migrations and only then runs the test code.

Isn't that where the test coverage has a hole?

I somehow expected the blog post to extend testing for this. A pre-populated database which is then migrated. That seems to catch a wider class of issues than parsing sql and shielding against just checking for non-null without default.

kobzol · 2025-03-30T13:10:58 1743340258

(author of the post)

Just to clarify a bit, the test ofc isn't a fully general solution to solving issues with database migrations (I hinted what that might be in the blog post), although it's still useful to provide a nice error message even if a more general solution was implemented.

That was not at all the goal of the post. I just wanted to appreciate how easy it was to achieve this specific task in Rust. In any other systems programming language that I used (even most other languages, except maybe for Python), I would never even imagine something like this being feasible, and so easy to do. That's it :)

switch007 · 2025-03-30T08:55:19 1743324919

"linter for PostgreSQL, focused on migrations"

It covers this class of problem I think

NoahZuniga · 2025-03-30T15:22:02 1743348122

After reading the article I was left with the question why didn't he just use an SQL linter?

kobzol · 2025-03-31T06:03:29 1743401009

That didn't even occur to me, tbh :) But it doesn't have to be SQL linting, I just wanted to appreciate the mindset of not being lazy/afraid to write an unorthodox test.

nikolayasdf123 · 2025-03-30T09:35:11 1743327311

just look at this test for this one-liner DDL. the test is way more complex than thing it tests in first place. such confusing and hard to write and read tests is the reason people avoid writing tests in the first place. make tests great again!

(great = simple, short, easy to write, read, maintain. at very least no more complex than the thing it testing!)

wesselbindt · 2025-03-30T09:50:38 1743328238

Totally with you on the merits of simplicity, writability, readability. But I'm getting strong "rest of the owl" vibes. How would you have prevented this defect instead, in a way that you find simple?

lelanthran · 2025-03-30T12:21:17 1743337277

> How would you have prevented this defect instead, in a way that you find simple?

I currently prevent this sort of thing using a populated database. This database has actual real data (user info overwritten with random data).

Using a pre-populated database catches many more errors than this article's approach does.

Using a pre-populated database that has actual data generated by the users over a year or so catches even more errors.

This approach is fragile and misses the actual error, fixing only the symptom. The actual error is "there's a hole in my tests big enough to fly a passenger jet through".

The correct approach is to take a dump of the production database, scrub PII from it, and then perform your migration.

karparov · 2025-03-30T09:56:28 1743328588

Pre-populate the db.

wesselbindt · 2025-03-30T12:26:31 1743337591

This section:

> Apart from parsing the SQL query, I also considered an alternative testing approach that I might implement in the future: go through each migration one by one, and insert some dummy data into the database before applying it, to make sure that we test each migration being applied on a non-empty database. The data would either have to be generated automatically based on the current database schema, or we could commit some example DB dataset together with each migration, to make sure that we have some representative data sample available.

Suggests that this may also be a fairly complicated direction. Although it's not entirely clear to me why he can't just put one record in the db before any migrations, and then pull it all through. Plus it has the added drawback of removing your coverage for the (not unimportant) zero case.

karparov · 2025-03-30T17:34:25 1743356065

It takes more than a few minutes, yes. But this one-time investment will prevent the next 10 migration-related bugs that he'll otherwise blog about.

Grab some representative data from production and keep feeding that into your migration tests. Keep updating those. Worth each minute if you care about quality.

polishdude20 · 2025-03-31T05:30:56 1743399056

You'd also have to remember to update your PII removal code every time a new PII field is added!

kobzol · 2025-03-30T13:14:14 1743340454

If I only insert data into the DB once, I could miss important states. Like, I could add non-NULL data to a NOT NULL column, then make it NULL, and then make it NOT NULL again. If I don't insert NULL into the column in-between the last two migrations, I won't trigger the issue.

lbreakjai · 2025-03-30T12:01:07 1743336067

In my previous job, we implemented something similar to Neon branching. Each MR would start by cloning the "public" schema into a schema scoped to the merge request, and only then run the migrations and the integration tests.

There's a range of errors you just won't catch until you run your code against the real thing. Especially if you write SQL directly, which feels like a lost art even amongst experienced developers.

dreghgh · 2025-03-30T09:25:21 1743326721

This is the wrong approach. He should be putting data in the database schema before trying to run migrations on it.

Doing that is simple and can potentially catch all sorts of bugs, including the bugs you didn't think of yet. His solution is complicated but only catches one very specific type of bug.

dmit · 2025-03-30T09:30:52 1743327052

The author mentions that approach at the end of the post:

> Apart from parsing the SQL query, I also considered an alternative testing approach that I might implement in the future: go through each migration one by one, and insert some dummy data into the database before applying it, to make sure that we test each migration being applied on a non-empty database. The data would either have to be generated automatically based on the current database schema, or we could commit some example DB dataset together with each migration, to make sure that we have some representative data sample available.

flqn · 2025-03-30T14:43:50 1743345830

That would certainly catch nearly all migration issues, but it doesn't provide a helpful error message like a test for a specific mistake like this does.

Ideally both approaches would be used, with the general case being used to detect and inform more targeted tests.

vlovich123 · 2025-03-30T09:44:13 1743327853

The author correctly notes the challenge of correctly populating some pre-migration data at each step.

dreghgh · 2025-03-30T10:50:59 1743331859

You should solve the problem you need to solve, not the problem you have a cool solution for.

The problem to solve is "given my database in an arbitrary but valid state, applying a migration should succeed and leave the database in a equally valid state". Not "how do I stop people adding NOT NULL columns into tables".

vlovich123 · 2025-03-30T11:58:03 1743335883

The author had an off-the-shelf crate they could trivially add to their project & prevent instances of this specific failure in 5 minutes even though solving the general case is possible but probably significantly harder & probably still contains various corner cases that are hard to completely eliminate.

I'd characterize it less as the cool solution and more as the quick & dirty solution that gets you a lot of value for little effort.

dxdm · 2025-03-30T11:52:36 1743335556

You're just over-generalizing the problem into something you will not be able to solve by proving that a migration works for a _specific_ arbitrary state.

dreghgh · 2025-03-30T16:04:16 1743350656

I mean this is a widely adopted practice so it clearly is not impossible to solve.

All tests substitute small collections of representative inputs for the generality of inputs.

dxdm · 2025-03-30T17:04:30 1743354270

Of course, but then let's not knock the OP article for covering a different subset of testable states, in a way that will need no adjusting for future schema changes, while still preventing the actual problem.

ptx · 2025-03-30T09:52:39 1743328359

Does it need to be done at each step, though? Couldn't the test data be added at just one point (where the schema is known) and just let it run through all the subsequent migrations to see if it goes boom?

IanCal · 2025-03-30T10:07:59 1743329279

Ideally imo you'd have a property based test that does the following:

Perform some arbitrary list of valid actions. (This alone is valuable)

Run the new migrations.

Perform some arbitrary list of valid actions.

Assert no crashes/errors.

I've found this great for testing APIs - just "perform some list of user actions and make sure things don't explode" can catch a lot.

jononor · 2025-03-30T10:52:36 1743331956

I have also found this kind of approach very valuable. With defensive code, where the code itself does a bit of the kind of correctness checking that one might conventionally put into tests, it works even better. Pre/post- conditions in functions for example.

vlovich123 · 2025-03-30T10:21:39 1743330099

Not if you could have data that could only have been inserted at some stage to behind with and then a subsequent migration only breaks that case (eg create table in step 10 and step 11 breaks that new table - if you only had data that you inserted in step 1, it would still pass all migrations. In other words, you need to have sample data that exercises the entire DDL that is added or you risk missing something.

formerly_proven · 2025-03-30T11:03:52 1743332632

Normally you have per-migration tests for non-trivial schema migrations at minimum...

nurettin · 2025-03-30T11:24:34 1743333874

Writing a populate script for every migration sounds like a huge overkill.

hardwaresofton · 2025-03-30T08:21:20 1743322880

Sounds like this might make for an excellent general SQL lint tool. I thought sqlint[0] was this, but it’s more for syntax than semantics, it looks like.

[0]: https://github.com/purcell/sqlint

cachehit · 2025-03-30T09:06:04 1743325564

I wrote a linter for schemas (Postgres only): [link redacted]

I've found it to be a pretty useful way of establishing patterns for data modelling in a team.

kleiba · 2025-03-30T08:46:12 1743324372

This is a whole ̶l̶o̶n̶g̶ blog post about someone who was surprised that he could use a crate to parse SQL queries in Rust. A bit of an underwhelming read to me personally.

simonw · 2025-03-30T11:55:35 1743335735

If someone figured out how to do a neat thing, learned some tricks along the way and wrote about it just in case it was useful to someone else - that person is doing useful work. More people should do that.

_dain_ · 2025-03-30T09:27:58 1743326878

There's a difference between intellectually "knowing" that you can parse SQL, and actually going out and trying it and finding that it's practical to do in a test suite. I found the article valuable.

kleiba · 2025-03-31T09:07:17 1743412037

> I found the article valuable.

And that's great. Note that I specifically said that "for me personally", I found that article less valuable. But I'm certainly not claiming that everyone else should feel the same.

As a matter of fact, I'm sure a lot of people find my comment to be of little value.

kubb · 2025-03-30T09:11:20 1743325880

It’s actually a great reminder that a good framing and sales pitch matter a whole lot.

The author isn’t just recounting their story, they’re selling it as a lesson in a generic approach that will make you a better engineer.

Whether they can deliver on that promise doesn’t matter. It’s the feeling that’s clickable.

karparov · 2025-03-30T09:55:09 1743328509

Tbf, it's not their fault it made it to the HN front page.

Are we going to criticise every little innocent blog post just because somebody liked it, submitted it to HN and it got enough upvotes?

kubb · 2025-03-30T10:58:21 1743332301

I don’t see a problem with making polite observations about aspects of the content that makes it to the top.

nkrisc · 2025-03-30T11:18:26 1743333506

If this isn’t a place to discuss the submissions, what is it?

karparov · 2025-03-30T17:37:46 1743356266

Discussing is obviously fine.

But sometimes some innocent blog posts get criticised as if they claimed that they solved world hunger. They don't. They are often just some random thought in a rarely-read blog. Nobody intended to go convince a gang of seasoned hackers that they have undisputable wisdom.

swombat · 2025-03-30T09:05:13 1743325513

Not even that long, but I agree on the "underwhelming"...

"Oh I found some niche issue that bothered me and wrote some code to fix it."

-> HN Front Page

bravetraveler · 2025-03-30T09:21:04 1743326464

Virtue trifecta; Rust, testing, and "just"

disgruntledphd2 · 2025-03-30T10:02:56 1743328976

It's a great title of which we all need reminding.

maccard · 2025-03-30T10:14:43 1743329683

If it’s that simple where are your front page blog posts for fixing niche issues?

bilalq · 2025-03-30T10:45:30 1743331530

Serverless databases with branching support like Neon make this kind of thing trivial to do. You can just have a test that branches off your prod DB, runs the migrations, and then deletes the branch. This is lightweight enough to easily run on every pull request change.

No mocking or anything. This tests that your migrations are safe to run against real data.

conradludgate · 2025-03-30T11:24:55 1743333895

I'm obviously biased by being an employee, but this is where Neon's branching[0] functionality can come in useful. We hope to expand on it one day and build more first-party migration tooling but you can already get a good enough system with the features we have.

Neon Branches are zero-copy snapshots of the database, with all the same data, on an isolated postgres instance. You can run migrations on that data without risking any modifications to data or performance in production. You can set up scripts to run it in CI[1].

This isn't perfect. If performance matters, some migrations might hide table locks which can cause major slowdowns. I'm not sure how you might detect this currently, I had some discussions recently about whether we can add "explain analyze" to DDL queries in postgres.

[0]: https://neon.tech/docs/introduction/branching [1]: https://neon.tech/docs/guides/branching-github-actions

davidgomes · 2025-03-30T22:27:52 1743373672

> If performance matters, some migrations might hide table locks which can cause major slowdowns.

Do you mean that the migration might work on a side branch, but then it might not work on the main branch, because there's no other activity running on the side branch?

earnestinger · 2025-03-30T08:46:41 1743324401

It bugs me that article never qualifies the sql in parser.

Postgre sql parser, mysql sql parser, …

(Solid advise though)

vlovich123 · 2025-03-30T09:45:39 1743327939

Not sure I follow. The parser can parse almost any dialect you throw at it.

https://github.com/apache/datafusion-sqlparser-rs

null_deref · 2025-03-30T11:42:24 1743334944

Enjoyed the article. Am I out of touch or have the article linked to a PR of the rust-lang eco system that didn’t go through a CR, is this really the standard for such a large language standard library?

simonw · 2025-03-30T11:50:22 1743335422

You mean this one? https://github.com/rust-lang/bors/pull/251

If you look at https://github.com/rust-lang/bors it's not a standard library package, it's a tool to support the Rust development process. And the person who opened and landed that PR is the lead developer of that project.

Skipping a code review from someone else feels OK to me for that.

null_deref · 2025-03-31T06:38:15 1743403095

Thanks for the clarification, definitely makes sense

shepmaster · 2025-03-30T12:40:10 1743338410

Agree with simonw’s sibling comment. To add to it, I’m the primary maintainer of the Rust playground and basically self-review every single commit.

The rust-lang/rust repository has higher scrutiny (in part driven by tools like bors, the subject of the article).

null_deref · 2025-03-31T06:39:02 1743403142

I am very sorry for casting doubt, the sibling comment makes a lot of sense.

bobnamob · 2025-03-30T10:49:03 1743331743

And now you've pulled in a full sql parser as a dependency (admittedly a dev/build time dependency, but a dependency nonetheless) in a project that has no business parsing sql.

In this day and age of increasingly rampant supply chain attacks & dependency vulnerabilities, I'd definitely be second guessing the approach of "just write a test for it" if that test involved blowing up your attack/vuln surface

kobzol · 2025-03-30T13:08:00 1743340080

I don't really see an attack surface for a dev dependency.

conradludgate · 2025-03-30T14:09:33 1743343773

Your development machine, potentially with API keys and access tokens in `$HOME`, is the attack surface

bsaul · 2025-03-30T08:24:20 1743323060

Sidenote : i found this code quite readable. I'm usually reading Rust code here on HN that's full of weird lifetime annotations or Dyn or super long generic types. As a non-rust dev this freaked me out.

How common is this kind of code in practice ?

vrnvu · 2025-03-30T08:35:26 1743323726

Specifying lifetimes and types is mostly used when writing libraries. Writing Rust "as a user" is pretty readable and comfortable.

bschwindHN · 2025-03-30T09:16:06 1743326166

If you're running something like this and it isn't performance sensitive, you don't really need to deal with lifetimes at all. Lifetimes mostly come into play if you want to work with zero-copy parsing, or structs that borrow some things and act like a sub-view into the data you're working with.

I'm sure there are plenty of other use cases for lifetimes, but they don't come up very often when writing standard "application" code.

kobzol · 2025-03-30T13:12:16 1743340336

If you're writing applications or tests, it's mostly the simpler kind of code. If you're writing reusable (or perf. critical code), you will start seeing generics and lifetimes much more often.