> Microservices are a design pattern for organisations as opposed
> to technology ... breakout into multiple teams
I agree, but just saying "multiple teams" has led many eng directors to think "I have two squads now --> omg they cannot both be in the same monolith".
When both squads are 5 people each.
And the squads re-org (or "right size") every 9 months to re-prioritize on the latest features.
Five years go by, 7 team/re-org changes, all of which made sense, but thank god we didn't microservice on the 2nd/3rd/4th/5th/6th team boundaries. :grimmacing:
We should stay "stable, long-lived teams" -- like you need to have a team that exists with the same ownership and mandate for ~18 months to prove its a stable entity worth forming your architecture around.
Well, we're not the "go to" yet :-) but if you want an entity-based ORM that isn't just a query builder, Joist has several amazing features (no N+1s) and great ergonomics https://joist-orm.io/
Do queue operations (enqueue a job & mark this job as complete) happen in the same transaction as my business logic?
Imo that's the killer feature of database-based queues, because it dramatically simplifies reasoning about retries, i.e. "did my endpoint logic commit _and_ my background operation enqueue both atomically commit, or atomically fail"?
Same thing for performing jobs, if my worker's business logic commits, but the job later retries (b/c marking the job as committed is a separate transaction), then oof, that's annoying.
My understanding is that hatchet isn’t just a queue, it’s a workflow orchestrator: you can use it as a queue but it’s kind of like using a computer as a calculator: it works but indeed it’d likely be simpler to use a calculator.
On your point of using transactions for idempotency: you’re right that it’s a great advantage of a db-based queue, but I’d be wary about taking it as a holy grail for a few reasons:
- it locks you into using a db-based queue. If for any reason you don’t want to anymore (eg you’re reaching scalability issues) it’ll be very difficult to switch to another queue system as you’re relying on transactions for idempotency.
- you only get transactional idempotency for db operations. Any other side effect won’t be automatically idempotent: external API calls, sending messages to other queues, writing files…
- if you decide to move some of your domain to another service, you lose transactional idempotency (it’s now two databases)
- relying on transactionality means you’re not resilient to having duplicate tasks in the queue (duplicate publishing). That can easily happen: bug of the publisher, two users triggering an action concurrently… it’s quite often a very normal thing to trigger the same action multiple times
So I’d avoid having my tasks rely on transactionality for idempotency, your system is much more resilient if you don’t
They’re not talking about distributed transactions: it’s not about a task being published and consumed atomically, it’s about it being consumed and executed atomically.
Working on v2 of our n+1-proof/reactive TypeScript ORM, Joist (https://joist-orm.io/), that moves to using the new-ish postgres.js driver (instead of knex/node-pg), so that we can leverage postgres.js's statement pipelining within transactions.
I'm anticipating a really sweet perf increase (as shown by some proof-of-concepts), but now that everything is actually working on the v2 branch, I'm putting together benchmarks that show the benefit in practice.
Love to have anyone poke around/ask questions/hang out on discord.
Same! We've also been using the polyfill in production for a ~year or so.
We recently configured the node-pg driver to leave the postgres DATE / TIMESTAMP / TIMESTAMPTZ dates as "just strings" (instead of its default parsing to JS dates), and then we go straight from strings -> temporals at our ORM layer, without the jank/hop of it first being a JS Date.
But it's been great, having `DATE` in the db => `PlainDate` in the TS backend => `Date` in GraphQL API => `PlainDate` again in the TS/React FE. Very clean.
I'm building "ActiveRecord but with type(script)" at https://joist-orm.io/ and have our own internal "kinda Rails w/GraphQL" going internally--so far we're at ~450 tables and still "feeling good" (no n+1s ever!), but definitely a long way to go before AirBNB-sized problems.
Using raw SQL directly is doable, but it means you're responsible for maintaining the business logic & validation rules of every single hand-written INSERT, UPDATE, and DELETE query in your codebase.
Personally I don't trust myself to remember to do that :-) hence preferring entity-based ORMs:
(That said, I definitely "know SQL" and use raw SQL queries for the ~5% of queries in a CRUD/SaaS app that are actually unique/non-boilerplate, instead of forcing them to go through an obtuse query builder DSL.)
I mean, of course they are a good idea, what we need is more examples of actually doing them in practice. :-)
I.e. quoting from the post:
- monolithic databases need to be broken up
- Tables must be grouped by module and isolated from other modules
- Tables must then be migrated to separate schemas
- I am not aware of any tools that help detect such boundaries
Exactly.
For as much press as "modular monoliths" have gotten, breaking up a large codebase is cool/fine/whatever--breaking up a large domain model is imo the "killer app" of modular monoliths, and what we're missing (basically the Rails of modular monoliths).
Any "query in a loop", whether a manually-written `for` loop, or an "emergent for loop" from 100 `after_update` lifecycle hooks firing at once, are always auto-batched, every time.
Joist is also likely why we actually like GraphQL, vs. it generally having a reputation for being a pita to implement (N+1s everywhere w/o tedious, manual batching).
Maybe its Stockholm syndrome after ~4-5 years of TypeScript, but I like knowing "this method call is going to do I/O somewhere" (that its red).
To the point where I consider "colorless functions" to be a leaky abstraction; i.e. I do a lot of ORM stuff, and "I'll just call author.getBooks().get(0) b/c that is a cheap, in-memory, synchronous collection access ... oh wait its actually a colorless SQL call that blocks (sometimes)" imo led to ~majority of ORM backlash/N+1s/etc.
Maybe my preference for "expressing IO in the type system" means in another ~4-5 years, I'll be a Haskell convert, or using Effect.ts to "fix Promise not being a true monad" but so far I feel like the JS Promise/async/await really is just fine.
I agree, but just saying "multiple teams" has led many eng directors to think "I have two squads now --> omg they cannot both be in the same monolith".
When both squads are 5 people each.
And the squads re-org (or "right size") every 9 months to re-prioritize on the latest features.
Five years go by, 7 team/re-org changes, all of which made sense, but thank god we didn't microservice on the 2nd/3rd/4th/5th/6th team boundaries. :grimmacing:
We should stay "stable, long-lived teams" -- like you need to have a team that exists with the same ownership and mandate for ~18 months to prove its a stable entity worth forming your architecture around.