As someone who built quite a bit of tech / product on Workers / Pages over the l...

lucasyvas · on May 29, 2023

Sane assessment - The transaction API for D1 will be so important as well. I've not been that excited for their approaches so far, but I also know of no other good alternative.

Something I quite like doing is a thread-local (or async-local) context transaction, and that seems quite hard to do if not impossible with both batching and stored procedures from what I've seen.

What I really wish for is to drop in any old query builder or ORM and use it identically to how I would with SQLite. I'm not sure if that's feasible, however.

kentonv · on May 29, 2023

So, a challenge here is that SQLite is designed for single-writer scenarios. One writer performing a transaction necessarily has to block any other writer from proceeding in the meantime. (There are some experimental approaches in the works to solve this, like "BEGIN CONCURRENT", but it's still limited compared to a typical multi-client database.)

This is all fine when the application is using SQLite as a local library since any particular transaction can finish up pretty quick and unlock the database for the next writer. But D1 allows queries to be submitted to the database from Workers located around the world. Any sort of multi-step transaction driven from the client is necessarily going to lock the database for at least one network round trip, maybe more if you are doing many rounds of queries. Since D1 clients could be located anywhere in the world, you could be looking at the database being write-locked for 10s or 100s of milliseconds. And if the client Worker disappears for some reason (machine failure, network connectivity, etc.), then presumably the database has to wait some number of seconds for a timeout, remaining locked in the meantime. Yikes!

So, the initial D1 API doesn't allow remote transactions, only query batches. But we know that's not good enough.

To actually enable transactions, we need to make sure the code is running next to the database, so that write locks aren't held for long. That's complicated but we're attacking it on a few different fronts.

The new D1 storage engine announced a couple weeks ago (which has been my main project lately) is actually a new storage engine for Durable Objects itself. When it's ready, this will mean that every Durable Object has a SQLite database attached, backed by actual local files. In a DO, since the database is local, there's no problem at all with transactions and they'll be allowed immediately when this feature is launched.

But DO is a lower-level primitive that requires some extra distributed systems thinking on the part of the developer. For people who don't want to think about it, D1 needs to offer something that "just works". The good news is that the Workers architecture makes it pretty easy for us to automatically move code around, so in principle we should be able to make a Worker run close to its D1 database if it needs to perform transactions against it. (We launched a similar feature recently, Smart Placement, which will auto-detect when a Worker makes lots of round trips to a single back-end, and moves the Worker to run close to it.)

Sorry it's not all there yet, but we're working on it...

cojo · on May 30, 2023

I really appreciate you taking the time to provide this context!

Intuitively I had a rough idea around why this was both a) a blocking issue ("why can't we just YOLO-try some version of it anyways?" came up at least a couple times on our side internally during alpha evaluation) and b) a hard issue, but I didn't know the details re: SQLite being single-writer specific (at least for the time being).

Valuable for my own knowledge, in addition to being useful re: understanding the steps involved for Cloudflare to enable this in the future.

I was already excited about Smart Placement regardless, but now doubly so knowing it is adjacent to what will enable a more generic solution for D1. We used DOs very heavily on the project I mentioned in my previous post, but as you call out, they are more complex to reason about, and in practice it limited who could work on them effectively on our team.

I always appreciate your comments in Worker / DO-related threads here on HN and have found them very insightful / helpful in learning more about what's under the hood! Thanks for taking the time to continue posting here - I know there's lots to do elsewhere.

lucasyvas · on May 30, 2023

This was a far more comprehensive answer than I expected to my skepticism. If nothing else, I'm very excited to see how it pans out in practice and what the ergonomics are like.