I generally agree that one database instance is ideal, but there are other reasons why Postgres everywhere is advantageous, even across multiple instances:
- Expertise: it's just SQL for the most part
- Ecosystem: same ORM, same connection pooler
- Portability: all major clouds have managed Postgres
I'd gladly take multiple Postgres instances even if I lose cross-database joins.
Yep. If performance becomes a concern, but we still want to exploit joins etc, it's easy to set up replicas and "shard" read only use cases across replicas.
Hard disagree. I've used Docker predominantly in monoliths, and it has served me well. Before that I used VMs (via Vagrant). Docker certainly makes microservices more tenable because of the lower overhead, but the core tenets of reproducibility and isolation are useful regardless of architecture.
There's some truth to this too honestly. At $JOB we prototyped one of our projects in Rust to evaluate the language for use, and only started using Docker once we chose to move to .NET, since the Rust deployment story was so seamless.
Haven't deployed production Java in years, so I won't speak to it. However, even with Go's static binaries, I'd like to leverage the same build and deploy process as other stacks. With Docker a Go service is no different than a Python service. With Docker, I use the same build tool, instrument health checks similarly, etc.
Standardization is major. Every major cloud has one (and often several) container orchestration services, so standardization naturally leads to portability. No lock-in. From my local to the cloud.
Even when running things in their own box, I likely want to isolate things from one another.
For example, different Python apps using different Python versions. venvs are nice but incomplete; you may end up using libraries with system dependencies.
I'm super interested in a Postgres-only task queue, but I'm still unclear from your post whether the only broker dependency is PostgreSQL. You mention working towards getting rid of the RabbitMQ dependency but the existence of RabbitMQ in your stack is dissonant with the statement 'a conviction that PostgreSQL is the right choice for a task queue'. In my mind, if you are using Postgres as a queue, I'm not sure why you'd also have RabbitMQ.
We're using RabbitMQ for pub/sub between different components of our engine. The actual task queue is entirely backed by Postgres, but things like streaming events between different workers are done through RabbitMQ at the moment, as well as sending a message from one component to another when you distribute the engine components. I've written a little more about this here: https://news.ycombinator.com/item?id=39643940.
We're eventually going to support a lightweight Postgres-backed messaging table, but the number of pub/sub messages sent through RabbitMQ is typically an order of magnitude higher than the number of tasks sent.
While I understand the sentiment, we see it very differently. We're interested in creating the best product possible, and being open source helps with that. The users who are self-hosting in our Discord give extremely high quality feedback and post feature ideas and discussions which shape the direction of the product. There's plenty of room for Hatchet the OSS repo and Hatchet the cloud version to coexist.
> develop all the functionality of RabbitMQ as a Postgres extension with the most permissive license
That's fair - we're not going to develop all the functionality of RabbitMQ on Postgres (if we were, we probably would have started with a amqp-compatible broker). We're building the orchestration layer that sits on top of the underlying message queue and database to manage the lifecycle of a remotely-invoked function.
The advice of "commoditize your complements" is working out great for amazon. Ironically, AWS is almost a commodity itself, and the OSS community could flip the table, but we haven't figured out how to do it.
That makes sense, though a bit disappointing. One hope of using Postgres as a task queue is simplifying your overall stack. Having to host RabbitMQ partially defeats that. I'll stay tuned for the Postgres-backed messaging!
Then maybe Procrastinate (https://procrastinate.readthedocs.io/en/main/) is something for you (I just contributed some features to it). It has very good documentation, MIT license, and also some nice features like job scheduling, priorities, cancellation, etc.
I'm curious if this supports coroutines at tasks in Python. It's especially useful for genAI, and legacy queues (namely Celery) are lacking in this regard.
It would help to see a mapping of Celery to Hatchet as examples. The current examples require you to understand (and buy into) Hatchet's model, but that's hard to do without understanding how it compares to existing solutions.
Touche. Almost all of these I can concede a smartwatch might do better. I do just want to mention that: Garmin has offline maps of your whole region (eg NA) and the ability to provide directions. Obviously not as good as RT traffic data, but also offline has been helpful occasionally. My garmin can control podcasts as well. Remote shutter is unfortunately a limit of phone APIs, I tried making an app for this. My garmin has synced well to 3rd party apps, and I can't think of many reasons I'd want 3rd party fitness apps like AllTrails on it?
More relevant to my original point, however, is none of these things you mentioned require a beefy processor – they could all easily be implemented in a context like Garmin.
That's fair, a lot of these features do not require a beefy processor. I think the value lies in the AW being a full-featured platform. Its processor is overkill for the lightest use-cases like remote shutter, but for others it's nice having a smartphone-like experience.
Case in point: Maps are full featured, it's not just about traffic, it's also about points of interest, supporting different modalities (crucially, public transit), and generally feeling like the full maps experience.
Like any other platform, 3rd party apps are nice for niches that the manufacturers do not serve, or do not serve well. For example, lifting apps make it super easy to log sets, reps, rest, etc; useful data the core experience does not offer.
As mentioned earlier, I think there's room for both. I'm actually not opposed to sporting a Garmin for endurance sports, where battery life is king :)
Actually my Garmin does do points of interest, and supports either driving, walking or cycling directions, all fully offline. For two entire continents! All on device.
I concede that the experience looks a bit less polished than the nice UI of an Apple Watch, but it’s genuinely impressive the level of features in this thing.
The only one I miss is voice recognition, where it would be nice to just say “navigate home” and have it go. And I’m not sure if that’s possible on the weak processors
Yeah. It's surely rough around the edges UX wise, but people consistently underestimate what you can do with such a low powered processor. If Garmin hired UI designers & worked on getting more companies to make apps for it, it could easily be the best smartwatch.
Garmin will also log weight training bouts and auto-detect reps and sets. You can program the entire bout into the phone app and track it as you go with the watch.
Offtopic: I find a lot of the langchain hate is misplaced and misguided. Sure, if you want to ship a weekend project using the LLM APIs directly is probably fine. But I've gotten a lot of use out of Langchain's abstractions on real-world projects that evolve quickly over time. At a minimum, it provides a common language for building LLM applications.
Some valid criticisms: (1) the learning curve is steep (2) the APIs and docs are volatile (though to be expected).
This reminds me of the old Django vs Flask debate. Sure, Flask is easy to get started with, but over time you end up building an undocumented, untested Django.
- Expertise: it's just SQL for the most part - Ecosystem: same ORM, same connection pooler - Portability: all major clouds have managed Postgres
I'd gladly take multiple Postgres instances even if I lose cross-database joins.