Hacker News new | past | comments | ask | show | jobs | submit login
Migrating to SQLAlchemy 2.0 (sqlalchemy.org)
198 points by brainless on Feb 18, 2021 | hide | past | favorite | 106 comments



I’ve seen quite a few shops which effectively hit every python db/networking/pooling foot gun you could possibly encounter while using SQLAlchemy.

People cargo cult the flask intro tutorial and have no clue how session binding works, where the transactions are, how committing works, etc because it’s all tucked away in magical middleware and singletons. As the code base grows so does the mess of blocking txns, accidental cross joins, pool exhaustion, and so on.

It’s a great tool from a technical expressiveness perspective but terribly full of operational foot guns. Beware and use Django until you’re sure you AND your team know what you’re doing.


SQLAlchemy author here.

Your comment states the problem space addressed by SQLAlchemy 2.0, which is what the above document is about, really well! So for your comment to have value, what did you think of SQLAlchemy 2.0's direction in how it seeks to vastly improve the very issues you speak of, "where the transactions are", "how committing works", etc.?

For example: "accidental cross joins" - SQLAlchemy 1.4 now warns/ raises for these: https://docs.sqlalchemy.org/en/14/changelog/migration_14.htm... cool huh? The new docs are worth a read before commenting.

It sounds like you were relying on flask-sqlalchemy in any case which unfortunately makes some poor decisions in these areas, but in SQLAlchemy 2.0 these things are brought forward unconditionally in all cases so you really can't talk to the database without having your transaction and its scope front and center.


I'm sorry if my comment struck negatively with you. The library has been very useful to me and teams Ive worked with in the past. In fact, I've lobbied for corporate sponsorship of your work a few times. Thank you.

The problem I'm describing isn't really the ORM libraries' fault as its outside the scope (hah). Your library is great. Most of the trouble I see is where the ORM meets the web framework. I think that gets tricky for lower experience developers because it's fundamentally tricky in Python. What the scheduler (e.g. gunicorn) does and how it forks and how sessions are handled at the wsgi layer are where you have to be very careful. Django has brought that in scope and you haven't -- which is why thats a better wholistic webdev experience and SQLAlchemy is the more powerful ORM.

With that said I stand by my comment having read the full changelog (commendably thorough btw). Unless you can afford to figure those things out as a team, it would behoove most web devs to reach for Django first.


no worries, I'm pretty known for being reactive to comments of all kinds, it's both a feature and a bug. SQLAlchemy was never meant to be as "simple" as django but I just dont want people to be scared off of it; that said, I'm super glad everyone who uses django sticks with django because they would be very grumpy SQLAlchemy users. the idea is supposed to be, make a larger investment upfront, get a bigger payoff. it's a tough balloon to float which is why I'm always nosing around people's twitter threads.


If you've got the best design in the long run there's nothing to worry about. That said, it takes a while to find it! And the target is always moving.

2.0 definitely looks like a step in the right direction. Can't wait to use it with psycopg3 and the latest Python. Thanks. ;-)


This exactly mirrors the situation where I work. They picked flask/sqlalchemy early on, the actual glue code is hard to get right, people were confused over the session api vs core api and problems ensue. As much as we want to tell people you should read docs carefully, libraries should make it hard to misuse and introduce footguns, and this is an aspect sqlalchemy maybe struggles with.


> Beware and use Django until you’re sure you AND your team know what you’re doing.

That's one of my pet peeve in the Python world: people thinks Flask is for beginners and Django for advanced users. The Flask API is simple, the hello world is 5 lines. By the time you render your first Django ViewModel, you have read and tried out the full Flask doc.

So it's easy to say, "I'll start with Flask, and check out this Django thing later".

But in reality, it's exactly the opposite. Flask requires you to know a lot of things and make a lot of decisions, not to mention select/write additional libraries for features Django packs.

So if you use Flask for anything non trivial, you should show your trade or you are going to mess up, big time.

If you are a beginner, you should always start with Django. Yes, it's more work up front, but it will give you an idea of how important concepts work, plug in to each others, and can be organized in a code base.

Django is not perfect, and its ORM is really far from SQLAlachemy, but it's good enough for a loads of thing. And it will be mostly secure and extensible by default.


100%

The upfront cost of Django makes people thing it's much more complicated than a micro-framework like Flask.

I like Flask, but not because it's a simple easy to use framework, especially once you chuck SQLAlchemy into the mix. Shit, even now if I don't touch it for a while I end up trying to remember how things work


If you don’t use Django’s ORM or SQLAlchemy’s ORM, then do you really need to use Django?

Like an adjacent commenter said, ORM is ok for simple blog websites. But I can’t imagine using an ORM for complex data tables and relationships with normalized data.


I used the Django ORM for surprisingly complex applications

I created a video streaming website, with 600k visitor days, that features tagging, categories, related vidéos, user upload, encoding queue, search with auto-completion and so on. It relies heavily on the ORM.

Django migration system and db model generation also make it a very nice tool for data export/migration/exploration.

The ORM provides a nice API, but above all, a standardized one, which lead the Django ecosystem to become so reach because of the assumptions the libraries could do, making you very productive.

Now, one has to remember:

- you can, and should by pass the ORM if you need it, writing raw SQL. Django let you do that.

- plenty of things can end up be in a cache, using varnish, redis, etc. Plus a lot of things should be done in task queues anyway, not in the django views. So the cost of the ORM is rarely an issue.

- complex data tables are sometimes better presented using SQL views, making the ORM model simple.

As usual with coding, it's a matter of gain and cost, and the right answer is "it depends". But you can go very far, and very comfortably with this tools, as long as you don't fight against its nature.

So yes, definitely use Django with the ORM. I am not advocating to use Django + SQLA, but rather contrast Django with Flask + SQLA.


>But I can’t imagine using an ORM for complex data tables and relationships with normalized data.

Whenever I read a statement like this from someone, I always feel that they actually haven't thoroughly read the docs for Django's ORM. Can you give me one concrete complex example that you can't imagine doing with the ORM? Because 95% of the time, these "complex" models are trivially easy to pull off with the ORM (from my experience).


Django orm doesn't support composite primary keys.


The beauty of the Django ecosystem, is that you have packages for everything: djangopackages.org

E.G: https://github.com/Arisophy/django-compositepk-model


Fair enough. There is `unique_together` though, which you may look into.


In the context of database transactions, any APIs serve only as backend endpoints. For this purpose I question whether flask is the best choice. For rest API, I always go for Falcon.


While I do keep my eyes on FastAPI, even in this case I would recommend Django to a beginner, solely because of the amazing Django Rest Framework.

Not only it makes creating an API from ORM models quick and easy (although the documentation doesn't let you believe so), but it also let you do custom views seaminglessly, including goodies such as standardized permissions, authentication and pagination with several levels of granularity. Having that being able to be integrated with all the rest of the Django ecosystem transparently (your website, a wagtail CRM, etc.) is really sweet.

But the biggest advantage for a beginner is again that it will prevent them from having to make decisions they are likely to get wrong. Especially on the serialization, format, identifier policy and homogeneity of the API: people crafting that by hand often make a big mess and don't even know about it.


We are running sqlalchemy in a giant environment - hundreds of millions of users. SQLAlchemy has been a blessing for us. Blame users that use flask-alchemy or whatever else is cool. This is exactly the same type of comment that blames ORM-s for ineffective SQL being used because people don't know how to write the queries in the first place.


You put it better than I could. There are certainly naive examples of how to use it out in the wild but that's not SQLAlchemy's fault. It's a sophisticated tool that accurately reflects how a database works.

I caution people against using Django because the ORM makes so many weak, simplified assumptions about how a database works that don't bear out in practice. It's fine for a blog. It falls apart when you're working with Real Data.


I've used Django with "real data", as in accountancy software for multiple years of transactions across 200+ fairly large clients and it's absolutely fine.

Whereas I've used Sqlalchemy and getting it to do a basic join, turns out there's 3 different ways of doing it all with an insane amount of crap to find the right way of doing things. I actually left my old job in part due to Sqlalchemy being too painful to work with.


There is absolutely a lack of a decent SQLA tutorial for IRL code.

The doc is very formal, and if you want to make a website, a quick script or a data migration, you are left on your own to find the proper setup.


I use Django with "real data" as well and, generally, it's pretty good once you learn how to wield it. My biggest issues are lack of multi-column primary keys and inefficient queries when doing .count(), that often introduce unnecessary group by statements in the subquery. If either of those could be addressed, I'd enjoy it so much more.



Django is optimizing for the 99.99% of people rather than the 0.01% of people who manage billions of rows on a single table. I like SQLAchemy, but for most people the added cost of managing connections, transactions, have to decide whether to use the 'ORM' or not, manually having to 'commit' changes, eager-loading vs. lazy-loading, is a complete waste of time if all they want is a website that will be used by - at best - a couple hundred people concurrently.


What weak assumptions does it make?


Yeah our org should really move away from flask-sqlalchemy and be more declarative about what session changes should actually be made on.

Being more hands on with sqlalchemy is probably the right approach.


I passed on flask-sqlalchemy because it was not obvious to me how it worked together, and I did not see the value over just using Flask and SQLAlchemy with no special integration. Has worked now a couple of years in, will see how it is in a few years more.


Not having touched this stack in a while: what is best current practice for wiring together sqlalchemy and flask (including unit tests)?


Potentially don't wire them together? That is, isolate them in separate parts of the code.

I always found Flask-SQLAlchemy a strange approach because it ties opposite and somewhat orthogonal parts of the request lifecycle together. When requests-to-data-transactions is a 1-1 relationship it's a convenience, but once you step out of that I think it's better to do things by hand.

My main app has a simple layer structure of "request > business logic > data layer". All of the business logic functions start a SQLAlchemy transaction using a Python decorator that is ~10 lines of SQLAlchemy code.


Check out how things are wired with in this Pyramid cookie cutter.

https://github.com/Pylons/pyramid-cookiecutter-starter/blob/...

I think same approach should work well for flask - just tie the session to the request object on first access.

Side note: I think we met on IRC few years ago ;-)


There is a lot going on in that cookie cutter. The “transaction” library for transaction management, wired into Pyramid with pyramid_tm, wired into SQLAlchemy with zope.sqlalchemy. Works nicely for Pyramid, but I don’t know how that translates to Flask.


Think on it like a middleware that wraps a transaction. It's fine, till you start consuming other Io based services (elastic/solar/...) And you find that this blocks one connection per request..


Doesn't, I wanted to show how it can be wired to request.


> hundreds of millions of users

Just curious, but how many employees? How many specifically are responsible for writing code that interacts with SQLAlchemy and/or the objects it creates. While I don't doubt it is _possible_ to make SQLAlchemy work great, I think it's mandatory to overstaff with engineers/programmers in order to do so.


> the same type of comment that blames ORM-s for ineffective SQL being used because people don't know how to write the queries in the first place.

I don't think this blame is as unfair as you imply, even if if were the primary cause of ORM induced performance disasters. In my experience the main selling point of ORMs has always been the promise for people who fancy themselves programmers to leverage DB technology without having to sully themselves or their cherished OO model by learning the relational model, databases or SQL. The other, equally false promise has been that the ORM would somehow allow you to "abstract" over your concrete DB so you could easily swap out one for another. Both these ideas now seem utterly misguided to me (although many years ago I probably fell for the same nonsense). For all of SQL's warts, I find the relational model conceptually much superior to OO. Moreover since the way data moves in and out of the DB (and the consistency/transactional and performance constraints around this) is often more architecturally central than the python code that orchestrates it, what ORMs are trying to do, shoe-horning the well thought out relational model into the badly thought out OO model seems utterly backwards to me -- spanning the cart before the horse so to speak. Now I will readily admit that SQLAlechemy is about the least bad ORM I have encountered, but I still absolutely fail to see what the point of it's ORM part is (I can see a bit of the appeal of being able to dynamically construct or compose queries with core, although I'd try hard to avoid the necessity).

Clearly you know what you are doing with hundreds of millions of users and you are not the sole large user of SQLAlchemy either, so I am evidently missing something. Can you maybe give one or two examples of things that are much easier/robust/... because of SQLAlchemy, compared to just writing SQL, in separate files, and executing that from you python code?

To be clear, there are various things that I think are broken about how databases work, and I'd really like to see them fixed but I don't see how SQLAlchemy helps much with most of the below (with the notable exception of query parameterization).

- no high level support for migrations: it should be possible to get a readable aggregate schema defintion out for example and it isn't

- parameterization and composition of queries is awkward and limited

- (for most DBs) lack of in-process support

- bad testing framework support


> Blame users that use...

I am, and I'm warning them, whats wrong with that?


Example: a team I was on was using SQLAlchemy and we had a couple of services, one that did read only queries, and another that did writes and occasionally DDL (rebuilding the table on the fly, (which is unwise but that's a topic for another post.)) Because we didn't understand SQLAlchemy's default semantics, the read side would effectively open a transaction and leave it open, meaning that subsequent DDL statements would block, then causing the entire table to lock. (I think our solution was to use `autocommit=True`, which iirc is now deprecated.)

That said, I really like SQLAlchemy.


The trick is to always auto close transactions at the end of the request.


I feel you are mixing totally different libraries and blaming SQLAlchemy for it. I use SQLAlchemy a lot and go through its documentation. I use it with FastAPI or other micro-frameworks.

But I need SQLAlchemy to really work with ... SQL! And it is solid at that. The glue libraries between bigger ones like SQLAlchemy and Flask usually are less maintained and have less eyeballs on them. These are weaker points IMHO, not SQLAlchemy.


Do you mind elaborating on these what these foot guns are and how to avoid them?


The flipside of this is that SQLAlchemy teaches people about pooling, threading, processes etc. .... but yes, by painful trial in some cases.


Please describe footguns and how to avoid. Thanks!


One of the most interesting 1.4/2.0 changes is first-class asyncio support, not just for core (the query builder) but for the ORM layer as well: https://docs.sqlalchemy.org/en/14/changelog/migration_14.htm...

As this notes, there's several changes you have to make to your assumptions around the ORM interface. SQLAlchemy, for better or worse, supports "lazy loading" of relationships on attribute access - that is, simply accessing `user.friends` would trigger a query to select a user's friends. This kind of magic is at odds with async/await execution models, where you would instead need to run something like `await user.get_friends()` for non-blocking i/o.

It looks like they've done some good work in making the ORM layer work reasonably well with these limitations (https://docs.sqlalchemy.org/en/14/orm/extensions/asyncio.htm...), but I wonder if removing "helpful magic" like this will push more people to stick with the query-builder, rather than the ORM.


It was really interesting. I saw the discussions about it where zzzeek was like, “oh, what, I could kinda just make this wrapper thing. Wait, what am I missing here? Nope, it works”

It was a late entry into the transition rather than a planned thing from what I could tell.

Edit: post is here https://gist.github.com/zzzeek/2a8d94b03e46b8676a063a32f7814...


So the secret sauce was just greenlet?


Thanks for the hardwork zzzeek! I've been using sqlalchemy for 5+yrs now and it's great to work with!

I like SQL and I like the ORM. When things go tough like doing complicated JOINs or OLAP. I just go raw sql. If it's OLTP, updates and simple lookups, ORM makes the code readable.

The alembic migration is also great! I can craft the models to design with postgres features (index, multi key index, pk uuid) and etc..

I'd would love sqlalchemy to invest more on scaling. Although scaling is an entire book of discussion. Not much resources are there for handling multiple db, sharding (maybe too much to ask).

My opinion is, love SQL and Love the ORM. You'll need both to appreciate it's power. It's like learning VIM, a long term investment tool and hard to :wq


Lots of major changes, so I hope it doesn't create a schism.

Do SQLAlchemy users appreciate how lucky they are? I generally prefer to use c#/.net but the 3 Microsoft ORMs (linq-to-sql, ef, ef.core) are all half baked. I don't know much about ActiveRecord, Django or other ORMs.

I wish I could have this sort of feature set and dynamic abilities that I get in sqlalchemy on the .net side. I say that as someone who loves SQL but appreciates the conveniences of a powerful ORM.


There have been so many times where I started a new project in a new shiny language and ended up coming back to Python because I love SQLAlchemy (and Pyramid) way too much. It's one of the few ORMs out there that doesn't hate you for liking SQL.


> doesn't hate you for liking SQL

This, just 100% this. I'm not a major fan of SQL or anything, however I get very hesitant to use any ORM that tries to imply that you don't need to understand how SQL or the underlying database actually works.

Any time I hear someone say that "SQL won't scale for my app", I assume that some rudimentary query analysis would solve 99% of problems.


Same here, I'd like to dive into golang. But there is nothing as good as SQLAlchemy. Maybe with generics we will start seeing some projects that can fill the gap.


Mmm yeah I had the same experience. My end conclusion for Golang was "don't use an ORM".


This exactly, I use github.com/doug-martin/goqu for some of my projects though. Only the query builder part though.


Django ORM has grown on me. I think it isn't quite as powerful as SQLAlchemy, but it is really quite decent. SQLAlchemy lets me think closer to SQL if I want to, and that is nice as well. I tend to use the Django ORM the most lately... but all 3 of those are rather full featured and accomplish the average user's needs. SQLA is still my favorite ORM anywhere.


same! i like that SQLAlchemy made me actually learn SQL.

but for like, boilerplate stuff, Django's ORM works so well. also, because it's all within the same "framework", every single library can adapt super well to it and more a shitton of dumb code from your hands.


I love reading about the thumping and bumping of modern frameworks. Meanwhile tons of great software gets written in rails and Django while startups cargo cult along. Maybe I am being curmudgeonly here, but, I think it’s true.


The description of the new direction for 2.0 sounds great, but as someone who isn't using SQLAlchemy currently (I prefer peewee) I'd be curious to see what it looks like without the assumption that I'm migrating from a previous version. I guess that's not yet fully settled?


that's what the tutorial is for - assumes 2.0 style usage and nothing else:

https://docs.sqlalchemy.org/en/14/tutorial/index.html


OT: but I’m seeing a weird layout issue on the new docs on mobile (iOS) where the main content is dropping below the bottom of the sidebar position.

https://imgur.com/a/U4AbEOZ


i would LOVE if someone could help us get the site to work on mobile. I can point folks to our scss and all of that and get it all going if someone can help. agree mobile is mostly unusable and new layout has probably some more problems since I started using flex layout for which I am unqualified to be touching.


Please drop me an email (you can find it on my profile here). SQLA is my go-to library for lots of stuff, I would love to help :)


I'm not seeing an email address can you tweet me on twitter?

Basically if someone can show me how to make the sidebar vanish on a mobile browser i think that's the main thing. i might have looked at this some time ago and given up.


Here's[1] the CSS I use on my blog to treat the table of contents differently depending on screen width. I'm probably also "unqualified to be touching" this stuff so take it with a grain of salt.

[1] https://gitlab.com/kevinjfoley/assorted-array/-/blob/master/...


Indeed, I must removed it a while ago. Please ping me at me[at]cpoul.dev and I'll check it asap


Happy to help with this. Will take a look tomorrow.


This is documentation for 1.4:

https://docs.sqlalchemy.org/en/14/contents.html

If I understand correctly, 2.0 will basically be 1.4 but without the DeprecationWarnings, and without the deprecated APIs - at least, that's how I've been coding for 2.0 so far, initially to benefit from asyncio support.


My company is migrating from peewee to SQLAlchemy because the missing async support and we faced many bugs related to multi-threading/multi-processing.


I've been building data analysis tools on top of SQLAlchemy's declarative system over the past couple years. It's got to be the most well documented, carefully designed library I've ever interacted it :).

It looks like most of the changes in 2.0 are aimed at the ORM system, which makes sense. I think a lot of complaints that come up have more to do with the complexity of interacting with a SQL database, so appreciate the effort in the docs not just laying out an API, but essentially educating around the problem domain.


I got my start in IT as a DBA. SQL and good table design come naturally. So when it came time to join a company as a developer I sprang for raw SQL only to find this SQLalchemy ORM. I couldn’t wrap my head around it at first. All the ceremony to get to what I wanted to do just got in the way. I felt trapped. But it’s ubiquitous and I have to adapt. So I’m learning. And there’s a whole lot of benefit being able to define a model and have it render on any database. Paired with Alembic migrations can be pretty simple.

I miss having 100% control over the queries, knowing exactly how they looked and analyzing each before committing them to main. But nobody has the time to hand craft artisanal queries and leverage every intricate detail of a database when they’re trying to move ever faster and ship features.


Got my start in classic ASP and we authored so many queries using string concatenation. When ASP.NET came around, we shifted our database thinking to stored procedures, so I could continue to write "raw" SQL. I'm no DBA, but I learned so much about authoring efficient and complex queries - I genuinely love working with SQL.

These days, though, I use Django's ORM. I can often get it to do what I want, bit it sure makes me miss raw queries. Thankfully, we still write the occasional view for complex joins and then just map that to a read-only model in Django - so I sometimes get the best of both worlds.


Yeah the shop I got my start in shipped .Net on the server and client side with stored procedures tied to that code on the DB side. So it was a sort of API abstraction that slowed devs to call update_users(...) from the app side and get raw sql performance and such. To a newbie like me at the time it may have been an old school approach but it made total sense.

Pretty cool to see someone get a similar start to me. I’ve not played with the Django ORM yet. Still getting used to SQLalchemy.


Exactly! It made total sense at the time, and still does to some degree.

I'm sure SQLAlchemy is lovely, but after spending so much time with Django ORM, I'm finding it hard to shift my thinking any time I try to look at SA. I'm sure if I had started using SA first, the opposite would also be true, so we just took different paths. I'm guessing you'll enjoy SA if you love SQL, based on all the other comments I'm seeing.


Good to see the removal of autocommit, with the detailed discussion of the design decision. I've always felt a little uneasy when going with autocommit=False in my old projects, thinking that the default of autocommit=True was the "blessed" way to use SQLAlchemy.

EDIT: Looks like autocommit=True has never been the default, must have been some possibly 3rd party documentation.


I'm particularly interested in the support for dataclasses. It's going to make modeling the application while decoupling from the data layer itself easier, I think.


Same here. This is something that I believe that Sqlalchemy should fundamentally take a bet on.

Dataclasses are now inherently part of python. They are also used across the ecosystem (e.g. pydantic). It makes sense to use them for model declaration.

Hope Sqlalchemy becomes dataclasses first...and not just as a compatibility feature.


zzzeek is an absolute beast and the SQLAlchemy codebase is a gem to read.


I used their ORM library in a medium sized project. As the project grew, it turned a nightmare scenario. After that experience, I stick to core and raw SQL queries. I hope 2.0 brings some meaningful changes.


Agreed. I will never recommend an ORM, things simply spiraled out of control for medium to large-ish projects that had more than 2 developers. Even with "best practices", code ended up having a mix of raw SQL and ORM-style queries, and it was hard to reason about the code.

Since switching to asyncpg [0] these problems have vanished. It commands a deeper knowledge of actual SQL, but I would argue this knowledge is absolutely necessary and one of the disadvantages of an ORM is that it makes the SQL that is eventually run opaque.

Not sure if there are equivalents to asyncpg for other RDBMS's.

[0]: https://github.com/MagicStack/asyncpg


Why is nobody ever writing raw SQL? I've never understood why ORMs are sine qua min.


Because you won't use raw SQL on any serious project forever. You'll end up building an ORM yourself or at least a query builder.


Do you mean "sine qua non"?

People don't generally like writing raw SQL because you have to map the results to and from your programming language. So at a minimum, you need a query builder that does some minimal and flexible mapping.


Ha. Yes. I apologise for the auto correct.

Building basic CRUD apps as a hobbyist, I've just never had that problem. My app needs some data. I fire of a query af psocopg2 gives me back my data.

I know I'm the least experienced. So I'm not arguing. I just don't understand it

(I work mostly be with data / BI, so I'm familiar with SQL)


Imagine be you build a "search users" page with many filters, you will end up inventing query builder at one point. The more dynamic data you need, you will end up reinventing those systems. The worst that can happen is when you start concatenating sql queries together.ORM/query builders save you that headache.


Yes..i have concatenated SQL i must admit.


How do people who prefer to write raw SQL usually deal with:

1. Query building, particularly when the query needs to be dynamic based on user input? Do you end up concatenating strings together or do you use a separate query builder?

2. Coalescing result sets produced by JOINs back into object form? Example: if you want to fetch users along with all their posts your query will return multiple rows per user, but when working with objects in your app you want each user to have a list of posts so you can simply say users.posts.

3. Property change tracking? Example: different parts of your app might update different properties for each user. If the user's email and last_login changes you need to write one query. If the user's password changes you need to write a different query. If the user's email, name, and location changes you need to write another query. An ORM with change tracking will figure out exactly which properties have been modified and issue the correct SQL to update only the changed properties. When working with raw SQL do you simply end up writing different queries for each possible permutation of changes?


1) concatenating. Not always possible in complex cases. 2) I don't think it's that hard to write 'select posts from user where user = ?' 3) I prefer functional to oop. In my small apps writing a function with a transparent SQL have worked fine.


Re: 2) For a single user, that's fine. What if you are trying to fetch posts for N users at a time? Issue N separate queries?


Usually by writing their own lightweight ORM as part of their application. ¯\_(ツ)_/¯


It's easy to write raw SQL queries. It's a lot harder to generate SQL programmatically, correctly, based on a variety of requirements your app may have. ORMs do this well.


You can write direct SQL queries using SQLAlchemy Core without engaging the ORM layer at all. It protects you from a few of the footguns associated with raw SQL but still provides most of the power and features you would want from raw SQL. If you decide that some part of your system could use the ORM then it is easy to integrate it into your existing Core work or visa versa.


If you want to use an SQL database with an OOP language like Python, you will end up using an ORM library, or writing one.


For record/object-at-a-time, an ORM saves tons of code.

SQL's design is optimized for processing unordered sets of records.


Raw sql is not composable.


There is a middle-ground between writing SQL statement strings in your code, and a full-blown ORM: query builders. At least in my experience with small to medium projects, these have far fewer footguns while keeping the code composable and readable. Here's one for Python: https://github.com/kayak/pypika


In SQLAlchemy, you can choose to use only the query builder without using the ORM part at all. The query builder in SQLAlchemy is very well documented[1] and its use without ORM has the same level of support as its ORM counterpart. SQLAlchemy call their query builder a SQL Expression Language[2], e.g.

    # Boilerplate
    from sqlalchemy import create_engine, MetaData, Table
    engine = create_engine('sqlite:///:memory:', echo=True)
    metadata = MetaData()

    customers = Table('customers', metadata, autoload_with=engine)
    query = customers.select([customers.c.id, customers.c.fname, customers.c.lname, customers.c.phone])
    # => SELECT customers.id, customers.fname, customers.lname, customers.phone FROM customers

    conn = engine.connect()
    conn.execute(query)
    # => [(1, "Foo", "Bar", "12345678"), ...]
ActiveRecord also has Arel which can be use as a standalone. Documentation is a bit more sparse compared to SQLAlchemy Core or its ActiveRecord ORM counterpart, though.

[1]: https://docs.sqlalchemy.org/en/13/index.html

[1]: https://docs.sqlalchemy.org/en/13/core/tutorial.html


This is where I tend to gravitate. Even using ActiveRecord, what people consider the “ultimate” ORM, I still find it gets in my way. None of my colleagues write SQL queries, it’s all completely abstracted away behind ActiveRecord, and we are constantly running into issues where someone forgets an `includes` and we end up with an N+1 error. N+1 errors are so easy to slip into with an ORM, it’s pretty much one of our biggest performance issues.

This seems like the biggest trend the more I use RAILs: it’s great to iterate and prototype, but as soon as you hit scale the maintainability becomes a big issue. I’m not saying other web frameworks are immune or that this is a problem that cannot be solved, it’s just all of these abstractions and cascade of configuration objects to make RAILs do what you want end up getting in the way.

A query builder is the best of both worlds: semantics that resemble raw SQL with some ability for composition.


I'll just say that SQLAlchemy can also be used effectively as just a query builder in places that you need it.


Thanks


And the next day on HN people lament ORM's: https://news.ycombinator.com/item?id=26188765


I kind of wish some of the SQLAlchemy core devs spent a bit of time using ActiveRecord to appreciate how an ORM can make defining and querying relations straightforward and easy.

Right now, using SQLAlchemy creates a "now you have two problems" kind of workflow: first you figure out the SQL need, then you spend at least that long figuring out how to write it with the ORM. I never felt this way about ActiveRecord.


> I kind of wish some of the SQLAlchemy core devs spent a bit of time using ActiveRecord to appreciate how an ORM can make defining and querying relations straightforward and easy.

The main SQLA developer, and the whole team, has been doing this now for almost 3 decades, has presented on and had thousands of serious detailed technical discussions on the subject with a diverse range of industry participants, and I can assure you is WELL aware of how ActiveRecord works and all of the patterns around it.


And? Is it a documentation problem then, that the SQL alchemy devs don't think it's worth the time to explain to devs familiar with active record what they gain using SQLA?


you get to think in terms of SQL and relational algebra is the basic idea. Here's one of my talks that discusses this: https://www.sqlalchemy.org/library.html#handcodedapplication...

as for "it's hard to translate from SQL to ORM" that's a huge part of what 1.4/2.0 is trying to make more obvious. But to be fair I get very few "how do I write this in SQL" questions these days as things are pretty 1-1 in any case now; the remaining weak spots (awkwardness with unions, support for table-valued expressions) are addressed in 1.4/2.0 and the relatively awkward "session.query()" model is now legacy.


Search "sqlalchemy activerecord" and you will find tutorials, compare and contrast posts, pros and cons, and several implementations for sqlalchemy and many other languages and libraries using the activerecord pattern.

SQLAlchemy, and Python in general, is highly extensible, it can do the ActiveRecord pattern and many other patterns depending on the data, not just the needs of a content publication system.

Here's a couple random AR/SQLA implementations I plucked from DDG:

https://pypi.org/project/sqlalchemy-mixins/

https://pypi.org/project/Flask-ActiveRecord/


Honestly that’s my favorite part about SQLAlchemy—you need to think in SQL. It doesn’t try to hide anything behind leaky abstractions, it’s WYSIWYG. Just gives us the good parts of an ORM while sidestepping the whole “ORMs are the Vietnam of Computer Science” problem.


I think this is fundamental design difference between SQLAlchemy and ActiveRecord in general. SQLAlchemy uses Data Mapper pattern which does not necessary map 1:1 to the row in the database, so you have to deal with the whole (Object, Mapper, Data) tuple instead of just Object in case of Active Record pattern (of which Rails' ActiveRecord were based on).

This means SQLAlchemy does not try to hide away SQL, which is beneficial in dealing with complex queries. In Rails ActiveRecord you could use arel in such case but being a query builder it lose the benefit of ActiveRecord, whereas in SQLAlchemy it could be done relatively easily within the ORM layer (arel equivalent in SQLAlchemy would be its Expression Language). On the other hand, some things that are complicated in ActiveRecord can also be trivial to implement in SQLAlchemy (e.g., column_property[1])

[1]: https://docs.sqlalchemy.org/en/13/orm/mapped_sql_expr.html


How does it compare to the JPA and active records?


Modeling their migration off the Python 2 -> 3 migration. Bold move, let's see how that works out for them.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: