-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
How does SQLAlchemy keep track of the right columns?
The main way used by SQLAlchemy to figure out what columns are what is using a thing called "column correspondence", which was first introduced in a blog post many years ago: https://techspot.zzzeek.org/2008/01/23/expression-transformations/
In this script, we can see this as follows. First we have the Table
object that we've created a mapping towards:
>>> A.__table__
Table('a', MetaData(), Column('id', Integer(), table=<a>, primary_key=True, nullable=False), Column('x', Integer(), table=<a>), Column('y', Integer(), table=<a>), Column('z', Integer(), table=<a>), schema=None)
# the "x" column
>>> A.__table__.c.x
Column('x', Integer(), table=<a>)
then we have the subquery/union thing we've made:
>>> u1 = select(literal(True), A).filter(A.id != None)
>>> u2 = select(literal(False), A).filter(A.id != None)
>>> subq = union_all(u1, u2).subquery()
when we build out a construct like that, every time the column named "x" gets accessed to create a new SQL construct that can be SELECTed from, we make a copy of that column in terms of the new table; above it's generated from the subquery that we made based on the two unions:
>>> subq.c.x
Column('x', Integer(), table=<anon_1>)
when this column was generated, it keeps track of where it came from:
>>> subq.c.x._proxies
[Column('x', Integer(), table=<a>), Column('x', Integer(), table=<a>)]
(the column repeated twice has to do with the fact that the subquery is against a union that includes this column in two separate SELECT statements)
so by setting up this "proxies" relationship, we can track the "x" column between the A.__table__
and the subquery in both directions; this is not based on the "name" of the column at all, it's based on how the subquery was generated from the table object and is all done structurally:
# the subquery gives us the column that came from the "x" column on the table
>>> subq.c.corresponding_column(A.__table__.c.x)
Column('x', Integer(), table=<anon_1>)
# the table can give us which of its columns relate to this column in the subquery
>>> A.__table__.c.corresponding_column(subq.c.x)
Column('x', Integer(), table=<a>)
using that general mechanical feature, the ORM builds out a huge bunch of logic that I've been fixing bugs in for over a decade that knows how to relate the "selectable" to which A
is mapped, i.e. the Table
, to the columns that happen to be in this subquery , so that it knows which columns to put in the SELECT statement it renders and it knows how to get them back from a result set and populate them into a new A
object.
Wiki post extracted from this discussion