When implementing relationships in SQLAlchemy models, you will generally need
to cross-reference one model from another, ideally in a bidirectional manner.
Historically, the mechanism to do this has been the relationship.backref
keyword. However, as noted in the SQLAlchemy
documentation, this keyword should now be considered legacy
and the relationship.back_populates migration preferred instead. There are a
couple of reasons for this. Quoting from the docs:
The
relationship.backrefkeyword should be considered legacy, and use ofrelationship.back_populateswith explicitrelationship()constructs should be preferred. Using individualrelationship()constructs provides advantages including that both ORM mapped classes will include their attributes up front as the class is constructed, rather than as a deferred step, and configuration is more straightforward as all arguments are explicit. New PEP 484 features in SQLAlchemy 2.0 also take advantage of attributes being explicitly present in source code rather than using dynamic attribute generation.
The SQLAlchemy docs include a guide on migrating tables, (as well as a great overview on relationships in general), however, this guide assumes you are using foreign key constraints in your models and using the auto-incrementing primary key field as your reference. To be honest, you probably should be using foreign keys since without these you have no way to enforce referential integrity but say you can’t use them or don’t want to, how does one migrate? The below examples should hopefully guide you in migrating these.
One-to-Many
The one-to-many pattern is the simpler one to migrate. Consider the following example:
import uuid
import sqlalchemy as sa
from sqlalchemy import orm
BASE = orm.declarative_base()
class User(BASE):
__tablename__ = 'user'
id = sa.Column(sa.Integer, primary_key=True)
uuid = sa.Column(sa.String(36), nullable=False)
name = sa.Column(sa.String)
class Address(BASE):
__tablename__ = 'addresses'
__tableargs__ = (
sa.Index('addresses_user_uuid_idx', 'user_uuid'),
)
id = sa.Column(sa.Integer, primary_key=True)
user_uuid = sa.Column(sa.String(36), nullable=False)
user = orm.relationship(
'User',
backref='addresses',
primaryjoin='Address.user_uuid == User.uuid',
foreign_keys=user_uuid,
)
def main():
engine = sa.create_engine('sqlite://')
session = orm.sessionmaker(bind=engine)()
BASE.metadata.create_all(engine)
user = User(name='John Doe', uuid=str(uuid.uuid4()))
session.add(user)
session.commit()
address = Address(user_uuid=user.uuid)
session.add(address)
session.commit()
print('# Users')
for user in session.query(User).all():
print(f'User: name={user.name}')
print()
print('# Addresses')
for address in session.query(Address).all():
print(f'Address: user={address.user_uuid}')
print()
if __name__ == '__main__':
main()
Before looking at migrating from backref to back_populates, it’s worth
understanding what’s going on here. You’ll note that we have a relationship
from the Address model back to the User model. This has the backref
keyword set on it, allowing you to reference related addresses from instances
of User, however it also has primaryjoin and foreign_keys keywords set.
These are necessary because the user_uuid column does not have a foreign key
constraint present. You can confirm this by inspecting the schemas generated by
SQLAlchemy.
❯ python
>>> import test_o2m # name of module saved locally
>>> from sqlalchemy import schema
>>> print(schema.CreateTable(test_o2m.User.__table__))
CREATE TABLE "user" (
id INTEGER NOT NULL,
uuid VARCHAR(36) NOT NULL,
name VARCHAR,
PRIMARY KEY (id)
)
>>> print(schema.CreateTable(test_o2m.Address.__table__))
CREATE TABLE addresses (
id INTEGER NOT NULL,
user_uuid VARCHAR(36) NOT NULL,
PRIMARY KEY (id)
)
Because we don’t have a foreign key present, SQLAlchemy isn’t able to determine
the join condition. This necessitates the addition of the primaryjoin and
foreign_keys keywords to guide SQLAlchemy on resolving these relations.
Removing these keywords will yield an error message.
sqlalchemy.exc.NoForeignKeysError: Can't find any foreign key relationships between 'addresses' and 'user'.
The only other unusual thing here is that we’re using a field other than id
to reference things, namely a uuid field on both models. There’s no real
reason to do this here but this was the case in the models that I had to
migrate and it makes things a little more complicated and “real-world’y”.
Now that we understand what’s going, it’s time to look at switching from
backref to back_populates. It turns out to be pretty simple for one-to-many
models like this.
import uuid
import sqlalchemy as sa
from sqlalchemy import orm
BASE = orm.declarative_base()
class User(BASE):
__tablename__ = 'user'
id = sa.Column(sa.Integer, primary_key=True)
uuid = sa.Column(sa.String(36), nullable=False)
name = sa.Column(sa.String)
addresses = orm.relationship(
'Address',
primaryjoin='User.uuid == Address.user_uuid',
back_populates='user',
foreign_keys='Address.user_uuid',
)
class Address(BASE):
__tablename__ = 'addresses'
__tableargs__ = (
sa.Index('addresses_user_uuid_idx', 'user_uuid'),
)
id = sa.Column(sa.Integer, primary_key=True)
user_uuid = sa.Column(sa.String(36), nullable=False)
user = orm.relationship(
'User',
primaryjoin='Address.user_uuid == User.uuid',
back_populates='addresses',
foreign_keys=user_uuid,
)
def main():
engine = sa.create_engine('sqlite://')
session = orm.sessionmaker(bind=engine)()
BASE.metadata.create_all(engine)
user = User(name='John Doe', uuid=str(uuid.uuid4()))
session.add(user)
session.commit()
address = Address(user_uuid=user.uuid)
session.add(address)
session.commit()
print('# Users')
for user in session.query(User).all():
print(f'User: name={user.name}')
print()
print('# Addresses')
for address in session.query(Address).all():
print(f'Address: user={address.user_uuid}')
print()
if __name__ == '__main__':
main()
Looking at the diff for these two files, we see the following:
diff --git test_o2m.py test_o2m.py
index 4f23221..f865045 100644
--- test_o2m.py
+++ test_o2m.py
@@ -14,6 +14,13 @@ class User(BASE):
uuid = sa.Column(sa.String(36), nullable=False)
name = sa.Column(sa.String)
+ addresses = orm.relationship(
+ 'Address',
+ primaryjoin='User.uuid == Address.user_uuid',
+ back_populates='user',
+ foreign_keys='Address.user_uuid',
+ )
+
class Address(BASE):
__tablename__ = 'addresses'
@@ -26,8 +33,8 @@ class Address(BASE):
user = orm.relationship(
'User',
- backref='addresses',
primaryjoin='Address.user_uuid == User.uuid',
+ back_populates='addresses',
foreign_keys=user_uuid,
)
So for the Address model, we have simply replaced the backref keyword with
back_populates. For the User model, we have added a new relationship. This
is mostly the inverse of the relationship on the Address model and only the
foreign_key argument points to the same field (a foreign field in the
Address model case).
Overall, this isn’t a difficult migration. Migrating many-to-many models could be a little trickier though. Let’s see.
Many-to-Many
Now consider the following example:
import uuid
import sqlalchemy as sa
from sqlalchemy import orm
BASE = orm.declarative_base()
association_table = sa.Table(
'association',
BASE.metadata,
sa.Column('parent_uuid', sa.String(36), primary_key=True),
sa.Column('child_uuid', sa.String(36), primary_key=True),
)
class Parent(BASE):
__tablename__ = 'parents'
__table_args__ = (
sa.Index('parents_uuid_idx', 'uuid', unique=True),
)
id = sa.Column(sa.Integer, primary_key=True)
uuid = sa.Column(sa.String(36), nullable=False)
name = sa.Column(sa.String)
class Child(BASE):
__tablename__ = 'children'
__table_args__ = (
sa.Index('children_uuid_idx', 'uuid', unique=True),
)
id = sa.Column(sa.Integer, primary_key=True)
uuid = sa.Column(sa.String(36), nullable=False)
name = sa.Column(sa.String)
parents = orm.relationship(
'Parent',
secondary='association',
primaryjoin='Child.uuid == association.c.child_uuid',
secondaryjoin='association.c.parent_uuid == Parent.uuid',
backref='children',
)
def main():
engine = sa.create_engine('sqlite://')
session = orm.sessionmaker(bind=engine)()
BASE.metadata.create_all(engine)
parent = Parent(name='John Doe', uuid=str(uuid.uuid4()))
session.add(parent)
session.commit()
child = Child(name='Jimmy Doe', uuid=str(uuid.uuid4()))
session.add(child)
session.commit()
parent.children.append(child)
print('# Parents')
for parent in session.query(Parent).all():
children = [x.name for x in parent.children]
print(f'Parent: name={parent.name}, children={children}')
print()
print('# Children')
for child in session.query(Child).all():
parents = [x.name for x in child.parents]
print(f'Child: name={child.name}, parents={parents}')
print()
if __name__ == '__main__':
main()
Again, there are a couple of things to consider here. Firstly, since this is a
many-to-many relationship, we need a so-called join table (also known as a
association table in SQLAlchemy or a through table in Django). The
necessitates the use of the secondary keyword to indicate the association
table. Next up, once again we do not have a foreign key constraint present
meaning we need to inform SQLAlchemy how the join should be managed. The use of
the association table means we must specify not only primaryjoin but also
secondaryjoin: the former describes the relationship from Child to the
association table, while the latter describes the relationship from the
association table to Parent. Finally, once again we’re using a uuid field
for referencing to makes things a little more like the real world.
Now to switch from backref to back_populates once again.
import uuid
import sqlalchemy as sa
from sqlalchemy import orm
BASE = orm.declarative_base()
association_table = sa.Table(
'association',
BASE.metadata,
sa.Column('parent_uuid', sa.String(36), primary_key=True),
sa.Column('child_uuid', sa.String(36), primary_key=True),
)
class Parent(BASE):
__tablename__ = 'parents'
__table_args__ = (
sa.Index('parents_uuid_idx', 'uuid', unique=True),
)
id = sa.Column(sa.Integer, primary_key=True)
uuid = sa.Column(sa.String(36), nullable=False)
name = sa.Column(sa.String)
children = orm.relationship(
'Child',
secondary='association',
primaryjoin='Parent.uuid == association.c.parent_uuid',
secondaryjoin='association.c.child_uuid == Child.uuid',
back_populates='parents',
)
class Child(BASE):
__tablename__ = 'children'
__table_args__ = (
sa.Index('children_uuid_idx', 'uuid', unique=True),
)
id = sa.Column(sa.Integer, primary_key=True)
uuid = sa.Column(sa.String(36), nullable=False)
name = sa.Column(sa.String)
parents = orm.relationship(
'Parent',
secondary='association',
primaryjoin='Child.uuid == association.c.child_uuid',
secondaryjoin='association.c.parent_uuid == Parent.uuid',
back_populates='children',
)
def main():
engine = sa.create_engine('sqlite://')
session = orm.sessionmaker(bind=engine)()
BASE.metadata.create_all(engine)
parent = Parent(name='John Doe', uuid=str(uuid.uuid4()))
session.add(parent)
session.commit()
child = Child(name='Jimmy Doe', uuid=str(uuid.uuid4()))
session.add(child)
session.commit()
parent.children.append(child)
print('# Parents')
for parent in session.query(Parent).all():
children = [x.name for x in parent.children]
print(f'Parent: name={parent.name}, children={children}')
print()
print('# Children')
for child in session.query(Child).all():
parents = [x.name for x in child.parents]
print(f'Child: name={child.name}, parents={parents}')
print()
if __name__ == '__main__':
main()
Looking at the diff for these two files, we see the following:
diff --git test_m2m.py test_m2m.py
index 948942f..d1dc0d4 100644
--- test_m2m.py
+++ test_m2m.py
@@ -25,6 +25,14 @@ class Parent(BASE):
uuid = sa.Column(sa.String(36), nullable=False)
name = sa.Column(sa.String)
+ children = orm.relationship(
+ 'Child',
+ secondary='association',
+ primaryjoin='Parent.uuid == association.c.parent_uuid',
+ secondaryjoin='association.c.child_uuid == Child.uuid',
+ back_populates='parents',
+ )
+
class Child(BASE):
__tablename__ = 'children'
@@ -41,7 +49,7 @@ class Child(BASE):
secondary='association',
primaryjoin='Child.uuid == association.c.child_uuid',
secondaryjoin='association.c.parent_uuid == Parent.uuid',
- backref='children',
+ back_populates='children',
)
This is nearly identical to the changes required for migrating the one-to-many
models. Once again, we simply replace backref with back_populates on the
relationship that is defined. We then define a new relationship on the
Parent model which has an inverse of the primary and secondary joins given
for the Child model relationship.
Summary
So that’s how you migrate from backref to back_populates on models without
foreign key constraints present. This likely isn’t broadly useful but hopefully
it will help someone that encounters this.
The code for this blog post can be found on GitHub.