14

Suppose we have a PostgreSQL database with two tables A, B.

table A columns: id, name
table B columns: id, name, array_a

The column array_a in table B contains a variable length array of ids from table A. In SQLAlchemy we have two classes that model those tables, say class A and B.

The following works fine to get all the objects A that are referenced in an object B:

session.query(A).join(B, A.id == func.any(B.array_a)).filter(B.id == <id>).all()

How can we create a relationship in B referencing the objects A corresponding to the array? Tried column comparators using the func.any above but it complains that ANY(array_a) is not a column in the model. Specifying the primaryjoin conditions as above doesn't seem to cut it either.

2 Answers 2

13

This anti-pattern is called "Jaywalking"; and PostgreSQL's powerful type system makes it very tempting. you should be using another table:

CREATE TABLE table_a (
    id SERIAL PRIMARY KEY,
    name VARCHAR
);

CREATE TABLE table_b (
    id SERIAL PRIMARY KEY,
    name VARCHAR
);

CREATE TABLE a_b (
    a_id INTEGER PRIMARY KEY REFERENCES table_a(id),
    b_id INTEGER PRIMARY KEY REFERENCES table_b(id)
)

Which is mapped:

from sqlalchemy import *
from sqlalchemy.dialects import postgresql
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import *

Base = declarative_base()

a_b_table = Table("a_b", Base.metadata,
    Column("a_id", Integer, ForeignKey("table_a.id"), primary_key=True),
    Column("b_id", Integer, ForeignKey("table_b.id"), primary_key=True))

class A(Base):
    __tablename__ = "table_a"
    id = Column(Integer, primary_key=True)
    name = Column(String)

class B(Base):
    __tablename__ = "table_b"
    id = Column(Integer, primary_key=True)
    name = Column(String)
    a_set = relationship(A, secondary=a_b_table, backref="b_set")

example:

>>> print Query(A).filter(A.b_set.any(B.name == "foo"))
SELECT table_a.id AS table_a_id, table_a.name AS table_a_name 
FROM table_a 
WHERE EXISTS (SELECT 1 
FROM a_b, table_b 
WHERE table_a.id = a_b.a_id AND table_b.id = a_b.b_id AND table_b.name = :name_1)

If you are stuck with the ARRAY column, your best bet is to use an alternate selectable that "looks" like a proper association table.

from sqlalchemy import *
from sqlalchemy.dialects import postgresql
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import *

Base = declarative_base()


class A(Base):
    __tablename__ = "table_a"
    id = Column(Integer, primary_key=True)
    name = Column(String)

class B(Base):
    __tablename__ = "table_b"
    id = Column(Integer, primary_key=True)
    name = Column(String)
    array_a = Column(postgresql.ARRAY(Integer))

a_b_selectable = select([func.unnest(B.array_a).label("a_id"),
                         B.id.label("b_id")]).alias()

A.b_set = relationship(B, secondary=a_b_selectable,
                          primaryjoin=A.id == a_b_selectable.c.a_id,
                          secondaryjoin=a_b_selectable.c.b_id == B.id,
                          viewonly=True,)

B.a_set = relationship(A, secondary=a_b_selectable,
                          primaryjoin=A.id == a_b_selectable.c.a_id,
                          secondaryjoin=a_b_selectable.c.b_id == B.id,
                          viewonly=True)

which gives you:

>>> print Query(A).filter(A.b_set.any(B.name == "foo"))
SELECT table_a.id AS table_a_id, table_a.name AS table_a_name 
FROM table_a 
WHERE EXISTS (SELECT 1 
FROM (SELECT unnest(table_b.array_a) AS a_id, table_b.id AS b_id 
FROM table_b) AS anon_1, table_b 
WHERE table_a.id = anon_1.a_id AND anon_1.b_id = table_b.id AND table_b.name = :name_1)

And obviously, since there's no real table there, viewonly=True is neccesary and you can't get the nice, dynamic objecty goodness you would if you had avoided jaywalking.

Sign up to request clarification or add additional context in comments.

5 Comments

I think relation is a deprecated alias for relationships since a few versions.
The selectable does what we need, yes.
I'm not sure it's jaywalking when PostgreSQL arrays are queryable and have a well defined type. It seems more an SQLAlchemy limitation with how it handles PostgreSQL arrays.
Thanks for the SQL Antipatterns reference. I picked up the book today and it's fabulous.
@Kiran Jonnalagadda: there's nothing wrong with using the array type; it's certainly very useful, even via sqlalchemy's types. The limitation is not because the Array type is poor, it isn't; but rather, enforcing data integrity through composite types is both difficult to express and difficult to get right. With a simple, single cell-per-referent arrangement, the consistency constraint can be expressed as just references footable(foocolumn), that's what makes it jaywalking.
5

Or else simply you can join explicitly like below:

class A(Base):
    __tablename__ = "table_a"
    id = Column(Integer, primary_key=True)
    name = Column(String)

class B(Base):
    __tablename__ = "table_b"
    id = Column(Integer, primary_key=True)
    name = Column(String)
    array_a = Column(postgresql.ARRAY(Integer))

    a_ids= relationship('A',primaryjoin='A.id == any_(foreign(B.array_a))',uselist=True)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.