0

Following on from this question: Django Postgresql ArrayField aggregation

I have an ArrayField of Categories and I would like to retrieve all unique values it has - however the results should be filtered so that only values starting with the supplied string are returned.

What's the "most Django" way of doing this?

Given an Animal model that looks like this:

class Animal(models.Model):
    # ...
    categories = ArrayField(
        models.CharField(max_length=255, blank=True),
        default=list,
    )
    # ...

Then, as per the other question's answer, this works for finding all categories, unfiltered.

all_categories = (
    Animal.objects
    .annotate(categories_element=Func(F('categories'), function='unnest'))
    .values_list('categories_element', flat=True)
    .distinct()
)

However, now, when I attempt to filter the result I get failure, not just with __startswith but all types of filter:

all_categories.filter(categories_element__startswith('ga'))
all_categories.filter(categories_element='dog')

Bottom of stacktrace is:

DataError: malformed array literal: "dog"
...
DETAIL:  Array value must start with "{" or dimension information.

... and it appears that it's because Django tries to do a second UNNEST - this is the SQL it generates:

...) WHERE unnest("animal"."categories") = dog::text[]

If I write the query in PSQL then it appears to require a subquery as a result of the UNNEST:

SELECT categories_element
FROM (
    SELECT UNNEST(animal.categories) as categories_element
) ul
WHERE ul.categories_element like 'Ga%';

Is there a way to get Django ORM to make a working query? Or should I just give up on the ORM and use raw SQL?

3
  • can maybe use docs.djangoproject.com/en/1.9/ref/models/querysets/#extra to get the subquery in there. annotate seems kind of wrong (it's usually for aggregation type functions) Commented Apr 20, 2016 at 17:13
  • Thanks @Anentropic - I did have a look at extra but couldn't see a way to add the subquery. The answer below is probably the "truest" - the schema is not fit for purpose really. I've implemented a mini query builder that selects and sorts from the ArrayFields and appears to be working nicely. Commented Apr 21, 2016 at 10:45
  • I wondered if you could put the subquery in tables parameter Commented Apr 21, 2016 at 12:17

1 Answer 1

2

You probably have the wrong database design.

Tip: Arrays are not sets; searching for specific array elements can be a sign of database misdesign. Consider using a separate table with a row for each item that would be an array element. This will be easier to search, and is likely to scale better for a large number of elements.

http://www.postgresql.org/docs/9.1/static/arrays.html

Sign up to request clarification or add additional context in comments.

3 Comments

Yes I completely agree that the design is not optimal. But it's not my design and I'm stuck with it for now :( Poor Animals!
In that case there is no reason why shouldn't use a raw query. Raw queries are not evil.
Thanks - I'll give it a go!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.