Django: How can I select objects with the same field values?

Question

For example, I have a model like this:

Class Doggy(models.Model):
    name = models.CharField(u'Name', max_length = 40)
    color = models.CharField(u'Color', max_length = 20)

How can i select doggies with the same color? Or with the same name :)

UPD. Of course, I don't know the name or the color. I want to.. kind of, group by their values.

UPD2. I'm trying to do something like that, but using Django:

SELECT * 
FROM table 
WHERE tablefield IN ( 
 SELECT tablefield
 FROM table 
 GROUP BY tablefield  
 HAVING (COUNT(tablefield ) > 1) 
)

UPD3. I'd like to do it via Django ORM, without having to iterate over the objects. I just want to get rows with duplicate values for one particular field.

Brad Martsberger · Accepted Answer · 2014-08-27 16:40:43Z

8

I'm late to the party, but here you go:

Doggy.objects.values('color', 'name').annotate(Count('pk'))

This will give you results that have a count of how many of each Doggy you have grouped by color and name.

answered Aug 27, 2014 at 16:40

Brad Martsberger

1,96714 silver badges7 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

DataGreed Over a year ago

:) Yeah, that's how it should be done, though there was no annotate back in 2011 ;)

Sam Dolan · Accepted Answer · 2011-01-18 17:30:34Z

3

You can use itertools.groupby() for this:

import operator
import itertools
from django.db import models

def group_model_by_attr(model_class, attr_name):
    assert issubclass(model_class, models.Model), \
        "%s is not a Django model." % (model_class,)
    assert attr_name in [field.name for field in Event._meta.fields], \
        "The %s field doesn't exist on model %s" % (attr_name, model_class)

    all_instances = model_class.objects.all().order_by(attr_name)
    keyfunc = operator.attrgetter(attr_name)    
    return [{k: list(g)} for k, g in itertools.groupby(all_instances, keyfunc)]

grouped_by_color = group_model_by_attr(Doggy, 'color')
grouped_by_name = group_model_by_attr(Doggy, 'name')

grouped_by_color (for example) will be a list of dicts like [{'purple': [doggy1, doggy2], {'pink': [doggy3,]}] where doggy1,2, etc. are Doggy instances.

UPDATE:

From your update it looks like you just want a list of ids for each event type. I tested this with 250k records in postgresql on my ubuntu laptop w/ a core 2 duo & 3gb of ram, and it took .35 seconds (the itertools.group_by took .72 seconds btw) to generate the dict. You mention that you have 900K records, so this should be fast enough. If it's not it should be easy to cache/update as the records change.

from collections import defaultdict

doggies = Doggy.objects.values_list('color', 'id').order_by('color').iterator()
grouped_doggies_by_color = defaultdict(list)
for color, id in doggies:
    grouped_doggies_by_color[color].append(id)

edited Jan 18, 2011 at 17:30

answered Jan 18, 2011 at 15:22

Sam Dolan

32.6k10 gold badges93 silver badges84 bronze badges

5 Comments

DataGreed Over a year ago

Thanks for trying to help. Actually, i'd like to do that using the ORM. Of course, I can iterate over all of the objects, but that's not that great if you have more than 900k of them...

Sam Dolan Over a year ago

No problem. I definitely would have mentioned the data size in your question. I've updated my answer with that I think will work for you.

DataGreed Over a year ago

Thank you for your reply, I've upvoted it :) But i will leave the question open - I really want to know if the mentioned query could be made via ORM $)

Sam Dolan Over a year ago

The second is basically done through the ORM, with a bit of messaging the data. FYI: I just tried w/750k records and groupby took 48 seconds, and the values_list took 22seconds.

DataGreed Over a year ago

Yeah I understand that, thank you, but I'm still interested in constructing the query like SQL query above.

Mez · Accepted Answer · 2011-01-18 13:27:24Z

2

If you're looking for Doggy's of a certain colour - you'd do something like.

Doggy.objects.filter(color='blue')

If you want to find Doggys based on the colour of the current Doggy

def GetSimilarColoredDoggys(self):
    return Doggy.objects.filter(color=self.color)

The same would go for names:-

def GetDoggysWithSameName(self):
    return Doggy.objects.filter(color=self.name)

answered Jan 18, 2011 at 13:27

Mez

25k14 gold badges75 silver badges93 bronze badges

4 Comments

DataGreed Over a year ago

Omg, was I really so unclear in my question? Sorry, I will update it to show that the color/name is not known.

DataGreed Over a year ago

Btw, your naming doesn't follow the python neaming convention. Seems, you like C :)

Matthew Rankin Over a year ago

@DataGreed: Your comment to Mez is true that using CamelCase for function names instead of lowercase with underscores isn't the preferred way. But then again, spaces between the '=' sign in keyword arguments—as you've used in your question—aren't PEP8 compliant either. python.org/dev/peps/pep-0008

DataGreed Over a year ago

Fair enough, but I was the one who was asking for advice. Remember, that it is a knowledge-base and if a novice would accept this advice, it could lead him the dark way of code formatting havoc. BTW, using camelCase with lowerecase first letter would not be so confisung. But, again, that's not the point.

Matthew Rankin · Accepted Answer · 2011-01-18 14:36:08Z

-2

I would change your data model so that the color and name are a one-to-many relationship with Doggy as follows:

class Doggy(models.Model):
    name = models.ForeignKey('DoggyName')
    color = models.ForeignKey('DoggyColor')

class DoggyName(models.Model):
    name = models.CharField(max_length=40, unique=True)

class DoggyColor(models.Model):
    color = models.CharField(max_length=20, unique=True)

Now DoggyName and DoggyColor do not contain duplicate names or colors, and you can use them to select dogs with the same name or color.

edited Jan 18, 2011 at 14:36

answered Jan 18, 2011 at 14:26

Matthew Rankin

461k39 gold badges130 silver badges166 bronze badges

6 Comments

DataGreed Over a year ago

Omg, i was not asking that. This model was a dummy one, provided for example. The real issue is finding duplicate messages.

Matthew Rankin Over a year ago

@DataGreed: Why isn't changing your data model a valid option?

DataGreed Over a year ago

Because I'm not interested in an advice about DB architecture this time. If you want to know the real picture: i've got something like a forum, where i'd like to find the duplicate messages and create a report for moderator about them. So, the question itself is about making a particular type of query using the Django ORM (if it is possible, of course - I've tried a lot of ways without using extra() and didn't get it working).

Matthew Rankin Over a year ago

@DataGreed: Your example Doggy model with name and color as CharField's will result in redundant data and a data model that is not in second normal form. Violating 2NF results in wasted storage space and reduced query performance. If your "real is issue is finding duplicate messages", then you should ask your real question instead of down voting people that ask the question you actually asked.

DataGreed Over a year ago

I've downvoted your answer because it didn't contain an answer to the actual question. You could also answer something like "Just don't do it, find yourself another hobby", but I wasn't asking for advice on finding the other hobby than writing django apps. I was asking the exact question about making an exact query using the ORM.

|

DataGreed · Accepted Answer · 2011-02-12 22:14:11Z

-3

Okay, apparently, there's no way to do such thing with ORM only.

If you have to do it, you have to use .extra() to execute needed SQL-statement (if you are using SQL database, of course)

answered Feb 12, 2011 at 22:14

DataGreed

13.9k9 gold badges51 silver badges66 bronze badges

1 Comment

Carl Meyer Over a year ago

I would use .raw() rather than .extra() -- it's simpler, and you can use any SQL you want and get back Django model objects. docs.djangoproject.com/en/dev/topics/db/sql/…

Collectives™ on Stack Overflow

Django: How can I select objects with the same field values?

5 Answers 5

1 Comment

5 Comments

4 Comments

6 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

1 Comment

5 Comments

4 Comments

6 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related