0

So take the following example list

l = [
    {
        'post':1,
        'user':1,
        'other_stuff':'something',
        'more':'you get the point'
    },
    {
        'post':1,
        'user':2,
        'other_stuff':'something',
        'more':'you get the point'
    },
    {
        'post':2,
        'user':1,
        'other_stuff':'something',
        'more':'you get the point'
    },
]

I need to be able to check if a 'user' is already connected to a 'post', and I could do it with looping:

user = 1
post = 1
response = False
for connection in l:
    if connection['post'] == post and connection['user'] == user:
        response = True
        break

and that works very well. The issue is that in the actual situation, l will be populated 1.5 million times, and this iteration will run every time it populates since it needs to check to see if something already exists. so the last 500k iterations will be iterating through a list of over 1 million dictionaries. There is no way that this is the most efficient method for this!! My question is: what would be an optimal method that would not require such exhaust?

note: I do not necessarily know the values of the other keys in the dictionaries so I cannot do if x is in l to check

3
  • Could you put this into a database as these l values come in? Then you can run queries against the database without needing to iterate over all the rows yourself. Commented Jan 10, 2014 at 0:00
  • I could, but I was hoping for a pythonic solution Commented Jan 10, 2014 at 0:03
  • in reality, I will most likely use a database since python will always be slower in this regard, but in other projects I work on I may not be able to use a database and an optimal search_like method for comparing a list of dictionaries would be useful Commented Jan 10, 2014 at 0:04

1 Answer 1

4

I'd reconsider how you are laying out your data structure. If you need to have efficient access on a post and user pair, I'd consider storing it in a format like the following:

l = { (1, 1) : {'other stuff':'something', ...}, 
      (1, 2) : {'other stuff':'something', ...},
      (2, 1) : {'other stuff':'something', ...} }

Then this becomes an O(1) lookup:

user_post_pair = (1, 1)
if user_post_pair in l:
    # Stuff...
Sign up to request clarification or add additional context in comments.

8 Comments

Using l for variable names. Gotta love it :)
And if for some reason you can't control the input to begin with, you can convert it into this format as it comes in.
I have considered that, the only issue I have with it is that this list is being accessed in a couple other places, and the way it is set up for those makes it much simpler...I may as well make a copy and store two separate objects, a list and dictionary.
@Keeler Yeah, I agree. I've kept it since the OP used it, and I'm assuming it's not actually used in the code. At least I hope not...
@Yuushi don't worry it's not
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.