1

I have a list of objects, each of the same kind.

Each object has its own list of objects (usually just 5-10 items)

What I used to do was:

for o in main_object_list:
    obj_list = o.get_this_object_list()
    for i in obj_list:
        if i in main_object_list:
            //do something with i

While this approach works, when main_object_list has, say, 100.000 elements, it goes horribly slow.

My workaround has been this:

for o in main_object_list:
    o.flag = True 

for o in main_object_list:
    obj_list = o.get_this_object_list()
    for i in obj_list:
        if i.flag:
            //do something with i

It goes several orders of magnitude faster (from 22 minutes to as little as 17 secs) but I suspect there may be a different, and better, approach. Moreover, this example works just becaue each object has a flag property, and by the way it is not so elegant to use a flag that may well have been setted/unsetted in other functions (if this function is called in the body of a parent function which uses the same flag mechanisms, this would mess everything up, setting every objects flag)

Is there a more correct pythonesque way to quickly check if an object is in main_object_list?

4
  • 4
    Better approach is to use a set if all the objects are unique and hashable Commented Apr 19, 2012 at 4:19
  • You need to call each object in main_object_list many times and with correct order, aren't you? You really need do something for quite each pair of main_object_list, say near 10.000.000.000 times? How that can be faster? Commented Apr 19, 2012 at 4:21
  • set in Python 2.6 and up, sets.Set down to Python 2.3. Commented Apr 19, 2012 at 4:23
  • every object is hashable and unique, the list is actually a set under this point of view. I tried using set (and dict). It GOES faster, but I gain just 10-15% in speed. I do not need to process each list in a prticular order btw Commented Apr 19, 2012 at 4:29

1 Answer 1

2

If you want to use your own flag, you could do:

for o in main_object_list:
    o.my_special_flag = True 

for o in main_object_list:
    obj_list = o.get_this_object_list()
    for i in obj_list:
        if hasattr(i, 'my_special_flag'):

Otherwise set.intersection is as fast as it gets:

main_object_set = set(main_object_list)

for o in main_object_list:
    obj_list = o.get_this_object_list()
    objs_in_main_list = main_object_set.intersection(obj_list)
    for i in objs_in_main_list:
        //do something with i

Or:

main_object_set = set(main_object_list)

objs_in_main_list = set().update(
                           *(o.get_this_object_list() for o in main_object_list))
objs_in_main_list.intersection_update(main_object_set)
Sign up to request clarification or add additional context in comments.

3 Comments

I tried set, but didnt try set.intersection (or set & set), just didnt think it could be faster than 'in'
@user815129 It does lots of ins at once, in C, without returning to the Python level. It should be faster.
@user815129 The other possibility is to use your own flag; see my edit.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.