Optimizing a nested for loop with two lists

Question

I have a program that searches through two separate lists, lets call them list1 and list2.

I only want to print the instances where list1 and list2 have matching items. The thing is, not all items in both lists match eachother, but the first, third and fourth items should.

If they match, I want the complete lists (including the mismatching items) to be appended to two corresponding lists.

I have written the follow code:

for item in list1:
    for item2 in list2:
        if (item[0] and item[2:4])==(item[0] and item2[2:4]):
            newlist1.append(item)
            newlist2.append(item2)
            break

This works, but it's quite inefficient. For some of the larger files I'm looking through it can take more than 10 seconds to complete the match, and it should ideally be at most half of that.

What I'm thinking is that it shouldn't have to start over from the beginning in list2 each time the code is run, it should be enough to continue from the last point where there was a match. But I don't know how to write it in code.

Are you aware that item[0] and item[2:4] will equal to item[0] if item[0] is not false? — utdemir
– utdemir, Commented Sep 1, 2014 at 8:56
@utdemir Actually, it will equal to item[0] if item[0] evaluates to False; otherwise it will equal to item[2:4] — tobias_k
– tobias_k, Commented Sep 1, 2014 at 9:10
You're right; confused, sorry. But still, IMHO it doesn't look like what OP is trying to do. — utdemir
– utdemir, Commented Sep 1, 2014 at 9:38

tobias_k · Accepted Answer · 2014-09-01 09:30:15Z

1

Your condition (item[0] and item[2:4])==(item[0] and item2[2:4]) is wrong.

Besides that the second item[0] should probably be item2[0], what (item[0] and item[2:4]) does is the following (analogously for (item2[0] and item2[2:4])):

if item[0] is 0, it returns item[0] itself, i.e. 0
if item[0] is not 0, it returns whatever item[2:4] is

And this is then compared to the result of the second term. Thus, [0,1,1,1] would "equal" [0,2,2,2], and [1,1,1,1] would "equal" [2,1,1,1].

Try using tuples instead:

if (item[0], item[2:4]) == (item2[0], item2[2:4]):

Or use operator.itemgetter as suggested in the other answer.

To speed up the pairwise matching of items from both lists, put the items from the first list into a dictionary, using those tuples as key, and then iterating over the other list and looking up the matching items in the dictionary. Complexity will be O(n+m) instead of O(n*m) (n and m being the length of the lists).

key = operator.itemgetter(0, 2, 3)

list1_dict = {}
for item in list1:
    list1_dict.setdefault(key(item), []).append(item)

for item2 in list2:
    for item in list1_dict.get(key(item2), []):
        newlist1.append(item)
        newlist2.append(item2)

edited Sep 1, 2014 at 9:30

answered Sep 1, 2014 at 9:05

tobias_k

83.1k12 gold badges130 silver badges186 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Halldinz0r Over a year ago

Thanks a lot, that's very helpful. I didn't see the typo of item and item2 there. While trying out the loop I actually did use tuples at one point and it seemed to work either way, but it's good to get an explanation of the robustness of using tuples. Also, it's really fast now that I changed to using dictionaries. Now I just need to figure out what you did there (I'm not a programmer)...

Anvesh · Accepted Answer · 2014-09-01 08:59:24Z

0

from operator import itemgetter

getter = itemgetter(0, 2, 3)
for item,item2 in zip(list1, list2):
    if getter(item) == getter(item2):
        newlist1.append(item)
        newlist2.append(item2)
        break

This may reduce bit of time complexity though...

answered Sep 1, 2014 at 8:59

Anvesh

6251 gold badge5 silver badges21 bronze badges

1 Comment

tobias_k Over a year ago

This compares only items in the same positions of the lists; it seems what OP wants is to compare each item from list1 with each item from list2.

Collectives™ on Stack Overflow

Optimizing a nested for loop with two lists

2 Answers 2

1 Comment

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related