0

I have a program that searches through two separate lists, lets call them list1 and list2.

I only want to print the instances where list1 and list2 have matching items. The thing is, not all items in both lists match eachother, but the first, third and fourth items should.

If they match, I want the complete lists (including the mismatching items) to be appended to two corresponding lists.

I have written the follow code:

for item in list1:
    for item2 in list2:
        if (item[0] and item[2:4])==(item[0] and item2[2:4]):
            newlist1.append(item)
            newlist2.append(item2)
            break

This works, but it's quite inefficient. For some of the larger files I'm looking through it can take more than 10 seconds to complete the match, and it should ideally be at most half of that.

What I'm thinking is that it shouldn't have to start over from the beginning in list2 each time the code is run, it should be enough to continue from the last point where there was a match. But I don't know how to write it in code.

3
  • 2
    Are you aware that item[0] and item[2:4] will equal to item[0] if item[0] is not false? Commented Sep 1, 2014 at 8:56
  • @utdemir Actually, it will equal to item[0] if item[0] evaluates to False; otherwise it will equal to item[2:4] Commented Sep 1, 2014 at 9:10
  • You're right; confused, sorry. But still, IMHO it doesn't look like what OP is trying to do. Commented Sep 1, 2014 at 9:38

2 Answers 2

1

Your condition (item[0] and item[2:4])==(item[0] and item2[2:4]) is wrong.

Besides that the second item[0] should probably be item2[0], what (item[0] and item[2:4]) does is the following (analogously for (item2[0] and item2[2:4])):

  • if item[0] is 0, it returns item[0] itself, i.e. 0
  • if item[0] is not 0, it returns whatever item[2:4] is

And this is then compared to the result of the second term. Thus, [0,1,1,1] would "equal" [0,2,2,2], and [1,1,1,1] would "equal" [2,1,1,1].

Try using tuples instead:

if (item[0], item[2:4]) == (item2[0], item2[2:4]):

Or use operator.itemgetter as suggested in the other answer.


To speed up the pairwise matching of items from both lists, put the items from the first list into a dictionary, using those tuples as key, and then iterating over the other list and looking up the matching items in the dictionary. Complexity will be O(n+m) instead of O(n*m) (n and m being the length of the lists).

key = operator.itemgetter(0, 2, 3)

list1_dict = {}
for item in list1:
    list1_dict.setdefault(key(item), []).append(item)

for item2 in list2:
    for item in list1_dict.get(key(item2), []):
        newlist1.append(item)
        newlist2.append(item2)
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks a lot, that's very helpful. I didn't see the typo of item and item2 there. While trying out the loop I actually did use tuples at one point and it seemed to work either way, but it's good to get an explanation of the robustness of using tuples. Also, it's really fast now that I changed to using dictionaries. Now I just need to figure out what you did there (I'm not a programmer)...
0
from operator import itemgetter

getter = itemgetter(0, 2, 3)
for item,item2 in zip(list1, list2):
    if getter(item) == getter(item2):
        newlist1.append(item)
        newlist2.append(item2)
        break

This may reduce bit of time complexity though...

1 Comment

This compares only items in the same positions of the lists; it seems what OP wants is to compare each item from list1 with each item from list2.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.