1

I'm dealing with a nested list that looks something like this.

mylist =[
    ["First", "Second", "Third"], 
    ["First", "Second", "Third"], 
    ...
]

The goal is to remove duplicate elements of mylist based on the following definition: An element is equal to another element if element1[0] == element2[0] and element1[1] == element2[1]. Basically, only the first two elements count, ignore the rest.

This doesn't seem terribly hard but I'm probably over complicating it and having trouble with it. I think I am close to a solution, which I'll post if it gets done and nobody has answered.

My main problems:

I really wish I could turn the list to a set like in more conventional cases--is there any way to give set a custom definition of equivalence? A lot of built-in methods don't work because of that and rewriting them is a bit painful as the indexing always gets screwed up somewhere.

2
  • If you have the list [[1,2,4],[1,2,3]], do you care which of the the two survives the cull? Commented Jun 26, 2015 at 3:52
  • I should clarify that: nope, either one surviving is fine. Commented Jun 26, 2015 at 3:58

2 Answers 2

3

You can make a class that stores the data and override __eq__:

class MyListThingy(object):
    def __init__(self, data):
        self.data = data
    def __eq__(self, other):
        return self.data[0]==other.data[0] and self.data[1]==other.data[1]

Of course, this won't do any good for sets, which use hashing. for that you have to override __hash__:

def __hash__(self):
    return hash((self.data[0],self.data[1]))
Sign up to request clarification or add additional context in comments.

3 Comments

This sounds good. However, giving set() a list of MyListThingy objects raises an unhashable instance error (with the __hash function in the class).
Oops, I meant __hash__.
Ah, works perfectly now, I thought you wanted to write hash as a private method (I think __ is used to denote that?). This is very nice solution that I'll keep in mind, thank you.
2

You can create a tuple of first and second items from inner list to be used as a key in a dictionary. Then add all inner lists into the dictionary which will lead to removal of duplicates.

d = dict()
l =[["First", "Second", "Third"], ["First", "Second", "Fourth"]]
for item in l:
      d[(item[0], item[1])]=item

Output: ( d.values() )

[['First', 'Second', 'Fourth']]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.