1

In python, given a list mylist of lists el of integers, I would like to remove duplicates that are equivalent under specific permutations of the indices.

The question is more general but I have in mind "decorated" McKay graphs where each node is given an integer defining el whose sum is equal to a certain number. Generating mylist is easy enough, but there is a lot of redundancy depending on the symmetry of the graph. For instance, for the "D4" diagram, el represents the following decorated graph:

                                       a
                                       |
    el = [a, b, c, d, e]     <-->    b-c―d
                                       |
                                       e

The two lists [1,0,4,0,0] and [0,0,4,0,1] are duplicates because there are the same up to a reflection of the graph around the horizontal axis. More generally, the lists el duplicates if there are equivalent under any permutation of a,b,d,e. Is there a simple way of removing duplicates given an arbitrary symmetry?

The way I did it for this particular graph was to sort the indices 0,1,3,4 and compare:

from itertools import product

def check_duplicate(f, li):

    for l in li:
        if l[2] == f[2] and sorted(l[:2] + l[3:]) == sorted(f[:2] + f[3:]):
            return True

    return False

mylist = [list(i) for i in product(range(5), repeat=5) if sum(i) == 5]

newlist = []
for f in my_list:
    if check_duplicate(f, newlist) == False:
        newlist.append(f)

Here tmp is generated brute force, but the condition is more involved in my real case.

This works well enough for this particular example, but is a bit clunky, and the implementation is harder to generalize to more involved cases. Is there a way to remove the duplicates in a more optimized way, in particular one that can easily implement removing duplicates given a particular symmetry of the indices of el?

1 Answer 1

1

It will probably be better to add "seen" cases to a set, and then continue to check if new elements have already been "seen", rather than comparing every element to every other element in the list iteratively (an operation with O(n^2) time complexity). Hopefully I understand what you're trying to accomplish, but here's how I'd do it:

from itertools import product

def to_key(f):
    # Creates a hashable key, keeping order-insensitive values sorted, and noting the middle value"""
    return (f[2], tuple(sorted(f[:2] + f[3:])))

mylist = [list(i) for i in product(range(5), repeat=5) if sum(i) == 5]

# Init an empty set. We'll add to it if a key isn't there
seen = set()
# If we add to the seen set, we'll add the element to newList
newlist = []

for f in mylist:
    key = to_key(f) # Create the hashable key
    if key not in seen: # Check if key in seen
        seen.add(key)
        newlist.append(f)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.