0

I have been trying to remove duplicate arrays in my array list (ie. [[2,1],[1,2]) I want to get rid of only one of the arrays. I tried to reverse the list to and remove the duplicates but that does not work.

def grf_to_edge_list(file):
    edgelist = []
    for line in file:
        y = line.split()
        for i in range(3,len(y)):
            edgelist.append([int(y[0]),int(y[i])])
    for i in range(len(edgelist)-1):
        temp = edgelist[i]
        temp.reverse()
        if temp in edgelist:
            edgelist.remove(temp)
            i = i - 1
    return edgelist    

Here is the exact data:

1 2.0 1.0 2 3
2 1.0 0.0 1 3
3 0.0 2.0 1 2 4
4 3.0 3.0 3
5 3.0 0.0 
5
  • [1, 2] is not the same as [2, 1], but if you don't care about order and just want unique sets, you could just create sets instead of lists and put the sets in a set to remove duplicates among them. Commented Oct 7, 2020 at 1:08
  • I am creating an edgelist, so [2,1] and [1,2] are the same thing just in different directions. I will try that though. Commented Oct 7, 2020 at 1:10
  • I think you can convert each array to a set before comparison. in that way, there will not be a difference between [1,2] and [2,1]. Commented Oct 7, 2020 at 1:12
  • Welcome to SO. Before your next question, please take the tour, read How to Ask and provide a minimal reproducible example. This current question is good, but a complete set of test data would have been better, allowing SO users to test their solution with your exact data set. Commented Oct 7, 2020 at 1:16
  • From the data and your code, I now understand that you consider the arrays as the first number of each line as [0] and the numbers after the third one as [1]. This should have been explained in the question as well. The less we have to work to figure it out, the better your chances of getting answers. Commented Oct 7, 2020 at 1:45

2 Answers 2

1

You might as well not add them to the list in first place, if you want to remove them later stage.

def grf_to_edge_list(file):
    edgelist = []
    for line in file:
        y = line.split()
        for i in range(3,len(y)):
            if [int(y[i]),int(y[0])] not in edgelist:    #My change is here.
                edgelist.append([int(y[0]),int(y[i])])
    return edgelist  
Sign up to request clarification or add additional context in comments.

2 Comments

In the line if [int(y[i]),int(y[0])] not in edgeList:, the L should be a small l. The code does not work otherwise.
Updated answer.
0
def new_grf_to_edge_list(file):    
    edgelist = []
    for line in file:
        y = line.split()
        for i in range(3,len(y)):
            edgelist.append([int(y[0]),int(y[i])])
    for i in range(len(edgelist)-1, 0, -1): # start from last item
        temp = copy.deepcopy(edgelist[i]) # deepcopy so that reverse does not change original
        temp.reverse()
        if temp in edgelist:
            edgelist.pop(i) # remove the item at i and not the one which is found
    return edgelist 

Please note the two changes. You cannot remove the temp because you are looping through the list. len(edgelist) changes. So if you loop from last-item you can remove it if it is found elsewhere since we are no longer going to access it(last-item in the loop).

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.