1

I have the below nested list:

sample = [['Ban', 'App'], ['Ban', 'Ora'], ['Gra', 'App'], ['Gra', 'Ora'], ['Kiw','App'], ['Kiw', 'Ora'], ['Man', 'Blu'], ['Pin', 'App']]

I need to consider items in each sub-list of the nested list, sample, that don't appear in any other sub-lists.

For example, my output list needs to contain the first element of the nested_list. I need to compare ['Ban', 'App'] with the rest of the list. As "Ban" in element 2 and "App" in element 3 are present in ['Ban', 'App'], we do not consider them. My next output element will is ['Gra', 'Ora'] as these items are not in ['Ban', 'App'].

Now my output is [['Ban', 'App'], ['Gra', 'Ora']] and I have to compare the rest of the nested list with these two elements. My next elements are ['Kiw','App'] and ['Kiw', 'Ora']. As 'App' is in ['Ban', 'App'], and 'Ora' is in ['Gra', 'Ora'], this won't be in the output list.

My output list is still [['Ban', 'App'], ['Gra', 'Ora']]. My next element is ['Man', 'Blu'] and these are brand new items, this will be added in my output list.

My new output list is [['Ban', 'App'], ['Gra', 'Ora'], ['Man', 'Blu']]. The last element is ['Pin', 'App'] and as "App" is in ['Ban', 'App'], we don't consider this item even though "Pin" is a new item.

My final output should be [['Ban', 'App'], ['Gra', 'Ora'], ['Man', 'Blu']].

final_output = [['Ban', 'App'], ['Gra', 'Ora'], ['Man', 'Blu']]

I started with the below code but this doesn't do exactly what I need it to do:

j =0
for i in range(len(sample)):
    #print ("I:", str(i))
    #print ("J" ,str(j))
    i = j
    for j in range(1, len(sample)):
        if sample[i][0] == sample[j][0] or sample[i][0] == sample[j][1] or sample[i][1] == sample[j][0] or sample[i][1] == sample[j][1]:
            pass
        else:
            print (sample[i], sample[j])
            #print (j)
            i = j
            break
1
  • To be clear: each time we look at one of the inner lists, we want to check whether it contains any string that has been mentioned before in any of the previous output? Commented Jan 12, 2023 at 5:08

2 Answers 2

6

I would keep a set that keeps track of items already seen and only add the pair to the final list if there is no intersection with that set.

st = set()
final_output = []
for pair in sample:
    if not st.intersection(pair):
        final_output.append(pair)
        st.update(pair)

print(final_output)
# [['Ban', 'App'], ['Gra', 'Ora'], ['Man', 'Blu']]
Sign up to request clarification or add additional context in comments.

2 Comments

I see we had the same idea. I like your use of intersection.
@Woody1193 I like your solution too. The explanation is very nice
2

You should use a set to hold the values you've already looked at. You can then iterate over each item in each sub-list and check if they're in the set:

seen = set()
filtered = []
for sublist in sample:
    if sublist[0] in seen or sublist[1] in seen:
        continue

    filtered.append(sublist)
    seen.add(sublist[0])
    seen.add(sublist[1])

This code works by iterating over sample and checking if any of the items in each sublist therein is in the set. If it is, then we'll ignore that item and continue on. Otherwise, add sublist to the filtered list and add the items to the set. This code will run much faster than what you have (O(n) vs. O(n^2)).

One thing this code does not consider is the case where your sublist has one item that has been seen and one which hasn't. You may need to make modifications to your code to handle that case.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.