2

I am trying to subset a list of strings based on another list of substrings. I want to remove strings in list1 if the string includes a substring in list2.

list1 = ['lunch time', 'sandwich shop', 'starts at noon','grocery store']

list2 = ['lunch','noon']

The output I want:

output = ['sandwich shop','grocery store']

3 Answers 3

2

Using Regex.

Ex:

import re


list1 = ['lunch time', 'sandwich shop', 'starts at noon','grocery store']
list2 = ['lunch','noon']
pattern = re.compile(r"|".join(list2))
print([i for i in list1 if not pattern.search(i)]) 

Output:

['sandwich shop', 'grocery store']
Sign up to request clarification or add additional context in comments.

1 Comment

This was so fast. Thank you.
1

One approach is to iterate on the copy of list1 and remove the string from it if it contains a substring from list2

list1 = ['lunch time', 'sandwich shop', 'starts at noon','grocery store']

list2 = ['lunch','noon']

#Iterate on copy of list1
for item1 in list1[:]:
    #If substring is present, remove string from list
    for item2 in list2:
        if item2 in item1:
            list1.remove(item1)

print(list1)

Another approach is to find the matching substrings, and then subtract that result with the actual list

list1 = ['lunch time', 'sandwich shop', 'starts at noon','grocery store']

list2 = ['lunch','noon']

#List of strings where the substrings are contained
result = [item1 for item1 in list1 for item2 in list2 if item2 in item1 ]

#List of strings where the substrings are not contained, found by set difference between original list and the list above
print(list(set(list1) - set(result)))

The output will be the same in both cases as below

['grocery store', 'sandwich shop']

Comments

1

So many different ways to do this. Here is my approach (may not be the best).

list1 = ['lunch time', 'sandwich shop', 'starts at noon','grocery store']
list2 = ['lunch','noon']


list3 = [x for x in list1 if len(set(list2) & set(x.split())) == 0]


print(list3)

Gives you:

['sandwich shop', 'grocery store']

What is happening?

  1. Loop through the first list items.
  2. Convert the item to an array of words using split().
  3. Convert that array and list2 to a set.
  4. Do a set union to find which are similar.
  5. Count how many were similar using len().
  6. If nothing was similar, then add the item from list1 to list3.
  7. Repeat until no more items.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.