1

I have a programm creating list like those:

["abc a","hello","abc","hello z"]

My goal is to move in list and if the element is contained in one of the string remove the string

first iteration:

# abc a can't be found in any other element:

["abc a","hello","abc","hello z"]

second one:

# hello is present in element 4:

["abc a","hello","abc"]

third one:

# abc can be found in element one:

["hello","abc"]

I have tried using the filter() function without success

I want every element to pass in the function the only problem is that the list size is reducing therefore this is another problem i dont know how to treat

Thank you

2
  • which element do you want to remove? "abc" or "abc a"? shortest? Commented Jul 18, 2019 at 7:22
  • i want to remove the one that contains the others for example when it's "ABC A" ' s turns no element in list except himself contains "ABC A" but when it's "ABC" 's turn "ABC A" contains ABC therefore "ABC A" must be removed Commented Jul 18, 2019 at 7:24

2 Answers 2

1

What you can do is initially when you get the list make it this way [[element1,status],[element2,status]]. Here the status will be present or deleted. Initially all the status will be present and as you are traversing instead of removing/deleting the element just update the status to deleted, and in every iteration if you find a match you will only consider if it's status is present, that way your list size remains the same. And at the end pick only those elements whose status is present. Hope you get it.

init_list = ["abc a","hello","abc","hello z"]
new_list = list()
for i in init_list:
    new_list.append([i,"present"])

for i in new_list:
    if i[1] == 'present':
        for j in new_list:
            if j[1] == 'present' and not i == j:
                fin = j[0].find(i[0])
                if not fin == -1:
                    j[1] = 'deleted'

fin_list = list()
for i in new_list:
    if i[1] == 'present':
        fin_list.append(i[0])

print(fin_list)
Sign up to request clarification or add additional context in comments.

Comments

1

one approach would be to:

  • create list of sets of words (by splitting words by spaces)
  • sort the list by smallest elements first
  • rebuild a list while making sure not to repeat words when adding new sets

like this:

lst = ["abc a","hello","abc","hello z"]

words = sorted([set(x.split()) for x in lst],key=len)

result = []
for l in words:
    if not result or all(l.isdisjoint(x) for x in result):
        result.append(l)

print(result)

prints the list of sets:

[{'hello'}, {'abc'}]

This approach loses order of the words but won't have issues with word delimiters. Substring approach would look like this:

lst = ["abc a","hello","abc","hello z"]

words = sorted(lst,key=len)

result = []
for l in words:
    if not result or all(x not in l for x in result):
        result.append(l)

print(result)

prints:

['abc', 'hello']

(this approach can be problematic with word delimiters but the all condition can be easily adapted with a split there to avoid this). Ex a condition like:

if not result or all(set(x.split()).isdisjoint(l.split()) for x in result):

would turn:

lst = ["abc a","hello","abc","abcd","hello z"]

into

['abc', 'abcd', 'hello']

3 Comments

why could the approach 2 be problematic ?
if you use ["abc a","hello","abc","abcd","hello z"] as input, abcd is filtered out. Maybe not what you want.
that's what i thought i first tried to add spaces in mine to identify 'abc' as block but it was not working

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.