1

I'm trying to learn python by developing some CLI tools for my job.

I have two lists of domains, one "deduplicated" hold the full domains i've loaded from a text file, the other "poison" contains some strings partially matching some domains.

deduplicated = ['facebook.com','google.com','en.wikipedia.org','youtube.com','it.wikipedia.org']

poison = ['youtube','wikipedia']

I'm trying to match the "poison" list of strings in order to obtain two new lists, one "clean" (the domains which are not matched by the poison list) and one "dirty" (which have been partially matched").

This is my attempt but it's not working...

clean = []

dirty = []

for item in deduplicated:
    if (any(poison in word for word in deduplicated)):
    print ("useless domain %s" % item)
    dirty.append(item)
else:
    print ("nice domain %s" % item)
    clean.append(item)

Update:

Edited the code because the formatting was ugly.

For future reference, the error i was getting was:

TypeError: 'in ' requires string as left operand, not list

2
  • What is your output and what is your desired output? Commented Aug 29, 2015 at 21:47
  • @mikeb i've updated the question with the error i was getting. Commented Aug 29, 2015 at 22:13

2 Answers 2

5

Since the outer loop already loops over deduplicated, you need the inner loop to loop over poison:

if any(search in item for search in poison):
    print("Useless domain", item)
Sign up to request clarification or add additional context in comments.

Comments

0

If I got you correctly, what you wanted to do was:

dirty = [word for word in deduplicated if any(unwanted in word for unwanted in poison)]
clean = [word for word in deduplicated if word not in dirty]

print(dirty) # => ['en.wikipedia.org', 'youtube.com', 'it.wikipedia.org']
print(clean) # => ['facebook.com', 'google.com']

There are two man problems with your code right now:

  • You iterate over items, but you don't use them in the check
  • Your indentation is messed up. Python is sensitive to whitespaces

1 Comment

You are right about the iteration! The indentation was my inability to properly format the code in the Stackoverflow editor ! I'm using Sublime Text and it manages the indentation quite well.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.