I'm trying to learn python by developing some CLI tools for my job.
I have two lists of domains, one "deduplicated" hold the full domains i've loaded from a text file, the other "poison" contains some strings partially matching some domains.
deduplicated = ['facebook.com','google.com','en.wikipedia.org','youtube.com','it.wikipedia.org']
poison = ['youtube','wikipedia']
I'm trying to match the "poison" list of strings in order to obtain two new lists, one "clean" (the domains which are not matched by the poison list) and one "dirty" (which have been partially matched").
This is my attempt but it's not working...
clean = []
dirty = []
for item in deduplicated:
if (any(poison in word for word in deduplicated)):
print ("useless domain %s" % item)
dirty.append(item)
else:
print ("nice domain %s" % item)
clean.append(item)
Update:
Edited the code because the formatting was ugly.
For future reference, the error i was getting was:
TypeError: 'in ' requires string as left operand, not list