I have a CSV file which has the following content:
Apple,Bat
Apple,Cat
Apple,Dry
Apple,East
Apple,Fun
Apple,Gravy
Apple,Hot
Bat,Cat
Bat,Dry
Bat,Fun
...
I also have a list as follows:
to_remove=[Fun,Gravy,...]
I would like an efficient way to delete all lines from the csv file which have any one of the words from the list to_remove.
I know one way to do it is to read each line of the csv file, loop through to_remove to see if any of the words are present in the line and save the line to another file if there was no match.
However, I have a lot of entries in both the csv file and the to_remove list (approx 21000 and 300 respectively). So I want a efficient way of doing it in Python.
I do not have access to clusters so map-reduce based options are not an option.
grep -Ev '(Fun|Gravy)' filename