If I have a collection of strings is there a data structure or function that could improve the speed of checking if any of the elements of the collections are substrings on my main string?
Right now I'm looping through my array of strings and using the in operator. Is there a faster way?
import timing
## string match in first do_not_scan
## 0:00:00.029332
## string not in do_not_scan
## 0:00:00.035179
def check_if_substring():
for x in do_not_scan:
if x in string:
return True
return False
## string match in first do_not_scan
## 0:00:00.046530
## string not in do_not_scan
## 0:00:00.067439
def index_of():
for x in do_not_scan:
try:
string.index(x)
return True
except:
return False
## string match in first do_not_scan
## 0:00:00.047654
## string not in do_not_scan
## 0:00:00.070596
def find_def():
for x in do_not_scan:
if string.find(x) != -1:
return True
return False
string = '/usr/documents/apps/components/login'
do_not_scan = ['node_modules','bower_components']
for x in range(100000):
find_def()
index_of()
check_if_substring()
string = 'a'just a sample. Becausenode_moduleswill never be instring. With that said, can you use a map. Where the keys are the items ofdo_not_scan. Then the search is O(1)stringmay not contain any elements ofdo_not_scan. I've never used a map before how would you go about doing that?grep -l -Ff collections_of_strings main_string? wherecollections_of_stringsfile contains a collections of strings (one per line) andmain_stringfile contains the main string (as is).