I have few filenames in a list called data. I want to read the contents of the file and check if a given text (example - orange) appears in the file. My filenames are appended to the list in a sequential order i.e if given text "orange", appears in file pi.txt (index 2), it will be present in all the files after index 2 as well and off course i want to get the index or filename where text "orange" appeared first.
I have more than thousand files in a list, therefore i want to use binary search.
data = ['ae.txt', 'ac.txt', 'pi.txt', 'ad.txt', 'mm.txt', 'ab.txt']
target = "orange"
def binary_search(a, x):
lo = 0
hi = len(a)
while lo < hi:
mid = (lo + hi) // 2
if not x in open(a[mid]).read():
lo = mid + 1
elif x in open(a[mid]).read():
hi = mid
elif mid > 0 and x in open(a[mid-1]).read():
hi = mid
else:
return mid
return -1
print(binary_search(data, target))
$ cat ae.txt
papaya
guava
$ cat ac.txt
mango
durian
papaya
guava
$ cat pi.txt
orange
papaya
guava
$ cat ad.txt
orange
papaya
guava
$ cat mm.txt
orange
papaya
guava
$ cat ab.txt
orange
papaya
guava