I have implemented the below Naive String search algorithm ('find'). It's as simple as it can get. However I found another way on 'GeeksforGeeks', ('search') it looked to have a better complexity. When I tested it for large strings, the results are drastically different but opposite.
1st: Slice the string into pattern length and compare. Move forward one character slice and compare. What should be the complexity of this?
def find(pat, txt):
size = len(pat)
for i in range( len(txt) -size + 1 ):
if txt[i : i + size] == pat:
print 'Pattern found at index %s'%(i)
2nd: Compare character by character. If a character doesn't match break. Else continue. In the end if all the characters matched print the result. Move forward one character. What should be the complexity of this?
def search(pat, txt):
M = len(pat)
N = len(txt)
for i in xrange(N-M+1):
status = 1
for j in xrange(M):
if txt[i+j] != pat[j]:
status = 0
break
if j == M-1 and status != 0:
print "Pattern found at index " + str(i)
Timing test Cases:
testString = ''.join([ 'a' for _ in range(1000*100)] ) + 'b'
testPattern = ''.join([ 'a' for _ in range(100*100) ]) + 'b'
import cProfile
cProfile.run('find(testPattern, testString)')
cProfile.run('search(testPattern, testString)')
for find
Pattern found at index 90000
90007 function calls in 0.160 seconds
For search
Pattern found at index 90000
5 function calls in 135.951 seconds
In my algo find I do slicing and comparison. The time complexity for slicing is O(k), similarly for comparison it should take another O(k) though not sure.
Python Time Complexity
While in search we run the loop just 'k' times. So shouldn't it have a better time complexity.