Probably Certainly the second, I don't see any difference in doing a search in a big string or many in small strings. You may skip some chars thanks to the shorter lines, but the split operation has its costs too (searching for \n, creating n different strings, creating the list) and the loop is done in python.
The string __contain__ method is implemented in C and so noticeably faster.
Also consider that the second method aborts as soon as the first match is found, but the first one splits all the string before even starting to search inside it.
This is rapidly proven with a simple benchmark:
import timeit
prepare = """
with open('bible.txt') as fh:
text = fh.read()
"""
presplit_prepare = """
with open('bible.txt') as fh:
text = fh.read()
lines = text.split('\\n')
"""
longsearch = """
'hello' in text
"""
splitsearch = """
for line in text.split('\\n'):
if 'hello' in line:
break
"""
presplitsearch = """
for line in lines:
if 'hello' in line:
break
"""
benchmark = timeit.Timer(longsearch, prepare)
print "IN on big string takes:", benchmark.timeit(1000), "seconds"
benchmark = timeit.Timer(splitsearch, prepare)
print "IN on splitted string takes:", benchmark.timeit(1000), "seconds"
benchmark = timeit.Timer(presplitsearch, presplit_prepare)
print "IN on pre-splitted string takes:", benchmark.timeit(1000), "seconds"
The result is:
IN on big string takes: 4.27126097679 seconds
IN on splitted string takes: 35.9622690678 seconds
IN on pre-splitted string takes: 11.815297842 seconds
The bible.txt file actually is the bible, I found it here: http://patriot.net/~bmcgin/kjvpage.html (text version)
get_mother_of_all_stringsis such a funny name for a function haha''a\n' * 100000000and compare the execution times.