Use case
A simple function which checks if a specific string is in another string at a position which is a multiple of 3 (see here for a real world example, finding stop codons in a DNA sequence).
Functions
sliding_window: takes a string of length 3 compares it to the search string, if they are identical moves 3 characters forward.
incremental_start: tries to find the search string, if the found position is not a multiple of 3, it tries to find the next position after the found position.
Please note: The sample data is just to make sure that each function has to go through the complete string, the performance is similar with real data or random data.
Results
- Python 2.7: The initial
sliding_windowfunction could be improved by a factor of ~39 by using the functionincremental_startin Python2.7 on Windows 10. There was slight drop in the performance improvement on Ubuntu, ~34x, ~37x, ~18x (VM, AWS, native), but still in the same range. - Python 3.4:
sliding_windowbecame slower than in Python2.7 (1.8x on Windows, 1.4x resp. 1.5x on all Ubuntus), but theincremental_startperformance dropped on all Ubuntus by a factor of 4, 5, 1.7 (VM, AWS, native), while it hardly changed on Windows. - Windows vs Ubuntu
Python2.7: virtualized Ubuntus needed less time for both functions(~20-30%), native Ubuntu was about 25% slower for theincremental_start, whilesliding_windowwas 40% faster.
Python3: thesliding_windowfunction needed less time to finish (~50%), while theincremental_startbecame slower by a factor of ~2-3.
Questions
- What causes the performance difference in Python 2 vs. Python 3 on Linux vs. Windows?
- How is it possible to anticipate such a behavior and adjust the code for optimal performance?
Code
import timeit
text = 'ATG' * 10**6
word = 'TAG'
def sliding_window(text, word):
for pos in range(0, len(text), 3):
if text[pos:pos + 3] == word:
return False
return True
def incremental_start(text, word):
start = 0
while start != -1:
start = text.find(word, start + 1)
if start % 3 == 0:
return False
return True
#sliding window
time = timeit.Timer(lambda: sliding_window(text, word), setup='from __main__ import text, word').timeit(number=10)
print('%3.3f' % time)
#incremental start
time = timeit.Timer(lambda: incremental_start(text, word), setup='from __main__ import text, word').timeit(number=500)
print('%3.3f' % time)
Tables
Ubuntu vs Windows VM AWS Native
Python2.7-Increment 79% 73% 126%
Python2.7-Sliding 70% 70% 60%
Python3.4-Increment 307% 346% 201%
Python3.4-Sliding 54% 59% 48%
Py2 vs 3 Windows VM AWS Native
Increment 105% 409% 501% 168%
Sliding 184% 143% 155% 147%
Absolute times in seconds
Win10 Ubuntu AWS Native
Py2.7-Increment 1.759 1.391 1.279 2.215
Py2.7-Sliding 1.361 0.955 0.958 0.823
Py3.4-Increment 1.853 5.692 6.406 3.722
Py3.4-Sliding 2.507 1.365 1.482 1.214
Details
Windows 10: Native Windows, 32bit Python 3.4.3 or 2.7.9, i5-2500, 16GB RAM
Ubuntu virtual machine: 14.04, Running on the Windows host, 64bit Python 3.4.3, Python 2.7.6, 4 cores, 4GB RAM
AWS: 14.04, AWS micro instance, 64bit Python 3.4.3, Python 2.7.6
Native Ubuntu: 14.04, 64bit Python 3.4.3, Python 2.7.6, i5-2500, 16GB ram [identical to Win10 machine]
Update
As suggested by Ingaz xrange and bytes were used, slight improvement in performance but still massive drop in performance on Ubuntu with Python3.4. The culprit seems to be find which is much slower when Ubuntu and Py3.4 are combined (same with Py3.5 which compiled was from source). This seems to Linux flavor dependent, on Debian Py2.7 and Py3.4 performed identical, on RedHat Py2.7 was considerably faster than Py3.4.
For better comparison Py3.4 is now used in 64bit on Windows10 and Ubuntu. Py27 is still used on Win10.
import timeit, sys
if sys.version_info >= (3,0):
from builtins import range as xrange
def sliding_window(text, word):
for pos in range(0, len(text), 3):
if text[pos:pos + 3] == word:
return False
return True
def xsliding_window(text, word):
for pos in xrange(0, len(text), 3):
if text[pos:pos + 3] == word:
return False
return True
def incremental_start(text, word):
start = 0
while start != -1:
start = text.find(word, start + 1)
if start % 3 == 0:
return False
return True
text = 'aaa' * 10**6
word = 'aaA'
byte_text = b'aaa' * 10**6
byte_word = b'aaA'
time = timeit.Timer(lambda: sliding_window(text, word), setup='from __main__ import text, word').timeit(number=10)
print('Sliding, regular: %3.3f' % time)
time = timeit.Timer(lambda: incremental_start(text, word), setup='from __main__ import text, word').timeit(number=500)
print('Incremental, regular: %3.3f' % time)
time = timeit.Timer(lambda: sliding_window(byte_text, byte_word), setup='from __main__ import byte_text, byte_word').timeit(number=10)
print('Sliding, byte string: %3.3f' % time)
time = timeit.Timer(lambda: incremental_start(byte_text, byte_word), setup='from __main__ import byte_text, byte_word').timeit(number=500)
print('Incremental, bytes: %3.3f' % time)
time = timeit.Timer(lambda: xsliding_window(byte_text, byte_word), setup='from __main__ import byte_text, byte_word').timeit(number=10)
print('Sliding, xrange&bytes: %3.3f' % time)
time = timeit.Timer(lambda: text.find(word), setup='from __main__ import text, word').timeit(number=1000)
print('simple find in string: %3.3f' % time)
Win10-py27 Wi10-py35 VM-py27 VM-py34
1.440 2.674 0.993 1.368
1.864 1.425 1.436 5.711
1.439 2.388 1.048 1.219
1.887 1.405 1.429 5.750
1.332 2.356 0.772 1.224
3.756 2.811 2.818 11.361