3

In the code below concatenation is the bottleneck. As you can see i've tried some sophisticated methods to speed this up, but its bloody slow anyway. I would like to know if there is anything i can do to make it faste.

BTW both plain and secret are data read from binary file and they are quite big (around 1mb)

x = b''
if len(plain) < len(secret*8):
    return False
i = 0

for secByte in secret:
    for j in range(8):
        z = setBit(plain[i],0,getBit(secByte,j))
        #x += bytes([z])
        x = x.join([b"", bytes([z])])
        #x = array.array("B",(int(z) for z in x.join([b"", bytes([z])]))).tostring()
        i = i+1
3

2 Answers 2

7

Python's lists have O(1) append, at least in the amortized sense. Instead of doing the join in the innermost list, you could build one big list and then join them at the end. This will turn your algorithm from O(N^2) to O(N). It's tough to give you working code without knowing exactly what your setBit() and getBit() functions are doing, but something like:

L = []
for secByte in secret:
    for j in range(8):
         z = ...
         L.append(z)
x = b"".join(L)
Sign up to request clarification or add additional context in comments.

6 Comments

That has not been true since Python 2.3. Time some code with concatenation and with joins.
@nate c, Python strings are an array of contiguous bytes and associated length. His code with x.join(...) creates a new, slightly longer string every time before it deallocates the old string when he assigns to it. This is O(N^2) behavior. You're wrong, and you shouldn't have modded me down.
The suggestion that one build strings as lists and only join them as necessary (preferably after the entire list is built) is still valid so far as I know. It's been a recommended Python practice for a very long time.
@Jim: I'm not saying its not. I'm saying Big O analysis of Strings is incorrect.
@xscott: Heres the discussion on the patch -- we are not talking about 2x or even 10x faster but roughly Nx faster where N is the size of the input data set. mail.python.org/pipermail/python-dev/2004-August/046686.html
|
5

I don't think you should be using string concatenation at all for this. It would be better to create a mutable bytearray of the full size of the final data and then set each byte. This is very much O(N) and for what you are doing using a bytearray is much more natural than string manipulation:

x = bytearray(len(secret)*8)   # creates an array of zero bytes
i = 0
for secByte in secret:
    for j in range(8):
        x[i] = setBit(plain[i], 0, getBit(secByte, j))
        i += 1

1 Comment

tyvm, it's exactly what i needed (it works like 100x faster) xscott solution is propably as fast as this, but this one is a litle better suiting to my problem

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.