Fast concatenation of bytes() in python3

Question

I have an array of byte-strings in python3 (it's an audio chunks). I want to make one big byte-string from it. Simple implementation is kind of slow. How to do it better?

chunks = []
while not audio.ends():
  chunks.append( bytes(audio.next_buffer()) )
  do_some_chunk_processing()

all_audio=b''
for ch in chunks:
  all_audio += ch

How to do it faster?

Are you sure that piecing together the chunks is what's taking the time? Your main while loop looks like it has the potential of being very slow. — Mark Ransom
– Mark Ransom, Commented Mar 4, 2021 at 13:46

Amin Pial · Accepted Answer · 2025-08-04 10:18:31Z

8

Use bytearray()

from time import time

c = b'\x02\x03\x05\x07' * 500 # test data

# Method-1 with bytes-string

bytes_string = b''

st = time()
for _ in range(10**4):
    bytes_string += c

print("string concat -> took {} sec".format(time()-st))

# Method-2 with bytes-array

bytes_arr = bytearray()

st = time()
for _ in range(10**4):
    bytes_arr.extend(c)
# convert byte_arr to bytes_string via
bytes_string = bytes(bytes_arr)

print("bytearray extend/concat -> took {} sec".format(time()-st))

benchmark in my Win10|Corei7-7th Gen shows:

string concat -> took 67.28 sec
bytearray extend/concat -> took 0.089 sec

the code is pretty self-explanatory. instead of using string+=next_block, use bytearray.extend(next_block). After building bytearray you can use bytes(bytearray) to get the bytes-string.

edited Aug 4 at 10:18

answered Apr 23, 2022 at 12:20

Amin Pial

5199 silver badges15 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

TheLizzard Over a year ago

Finally a fast solution, I was adding >50,000 chunks of bytes on the fly and I got a 140x speed up by using bytearray.

Aaron Webstey Aug 27 at 20:06

This was also much faster for me than b''.join() - but I too was adding many chunks on the fly (not finding all chunks and then concatenating them at the end). My script run time went from 393s to 13s, with the bulk of the time shifting from a[i] = a[i] + b to a regex elsewhere in the code.

Wander Nauta · Accepted Answer · 2021-03-04 13:45:26Z

5

One approach you could try and measure would be to use bytes.join:

all_audio = b''.join(chunks)

The reason this might be faster is that this does a pre-pass over the chunks to find out how big all_audio needs to be, allocates exactly the right size once, then concatenates it in one go.

Reference

edited Mar 4, 2021 at 13:45

answered Mar 4, 2021 at 13:40

Wander Nauta

19.7k1 gold badge50 silver badges65 bronze badges

Comments

A. Bohyn · Accepted Answer · 2021-03-04 13:47:07Z

0

One approach is to use fstring

all_audio = b''
for ch in chunks:
        all_audio = f'{all_audio}{ch}'

Which seems to be faster for small strings, according to this comparison.

answered Mar 4, 2021 at 13:47

A. Bohyn

1973 silver badges15 bronze badges

Collectives™ on Stack Overflow

Fast concatenation of bytes() in python3

3 Answers 3

2 Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related