0

I've got a binary string like '1100011101'. I would like to parse it into a list where each chunk of 1's or 0's is a separate value in the list.

Such as: '1100011101' becomes ['11', '000', '111', '0', '1']

0

4 Answers 4

3

You can scrape a (minor) bit of performance out of this by using a regex instead of groupby() + join(). This just finds groups of 1 or 0:

import re

s = '1100011101'
l = re.findall(r"0+|1+", s)
# ['11', '000', '111', '0', '1']

Timings:

s = '1100011101' * 1000

%timeit l = [''.join(g) for _, g in groupby(s)]
# 1.16 ms ± 9.79 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit re.findall(r"0+|1+", s)
# 723 µs ± 5.32 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Sign up to request clarification or add additional context in comments.

Comments

2

Use itertools.groupby:

from itertools import groupby

binary = "1100011101"

result = ["".join(repeat) for _, repeat in groupby(binary)]
print(result)

Output

['11', '000', '111', '0', '1']

Comments

2

Insert a space between the 01 and 10 transitions using .replace() and then split the resulting string:

'1100011101'.replace("01","0 1").replace("10","1 0").split()

['11', '000', '111', '0', '1']

Comments

1

with groupby

>>> from itertools import groupby as f
>>> x = str(1100011101)
>>> sol = [''.join(v) for k, v in f(x)]
>>> print(sol)
['11', '000', '111', '0', '1']

without using groupby and if you want more faster execution

def func(string):
    if not string:
        return []
    def get_data(string):
            if not string:
                return 
            count = 0
            target = string[0]
            for i in string:
                if i==target:
                    count+=1
                else:
                    yield target*count
                    count = 1
                    target = i
            if count>0:
                yield target*count
    return list(get_data(string))
        
    x = '1100011101'
    sol =func(x)
    print(sol)

output

['11', '000', '111', '0', '1']

Timings on my machine

 from itertools import groupby

s = '11000111010101' * 100000

%timeit l = [''.join(g) for _, g in groupby(s)]
318 ms ± 2.75 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


import re

s = '11000111010101' * 100000

%timeit l = re.findall(r"0+|1+", s)
216 ms ± 2.01 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


def func(string):
    if not string:
        return []
    def get_data(string):
            if not string:
                return 
            count = 0
            target = string[0]
            for i in string:
                if i==target:
                    count+=1
                else:
                    yield target*count
                    count = 1
                    target = i
            if count>0:
                yield target*count
    return list(get_data(string))

s = '11000111010101' * 100000

%timeit func(s)
205 ms ± 11.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
####################################################################

from itertools import groupby

s = '11000111010101' * 1000

%timeit l = [''.join(g) for _, g in groupby(s)]
3.28 ms ± 178 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


import re

s = '11000111010101' * 1000

%timeit l = re.findall(r"0+|1+", s)
2.06 ms ± 57.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


def func(string):
    if not string:
        return []
    def get_data(string):
            if not string:
                return 
            count = 0
            target = string[0]
            for i in string:
                if i==target:
                    count+=1
                else:
                    yield target*count
                    count = 1
                    target = i
            if count>0:
                yield target*count
    return list(get_data(string))

s = '11000111010101' * 1000

%timeit func(s)
1.91 ms ± 153 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.