How to find null byte in a string in Python?

Question

I'm having an issue parsing data after reading a file. What I'm doing is reading a binary file in and need to create a list of attributes from the read file all of the data in the file is terminated with a null byte. What I'm trying to do is find every instance of a null byte terminated attribute.

Essentially taking a string like

Health\x00experience\x00charactername\x00

and storing it in a list.

The real issue is I need to keep the null bytes in tact, I just need to be able to find each instance of a null byte and store the data that precedes it.

abarnert · Accepted Answer · 2019-01-28 20:35:32Z

11

Python doesn't treat NUL bytes as anything special; they're no different from spaces or commas. So, split() works fine:

>>> my_string = "Health\x00experience\x00charactername\x00"
>>> my_string.split('\x00')
['Health', 'experience', 'charactername', '']

Note that split is treating \x00 as a separator, not a terminator, so we get an extra empty string at the end. If that's a problem, you can just slice it off:

>>> my_string.split('\x00')[:-1]
['Health', 'experience', 'charactername']

edited Jan 28, 2019 at 20:35

user3064538

answered Sep 24, 2013 at 0:04

abarnert

368k54 gold badges626 silver badges691 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

JohnF1NK Over a year ago

I forgot to say in my initial question I need to keep all of the nullbyte in place, I just need to be able to take the input and find the nullbyte, sorry I didn't clarify that initially

abarnert Over a year ago

@user2806298: As justhalf implies, Python's str.split method doesn't have any way to keep the separators, but it's easy to just add them back on to each one. For example: [s+'\x00' for s in my_string.split('\x00')[:-1]].

101 · Accepted Answer · 2015-04-24 04:32:01Z

10

While it boils down to using split('\x00') a convenience wrapper might be nice.

def readlines(f, bufsize):
    buf = ""
    data = True
    while data:
        data = f.read(bufsize)
        buf += data
        lines = buf.split('\x00')
        buf = lines.pop()
        for line in lines:
            yield line + '\x00'
    yield buf + '\x00'

then you can do something like

with open('myfile', 'rb') as f:
    mylist = [item for item in readlines(f, 524288)]

This has the added benefit of not needing to load the entire contents into memory before splitting the text.

edited Apr 24, 2015 at 4:32

101

9,0576 gold badges47 silver badges71 bronze badges

answered Sep 24, 2013 at 0:12

kalhartt

4,13723 silver badges27 bronze badges

2 Comments

JohnF1NK Over a year ago

Thanks for the help, the issue I have though is I forgot to say in my initial question I need to keep all of the nullbyte in place, I just need to be able to take the input and find the nullbyte, sorry I didn't clarify that initially

kalhartt Over a year ago

@user2806298 Edited to keep the nullbytes in place

kenorb · Accepted Answer · 2015-06-01 15:47:22Z

6

To check if string has NULL byte, simply use in operator, for example:

if b'\x00' in data:

To find the position of it, use find() which would return the lowest index in the string where substring sub is found. Then use optional arguments start and end for slice notation.

answered Jun 1, 2015 at 15:47

kenorb

169k95 gold badges712 silver badges796 bronze badges

Comments

Tim Peters · Accepted Answer · 2019-01-28 19:42:02Z

1

Split on null bytes; .split() returns a list:

>> print("Health\x00experience\x00charactername\x00".split("\x00"))
['Health', 'experience', 'charactername', '']

If you know the data always ends with a null byte, you can slice the list to chop off the last empty string (like result_list[:-1]).

edited Jan 28, 2019 at 19:42

user3064538

answered Sep 24, 2013 at 0:07

Tim Peters

71.4k14 gold badges133 silver badges140 bronze badges

1 Comment

JohnF1NK Over a year ago

Yeah the extra slash present in error I forgot to say in my initial question I need to keep all of the nullbyte in place, I just need to be able to take the input and find the nullbyte, sorry I didn't clarify that initially

Collectives™ on Stack Overflow

How to find null byte in a string in Python?

4 Answers 4

2 Comments

2 Comments

Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

2 Comments

2 Comments

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related