0

Is there an easy way to work in binary with Python?

I have a file of data I am receiving (in 1's and 0's) and would like to scan through it and look for certain patterns in binary. It has to be in binary because due to my system, I might be off by 1 bit or so which would throw everything off when converting to hex or ascii.

For example, I would like to open the file, then search for '0001101010111100110' or some string of binary and have it tell me whether or not it exists in the file, where it is, etc.

Is this doable or would I be better off working with another language?

1
  • If you don't mind expanding your data by 8x or so you could convert it to a string, then use the usual string search facilities. Commented Apr 17, 2013 at 22:28

2 Answers 2

4

To convert a byte string into a string of '0' and '1', you can use this one-liner:

bin_str = ''.join(bin(0x100 + ord(b))[-8:] for b in byte_str)

Combine that with opening and reading the file:

with open(filename, 'rb') as f:
    byte_str = f.read()

Now it's just a simple string search:

if '0001101010111100110' in bin_str:
Sign up to request clarification or add additional context in comments.

4 Comments

I think something like ''.join('{:08b}'.format(ord(b)) for b in byte_str) looks a little cleaner.
@DSM, you might be right. I just went with the first thing I thought of.
This works well. Just out of curiosity, in the first code what is the purpose of the 0x100? is that a mask? or the 08b in the second? I expect my file to be ~10-1000kB so could this throw off importing the entire file?
@tdfoster, it turns out bin doesn't give you leading zeros so I had to add 0x100 to make sure they were there. That leading 1 and the 0b before it get stripped off with the [-8:]. DSM's suggestion should result in the same string.
0

You would be better working off another language. Python could do it (if you use for example, file = open("file", "wb")

(appending the b opens it in binary), and then using a simple search, but to be honest, it is much easier and faster to do it in a lower-level language such as C.

3 Comments

I'm not so sure of that. Could you give an example implementation in C to compare with the four-line Python version above?
I am using SWIG and a good ammount of C so I think I will give both options a try. Thanks!
python is simpler, but I meant that C would be faster.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.