Building a list with binary numbers in binary representation in Python

Question

I have a very big list of 0's and 1's that are represented as integers - by default - by python, I think: [randint(0, 1) for i in range(50*98)]

And I want to optimize the code so that it utilizes much less memory. The obvious way is to use just 1 bit to represent each of these numbers.

Is it possible to build a list of real binary numbers in python?

Regards, Bruno

EDIT: Thank you all.
From the answers I found out Python doesn't do this by default, so I found this library (that is installed by Macports on OSX so it saves me some trouble) that does bit operations: python-bitstring

What would the difference between a "real binary number" and a "normal number" be? — Ignacio Vazquez-Abrams
– Ignacio Vazquez-Abrams, Commented Sep 5, 2011 at 11:48
Do you mean a bitset/bitstring? stackoverflow.com/questions/3946086/… — Fred Foo
– Fred Foo, Commented Sep 5, 2011 at 11:50
A real binary number uses just 1 bit of memory. A normal number, so an integer, utilizes at least 16 bits in Python as far as I know. — jbssm
– jbssm, Commented Sep 5, 2011 at 11:51
You can't adress a single bit of memory anyway, regardless of the language used. If you want an N-bit number, N has to be a multiple of 8, even if you ignore them. — user395760
– user395760, Commented Sep 5, 2011 at 11:56
Is memory really a problem, or do you just think it is? 50*98 digits is a pretty small amount of memory. — Bryan Oakley
– Bryan Oakley, Commented Sep 5, 2011 at 12:29

Scott Griffiths · Accepted Answer · 2011-09-05 12:37:30Z

4

This uses the bitstring module and constructs a BitArray object from your list:

from bitstring import BitArray
b = BitArray([randint(0, 1) for i in range(50*98)])

Internally this is now stored packed as bytes so will take considerably less memory. You can slice, index, check and set bits etc. with the usual notation and there additional methods such as set, all and any to modify the bits.

To get the data back as a binary string just use b.bin and to get out the byte packed data use b.tobytes() which will pad with zero bits up to a byte boundary.

answered Sep 5, 2011 at 12:37

Scott Griffiths

22k8 gold badges58 silver badges86 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

jbssm Over a year ago

Thank you a lot, you saved me going trough the documentation of bitstring. This is exactly what I'll use.

Torsten Marek · Accepted Answer · 2011-09-05 12:42:47Z

As delnan already said in a comment, you will not be able to use real binary numbers if you mean bit-for-bit equivalent memory usage.

Integers (or longs) are of course real binary numbers in the meaning that you can address individual bits (using bit-wise operators, but that is easily hidden in a class). Also, long objects can become arbitrarily large, i.e. you can use them to simulate arbitrarily large bitsets. It is not going to be very fast if you do it in Python, but not very difficult either and a good start.

Using your binary generation scheme from above, you can do the following:

reduce(
    lambda (a, p), b: (b << p | a, p + 1), 
    (random.randint(0, 1) for i in range(50*98)),
    (0, 0)
)[0]

Of course, random supports arbitrarily large upper boundaries, so you can do just that:

r = random.randint(0, 2**(50*98))

This is not entirely the same, since the individual binary digits in the are not independent in the same way they are independent when you create each digit for itself. Then again, knowing you pRNGs work, they are not really independent in the other case, either. If this is a concern for you, you probably should not use the random module at all, but a hardware RNG.

Cat Plus Plus · Accepted Answer · 2011-09-05 11:51:58Z

1

It's called a bit vector, or a bitmap. Try e.g. BitVector. If you want to implement it yourself, you need to use a numeric object, not a list, and use bitwise operations to toggle bits, e.g.

 bitmap = 0
 bit = (1 << 24)
 bitmap |= bit  # enable bit
 bitmap &= ~bit # disable bit

answered Sep 5, 2011 at 11:51

Cat Plus Plus

131k27 gold badges205 silver badges226 bronze badges

1 Comment

jbssm Over a year ago

Thanks, it was something like that I was searching for. After understanding from the answers that Python doesn't provide that in the standard install, I did find a library (that is installed by Macports on OSX so it saves me some trouble) that does bit operations. python-bitstring

Constantinius · Accepted Answer · 2011-09-05 11:52:16Z

0

It seems you need some kind of bit-set. I'm not sure if this example entirely suits your needs, but it is worth a try.

answered Sep 5, 2011 at 11:52

Constantinius

35.2k9 gold badges79 silver badges85 bronze badges

1 Comment

jbssm Over a year ago

Thank you. I found a library (that is installed by Macports on OSX so it saves me some trouble) that does bit operations. python-bitstring

Alex Reynolds · Accepted Answer · 2011-09-05 11:52:46Z

0

Perhaps you could look into a lossless compression scheme for your data? Presumably there will be a lot of redundancy in such a list.

answered Sep 5, 2011 at 11:52

Alex Reynolds

97.3k59 gold badges251 silver badges356 bronze badges

4 Comments

jbssm Over a year ago

That's true, but altough the strings are a bit big, they are not so big that would compensate for the overhead of using compression.

vhallac Over a year ago

If the bit array is generated randomly, there wouldn't be any redundancy to compress away in any case.

jbssm Over a year ago

@vhallac: True if I do it like I want and only use bits. But Alex was also right, because if I have to represent the 0's and 1's in the date with integers, then, even being random compression would help. But even so, it looks simpler to use bit operations. Thanks.

vhallac Over a year ago

@jbbsm Good point. It never occured to me to compress the integers holding bits. The obvious bit packing "compression" would be almost trivial compared to any compression method you can think of - and would perform better. :)

Collectives™ on Stack Overflow

Building a list with binary numbers in binary representation in Python

5 Answers 5

1 Comment

Comments

1 Comment

1 Comment

4 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

1 Comment

Comments

1 Comment

1 Comment

4 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related