How to efficiently encode a bit string in MATLAB?

Question

I want to store a vector of integers (uint8) as (space-)efficiently as possible in MATLAB. So far, I am using arithenco to encode the vector:

bits = arithenco(z, counts);

The good thing is that it returns a vectors of bits. The bad thing is that the bits are stored in doubles. This means that the returned vector is about 64 times as large as the original uint8 vector, while the whole idea was to make the thing smaller.

So is there an easy (and runtime-efficient) way to encode those pseudo-bits so that I actually get a space improvement?

The only solution I've come up with yet is to use bitset to store all those bits in a vector of, say, uint32 again, but this seems to be cumbersome and not very fast as I will have to loop over the whole bits vector explicitly.

Note: I can not use the Java API for this, otherwise this would have been relatively easy.

Amro · Accepted Answer · 2012-06-13 15:15:06Z

3

Similar to your solution, but using core MATLAB functions only:

%# some random sequence of bits
bits = rand(123,1) > 0.5;

%# reshape each 8 bits as a column (with zero-padding if necessary)
numBits = numel(bits);
bits8 = false(8, ceil(numBits/8));
bits8(1:numBits) = bits(:);

%# convert each column to uint8
bits_packed = uint8( bin2dec(char(bits8'+'0')) );

Compare the sizes:

>> whos bits bits_packed
  Name               Size            Bytes  Class      Attributes

  bits             123x1               123  logical              
  bits_packed       16x1                16  uint8

To unpack/recover the original bits:

%# unpack
b = logical(dec2bin(bits_packed)' - '0');
b = b(:);

%# sanity check
isequal(bits, b(1:numBits))

edited Jun 13, 2012 at 15:15

answered Jun 13, 2012 at 15:09

Amro

125k25 gold badges250 silver badges466 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

rolve Over a year ago

Thank you! I didn't think about character conversion. I guess this could even be faster than my solution.

rolve · Accepted Answer · 2012-06-12 21:42:52Z

1

After some searching and trying, I've finally come up with this solution:

bitCount = size(bits, 2);
bits8 = zeros(ceil(bitCount/8), 8);
bits8(1:bitCount) = bits;           % Reshape to (pseudo-)8-bit representation
comp = uint8(bi2de(bits8));         % Convert to efficient byte representation

The key part here is the bi2de function which "converts a binary row vector b to a nonnegative decimal integer". To get the bits again, the de2bi function can be used.

answered Jun 12, 2012 at 21:42

rolve

10.3k4 gold badges58 silver badges78 bronze badges

4 Comments

tmpearce Over a year ago

If this is the complete answer you were looking for, you can mark it as accepted even if you're the one who answered it.

Amro Over a year ago

@rolve: shouldn't bits8 be defined as: bits8 = zeros(8,ceil(bitCount/8)); so that it gets filled in the correct order (by columns). Just remember to transpose before passing it to bi2de

rolve Over a year ago

@Amro yeah, you're probably right. However, since i just want to store the bits efficiently, it does not matter whether they're in the rows or spread across the columns. I just have to do the right reshaping when converting them back into the ?-bit representation.

Amro Over a year ago

@rolve: I guess you're right, you got my +1 vote anyways.. By the way I posted a solution similar to yours but without using any special toolboxes.

Andrey Rubshtein · Accepted Answer · 2012-06-12 08:41:00Z

0

You can convert them to logical:

     bitsLogical = logical(bits);

This should be more efficient in memory. But you will still have the conversion step. So the best thing is to dive into arithenco and change it to return logical in the first place.

Edit As the OP says correctly, this will not be packed as bits, but as bytes. Still it is an improvement over double.

edited Jun 12, 2012 at 8:41

answered Jun 12, 2012 at 8:29

Andrey Rubshtein

21k11 gold badges73 silver badges106 bronze badges

3 Comments

rolve Over a year ago

Thank you for the idea but logicals are stored as bytes, so there are still 7 out of 8 bits wasted.

Andrey Rubshtein Over a year ago

@rolve, I did not know it. Do you have any reference for this?

rolve Over a year ago

You can check the size of your data structures with the whos (mathworks.ch/help/techdoc/ref/whos.html) command. Using that you will see that a single logical has a size of 1 byte.

Collectives™ on Stack Overflow

How to efficiently encode a bit string in MATLAB?

3 Answers 3

1 Comment

4 Comments

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

4 Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related