Byte array to int in C++

Question

Would the following be the most efficient way to get an int16 (short) value from a byte array?

inline __int16* ReadINT16(unsigned char* ByteArray,__int32 Offset){
    return (__int16*)&ByteArray[Offset];
};

If the byte array contains a dump of the bytes in the same endian format as the machine, this code is being called on. Alternatives are welcome.

int types are generally aligned with the machine architecture for better memory access. If the ByteArray[Offset] is not respecting the proper alignement (say Offset is an odd number) then retrieving __int16 may not be a good idea. Though, we can wait for some better answers. — iammilind
– iammilind, Commented Jul 28, 2011 at 7:00
for those talking about alignment the byte array may contain single bytes as it is a file format. so the offset may be odd. Is this a problem? — user826228
– user826228, Commented Jul 28, 2011 at 7:03
@user826228: yes, it's a potential problem if you want your code to be portable. If you know it will only ever run on say x86, and performance is not critical, then you can get away with it -- but I'd advise against it. — Paul R
– Paul R, Commented Jul 28, 2011 at 9:52

Paul R · Accepted Answer · 2011-07-28 07:06:53Z

9

It depends on what you mean by "efficient", but note that in some architectures this method will fail if Offset is odd, since the resulting 16 bit int will be misaligned and you will get an exception when you subsequently try to access it. You should only use this method if you can guarantee that Offset is even, e.g.

inline int16_t ReadINT16(uint8_t *ByteArray, int32_t Offset){
    assert((Offset & 1) == 0); // Offset must be multiple of 2
    return *(int16_t*)&ByteArray[Offset];
};

Note also that I've changed this slightly so that it returns a 16 bit value directly, since returning a pointer and then subsequently de-referencing it will most likely less "efficient" than just returning a 16 bit value directly. I've also switched to standard Posix types for integers - I recommend you do the same.

edited Jul 28, 2011 at 7:06

answered Jul 28, 2011 at 6:59

Paul R

214k38 gold badges402 silver badges579 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Jesus Ramos Over a year ago

+1 for alignment, didn't think anyone would pick up on that. Most architectures do support it now but it's slower than aligned access.

selbie Over a year ago

What if ByteArray doesn't start on an even address?

Paul R Over a year ago

@selbie: that should not be possible, so long as ByteArray is either declared statically or allocated via new or malloc. If you wanted to be absolutely certain though then you could test the alignment of &ByteArray[Offset] instead of just testing Offset.

selbie Over a year ago

Right - the assert should be "assert(((ByteArray+offset)&1)==0);"

Paul R Over a year ago

@selbie: yes, except you probably need a cast in there.

Peter Mortensen · Accepted Answer · 2016-01-28 17:33:04Z

4

I'm surprised no one has suggested this yet for a solution that is both alignment safe and correct across all architectures. (well, any architecture where there are 8 bits to a byte).

inline int16_t ReadINT16(uint8_t *ByteArray, int32_t Offset)
{
    int16_t result;
    memcpy(&result, ByteArray+Offset, sizeof(int16_t));
    return result;
};

And I suppose the overhead of memcpy could be avoided:

inline int16_t ReadINT16(uint8_t *ByteArray, int32_t Offset)
{
    int16_t result;
    uint8_t* ptr1=(uint8_t*)&result;
    uint8_t* ptr2 = ptr1+1;
    *ptr1 = *ByteArray;
    *ptr2 = *(ByteArray+1);
    return result;
};

I believe alignment issues don't generate exceptions on x86. And if I recall, Windows (when it ran on Dec Alpha and others) would trap the alignment exception and fix it up (at a modest perf hit). And I do remember learning the hard way that Sparc on SunOS just flat out crashes when you have an alignment issue.

edited Jan 28, 2016 at 17:33

Peter Mortensen

31.4k22 gold badges110 silver badges134 bronze badges

answered Jul 28, 2011 at 7:33

selbie

105k15 gold badges109 silver badges187 bronze badges

1 Comment

user826228 Over a year ago

i was trying to avoid the overhead of copying data. albeit a small overhead i think its bad practice to use redundant data.

Peter Mortensen · Accepted Answer · 2016-01-28 17:28:57Z

1

inline __int16* ReadINT16(unsigned char* ByteArray,__int32 Offset)
{     
    return (__int16*)&ByteArray[Offset]; 
};

Unfortunately this has undefined behavour in C++, because you are accessing storage using two different types which is not allowed under the strict aliasing rules. You can access the storage of a type using a char*, but not the other way around.

From previous questions I asked, the only safe way really is to use memcpy to copy the bytes into an int and then use that. (Which will likely be optimised to the same code you'd hope anyway, so just looks horribly inefficient).

Your code will probably work, and most people seem to do this... But the point is that you can't go crying to your compiler vendor when one day it generates code that doesn't do what you'd hope.

edited Jan 28, 2016 at 17:28

Peter Mortensen

31.4k22 gold badges110 silver badges134 bronze badges

answered Jul 28, 2011 at 7:14

jcoder

30.2k21 gold badges94 silver badges133 bronze badges

Comments

Jesus Ramos · Accepted Answer · 2011-07-28 06:59:47Z

0

I see no problem with this, that's exactly what I'd do. As long as the byte array is safe to access and you make sure that the offset is correct (shorts are 2 bytes so you may want to make sure that they can't do odd offsets or something like that)

answered Jul 28, 2011 at 6:59

Jesus Ramos

23.3k10 gold badges62 silver badges90 bronze badges

Collectives™ on Stack Overflow

Byte array to int in C++

4 Answers 4

5 Comments

1 Comment

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

5 Comments

1 Comment

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related