1

I am trying to find a better title so I apologize if its confusing.

I have a binary message that I am reading over a serial port. The message contains a constant 8 byte header, and a dynamic message length (depends on the message type which is defined in the 8 byte header).

I am trying to create a function that returns this message with the array allocated using malloc.

I changed some of the variable/member names just to make it clearer Example:

#include <stdlib.h>
#include <stdio.h>

typedef struct MyMessageHeader {
  uint8_t byte1, byte2, byte3, byte4, byte5, byte6, byte7, 
} MyMessageHeader;

// I could have up to 100 different message types, but they all contain 
// the same header, so trying to make a message type with a header and array pointer
typedef struct MyMessage {
  MyMessageHeader header;
  uint8_t* data;
} MyMessage;

// Function to copy a raw byte array that is read from the serial port into
// a MyMessage object
MyMessage* FunctionThatIsNotWorking(uint8_t* rawBytes) {
  // Somehow we have determined in rawBytes that this message contains 12 bytes of data
  // plus the 8 bytes which is the header
  MyMessage* ret_msg = malloc(20);
  // This is where things go wrong and I get segmentation faults. Likely due to
  // my misunderstanding of malloc.
  // Copy the rawBytes into MyMessage. Assuming the first 8 bytes of rawBytes goes
  // into MyMessageHeader, and the last 12 bytes go into data
  memcpy(ret_msg, rawBytes, 20);
}

int main() {
uint8_t raw_bytes[20] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20};
MyMessage* my_msg = FunctionThatIsNotWorking(raw_bytes);
// Expecting now but this doesnt work
my_msg->header.byte1 == 1;
my_msg->header.byte2 == 2;
my_msg->header.byte3 == 3;
// ...
// first byte of data should be the 9th byte in raw_bytes 
my_msg->data[0] == 9;
my_msg->data[1] == 10;
}

What am I missing here, or what is my mis-understanding?

9
  • 2
    Pointers are not arrays, arrays are not pointers. Mixing them up (particularly by reinterpreting the raw byte representation) will not work. Commented Feb 9, 2020 at 22:20
  • 1
    True, but the OP's example (uint8_t raw_bytes[20] = {...}) IS a legitimate "mock" for a "real" input stream of bytes. I believe the actual problem is that his struct layout doesn't necessarily match the raw byte stream. Because of structure padding. Hence a naive "memcpy" will fail :( Commented Feb 9, 2020 at 22:28
  • 2
    @FoggyDay No. Unless you have a very specific memory layout, the contents of raw_bytes[]will absolutely not be valid bytewise representation for a struct MyMessage. Look again. Commented Feb 9, 2020 at 22:41
  • You're not listening. raw_bytes[] can be a perfectly valid representation for a byte stream. Which, AFAIK, is what the OP has. The problem is that the "steam," won't necessarily match the "struct". That's what you're trying to say ... and that's the point I suspect the OP might be unaware of. Too bad I can't mark down a comment :( Commented Feb 9, 2020 at 22:55
  • 1
    @FoggyDay I have no idea why you're trying to argue with me, but I'll try to express this in a simple way: the problem is not structure padding, so your contribution here is not helpful. Commented Feb 9, 2020 at 23:04

1 Answer 1

2

I suspect what's byteing you (pun intended) is structure padding

SUGGESTION:

  1. Read the data directly from your device into a local buffer. Make the buffer at least as long as the largest anticipated message.

  2. Read the header

  3. Perform your "malloc", and finally

  4. Unpack the data from your buffer into you


OK: Here's what I'd suggest:

/*
 * SAMPLE OUTPUT:
 * sizeof(*my_msg)= 16
 * header: 01 02 03 04 05 06 07 08
 * data: 09 0a 0b 0c 0d 0e 0f 10 11 12 13 14
 */
#include <stdio.h>
#include <stdint.h>   // unit8_t
#include <stdlib.h>   // malloc()
#include <string.h>   // memcpy ()

typedef struct MyMessage {
  uint8_t header[8];  // 8 bytes (you said) for header; message length unknown.
  uint8_t* data;      // Here, you're only allocating space for a pointer (e.g. 4 bytes)
} MyMessage_t;

MyMessage_t* FunctionThatIsNotWorking(uint8_t* rawBytes) {
  // Let's assume rawBytes contains header + data
  // Parse the header, determine message type, and determine message length
  // ... TBD ...

  // Allocate your struct
  MyMessage_t* ret_msg = malloc(sizeof (struct MyMessage));

  // Now allocate space for your *data* (let's assume this particular message has 12 bytes)
  ret_msg->data = malloc(12);

  // Copy your header (from rawBytes)
  memcpy(&ret_msg->header, rawBytes, 8);

  // Copy the data (starting on the ninth byte; assume the data is contiguous)
  memcpy(ret_msg->data, &rawBytes[8], 12);

  // Return the completed record
  return ret_msg;
}

int main() {
  int i;
  // Create some dummy data
  uint8_t raw_bytes[20] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20};

  MyMessage_t* my_msg = FunctionThatIsNotWorking(raw_bytes);

  printf ("sizeof(*my_msg)= %ld\n", sizeof(*my_msg));
  printf ("header: ");
  for (i=0; i<sizeof(my_msg->header); i++) {
    printf("%02x ", my_msg->header[i]);
  }
  printf ("\ndata: ");
  for (i=0; i<12; i++) {
    printf("%02x ", my_msg->data[i]);
  }
  printf ("\n");

  return 0;
}

The point I was trying to make is that the byte layout the compiler gives your struct is NOT necessarily what you might expect. There's often "extra spacing" in fields, to ensure alignment.

If you're using structs on the sending end (when constructing your message), the data layout in the receiver won't necessarily match the layout in the sender.

Consequently, you usually need to explicitly "pack" and "unpack" messages you send back and forth between different hosts/different platforms.

I think the point was emphasizing is that you never allocated space for your data. This is true :)

Finally, in most cases, the length of the message will be unknown until you actually read the header. My example handles this case, too.

In case you only want to pass the data back to the caller (my example copies both header and data), you'll probably want to add a "message length" field in your struct.

Always try to use the "sizeof" operator, instead of hard-coding "magic numbers". The next best thing is to declare constant (e.g. #define DATA_LENGTH 12).


One more example.

Let's say you know your message data is always going to be exactly 12 bytes. and let's still assume you want the header field to be 8 bytes. You can do this:

...
typedef struct MyMessage {
  uint8_t header[8]; // 8 bytes (you said) for header
  uint8_t data[12];  // This allocates 12 bytes
} MyMessage_t;
...
MyMessage_t* FunctionThatIsNotWorking(uint8_t* rawBytes) {
  // Let's assume rawBytes contains header + data
  // Parse the header, determine message type, and determine message length
  // ... TBD ...

  // Allocate your struct
  MyMessage_t* ret_msg = malloc(sizeof (*ret_msg));

  // Copy directly from rawBytes
  memcpy(ret_msg, rawBytes, sizeof (*ret_msg));

  // Return the completed record
  return ret_msg;
}
Sign up to request clarification or add additional context in comments.

5 Comments

Hmmm. OP says 8 bytes for the header, but codes 7 bytes. With 7 bytes I would expect padding; perhaps not with 8 bytes (but there are no such guarantees in C). The advice to mind possible padding bytes is good, but I think OP's primary problem was failure to allocate for the actual data (which you point out above). Aside: avoiding magic numbers in malloc() is good, but avoiding explicit types too is often better (less error-prone, more maintainable): MyMessage_t* ret_msg = malloc(sizeof (struct MyMessage)) --> MyMessage_t* ret_msg = malloc(sizeof *ret_msg).
I initially took the OP at his code (which was 7 bytes: "byte1", "byte2", ..., "byte7"), but then decided to change it to 8 bytes (his comment and, I believe, probably his intent). It doesn't matter - I hope the example - and the explanation - was helpful.
@FoggyDay That makes sense, and I appreciate the time you put in to show the example. I was hoping to "cheat" and just malloc() the full message (header + data). And I agree on most of your suggestions, I did remove some stuff for clarity.
Actually, you can. Assuming you have a fixed-length message. Let me give you another example.
@FoggyDay thank you for the 2nd example. Two questions. 1) In the // allocate struct you sizeof the return message, is that correct? 2) If, lets say, I passed in rawBytes, and the size of the message(header of 8 bytes, plus data) could this 2nd example still work? So its not always 12 bytes, but its known before malloc() and memcpy

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.