0

I have created an array of integers and I would like them to be interpreted by the structure definition which I have created

from ctypes import *
from array import array

class MyStruct(Structure):
    _fields_ = [("init", c_uint),
                ("state", c_char),
                ("constant", c_int),
                ("address", c_uint),
                ("size", c_uint),
                ("sizeMax", c_uint),
                ("start", c_uint),
                ("end", c_uint),
                ("timestamp", c_uint),
                ("location", c_uint),
                ("nStrings", c_uint),
                ("nStringsMax", c_uint),
                ("maxWords", c_uint),
                ("sizeFree", c_uint),
                ("stringSizeMax", c_uint),
                ("stringSizeFree", c_uint),
                ("recordCount", c_uint),
                ("categories", c_uint),
                ("events", c_uint),
                ("wraps", c_uint),
                ("consumed", c_uint),
                ("resolution", c_uint),
                ("previousStamp", c_uint),
                ("maxTimeStamp", c_uint),
                ("threshold", c_uint),
                ("notification", c_uint),
                ("version", c_ubyte)]

# arr = array.array('I', [1])
# How can I do this?
# mystr = MyStruct(arr) magic
# (mystr.helloworld == 1) == True

I can do the following:

mystr = MyStruct()
rest = array.array('I')
with open('myfile.bin', 'rb') as binaryFile:
    binaryFile.readinto(mystr)
    rest.fromstring(binaryFile.read())

# Now create another struct with rest
rest.readinto(mystr) # Does not work

How can I avoid using a file to convert an array of Ints to a struct if the data is contained in an array.array('I')? I am not sure what the Structure constructor accepts or how the readinto works.

2 Answers 2

1
+50

Solution #1: Star unpacking for one-line initialization

Star-unpacking will work, but only if all the fields in your structure are integer types. In Python 2.x, c_char cannot be initialized from an int (it works fine in 3.5). If you change the type of state to c_byte, then you can just do:

mystr = MyStruct(*myarr)

This doesn't actually benefit from any array specific magic (the values are briefly converted to Python ints in the unpacking step, so you're not reducing peak memory usage), so you'd only bother with an array if initializing said array was easier than directly reading into the structure for whatever reason.

If you go the star unpacking route, reading .state will now get you int values instead of len 1 str values. If you want to initialize with int, but read as one character str, you can use a protected name wrapped in a property:

class MyStruct(Structure):
    _fields_ = [...
                ("_state", c_byte),  # "Protected" name int-like; constructor expects int
                ...]

    @property
    def state(self):
        return chr(self._state)

    @state.setter
    def state(self, x):
        if isinstance(x, basestring):
            x = ord(x)
        self._state = x

A similar technique could be used without propertys by defining your own __init__ that converted the state argument passed:

class MyStruct(Structure):
    _fields_ = [("init", c_uint),
                ("state", c_char),
                ...]

    def __init__(self, init=0, state=b'\0', *args, **kwargs):
        if not isinstance(state, basestring):
            state = chr(state)
        super(MyStruct, self).__init__(init, state, *args, **kwargs)

Solution #2: Direct memcpy-like solutions to reduce temporaries

You can use some array specific magic to avoid the temporary Python level ints though (and avoid the need to change state to c_byte) without real file objects using a fake (in-memory) file-like object:

import io

mystr = MyStruct()  # Default initialize

# Use BytesIO to gain the ability to write the raw bytes to the struct
# because BytesIO's readinto isn't finicky about exact buffer formats
io.BytesIO(myarr.tostring()).readinto(mystr)

# In Python 3, where array implements the buffer protocol, you can simplify to:
io.BytesIO(myarr).readinto(mystr)
# This still performs two memcpys (one occurs internally in BytesIO), but
# it's faster by avoiding a Python level method call

This only works because your non-c_int width attributes are followed by c_int width attributes (so they're padded out to four bytes anyway); if you had two c_ubyte/c_char/etc. types back to back, then you'd have problems (because one value of the array would initialize two fields in the struct, which does not appear to be what you want).

If you were using Python 3, you could benefit from array specific magic to avoid the cost of both unpacking and the two step memcpy of the BytesIO technique (from array -> bytes -> struct). It works in Py3 because Py3's array type supports the buffer protocol (it didn't in Py2), and because Py3's memoryview features a cast method that lets you change the format of the memoryview to make it directly compatible with array:

mystr = MyStruct()  # Default initialize

# Make a view on mystr's underlying memory that behaves like a C array of
# unsigned ints in native format (matching array's type code)
# then perform a "memcpy" like operation using empty slice assignment
# to avoid creating any Python level values.
memoryview(mystr).cast('B').cast('I')[:] = myarr

Like the BytesIO solution, this only works because your fields all happen to pad to four bytes in size

Performance

Performance-wise, star unpacking wins for small numbers of fields, but for large numbers of fields (your case has a couple dozen), direct memcpy based approaches win out; in tests for a 23 field class, the BytesIO solution won over star unpacking on my Python 2.7 install by a factor of 2.5x (star unpacking was 2.5 microseconds, BytesIO was 1 microsecond).

The memoryview solution scales similarly to the BytesIO solution, though as of 3.5, it's slightly slower than the BytesIO approach (likely a result of the need to construct several temporary memoryviews to perform the necessary casting operations and/or the memoryview slice assignment code being general purpose for many possible formats, so it's not simple memcpy in implementation). memoryview might scale better for much larger copies (if the losses are due to the fixed cast overhead), but it's rare that you'd have a struct large enough to matter; it would only be in more general purpose copying scenarios (to and from ctypes arrays or the like) that memoryview would potentially win.

Sign up to request clarification or add additional context in comments.

1 Comment

Side-note: Normally, for file-like objects, I'd use them with a with statement; in this specific case of BytesIO I omitted it because no real file handles are involved (so even on non-CPython interpreters, it's not a problem if the memory is cleaned up lazily), and using the with statement involves significant overhead; on Py 3.5, changing io.BytesIO(myarr).readinto(mystr) to with io.BytesIO(myarr) as bio: bio.readinto(mystr) roughly doubles the time per initialization in my test case of a 23 field class (314 ns -> 541 ns; trivial cost for real I/O, but not for in-memory fake I/O).
0

Does this have to be an array? could you use a list maybe? you can unpack from a list in to a function you can use the * operator:

mystr = MyStruct(*arr)

or a dict with:

mystr = MyStruct(**arr)

6 Comments

I could use array.tolist() ill try to do that, nope doesnt work I get: TypeError: one character string expected
Is there a reason to be using array in your code? my presented solution meant reengineer the code to use a native data structure like a list or dictionary
well... an array is the same as a list if I did a .tolist() what you are describing here is unpacking a list or a dictionary for a generic function which is not what I am trying to do. I need an array since I am representing binary data and I am trying to convert it to a ctype struct.
I think the issue is that the data is coming from a file and I am not sure how this is relates to what you have described vs the readinto call
@Har: That error you get doesn't make sense given the definitions from your example code. Put something closer to your real code in the question.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.