Correct or Most Efficient Sharing of Numpy Array of Objects to C++ Vector

Question

I'm passing Numpy data to a C++ extension function using PyArg_ParseTuple. One of the arguments is a Numpy array of objects, i.e. dtype='O'. These objects are in fact 1 dimensional Numpy arrays themselves, but each with a different length.

I've succeeded with the following code, but as a novice at building python extensions, I wonder if there is a better way to do this?

PyArrayObject *arr_neighbors=NULL;
PyArg_ParseTuple(args, "O!", &PyArray_Type, &arr_neighbors);

std::vector<long*> neighbors(n_polys);
std::vector<int> neighbor_lengths(n_polys);
for (long i = 0; i < n_polys; i++ ) {
  PyObject *array = PyArray_GETITEM(arr_neighbors, PyArray_GetPtr(arr_neighbors, &i));
  neighbor_lengths[i] = PyArray_DIM(array, 0);
  neighbors[i] = (long*) PyArray_DATA((PyArrayObject*) array);
}

There's nothing 'efficient' about object dtype arrays. It's basically a list of references to other objects, in your case, more arrays. — hpaulj
– hpaulj, Commented Mar 27, 2020 at 20:57
I understand that the array of objs is an array of references - I'm asking for ways to make the copy more efficient, not the array of objs itself. E.g. since I know the objs are actually arrays, is there a method in the arrayobject.h api that might replace the inner loop here. Someone who knows the arrayobject api well may know of more efficient ways to accomplish what I'm trying to do. — NLi10Me
– NLi10Me, Commented Mar 27, 2020 at 21:11

John Zwinck · Accepted Answer · 2020-03-28 00:12:24Z

1

You can share the data between Python and C++, avoiding a copy, if you allocate it in C++. Here's how to do it for one vector<long>, which you'll need to repeat for each one within your outer vector:

std::vector<long> vec; // TODO: populate

PyObject* dtype = PyString_FromString("i8");
PyArray_Descr* descr;
int rc = PyArray_DescrAlignConverter2(dtype.ptr(), &descr);
assert(rc == 1);

npy_intp dimension = vec.size();
PyObject* arr = PyArray_NewFromDescr(&PyArray_Type, descr, 1, &dimension,
        nullptr, vec.data(), 0/*flags*/, nullptr/*init*/));

Since your data comes from Python, you can do vec.resize(N) with the above to emulate numpy.zeros(N, 'i8'), then fill in the values in Python (which will modify the vector in C++).

edited Mar 28, 2020 at 0:12

answered Mar 27, 2020 at 23:46

John Zwinck

252k44 gold badges346 silver badges459 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

NLi10Me Over a year ago

Thanks John, unfortunately in my case what I'm passing from python to c++ is the result of a function call defined in python code that I do not want to edit, and I do not know the size of each sub-array in advance, i.e. the memory must be allocated from within that python function call. See my edit above for an incremental improvement on what I had before - but maybe it can be improved even more?

Collectives™ on Stack Overflow

Correct or Most Efficient Sharing of Numpy Array of Objects to C++ Vector

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related