0

I'm passing Numpy data to a C++ extension function using PyArg_ParseTuple. One of the arguments is a Numpy array of objects, i.e. dtype='O'. These objects are in fact 1 dimensional Numpy arrays themselves, but each with a different length.

I've succeeded with the following code, but as a novice at building python extensions, I wonder if there is a better way to do this?

PyArrayObject *arr_neighbors=NULL;
PyArg_ParseTuple(args, "O!", &PyArray_Type, &arr_neighbors);

std::vector<long*> neighbors(n_polys);
std::vector<int> neighbor_lengths(n_polys);
for (long i = 0; i < n_polys; i++ ) {
  PyObject *array = PyArray_GETITEM(arr_neighbors, PyArray_GetPtr(arr_neighbors, &i));
  neighbor_lengths[i] = PyArray_DIM(array, 0);
  neighbors[i] = (long*) PyArray_DATA((PyArrayObject*) array);
}
2
  • There's nothing 'efficient' about object dtype arrays. It's basically a list of references to other objects, in your case, more arrays. Commented Mar 27, 2020 at 20:57
  • I understand that the array of objs is an array of references - I'm asking for ways to make the copy more efficient, not the array of objs itself. E.g. since I know the objs are actually arrays, is there a method in the arrayobject.h api that might replace the inner loop here. Someone who knows the arrayobject api well may know of more efficient ways to accomplish what I'm trying to do. Commented Mar 27, 2020 at 21:11

1 Answer 1

1

You can share the data between Python and C++, avoiding a copy, if you allocate it in C++. Here's how to do it for one vector<long>, which you'll need to repeat for each one within your outer vector:

std::vector<long> vec; // TODO: populate

PyObject* dtype = PyString_FromString("i8");
PyArray_Descr* descr;
int rc = PyArray_DescrAlignConverter2(dtype.ptr(), &descr);
assert(rc == 1);

npy_intp dimension = vec.size();
PyObject* arr = PyArray_NewFromDescr(&PyArray_Type, descr, 1, &dimension,
        nullptr, vec.data(), 0/*flags*/, nullptr/*init*/));

Since your data comes from Python, you can do vec.resize(N) with the above to emulate numpy.zeros(N, 'i8'), then fill in the values in Python (which will modify the vector in C++).

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks John, unfortunately in my case what I'm passing from python to c++ is the result of a function call defined in python code that I do not want to edit, and I do not know the size of each sub-array in advance, i.e. the memory must be allocated from within that python function call. See my edit above for an incremental improvement on what I had before - but maybe it can be improved even more?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.