7

I need some help regarding passing C array to python(numpy). I have 2d array of doubles NumRows x NumInputs, it seems that PyArray_SimpleNewFromData does not convert it right way - it is hard to see because debugger does not show much, only pointers.

What would be the right way to pass 2 dimensional array ?

int NumRows = X_test.size();
int NumInputs = X_test_row.size();

double **X_test2 = new double*[NumRows];
for(int i = 0; i < NumRows; ++i) 
{
    X_test2[i] = new double[NumInputs];
}


for(int r = 0; r < NumRows; ++r) 
{
    for(int c = 0; c < NumInputs; ++c) 
    {
        X_test2[r][c] = X_test[r][c];
    }
}




const char *ScriptFName = "100-ABN-PREDICT";
char *FunctionName=NULL;

FunctionName="PredictGBC_DBG"; 

npy_intp Dims[2];
Dims[0]= NumRows;
Dims[1] = NumInputs;

PyObject *ArgsArray;
PyObject *pName, *pModule, *pDict, *pFunc, *pValue, *pArgs;

int row, col, rows, cols, size, type;

const double* outArray;
double ArrayItem;

//===================

Py_Initialize();

pName = PyBytes_FromString(ScriptFName);

pModule = PyImport_ImportModule(ScriptFName);

if (pModule != NULL)
{
    import_array(); // Required for the C-API

    ArgsArray = PyArray_SimpleNewFromData (2, Dims, NPY_DOUBLE, X_test2);//SOMETHING WRONG 

    pDict = PyModule_GetDict(pModule);

    pArgs = PyTuple_New (1);
    PyTuple_SetItem (pArgs, 0, ArgsArray);

    pFunc = PyDict_GetItemString(pDict, FunctionName);

    if (pFunc && PyCallable_Check(pFunc))
    {

        pValue = PyObject_CallObject(pFunc, pArgs);//CRASHING HERE

        if (pValue != NULL) 
        {
            rows = PyArray_DIM(pValue, 0);
            cols = PyArray_DIM(pValue, 1);
            size = PyArray_SIZE(pValue);
            type = PyArray_TYPE(pValue);


            // get direct access to the array data
            //PyObject* m_obj;
            outArray = static_cast<const double*>(PyArray_DATA(pValue));


            for (row=0; row < rows; row++) 
            {
                ArrayItem = outArray[row];
                y_pred.push_back(ArrayItem);
            }

        }
        else 
        {
            y_pred.push_back(EMPTY_VAL);
        }
    }
    else 
    {
        PyErr_Print();
    }//pFunc && PyCallable_Check(pFunc)



}//(pModule!=NULL
else
{
    PyErr_SetString(PyExc_TypeError, "Cannot call function ?!");
    PyErr_Print();
}




Py_DECREF(pValue);
Py_DECREF(pFunc);

Py_DECREF(ArgsArray);  
Py_DECREF(pModule);
Py_DECREF(pName);


Py_Finalize (); 
5
  • 1
    Firstly, I see new, so I guess the better tag is C++, even if it's largely C-like what you're doing. Secondly, I would argue X_test2 is not a 2 dimensional array, but rather an array of arrays. It just happens that each subarray is the same size (NumInputs), but it doesn't have to be. Commented Jan 14, 2015 at 10:45
  • 2
    If you don't mind using Cython, which is very much an accepted standard for interfacing numpy and C, you can make it a lot easier. Though in that case, it is probably easier (recommended?) to allocate the array in Python/numpy, and then pass that to your C routine to do the computations (so your second for-loop, I guess). There are some examples at the Cython wiki to help you out. Note how that numpy array is 2D, but then passed a single pointer and used as a 1D array inside the C code. Hence (partly) my previous comment. Commented Jan 14, 2015 at 10:47
  • It is a bit more complicated: c++ part is dll used by some other software, it should only get data, change its format to numpy and pass it to python where all the calculation are done (scikit-learn). Commented Jan 14, 2015 at 10:52
  • If it's a dll, can't you use ctypes? Commented Jan 14, 2015 at 11:22
  • 1
    Lets focus on passing 2d array to python. Commented Jan 14, 2015 at 11:43

1 Answer 1

7

You'll have to copy your data to a contiguous block of memory. To represent a 2d array, numpy does not use an array of pointers to 1d arrays. Numpy expects the array to be stored in a contiguous block of memory, in (by default) row major order.

If you create your array using PyArray_SimpleNew(...), numpy allocates the memory for you. You have to copy X_test2 to this array, using, say, std::memcpy or std::copy in a loop over the rows.

That is, change this:

ArgsArray = PyArray_SimpleNewFromData (2, Dims, NPY_DOUBLE, X_test2);//SOMETHING WRONG 

to something like this:

// PyArray_SimpleNew allocates the memory needed for the array.
ArgsArray = PyArray_SimpleNew(2, Dims, NPY_DOUBLE);

// The pointer to the array data is accessed using PyArray_DATA()
double *p = (double *) PyArray_DATA(ArgsArray);

// Copy the data from the "array of arrays" to the contiguous numpy array.
for (int k = 0; k < NumRows; ++k) {
    memcpy(p, X_test2[k], sizeof(double) * NumInputs);
    p += NumInputs;
}

(It looks like X_test2 is a copy of X_test, so you might want to modify the above code to copy directly from X_test to the numpy array.)

Sign up to request clarification or add additional context in comments.

5 Comments

Thanks, I just quickly checked - it seems to work (I investigate further later). By the way, maybe you will know why calling: cols = PyArray_DIM(pValue, 1); does not return number of column ie. array.shape[1] ? It returns 8 when numpy array is of doubles and 4 when int32 ?
What is pValue? The first argument of PyArray_DIM() must be the python object holding the numpy array, e.g. ArgsArray.
This in the code attached in the question, pValue = PyObject_CallObject(pFunc, pArgs). This is numpy array returned from python
I just figured it out (in a way) - this happens when numpy array is 1d
Ah, that makes sense. PyArray_DIM(arr, k) is accessing an array with length PyArray_NDIM(arr), which is 1 for a 1d array.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.