3

I am trying to understand the meaning of ndarray.data field in numpy (see memory layout section of the reference page on N-dimensional arrays), especially for views into arrays. To quote the documentation:

ndarray.data -- Python buffer object pointing to the start of the array’s data

According to this description, I was expecting this to be a pointer to the C-array underlying the instance of ndarray.

Consider x = np.arange(5, dtype=np.float64).

Form y as a view into x using a slice: y = x[3:1:-1].

I was expecting x.data to point at location of 0. and y.data to point at the location of 3.. I was expecting the memory pointer printed by y.data to thus be offset by 3*x.itemsize bytes from the memory pointer printed by x.data, but this does not appear to be the case:

>>> import numpy as np
>>> x = np.arange(5, dtype=np.float64)
>>> y = x[ 3:1:-1]
>>> x.data
<memory at 0x000000F2F5150348>
>>> y.data
<memory at 0x000000F2F5150408>
>>> int('0x000000F2F5150408', 16) - int('0x000000F2F5150348', 16)
192
>>> 3*x.itemsize
24

The 'data' key in __array_interface dictionary associated with the ndarray instance behaves more like I expect, although it may itself not be a pointer:

>>> y.__array_interface__['data'][0] - x.__array_interface__['data'][0]
24

So this begs the question, what does the ndarray.data give?

Thanks in advance.

5
  • 2
    Since y is non-continuous, it doesn't expose data (>>>y.data AttributeError: cannot get single-segment buffer for discontiguous array). So I kinda cannot imagine how you're going to compare x.data and y.data. (numpy 1.11.1 and python 2.7.12 win32 here) . Commented Sep 14, 2016 at 22:11
  • 1
    192 = 3*64, just saying Commented Sep 14, 2016 at 22:13
  • @ivan_pozdeev I am not getting this error from evaluation of y.data using numpy 1.11 on Windows and Linux using Python 3.5.2 from Anaconda distribution. What is your configuration? Commented Sep 14, 2016 at 22:16
  • 24 bytes, 192 bits? Commented Sep 14, 2016 at 22:21
  • @user40314 I only said that it returns an error for me, I couldn't know about the cause of discrepancy. Since the question couldn't be answered without finding it out, I required additional info. Commented Sep 14, 2016 at 22:38

2 Answers 2

3

<memory at 0x000000F2F5150348> is a memoryview object located at address 0x000000F2F5150348; the buffer it provides access to is located somewhere else.

Memoryviews provide a number of operations described in the relevant official documentation, but at least on the Python-side API, they do not provide any way to access the raw address of the memory they expose. Particularly, the at whatevernumber number is not what you're looking for.

Sign up to request clarification or add additional context in comments.

Comments

2

Generally the number displayed by x.data isn't meant to be used by you. x.data is the buffer, which can be used in other contexts that expect a buffer.

np.frombuffer(x.data,dtype=float)

replicates your x.

np.frombuffer(x[3:].data,dtype=float)

this replicates x[3:]. But from Python you can't take x.data, add 192 bits (3*8*8) to it, and expect to get x[3:].

I often use the __array_interface__['data'] value to check whether two variables share a data buffer, but I don't use that number for any thing. These are informative numbers, not working values.

I recently explored this in

Creating a NumPy array directly from __array_interface__

1 Comment

Thank you for your response and for the link. My confusion was in conflating PyArray_DATA on C-API with ndarray.data.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.