4

I have a question regarding numpys memory views:

Suppose we have two arrays with memory:

import numpy as np
import gc
x = np.arange(4*3).reshape(4,3).astype(float)
y = (np.arange(5) - 5).astype(float)
y_ref = y

We use these (x, y) in a framework, such that we cannot just redefine them, as the user may have linked them for himself (as in y_ref). Now we want to combine their memory in one view. So, that the single view, say p shares the memory with both arrays.

I did it in the following way, but do not know if this causes a memory leak:

p = np.empty(x.size+y.size, dtype=float) # create new memory block with right size
c = 0 # current point in memory

# x
p[c:c+x.size].flat = x.flat # set the memory for combined array p
x.data = p[c:c+x.size].data # now set the buffer of x to be the right length buffer of p

c += x.size

# y
p[c:c+y.size].flat = y.flat # set the memory for combined array p
y.data = p[c:c+y.size].data # and set the buffer of x to be the right length buffer of p

Thus, we can now operate on the single view p or either of the arrays, without having to redifine every single reference to them

x[3] = 10
print p[3*3:4*3]
# [ 10.  10.  10.]

Even y_ref has got the update:

print y[0] # -5
y_ref[0] = 100
print p[x.size] # 100

Is this the correct way of setting the memory of an array to be a view into another array?

Is there an obvious way of unifying the memory of arrays, which I am blatantly missing?

I am not sure what will happen with the old data buffers of x and y as they are out of scope now. Will they get deallocated?

Update thanks @Jaime:

p.size can get very large (into billions) on datasets I am applying to (microbiology). Also, this theme gets used in a framework with potentially deep structures, so updating all local versions can get expensive. Updating of all parameters need to be done in an optimization loop, so it is crucial to have everything in memory.

Actually your approach was what I came from in the first place, as it was inefficient using python hierarchy traversals to update all local copies.

1
  • 1
    Why don't you simply make yourself a copy of the two arrays? p = np.concatenate((a, b)); local_a = p[:len(a)]; local_b = p[len(a):] If you need a and b to reflect the changes you make to their local versions, finish your manipulations with a[:] = local_a; b[:] = local_b. Commented May 14, 2014 at 14:26

1 Answer 1

3

According to the source code, the old data buffer will be freed.

https://github.com/numpy/numpy/blob/6c6ddaf62e0556919a57d510e13ccb2e6cd6e043/numpy/core/src/multiarray/getset.c#L329

but if the old buffer is referenced by other array, it will cause problem:

import numpy as np

a = np.zeros(10)
b = np.zeros(10)
c = a[:]
a.data = b
print c
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.