Numpy set array memory

Question

I have a question regarding numpys memory views:

Suppose we have two arrays with memory:

import numpy as np
import gc
x = np.arange(4*3).reshape(4,3).astype(float)
y = (np.arange(5) - 5).astype(float)
y_ref = y

We use these (x, y) in a framework, such that we cannot just redefine them, as the user may have linked them for himself (as in y_ref). Now we want to combine their memory in one view. So, that the single view, say p shares the memory with both arrays.

I did it in the following way, but do not know if this causes a memory leak:

p = np.empty(x.size+y.size, dtype=float) # create new memory block with right size
c = 0 # current point in memory

# x
p[c:c+x.size].flat = x.flat # set the memory for combined array p
x.data = p[c:c+x.size].data # now set the buffer of x to be the right length buffer of p

c += x.size

# y
p[c:c+y.size].flat = y.flat # set the memory for combined array p
y.data = p[c:c+y.size].data # and set the buffer of x to be the right length buffer of p

Thus, we can now operate on the single view p or either of the arrays, without having to redifine every single reference to them

x[3] = 10
print p[3*3:4*3]
# [ 10.  10.  10.]

Even y_ref has got the update:

print y[0] # -5
y_ref[0] = 100
print p[x.size] # 100

Is this the correct way of setting the memory of an array to be a view into another array?

Is there an obvious way of unifying the memory of arrays, which I am blatantly missing?

I am not sure what will happen with the old data buffers of x and y as they are out of scope now. Will they get deallocated?

Update thanks @Jaime:

p.size can get very large (into billions) on datasets I am applying to (microbiology). Also, this theme gets used in a framework with potentially deep structures, so updating all local versions can get expensive. Updating of all parameters need to be done in an optimization loop, so it is crucial to have everything in memory.

Actually your approach was what I came from in the first place, as it was inefficient using python hierarchy traversals to update all local copies.

Why don't you simply make yourself a copy of the two arrays? p = np.concatenate((a, b)); local_a = p[:len(a)]; local_b = p[len(a):] If you need a and b to reflect the changes you make to their local versions, finish your manipulations with a[:] = local_a; b[:] = local_b. — Jaime
– Jaime, Commented May 14, 2014 at 14:26

HYRY · Accepted Answer · 2014-05-14 11:22:08Z

3

According to the source code, the old data buffer will be freed.

https://github.com/numpy/numpy/blob/6c6ddaf62e0556919a57d510e13ccb2e6cd6e043/numpy/core/src/multiarray/getset.c#L329

but if the old buffer is referenced by other array, it will cause problem:

import numpy as np

a = np.zeros(10)
b = np.zeros(10)
c = a[:]
a.data = b
print c

answered May 14, 2014 at 11:22

HYRY

97.8k28 gold badges197 silver badges192 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Numpy set array memory

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related