Appending rows to numpy array using less memory

Question

I have the following problem. I need to change the shape of one Numpy array to match the shape of another Numpy array by adding rows and columns.

Let's say this is the array that needs to be changed:

change_array = np.random.rand(150, 120)

And this is the reference array:

reference_array = np.random.rand(200, 170)

To match the shapes I'm adding rows and columns containing zeros, using the following function:

def match_arrays(change_array, reference_array):

    cols = np.zeros((change_array.shape[0], (reference_array.shape[1] - change_array.shape[1])), dtype=np.int8)
    change_array = np.append(change_array, cols, axis=1)
    rows = np.zeros(((reference_array.shape[0] - change_array.shape[0]), reference_array.shape[1]), dtype=np.int8)
    change_array = np.append(change_array, rows, axis=0)
    return change_array

Which perfectly works and changes the shape of change_array to the shape of reference_array. However, using this method, the array needs to be copied twice in memory. I understand how Numpy needs to make a copy of the array in memory in order to have space to append the rows and columns.

As my arrays can get very large I am looking for another, more memory efficient method, that can achieve the same result. Thanks!

Warren Weckesser · Accepted Answer · 2016-12-15 09:06:55Z

1

Here are a couple ways. In the code examples, I'll use the following arrays:

In [190]: a
Out[190]: 
array([[12, 11, 15],
       [16, 15, 10],
       [16, 12, 13],
       [11, 19, 10],
       [12, 12, 11]])

In [191]: b
Out[191]: 
array([[70, 82, 83, 93, 97, 55],
       [50, 86, 53, 75, 75, 69],
       [60, 50, 76, 52, 72, 88],
       [72, 79, 66, 93, 58, 58],
       [57, 92, 71, 97, 91, 50],
       [60, 77, 67, 91, 91, 63],
       [60, 90, 91, 50, 86, 71]])

Use numpy.pad:

In [192]: np.pad(a, [(0, b.shape[0] - a.shape[0]), (0, b.shape[1] - a.shape[1])], 'constant')
Out[192]: 
array([[12, 11, 15,  0,  0,  0],
       [16, 15, 10,  0,  0,  0],
       [16, 12, 13,  0,  0,  0],
       [11, 19, 10,  0,  0,  0],
       [12, 12, 11,  0,  0,  0],
       [ 0,  0,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0,  0]])

Or, use a more efficient version of your function, in which the result is preallocated as an array of zeros with the same shape as reference_array, and then the values in change_array are copied into the result:

In [193]: def match_arrays(change_array, reference_array):
     ...:     result = np.zeros(reference_array.shape, dtype=change_array.dtype)
     ...:     nrows, ncols = change_array.shape
     ...:     result[:nrows, :ncols] = change_array
     ...:     return result
     ...: 

In [194]: match_arrays(a, b)
Out[194]: 
array([[12, 11, 15,  0,  0,  0],
       [16, 15, 10,  0,  0,  0],
       [16, 12, 13,  0,  0,  0],
       [11, 19, 10,  0,  0,  0],
       [12, 12, 11,  0,  0,  0],
       [ 0,  0,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0,  0]])

edited Dec 15, 2016 at 9:06

answered Dec 15, 2016 at 8:58

Warren Weckesser

116k20 gold badges207 silver badges224 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

hpaulj Over a year ago

np.pad does 2 concatenates. It does a prepend and postpend for each axis (a total of 4 operations), but is smart enough to skip the concatenate if the pad has size 0. It's a very general purpose function written in Python.

Collectives™ on Stack Overflow

Appending rows to numpy array using less memory

1 Answer 1

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related