0

I have the following problem. I need to change the shape of one Numpy array to match the shape of another Numpy array by adding rows and columns.

Let's say this is the array that needs to be changed:

change_array = np.random.rand(150, 120)

And this is the reference array:

reference_array = np.random.rand(200, 170)

To match the shapes I'm adding rows and columns containing zeros, using the following function:

def match_arrays(change_array, reference_array):

    cols = np.zeros((change_array.shape[0], (reference_array.shape[1] - change_array.shape[1])), dtype=np.int8)
    change_array = np.append(change_array, cols, axis=1)
    rows = np.zeros(((reference_array.shape[0] - change_array.shape[0]), reference_array.shape[1]), dtype=np.int8)
    change_array = np.append(change_array, rows, axis=0)
    return change_array

Which perfectly works and changes the shape of change_array to the shape of reference_array. However, using this method, the array needs to be copied twice in memory. I understand how Numpy needs to make a copy of the array in memory in order to have space to append the rows and columns.

As my arrays can get very large I am looking for another, more memory efficient method, that can achieve the same result. Thanks!

0

1 Answer 1

1

Here are a couple ways. In the code examples, I'll use the following arrays:

In [190]: a
Out[190]: 
array([[12, 11, 15],
       [16, 15, 10],
       [16, 12, 13],
       [11, 19, 10],
       [12, 12, 11]])

In [191]: b
Out[191]: 
array([[70, 82, 83, 93, 97, 55],
       [50, 86, 53, 75, 75, 69],
       [60, 50, 76, 52, 72, 88],
       [72, 79, 66, 93, 58, 58],
       [57, 92, 71, 97, 91, 50],
       [60, 77, 67, 91, 91, 63],
       [60, 90, 91, 50, 86, 71]])

Use numpy.pad:

In [192]: np.pad(a, [(0, b.shape[0] - a.shape[0]), (0, b.shape[1] - a.shape[1])], 'constant')
Out[192]: 
array([[12, 11, 15,  0,  0,  0],
       [16, 15, 10,  0,  0,  0],
       [16, 12, 13,  0,  0,  0],
       [11, 19, 10,  0,  0,  0],
       [12, 12, 11,  0,  0,  0],
       [ 0,  0,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0,  0]])

Or, use a more efficient version of your function, in which the result is preallocated as an array of zeros with the same shape as reference_array, and then the values in change_array are copied into the result:

In [193]: def match_arrays(change_array, reference_array):
     ...:     result = np.zeros(reference_array.shape, dtype=change_array.dtype)
     ...:     nrows, ncols = change_array.shape
     ...:     result[:nrows, :ncols] = change_array
     ...:     return result
     ...: 

In [194]: match_arrays(a, b)
Out[194]: 
array([[12, 11, 15,  0,  0,  0],
       [16, 15, 10,  0,  0,  0],
       [16, 12, 13,  0,  0,  0],
       [11, 19, 10,  0,  0,  0],
       [12, 12, 11,  0,  0,  0],
       [ 0,  0,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0,  0]])
Sign up to request clarification or add additional context in comments.

1 Comment

np.pad does 2 concatenates. It does a prepend and postpend for each axis (a total of 4 operations), but is smart enough to skip the concatenate if the pad has size 0. It's a very general purpose function written in Python.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.