I want to implement the following routine without using loops, just using Numpy functions or ndarray methods. Here is the code:
def split_array_into_blocks( the_array, block_dim, total_blocks_per_row ):
n_grid = the_array.shape[0]
res = np.empty( (n_grid, total_blocks_per_row, total_blocks_per_row, block_dim,
block_dim ) )
for i in range( total_blocks_per_row ):
for j in range( total_blocks_per_row ):
subblock = the_array[ :, block_dim*i:block_dim*(i+1), block_dim*j:block_dim*(j+1) ]
res[ :, i,j,:,: ] = subblock
return res
I have tried with the "reshape" method so that:
the_array = the_array.reshape( ( n_grid, total_blocks_per_row, total_blocks_per_row, block_dim, block_dim) )
but this seems to change the order of the elements in some way, and the blocks needs to be stored exactly as in the routine. Can anyone provide a way to do this, with a short explanation of why the reshape method gives a different result here? (maybe I am missing using the np.transpose() in addition?)
EDIT: I came up with this alternative implementation, but I am still not sure if this is the most efficient way (maybe someone can shed some light here):
def split_array_in_blocks( the_array, block_dim, total_blocks_per_row ):
indx = [ block_dim*j for j in range( 1, total_blocks_per_row ) ]
the_array = np.array( [ np.split( np.split( the_array, indx, axis=1 )[j], indx, axis=-1 ) for j in range( total_blocks_per_row ) ] )
the_array = np.transpose( the_array, axes=( 2,0,1,3,4 ) )
return the_array
EXAMPLE: Here is a minimal working example for the two implementations. What we want is, from an initial "cube" of dimensions Nx3MX3M, decompose into blocks NxMxMx3x3, which are the chunked version of the original block. With the two implementations above, one can check that they give the same result; the question is on how to achieve this in an efficient way (i.e., no loops)
import numpy as np
def split_array_in_blocks_2( the_array, block_dim, total_blocks_per_row ):
n_grid = the_array.shape[0]
res = np.zeros( (n_grid, total_blocks_per_row, total_blocks_per_row, block_dim, block_dim ), dtype=the_array.dtype )
for i in range( total_blocks_per_row ):
for j in range( total_blocks_per_row ):
subblock = the_array[ :, block_dim*i:block_dim*(i+1), block_dim*j:block_dim*(j+1) ]
res[ :, i,j,:,: ] = subblock
return res
def split_array_in_blocks( the_array, block_dim, total_blocks_per_row ):
indx = [ block_dim*j for j in range( 1, total_blocks_per_row ) ]
the_array = np.array( [ np.split( np.split( the_array, indx, axis=1 )[j], indx, axis=-1 ) for j in range( total_blocks_per_row ) ] )
the_array = np.transpose( the_array, axes=( 2,0,1,3,4 ) )
return the_array
A = np.random.rand( 1001, 63, 63 )
n = 3
D = 21
from time import time
ts = time()
An = split_array_in_blocks( A, n, D )
t2 = time()
Bn = split_array_in_blocks_2( A, n, D )
t3 = time()
print( t2-ts )
print(t3-t2)
print(np.allclose( An, Bn ))