numpy 2d array partitioning into blocks using given indices and track the indices

Question

I have the indices of a 2D array. I want to partition the indices such that the corresponding entries form blocks (block size is given as input m and n). I want to track the indices of the blocks too.

For example, if the indices are as given below

(array([0, 0, 0, 0, 1, 1, 1, 1, 6, 6, 6, 6, 7, 7, 7, 7 ]), array([0, 1, 7, 8, 0,1,7,8, 0,1,7,8, 0, 1, 7, 8]))

for the original matrix (from which the indices are generated)

array([[3, 4, 2, 0, 1, 1, 0, 2, 4],
       [1, 3, 2, 0, 0, 1, 0, 4, 0],
       [1, 0, 0, 1, 1, 0, 1, 1, 3],
       [0, 0, 0, 3, 3, 0, 4, 0, 4],
       [4, 3, 4, 2, 1, 1, 0, 0, 4],
       [0, 1, 0, 4, 4, 2, 2, 2, 1],
       [2, 4, 0, 1, 1, 0, 0, 2, 1],
       [0, 4, 1, 3, 3, 2, 3, 2, 4]])

and if the block size is (2,2), then the blocks should be

[[3, 4],
 [1, 3]]

[[2, 4] 
 [4, 0]]

[[2, 4]
 [0, 4]]

[[2, 1]           
 [2, 4]]

I tried with reshape as A[inds].reshape(4,2,2). But it is not working. I even tried to transpose the axis with no success. Also, I am not sure how can i track the indices in each block.

*** The below code is not working in the general case.

For the below array

array([[(1., 1.), (1., 2.), (1., 3.), (1., 4.), (1., 5.), (1., 6.),
        (1., 7.), (1., 8.)],
       [(2., 1.), (2., 2.), (2., 3.), (2., 4.), (2., 5.), (2., 6.),
        (2., 7.), (2., 8.)],
       [(3., 1.), (3., 2.), (3., 3.), (3., 4.), (3., 5.), (3., 6.),
        (3., 7.), (3., 8.)],
       [(4., 1.), (4., 2.), (4., 3.), (4., 4.), (4., 5.), (4., 6.),
        (4., 7.), (4., 8.)],
       [(5., 1.), (5., 2.), (5., 3.), (5., 4.), (5., 5.), (5., 6.),
        (5., 7.), (5., 8.)],
       [(6., 1.), (6., 2.), (6., 3.), (6., 4.), (6., 5.), (6., 6.),
        (6., 7.), (6., 8.)],
       [(7., 1.), (7., 2.), (7., 3.), (7., 4.), (7., 5.), (7., 6.),
        (7., 7.), (7., 8.)],
       [(8., 1.), (8., 2.), (8., 3.), (8., 4.), (8., 5.), (8., 6.),
        (8., 7.), (8., 8.)]], dtype=[('f0', '<f2'), ('f1', '<f2')])

with indices

(array([0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2,
       2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5,
       5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 7, 7]), array([0, 1, 2, 3, 4, 5, 6, 7, 0, 1, 2, 3, 4, 5, 6, 7, 0, 1, 2, 3, 4, 5,
       6, 7, 0, 1, 2, 3, 4, 5, 6, 7, 0, 1, 2, 3, 4, 5, 6, 7, 0, 1, 2, 3,
       4, 5, 6, 7, 0, 1, 2, 3, 4, 5, 6, 7, 0, 1, 2, 3, 4, 5, 6, 7]))

and with a block size of (4,4) it returns the below result

array([[[(1., 1.), (1., 2.), (1., 3.), (1., 4.)],
        [(2., 1.), (2., 2.), (2., 3.), (2., 4.)],
        [(1., 5.), (1., 6.), (1., 7.), (1., 8.)],
        [(2., 5.), (2., 6.), (2., 7.), (2., 8.)]],

       [[(3., 1.), (3., 2.), (3., 3.), (3., 4.)],
        [(3., 5.), (3., 6.), (3., 7.), (3., 8.)],
        [(4., 1.), (4., 2.), (4., 3.), (4., 4.)],
        [(4., 5.), (4., 6.), (4., 7.), (4., 8.)]],

       [[(5., 1.), (5., 2.), (5., 3.), (5., 4.)],
        [(5., 5.), (5., 6.), (5., 7.), (5., 8.)],
        [(6., 1.), (6., 2.), (6., 3.), (6., 4.)],
        [(6., 5.), (6., 6.), (6., 7.), (6., 8.)]],

       [[(7., 1.), (7., 2.), (7., 3.), (7., 4.)],
        [(7., 5.), (7., 6.), (7., 7.), (7., 8.)],
        [(8., 1.), (8., 2.), (8., 3.), (8., 4.)],
        [(8., 5.), (8., 6.), (8., 7.), (8., 8.)]]],
      dtype=[('f0', '<f2'), ('f1', '<f2')])

The correct result should be

array([[[(1., 1.), (1., 2.), (1., 3.), (1., 4.)],
        [(2., 1.), (2., 2.), (2., 3.), (2., 4.)],
        [(3., 1.), (3., 2.), (3., 3.), (3., 4.)],
        [(4., 1.), (4., 2.), (4., 3.), (4., 4.)]],

       [[(1., 5.), (1., 6.), (1., 7.), (1., 8.)],
        [(2., 5.), (2., 6.), (2., 7.), (2., 8.)]
        [(3., 5.), (3., 6.), (3., 7.), (3., 8.)],
        [(4., 5.), (4., 6.), (4., 7.), (4., 8.)]],

       [[(5., 1.), (5., 2.), (5., 3.), (5., 4.)],
        [(6., 1.), (6., 2.), (6., 3.), (6., 4.)],
        [(7., 1.), (7., 2.), (7., 3.), (7., 4.)],
        [(8., 1.), (8., 2.), (8., 3.), (8., 4.)]],

       [[(5., 5.), (5., 6.), (5., 7.), (5., 8.)],
        [(6., 5.), (6., 6.), (6., 7.), (6., 8.)],
        [(7., 5.), (7., 6.), (7., 7.), (7., 8.)],
        [(8., 5.), (8., 6.), (8., 7.), (8., 8.)]]],
      dtype=[('f0', '<f2'), ('f1', '<f2')])

Romain Simon · Accepted Answer · 2022-08-03 20:18:08Z

1

The following should work in a general case! It only works for 2D arrays if and only if the length of your index array modulo the product of the two elements of your block shape is equal to 0.

def block(arr, ind1, ind2, block_shape):
    """
    :param arr: 2D numpy array.
    :param ind1: 1D numpy array of row indices.
    :param ind1: 1D numpy array of column indices.
    :param block_shape: tuple of length two represents the block shape.             
    """
    block_shape0, block_shape1 = block_shape
    step = block_shape0 * block_shape1
   
    # This condition has to be verified to have entire blocks
    if len(ind1) % step == 0 and len(ind2) % step == 0:
        len_array = len(ind1) // block_shape1
        new_shape = (len_array, block_shape1)
        a = arr[ind1, ind2].reshape(new_shape)
   
        # Here the swap is necessary to have the good blocks together
        no_swap = [(i, i+1) for i in range(1, len_array, step)]
        swap = [(i+1, i) for i in range(1, len_array, step)]
        a[no_swap, :] = a[swap, :]
        a = a.reshape((len_array//block_shape0, block_shape0, block_shape1))

    else:
        a = []

    return a

With your example inputs:

>>> block(arr, ind1, ind2, (2, 2))
[[[3 4]
  [1 3]]

 [[2 4]
  [4 0]]

 [[2 4]
  [0 4]]

 [[2 1]
 [2 4]]]

To keep track of the indices, you can also use the block function by inputing a different arr

row_arr, col_arr = np.mgrid[0:arr.shape[0], 0:arr.shape[1]]
>>> row_arr
[[0 0 0 0 0 0 0 0 0]
 [1 1 1 1 1 1 1 1 1]
 [2 2 2 2 2 2 2 2 2]
 [3 3 3 3 3 3 3 3 3]
 [4 4 4 4 4 4 4 4 4]
 [5 5 5 5 5 5 5 5 5]
 [6 6 6 6 6 6 6 6 6]
 [7 7 7 7 7 7 7 7 7]]

>>> col_arr
[[0 1 2 3 4 5 6 7 8]
 [0 1 2 3 4 5 6 7 8]
 [0 1 2 3 4 5 6 7 8]
 [0 1 2 3 4 5 6 7 8]
 [0 1 2 3 4 5 6 7 8]
 [0 1 2 3 4 5 6 7 8]
 [0 1 2 3 4 5 6 7 8]
 [0 1 2 3 4 5 6 7 8]]

row_arr and col_arr have the same shape as arr and represent respectively the row indices are column indices.

To keep track of the indices, do the following:

>>> row_ind = block(row_arr, ind1, ind2, (2, 2)))
>>> row_ind
[[[0 0]
  [1 1]]

 [[0 0]
  [1 1]]

 [[6 6]
  [7 7]]

 [[6 6]
  [7 7]]]

And you can do the same for col_arrto get col_ind!

edited Aug 3, 2022 at 20:18

answered Aug 3, 2022 at 15:16

Romain Simon

3551 silver badge8 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Shew Over a year ago

how can i track the indices of the new blocks ? (original indices ). can you check that too please ?

Romain Simon Over a year ago

I edited my post, with the new answer for the indices

Shew Over a year ago

The code is working only for (2,2) not in general case. I added one example. Can you check it please ?

Collectives™ on Stack Overflow

numpy 2d array partitioning into blocks using given indices and track the indices

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related