Create block diagonal numpy array from a given numpy array

Question

I have a 2-dimensional numpy array with an equal number of columns and rows. I would like to arrange them into a bigger array having the smaller ones on the diagonal. It should be possible to specify how often the starting matrix should be on the diagonal. For example:

a = numpy.array([[5, 7], 
                 [6, 3]])

So if I wanted this array 2 times on the diagonal the desired output would be:

array([[5, 7, 0, 0], 
       [6, 3, 0, 0], 
       [0, 0, 5, 7], 
       [0, 0, 6, 3]])

For 3 times:

array([[5, 7, 0, 0, 0, 0], 
       [6, 3, 0, 0, 0, 0], 
       [0, 0, 5, 7, 0, 0], 
       [0, 0, 6, 3, 0, 0],
       [0, 0, 0, 0, 5, 7],
       [0, 0, 0, 0, 6, 3]])

Is there a fast way to implement this with numpy methods and for arbitrary sizes of the starting array (still considering the starting array to have the same number of rows and columns)?

Divakar · Accepted Answer · 2019-09-19 18:33:13Z

30

Approach #1

Classic case of numpy.kron -

np.kron(np.eye(r,dtype=int),a) # r is number of repeats

Sample run -

In [184]: a
Out[184]: 
array([[1, 2, 3],
       [3, 4, 5]])

In [185]: r = 3 # number of repeats

In [186]: np.kron(np.eye(r,dtype=int),a)
Out[186]: 
array([[1, 2, 3, 0, 0, 0, 0, 0, 0],
       [3, 4, 5, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 1, 2, 3, 0, 0, 0],
       [0, 0, 0, 3, 4, 5, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 1, 2, 3],
       [0, 0, 0, 0, 0, 0, 3, 4, 5]])

Approach #2

Another efficient one with diagonal-viewed-array-assignment -

def repeat_along_diag(a, r):
    m,n = a.shape
    out = np.zeros((r,m,r,n), dtype=a.dtype)
    diag = np.einsum('ijik->ijk',out)
    diag[:] = a
    return out.reshape(-1,n*r)

Sample run -

In [188]: repeat_along_diag(a,3)
Out[188]: 
array([[1, 2, 3, 0, 0, 0, 0, 0, 0],
       [3, 4, 5, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 1, 2, 3, 0, 0, 0],
       [0, 0, 0, 3, 4, 5, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 1, 2, 3],
       [0, 0, 0, 0, 0, 0, 3, 4, 5]])

edited Sep 19, 2019 at 18:33

answered Nov 3, 2015 at 20:23

Divakar

222k19 gold badges273 silver badges374 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

Will.Evo Over a year ago

How would you do this if you need to insert x different matrices into the diagonal? I have 80 different matrices, that need to be made into a block diagonal matrix.

Divakar Over a year ago

@Will.Evo All 80 of the same shapes?

Will.Evo Over a year ago

Yes all the same shape

Divakar Over a year ago

@Will.Evo Hmm that would suit better as a new question. I am thinking some masking based method would be needed, different from the simplistic case here.

kuzand Over a year ago

If you have many matrices a, b, c... , then use scipy.linalg.block_diag(a,b,c,...), see also Prokhozhii answer.

|

Prokhozhii · Accepted Answer · 2018-05-28 14:43:45Z

18

import numpy as np
from scipy.linalg import block_diag

a = np.array([[5, 7], 
              [6, 3]])

n = 3

d = block_diag(*([a] * n))

d

array([[5, 7, 0, 0, 0, 0],
       [6, 3, 0, 0, 0, 0],
       [0, 0, 5, 7, 0, 0],
       [0, 0, 6, 3, 0, 0],
       [0, 0, 0, 0, 5, 7],
       [0, 0, 0, 0, 6, 3]], dtype=int32)

But it looks like np.kron solution is a little bit faster for small n.

%timeit np.kron(np.eye(n), a)
12.4 µs ± 95.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

%timeit block_diag(*([a] * n))
19.2 µs ± 34.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

However for n = 300, for example, the block_diag is much faster:

%timeit block_diag(*([a] * n))
1.11 ms ± 32.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit np.kron(np.eye(n), a)
4.87 ms ± 31 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

edited May 28, 2018 at 14:43

answered May 28, 2018 at 12:40

Prokhozhii

7301 gold badge8 silver badges14 bronze badges

1 Comment

Julek Over a year ago

This seems the most viable solution, but I checked the code for block_diag and all it is doing is just filling up the bigger matrix in a loop.

norok2 · Accepted Answer · 2019-09-19 20:26:19Z

For the specialized case of matrices, a simple slicing is WAY faster then numpy.kron() (the slowest) and mostly on par with numpy.einsum()-based approach (from @Divakar answer). Compared to scipy.linalg.block_diag(), it performs better for smaller arr, somewhat independently of number of block repetitions.

Note that the performances of block_diag_view() on smaller inputs can be easily further improved with Numba, but one would get worse performances for larger inputs.

With Numba, full explicit looping and parallelization (block_diag_loop_jit()) one would get again similar results as block_diag_einsum() if the number of repetitions is small.

Overall, the most performing solutions are block_diag_einsum() and block_diag_view().

import numpy as np
import scipy as sp
import numba as nb

import scipy.linalg


NUM = 4
M = 9


def block_diag_kron(arr, num=NUM):
    return np.kron(np.eye(num), arr)


def block_diag_einsum(arr, num=NUM):
    rows, cols = arr.shape
    result = np.zeros((num, rows, num, cols), dtype=arr.dtype)
    diag = np.einsum('ijik->ijk', result)
    diag[:] = arr
    return result.reshape(rows * num, cols * num)


def block_diag_scipy(arr, num=NUM):
    return sp.linalg.block_diag(*([arr] * num))


def block_diag_view(arr, num=NUM):
    rows, cols = arr.shape
    result = np.zeros((num * rows, num * cols), dtype=arr.dtype)
    for k in range(num):
        result[k * rows:(k + 1) * rows, k * cols:(k + 1) * cols] = arr
    return result


@nb.jit
def block_diag_view_jit(arr, num=NUM):
    rows, cols = arr.shape
    result = np.zeros((num * rows, num * cols), dtype=arr.dtype)
    for k in range(num):
        result[k * rows:(k + 1) * rows, k * cols:(k + 1) * cols] = arr
    return result


@nb.jit(parallel=True)
def block_diag_loop_jit(arr, num=NUM):
    rows, cols = arr.shape
    result = np.zeros((num * rows, num * cols), dtype=arr.dtype)
    for k in nb.prange(num):
        for i in nb.prange(rows):
            for j in nb.prange(cols):
                result[i + (rows * k), j + (cols * k)] = arr[i, j]
    return result

Benchmarks for NUM = 4:

Benchmarks for NUM = 400:

Plots were produced from this template using the following additional code:

def gen_input(n):
    return np.random.randint(1, M, (n, n))


def equal_output(a, b):
    return np.all(a == b)


funcs = block_diag_kron, block_diag_scipy, block_diag_view, block_diag_jit


input_sizes = tuple(int(2 ** (2 + (3 * i) / 4)) for i in range(13))
print('Input Sizes:\n', input_sizes, '\n')


runtimes, input_sizes, labels, results = benchmark(
    funcs, gen_input=gen_input, equal_output=equal_output,
    input_sizes=input_sizes)


plot_benchmarks(runtimes, input_sizes, labels, units='ms')

(EDITED to include np.einsum()-based approach and another Numba version with explicit looping.)

Collectives™ on Stack Overflow

Create block diagonal numpy array from a given numpy array

3 Answers 3

6 Comments

1 Comment

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

6 Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related