Here are a few variants using either mgrid or ogrid or manually creating the same ranges that ogrid generates.
Observations:
- for an array of size 1000, the fastest method is more than three times faster than
mgrid
- using
ogrid or manual it is a bit better to add the two ranges separately, thereby avoiding a full size temporary
- conveniences such as
mgrid or ogrid tend to come at a cost in numpy, and indeed the manual method is twice as fast as ogrid
Code:
import numpy as np
from timeit import timeit
A = np.arange(1000).reshape(20, 50)
def f():
B = A.copy()
m, n = B.shape
I, J = np.mgrid[:m*1000:1000, :n*500:500]
B += I+J
return B
def g():
B = A.copy()
m, n = B.shape
I, J = np.ogrid[:m*1000:1000, :n*500:500]
B += I+J
return B
def h():
B = A.copy()
m, n = B.shape
I, J = np.ogrid[:m*1000:1000, :n*500:500]
B += I
B += J
return B
def i():
B = A.copy()
m, n = B.shape
BT = B.T
BT += np.arange(0, 1000*m, 1000)
B += np.arange(0, 500*n, 500)
return B
def j():
B = A.copy()
m, n = B.shape
B += np.arange(0, 1000*m, 1000)[:, None]
B += np.arange(0, 500*n, 500)
return B
assert np.all(f()==h())
assert np.all(g()==h())
assert np.all(i()==h())
assert np.all(j()==h())
print(timeit(f, number=10000))
print(timeit(g, number=10000))
print(timeit(h, number=10000))
print(timeit(i, number=10000))
print(timeit(j, number=10000))
Sample run:
0.289166528998976 # mgrid
0.25259370900130307 # ogrid 1 step
0.24528862700026366 # ogrid 2 steps
0.09056068700010655 # manual transpose
0.08238107499892067 # manual add dim