0

I have tested the timing of the built-in np.gradient function with respect to an hard-coded function which compute the gradient using the same central difference scheme. By using the cProfile module I have found significantly smaller time of np.gradient. This suggest that my implementation of the gradient computation in Python is extremely naive. I would like to understand how np.gradient is implement so to understand how to properly implement math operation in python. (I am aware that the hard-coded function does not exactly compute the gradient as np.gradient, since the boundary values are not included, but this should not affect the results).

Here is the python code I have used for the test:

import numpy as np
import cProfile

# Create ndarray
N = 1000
f = np.empty((N,N))

# Function to test the built-in gradient function of numpy
def test_grad_np(T, n):
   for s in range(n):
       dTdx, dTdy = np.gradient(T)

# Hard-coded gradient function
def hc_grad(T):
   Ny, Nx = np.shape(T)
   dTdx = np.zeros((Ny, Nx))
   dTdy = np.zeros((Ny, Nx))
   for j in range(1,Nx-1):
       for i in range(1,Nx-1):
           dTdx[j,i] = (T[j,i+1] - T[j,i-1])/2.
           dTdy[j,i] = (T[j+1,i] - T[j-1,i])/2.
   return dTdx, dTdy

# Function to test the hard-coded gradient function
def test_hc_grad(T, n):
   for s in range(n):
       dTdx, dTdy = hc_grad(T)

cProfile.run('test_hc_grad(T, 20)')
cProfile.run('test_grad_np(T, 20)')

and this is the output on my machine:

        144 function calls in 16.818 seconds

  Ordered by: standard name

  ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      20    0.000    0.000    0.000    0.000 <__array_function__ internals>:2(shape)
       1    0.001    0.001   16.818   16.818 <string>:1(<module>)
      20    0.000    0.000    0.000    0.000 fromnumeric.py:1922(_shape_dispatcher)
      20    0.000    0.000    0.000    0.000 fromnumeric.py:1926(shape)
      20   16.803    0.840   16.804    0.840 test_speed.py:20(hc_grad)
       1    0.013    0.013   16.817   16.817 test_speed.py:30(test_hc_grad)
       1    0.000    0.000   16.818   16.818 {built-in method builtins.exec}
      20    0.000    0.000    0.000    0.000 {built-in method numpy.core._multiarray_umath.implement_array_function}
      40    0.001    0.000    0.001    0.000 {built-in method numpy.zeros}
       1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}


        704 function calls (624 primitive calls) in 0.262 seconds

  Ordered by: standard name

  ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      40    0.000    0.000    0.001    0.000 <__array_function__ internals>:2(empty_like)
      20    0.000    0.000    0.248    0.012 <__array_function__ internals>:2(gradient)
      40    0.000    0.000    0.001    0.000 <__array_function__ internals>:2(ndim)
       1    0.001    0.001    0.262    0.262 <string>:1(<module>)
      20    0.000    0.000    0.000    0.000 _asarray.py:110(asanyarray)
      40    0.000    0.000    0.000    0.000 _asarray.py:23(asarray)
      40    0.000    0.000    0.000    0.000 fromnumeric.py:3102(_ndim_dispatcher)
      40    0.000    0.000    0.001    0.000 fromnumeric.py:3106(ndim)
      40    0.000    0.000    0.000    0.000 function_base.py:798(_gradient_dispatcher)
      20    0.245    0.012    0.247    0.012 function_base.py:803(gradient)
      40    0.000    0.000    0.000    0.000 multiarray.py:75(empty_like)
      40    0.000    0.000    0.000    0.000 numerictypes.py:285(issubclass_)
      20    0.000    0.000    0.000    0.000 numerictypes.py:359(issubdtype)
       1    0.014    0.014    0.261    0.261 test_speed.py:15(test_grad_np)
       1    0.000    0.000    0.262    0.262 {built-in method builtins.exec}
      60    0.000    0.000    0.000    0.000 {built-in method builtins.issubclass}
      40    0.000    0.000    0.000    0.000 {built-in method builtins.len}
      60    0.000    0.000    0.000    0.000 {built-in method numpy.array}
  100/20    0.001    0.000    0.248    0.012 {built-in method numpy.core._multiarray_umath.implement_array_function}
      40    0.000    0.000    0.000    0.000 {method 'append' of 'list' objects}
       1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
3
  • Just a quick remark: The whole point of using numpy is to not loop in Python to avoid the interpreter and instead use vectorization. Commented Jan 28, 2023 at 19:35
  • At its simplest gradient is np.diff, that is taking successive differences. For a 1d array that is just arr[1:]-arr[:-1]. There are more details when working with more dimensions, higher order differences, and edges. Commented Jan 28, 2023 at 19:57
  • @Homer512 I have read about vectorization and I think now I understand that the key point is to avoid looping so that numpy can uses its low level implementation. Commented Jan 28, 2023 at 21:21

1 Answer 1

4

If you're really curious, why not have a look at the np.gradient source code?

I'm not an expert on np.gradient, but just from playing around with your current implementation, a lot can be improved by using vectorial code. I believe the following should be equivalent:

def hc_grad(T):
    dTdx = np.zeros_like(T)
    dTdy = np.zeros_like(T)

    dTdx[1:-1, 1:-1] = (T[1:-1, 2:] - T[1:-1, :-2]) / 2
    dTdy[1:-1, 1:-1] = (T[2:, 1:-1] - T[:-2, 1:-1]) / 2

which already runs a lot faster on my computer.

Sign up to request clarification or add additional context in comments.

1 Comment

I have tested your implementation and the timing is about the same as for the np.gradient test. I will have a look at the source coded linked in the answer.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.