2

I have a mXn numpy array called a: I would like to write a function which returns an array with size (3, mxn) which contains for each couple (x,y) in the first array the correspondant value.

import numpy as np

m=5
n=10
a = np.random.random((m, n))    
x = np.random.random((m, 1))                          # x coordinates
y = np.random.random((1, n))                          # y coordinates


b = np.empty((3, m*n))                # array to store coordinates
k=0
for i in range (0,m):
    for j in range (0,n):
        b[0,k] = a[i,0]
        b[1,k] = a[0,j]
        b[2,k] = a[i,j]
        k=k+1

This seems to run ok, but is there a faster or better coded way to do this?

5
  • 1
    It looks like you're trying to build a meshgrid in your own way. Take a look at this: docs.scipy.org/doc/numpy/reference/generated/… Commented Jun 8, 2017 at 8:13
  • This is useful, but it only returns x,y coordinates, it doesn't return the third coordinate, from what I understand Commented Jun 8, 2017 at 8:33
  • 1
    I think you need to use range (0,m) and range (0,n) to cover all elements. Commented Jun 8, 2017 at 8:34
  • thanks @Divakar, i had just noticed some strange values Commented Jun 8, 2017 at 8:35
  • you did not use x,y at all? why bother put it in your code here? Commented Jun 8, 2017 at 8:48

1 Answer 1

3

Steps :

  • Initialize a 3D array, such that m and n are separate ones. This lets us broadcast values.

  • Index into the first three elements along the first axis of output with appropriate elements off a and make sure that those shapes are broadcastable.

  • Reshape the output back to 2D.

That's all the play is about here! Here's the vectorized implementation -

b_out = np.empty((3, m,n),dtype=a.dtype)  # 1. Initialize 
b_out[0] = a[:,0,None]                    # 2. Assign
b_out[1] = a[0]
b_out[2] = a
b_out.shape = (3,m*n)                     # 3. Reshape back to 2D

Runtime test

Approaches -

def loopy_app(a):
    m,n = a.shape
    b = np.empty((3, m*n),dtype=a.dtype)
    k=0
    for i in range (0,m):
        for j in range (0,n):
            b[0,k] = a[i,0]
            b[1,k] = a[0,j]
            b[2,k] = a[i,j]
            k=k+1
    return b

def vectorized_app(a):
    b_out = np.empty((3, m,n),dtype=a.dtype)  
    b_out[0] = a[:,0,None]
    b_out[1] = a[0]
    b_out[2] = a
    b_out.shape = (3,m*n)
    return b_out

Timings -

In [194]: m=5
     ...: n=10
     ...: a = np.random.random((m, n))
     ...: 

In [195]: %timeit loopy_app(a)
     ...: %timeit vectorized_app(a)
     ...: 
10000 loops, best of 3: 28.2 µs per loop
100000 loops, best of 3: 2.48 µs per loop

In [196]: m=50
     ...: n=100
     ...: a = np.random.random((m, n))
     ...: 

In [197]: %timeit loopy_app(a)
     ...: %timeit vectorized_app(a)
     ...: 
100 loops, best of 3: 2.56 ms per loop
100000 loops, best of 3: 6.31 µs per loop

In [198]: 2560/6.31
Out[198]: 405.7052297939778

400x+ speedup on large datasets and more on larger ones!

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.