1

So, I have a snippet of a script:

lol = []
latv1 = 0
latv2 = 0
latv3 = 0

#Loop a
for a in range(100):

    #Refresh latv2 after each iteration of loop a
    latv2 = 0

    #Loop b
    for b in range(100):

        #Refresh latv3 after each iteration of loop b
        latv3 = 0

        #Loop c        
        for c in range(100):

            #Make 4 value list according to iteration and append to lol
            midl2 = [latv1,latv2,latv3,0]
            lol.append(midl2)

            #Iterate after loop
            latv3 = latv3 + 1
        latv2 = latv2 + 1
    latv1 = latv1 + 1

Which will do what I want it to do.... but very slowly. It gives:

[[0,0,0,0]
 [0,0,1,0]
 ...
 [0,1,0,0]
 [0,1,1,0]
 ...
 [9,9,8,0]
 [9,9,9,0]]

I've read about numpy and its speed and optimization. I cannot figure out how to implement with numpy what I have above. I've learned how to make an array of zeroes with numpy via the manuals:

numpy_array = np.zeroes((100,4))

To give:

[[ 0.  0.  0.  0.]
 [ 0.  0.  0.  0.]
 [ 0.  0.  0.  0.]
 ..., 
 [ 0.  0.  0.  0.]
 [ 0.  0.  0.  0.]
 [ 0.  0.  0.  0.]]

and can change the values of each column with:

numpA  = np.arange(0,100,1)
numpB  = np.arange(0,100,1
numpC  = np.arange(0,100,1)
numArr[:,0] = numpA
numArr[:,1] = numpB
numArr[:,2] = numpC

giving:

[[   0.    0.    0.    0.]
 [   1.    1.    1.    0.]
 [   2.    2.    2.    0.]
 ..., 
 [ 997.  997.  997.    0.]
 [ 998.  998.  998.    0.]
 [ 999.  999.  999.    0.]]

but I cannot create a numpy array 1000000 lines long and have the columns increment like the original example did. If I call the zero array creation with 1000000 instead of 100 the column substitution does not work, which makes sense as the length of the array and the substitution are unequal - but I am not sure how to correctly iterate the substitution arrays to work.

How can I replicate the original scripts output via numpy arrays?

Note: This is a python 2.7 machine, but it's 64 bit at least. I know RAM use is an issue, but I should be able to change the dtype of the array to fit my needs.

1 Answer 1

5

Approach #1

To create the NumPy equivalent of the posted code and have NumPy array as output, you could additionally make use of itertools, like so -

from itertools import product

out = np.zeros((N**3,4),dtype=int)
out[:,:3] = list(product(np.arange(N), repeat=3))

Please note that it would be N = 100 to make it equivalent to the posted code.

Approach #2

Another potentially faster approach based on purely NumPy and using it's vectorized broadcasting capabilities could be suggested like so -

out = np.zeros((N**3,4),dtype=int)
out[:,:3] = (np.arange(N**3)[:,None]/[N**2,N,1])%N

I would think this to be faster than the previous itertools based one, because that created a list of tuples that are to be set into a NumPy array. We will test this theory out in the next section.


Runtime test

In [111]: def itertools_based(N):
     ...:     out = np.zeros((N**3,4),dtype=int)
     ...:     out[:,:3] = list(product(np.arange(N), repeat=3))
     ...:     return out
     ...: 
     ...: def broadcasting_based(N):
     ...:     out = np.zeros((N**3,4),dtype=int)
     ...:     out[:,:3] = (np.arange(N**3)[:,None]/[N**2,N,1])%N
     ...:     return out


In [112]: N = 20

In [113]: np.allclose(itertools_based(N),broadcasting_based(N)) # Verify results
Out[113]: True

In [114]: %timeit itertools_based(N)
100 loops, best of 3: 7.42 ms per loop

In [115]: %timeit broadcasting_based(N)
1000 loops, best of 3: 1.23 ms per loop

Now, let's time just the creation of list of tuples of those iterated elements and put it against the NumPy based one -

In [116]: %timeit list(product(np.arange(N), repeat=3))
1000 loops, best of 3: 746 µs per loop

In [117]: %timeit (np.arange(N**3)[:,None]/[N**2,N,1])%N
1000 loops, best of 3: 1.09 ms per loop

Well, so the creation part for the itertools-based one is faster now, as predicted/thought out earlier! So, if you are happy with the first three columns as output and them being list of tuples, then go with itertools.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.