Numpy array creation with patterns

Question

is it possible, in a fast way, to create a (large) 2d numpy array which

contains a value n times per row (randomly placed). e.g., for n = 3
```
1 0 1 0 1
0 0 1 1 1
1 1 1 0 0
...
```
same as 1., but place groups of that size n randomly per row. e.g.
```
1 1 1 0 0
0 0 1 1 1
1 1 1 0 0
...
```

of course, I could enumerate all rows, but I am wondering if there's a way to create the array using np.fromfunctionor some faster way?

Do you want a specific probability distribution for a row to have 1, 2 or 3 ones? — Eric O. Lebigot
– Eric O. Lebigot, Commented Feb 7, 2014 at 20:24
This question appears to be off-topic because it is shows no attempt to solve the problem. — user663031
– user663031, Commented Feb 8, 2014 at 5:03
@EOL: within a row, there is no requirement for a probability distribution. — HTTPeter
– HTTPeter, Commented Feb 8, 2014 at 9:15

Eelco Hoogendoorn · Accepted Answer · 2014-02-08 14:40:32Z

1

The answer to your first question has a simple one-line solution, which I imagine is pretty efficient. Functions like np.random.shuffle or np.random.permutation must be doing something similar under the hood, but they require a python loop over the rows, which might become a problem if you have very many short rows.

The second question also has a pure numpy solution which should be quite efficient, although it is a little less elegant.

import numpy as np

rows = 20
cols = 10
n = 3

#fixed number of ones per row in random places
print (np.argsort(np.random.rand(rows, cols)) < n).view(np.uint8)

#fixed number of ones per row in random contiguous place
data = np.zeros((rows, cols), np.uint8)
I = np.arange(rows*n)/n
J = (np.random.randint(0,cols-n+1, (rows,1))+np.arange(n)).flatten()
data[I, J] = 1
print data

Edit: here is a slightly longer, but more elegant and more performant solution to your second question:

import numpy as np

rows = 20
cols = 10
n = 3

def running_view(arr, window, axis=-1):
    """
    return a running view of length 'window' over 'axis'
    the returned array has an extra last dimension, which spans the window
    """
    shape = list(arr.shape)
    shape[axis] -= (window-1)
    assert(shape[axis]>0)
    return np.lib.index_tricks.as_strided(
        arr,
        shape + [window],
        arr.strides + (arr.strides[axis],))


#fixed number of ones per row in random contiguous place
data = np.zeros((rows, cols), np.uint8)

I = np.arange(rows)
J = np.random.randint(0,cols-n+1, rows)

running_view(data, n)[I,J,:] = 1
print data

edited Feb 8, 2014 at 14:40

answered Feb 8, 2014 at 10:19

Eelco Hoogendoorn

10.8k1 gold badge46 silver badges43 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

HTTPeter Over a year ago

For the fixed number of ones per rows solution, I also get [1, 2, 1, 0, 0] - could the trick be then to use >0 masking?

Eelco Hoogendoorn Over a year ago

Good point; the only reason I used division is to obtain the desired int result in a single pass over the array; but indeed that solution only works if n < cols/2. We can get the same result in a single pass with a comparison and a view; ill edit the code.

CCP · Accepted Answer · 2014-02-07 19:54:00Z

First of all you need to import some functions of numpy:

from numpy.random import rand, randint
from numpy import array, argsort

Case 1:

a = rand(10,5)
b=[]
for i in range(len(a)):
    n=3 #number of 1's
    b.append((argsort(a[i])>=(len(a[i])-n))*1)
b=array(b)

Result:

print b
array([[ 1,  0,  0,  1,  1],
       [ 1,  0,  0,  1,  1],
       [ 0,  1,  0,  1,  1],
       [ 1,  0,  1,  0,  1],
       [ 1,  0,  0,  1,  1],
       [ 1,  1,  0,  0,  1],
       [ 0,  1,  1,  1,  0],
       [ 0,  1,  1,  0,  1],
       [ 1,  0,  1,  0,  1],
       [ 0,  1,  1,  1,  0]])

Case 2:

a = rand(10,5)
b=[]
for i in range(len(a)):
    n=3 #max number of 1's
    n=randint(0,(n+1)) 
    b.append((argsort(a[i])>=(len(a[i])-n))*1)
b=array(b)

Result:

print b
array([[ 0,  0,  1,  0,  0],
       [ 0,  1,  0,  1,  0],
       [ 1,  0,  1,  0,  1],
       [ 0,  1,  1,  0,  0],
       [ 1,  0,  1,  0,  0],
       [ 1,  0,  0,  1,  1],
       [ 0,  1,  1,  0,  1],
       [ 1,  0,  1,  0,  0],
       [ 1,  1,  0,  1,  0],
       [ 1,  0,  1,  1,  0]])

I think that could work. To get the result i generate lists of random floats and with "argsort" see what of those are the n biggests of the list, then i filter them as ints (boolean*1-> int).

m_power · Accepted Answer · 2014-02-07 20:45:22Z

Just for the fun of it, I tried to find a solution for your first question even if I'm quite new to Python. Here what I have so far :

np.vstack([np.hstack(np.random.permutation([np.random.randint(0,2),
 np.random.randint(0,2), np.random.randint(0,2), 0, 0, 0])),
   np.hstack(np.random.permutation([np.random.randint(0,2),
 np.random.randint(0,2), np.random.randint(0,2), 0, 0, 0])),
   np.hstack(np.random.permutation([np.random.randint(0,2),
 np.random.randint(0,2), np.random.randint(0,2), 0, 0, 0])),
   np.hstack(np.random.permutation([np.random.randint(0,2),
 np.random.randint(0,2), np.random.randint(0,2), 0, 0, 0])),
   np.hstack(np.random.permutation([np.random.randint(0,2),
 np.random.randint(0,2), np.random.randint(0,2), 0, 0, 0])),
   np.hstack(np.random.permutation([np.random.randint(0,2),
 np.random.randint(0,2), np.random.randint(0,2), 0, 0, 0]))])
array([[1, 0, 0, 0, 0, 0],
       [0, 1, 0, 1, 0, 0],
       [0, 1, 0, 1, 0, 1],
       [0, 1, 0, 1, 0, 0],
       [0, 1, 0, 0, 0, 0],
       [1, 0, 0, 0, 0, 1]])

It is not the final answer, but maybe it can help you find an alternate solution using random numbers and permutation.

Collectives™ on Stack Overflow

Numpy array creation with patterns

3 Answers 3

2 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related