I am trying to create a rather big array in python, filled with zeros and ones. In the end it should have around 1.2 billion entries. I do fill it like in the example. The idea behind is that 400 entries are a time slot and for each time slot there is a probability p that it is one. If that is the case, it is filled with ones for slot_duration time slots, otherwise it is filled with 400 entries, one time slot, of zeros.
import numpy as np
p = 0.01
slot_duration = 10
test_duration = 60
timeslots_left = test_duration * 1000 * 1000 / 20
transmission_array = []
while timeslots_left >= 0:
rand_num = np.random.choice((0, 1), p=[1 - p, p])
if rand_num == 1:
for i in range(0, slot_duration):
for j in range(0, 400):
transmission_array.append(1)
timeslots_left -= slot_duration
else:
for j in range(0, 400):
transmission_array.append(0)
timeslots_left -= 1
The performance is of course horrible. For a duration of 10 it takes around 45 seconds to generate the array, but it also takes 45 seconds just to iterate over it.
My question is, whether there is a more performant way to do it? Would it be better to initialise an array with fixed length containing zeros and then re-assign values to one? Or would that not help if iterating over it takes the same time?
I'm open to any suggestions.
timeslots_left - slot_durationshould betimeslots_left -= slot_durationxrangebecause it evaluate lazily.