4

I am trying to create a rather big array in python, filled with zeros and ones. In the end it should have around 1.2 billion entries. I do fill it like in the example. The idea behind is that 400 entries are a time slot and for each time slot there is a probability p that it is one. If that is the case, it is filled with ones for slot_duration time slots, otherwise it is filled with 400 entries, one time slot, of zeros.

import numpy as np

p = 0.01
slot_duration = 10
test_duration = 60
timeslots_left = test_duration * 1000 * 1000 / 20
transmission_array = []
while timeslots_left >= 0:
    rand_num = np.random.choice((0, 1), p=[1 - p, p])
    if rand_num == 1:
        for i in range(0, slot_duration):
            for j in range(0, 400):
                transmission_array.append(1)
        timeslots_left -= slot_duration
    else:
        for j in range(0, 400):
            transmission_array.append(0)
        timeslots_left -= 1

The performance is of course horrible. For a duration of 10 it takes around 45 seconds to generate the array, but it also takes 45 seconds just to iterate over it.

My question is, whether there is a more performant way to do it? Would it be better to initialise an array with fixed length containing zeros and then re-assign values to one? Or would that not help if iterating over it takes the same time?

I'm open to any suggestions.

7
  • Based on my experience with APL, another interpretive language, for situations like this, it's better to create a large array and minimize the looping required to set the ones. Does your system have enough memory to handle such large arrays? What I don't know is what functions or operators Python has to help with your program. Commented Jan 31, 2017 at 9:05
  • 1
    I think that you have an error in the code. timeslots_left - slot_duration should be timeslots_left -= slot_duration Commented Jan 31, 2017 at 9:19
  • Also a general advice whenever you want to use less memory while generating sequences in for loops, it is better to use xrange because it evaluate lazily. Commented Jan 31, 2017 at 9:36
  • 1
    Once you have filled the array what are you going to do with the data? Depending on the use case you might just need to store every index where the value flips. Commented Jan 31, 2017 at 9:43
  • @saloua Thanks! I didn't see that. Maybe xrange might help. I will check it. Commented Jan 31, 2017 at 9:43

2 Answers 2

1

if you have enough memory, you could replace that loop:

    for i in range(0, slot_duration):
        for j in range(0, 400):
            transmission_array.append(1)

by

transmission_array.extend([1]*400*slot_duration)

You perform 1 instruction, C-compiled, and you extend your list in 1 go, without all the resizing. Like this you're avoiding the double loop and perform a lot less resizes/memory copies under the hood.

And if slot_duration is constant, you could declare:

chunk = [1]*400*slot_duration

at startup so you can do transmission_array.extend(chunk)

so you're avoiding the allocation of chunk at each iteration

Same problem, same fix here:

    for j in range(0, 400):
        transmission_array.append(0)

becomes

    transmission_array.extend(zero_array)

with zero_array = [0]*400

Sign up to request clarification or add additional context in comments.

5 Comments

That already makes it around 5 times faster, which is great! Memory might be an issue though, I reached around 9GB memory usage. But I think that should work. The machine it will run on has 12GB or 16GB. If you have any more suggestions, I'm open for them!
the memory shouldn't be an issue with a pre-allocated chunk. Another suggestion? upvote if it's useful :)
Well, the 9GB was with a pre-allocated chunk. ;) So I need to be careful. But I shouldn't get over that either way. On the other hand, the computation time is a bit over a minute, which has implications on what I want to do it with it. And it's already upvoted. ;)
thanks for the upvote. But chunk should be 4000 ones. (400*10) so no issue with the memory right?
4000 or higher. The memory usage was for the whole program. But it seems it doesn't change when I use larger chunks. And the larger the chunk, the faster, which is good.
1

I would suggest the following more pythonic code.

It is better to avoid doing the loops in order to just append new values to the list.

p = 0.01
slot_duration = 10
test_duration = 60
timeslots_left = test_duration * 1000 * 1000 / 20
transmission_array = []
while timeslots_left >= 0:
    rand_num = np.random.choice((0, 1), p=[1 - p, p])
    duration = slot_duration if rand_num == 1 else 1
    transmission_array.extend([rand_num] * 400 * duration)
    timeslots_left -= duration

And as you are only storing zeros and ones in the array, I would suggest to use a sparse array. It is even less memory consuming.

4 Comments

Strangely enough, this is a bit slower then @Jean-François Fabre answer. Although only a few seconds. But it seems if I would introduce pre-allocated chunks, I will end up with his answer. Or do you have another suggestion?
@Patrick It depends on what you want to do with the array. But it is worth trying to use the sparse array. Upvote or accept the answer if useful :)
definitely more pythonic, although the factorization loses the possibility to pre-allocate the chunks, which is one of the major factors for the speedup. It's still the same dilemna between efficiency and readability.
Losing the possibility to pre-allocate is why I didn't choose that answer. For me the major point is speed up. The mention of sparse array brought me to the bit array module, which already reduces memory usage immensely. I only need to convince my following application to accept it...

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.