1

I have a big numpy array my_array that I copy into another temporary array temp_my_array in order to use it in calculations inside a loop, as following:

my_array = [10.1, 20.3, ..., 11.2] # a large numpy array
temp_my_array = np.copy(my_array)

for i in range(200000):
     for item in np.where(my_array> 5): 
           temp_my_array[item] = f(my_array[some other items]) 
     my_array = np.copy(temp_my_array)

I have memory error with np.copy when my_array is so big. Besides, profiling showed that np.copy is the slowest part of my code. Any ideas please?

14
  • I need a deep copy, so that when I change an item in temp_my_array it doesn't change the same item in my_array. Commented Nov 30, 2017 at 9:06
  • Cope the array ONCE (what you did on line 2), that will copy the entire array, i.e. all values, all of which you can change at will. Commented Nov 30, 2017 at 9:07
  • Sorry, I didn't realize you were using the values from my_array in your function call. Commented Nov 30, 2017 at 9:07
  • In the second for loop can't you just do my_array[item] = f(my_array[item]) and not bother with your temp array? The iterator you've created with np.where will not reference the changing my_array inside the loop. Commented Nov 30, 2017 at 9:08
  • @Lærne Can you please elaborate a bit more? Commented Nov 30, 2017 at 9:09

2 Answers 2

2

I suggest you only copy the values that actually changed. With your code this is only a slight change:

my_array = [10.1, 20.3, ..., 11.2] # a large numpy array
temp_my_array = np.copy(my_array)

for i in range(200000):
    inds = np.where(my_array > 5)
    for item in inds: 
        temp_my_array[item] = f(my_array[some other items]) 
    my_array[inds] = temp_my_array[inds]

Otherwise you can vectorize your function, but that might be annoying if [some other items] relies on your current index or impossible if it relies on the previous my_array result.

Sign up to request clarification or add additional context in comments.

Comments

1

Does this sound reasonable? Mass assign to original array without making copies

my_array = [10.1, 20.3, ..., 11.2] # a large numpy array

for i in range(200000):
     my_array[np.where(my_array>5)] = f(my_array[some other items]) # Mass assign instead of for-loop

You'll need to make sure f() returns an array now.

5 Comments

This is an interesting solution! Is this kind of doing calculations in a vectorised version?
If some other items depend on the prior result of my_array then you cannot vectorize in this fashion, so it really depends what the other items are.
@BehzadJamali Yeah assignment with index-array is vectorized with numpy, so this should be very fast.
@AlexanderReynolds Yeah that's true, I assumed that it doesn't because it's "some other items" as in from someplace else.
I am dependent on the prior value of my_array .Is true to say that vectorization will speed up my code by removing the second loop but I still need do my_array = np.copy(temp_my_array)?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.