Python avoiding large array allocation multiple times

Question

I have to compute a function many many times. To compute this function the elements of an array must be computed. The array is quite large.

How can I avoid the allocation of the array in every function call.

The code I have tried goes something like this:

class FunctionCalculator(object):
    def __init__(self, data):
        """
        Get the data and do some small handling of it
        Let's say that we do
        self.data = data
        """
    def function(self, point):
        return numpy.sum(numpy.array([somecomputations(item) for item in self.data]))

Well, maybe my concern is unfounded, so I have first this question.

Question: Is it true that the array [somecomputations(item) for item in data] is being allocated and deallocated for every call to function?

Thinking that that is the case I have tried

class FunctionCalculator(object):
    def __init__(self, data):
        """
        Get the data and do some small handling of it
        Let's say that we do
        self.data = data
        """
        self.number_of_data = range(0, len(data))
        self.my_array = numpy.zeros(len(data))
    def function(self, point):
        for i in self.number_of_data:
            self.my_array[i] = somecomputations(self.data[i])
        return numpy.sum(self.my_array)

This is slower than the previous version. I assume that the list comprehension in the first version can be ran in C entirely, while in the second version smaller parts of the script can be translated into optimized C code.

I have very little idea of how Python works inside.

Question: Is there a good way to skip the array allocation in every function call and at the same time take advantage of a well optimized loop on the array?

I am using Python3.5

kabanus · Accepted Answer · 2016-10-22 15:51:42Z

1

Looping over the array is unnecessary and access python to c many times, hence the slow down. The beauty of numpy arrays that functions work on them cell by cell. I think the fastest would be:

return numpy.sum(somecomputations(self.data))

Somecomputations may need a bit of a modification, but often it will work off the bat. Also, you're not using point, and other stuff.

answered Oct 22, 2016 at 15:51

kabanus

26.3k7 gold badges48 silver badges79 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

myfirsttime1 Over a year ago

In order to pass the numpy array to my function does the function have to consists of operations numpy knows about? I didn't quite understand.

myfirsttime1 Over a year ago

Oh! I just learned there is a numpy.vectorize, to which you pass a function. That might help.

kabanus Over a year ago

It's not a must. Functions which are simple arithmetic expression work as is.

Collectives™ on Stack Overflow

Python avoiding large array allocation multiple times

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related