I have a python function (call it myFunction) that gets as input a list of numbers, and, following a complex calculation, returns back the result of the calculation (which is a number).
The function looks like this:
def myFunction( listNumbers ):
# initialize the result of the calculation
calcResult = 0
# looping through all indices, from 0 to the last one
for i in xrange(0, len(listNumbers), 1):
# some complex calculation goes here, changing the value of 'calcResult'
# let us now return the result of the calculation
return calcResult
I tested the function, and it works as expected.
Normally, myFunction is provided a listNumbers argument that contains 5,000,000 elements in it. As you may expect, the calculation takes time. I need this function to run as fast as possible
Here comes the challenge: assume that the time now is 5am, and that listNumbers contains just 4,999,999 values in it. Meaning, its LAST VALUE is not yet available. This value will only be available at 6am.
Obviously, we can do the following (1st mode): wait until 6am. Then, append the last value into listNumbers, and then, run myFunction. This solution works, BUT it will take a while before myFunction returns our calculated result (as we need to process the entire list of numbers, from the first element on). Remember, our goal is to get the results as soon as possible past 6am.
I was thinking about a more efficient way to solve this (2nd mode): since (at 5am) we have listNumbers with 4,999,999 values in it, let us immediately start running myFunction. Let us process whatever we can (remember, we don't have the last piece of data yet), and then -- exactly at 6am -- 'plug in' the new data piece -- and generate the computed result. This should be significantly faster, as most of the processing will be done BEFORE 6am, hence, we will only have to deal with the new data -- which means the computed result should be available immediately after 6am.
Let's suppose that there's no way for us to inspect the code of myFunction or modify it. Is there ANY programming technique / design idea that will allow us to take myFunction AS IS, and do something with it (without changing its code) so that we can have it operate in the 2nd mode, rather than the 1st one?
Please do not suggest using c++ / numpy + cython / parallel computing etc to solve this problem. The goal here is to see if there's any programming technique or design pattern that can be easily used to solve such problems.
listNumbersis currently a numpy array. Could you show how it's getting populated, and what causes it to wait until 6am to receive the last element? Is there threading involved?listNumbersloads its first 4,999,999 values from disk (at 5am). At 6pm, the main program gets a message with the most recent value, and calls a method that updates the last value oflistNumberswith the new value. No threading is involved at present.myFunctiondoes, and then you just do the same thing for the last value ex-post, no fancy hacks needed. Or you don't, and then there is no way for us to know how you access the list object inside the function: list[index]? multiple iterations over the list? what other methods on the list are called, apart from the unsightlylen()? some completely different magickery?