2

I am trying to speed up my code. The biggest problem is a few nested loops I have (they have to iterate over 25000 cells). However, when I try to get rid of these nested loops, I get a different result and I don't seem to get why.

This is one of the nested loop:

for i in range(N):
    for j in range(N):
        # value added in sector i (month k+1)
        VA[i,k+1]= VA[i,k+1] - IO[j,i]*(Produc[i,k+1]/Produc[i,0])

This is what I did to get rid of the inner loop:

for in range(N):
    VA[i,k+1]=VA[i,k+1] - np.sum(IO[:,i])*(Produc[i,k+1]/Produc[i,0])

Thank you for very much your help.

8
  • 3
    use xrange instead of range (for Python version < 3), else it creates an array of size N every time Commented Sep 20, 2013 at 12:12
  • what is the size of IO ? you sum up from 0 to N-1 on j in the first case (inner loop) and from 0 to len(IO)-1 on j in the second case Commented Sep 20, 2013 at 12:14
  • @GrijeshChauhan: trying to make your single codebase futureproof by making it very inefficient on the interpreter you're already using seems like a Bad Idea. Commented Sep 20, 2013 at 12:16
  • @GrijeshChauhan, not exactly what I would call "more portable". The meaning of the code and the performance is very different. It just has the same end result in a lot of cases. 2to3 will do this transformation better. Commented Sep 20, 2013 at 12:18
  • Can you give an example of two differing results? Is it possible that the difference is due to truncation of floating point values? Commented Sep 20, 2013 at 12:34

1 Answer 1

1

The problem is that assigning to VA constricts the type to VA.dtype, so you can lose accuracy if VA.dtype is less precise than the result from VA[i,k+1] - IO[j,i]*(Produc[i,k+1]/Produc[i,0]).

To keep this rounding you'd want:

for i in range(N):
    # value added in sector i (month k+1)
    VA[i,k+1] -= (IO[:,i]*(Produc[i,k+1]/Produc[i,0])).astype(VA.dtype).sum()

...assuming you're not more happy with the more accurate version!

Some more painstaking research has shown that if the subtractions take the data through 0, the behaviour isn't perfectly emulated. I wouldn't bother though, because emulating subtle bugs is a waste of time ;).


Note that if you're happy with

for in range(N):
    VA[i,k+1]=VA[i,k+1] - np.sum(IO[:,i])*(Produc[i,k+1]/Produc[i,0])

you can also do

VA[:,k+1] -= IO.sum(axis=0) * Produc[:,k+1] / Produc[:,0]

which I think is equivalent.


Note that this assumes that N is the perfect fit for a lot of these. It could be that VA[:N, :N] is a subset of VA, in which case that's the problem and you should crop everything to N within the calculations.

Sign up to request clarification or add additional context in comments.

10 Comments

Am I allowed to ask what the disagreement is? I have a sample data set where my answer works, so it's at least a reasonable guess.
VA[:,k+1] -= IO.sum(axis=0) * Produc[:,k+1] / Produc[:,0] looks right to me.
Your first statement is incorrect: you don't change the type by assigning to one item of an array. And you especially don't when you use -= or +=.
Ah, no, it's the type of the object you're adding, not of the array you're adding to. So [32] + 1.4 would become [33], and that's a rounding error that the original change didn't consider. It's an easy misunderstanding, I'll clear up the language.
Even if the dtypes of all the arrays are identical, the results will likely be different due to truncation since you are now adding/subtracting values of different magnitudes.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.