0

I tried to optimize the code below but I cannot figure out how to improve computation speed. Below code is taking almost 30 secs to run. this is taking time because of bootsam and filedata matrix. Can someone please help me to optimize this code Is it possible to improve the performance?

import numpy as np
filedata=np.genfromtxt('monthlydata1970to2010.txt',dtype='str') # this will creae 980 * 7 matrix
nboot=5000  
results=np.zeros((11,nboot));   #this will create 11*5000 matrix  
results[0,:]=600  
horizon=360  
balance=200  
bootsam=np.random.randint(984, size=(984, nboot)) # this will create 984*5000 matrix
for bs in range(0,nboot):  
   for mn in range(1,horizon+1):  
        if mn%12 ==1:  
            bondbal = 24*balance  
            sp500bal=34*balance  
            russbal = 44*balance  
            eafebal=55*balance  
            cashbal =66*balance  
            bondbal=bondbal*(1+float(filedata[bootsam[mn-1,bs]-1,2]))  
            sp500bal=sp500bal*(1+float(filedata[bootsam[mn-1,bs]-1,3]))  
            russbal=russbal*(1+float(filedata[bootsam[mn-1,bs]-1,4]))  
            eafebal=eafebal*(1+float(filedata[bootsam[mn-1,bs]-1,5]))  
            cashbal=cashbal*(1+float(filedata[bootsam[mn-1,bs]-1,6]))  
            balance=bondbal + sp500bal + russbal + eafebal + cashbal  
        else:  
            bondbal=bondbal*(1+float(filedata[bootsam[mn-1,bs]-1,2]))
            sp500bal=sp500bal*(1+float(filedata[bootsam[mn-1,bs]-1,3]))
            russbal=russbal*(1+float(filedata[bootsam[mn-1,bs]-1,4]))
            eafebal=eafebal*(1+float(filedata[bootsam[mn-1,bs]-1,5]))
            cashbal=cashbal*(1+float(filedata[bootsam[mn-1,bs]-1,6]))
            balance=bondbal + sp500bal + russbal + eafebal + cashbal
            if mn == 60:
               results[1,bs]=balance
            if mn == 120: 
               results[2,bs]=balance
            if mn == 180:
               results[3,bs]=balance
            if mn == 240:
               results[4,bs]=balance
            if mn == 300: 
               results[5,bs]=balance  
12
  • 1
    1+float(100) is the same thing as 101. Commented Mar 1, 2013 at 5:05
  • 1
    It would probably help if you said what you were trying to do with the code instead of asking how to improve it. Commented Mar 1, 2013 at 5:07
  • use timeit module if you want to check the amount of time it is taking Commented Mar 1, 2013 at 5:08
  • while timeit is the best way to time something, If you're getting times on the order of 7s using datetime.now(), timeit will likely say the same thing. Commented Mar 1, 2013 at 5:11
  • @HunterMcMillen: basically I am converting some matlab code into python,I can not write actual code because that is confidential, (1+float(100)) Here 100 is coming from two dimension string matrix, that why I have written float to convert string variable. Commented Mar 1, 2013 at 5:13

2 Answers 2

5

Basic Algebra: executing x = x * 1.23 360 times can be easily converted to a single execution of

x = x * (1.23 ** 360)

Refactor your code and you'll see that the loops are not really needed.

Sign up to request clarification or add additional context in comments.

Comments

2

It is difficult to answer without seeing the real code. I can't get your sample working because balance is set to inf early in the code, as it has been noticed in the comments to the question. Anyway a pretty obvious optimization is not to read the bootsam[mn-1,bs] element five times at every iteration in order to compute the xxbal variables. All those variables use the same bootsam element so you should read the element once and reuse it:

for bs in xrange(0,nboot):
   for mn in xrange(1,horizon+1):
        row = bootsam[mn-1,bs]-1
        if (mn % 12) == 1:  
            bondbal = 24*balance
            sp500bal=34*balance
            russbal = 44*balance
            eafebal=55*balance
            cashbal =66*balance

            bondbal=bondbal*(1+float(filedata[row,2]))  
            sp500bal=sp500bal*(1+float(filedata[row,3]))  
            russbal=russbal*(1+float(filedata[row,4]))  
            eafebal=eafebal*(1+float(filedata[row,5]))  
            cashbal=cashbal*(1+float(filedata[row,6]))  
            balance=bondbal + sp500bal + russbal + eafebal + cashbal
        else:  
            bondbal=bondbal*(1+float(filedata[row,2]))  
            sp500bal=sp500bal*(1+float(filedata[row,3]))  
            russbal=russbal*(1+float(filedata[row,4]))  
            eafebal=eafebal*(1+float(filedata[row,5]))  
            cashbal=cashbal*(1+float(filedata[row,6]))  

The optimized code (which uses a fake value for balance) runs nearly twice faster than the original one on my old Acer Aspire.

Update

If you need further optimizations you can do at least two more things:

  • do not add 1 and convert to float at every accessed element of filedata. Instead add 1 to the array at creation time and give it a float datatype.
  • do not use arithmetic expressions that mix numpy and built-in numbers because Python arithmetic works slower (you can read more on this problem in this SO thread)

The following code follows those advices:

filedata=np.genfromtxt('monthlydata1970to2010.txt',dtype='str') # this will creae 980 * 7 matrix
my_list = (np.float(1) + filedata.astype(np.float)).tolist() # np.float is converted to Python float
nboot=5000
results=np.zeros((11,nboot))   #this will create 11*5000 matrix
results[0,:]=600  
horizon=360
balance=200
bootsam=np.random.randint(5, size=(984, nboot)) # this will create 984*5000 matrix
for bs in xrange(0,nboot):
   for mn in xrange(1,horizon+1):
        row = int(bootsam[mn-1,bs]-1)
        if (mn % 12) == 1:
            bondbal = 24*balance
            sp500bal=34*balance
            russbal = 44*balance
            eafebal=55*balance
            cashbal =66*balance

            bondbal=bondbal*(my_list[row][2])  
            sp500bal=sp500bal*(my_list[row][3])  
            russbal=russbal*(my_list[row][4])  
            eafebal=eafebal*(my_list[row][5])  
            cashbal=cashbal*(my_list[row][6])  
            balance=bondbal + sp500bal + russbal + eafebal + cashbal
        else:  
            bondbal=bondbal*(my_list[row][2])  
            sp500bal=sp500bal*(my_list[row][3])  
            russbal=russbal*(my_list[row][4])  
            eafebal=eafebal*(my_list[row][5])  
            cashbal=cashbal*(my_list[row][6])  
            balance=bondbal + sp500bal + russbal + eafebal + cashbal  

With those changes the code runs nearly twice faster than the previously optimized code.

3 Comments

thanks for looking into the code. Your comments really save my time. Now I have replaced variable from row and able to execute code in 15 sec. instead of 30 sec.
I have run the code in my local machine and I am able to execute this code. Can you please further optimize this code
you are awesome. Thanks we started from 28 sec. now reached to 4 sec. thanks once again.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.