3

I'm trying to speed up this loop I'm running to separate data into 2 categories. Normally I wouldn't care all that much about speed, but I am finding that right now the speed of this code is actually slowing down dramatically after multiple iterations. Here is how I wrote the code:

plane1Data = []
plane2Data = []
plane1Times = []
plane2Times = []
plane1Dets = []
plane2Dets = []
t1 = time.time()
for i in range(0,len(adcBoardVals)):#10000):
    tic = time.time()
    if adcBoardVals[i] == 5:
        if adcChannel[i] == 0:
            #detectorVal = detectorVal + [0]
            plane1Data = plane1Data + [rawDataMat[i,:]]
            plane1Times = plane1Times + [timeVals[i]]
            plane1Dets = plane1Dets + [0]
        elif adcChannel[i] == 1:
            #detectorVal = detectorVal + [1]
            plane1Data = plane1Data + [rawDataMat[i,:]]
            plane1Times = plane1Times + [timeVals[i]]
            plane1Dets = plane1Dets + [1]
        elif adcChannel[i] == 2:
            #detectorVal = detectorVal + [2]
            plane1Data = plane1Data + [rawDataMat[i,:]]
            plane1Times = plane1Times + [timeVals[i]]
            plane1Dets = plane1Dets + [2]
        elif adcChannel[i] == 3:
            #detectorVal = detectorVal + [3]
            plane1Data = plane1Data + [rawDataMat[i,:]]
            plane1Times = plane1Times + [timeVals[i]]
            plane1Dets = plane1Dets + [3]
        elif adcChannel[i] == 4:
            #detectorVal = detectorVal + [4]
            plane1Data = plane1Data + [rawDataMat[i,:]]
            plane1Times = plane1Times + [timeVals[i]]
            #plane1Dets = plane1Dets + [4]
        elif adcChannel[i] == 5:
            #detectorVal = detectorVal + [5]
            plane1Data = plane1Data + [rawDataMat[i,:]]
            plane1Times = plane1Times + [timeVals[i]]
            plane1Dets = plane1Dets + [5]
        elif adcChannel[i] == 6:
            #detectorVal = detectorVal + [6]
            plane1Data = plane1Data + [rawDataMat[i,:]]
            plane1Times = plane1Times + [timeVals[i]]
            plane1Dets = plane1Dets + [6]
        elif adcChannel[i] == 7:
            #detectorVal = detectorVal + [7]
            plane1Data = plane1Data + [rawDataMat[i,:]]
            plane1Times = plane1Times + [timeVals[i]]
            plane1Dets = plane1Dets + [7]
    elif adcBoardVals[i] == 7:
        if adcChannel[i] == 0:
            #detectorVal = detectorVal + [16]
            plane2Data = plane2Data + [rawDataMat[i,:]]
            plane2Times = plane2Times + [timeVals[i]]
            plane2Dets = plane2Dets + [16]
        elif adcChannel[i] == 1:
            #detectorVal = detectorVal + [17]
            plane2Data = plane2Data + [rawDataMat[i,:]]
            plane2Times = plane2Times + [timeVals[i]]
            plane2Dets = plane2Dets + [17]
        elif adcChannel[i] == 2:
            #detectorVal = detectorVal + [18]
            plane2Data = plane2Data + [rawDataMat[i,:]]
            plane2Times = plane2Times + [timeVals[i]]
            plane2Dets = plane2Dets + [18]
        elif adcChannel[i] == 3:
            #detectorVal = detectorVal + [19]
            plane2Data = plane2Data + [rawDataMat[i,:]]
            plane2Times = plane2Times + [timeVals[i]]
            plane2Dets = plane2Dets + [19]
        elif adcChannel[i] == 4:
            #detectorVal = detectorVal + [20]
            plane2Data = plane2Data + [rawDataMat[i,:]]
            plane2Times = plane2Times + [timeVals[i]]
            plane2Dets = plane2Dets + [20]
        elif adcChannel[i] == 5:
            #detectorVal = detectorVal + [21]
            plane2Data = plane2Data + [rawDataMat[i,:]]
            plane2Times = plane2Times + [timeVals[i]]
            plane2Dets = plane2Dets + [21]
        elif adcChannel[i] == 6:
            #detectorVal = detectorVal + [22]
            plane2Data = plane2Data + [rawDataMat[i,:]]
            plane2Times = plane2Times + [timeVals[i]]
            plane2Dets = plane2Dets + [22]
        elif adcChannel[i] == 7:
            #detectorVal = detectorVal + [23]
            plane2Data = plane2Data + [rawDataMat[i,:]]
            plane2Times = plane2Times + [timeVals[i]]
            plane2Dets = plane2Dets + [23]
    elif adcBoardVals[i] == 6:
        if adcChannel[i] == 0:
            #detectorVal = detectorVal + [8]
            plane1Data = plane1Data + [rawDataMat[i,:]]
            plane1Times = plane1Times + [timeVals[i]]
            plane1Dets = plane1Dets + [8]
        elif adcChannel[i] == 1:
            #detectorVal = detectorVal + [9]
            plane1Data = plane1Data + [rawDataMat[i,:]]
            plane1Times = plane1Times + [timeVals[i]]
            plane1Dets = plane1Dets + [9]
        elif adcChannel[i] == 2:
            #detectorVal = detectorVal + [10]
            plane1Data = plane1Data + [rawDataMat[i,:]]
            plane1Times = plane1Times + [timeVals[i]]
            plane1Dets = plane1Dets + [10]
        elif adcChannel[i] == 3:
            #detectorVal = detectorVal + [11]
            plane1Data = plane1Data + [rawDataMat[i,:]]
            plane1Times = plane1Times + [timeVals[i]]
            plane1Dets = plane1Dets + [11]
        elif adcChannel[i] == 4:
            #detectorVal = detectorVal + [12]
            plane2Data = plane2Data + [rawDataMat[i,:]]
            plane2Times = plane2Times + [timeVals[i]]
            plane2Dets = plane2Dets + [12]
        elif adcChannel[i] == 5:
            #detectorVal = detectorVal + [13]
            plane2Data = plane2Data + [rawDataMat[i,:]]
            plane2Times = plane2Times + [timeVals[i]]
            plane2Dets = plane2Dets + [13]
        elif adcChannel[i] == 6:
            #detectorVal = detectorVal + [14]
            plane2Data = plane2Data + [rawDataMat[i,:]]
            plane2Times = plane2Times + [timeVals[i]]
            plane2Dets = plane2Dets + [14]
        elif adcChannel[i] == 7:
            #detectorVal = detectorVal + [15]
            plane2Data = plane2Data + [rawDataMat[i,:]]
            plane2Times = plane2Times + [timeVals[i]]
            plane2Dets = plane2Dets + [15]
    if i%100000 == 0:
        print('k = ',i)   
        toc = time.time()
        print('tictoc = ',toc-tic)
        print('elapsed = ',toc-t1)
    elif i>900000:
        if i%1000 == 0:
            print('k = ',i)
            toc = time.time()
            print('tictoc = ',toc-tic)
            print('elapsed = ',toc-t1)

#detectorVal = np.array(detectorVal,dtype='float')
plane1Data = np.array(plane1Data,dtype='float')
plane2Data = np.array(plane2Data,dtype='float')
plane1Times = np.array(plane1Times,dtype='float')
plane2Times = np.array(plane2Times,dtype='float')
plane1Dets = np.array(plane1Dets,dtype='int')
plane2Dets = np.array(plane2Dets,dtype='int')

I vaguely remember from a c++ course I took a while ago that you can make lists that can run faster than nested 'if' statements. Is this correct and if so can I do this in python? I am running python 3.5 right now. Thank you for your help.

5
  • 1
    This question may belong on the CodeReview SE. Commented Aug 30, 2017 at 14:55
  • 1
    1) Use append instead of +, which is much slower, to add elements to a list. 2) If you are going to create NumPy arrays anyway, consider preallocating them (e.g. with np.empty) first and the setting the values in the loop. Commented Aug 30, 2017 at 15:00
  • 1) When I use the append function, the first value is always set to 'None' for some reason. I'm using, for instance, plane2Data = np.append(plane2Data,rawDataMat[i,:]) 2) can I use np.empty to generate an array of unknown length? Thanks. Commented Aug 30, 2017 at 15:11
  • @jdehesa is spot-on on both counts. If you'll be using numpy, you could use its high-performance data structures instead of ordinary lists-- it might speed things up a little bit, but this is unrelated to your real performance problem, which is the list copies. Commented Aug 30, 2017 at 16:35
  • Independently, you can greatly simplify your code by using the value of adcChannel[i] in the last assignment-- you'll be able to drop the entire set of embedded if-else blocks. Commented Aug 30, 2017 at 16:36

1 Answer 1

7

Your problem, and it is a major time waster, are the statements of the form

list_variable = list_variable + [ new_value ]

You call three of them on each loop iteration, e.g.:

plane1Data = plane1Data + [rawDataMat[i,:]]

Because you might have additional references to the list pointed to by list_variable, Python constructs a complete copy of the list on each call, only to discard the original when the assignment is carried out. Use the following form for all your list extensions, and you'll see astronomical improvement:

list_variable += [ new_value ]

Here's the proof that this really happens:

>>> from timeit import timeit
>>> x=list(range(100000))
>>> timeit("x += [99]", "from __main__ import x", number=1000)
0.00023529794998466969
>>> x=list(range(100000))
>>> timeit("x = x + [99]", "from __main__ import x", number=1000)
0.7576854809885845
>>> 0.7576854809885845 / 0.00023529794998466969
3220.110846855868

There you have it. For this 100000-element list, appending in place is more than three thousand times faster than copy-and-assign. You can profile a subset of your own data if you want to measure your gains.

Sign up to request clarification or add additional context in comments.

7 Comments

Would this method of using 'x += [value]' be faster than using the 'x = np.append(x,[value])' method? Thanks for the help!
I don't know-- measure it.
My guess, is, either method is plenty fast enough. But I could be wrong.
Hi Alexis, I did a check and using += is about 100 times faster than using np.append().
Good to know. To be fair, @jdehesa's suggestion was not that you use np.append(), but that you allocate a numpy array of the needed size (say, with all zeros) and then assign values to pre-allocated cells. But if you have adequate performance now, you should move on to more important things.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.