Conditioned index in numpy array

Question

I've been going through an online tutorial

from sklearn.decomposition import  * 
from sklearn import datasets
import matplotlib.pyplot as plt
import time

digits=datasets.load_digits()

randomized_pca = PCA(n_components=2,svd_solver='randomized')

# a numpy array with shape= (1800,2)  
reduced_data_rpca = randomized_pca.fit_transform(digits.data)

# make a scatter plot

colors = ['black', 'blue', 'purple', 'yellow', 'pink', 'red', 'lime', 'cyan', 
'orange', 'gray']

start=time.time()

#   Time Taken for this loop = 9.5 seconds

# for i in range(len(reduced_data_rpca)):
#         x = reduced_data_rpca[i][0]
#         y = reduced_data_rpca[i][1]
#         plt.scatter(x,y,c=colors[digits.target[i]])

# Alternative way  TimeTaken = 0.2 sec

# plots all the points (x,y) with color[i] in ith iteration

for i in range(len(colors)):
    """assigns all the elements (accordingly to x and y)  whose label(0-9) equals the variable i (am I 
    correct ? does this mean it iterates the whole again to check for the 
    equality?) """
    x = reduced_data_rpca[:, 0][digits.target == i]  
    y = reduced_data_rpca[:, 1][digits.target == i]
    plt.scatter(x, y, c=colors[i])

end=time.time()

print("Time taken",end-start," Secs")

My question is although both commented and non-commented loops performs same operation I cannot understand how the second loop is working and why it is performing better than the other one.

plan · Accepted Answer · 2017-10-08 14:37:07Z

1

Your first loop (commented out) loops over a 1800-element array. The second one uses the indexing methods of numpy for the "inner loop" and only has to a regular for loop through your 10 colors. Numpy arrays are faster than regular lists and loops.

But what does digits.target == i do? It seems to me like it is not picking out a boolean array from reduced_data_rpca but doing a comparison between a dictionary and the array index over and over. Isn't the result of that comparison always False?

Also see: https://docs.scipy.org/doc/numpy-1.13.0/user/basics.indexing.html

answered Oct 8, 2017 at 14:37

plan

2692 silver badges10 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Yashwanth Over a year ago

actually digits.target (these are the class labels of numbers in reduced_data_rpca) contains 1800 values (ranged 0-9) and reduced_data_rpca contains 2 columns and what I am thinking is that, it loops over this array and verifies the equality of corresponding digits.target value with the variable i. Am I being wrong?

plan Over a year ago

You are right I think! Makes sense once I tried it out.

Collectives™ on Stack Overflow

Conditioned index in numpy array

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related