Python: subsetting data frame using a list

Question

I am a newbie to python and I have a very simple question. I have a list of indices which correspond to the some row indices of the data frame. What is the best way to use this list (in the order of the items) to subset a data frame?

gmds · Accepted Answer · 2019-04-09 11:18:27Z

1

Use iloc:

import numpy as np
import pandas as pd

np.random.seed(0)
df = pd.DataFrame(np.random.randint(100, 200, (10, 2)), columns=['a', 'b'])
print(df, end='\n\n')
print(df.iloc[[7, 2, 3, 1, 6]])

Output:

    a    b
0  144  147
1  164  167
2  167  109
3  183  121
4  136  187
5  170  188
6  188  112
7  158  165
8  139  187
9  146  188

     a    b
7  158  165
2  167  109
3  183  121
1  164  167
6  188  112

If you want to use a list that corresponds to values in a column instead, then we need to merge:

values = [158, 167, 183, 164, 188]
print(pd.merge(pd.DataFrame([158, 167, 183, 164, 188], columns=['a']), df, on='a', how='left'))

Output:

     a    b
0  158  165
1  167  109
2  183  121
3  164  167
4  188  112

edited Apr 9, 2019 at 11:18

answered Apr 9, 2019 at 11:04

gmds

20k4 gold badges37 silver badges65 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

user5054 Over a year ago

thank you, very helpful! how can I do a similar subsetting where the list I have corresponds to the elements in the first column instead of the data frame indices?

gmds Over a year ago

@user5054 That would require a fairly different method. Is that in fact your problem?

user5054 Over a year ago

Yes, this is my actual problem @gmds . I got the indices of the first column elements using indices = [data_fr[colname] == g for g in listitems] thinking I could use what you suggested after this step, and then I realized these were all logicals (and weirdly, all False), not indices. I think it would be more efficient to directly index based on the elements in column colname.

Dirk van Eck · Accepted Answer · 2019-04-09 11:18:16Z

0

"how can I do a similar subsetting where the list I have corresponds to the elements in the first column instead of the data frame indices?"

-->

[x for x in df['a'] if x in list_of_elements]

answered Apr 9, 2019 at 11:18

Dirk van Eck

12 bronze badges

Collectives™ on Stack Overflow

Python: subsetting data frame using a list

2 Answers 2

3 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related