5

Let's say I have the following multi-index DataFrame:

import pandas as pd
df = pd.DataFrame({'Index0':[0,1,2,3,4,5],'Index1':[100,200,300,400,500,600],'A':[5,2,5,8,1,2]})

example DataFrame

Now I want to select all the rows where Index1 is less than 400. Everybody knows how that works if Index1 was a regular column:

df[df['Index1'] < 400]

So one method would be to reset_index, perform the selection, then set the index again. This seems quite redundant.

My question is: Is there a way to do this directly? And how to do this when the DataFrame has a row multiindex?

2
  • Oops. Forgot df.set_index(['Index0','Index1']) in the code. Commented Jun 10, 2018 at 15:58
  • Ahem, that should have been `df.set_index(['Index0','Index1'],inplace=True) Commented Jun 10, 2018 at 16:08

1 Answer 1

7

Simpliest here is use query:

df1 = df.query('Index1 < 400')
print (df1)
               A
Index0 Index1   
0      100     5
1      200     2
2      300     5

Or get_level_values for select level of MultiIndex with boolean indexing:

df1 = df[df.index.get_level_values('Index1') < 400]

Detail:

print (df.index.get_level_values('Index1'))
Int64Index([100, 200, 300, 400, 500, 600], dtype='int64', name='Index1')

If levels have no names select by positions, for query use special keyword ilevel_ with position:

df.index.names = [None, None]
print (df)
       A
0 100  5
1 200  2
2 300  5
3 400  8
4 500  1
5 600  2

df1 = df.query('ilevel_1 < 400')

df1 = df[df.index.get_level_values(1) < 400]
print (df1)
       A
0 100  5
1 200  2
2 300  5
Sign up to request clarification or add additional context in comments.

1 Comment

Many thanks @jezrael for prompt and complete answer. I can see now why query is convenient. I should have known the second method.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.