9

I'm trying to use query on a MultiIndex column. It works on a MultiIndex row, but not the column. Is there a reason for this? The documentation shows examples like the first one below, but it doesn't indicate that it won't work for a MultiIndex column.

I know there are other ways to do this, but I'm specifically trying to do it with the query function

import numpy as np
import pandas as pd

df = pd.DataFrame(np.random.random((4,4)))
df.index = pd.MultiIndex.from_product([[1,2],['A','B']])
df.index.names = ['RowInd1', 'RowInd2']
# This works
print(df.query('RowInd2 in ["A"]'))

df = pd.DataFrame(np.random.random((4,4)))
df.columns = pd.MultiIndex.from_product([[1,2],['A','B']])
df.columns.names = ['ColInd1', 'ColInd2']
# query on index works, but not on the multiindexed column
print(df.query('index < 2'))
print(df.query('ColInd2 in ["A"]'))
3
  • Have you read this answer? Commented Jul 24, 2018 at 16:20
  • 3
    Yes, I know there is xs and other methods, as I indicated in my question in bold. I'm specifically looking for a way to do this with the query function. The answer might be that it isn't possible. I thought that maybe there is a slightly different syntax that can be used to query with multiindex columns. Commented Jul 24, 2018 at 16:44
  • I completely missed the sentence in bold. My bad. Commented Jul 24, 2018 at 17:55

2 Answers 2

1

To answer my own question, it looks like query shouldn't be used at all (regardless of using MultiIndex columns) for selecting certain columns, based on the answer(s) here:

Select columns using pandas dataframe.query()

Sign up to request clarification or add additional context in comments.

Comments

0

You can using IndexSlice

df.query('ilevel_0>2')
Out[327]: 
ColInd1         1                  2          
ColInd2         A         B        A         B
3        0.652576  0.639522  0.52087  0.446931
df.loc[:,pd.IndexSlice[:,'A']]
Out[328]: 
ColInd1         1         2
ColInd2         A         A
0        0.092394  0.427668
1        0.326748  0.383632
2        0.717328  0.354294
3        0.652576  0.520870

1 Comment

I am aware of this, but I'm specifically looking for a way to do this with the query function. The answer might be that it isn't possible, but I thought that maybe there is a slightly different syntax that can be used to query with multiindex columns.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.