3

I have Multiiindex DF as follows:

tuples = list(zip(*[['a', 'a', 'b', 'b'], ['c', 'd', 'c', 'd']]))
index = pd.MultiIndex.from_tuples(tuples, names=['i1', 'i2'])
df = pd.DataFrame([5, 6, 7, 8], index=index[:4], columns=['col'])

       col
i1 i2     
a  c     5
   d     6
b  c     7
   d     8

Would like to keep rows whose index (level 0) is in

idx_to_keep = ['a']

Should be a straightforward task, but I can't think of any other way than

idx_to_drop = np.setdiff1d(pd.unique(df.index.levels[0]), idx_to_keep)
df.drop(idx_to_drop, inplace = True)

       col
i1 i2     
a  c     5
   d     6

Can I do better?

1

3 Answers 3

4

One way is to use the index method get_level_values():

df
       col
i1 i2     
a  c     5
   d     6
b  c     7
   d     8

df[df.index.get_level_values(0).isin(idx_to_keep)]
       col
i1 i2     
a  c     5
   d     6
Sign up to request clarification or add additional context in comments.

1 Comment

Found a cleaner solution, using 'level' parameter: df = df[df.index.isin(idx_to_keep, level=0)]
3

You can just use loc:

df.loc[['a']]

The resulting output:

       col
i1 i2     
a  c     5
   d     6

Comments

2

You are looking for .xs:

df.xs('a', axis=0, level=0, drop_level=False)

Which gives:

       col
i1 i2     
a  c     5
   d     6

2 Comments

Also if looking to preserve index level 0, can specify drop_level=False
what if I want to keep more than just 'a' (keep both 'a' and 'b' for example).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.