4

Cannot figure out how to drop a list of multi-level rows from a pandas dataframe with greater than 3 levels, without resorting to a for loop.

This works fine when explicitly defining all values in the index as answered by: Pandas Multiindex dataframe remove rows

e.g.

mask = dfmi.index.isin(( ('A0','B0', 'C0'), ('A2','B3', 'C4') ))
dfmi.loc[~mask,:]

However when one wants to accept all possible third level:

dfmi.index.isin(( ('A0','B0', slice(None)), ('A2','B3', slice(None)) ))

The result TypeError: unhashable type: 'slice'

Currently I am achieving this with the following code:

import numpy as np
import pandas as pd
def mklbl(prefix, n):
     return ["%s%s" % (prefix, i) for i in range(n)]

miindex = pd.MultiIndex.from_product([mklbl('A', 4),
                                   mklbl('B', 4),
                                   mklbl('C', 10)])

dfmi = pd.DataFrame(np.arange(len(miindex) * 2)
               .reshape((len(miindex), 2)),
                index=miindex).sort_index().sort_index(axis=1)

As = ['A0', 'A2']
Bs = ['B1', 'B3']

for a,b in zip(As, Bs):
    dfmi_drop_idx = dfmi.loc[(a, b, slice(None)), :].index
    dfmi.drop(dfmi_drop_idx, inplace=True, errors='ignore')
2
  • Are you looking to drop all rows with these particular index values, or just the rows corresponding to (A0, B1, :) and (A2, B3, :)? Because that's a bit unclear from your question. Commented Jun 20, 2019 at 13:56
  • I am looking to drop only the unique combination of rows (A0, B1, :) and (A2, B3, :). Obviously my full list of row combinations is very large, hence the need to avoid a loop. Commented Jun 20, 2019 at 14:02

2 Answers 2

3

Create the MultiIndex index then drop it

dfmi.drop(pd.MultiIndex.from_arrays([As,Bs]))
Sign up to request clarification or add additional context in comments.

Comments

3

drop on a list of tuples should do the trick

dfmi.drop([*zip(As, Bs)])

To verify, this is a modified version of your code. We'll compare outputs to asser equality.

from functools import reduce
didx = reduce(
    pd.MultiIndex.union,
    [dfmi.loc[pd.IndexSlice[a, b, :], :].index
     for a, b in zip(As, Bs)]
)

assert dfmi.drop(didx).equals(dfmi.drop([*zip(As, Bs)]))

1 Comment

I might've misinterpreted the question.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.