3

Suppose one has a dataframe created as such:

tdata = {('A', 50): [1, 2, 3, 4],
         ('A', 55): [5, 6, 7, 8],
         ('B', 10): [10, 20, 30, 40],
         ('B', 20): [50, 60, 70, 80],
         ('B', 50): [2, 4, 6, 8],
         ('B', 55): [10, 12, 14, 16]}
tdf = pd.DataFrame(tdata, index=range(0,4))

      A      B
     50 55  10  20 50  55
   0  1  5  10  50  2  10
   1  2  6  20  60  4  12
   2  3  7  30  70  6  14
   3  4  8  40  80  8  16
  1. How would one drop specific columns, say ('B', 10) and ('B', 20) from the dataframe?
  2. Is there a way to drop the columns in one command such as tdf.drop(['B', [10,20]])? Note, I know that my example of the command is by no means close to what it should be, but I hope that it gets the gist across.
  3. Is there a way to drop the columns through some logical expression? For example, say I want to drop columns having the sublevel indices less than 50 (again, the 10, 20 columns). Can I do some general command that would encompass column 'A', even though the 10,20 sublevel indices don't exist or must I specifically reference column 'B'?
3
  • Can you explain more Can I do some general command that would encompass column 'A', even though the 10,20 sublevel indices don't exist or must I specifically reference column 'B'? Commented Mar 16, 2017 at 14:35
  • @jezrael Thanks for asking. I was wondering if I could do something like wildcarding the top levels 'A' and 'B' and go after the sublevels that I don't want, something like tdf.drop([:, [10,20]]). Commented Mar 16, 2017 at 15:38
  • 1
    I think not, it is not possible. Only select by slicers, but not dropping. Commented Mar 16, 2017 at 15:41

1 Answer 1

6

You can use drop by list of tuples:

print (tdf.drop([('B',10), ('B',20)], axis=1))
   A     B    
  50 55 50  55
0  1  5  2  10
1  2  6  4  12
2  3  7  6  14
3  4  8  8  16

For remove columns by level:

mask = tdf.columns.get_level_values(1) >= 50
print (mask)
[ True  True False False  True  True]

print (tdf.loc[:, mask])
   A     B    
  50 55 50  55
0  1  5  2  10
1  2  6  4  12
2  3  7  6  14
3  4  8  8  16

If need remove by level is possible specify only one level:

print (tdf.drop([50,55], axis=1, level=1))
    B    
   10  20
0  10  50
1  20  60
2  30  70
3  40  80
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.