python dataframe pandas drop column using int

Question

I understand that to drop a column you use df.drop('column name', axis=1). Is there a way to drop a column using a numerical index instead of the column name?

I figure this will not work for the reasons shown here: stackoverflow.com/questions/13411544/… — John
– John, Commented Nov 30, 2013 at 7:37

frederikf · Accepted Answer · 2016-02-16 10:48:10Z

276

You can delete column on i index like this:

df.drop(df.columns[i], axis=1)

It could work strange, if you have duplicate names in columns, so to do this you can rename column you want to delete column by new name. Or you can reassign DataFrame like this:

df = df.iloc[:, [j for j, c in enumerate(df.columns) if j != i]]

edited Feb 16, 2016 at 10:48

frederikf

33 bronze badges

answered Nov 30, 2013 at 15:06

roman

118k30 gold badges205 silver badges209 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Darren Over a year ago

I think you missed the point - they want to drop by index, not by label. Converting index into a label is just dropping by label :(

mArk Over a year ago

How to index cols, if I have to drop 100 columns that are continuous in the middle of the data frame

Nick Over a year ago

The second technique using iloc works well given duplicate column names and is very performant. Thanks.

zkhan122 Over a year ago

in the first technique, you either also need to reassign or add inplace=True

Vince · Accepted Answer · 2019-07-09 19:17:09Z

178

Drop multiple columns like this:

cols = [1,2,4,5,12]
df.drop(df.columns[cols],axis=1,inplace=True)

inplace=True is used to make the changes in the dataframe itself without doing the column dropping on a copy of the data frame. If you need to keep your original intact, use:

df_after_dropping = df.drop(df.columns[cols],axis=1)

edited Jul 9, 2019 at 19:17

Vince

3,4052 gold badges25 silver badges41 bronze badges

answered Oct 2, 2015 at 14:10

muon

14.2k13 gold badges74 silver badges94 bronze badges

5 Comments

sidpat Over a year ago

What is inplace argument for?

muon Over a year ago

if you do not use inplace=True then you will have to do df = df.drop() if you want to see the change in df itself.

mArk Over a year ago

How to index cols, if I have to drop 100 columns that are continuous in the middle of the data frame.

muon Over a year ago

you could do something like col_indices = [df.columns.tolist().index(c) for c in list_of_colnames]

Silidrone Over a year ago

@muon I can't believe you got +15 on that comment. It just goes to show how much noobs there are using python. I can't believe that someone gets to use pandas without knowing what an in place assignment is. Anyway, it is a useful comment, I am just very surprised.

Saeed · Accepted Answer · 2018-12-16 22:03:07Z

79

If there are multiple columns with identical names, the solutions given here so far will remove all of the columns, which may not be what one is looking for. This may be the case if one is trying to remove duplicate columns except one instance. The example below clarifies this situation:

# make a df with duplicate columns 'x'
df = pd.DataFrame({'x': range(5) , 'x':range(5), 'y':range(6, 11)}, columns = ['x', 'x', 'y']) 


df
Out[495]: 
   x  x   y
0  0  0   6
1  1  1   7
2  2  2   8
3  3  3   9
4  4  4  10

# attempting to drop the first column according to the solution offered so far     
df.drop(df.columns[0], axis = 1) 
   y
0  6
1  7
2  8
3  9
4  10

As you can see, both Xs columns were dropped. Alternative solution:

column_numbers = [x for x in range(df.shape[1])]  # list of columns' integer indices

column_numbers .remove(0) #removing column integer index 0
df.iloc[:, column_numbers] #return all columns except the 0th column

   x  y
0  0  6
1  1  7
2  2  8
3  3  9
4  4  10

As you can see, this truly removed only the 0th column (first 'x').

edited Dec 16, 2018 at 22:03

answered Feb 7, 2018 at 19:29

Saeed

2,1511 gold badge22 silver badges30 bronze badges

5 Comments

ATK7474 Over a year ago

You're my hero. Was trying to think of a clever way to do this for way too long.

JDenman6 Over a year ago

This iloc solution is exactly what I was looking for. dropping the first x columns becomes df = df.iloc[:, x:] If you want to drop columns x through y you could do something like:

all_cols = set(range(0,len(df.columns)))       keep_cols = all_cols - set(range(x,y+1))       df = df.iloc[:, list(keep_cols)]

u-phoria Over a year ago

This answer deserves more upvotes as it handles duplicate column names correctly.

Saeed Over a year ago

@AlexandreHuat a CS Lord with less than 1500 points! ;) Thanks you, anyways

Alexandre Huat Over a year ago

Haha, I just wanted to brighten the day of someone

Cam · Accepted Answer · 2020-02-24 22:36:42Z

14

If you have two columns with the same name. One simple way is to manually rename the columns like this:-

df.columns = ['column1', 'column2', 'column3']

Then you can drop via column index as you requested, like this:-

df.drop(df.columns[1], axis=1, inplace=True)

df.column[1] will drop index 1.

Remember axis 1 = columns and axis 0 = rows.

edited Feb 24, 2020 at 22:36

answered Jan 20, 2020 at 18:05

Cam

1,8651 gold badge23 silver badges34 bronze badges

Comments

sargupta · Accepted Answer · 2019-04-18 15:24:09Z

10

You need to identify the columns based on their position in dataframe. For example, if you want to drop (del) column number 2,3 and 5, it will be,

df.drop(df.columns[[2,3,5]], axis = 1)

answered Apr 18, 2019 at 15:24

sargupta

1,03316 silver badges28 bronze badges

Comments

ranaalisaeed · Accepted Answer · 2020-08-28 22:26:45Z

8

You can simply supply columns parameter to df.drop command so you don't to specify axis in that case, like so

columns_list = [1, 2, 4] # index numbers of columns you want to delete
df = df.drop(columns=df.columns[columns_list])

For reference see columns parameter here: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.drop.html?highlight=drop#pandas.DataFrame.drop

answered Aug 28, 2020 at 22:26

ranaalisaeed

3534 silver badges16 bronze badges

Comments

mkln · Accepted Answer · 2013-11-30 09:17:38Z

4

if you really want to do it with integers (but why?), then you could build a dictionary.

col_dict = {x: col for x, col in enumerate(df.columns)}

then df = df.drop(col_dict[0], 1) will work as desired

edit: you can put it in a function that does that for you, though this way it creates the dictionary every time you call it

def drop_col_n(df, col_n_to_drop):
    col_dict = {x: col for x, col in enumerate(df.columns)}
    return df.drop(col_dict[col_n_to_drop], 1)

df = drop_col_n(df, 2)

answered Nov 30, 2013 at 9:17

mkln

15.1k4 gold badges21 silver badges25 bronze badges

Comments

Suraj Kumar · Accepted Answer · 2019-02-26 04:48:21Z

3

You can use the following line to drop the first two columns (or any column you don't need):

df.drop([df.columns[0], df.columns[1]], axis=1)

Reference

edited Feb 26, 2019 at 4:48

Suraj Kumar

5,6748 gold badges24 silver badges45 bronze badges

answered Feb 26, 2019 at 4:44

Mojtaba Peyrovi

411 bronze badge

Comments

voldr · Accepted Answer · 2021-06-24 12:10:27Z

2

Good way to get the columns you want (doesn't matter duplicate names).

For example you have the column indices you want to drop contained in a list-like variable

unnecessary_cols = [1, 4, 5, 6]

then

import numpy as np
df.iloc[:, np.setdiff1d(np.arange(len(df.columns)), unnecessary_cols)]

edited Jun 24, 2021 at 12:10

answered Jun 24, 2021 at 9:56

voldr

3831 silver badge11 bronze badges

Comments

Elias Mi · Accepted Answer · 2022-04-20 14:56:05Z

Appreciate I'm very late to the party, but I had the same issue with a DataFrame that has a MultiIndex. Pandas really doesn't like non-unique multi indices, to a degree that most of the solutions above don't work in that setting (e.g. the .drop function just errors with a ValueError: cannot handle a non-unique multi-index!)

The solution I got to was using .iloc instead. According to the documentation, use can use iloc with a mask (= list of True/False values of which columns you want to keep):

With a boolean array whose length matches the columns.

df.iloc[:, [True, False, True, False]]

Combined with df.columns.duplicated() to identify duplicated columns, you can do this in an efficient, pandas-native way:

df = df.iloc[:, ~df.columns.duplicated()]

Thunder · Accepted Answer · 2019-01-08 09:17:17Z

-2

Since there can be multiple columns with same name , we should first rename the columns. Here is code for the solution.

df.columns=list(range(0,len(df.columns)))
df.drop(columns=[1,2])#drop second and third columns

answered Jan 8, 2019 at 9:17

Thunder

11.1k27 gold badges90 silver badges119 bronze badges

Collectives™ on Stack Overflow

python dataframe pandas drop column using int

11 Answers 11

4 Comments

5 Comments

5 Comments

Comments

Comments

Comments

Comments

Comments

Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

11 Answers 11

4 Comments

5 Comments

5 Comments

Comments

Comments

Comments

Comments

Comments

Comments

Comments

Comments

Linked

Related