python dataframe pandas drop multiple column using column name

Question

I understand that to drop a column you use df.drop('column name', axis=1, inplace =True)

The file format is in .csv file

I want to use above syntax for large data sets and more in robust way

suppose I have 500 columns and I want to keep column no 100 to 140 using column name not by indices and rest want to drop , how would I write above syntax so that I can achieve my goal and also in 100 to 140 column , I want to drop column no 105, 108,110 by column name

Why are you referring to column's position, if you want to drop them by name? — Celius Stingher
– Celius Stingher, Commented Aug 11, 2022 at 8:07

Vladimir Fokow · Accepted Answer · 2022-08-11 11:42:55Z

4

df = df.loc[:, 'col_100_name' : 'col_140_name']

.loc always selects using both ends inclusive. Here I am selecting all rows, and only the columns that you want to select (by names).

After this (or before - it doesn't matter) you can drop the other columns by names as usual:

df.drop(['col_105_name', 'col_108_name', 'col_110_name'], axis=1, inplace=True)

If you wish to select columns using a combination of slices and explicit column names:

cols_in_the_slice = df.loc[:, 'col_100_name' : 'col_140_name'].columns
other_cols = pd.Index(['col_02_name', 'col_04_name'])
all_cols = other_cols.union(cols_in_the_slice , sort=False)

df = df[all_cols]

Union appends the NEW (not yet encountered) elements of cols_in_the_slice to the end of other_cols. It sorts by default, so I specify sort=False not to sort. Then we are selecting all these columns.

By the way, here you can also drop column names which you don't wish to have.

You can use .drop if you know column names, or .delete if you know their locations in this index:

cols_in_the_slice = cols_in_the_slice.drop(['col_105_name', 'col_108_name', 'col_110_name'])

I also recommend taking a look at Pandas User Guide on Indexing and selecting data.

edited Aug 11, 2022 at 11:42

answered Aug 11, 2022 at 8:07

Vladimir Fokow

3,8932 gold badges8 silver badges32 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Praveen Rai Over a year ago

here when I am using - df = df.loc[:, 'col_02_name', 'col_04_name', 'col_100_name' : 'col_140_name'] saying - too many indexes basically I also want to keep 2 more individuals columns along with 100 to 140

Vladimir Fokow Over a year ago

Yes - too many indices because there can exist only 1 'from' and 1 'to' endpoints.

Praveen Rai Over a year ago

very impressed with your way of answering thnx

Benjamin Rio · Accepted Answer · 2022-08-11 08:02:04Z

0

Instead of using a string parameter for the column name, use a list of strings refering to the column names you want to delete.

answered Aug 11, 2022 at 8:02

Benjamin Rio

6763 silver badges17 bronze badges

Collectives™ on Stack Overflow

python dataframe pandas drop multiple column using column name

2 Answers 2

3 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related