1

I understand that to drop a column you use df.drop('column name', axis=1, inplace =True)

The file format is in .csv file

I want to use above syntax for large data sets and more in robust way

suppose I have 500 columns and I want to keep column no 100 to 140 using column name not by indices and rest want to drop , how would I write above syntax so that I can achieve my goal and also in 100 to 140 column , I want to drop column no 105, 108,110 by column name

1
  • 3
    Why are you referring to column's position, if you want to drop them by name? Commented Aug 11, 2022 at 8:07

2 Answers 2

4
df = df.loc[:, 'col_100_name' : 'col_140_name']

.loc always selects using both ends inclusive. Here I am selecting all rows, and only the columns that you want to select (by names).

After this (or before - it doesn't matter) you can drop the other columns by names as usual:

df.drop(['col_105_name', 'col_108_name', 'col_110_name'], axis=1, inplace=True)

If you wish to select columns using a combination of slices and explicit column names:

cols_in_the_slice = df.loc[:, 'col_100_name' : 'col_140_name'].columns
other_cols = pd.Index(['col_02_name', 'col_04_name'])
all_cols = other_cols.union(cols_in_the_slice , sort=False)

df = df[all_cols]

Union appends the NEW (not yet encountered) elements of cols_in_the_slice to the end of other_cols. It sorts by default, so I specify sort=False not to sort. Then we are selecting all these columns.

By the way, here you can also drop column names which you don't wish to have.

You can use .drop if you know column names, or .delete if you know their locations in this index:

cols_in_the_slice = cols_in_the_slice.drop(['col_105_name', 'col_108_name', 'col_110_name'])

I also recommend taking a look at Pandas User Guide on Indexing and selecting data.

Sign up to request clarification or add additional context in comments.

3 Comments

here when I am using - df = df.loc[:, 'col_02_name', 'col_04_name', 'col_100_name' : 'col_140_name'] saying - too many indexes basically I also want to keep 2 more individuals columns along with 100 to 140
Yes - too many indices because there can exist only 1 'from' and 1 'to' endpoints.
very impressed with your way of answering thnx
0

Instead of using a string parameter for the column name, use a list of strings refering to the column names you want to delete.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.