In Python (Pandas-Numpy), how to modify column names (strings), using a condition and iteration?

Question

I'm trying to modify the column names of a dataframe with a lot of columns. The column names are strings like:

'0000', '0005'...'0100'...'2355'

Since is a large amount of columns, I need to do this with iteration. The gist of the modification is that if a column name (string) starts with '0', modify that column name (string) so that the new value is only the last 3 digits of the string (all the stings have 4 digits).

So what I did was:

Put the column names on a list

 df_cols = df.columns.tolist()

Then do the changes in the list through iteration

for i in range(len(df_cols)):
    if df_cols[i][0] == '0':
        df_cols[i] = df_cols[i][1:4]

When I check the list, it effectively made the modifications. However, when I try to use the modify list of column names (df_cols) in the dataframe:

df = df[df_cols]

I get an error msg:

File "c:\users\hernan\anaconda\lib\site-packages\pandas\core\frame.py", line 1774, in __getitem__
return self._getitem_array(key)

File "c:\users\hernan\anaconda\lib\site-packages\pandas\core\frame.py", line 1818, in _getitem_array
indexer = self.ix._convert_to_indexer(key, axis=1)

File "c:\users\hernan\anaconda\lib\site-packages\pandas\core\indexing.py", line 1143, in _convert_to_indexer
raise KeyError('%s not in index' % objarr[mask])

KeyError: "['000' '001' '002' '003' '004' '005' '006' '007'....] not in index"

Thanks for the help

You can use transformed_cols = ["{:03}".format(int(i)) for i in cols] for a fancy way to delete one leading zero :) — cel
– cel, Commented Feb 23, 2015 at 17:07

Carsten · Accepted Answer · 2015-02-23 17:03:49Z

2

You have just changed the values of df_cols. You have to update your DataFrame's column names first before you can use them:

df.columns = df_cols

answered Feb 23, 2015 at 17:03

Carsten

18.5k4 gold badges51 silver badges56 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Tarun Gaba · Accepted Answer · 2015-02-23 17:03:59Z

2

You are modifying a copy of columns, not the actual column_names. This should do:

df_cols = df.columns.tolist()
for i in range(len(df_cols)):
if df_cols[i][0] == '0':
    df_cols[i] = df_cols[i][1:4]

df.columns = df_cols  #Here you substitute back the modified column names to the dataframe

Hope it helps .. :)

answered Feb 23, 2015 at 17:03

Tarun Gaba

1,1131 gold badge9 silver badges16 bronze badges

Collectives™ on Stack Overflow

In Python (Pandas-Numpy), how to modify column names (strings), using a condition and iteration?

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related