I'm trying to modify the column names of a dataframe with a lot of columns. The column names are strings like:
'0000', '0005'...'0100'...'2355'
Since is a large amount of columns, I need to do this with iteration. The gist of the modification is that if a column name (string) starts with '0', modify that column name (string) so that the new value is only the last 3 digits of the string (all the stings have 4 digits).
So what I did was:
Put the column names on a list
df_cols = df.columns.tolist()
Then do the changes in the list through iteration
for i in range(len(df_cols)):
if df_cols[i][0] == '0':
df_cols[i] = df_cols[i][1:4]
When I check the list, it effectively made the modifications. However, when I try to use the modify list of column names (df_cols) in the dataframe:
df = df[df_cols]
I get an error msg:
File "c:\users\hernan\anaconda\lib\site-packages\pandas\core\frame.py", line 1774, in __getitem__
return self._getitem_array(key)
File "c:\users\hernan\anaconda\lib\site-packages\pandas\core\frame.py", line 1818, in _getitem_array
indexer = self.ix._convert_to_indexer(key, axis=1)
File "c:\users\hernan\anaconda\lib\site-packages\pandas\core\indexing.py", line 1143, in _convert_to_indexer
raise KeyError('%s not in index' % objarr[mask])
KeyError: "['000' '001' '002' '003' '004' '005' '006' '007'....] not in index"
Thanks for the help
transformed_cols = ["{:03}".format(int(i)) for i in cols]for a fancy way to delete one leading zero :)