Index columns disappeared after lambda function in Pandas

Question

I wanted to calculate the percent of some object in one hour ('Time') so I have tried to write a lambda function, and I think it does the job, but index columns disappeared, columns that dataframe is grouped by.

df = df.groupby(['id', 'name', 'time', 'object', 'type'], as_index=True, sort=False)['col1', 'col2', 'col3', 'col4', 'col5'].apply(lambda x: x * 100 / 3600).reset_index()

After that code I print df.columns and got this:

Index([u'index', u'col1', col2', u'col3',
       u'col4', u'col5'],
      dtype='object')

If there is a need I am going to write some table with values for each column. Thanks in advance.

Please provide the data you are using or a sample of it that enables us to test it. — piRSquared
– piRSquared, Commented May 11, 2018 at 13:17

Ami Tavory · Accepted Answer · 2018-05-11 13:31:42Z

3

Moving the loop outward, will make the code run significantly faster:

for c in ['col1', 'col2', 'col3', 'col4', 'col5']:
    df[c] *= 100. / 3600

This is because the individual loops' calculations will be done in a vectorized way.

This also won't modify the index in any way.

answered May 11, 2018 at 13:31

Ami Tavory

76.7k13 gold badges152 silver badges196 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

jpp · Accepted Answer · 2018-05-11 13:27:40Z

2

pd.DataFrame.groupby is used to aggregate data, not to apply a function to multiple columns.

For simple functions, you should look for a vectorised solution. For example:

# set up simple dataframe
df = pd.DataFrame({'id': [1, 2, 1], 'name': ['A', 'B', 'A'],
                   'col1': [5, 6, 8], 'col2': [9, 4, 5]})

# apply logic in a vectorised way on multiple columns
df[['col1', 'col2']] = df[['col1', 'col2']].values * 100 / 3600

If you wish to set your index as multiple columns, and are keen to use pd.DataFrame.apply, this is possible as two separate steps. For example:

df = df.set_index(['id', 'name'])
df[['col1', 'col2']] = df[['col1', 'col2']].apply(lambda x: x * 100 / 3600)

answered May 11, 2018 at 13:27

jpp

166k37 gold badges301 silver badges362 bronze badges

Comments

JE_Muc · Accepted Answer · 2018-05-11 13:27:50Z

1

You apply .reset_index() which resets the index. Take a look at the pandas documentation and you'll see, that .reset_index() transfers the index to the columns.

answered May 11, 2018 at 13:27

JE_Muc

5,8323 gold badges30 silver badges49 bronze badges

Comments

BENY · Accepted Answer · 2018-05-11 13:33:42Z

1

Data from Jpp

df[['col1','col2']]*=100/3600
df
Out[110]: 
       col1      col2  id name
0  0.138889  0.250000   1    A
1  0.166667  0.111111   2    B
2  0.222222  0.138889   1    A

answered May 11, 2018 at 13:33

BENY

324k22 gold badges176 silver badges250 bronze badges

Collectives™ on Stack Overflow

Index columns disappeared after lambda function in Pandas

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related