arithmetic on pandas dataframe row-wise

Question

df have:

    A   B  C
a   1   2  3
b   2   1  4
c   1   1  1

df want:

    A   B  C
a   1   2  3
b   2   1  4
c   1   1  1
d   1  -1  1

I am able to get df want by using:

df.loc['d']=df.loc['b']-df.loc['a']

However, my actual df has 'a','b','c' rows for multiple IDs 'X', 'Y' etc.

        A   B  C
  X a   1   2  3
    b   2   1  4
    c   1   1  1
  Y a   1   2  3
    b   2   1  4
    c   1   1  1

How can I create the same output with multiple IDs? My original method:

df.loc['d']=df.loc['b']-df.loc['a']

fails KeyError:'b'

Desired output:

        A   B  C
  X a   1   2  3
    b   2   1  4
    c   1   1  1
    d   1  -1  1
  Y a   1   2  3
    b   2   2  4
    c   1   1  1
    d   1   0  1

It's a multi-index. You need to provide both the higher level and the lower level of the index to resolve the row — ifly6
– ifly6, Commented Aug 9, 2019 at 17:17

rafaelc · Accepted Answer · 2019-08-09 17:51:05Z

1

IIUC,

for i, sub in df.groupby(df.index.get_level_values(0)):
  df.loc[(i, 'd'), :] = sub.loc[(i,'b')] - sub.loc[(i, 'a')]

print(df.sort_index())

Or maybe

k = df.groupby(df.index.get_level_values(0), as_index=False).apply(lambda s: pd.DataFrame([s.loc[(s.name,'b')].values - s.loc[(s.name, 'a')].values], 
                                                                                      columns=s.columns, 
                                                                                      index=pd.MultiIndex(levels=[[s.name], ['d']], codes=[[0],[0]])
                                                                                      )).reset_index(drop=True, level=0)

pd.concat([k, df]).sort_index()

answered Aug 9, 2019 at 17:51

rafaelc

59.4k15 gold badges64 silver badges87 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Mark Wang · Accepted Answer · 2019-08-09 20:08:58Z

1

Data reshaping is a useful trick if you want to do manipulation on a particular level of a multiindex. See code below,

result = (df.unstack(0).T
            .assign(d=lambda x:x.b-x.a)
            .stack()
            .unstack(0))

edited Aug 9, 2019 at 20:08

answered Aug 9, 2019 at 20:02

Mark Wang

2,7579 silver badges18 bronze badges

Comments

Andy L. · Accepted Answer · 2019-08-09 23:35:35Z

0

Use pd.IndexSlice to slice a and b. Call diff and slice on b and rename it to d. Finally, append it to original df

idx = pd.IndexSlice
df1 = df.loc[idx[:,['a','b']],:].diff().loc[idx[:,'b'],:].rename({'b': 'd'})
df2 = df.append(df1).sort_index().astype(int)

Out[106]:
     A  B  C
X a  1  2  3
  b  2  1  4
  c  1  1  1
  d  1 -1  1
Y a  1  2  3
  b  2  2  4
  c  1  1  1
  d  1  0  1

answered Aug 9, 2019 at 23:35

Andy L.

25.3k4 gold badges20 silver badges30 bronze badges

Collectives™ on Stack Overflow

arithmetic on pandas dataframe row-wise

3 Answers 3

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related