2

I am trying to determine a more efficient to add specific values in a pandas df.

For the df below, I want to add the integers in Value for each X + Y in Area. So, for every X, I want to add that to the following Y.

import pandas as pd

d = ({
    'Area' : ['X','Y','Z','X','Y','Z'],                                     
    'Value' : [10,11,20,21,30,31],                                     
     })

df = pd.DataFrame(data=d)

If there's not many values I can go through manually as per the following:

x = df.iloc[0] + df.iloc[1]

But if the df is quite large, this becomes inefficient.

Intended Output:

21
51

2 Answers 2

1

Filter by boolean indexing to Series, create default index and Series.add:

s1 = df.loc[df['Area'].eq('X'), 'Value'].reset_index(drop=True)
s2 = df.loc[df['Area'].eq('Y'), 'Value'].reset_index(drop=True)

s = s1.add(s2)
print (s)
0    21
1    51
dtype: int64

Advantage of solution is not important ordering of X and Y values.

Sign up to request clarification or add additional context in comments.

Comments

1

create a mask with X and Y and groupby on every 2 indexes and sum(), use:

m=df[df.Area.isin(['X','Y'])].reset_index(drop=True)
print(m.groupby(m.index//2)['Value'].sum())

Output

   0    21
   1    51

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.