Adding rows based on condition - Dataframe

Question

I have a dataframe as shown below:

I want to add a new row based on the following logic:

Add a new row with "location" as "Stage Area"
This row is a sum of the entries where 'location' is "Reply's Area - New Commercial Area" and entries where 'location' is "Cultural Hub".
Drop the rows with 'location' as "Reply's Area - New Commercial Area" and "Cultural Hub"

So for 11th November 2020 I should have the below entry:

Please don't post images of code/data (or links to them)

jezrael
– jezrael

2020-11-24 11:26:34 +00:00
Commented Nov 24, 2020 at 11:26 — jezrael
– jezrael, Commented Nov 24, 2020 at 11:26

Fraf · Accepted Answer · 2020-11-24 11:48:20Z

1

Jezrael looks like he was close with the answer, but maybe the aggregation on football won't be correct... just from looking at his code, so I might be wrong.

The correct version would look like this, and this matching the figures you suggested in your example. I made a smaller version of your example table for testing. Here "data" is your dataframe.

mask = data["location"].isin(["Repley's Area - New Commercial Area", "Cultural Hub"])
data[mask].groupby(["day","locationTypes"], as_index=False)['dwell', 'football'].sum().assign(location="Stage Area")

The output:

          day locationTypes  dwell  football    location
0  2020-11-11          Zone    145      2307  Stage Area
1  2020-11-12          Zone     95      2905  Stage Area

answered Nov 24, 2020 at 11:48

Fraf

132 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

jezrael · Accepted Answer · 2020-11-24 11:57:05Z

1

Use Series.isin for filter by multiple values, aggregate sum add column location and last add to original DataFrame without matched rows by mask:

mask = df['location'].isin(["Reply's Area - New Commercial Area", 'Cultural Hub'])

df1 = (df[mask].groupby(['day','locationTypes'],as_index=False)[['dwell', 'football']]
              .sum()
              .assign(location = 'Stage Area')
              .reindex(df.columns, axis=1))

df = pd.concat([df[~mask], df1], ignore_index=True)

edited Nov 24, 2020 at 11:57

answered Nov 24, 2020 at 11:26

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Comments

Aastha Jha · Accepted Answer · 2020-11-24 12:00:52Z

0

Thanks for the responses! The following worked:

mask=df[df['location'].isin(["Repley's Area - New Commercial Area",'Cultural Hub'])]

df1=mask.groupby(['day','locationTypes'],as_index=False)['footfall','dwell (minutes)'].sum().assign(location='Stage Area')

#reordering the columns for pd.concat
df1= df1[df.columns]

df_final=pd.concat([df[~df['location'].isin(["Repley's Area - New Commercial Area",'Cultural Hub'])],df1]) 

#checking the result
df_final[(df_final['day']=='2020-11-11') & (df_final['location']=='Stage Area')]

#which gives

answered Nov 24, 2020 at 12:00

Aastha Jha

3031 gold badge6 silver badges16 bronze badges

1 Comment

jezrael Over a year ago

Oki, added reindex for change order of column

Collectives™ on Stack Overflow

Adding rows based on condition - Dataframe

3 Answers 3

Comments

Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related