0

I have a dataframe as shown below:

enter image description here

I want to add a new row based on the following logic:

  1. Add a new row with "location" as "Stage Area"
  2. This row is a sum of the entries where 'location' is "Reply's Area - New Commercial Area" and entries where 'location' is "Cultural Hub".
  3. Drop the rows with 'location' as "Reply's Area - New Commercial Area" and "Cultural Hub"

So for 11th November 2020 I should have the below entry:

enter image description here

1

3 Answers 3

1

Jezrael looks like he was close with the answer, but maybe the aggregation on football won't be correct... just from looking at his code, so I might be wrong.

The correct version would look like this, and this matching the figures you suggested in your example. I made a smaller version of your example table for testing. Here "data" is your dataframe.

mask = data["location"].isin(["Repley's Area - New Commercial Area", "Cultural Hub"])
data[mask].groupby(["day","locationTypes"], as_index=False)['dwell', 'football'].sum().assign(location="Stage Area")

The output:

          day locationTypes  dwell  football    location
0  2020-11-11          Zone    145      2307  Stage Area
1  2020-11-12          Zone     95      2905  Stage Area
Sign up to request clarification or add additional context in comments.

Comments

1

Use Series.isin for filter by multiple values, aggregate sum add column location and last add to original DataFrame without matched rows by mask:

mask = df['location'].isin(["Reply's Area - New Commercial Area", 'Cultural Hub'])

df1 = (df[mask].groupby(['day','locationTypes'],as_index=False)[['dwell', 'football']]
              .sum()
              .assign(location = 'Stage Area')
              .reindex(df.columns, axis=1))

df = pd.concat([df[~mask], df1], ignore_index=True)

Comments

0

Thanks for the responses! The following worked:

mask=df[df['location'].isin(["Repley's Area - New Commercial Area",'Cultural Hub'])]

df1=mask.groupby(['day','locationTypes'],as_index=False)['footfall','dwell (minutes)'].sum().assign(location='Stage Area')

#reordering the columns for pd.concat
df1= df1[df.columns]

df_final=pd.concat([df[~df['location'].isin(["Repley's Area - New Commercial Area",'Cultural Hub'])],df1]) 

#checking the result
df_final[(df_final['day']=='2020-11-11') & (df_final['location']=='Stage Area')]

#which gives

enter image description here

1 Comment

Oki, added reindex for change order of column

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.