I'm creating an additional column "Total_Count" to store the cumulative count record by Site and Count_Record column information. My coding is almost done for total cumulative count. However, the Total_Count column is shift for a specific Card as below. Could someone help with code modification, thank you!
Expected Output:
Current Output:
My Code:
import pandas as pd
df1 = pd.DataFrame(columns=['site', 'card', 'date', 'count_record'],
data=[['A', 'C1', '12-Oct', 5],
['A', 'C1', '13-Oct', 10],
['A', 'C1', '14-Oct', 18],
['A', 'C1', '15-Oct', 21],
['A', 'C1', '16-Oct', 29],
['B', 'C2', '12-Oct', 11],
['A', 'C2', '13-Oct', 2],
['A', 'C2', '14-Oct', 7],
['A', 'C2', '15-Oct', 13],
['B', 'C2', '16-Oct', 4]])
df_append_temp=[]
total = 0
preCard = ''
preSite = ''
preCount = 0
for pc in df1['card'].unique():
df2 = df1[df1['card'] == pc].sort_values(['date'])
total = 0
for i in range(0, len(df2)):
site = df2.iloc[i]['site']
count = df2.iloc[i]['count_record']
if site == preSite:
total += (count - preCount)
else:
total += count
preCount = count
preSite = site
df2.loc[i, 'Total_Count'] = total #something wrong using loc here
df_append_temp.append(df2)
df3 = pd.DataFrame(pd.concat(df_append_temp), columns=df2.columns)
df3


iis an integer location.locis label based location. You should useilocas above.df2.loc[i, df2.columns.get_loc('Total_Count')] = total. You'd need to initialize theTotal_Countcolumn first though with that approach.df1[df1['card'] == pc]you're not guaranteed to get a contiguous default range index 0-length.