My dataframe:
ordercode quantity
PMC21-AA1U1FBWBJA 1
PMP23-GR1M1FB3CJ 1
PMC11-AA1U1FJWWJA 1
PMC11-AA1U1FBWWJA+I7 2
PMC11-AA1U1FJWWJA 3
PMC11-AA1L1FJWWJA 3
My desired output:
Group ordercode quantity
0 PMC21-AA1U1FBWBJA 1
PMP23-GR1M1FB3CJ 1
PMC11-AA1U1FJWWJA 1
PMC11-AA1U1FBWWJA+I7 1
1 PMC11-AA1U1FBWWJA+I7 1
PMC11-AA1U1FJWWJA 3
2 PMC11-AA1L1FJWWJA 3
So here my desired result is based on column['quantity']. Max value of quantity is 4.
In group0 & group1 the total values (1+1+1+1=4)(1+3=4) (i.e keeping the max vale of quantity as 4). In group2 we can see that no values to add so the group is formed by the left over (here it is 3). In group0 & group1 we can see that PMC11-AA1U1FBWWJA+I7's value splits.
I got little help from the forum and done the following coding:
df = pd.DataFrame(np.concatenate(df.apply(lambda x: [x[0]] * x[1], 1).as_matrix()),
columns=['ordercode'])
df['quantity'] = 1
df['group'] = sorted(range(0, len(df)/3, 1) * 4)[0:len(df)]
df.groupby(['group', 'ordercode']).sum()
but I am getting error.
Type error: 'float' object cannot be interpreted as an integer
If I use int in
df['group'] = sorted(range(0, int(len(df)/3), 1) * 4)[0:len(df)]
again I'm getting typeerror. Can anyone tell me why?