1

My dataframe:

  ordercode             quantity
PMC21-AA1U1FBWBJA           1
PMP23-GR1M1FB3CJ            1
PMC11-AA1U1FJWWJA           1
PMC11-AA1U1FBWWJA+I7        2
PMC11-AA1U1FJWWJA           3
PMC11-AA1L1FJWWJA           3

My desired output:

  Group    ordercode                quantity
    0       PMC21-AA1U1FBWBJA           1
            PMP23-GR1M1FB3CJ            1
             PMC11-AA1U1FJWWJA          1
            PMC11-AA1U1FBWWJA+I7        1
    1      PMC11-AA1U1FBWWJA+I7         1
            PMC11-AA1U1FJWWJA           3
    2      PMC11-AA1L1FJWWJA            3

So here my desired result is based on column['quantity']. Max value of quantity is 4.

In group0 & group1 the total values (1+1+1+1=4)(1+3=4) (i.e keeping the max vale of quantity as 4). In group2 we can see that no values to add so the group is formed by the left over (here it is 3). In group0 & group1 we can see that PMC11-AA1U1FBWWJA+I7's value splits.

I got little help from the forum and done the following coding:

df = pd.DataFrame(np.concatenate(df.apply(lambda x: [x[0]] * x[1], 1).as_matrix()), 
              columns=['ordercode'])
df['quantity'] = 1
df['group'] = sorted(range(0, len(df)/3, 1) * 4)[0:len(df)]
df.groupby(['group', 'ordercode']).sum()

but I am getting error.

Type error: 'float' object cannot be interpreted as an integer

If I use int in

df['group'] = sorted(range(0, int(len(df)/3), 1) * 4)[0:len(df)]

again I'm getting typeerror. Can anyone tell me why?

1 Answer 1

2

Assuming you use Python3, type a double slash // to do integer division and convert the range to a list.

df['group'] = sorted(list(range(0, len(df) // 3, 1)) * 4)[0:len(df)]

For the second attempt, it looks like this.

df['group'] = sorted(list(range(0, int(len(df) / 3), 1)) * 4)[0:len(df)]

So the full code runs like this. I copied your example to the clipboard before running.

import pandas as pd
import numpy as np
df = pd.read_clipboard()
df = pd.DataFrame(np.concatenate(df.apply(lambda x: [x[0]] * x[1], 1).as_matrix()), 
              columns=['ordercode'])
df['quantity'] = 1
df['group'] = sorted(list(range(0, len(df) // 3, 1)) * 4)[0:len(df)]
df = df.groupby(['group', 'ordercode']).sum()
print(df)

The ".as_matrix()" command you used generates a warning, but works.

Sign up to request clarification or add additional context in comments.

12 Comments

Yes I'm using Python 3.I'm getting Typeerror: unsupported operand type(s) for *:'range' and 'int'
I updated the reply. There was a bracket problem in the first line.
When I Use list function its giving me a warning. Future warning: Method .as_Matrix will be removed in a future version. Use .values Instead.df = pd.DataFrame(np.concatenate(df.apply(lambda x: [x[0]] * x[1], 1).as_matrix()), columns=['ordercode']) And I didn't get any result
I ran your example in jupyter notebooks and get the requested result. I will update the answer to include the running example
Hey I got the result of df['group'] = sorted(list(range(0, len(df) // 3, 1)) * 4)[0:len(df)] but It didn't group df.groupby(['group', 'ordercode']).sum()
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.