Python Pandas: Transforming dataframe according to a specific column

Question

I have a dataframe with offer data that I want to transform into portion data. The column quantity tells how many portions each offer has. If an offer has three portions out of which two were sold, that row should appear three times in the new dataframe, so that the sold column has value 1 on two rows and value 0 on one row.

The actual dataframe contains more columns that should stay intact in the transformed dataframe. Date column works as an example.

My input:

    Offer name     Quantity   Sold   Date
  0 A              3          2      2022-05 
  1 B              2          1      2022-01 
  2 C              1          1      2022-04 
  3 D              1          1      2022-04 
  4 E              1          1      2022-05

The desired output:

    Offer name     Quantity   Sold   Date
  0 A              1          1      2022-05 
  1 A              1          1      2022-05 
  2 A              1          0      2022-05 
  3 B              1          1      2022-01 
  4 B              1          0      2022-01
  and so on...

Thank you!!

What if Sold is greater than Quantity?

Ynjxsjmh
– Ynjxsjmh

2022-05-30 17:32:48 +00:00
Commented May 30, 2022 at 17:32 — Ynjxsjmh
– Ynjxsjmh, Commented May 30, 2022 at 17:32
It's impossible in this dataframe.

Konsta
– Konsta

2022-05-31 07:07:27 +00:00
Commented May 31, 2022 at 7:07 — Konsta
– Konsta, Commented May 31, 2022 at 7:07

Shubham Sharma · Accepted Answer · 2022-05-30 17:31:52Z

Solution

s = df.reindex(df.index.repeat(df['Quantity']))
s['Quantity'] = 1
s['Sold'] = s.groupby(level=0).cumcount().lt(s['Sold']).astype(int)

Explained

Repeat the index of dataframe exactly Quantity times

>>> s = df.reindex(df.index.repeat(df['Quantity']))
>>> s

  Offer name  Quantity  Sold     Date
0          A         3     2  2022-05
0          A         3     2  2022-05
0          A         3     2  2022-05
1          B         2     1  2022-01
1          B         2     1  2022-01
2          C         1     1  2022-04
3          D         1     1  2022-04
4          E         1     1  2022-05

Group the above dataframe on index and use cumcount to create a sequential counter,

>>> s.groupby(level=0).cumcount()
0    0
0    1
0    2
1    0
1    1
2    0
3    0
4    0
dtype: int64

Flag the rows where the sequential counter is less than Sold quantity then change the dtype to int and assign the result to Sold column

>>> s['Sold'] = s.groupby(level=0).cumcount().lt(s['Sold']).astype(int)
>>> s

  Offer name  Quantity  Sold     Date
0          A         1     1  2022-05
0          A         1     1  2022-05
0          A         1     0  2022-05
1          B         1     1  2022-01
1          B         1     0  2022-01
2          C         1     1  2022-04
3          D         1     1  2022-04
4          E         1     1  2022-05

Ynjxsjmh · Accepted Answer · 2022-05-30 17:52:41Z

1

You can try apply on rows

def repeat(row):
    offer = [row['Offer name']] * row['Quantity']
    sold = [1] * row['Sold'] + [0] * (row['Quantity']-row['Sold'])
    return offer, sold, 1

df[['Offer name', 'Sold', 'Quantity']] = df.apply(repeat, axis=1, result_type='expand')
df = df.explode(['Offer name', 'Sold'])

print(df)

  Offer name  Quantity Sold     Date
0          A         1    1  2022-05
0          A         1    1  2022-05
0          A         1    0  2022-05
1          B         1    1  2022-01
1          B         1    0  2022-01
2          C         1    1  2022-04
3          D         1    1  2022-04
4          E         1    1  2022-05

answered May 30, 2022 at 17:52

Ynjxsjmh

30.3k7 gold badges43 silver badges64 bronze badges

1 Comment

Konsta Over a year ago

Thanks! This was also a clear way to solve the problem.

Collectives™ on Stack Overflow

Python Pandas: Transforming dataframe according to a specific column

2 Answers 2

Solution

Explained

Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Solution

Explained

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related