0

I am trying to expand out a dataset that has two columns and expand it out in python.

Basket        | Times 
______________|_______
Bread         | 5     
Orange, Bread | 3     

I would like, based on the number in the Times column that many rows. So for the example above

Newcolumn  
_______ 
Bread1
Bread2
Bread3
Bread4
Bread5   
Orange, Bread1
Orange, Bread2
Orange, Bread3  

2 Answers 2

1

You can try apply on rows to generate desired list and explode the column

df['Newcolumn'] = df.apply(lambda row: [f"{row['Basket']}_{i+1}" for i in range(row['Times'])], axis=1)
df = df.explode('Newcolumn', ignore_index=True)
print(df)

          Basket  Times        Newcolumn
0          Bread      5          Bread_1
1          Bread      5          Bread_2
2          Bread      5          Bread_3
3          Bread      5          Bread_4
4          Bread      5          Bread_5
5  Orange, Bread      3  Orange, Bread_1
6  Orange, Bread      3  Orange, Bread_2
7  Orange, Bread      3  Orange, Bread_3
Sign up to request clarification or add additional context in comments.

Comments

1

Use np.repeat to repeat each value the required number of times. Then groupby and cumcount to add the required suffixes:

import numpy as np
srs = np.repeat(df["Basket"],df["Times"])

output = (srs+srs.groupby(level=0).cumcount().add(1).astype(str)).reset_index(drop=True)

>>> output
0            Bread1
1            Bread2
2            Bread3
3            Bread4
4            Bread5
5    Orange, Bread1
6    Orange, Bread2
7    Orange, Bread3
dtype: object

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.