I am facing an issue while developing a piece of code for the below function. I have a dataframe with the below values
| Date | Name | UserId | task | Client | duration |
|---|---|---|---|---|---|
| 1/2/2022 | 'Alex, J' | 101 | 'C' | QAT | 8 |
| 2/2/2022 | 'Alex, J' | 101 | 'C' | QAT | 8 |
| 1/2/2022 | 'Marc, B' 'Marc, B' | 102 102 | 'A' 'B' | App Dev | 8 |
| 2/2/2022 | 'Marc, B' 'Marc, B' | 102 102 | 'A' 'B' | App Dev | 8 |
Now, I want to convert to the below dataframe.
| Date | Name | UserId | task | Client | duration |
|---|---|---|---|---|---|
| 1/2/2022 | 'Alex, J' | 101 | 'C' | QAT | 8 |
| 2/2/2022 | 'Alex, J' | 101 | 'A' | QAT | 8 |
| 1/2/2022 | 'Marc, B' | 102 | 'A' | App | 4 |
| 1/2/2022 | 'Marc, B' | 102 | 'B' | Dev | 4 |
| 2/2/2022 | 'Marc, B' | 102 | 'A' | App | 4 |
| 2/2/2022 | 'Marc, B' | 102 | 'B' | Dev | 4 |
I want to separate out the values in Name, UserId, task and Client column and want to divide the duration by the number of tasks for a particular day.
For example, I had 2 tasks here i.e A and B for the same day(1/2/2022). So i divided the duration of 8 by 2 and got 4 for each A and B.
I would request you to please help me in this. Thanks alot.
df = pd.DataFrame(...). Beside that look at theexplode()method in pandas.