2

I have a sequential campaign data in Pandas dataset.

#sample data code 
user_id = [9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,4705,4705,4705,4705,4705,223,223,223,223,223,223,223,223]
transaction_Value= [50,125,0,100,0,1000,473,0,47,110,0,44,93,0,49,92,0,242,0,75,0,47,122,0,50,100,200,0,35,85,0,50]
Campaign = ['M1','M1','Used','M1','Used','W1','Used','Used','W2','W2','Used','W2','W2','Used','W2','W2','Used','O1','Used','W3','Used','W2','S1','Lost','M1','M1','M1','Used','W2','S2','Lost','S2',]
transaction_c= [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,1,2,3,4,5,1,2,3,4,5,6,7,8]
 
df = pd.DataFrame(list(zip(user_id,transaction_Value,Campaign,transaction_c)), columns =['user_id','transaction_Value', 'Campaign','transaction_c'])

So far I have used the following code to group the data

df2 = (df.set_index(['user_id',df.groupby('user_id').cumcount()])[('transaction_Value')]
         .unstack(fill_value='')
         .reset_index())

This Transposes the value based on the transaction number

| user_id | 0  | 1   | 2   | 3   | 4  | 5    | 6   | 7  | 8  | 9   | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17  | 18 |
|---------|----|-----|-----|-----|----|------|-----|----|----|-----|----|----|----|----|----|----|----|-----|----|
| 9       | 50 | 125 | 0   | 100 | 0  | 1000 | 473 | 0  | 47 | 110 | 0  | 44 | 93 | 0  | 49 | 92 | 0  | 242 | 0  |
| 223     | 50 | 100 | 200 | 0   | 35 | 85   | 0   | 50 |    |     |    |    |    |    |    |    |    |     |    |
| 4705    | 75 | 0   | 47  | 122 | 0  |      |     |    |    |     |    |    |    |    |    |    |    |     |    |

how do I write a code so that this is changed to every time the rows value is used or lost

I could do the same for the Campaign values and then stack these 2 dataframes together

Ideal output

| user_id | Type        | 1    | 2    | 3    | 4    |
|---------|-------------|------|------|------|------|
| 9       | Campaign    | M1   | M1   | Used |      |
| 9       | Campaign    | M1   | Used |      |      |
| 9       | Campaign    | W1   | Used |      |      |
| 9       | Campaign    | Used |      |      |      |
| 9       | Campaign    | W2   | W2   | Used |      |
| 9       | Campaign    | W2   | W2   | Used |      |
| 9       | Campaign    | W2   | W2   | Used |      |
| 9       | Campaign    | O1   | Used |      |      |
| 223     | Campaign    | M1   | M1   | M1   | Used |
| 223     | Campaign    | W2   | S2   | Lost |      |
| 223     | Campaign    | S2   |      |      |      |
| 9       | Transaction | 50   | 125  | 0    |      |
| 9       | Transaction | 100  | 0    |      |      |
| 9       | Transaction | 1000 | 473  |      |      |
| 9       | Transaction | 0    |      |      |      |
| 9       | Transaction | 47   | 110  | 0    |      |
| 9       | Transaction | 44   | 93   | 0    |      |
| 9       | Transaction | 49   | 92   | 0    |      |
| 223     | Transaction | 242  | 0    |      |      |
| 223     | Transaction | 50   | 100  | 200  | 0    |
| 223     | Transaction | 35   | 85   | 0    |      |
| 223     | Transaction | 50   |      |      |      |

Appreciate all the help in doing resolving this . thanks :)

4
  • 1
    how is the Type computed? Commented Mar 16, 2021 at 7:11
  • @JoeFerndz if the transpose is Campaign then its Campaign else if it for the transaction_Value then its Transaction Commented Mar 16, 2021 at 7:15
  • what transpose is Campaign? Your original dataset does not have Campaign as a value. Are you referring to the Column with Campaign and Transaction? Commented Mar 16, 2021 at 7:17
  • yes exactly the column's campaign and transaction Commented Mar 16, 2021 at 7:18

1 Answer 1

1

Create groups by test Campaign by Series.isin with change order by iloc and created groups by Series.cumsum, added to set_index and groupby and then use DataFrame.stack with sorting by third level, last remove second level and convert MultiIndex to columns:

g = df['Campaign'].isin(['Used','Lost']).iloc[::-1].cumsum().iloc[::-1]
g = pd.factorize(g)[0]

df2 = (df.set_index(['user_id',g, df.groupby(['user_id', g]).cumcount()])[['Campaign','transaction_Value']]
          .unstack(fill_value='')
          .stack(0)
          .sort_index(level=[2])
          .rename_axis(['user_id','Campaign','Type'])
          .reset_index(level=1, drop=True)
          .reset_index())

print (df2)
    user_id               Type     0     1     2     3
0         9           Campaign    M1    M1  Used      
1         9           Campaign    M1  Used            
2         9           Campaign    W1  Used            
3         9           Campaign  Used                  
4         9           Campaign    W2    W2  Used      
5         9           Campaign    W2    W2  Used      
6         9           Campaign    W2    W2  Used      
7         9           Campaign    O1  Used            
8       223           Campaign    M1    M1    M1  Used
9       223           Campaign    W2    S2  Lost      
10      223           Campaign    S2                  
11     4705           Campaign    W3  Used            
12     4705           Campaign    W2    S1  Lost      
13        9  transaction_Value    50   125     0      
14        9  transaction_Value   100     0            
15        9  transaction_Value  1000   473            
16        9  transaction_Value     0                  
17        9  transaction_Value    47   110     0      
18        9  transaction_Value    44    93     0      
19        9  transaction_Value    49    92     0      
20        9  transaction_Value   242     0            
21      223  transaction_Value    50   100   200     0
22      223  transaction_Value    35    85     0      
23      223  transaction_Value    50                  
24     4705  transaction_Value    75     0            
25     4705  transaction_Value    47   122     0      
Sign up to request clarification or add additional context in comments.

2 Comments

How do I keep the transaction transaction_c column in the data ? in this case for the user_id 223 the first Campaign was M1 and not S2
@AniruddhaDas - You are right, added pd.factorize() to correct ordering of groups g

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.