0

I have a dataframe like this:

Time       User   Route
11:03:01   1234   home
11:03:04   1234   category
11:03:10   1234   product
11:03:21   1234   cart
11:04:01   4321   home
11:04:04   4321   category
11:04:10   4321   product
11:04:21   4321   cart

I want to create this:

Time       User   Route        Journey
11:03:01   1234   home         home
11:03:04   1234   category     home, category
11:03:10   1234   product      home, category, product
11:03:21   1234   cart         home, category, product, cart
11:04:01   4321   home         home
11:04:04   4321   category     home, category
11:04:10   4321   product      home, category, product
11:04:21   4321   cart         home, category, product, cart

How can I do this in a dataframe?

1
  • Do you need e.g. row 0-2 as well or is row 3 the important part? Commented Dec 6, 2019 at 18:05

1 Answer 1

1

Here you go:

df['Journey'] = (df.Route.add(', ')
                   .groupby(df['User'])
                   .transform(lambda x: x.cumsum().str[:-2])
                )

output:

       Time  User     Route                        Journey
0  11:03:01  1234      home                           home
1  11:03:04  1234  category                 home, category
2  11:03:10  1234   product        home, category, product
3  11:03:21  1234      cart  home, category, product, cart
4  11:04:01  4321      home                           home
5  11:04:04  4321  category                 home, category
6  11:04:10  4321   product        home, category, product
7  11:04:21  4321      cart  home, category, product, cart
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.