How to merge multiple rows based on two columns in pandas

Question

I have data frame like this:

    Name1  Name2   Start End
    aaa    bbb     1     2
    aaa    bbb     2     22
    aaa    bbb     30    42
    ccc    ddd     100   141
    ccc    ddd     145   160
    ccc    ddd     160   178

How do I merge rows that the end time of the first row is equal to the start time of the second row, otherwise keep the row as is. The expected result look like this:

    Name1  Name2   Start End
    aaa    bbb     1     22
    aaa    bbb     30    42
    ccc    ddd     100   141
    ccc    ddd     145   178

I can do this use iterrow, but I am wondering if there is a better way like apply or groupby to do so.

akuiper · Accepted Answer · 2021-08-24 03:12:00Z

2

To rephrase the problem, you need to find intervals that don't overlap: if we sort Start column in ascending order, then whenever the cumulative maximum End is smaller than the next Start, you have a new interval, and based on this observation, you can create a new group variable and aggregate new Start and End for the merged intervals:

df.sort_values('Start', inplace=True)
df.groupby(['Name1', 'Name2']).apply(
  lambda g: g.groupby((g.End.cummax().shift() < g.Start).cumsum()).agg({'Start': min, 'End': max})
).reset_index(level=[0,1])

  Name1 Name2  Start  End
0   aaa   bbb      1   22
1   aaa   bbb     30   42
0   ccc   ddd    100  141
1   ccc   ddd    145  178

edited Aug 24, 2021 at 3:12

answered Aug 24, 2021 at 3:06

akuiper

216k33 gold badges362 silver badges379 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Travis Huang Over a year ago

Groupby within another groupby always confuse me. Could you elaborate more on how you do this? I want to learn the thinking process of this approach so I can use it in the future. Thanks!

Collectives™ on Stack Overflow

How to merge multiple rows based on two columns in pandas

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related