Python. Merge repeated columns

Question

I have to create a dataframe from a file that contains some columns repeated and their values split as it follows:

enter image description here

As you can see c1 for example is split into 3 parts or c2 into 2

What i want to get it is something like:

enter image description here

I know that i can merge the columns by:

df.sum(index=1) or df.max(index=1)

but don't know how to specify that I want to do it with specific columns.
Another possibility could be to create dataframes with only the repeated columns, apply either sum or max and then merge everything.

But I was wondering if there is something less "ugly".

Julien Marrec · Accepted Answer · 2015-07-16 09:43:26Z

4

In a much more simple fashion, you can use groupby for that.

In [1]: df = pd.DataFrame(np.random.random_integers(0,10,(5,8)), columns=['C1','C2','C3','C1','C4','C1','C5','C2'])

In [2]: df
Out[2]:
    C1  C2  C3  C1  C4  C1  C5  C2
0   5   0   9   1   7   3   3   8
1   3   1   10  7   1   2   3   8
2   1   0   0   0   4   10  6   10

In [3]: # Groupby level 0 on axis 1 (columns) and apply a sum
df.groupby(level=0, axis=1).sum()

Out[3]:
    C1  C2  C3  C4  C5
0   9   8   9   7   3
1   12  9   10  1   3
2   11  10  0   4   6

edited Jul 16, 2015 at 9:43

answered Jul 16, 2015 at 9:33

Julien Marrec

11.9k5 gold badges51 silver badges67 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Python. Merge repeated columns

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related