Merge multiple csv columns into one while repeating the header

Question

I am trying to merge multiple columns within a csv into a single column with each original column's header being repeated as shown below.

userA   userB
A1  B1
A2  B2
A2  B3
A2  B4

Into this:

userA   A1
userA   A2
userA   A3
userA   A4
userB   B1
userB   B2
userB   B3
userB   B4

Does anyone have any suggestions on how to do this. I do have some experience in pandas but I'm currently at a loss.

UPDATE: I found how to merge the columns

df = pd.read_csv(filename, sep='\t')
df = df.combine_first(pd.Series(df.values.ravel('F')).to_frame('merged'))

FINAL UPDATE: Solved using melt()

df = pd.melt(df)

With a dataframe of just those two columns, you could do df.stack().reset_index(level=1) — cmaher
– cmaher, Commented Apr 19, 2018 at 0:35
@cmaher This is great but the entries are not ordered properly. It's now alternating between userA and userB. Got an idea how to produce the above order? — testac1234
– testac1234, Commented Apr 19, 2018 at 0:49
That's not what your output in your question indicates. Can you update the expected output your question first? — cmaher
– cmaher, Commented Apr 19, 2018 at 0:51
Good work finding a solution, can you post that as the answer and accept it so we can close this question? — Ahmed Fasih
– Ahmed Fasih, Commented Apr 19, 2018 at 0:55
@cmaher I solved it! Your first comment was all I needed. I then used: df.sort_values(by=[0]) to sort properly. Thanks! — testac1234
– testac1234, Commented Apr 19, 2018 at 0:55

BENY · Accepted Answer · 2018-04-19 01:09:56Z

2

You can using melt

df.melt()
Out[702]: 
  variable value
0    userA    A1
1    userA    A2
2    userA    A2
3    userA    A2
4    userB    B1
5    userB    B2
6    userB    B3
7    userB    B4

answered Apr 19, 2018 at 1:09

BENY

324k22 gold badges176 silver badges250 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

testac1234 Over a year ago

df=pd.melt(df) is exactly what I needed! Nothing more is required. Thank you.

BENY Over a year ago

@testac1234 yw :-) happy coding

piRSquared · Accepted Answer · 2018-04-19 01:03:04Z

2

construct with `ravel` and `repeat`

pd.Series(df.values.ravel(), df.columns.repeat(len(df)))

userA    A1
userA    B1
userA    A2
userA    B2
userB    A2
userB    B3
userB    A2
userB    B4
dtype: object

answered Apr 19, 2018 at 1:03

piRSquared

296k68 gold badges509 silver badges654 bronze badges

Comments

testac1234 · Accepted Answer · 2018-04-19 01:20:46Z

1

Solved first using:

With a dataframe of just those two columns, you could do df.stack().reset_index(level=1) – cmaher

Following by a simple sort to order properly:

df.sort_values(by=[0])

See pd.melt(df) above for better answer.

edited Apr 19, 2018 at 1:20

answered Apr 19, 2018 at 0:59

testac1234

335 bronze badges

Collectives™ on Stack Overflow

Merge multiple csv columns into one while repeating the header

3 Answers 3

2 Comments

construct with `ravel` and `repeat`

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

construct with ravel and repeat

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related

construct with `ravel` and `repeat`