1

I have dynamic number of columns in my dataframe for each row and a single record can go for more than 1 row. First 2 columns are the key columns. If the key columns are matching, i have to append each row of data into a single row and create as much columns as required for appending.

Input is below (dataframe) c1 in a column c2 in a column etc...

row 1: A 1 c1 c2 c3.. c20
row 2: A 1 c21....c25
row 3. A 1 c26.... c35
row 4: A 2 d1 d2... d21
row 5: A 2 d22....d27

I tried using df.groupby(___first 2 column names____).first().reset_index() which returns only first row as we are using first(). is there any function to do this in python

output required: (dataframe)

row 1: A 1 c1 c2...c35 (each value in 1 column)
row 2: A 2 d1...d27 (each value in 1 column)

1 Answer 1

3

Use GroupBy.cumcount for series of counter, then DataFrame.set_index, DataFrame.sort_index and last flatten MultiIndex in list comprehension:

print (df)
       a  b    c    d    e    f
row1:  A  1   c1   c2   c3  c20
row2:  A  1  c21  c22  c23  c24
row3.  A  1  c26  c27  c28  c29
row4:  A  2   d1   d2  d21  d22
row5:  A  2  d22  d27  d28  d29

s = df.groupby(['a','b']).cumcount()

df1 = df.set_index(['a', 'b', s]).unstack().sort_index(level=1, axis=1)
df1.columns = [f'{x}{y}' for x, y in df1.columns]
df1 = df1.reset_index()
print (df1)
   a  b  c0  d0   e0   f0   c1   d1   e1   f1   c2   d2   e2   f2
0  A  1  c1  c2   c3  c20  c21  c22  c23  c24  c26  c27  c28  c29
1  A  2  d1  d2  d21  d22  d22  d27  d28  d29  NaN  NaN  NaN  NaN
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.