7

I have the following Pandas Dataframe:

df=pd.DataFrame({0:["a","b","c","d"], 1:["e","f","g",None], 2:["h",None,None,None]})

   0     1     2
0  a     e     h
1  b     f  None
2  c     g  None
3  d  None  None

I like to create a new DataFrame with one column where each row is a concatenated string, with a seperator ",":

       0
0  a,e,h
1    b,f
2    c,g
3      d

For a single row I could use

df.iloc[0,:].str.cat(sep=",")

but how can I apply this to the whole DataFrame, without using a for-loop (if possible)

3 Answers 3

5

Stacking removes nulls by default. Follow-up with a groupby on level=0

df.stack().groupby(level=0).apply(','.join)

0    a,e,h
1      b,f
2      c,g
3        d
dtype: object

To duplicate OP's output, use to_frame

df.stack().groupby(level=0).apply(','.join).to_frame(0)

       0
0  a,e,h
1    b,f
2    c,g
3      d
Sign up to request clarification or add additional context in comments.

Comments

4
for i, r in df.iterrows():
    print(r.str.cat(sep=","))

as a new dataframe:

ndf = pd.DataFrame([r.str.cat(sep=",") for i, r in df.iterrows()])
print(ndf)

       0
0  a,e,h
1    b,f
2    c,g
3      d

Comments

4

You could use:

df.apply(lambda x: ','.join(x.dropna()), axis=1)

Output:

0    a,e,h
1      b,f
2      c,g
3        d
dtype: object

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.