1

I have a dataframe which looks like this:

time text
01.01.1970 abc
01.01.1970 cde
01.01.1970 fgh
01.01.1980 abc
01.01.1980 xyz

I would like to join the content in text based on column time. I want to join them separated by \n. How can I do this in order to get such a dataframe?

time text
01.01.1970 abc\ncde\nfgh
01.01.1980 abc\nxyz

I tried the following but I do not get what is expected but instead for every row in text I get: text\ntime.

out = (df.groupby('time', as_index=False)
       ['text'].agg(lambda x: '\n'.join(x.dropna())))
3
  • 1
    remove as_index=False. Commented Jul 11, 2022 at 14:49
  • Why your provided groupby doesn't work as expected? Commented Jul 11, 2022 at 15:04
  • Because there was as_index=False included. Commented Jul 11, 2022 at 15:06

3 Answers 3

2
df.groupby('time')['text'].apply(lambda x: x.str.cat(sep='\n'))

output:

time    text
01.01.1970  "abc\ndef"
01.01.1980  "ghi\njkl"
Sign up to request clarification or add additional context in comments.

Comments

1

It's easier to drop NaNs before

df.dropna().groupby('time')['text'].agg('\n'.join)

1 Comment

This might not work as expected if data has other columns than the two included here. Plus, the solution is really something else.
0

This answer is longer/uglier than the others but it at least gives you back a dataframe similar to your starting one.

List = []
for x in df.time.unique():
    List.append([x , "\n".join(df[df.time == x].text.values)])
pd.DataFrame(List, columns = df.columns)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.