0

is there an efficient way to concatenate strings column-wise of multiple rows of a DataFrame such that the result is a single row whose value for each column is the concatenation of each value of the same column of all given rows?

Example

Combine the first four rows as explained above.

>>> df = pd.DataFrame([["this", "this"], ["is", "is"], ["a", "a"], ["test", "test"], ["ignore", "ignore"]])
>>> df
        0       1
0    this    this
1      is      is
2       a       a
3    test    test
4  ignore  ignore

Both accepted results:

          0              1
0  this is a test  this is a test
          0
1  this is a test
2  this is a test
0

1 Answer 1

1

If need join all rows without last use DataFrame.iloc with DataFrame.agg:

s = df.iloc[:-1].agg(' '.join)
print (s)
0    this is a test
1    this is a test
dtype: object

For one row DataFrame add Series.to_frame with transpose:

df = df.iloc[:-1].agg(' '.join).to_frame().T
print (df)
                0               1
0  this is a test  this is a test

For all rows:

s = df.agg(' '.join)
print (s)
0    this is a test ignore
1    this is a test ignore
dtype: object


df = df.agg(' '.join).to_frame().T
print (df)
                       0                      1
0  this is a test ignore  this is a test ignore
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you very much for your answer, jezrael. I tested it against my solution df.iloc[rows, :].apply(lambda column: " ".join(column), axis=0) (rows is a list consisting of row indices) and both performed equally well. Maybe we will find an even better solution!
@cspecial - Yes, it is good solution, similar like my answer, axis=0 is default value, so should be removed.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.