3

Given this example dataframe in Pandas

df2 = pd.DataFrame({'a' : ['one', 'two', 'three', 'four', 'five', 'six', 'seven'],
                    'b' : ['x', 'y', 'y', 'x', 'y', 'x', 'x'],
                    'c' : ['abc', 'def', 'ghi', 'jkl', 'mno', 'pqr', 'stu']})

looking like

       a  b    c
0    one  x  abc
1    two  y  def
2  three  y  ghi
3   four  x  jkl
4   five  y  mno
5    six  x  pqr
6  seven  x  stu

I would like to build a new by concatenating e.g. rows 2 & 3 to get something like

   two_three y_y def_ghi
0    one     x   abc
1    two     y   def
2  three     y   ghi
3   four     x   jkl
4   five     y   mno
5    six     x   pqr
6  seven     x   stu

Any idea for a vector-like realization?

Thanks a lot, Sascha

1 Answer 1

2

You can get desired result applying str.join along axis to a dataframe slice. See for example

>>> df.iloc[[1,2]].apply('_'.join, axis=0)
two_three    two_three
y_y                y_y
def_ghi        def_ghi
dtype: object

If you want to name your columns this way, just do

>>> df.columns = df.iloc[[1,2]].apply('_'.join, axis=0)
>>> df
  two_three y_y def_ghi
0       one   x     abc
1       two   y     def
2     three   y     ghi
3      four   x     jkl
4      five   y     mno
5       six   x     pqr
6     seven   x     stu

[7 rows x 3 columns]
Sign up to request clarification or add additional context in comments.

1 Comment

Hmm, another Problem popped up now since my data is coming from a file I set all columns to "dtype=object" via read_csv and replaced all nA's with df = df.fillna('') But now the .join results in unicode-strings for the Header u'two_three ' etc. So how to get rid of this?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.