1

I have something like this:

   XY UV  BC   Val
0  y  u    c    11
1  y  u    b    22
2  y  v    c    33
3  y  v    b    44
4  x  u    c    111
5  x  u    b    222
6  x  v    c    333
7  x  v    b    444

I'd like to get

   XY  UV  B_Val  C_Val
0  y   u   22      11
1  y   v   44      33
2  x   u   222    111
3  x   v   444     333

In general, the BC columns above can contain a number of different items, so I need a solution that works in the general case, not only for 2 different values.

I tried writing some code that splits the dataframe, than re-joins the separate parts, but it started looking too complicated, and it wasn't going anywhere.

3 Answers 3

2

This where I like to use multi-level indexes and stack/unstack.

So here, I'd do:

from io import StringIO
import pandas

datacsv = StringIO("""\
XY UV  BC   Val
y  u    c    11
y  u    b    22
y  v    c    33
y  v    b    44
x  u    c    111
x  u    b    222
x  v    c    333
x  v    b    444
""")
df = pandas.read_csv(datacsv, sep='\s+')
df.set_index(['XY', 'UV', 'BC']).unstack(level='BC')

Which gives us:

       Val     
BC       b    c
XY UV          
x  u   222  111
   v   444  333
y  u    22   11
   v    44   33

So we have MultiIndexes on both the rows and columns. Assuming you don't want that, I would just do:

xtab = (df.set_index(['XY', 'UV', 'BC'])
          .unstack(level='BC')['Val']
          .reset_index())

And that'll give you:

BC XY UV    b    c
0   x  u  222  111
1   x  v  444  333
2   y  u   22   11
3   y  v   44   33
Sign up to request clarification or add additional context in comments.

4 Comments

How do I rename the column index from 'BC' to just 'index'?
That's actually the name of the columns index level. Try xtab.columns.names = [] or maybe it's xtab.columns.index.names = []
I got ValueError: Length of new names must be 1, got 0
Try filling the list with None or an empty string.
2

IIUC you want to pivot:

In [110]:
df.pivot(index='XY',columns='BC', values='Val')

Out[110]:
BC   b   c
XY        
x   10  20
y   33  44

EDIT

pivot doesn't support multi-index df's which was one method I was considering, what you could do is add a new column which is a composite of the 2 columns and use this as the index to pivot on:

In [120]:
df['composite'] = df['XY']+df['UV']
df

Out[120]:
  XY UV BC  Val composite
0  y  u  c   11        yu
1  y  u  b   22        yu
2  y  v  c   33        yv
3  y  v  b   44        yv
4  x  u  c  111        xu
5  x  u  b  222        xu
6  x  v  c  333        xv
7  x  v  b  444        xv

In [121]:
df.pivot(index='composite', columns='BC', values='Val')

Out[121]:
BC           b    c
composite          
xu         222  111
xv         444  333
yu          22   11
yv          44   33

3 Comments

How does the index= clause look like if I have more than 1 column analogous to XY? I'm sorry for adding a question like this.
I really don't understand what you're asking, as I've stated before unless you post a question with representative data and desired output then it becomes difficult to answer speculative questions
Please take a look at my improved example. Thanks
1

You can also use multi Index and unstack like this:

df=df.set_index(['XY','UV','BC'])
df=df.unstack('BC')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.