1

I have a key-value dataframe:

pd.DataFrame(columns=['X','Y','val'],data= [['a','z',5],['b','g',3],['b','y',6],['e','r',9]])
>    X Y val
   0 a z   5
   1 b g   3
   2 b y   6
   3 e r   9

Which I'd like to convert into a denser dataframe:

     X z g y r
   0 a 5 0 0 0
   1 b 0 3 6 0
   2 e 0 0 0 9

Before I resort to a pure-python I was wondering if there was a simple way to do this with pandas.

3
  • It's easy to pivot to get this without the empty line of b 0 0 0 0; is that important? Commented Sep 5, 2013 at 17:30
  • Should the 6 be on row 2 rather than row 1? Commented Sep 5, 2013 at 17:59
  • fixed row 2, it was a typo! thanks for pointing this out! Commented Sep 5, 2013 at 18:14

2 Answers 2

3

You can use get_dummies:

In [11]: dummies = pd.get_dummies(df['Y'])

In [12]: dummies
Out[12]: 
   g  r  y  z
0  0  0  0  1
1  1  0  0  0
2  0  0  1  0
3  0  1  0  0

and then multiply by the val column:

In [13]: res = dummies.mul(df['val'], axis=0)

In [14]: res
Out[14]: 
   g  r  y  z
0  0  0  0  5
1  3  0  0  0
2  0  0  6  0
3  0  9  0  0

To fix the index, you could just add the X as this index, you could first apply set_index:

In [21]: df1 = df.set_index('X', append=True)

In [22]: df1
Out[22]: 
     Y  val
  X        
0 a  z    5
1 b  g    3
2 b  y    6
3 e  r    9

In [23]: dummies = pd.get_dummies(df['Y'])

In [24]: dummies.mul(df['val'], axis=0)
Out[24]: 
     g  r  y  z
  X            
0 a  0  0  0  5
1 b  3  0  0  0
2 b  0  0  6  0
3 e  0  9  0  0

If you wanted to do this pivot (you can also use pivot_table):

In [31]: df.pivot('X', 'Y').fillna(0)
Out[31]: 
   val         
Y    g  r  y  z
X              
a    0  0  0  5
b    3  0  6  0
e    0  9  0  0

Perhaps you want to reset_index, to make X a column (I'm not sure whether than makes sense):

In [32]: df.pivot('X', 'Y').fillna(0).reset_index()
Out[32]: 
   X  val         
Y       g  r  y  z
0  a    0  0  0  5
1  b    3  0  6  0
2  e    0  9  0  0

For completeness, the pivot_table:

In [33]: df.pivot_table('val', 'X', 'Y', fill_value=0)
Out[33]: 
Y  g  r  y  z
X            
a  0  0  0  5
b  3  0  6  0
e  0  9  0  0

In [34]: df.pivot_table('val', 'X', 'Y', fill_value=0).reset_index()
Out[34]: 
Y  X  g  r  y  z
0  a  0  0  0  5
1  b  3  0  6  0
2  e  0  9  0  0

Note: the column name are named Y, after reseting the index, not sure if this makes sense (and easy to rectify via res.columns.name = None).

Sign up to request clarification or add additional context in comments.

5 Comments

Hmm. Using get_dummies preserves all the rows the OP wants, but doesn't put the 3 and 6 in the same row; .pivot("X", "Y").fillna(0) puts the 3 and 6 in the same row but loses the 0 row. I'm not sure which is closer to what the OP is after.
Hmmm, that positioning looks wrong. The thing I'm missing atm is the df['X'] col being part of the index
Yeah, I guess it could be an error on the OP's part. +1 anyway. :^)
:) I see what you're saying. Yeah, depends what they are after. If it's the thing OP wrote they should throw away the first index (as that doesn't make much sense)...
Yeah sorry about not being clear - the pivot tables where all I was looking for... forgot about those. However after testing out get_dummies this works out better for what I need to work with. Thank you!
1

If you want something that feels more direct. Something akin to DataFrame.lookup but for np.put might make sense.

def lookup_index(self, row_labels, col_labels):
    values = self.values
    ridx = self.index.get_indexer(row_labels)
    cidx = self.columns.get_indexer(col_labels)
    if (ridx == -1).any():
        raise ValueError('One or more row labels was not found')
    if (cidx == -1).any():
        raise ValueError('One or more column labels was not found')
    flat_index = ridx * len(self.columns) + cidx
    return flat_index

flat_index = lookup_index(df, vals.X, vals.Y)
np.put(df.values, flat_index, vals.val.values)

This assumes that df has the appropriate columns and index to hold the X/Y values. Here's an ipython notebook http://nbviewer.ipython.org/6454120

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.