2

Given data

df = pd.DataFrame(
    {
        'c': ['p1', 'p2', 'p3'],
        'v': [ 2  ,  8  ,  3],
    }
)

This outputs

    c  v  
0  p1  2   
1  p2  8   
2  p3  3   

I'm wondering how to create the following using pandas

    c  v  p1  p2  p3
0  p1  2   2   0   0
1  p2  8   0   8   0
2  p3  3   0   0   3

In such a way that I could scale this up to 1000 rows rather than 3 rows (so no hard coding)

edit

my current approach is as follows :

df = pd.DataFrame(
    {
        'c': ['p1', 'p2', 'p3'],
        'v': [ 2  ,  8  ,  3],
    }
)

# create columns with zero 
for p in df['c']:
    df[p] = 0
# iterate over columns, set values 
for p in df['c']:
    # get value
    value = df.loc[ df.loc[:,'c']==p, 'v']
    # get the location of the element to set
    idx=df.loc[:,'c']==p
    df.loc[idx,p]=value

which outputs the correct result, I feel as though it's a very clunky approach though.

Edit two

The solution must work for the following data :

df = pd.DataFrame(
    {
        'c': ['p1', 'p2', 'p3', 'p1'],
        'v': [ 2  ,  8  ,  3, 4],
    }
)

returning

    c  v  p1  p2  p3
0  p1  2   2   0   0
1  p2  8   0   8   0
2  p3  3   0   0   3
3  p1  9   9   0   0

Meaning that the approach of using a pivot table as

piv = df.pivot_table(index='c', columns='c', values='v', fill_value=0)
df = df.join(piv.reset_index(drop=True))

wouldn't work, although for the original data set it was fine.

4 Answers 4

2

Multiple indicator DataFrame created by get_dummies with column v and DataFrame.join to original:

df1 = df.join(pd.get_dummies(df["c"]).mul(df['v'], axis=0))
print (df1)
    c  v  p1  p2  p3
0  p1  2   2   0   0
1  p2  8   0   8   0
2  p3  3   0   0   3

EDIT:

df1 = df.join(pd.get_dummies(df["c"]).mul(df['v'], axis=0))
print (df1)
    c  v  p1  p2  p3
0  p1  2   2   0   0
1  p2  8   0   8   0
2  p3  3   0   0   3
3  p1  4   4   0   0

Details:

#indicator column
print (pd.get_dummies(df["c"]))
   p1  p2  p3
0   1   0   0
1   0   1   0
2   0   0   1
3   1   0   0

#all values are multiple by c column
print (pd.get_dummies(df["c"]).mul(df['v'], axis=0))
   p1  p2  p3
0   2   0   0
1   0   8   0
2   0   0   3
3   4   0   0
Sign up to request clarification or add additional context in comments.

7 Comments

does this depend on their being as many p_i values as there are rows in the data? because that's not necessarily true (see update)
@baxx - In my solution if working perfectly, check edited answer.
yeah, this works well, and is the only one that works with the data i have locally as well, so I've translated something poorly it seems. Thanks though
@baxx - You test another solution with pivot_table from Erfan answer, it working only if unique values in c.
yes, the one just with get_dummies and join didn't work with the data i have (~3400 rows) for some reason, I'm not too sure. I should try to get a subset of the data, obfuscate it, and include it into the OP so that you (and others) can see any mistakes, and why your solution was (in this case) the most ideal
|
2

Use

  • pd.get_dummies() - Convert categorical variable into dummy/indicator variables.

  • df.join() - Join columns of another DataFrame.

Ex.

import pandas as pd
df = pd.DataFrame(
    {
        'c': ['p1', 'p2', 'p3'],
        'v': [ 2  ,  8  ,  3],
    }
)
s = pd.get_dummies(df["c"])
s.values[s != 0] = df['v']
df = df.join(s)
print(df)

O/P:

    c  v  p1  p2  p3
0  p1  2   2   0   0
1  p2  8   0   8   0
2  p3  3   0   0   3

Comments

1

You can use numpy matrix.

n = df['c'].shape[0]
t = np.zeros(shape=(n, n)).astype(np.int)
np.fill_diagonal(t, df['v'])    
t = pd.DataFrame(t, columns = df['c'])

df = pd.concat([df,t], axis=1)

df:

    c   v   p1  p2  p3
0   p1  2   2   0   0
1   p2  8   0   8   0
2   p3  3   0   0   3

1 Comment

nice to see an alternative approach (although the question was around pandas - still good to see a different take though)
1

Using pivot_table:

piv = df.pivot_table(index='c', columns='c', values='v', fill_value=0)
df = df.join(piv.reset_index(drop=True))

Output

    c  v  p1  p2  p3
0  p1  2   2   0   0
1  p2  8   0   8   0
2  p3  3   0   0   3

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.