2

I have this DataFrame

df  = pd.DataFrame({'store':[1,1,1,2],'upc':[11,22,33,11],'sales':[14,16,11,29]})

which gives this output

   store  upc  sales
0      1   11     14
1      1   22     16
2      1   33     11
3      2   11     29

I want something like this

store upc_11  upc_22  upc_33
    1   14.0    16.0    11.0
    2   29.0    NaN     NaN

I tried this

newdf = df.pivot(index='store', columns='upc')
newdf.columns = newdf.columns.droplevel(0)

and the output looks like this with multiple headers

upc      11    22    33
store                  
1      14.0  16.0  11.0
2      29.0   NaN   NaN

I also tried

newdf = df.pivot(index='store', columns='upc').reset_index()

This also gives multiple headers

    store sales            
upc          11    22    33
0       1  14.0  16.0  11.0
1       2  29.0   NaN   NaN

2 Answers 2

4

try via fstring+columns attribute and list comprehension:

newdf = df.pivot(index='store', columns='upc')
newdf.columns=[f"upc_{y}" for x,y in newdf.columns]
newdf=newdf.reset_index()

OR

In 2 steps:

newdf = df.pivot(index='store', columns='upc').reset_index()
newdf.columns=[f"upc_{y}" if y!='' else f"{x}" for x,y in newdf.columns]
Sign up to request clarification or add additional context in comments.

Comments

2

Another option, which is longer than @Anurag's:

(df.pivot(index='store', columns='upc')
.droplevel(axis=1, level=0)
.rename(columns = lambda df: f"upc_{df}")
.rename_axis(index=None, columns=None)
)
 
   upc_11  upc_22  upc_33
1    14.0    16.0    11.0
2    29.0     NaN     NaN

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.