0

Using python pandas how can we change the data frame

  • input

    id1 AAA 12
    id1 BBB 2
    id2 DDD 3
    id2 AAA 23
    id3 FFF 34
    id3 AAA 5
    id3 BBB 65
    
  • output

        id1 id2 id3
    AAA  12  23   0
    BBB   2   0  65
    DDD   0   3   0
    FFF   0   0  34
    

2 Answers 2

4

I think the pivot_table function is what you are looking for.

row = [["id1", "AAA", 12],["id2", "BBB", 2],["id3", "CCC", 1],["id1", "BBB", 4],["id2", "AAA", 1],["id3", "AAA", 3]]
df=pd.DataFrame(row, columns=["id", "letters", "numbers"])
df.pivot_table(values="numbers", index="letters",columns="id").reset_index()

It does what the pivot table in excel does, summing the values in case the index is duplicated (but you can set the aggregating function to be an average)

Sign up to request clarification or add additional context in comments.

Comments

1

You can use unstack() and fillna() to get your expected output.

from pandas.compat import StringIO as pStringIO

new_data = pStringIO("""id Symbol Value
id1 AAA 12
id1 BBB 2
id2 DDD 3
id2 AAA 23
id3 FFF 34
id3 AAA 5
id3 BBB 65""")

df = pd.read_csv(new_data, sep="\s+", index_col=[0,1], skipinitialspace=True)
df_soln = (df.unstack(level=0)).fillna(0)
print(df_soln)

giving you

       Value            
id       id1   id2   id3
Symbol                  
AAA     12.0  23.0   5.0
BBB      2.0   0.0  65.0
DDD      0.0   3.0   0.0
FFF      0.0   0.0  34.0

If you don't want the Value top-level showing, just do the following.

df_soln.columns = [c[-1] for c in df_soln.columns]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.