1

Current Dataframe:

CountryName      IndicatorCode    Year         Value  
Arab World     TX.VAL.MRCH.RS.ZS  1960  1.646954e+01  
Arab World     TX.VAL.MRCH.R1.ZS  1960  2.260207e+00
Arab World     TX.VAL.MRCH.RS.ZS  1961  1.244584e+01
Arab World     TX.VAL.MRCH.R1.ZS  1961  1.860104e+00  
Zimbabwe       DT.DIS.OFFT.CD     2015  8.377700e+07
Zimbabwe       DT.INT.OFFT.CD     2015  2.321300e+07
Zimbabwe       DT.AMT.PROP.CD     2015  6.250000e+05

I want to convert each value of IndicatorCode column as different columns and these columns should contain data from the respective rows of Value column.
For example, after doing reshape:

CountryName Year TX.VAL.MRCH.RS.ZS TX.VAL.MRCH.R1.ZS  
Arab World  1960 1.646954e+01      2.260207e+00
Arab World  1961 1.244584e+01      1.860104e+00

Final Dataframe columns should be:

[CountryName, Year, TX.VAL.MRCH.RS.ZS, TX.VAL.MRCH.R1.ZS, DT.DIS.OFFT.CD,DT.INT.OFFT.CD, DT.AMT.PROP.CD]  

I tried using pivot, but not success. I cannot take Country name as Index also since its not unique.

temp = indicators_df.pivot(columns='IndicatorCode',  values='Value')

Got ValueError: negative dimensions are not allowed

2 Answers 2

4

You can use pivot_table which accepts multiple columns as index, values and columns:

df.pivot_table("Value", ["CountryName", "Year"], "IndicatorCode").reset_index()

enter image description here

Some explanation:

The parameters passed here are by positions, i.e, they are in the order of values, index, and columns or:

df.pivot_table(values = "Value", index = ["CountryName", "Year"], columns = "IndicatorCode").reset_index()

The values are what fill the cells of the final data frame, the index are the columns that get deduplicated and remain as columns in the result, the columns variables are ones that get pivoted to column headers in the result.

Sign up to request clarification or add additional context in comments.

1 Comment

If possible, can you tell what each arguments stand for(A little explanation). I was reading about it now, though not getting it. Thanks.
1

set_index + unstack

s = df.set_index(['CountryName', 'Year', 'IndicatorCode']).Value
s.unstack().reset_index().rename_axis([None], 1)

enter image description here

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.