0

I have an excel sheet contains the data like the following.

How to handle this in python using pandas?

Typically I wants to plot this data in a graph. And wanted to find the percentage of people who have registered for ANC from the Estimated Number of Annual Pregnancies year-wise across the states.

Any idea would be deeply helpful.

PS: I am using IPython in Ipython notebook in LinuxMint.

enter image description here

I need the data to be indexed like this..

enter image description here

1 Answer 1

1

I would recommend you read in the data frame by skipping rows, then create a dictionary to rename your columns.

Something like the following:

df = pd.read_excel(path, skiprows=8)
mydict = {"Original Col1":"New Col Name1", "Original Col2":"New Col Name2"}
df = df.rename(mydict)
Sign up to request clarification or add additional context in comments.

2 Comments

But this wont allow me to sort the data with respect to year right.. I wants to use two indexing. (or group by years). Means if I want to compare the estimated pregnancies with registered for ANC with respect to year...I may not be able to do it from this right..
You could separate the them into two separate data frame using the iloc method. You can try assigning the first two columns to a a df using df.iloc[: , 0:2] and the other using df.iloc[:, 2:4]. Once done you can use the pivot method and index by region and date on all of the dfs you have. You must pass python list containing the indices for your pivot table. Something like the following should work: df.pivot_table(values="Estimated Number Preg", index= ["Region","Year"], aggfunc = sum).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.