3

Imagine this is my input data:

    data = [("France",    "Paris",      "Male",   "1"),
            ("France",    "Paris",      "Female", "6"),
            ("France",    "Nice",       "Male",   "2"),
            ("France",    "Nice",       "Female", "7"),
            ("Germany",   "Berlin",     "Male",   "3"),
            ("Germany",   "Berlin",     "Female", "8"),
            ("Germany",   "Munchen",    "Male",   "4"),
            ("Germany",   "Munchen",    "Female", "9"),
            ("Germany",   "Koln",       "Male",   "5"),
            ("Germany",   "Koln",       "Female", "10")]

I'd like to put it into a dataframe like this:

Country City       Sex
                   Male     Female
France  Paris       1         6
        Nice        2         7
Germany Berlin      3         8
        Munchen     4         9
        Koln        5         10

The first part is easy:

df = pd.DataFrame(data, columns=["country", "city", "sex", "count"])
df = df.set_index(["country", "city"])

Gives me output:

                   sex  count
country city                 
France  Paris      Male     1
        Paris    Female     6
        Nice       Male     2
        Nice     Female     7
Germany Berlin     Male     3
        Berlin   Female     8
        Munchen    Male     4
        Munchen  Female     9
        Koln       Male     5
        Koln     Female    10

So the rows are ok, but now I'd like to put the values from 'sex' column into a column multiindex. Is it possible to do so, if so, how?

2 Answers 2

3

Add column Sex to list in set_index and call unstack:

df = df.set_index(["country", "city",'sex']).unstack()
#data cleaning - remove columns name sex and rename column count
df = df.rename_axis((None, None),axis=1).rename(columns={'count':'Sex'})
print (df)
                   Sex     
                Female Male
country city               
France  Nice         7    2
        Paris        6    1
Germany Berlin       8    3
        Koln        10    5
        Munchen      9    4
Sign up to request clarification or add additional context in comments.

Comments

0

Another method using pivot inplace of unstack (both almost mean the same) i.e

df.set_index(['country','city']).pivot(columns='sex')
               
                   count     
sex             Female Male
country city               
France  Nice         7    2
        Paris        6    1
Germany Berlin       8    3
        Koln        10    5
        Munchen      9    4

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.