6

I have a dataframe containing a categorical variable in one column and a continuous variable in another column like so:

    gender  contVar
    Male     22379
    Female   24523
    Female   23421
    Male     23831
    Male     29234

I want to get a table like so:

    Male   Female
    22379   24523
    23831   23421
    23831
    29234

Is this possible in pandas? When I do:

    df.pivot(index = df.index.tolist(), columns='gender', values='contVar') 

I get that the index is out of bounds (obviously since there arent rows as there are indices but I also presume that its because the number of rows in each column are not equal). Any ideas are appreciated.

1 Answer 1

5

You can do:

pd.concat([pd.DataFrame({g:d.contVar.tolist()}) for g,d in df.groupby('gender')], axis=1)

Out[416]:
   Female   Male
0   24523  22379
1   23421  23831
2     NaN  29234
Sign up to request clarification or add additional context in comments.

7 Comments

Or is it possible to get two separate lists? I don't really need the pivotted data in a table (and fill empty cells with NaN) but even separate lists.
I do not use pivot here, and I actually construct a list of dataframes (one for female, the other for male) so you can access 'separate' dataframes from this list.
[d.contVar.tolist() for g,d in df.groupby('gender')] do this will give you a list of two lists like this [[24523, 23421], [22379, 23831, 29234]]
[{g:d.contVar.tolist()} for g,d in df.groupby('gender')] do this will give you a list of two dicts like this [{'Female': [24523, 23421]}, {'Male': [22379, 23831, 29234]}]
it is a list of dictionnary indeed, not clear what the OP means by 'two separate lists'
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.