Creating new dataframes using groupby

Question

I read this (How to create multiple dataframes from pandas groupby object) however, I still do not understand how to create my dataframes for each person after I create my grouped_persons group with groupby.

How to create multiple dataframes from pandas groupby object

What should I change in this code? I think this is part of my problem: 'df_'+ name +'1'

grouped_persons = df.groupby('Person')
for name, group in grouped_persons
    'df_'+ name +'1' = df.loc[(df.Person == name) & (df.ExpNum == 1)]

File "", line 2 for name, group in grouped_persons ^ SyntaxError: invalid syntax

Kay Wittig · Accepted Answer · 2018-07-10 06:47:17Z

Let your DataFrame look like this

df = pd.DataFrame([['Tim', 1, 2],
                   ['Tim', 0, 2],
                   ['Claes', 1, 3],
                   ['Claes', 0, 1],
                   ['Emma', 1, 1],
                   ['Emma', 1, 2]], columns=['Person', 'ExpNum', 'Data'])

giving

>>> df
  Person  ExpNum  Data
0    Tim       1     2
1    Tim       0     2
2  Claes       1     3
3  Claes       0     1
4   Emma       1     1
5   Emma       1     2

Then you will get the group dataframes directly from the pandas groupby object

grouped_persons = df.groupby('Person')

by

>>> grouped_persons.get_group('Emma')
  Person  ExpNum  Data
4   Emma       1     1
5   Emma       1     2

and there is no need to store those separately.

Note: Pandas version used was '0.23.1' but this feature might be available in some earlier versions as well.

Edit: If you are interested in those entries with ExpNum == 1 only, I suggest applying this before the groupby, e.g.

grouped_persons_1 = df[df['ExpNum'] == 1].groupby('Person')

jpp · Accepted Answer · 2018-07-10 09:01:03Z

Use a dictionary for a variable number of variables.

One straightforward solution is to use tuple keys representing ('Person', 'ExpNum') combinations. You can achieve this by feeding a groupby object to tuple and then the result to dict.

Data from @KayWittig.

df = pd.DataFrame([['Tim', 1, 2], ['Tim', 0, 2],
                   ['Claes', 1, 3], ['Claes', 0, 1],
                   ['Emma', 1, 1], ['Emma', 1, 2]],
                  columns=['Person', 'ExpNum', 'Data'])

df_dict = dict(tuple(df.groupby(['Person', 'ExpNum'])))

print(df_dict)

{('Claes', 0):   Person  ExpNum  Data
               3  Claes       0     1,
 ('Claes', 1):   Person  ExpNum  Data
               2  Claes       1     3,
 ('Emma', 1):   Person  ExpNum  Data
               4   Emma       1     1
               5   Emma       1     2,
 ('Tim', 0):   Person  ExpNum  Data
               1    Tim       0     2,
 ('Tim', 1):   Person  ExpNum  Data
               0    Tim       1     2}

Kavitha Madhavaraj · Accepted Answer · 2018-06-29 06:05:20Z

0

You can store it in a dictionary like this. I have corrected some syntax errors in your code as well.

    grouped_persons = df.groupby('Person')
    multi_df = {}
    for name, group in grouped_persons:
       multi_df['df_'+ name +'1'] = df[(df.Person == name) & (df.ExpNum == 1)]

Now you can get the stored dataframe back by using multi_df['df_myname_1']

answered Jun 29, 2018 at 6:05

Kavitha Madhavaraj

5921 gold badge6 silver badges24 bronze badges

Collectives™ on Stack Overflow

Creating new dataframes using groupby

3 Answers 3

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related