Create few Data Frames from existing Data Frames based on unique values

Question

My DF looks like below

x  y  z  b
1  2  3  Max
12 32 8  Max
1  2  3  Jon
12 32 8  Max
1  25  3  Jon
12 32 81  Anna

So I need to based on column b, take unique values (in this case: Max, Jon, Anna) and create 3 new df like this:

df_1:

x  y  z  b
1  2  3  Max
12 32 8  Max
12 32 8  Max

df_2:

x  y  z   b
1  2  3   Jon
1  25  3  Jon

df_3:

x  y  z   b
12 32 81  Anna

I was looking for the answer but I don't know how can I create new DF's. Do you have any ideas? Of course in original DF there is more unique values.

Regards Tomasz

df[df['b'] == 'Max']] etc.

RJ Adriaansen
– RJ Adriaansen

2021-09-01 09:44:05 +00:00
Commented Sep 1, 2021 at 9:44 — RJ Adriaansen
– RJ Adriaansen, Commented Sep 1, 2021 at 9:44

Corralien · Accepted Answer · 2021-09-01 11:09:31Z

1

Use locals() to create variable dynamically:

Update

Do you have maybe idea how instead of calling DF: DF_1, DF_2, DF_3 using unique names? I mean DF_Max, DF_Jon, DF_Anna and save every DF into excel?

for name, subdf in df.groupby('b', sort=False):
    locals()[f'df_{name}'] = subdf
    subdf.to_excel(f'{name}.xlsx', index=False)

>>> df_Max
    x   y  z    b
0   1   2  3  Max
1  12  32  8  Max
3  12  32  8  Max


>>> df_Jon
   x   y  z    b
2  1   2  3  Jon
4  1  25  3  Jon


>>> df_Anna
    x   y   z     b
5  12  32  81  Anna

Old answer

for i, (_, subdf) in enumerate(df.groupby('b', sort=False), 1):
    locals()[f'df_{i}'] = subdf

>>> df_1
    x   y  z    b
0   1   2  3  Max
1  12  32  8  Max
3  12  32  8  Max

>>> df_2
   x   y  z    b
2  1   2  3  Jon
4  1  25  3  Jon

>>> df_3
    x   y   z     b
5  12  32  81  Anna

https://stackoverflow.com/a/68969956/15239951

https://stackoverflow.com/a/68268034/15239951

edited Sep 1, 2021 at 11:09

answered Sep 1, 2021 at 9:45

Corralien

121k8 gold badges44 silver badges69 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

mozway Over a year ago

IMHO, a very bad practice to suggest, at the minimum I would add a warning. This can have unexpected consequences such as overwriting variables

Tmiskiewicz Over a year ago

Thanks mate! And do you have maby idea how insted of calling DF: DF_1, DF_2, DF_3 using unique names? I mean DF_Max, DF_Jon, DF_Anna and save every DF into excel?

Corralien Over a year ago

I updated my answer according your comment.

Tmiskiewicz Over a year ago

You are my today's hero bro :D THANKS A LOT!

mozway Over a year ago

@Tmiskiewicz keep in mind that setting variables like this is a very bad practice, especially since here you really do not need to do this just to save a file

|

mozway · Accepted Answer · 2021-09-01 09:45:33Z

1

You can groupby('b') and make a dictionary:

dfs = {k:v for k,v in df.groupby('b')}

This is an efficient structure to save arbitrary keys. Especially if you do not know the number of groups in advance.

You can then access the dataframes by key:

>>> dfs['Max']
    x   y  z    b
0   1   2  3  Max
1  12  32  8  Max
3  12  32  8  Max

answered Sep 1, 2021 at 9:45

mozway

267k13 gold badges56 silver badges106 bronze badges

1 Comment

Corralien Over a year ago

@mozway. I did the same.

U13-Forward · Accepted Answer · 2021-09-01 09:44:05Z

0

Try this:

>>> Anna, Jon, Max = list(zip(*df.groupby('b')))[1]

Or:

>>> Anna, Jon, Max = [x for _, x in df.groupby('b')]
>>> Anna
    x   y   z     b
5  12  32  81  Anna
>>> Jon
   x   y  z    b
2  1   2  3  Jon
4  1  25  3  Jon
>>> Max
    x   y  z    b
0   1   2  3  Max
1  12  32  8  Max
3  12  32  8  Max
>>>

answered Sep 1, 2021 at 9:44

U13-Forward

71.8k15 gold badges100 silver badges125 bronze badges

Comments

Rinshan Kolayil · Accepted Answer · 2021-09-01 13:20:59Z

0

df = pd.DataFrame({'X': [1,12,1,12,1,12],
                   'Y': [2,32,2,32,25,32],
                   'Z': [3,8,3,8,3,81],
                   'B': ["Max","Max","Jon","Max","Jon","Anna"]})
gb = df.groupby('B')
out = {}
for name, group in gb:
   out[name] = group
print(out['Max'])
# Output
    X   Y   Z   B
0   1   2   3   Max
1   12  32  8   Max
3   12  32  8   Max

Method 2

out = dict(tuple(df.groupby('B')))

EDIT

You can try vars() or globals() as well

import pandas as pd
df = pd.DataFrame({'X': [1,12,1,12,1,12],
                   'Y': [2,32,2,32,25,32],
                   'Z': [3,8,3,8,3,81],
                   'B': ["Max","Max","Jon","Max","Jon","Anna"]})
for name, group in df.groupby('B'):
    vars()[f"DF_{name}"] = group

edited Sep 1, 2021 at 13:20

answered Sep 1, 2021 at 9:55

Rinshan Kolayil

1,1491 gold badge10 silver badges16 bronze badges

Collectives™ on Stack Overflow

Create few Data Frames from existing Data Frames based on unique values

4 Answers 4

6 Comments

1 Comment

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

6 Comments

1 Comment

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related