0

My DF looks like below

x  y  z  b
1  2  3  Max
12 32 8  Max
1  2  3  Jon
12 32 8  Max
1  25  3  Jon
12 32 81  Anna

So I need to based on column b, take unique values (in this case: Max, Jon, Anna) and create 3 new df like this:

df_1:

x  y  z  b
1  2  3  Max
12 32 8  Max
12 32 8  Max

df_2:

x  y  z   b
1  2  3   Jon
1  25  3  Jon

df_3:

x  y  z   b
12 32 81  Anna

I was looking for the answer but I don't know how can I create new DF's. Do you have any ideas? Of course in original DF there is more unique values.

Regards Tomasz

1
  • df[df['b'] == 'Max']] etc. Commented Sep 1, 2021 at 9:44

4 Answers 4

1

Use locals() to create variable dynamically:

Update

Do you have maybe idea how instead of calling DF: DF_1, DF_2, DF_3 using unique names? I mean DF_Max, DF_Jon, DF_Anna and save every DF into excel?

for name, subdf in df.groupby('b', sort=False):
    locals()[f'df_{name}'] = subdf
    subdf.to_excel(f'{name}.xlsx', index=False)
>>> df_Max
    x   y  z    b
0   1   2  3  Max
1  12  32  8  Max
3  12  32  8  Max


>>> df_Jon
   x   y  z    b
2  1   2  3  Jon
4  1  25  3  Jon


>>> df_Anna
    x   y   z     b
5  12  32  81  Anna

Old answer

for i, (_, subdf) in enumerate(df.groupby('b', sort=False), 1):
    locals()[f'df_{i}'] = subdf
>>> df_1
    x   y  z    b
0   1   2  3  Max
1  12  32  8  Max
3  12  32  8  Max

>>> df_2
   x   y  z    b
2  1   2  3  Jon
4  1  25  3  Jon

>>> df_3
    x   y   z     b
5  12  32  81  Anna

https://stackoverflow.com/a/68969956/15239951

https://stackoverflow.com/a/68268034/15239951

Sign up to request clarification or add additional context in comments.

6 Comments

IMHO, a very bad practice to suggest, at the minimum I would add a warning. This can have unexpected consequences such as overwriting variables
Thanks mate! And do you have maby idea how insted of calling DF: DF_1, DF_2, DF_3 using unique names? I mean DF_Max, DF_Jon, DF_Anna and save every DF into excel?
I updated my answer according your comment.
You are my today's hero bro :D THANKS A LOT!
@Tmiskiewicz keep in mind that setting variables like this is a very bad practice, especially since here you really do not need to do this just to save a file
|
1

You can groupby('b') and make a dictionary:

dfs = {k:v for k,v in df.groupby('b')}

This is an efficient structure to save arbitrary keys. Especially if you do not know the number of groups in advance.

You can then access the dataframes by key:

>>> dfs['Max']
    x   y  z    b
0   1   2  3  Max
1  12  32  8  Max
3  12  32  8  Max

1 Comment

@mozway. I did the same.
0

Try this:

>>> Anna, Jon, Max = list(zip(*df.groupby('b')))[1]

Or:

>>> Anna, Jon, Max = [x for _, x in df.groupby('b')]
>>> Anna
    x   y   z     b
5  12  32  81  Anna
>>> Jon
   x   y  z    b
2  1   2  3  Jon
4  1  25  3  Jon
>>> Max
    x   y  z    b
0   1   2  3  Max
1  12  32  8  Max
3  12  32  8  Max
>>> 

Comments

0
df = pd.DataFrame({'X': [1,12,1,12,1,12],
                   'Y': [2,32,2,32,25,32],
                   'Z': [3,8,3,8,3,81],
                   'B': ["Max","Max","Jon","Max","Jon","Anna"]})
gb = df.groupby('B')
out = {}
for name, group in gb:
   out[name] = group
print(out['Max'])
# Output
    X   Y   Z   B
0   1   2   3   Max
1   12  32  8   Max
3   12  32  8   Max

Method 2

out = dict(tuple(df.groupby('B')))

EDIT

You can try vars() or globals() as well

import pandas as pd
df = pd.DataFrame({'X': [1,12,1,12,1,12],
                   'Y': [2,32,2,32,25,32],
                   'Z': [3,8,3,8,3,81],
                   'B': ["Max","Max","Jon","Max","Jon","Anna"]})
for name, group in df.groupby('B'):
    vars()[f"DF_{name}"] = group

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.