0

I have a df I'd like to split into 5 (named df1 - df5) based on the value of one column (origin). I've tried groupby, and a few other things (like this and this) with no success.

My df looks like this

     origin t_id    Group   ids            ...
0    g2     300     group2  23, 54, 24     ...
1    g      300     group2  1, 89          ...
2    g3     300     group10 155, 4, 90     ...
3    g5     300     group11 38, 13, 45.    ...
4    g4     300     group2  2.             ...

Right now I have it broken up into multiple .loc statements for each unique value of origin, but there must be a cleaner, more concise way to do this.

3
  • It's hard to help with no illustration. Have a look at How to make good reproducible pandas examples and then provide a sample of your data with the expected output Commented Aug 15, 2019 at 22:55
  • @AlexandreB. done Commented Aug 15, 2019 at 23:12
  • What is your expected output from the table above? Commented Aug 16, 2019 at 0:00

1 Answer 1

2

This should do


a = []

for value in df['origin'].unique():
    a.append(df[df['origin']==value])

The array will contain the dataframes corresponding to the unique values.Let me know if I misunderstood anything.

Sign up to request clarification or add additional context in comments.

6 Comments

This returns TypeError: 'method' object is not iterable
Please try again. unique was supposed to be called as a function
that seems to run, but then a just prints all the data, not a list of df names.
To access one particular dataframe, you will have to do a[i]
is there a way to automatically name them in the same block of code. For example, df1-df5?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.