Selecting a dataframe using another dataframe?

Question

I regularly get in a scenario where I have a dataframe with a MultiIndex with 3 levels. I then reduce that dataframe to two levels (for instance, to get the mean or the size of a level) and make a subselection of those means, for instance.

I just can't get this to work. I have tried slicing, loc (but that gives an error), etc. but I cannot get this to work.

How do you do this? Example:

import pandas as pd
import numpy as np

df1 = pd.DataFrame.from_dict({'Alpha': 'a a b b c'.split(), 
                    'Word': 'one one three two three'.split(),
                    'AnotherWord':'alpha alpa beta bèta gamma'.split(),
                    'Random1': list(np.random.randint(0,20,5)),
                    'Random2':list(np.random.randint(0,200,5)),
                    'Random3':list(np.random.randint(0,100,5))}
                    )
df1.set_index(['Alpha', 'Word', 'AnotherWord'], inplace=True)

>>> df1
                         Random1  Random2  Random3
Alpha Word  AnotherWord                           
a     one   alpha              9      123       34
            alpa              18        9       77
b     three beta              10      110       33
      two   bèta              11      153       88
c     three gamma              9      130        6

filtered = df1.groupby(['Alpha', 'Word']).size()
>>> filtered
Alpha  Word 
a      one      2
b      three    1
       two      1
c      three    1
dtype: int64

Now I want to filter on filtered == 1:

Result should be:

                         Random1  Random2  Random3
Alpha Word  AnotherWord                               
b     three beta              10      110       33
      two   bèta              11      153       88
c     three gamma              9      130        6

In this case I have no performed any filtering, but I do want to add the data to the df1.

BENY · Accepted Answer · 2018-07-12 14:42:18Z

2

You can using transform

s=df1.groupby(['Alpha', 'Word']).Random1.transform('size')
df1[s==1]
Out[58]: 
                         Random1  Random2  Random3
Alpha Word  AnotherWord                           
b     three beta              15       68       79
      two   bèta              15       87       85
c     three gamma              8       14       26

answered Jul 12, 2018 at 14:42

BENY

324k22 gold badges176 silver badges250 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Selecting a dataframe using another dataframe?

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related