How to count rows in a dataframe where column values match column values in another dataframe

Question

I have the following sample data

df1 = [[52, '1', '10'], [54, '1', '4'],
       [55, '2', '3'], [52, '1', '10'],
       [55, '2', '10'], [52, '1', '4']]
    
df = pd.DataFrame(df1, columns =['Cow', 'Lact', 'Procedure'])

df2 = [['52', '1'], ['53', '9'],
       ['54', '2'], ['55', '2']]
    
df2 = pd.DataFrame(df2, columns =['Cow', 'Lact'])

The tables look like:

df:

    Cow Lact    Procedure
0   52  1        10
1   54  1        4
2   55  2        3
3   52  1        10
4   55  2        10
5   52  1        4

df2:

I would like to count the number of procedure = 10 for each Cow-Lact combination in df2 and then add a column to df2 called Tproc that includes the count.

The output I am looking for is

    Cow Lact Tproc
0   52  1     2
1   53  9     0
2   54  2     0
3   55  2     1

The following filter does not work:

filt = [(df['Cow']==df2['Cow'])&(df['Lact']==df2['Lact'])&(df['Procedure']==10)]

My plan was then to use .len to get a count

df2['Tproc'] = df2.loc[filt].len

How can you filter a DataFrame based on values in another DataFrame to count the number of rows that satisfy the condition?

Your data types are not the same between columns of same name — Corralien
– Corralien, Commented Mar 3, 2022 at 6:16

Corralien · Accepted Answer · 2022-03-03 07:54:32Z

1

You can use value_counts and reindex your first dataframe before append the values to Tproc column:

df2['Tproc'] = (
    df1[df1['Procedure'] == '10'].value_counts(['Cow', 'Lact'])
       .reindex(pd.MultiIndex.from_frame(df2[['Cow', 'Lact']]), fill_value=0).values
)
print(df2)

# Output
  Cow Lact  Tproc
0  52    1      2
1  53    9      0
2  54    2      0
3  55    2      1

Setup:

df1 = pd.DataFrame({'Cow': [52, 54, 55, 52, 55, 52],
                    'Lact': [1, 1, 2, 1, 2, 1],
                    'Procedure': ['10', '4', '3', '10', '10', '4']})

df2 = pd.DataFrame({'Cow': [52, 53, 54, 55], 'Lact': [1, 9, 2, 2]})

edited Mar 3, 2022 at 7:54

answered Mar 3, 2022 at 6:20

Corralien

121k8 gold badges44 silver badges69 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

JohnH Over a year ago

Corralien I like your code. Thanks for the heads up re the datatypes. In the real dataset the cow and lact variables are integers. When I run the code you wrote with this example it works great. When I run it with the integer variables and remove the df1.astype({'Cow'}:str) like below df2['Tproc'] = ( [df1['Procedure'] == '10'].value_counts(['Cow', 'Lact']) .reindex(df2[['Cow', 'Lact']], fill_value=0).values ) I get the following, TypeError: list indices must be integers or slices, not str also error if I leave in df['Cow']

Corralien Over a year ago

@JohnH. I updated my answer according to your comment. Take care, it's not .value_counts(['Cow', 'Lact']) but .value_counts([['Cow', 'Lact']]).

score 1 · Accepted Answer · 2022-03-03 06:25:37Z

1

You could merge + groupby + sum:

tmp = df2.merge(df.astype(str), on=['Cow','Lact'], how='left')
out = tmp['Procedure'].eq('10').groupby([tmp['Cow'], tmp['Lact']]).sum().reset_index(name='Tproc')

Output:

  Cow Lact  Tproc
0  52    1      2
1  53    9      0
2  54    2      0
3  55    2      1

edited Mar 3, 2022 at 6:25

answered Mar 3, 2022 at 6:20

user7864386

Comments

JayPeerachai · Accepted Answer · 2022-03-03 06:28:42Z

1

use groupby() + size() and then merge()

out = df2.merge(
    df[df['Procedure'] == '10'].groupby(['Cow', 'Lact']).size().reset_index(name='Tproc').astype(str), 
    how='left', 
    on=['Cow','Lact']
).fillna(0)

output:

  Cow Lact Tproc
0 52  1   2
1 53  9   0
2 54  2   0
3 55  2   1

answered Mar 3, 2022 at 6:28

JayPeerachai

3,8523 gold badges19 silver badges34 bronze badges

Collectives™ on Stack Overflow

How to count rows in a dataframe where column values match column values in another dataframe

3 Answers 3

2 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related