3

I want to calculate the frequencies of values of a dataframe column in a column from another dataframe. Right now, I have the code as below:

df2["freq"] = df1[["col1"]].groupby(df2["col2"])["col1"].transform('count')

But it is giving freq of 1.0 for all the values in df2["col2"], even for those values that don't exist in df1["col1"].

df1:

            col1
0            636  
1            636  
2            801  
3            802  

df2:

            col2
0            636  
1            734  
2            801  
3            803  

df2 after adding freq column:

            col2    freq
0            636    1.0
1            734    1.0
2            801    1.0
3            803    1.0

What I actually want:

            col2    freq
0            636     2
1            734     0
2            801     1
3            803     0

I am new to pandas, so I am not getting what I am doing wrong. Any help is appreciated! Thanks!

3
  • Can you add some data samples, 3-5 rows for both? Commented Aug 21, 2019 at 11:30
  • I don't understand what you mean by " in a column from another dataframe". Because you can get a frequency distribution easily: your_column.value_counts() Commented Aug 21, 2019 at 11:34
  • 1
    @jezrael I have updated the question to include dataframes. Hopefully my question is clear now Commented Aug 21, 2019 at 11:47

1 Answer 1

4

Use Series.map by Series created by Series.value_counts, last replace missing values to 0:

df2["freq"] = df2["col2"].map(df1["col1"].value_counts()).fillna(0).astype(int)
print (df2)
   col2  freq
0   636     2
1   734     0
2   801     1
3   803     0
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.