0

I have a dataframe 'xyz' and I want to create a new column based on a simple calculation, but when I run the code below, the result is NaN.

xyz =

   account_id    date    
0    123        2016-01-01
1    124        2016-01-01
2    125        2016-01-01
3    126        2016-01-01
4    123        2016-01-02
5    124        2016-01-02
6    125        2016-01-02
7    126        2016-01-02

New column I want to create: number of days where I have data per account_id.

Code I'm executing:

xyz['new_column'] = xyz.groupby('account_id').date.nunique()

Result I get:

   account_id    date         new_column
0    123        2016-01-01      NaN
1    124        2016-01-01      NaN
2    125        2016-01-01      NaN
3    126        2016-01-01      NaN
4    123        2016-01-02      NaN
5    124        2016-01-02      NaN
6    125        2016-01-02      NaN
7    126        2016-01-02      NaN

Thanks in advance!

2 Answers 2

1

You can use transform:

xyz['new_column'] = xyz.groupby('account_id').date.transform('nunique')
Sign up to request clarification or add additional context in comments.

1 Comment

This is exactly what I was looking for! Thanks, @Julien Spronck!
1

Here is an alternate solution:

xyz['new_column'] = xyz.date.map(dict(xyz.date.value_counts()))

2 Comments

Thanks, @AlexG. I ran this but it counted all instances of the date. Apologies if I worded confusingly. My desired outcome is to have counts of account_id per date, but I switched out 'date' with 'account_id' and got the outcome I was looking for. Thanks!
Oops that is my fault. Cheers

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.