0

Suppose I have a dataframe:

DF1:
Class | Age | City        | Color
  A   | 20  | Los Angeles | Blue
  A   | 20  | Los Angeles | Blue
  A   | 20  | Los Angeles | Red
  B   | 25  | Phoenix     | Yellow

I'd like to get a unique count of every duplicate and unique value so the output looks like this:

DF2:
Class | Age | City        | Color   | Count
  A   | 20  | Los Angeles | Blue    |  2
  A   | 20  | Los Angeles | Red     |  1
  B   | 25  | Phoenix     | Yellow  |  1

In this case, Class A, Age 20, City Los Angeles, and Color Blue shows up twice. I've tried using nunique but my output did not collapse duplicate vales together.

df = df.groupby(['Class', 'Age', 'City', 'Color']).nunique()
1
  • What did df.groupby(['Class', 'Age', 'City', 'Color']).nunique() return? Commented Jan 16, 2019 at 18:19

1 Answer 1

1

You could use size:

result = df.groupby(['Class', 'Age', 'City', 'Color']).size().reset_index(name='Count')
print(result)

Output

  Class  Age         City   Color  Count
0     A   20  Los Angeles    Blue      2
1     A   20  Los Angeles     Red      1
2     B   25      Phoenix  Yellow      1
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.