0

I have a data frame, df, and I'd like to get all the columns in it and the count of unique values in it and save it as another data frame. I can't seem to find a way to do that. I can, however, print what I want on the console. Here's what I mean:

def counting_unique_values_in_df(df):
    for evry_colm in df:
        print (evry_colm, "-", df[evry_colm].value_counts().count())

Now that prints what I want just fine. Instead of printing, if I do something like newdf = pd.DataFrame(evry_colm, df[evry_colm].value_counts().count(), columns = ('a', 'b')), it throws an error that reads "TypeError: object of type 'numpy.int32' has no len()". Obviously, that isn't right.

Soo, how can I make a data frame like columnName and UniqueCounts?

2 Answers 2

1

To count unique values per column you can use apply and nunique function on data frame. Something like:

import pandas as pd

df = pd.DataFrame([
       {'a': 1, 'b': 2}, 
       {'a': 2, 'b': 2}
     ])

count_series = df.apply(lambda col: col.nunique())

#   returned object is pandas Series 
#   a    2
#   b    1
#   to map it to DataFrame try

pd.DataFrame(count_series).T
Sign up to request clarification or add additional context in comments.

Comments

0
import pandas as pd
df = pd.DataFrame({'A': [1, 1, 2, 2], 'B': [1, 2, 3, 4]})
print(df)
print()
df = pd.DataFrame({col: [df[col].nunique()] for col in df})
print(df)

Output:

   A  B
0  1  1
1  1  2
2  2  3
3  2  4

   A  B
0  2  4

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.