1

I am trying to find the frequency of unique values in a column of a pandas dataframe I know how to get the unique values like this:

data_file.visiting_states()

returns :

array(['CA', 'VA', 'MT', nan, 'CO', 'CT'],    dtype=object)

and I want to return the count of those unique values and I know I cant do .value_counts() because its a numpy array

1
  • I cant do .value_counts() because its a numpy array, just cast it to series! pandas.Series(my_array).value_counts() Commented Jan 8, 2017 at 17:20

1 Answer 1

1

You can use nunique:

data_file = pd.DataFrame({'visiting_states':['CA', 'VA', 'MT', np.nan, 'CO', 'CT','CA',
                                             'VA', 'MT', np.nan, 'CO', 'CT']})
print (data_file)
   visiting_states
0               CA
1               VA
2               MT
3              NaN
4               CO
5               CT
6               CA
7               VA
8               MT
9              NaN
10              CO
11              CT

print (data_file.visiting_states.nunique())
5

print (data_file.visiting_states.nunique(dropna=False))
6

arr = np.array(['CA', 'VA', 'MT', np.nan, 'CO', 'CT'],    dtype=object)
print (arr)
['CA' 'VA' 'MT' nan 'CO' 'CT']

print (len(arr))
6
Sign up to request clarification or add additional context in comments.

4 Comments

I want to know the count per state, ie 5 for CA, 4 for VA etc
Then need value_counts - print (data_file.visiting_states.value_counts())
this did not work, it said value counts is not defined
ok, what return print (type(data_file.visiting_states)) and what print (type(data_file.visiting_states.iloc[0])) ?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.