0

For example I have this pandas simple data frame:

>> print(file)
       Name
    0  ['junior','senior']
    1  freshgrad

when I tried using :

>> len(file[0])
2

but for the second one

>> len(file[1])
9

But I want the second one to calculate as 1 how to differentiate between these two?

  1. I have tried using .join() but it still calculate as 9
  2. I have tried list.count but I got an error
5
  • 2
    If you want to calculate the number of values, you need to make sure that single values are stored as length-1 lists. Commented May 26, 2016 at 1:42
  • freshgrad is a string and not an array. If you had 1 ['freshgrad'] you would get len(file[1])=1 Commented May 26, 2016 at 1:42
  • @EliSadoff So, for the 'freshgrad' I need to change it from array to string? Commented May 26, 2016 at 1:47
  • No, freshgrad is currently a string, and you want it to be an array. Commented May 26, 2016 at 1:48
  • @EliSadoff oh yes! I was a bit confused. thank you. Commented May 26, 2016 at 1:50

4 Answers 4

2

The best way to do what you want is to check the data type of the item in question. You can use:

if isinstance(item, list):
    ...

And:

if isinstance(item, str):
    ...

In the case of a string, you can then use 1 for the length if you wish. Note that it's better to use isinstance(item, dtype) than type(item) == dtype because it will automatically work on subclassed types.

Sign up to request clarification or add additional context in comments.

Comments

0

freshgrad is a string
so len(file[1]) means that you are calculate the length of this string.It is 9.
if file[1] is a list containing freshgrad like['freshgrad'],len(file[1]) will be 1.

Comments

0

You could use np.size:

In [301]: file = pd.Series([['junior','senior'], 'freshgrad'])

In [302]: file.apply(np.size)
Out[302]: 
0    2
1    1
dtype: int64

In [327]: np.size(file[0])
Out[327]: 2

In [328]: np.size(file[1])
Out[328]: 1

But to some extent this might just be delaying your agony. When the objects in a Series (or any kind sequence) have different types, the code tends require type-checking or try..excepts to handle the various types differently. (In fact, this is what np.size is doing. Under the hood np.size is using try..except to handle the exceptional case.)

Life is usually simpler (and therefore better) when all the objects in a sequence have the same type. So it is preferable to build a Series whose elements are all lists:

In [301]: file = pd.Series([['junior','senior'], ['freshgrad']])

Comments

0

You could define your own length function, which uses the type to determine how to calculate the length:

def mylen(data):
    return len(data) if isinstance(data,list) else 1

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.