0

so I was writing a function consume a list to count how many different types of data in it and return a list of natural numbers by an order of integer, float, string, Boolean, other

[integer, float, string, Boolean, others]

for example let

L = [1, "123", !, #, 2.0, True, False, [1,2]] 

then funcition(L) will return

[1, 1, 1, 2, 3]

I choose the built-in function type() to determine the data type for example

type(L[0]) == type(9) 

then I know if first element of the list is an integer. but when the list contains some symbols like "!@#$%^", then the type() function doesn't work anymore, it will display

Syntax Error: invalid syntax: <string>

So I wonder if there is another way to classify symbols and put them into the "other" type.

2
  • 6
    But ! and # are not valid objects to begin with... Commented Jun 14, 2017 at 9:51
  • Not an answer to your problem, but if you want to count types, why not just do collections.Counter(map(type, L))? Commented Jun 14, 2017 at 10:00

3 Answers 3

3

It is not the type(..) function that does not work: the error occurs when constructing the list.

When a compiler/interpreter reads code, it will aim to generate an abstract syntax tree according to the grammar of the language. ! and # are simply invalid grammar. The # is interpreted as the start of a comment, so the interpreter will read:

L = [1, "123", !, 

But that does not make any sense either since ! is not valid, and even if you would remove the !, then Python will still complain it cannot find a matching ] (to close the list).

Now let's assume you sort that out, there is another problem: what with subclassing? One can for instance subclass a string. Do you count it as a str or other? The problem is more troublesome because Python supports multiple inheritance: something can be a string and a bool at the same time. Let us assume we only count real strings and not subclasses, then we can write the following code:

def count_types(data):
    the_types = [int,float,str,bool]
    result = [0]*(len(the_types)+1)
    for x in data:
        tyx = type(x)
        for idx,ty in enumerate(the_types):
            if tyx is ty:
                result[idx] += 1
                break
        else:
            result[-1] += 1
    return result

This would produce:

>>> count_types([1, "123", 2.0, True, False, [1,2]])
[1, 1, 1, 2, 1]

Nevertheless it is weird that you want to return a list without context, and furthermore that you only count specific types. I guess a more elegant and faster solution, is to generate a Counter with types:

from collections import Counter

result = Counter(type(x) for x in data)

This generates:

>>> Counter(type(x) for x in data)
Counter({<class 'bool'>: 2, <class 'list'>: 1, <class 'str'>: 1, <class 'float'>: 1, <class 'int'>: 1})
Sign up to request clarification or add additional context in comments.

1 Comment

Pedantic note on "something can be a string and a bool at the same time": You picked a bad example; bool is not subclassable (at the C level, you have to opt in to subclassability, and bool, slice, NoneType and a handful of other core classes, usually once you don't create directly, often ones with a restricted set of singleton values, don't set that flag). And while str is subclassable, the subclasses can't use multiple inheritance with most (all?) other built-in types or __slots__ed types (you get an instance lay-out conflict; they expand PyObject in incompatible ways).
2

If you want to compare types (even though it's not a Pythonic way) you need to use isinstance()

 isinstance(L[0], type(9))

or

 isinstance(L[0], str)

2 Comments

This will classify True as int. I think in OP's scenario, comparing to types directly is the better option. And in any way, it is not related to OP's problem.
Then OP has to compare to with direct but with abstract classes.
1

You can use a dictionary to count your types, then use operator.itemgetter() in order to get the desire result based on your expected order:

In [32]: order = [int, float, str, bool, 'other']

In [33]: l = [1, 3, 5.6, 'st', True, 'gg', False, None, lambda x: 3, sum]

In [34]: from collections import defaultdict

In [35]: from operator import itemgetter

In [36]: d = defaultdict(int)                    

In [37]: for i in l:                             
             t = type(i)
             if t in order:
               d[t] += 1
             else:
               d['other'] += 1
   ....:             

In [38]: itemgetter(*order)(d)                    
Out[38]: (2, 1, 2, 2, 3)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.