17

Background:
I mostly run python scripts from the command line in pipelines and so my arguments are always strings that need to be type casted to the appropriate type. I make a lot of little scripts each day and type casting each parameter for every script takes more time than it should.

Question:
Is there a canonical way to automatically type cast parameters for a function?

My Way:
I've developed a decorator to do what I want if there isn't a better way. The decorator is the autocast fxn below. The decorated fxn is fxn2 in the example. Note that at the end of the code block I passed 1 and 2 as strings and if you run the script it will automatically add them. Is this a good way to do this?

def estimateType(var):
    #first test bools
    if var == 'True':
            return True
    elif var == 'False':
            return False
    else:
            #int
            try:
                    return int(var)
            except ValueError:
                    pass
            #float
            try:
                    return float(var)
            except ValueError:
                    pass
            #string
            try:
                    return str(var)
            except ValueError:
                    raise NameError('Something Messed Up Autocasting var %s (%s)' 
                                      % (var, type(var)))

def autocast(dFxn):
    '''Still need to figure out if you pass a variable with kw args!!!
    I guess I can just pass the dictionary to the fxn **args?'''
    def wrapped(*c, **d):
            print c, d
            t = [estimateType(x) for x in c]
            return dFxn(*t)
    return wrapped

@autocast
def fxn2(one, two):

   print one + two 

fxn2('1', '2')      

EDIT: For anyone that comes here and wants the updated and concise working version go here:

https://github.com/sequenceGeek/cgAutoCast

And here is also quick working version based on above:

def boolify(s):
    if s == 'True' or s == 'true':
            return True
    if s == 'False' or s == 'false':
            return False
    raise ValueError('Not Boolean Value!')

def estimateType(var):
    '''guesses the str representation of the variables type'''
    var = str(var) #important if the parameters aren't strings...
    for caster in (boolify, int, float):
            try:
                    return caster(var)
            except ValueError:
                    pass
    return var

def autocast(dFxn):
    def wrapped(*c, **d):
            cp = [estimateType(x) for x in c]
            dp = dict( (i, estimateType(j)) for (i,j) in d.items())
            return dFxn(*cp, **dp)

    return wrapped

######usage######
@autocast
def randomFunction(firstVar, secondVar):
    print firstVar + secondVar

randomFunction('1', '2')
5
  • 2
    It's "cast," not "caste" Commented Aug 10, 2011 at 23:44
  • Why are you trying to cast it to string? You're assuming it's a string, as var == "True" and var == "False" will give unexpected results for any other type. Commented Aug 10, 2011 at 23:50
  • @agf That won't cast it to string; Python is not statically typed (the OP wants to automatically convert a string to some other value based on its contents) Commented Aug 10, 2011 at 23:52
  • The line str(var) tries to cast it to a string, but this is after he's already assumed it's a string. For example 1 == True or 0 == False will evaluate to True, which isn't what he wants. Therefore, he should either check if it's a string before making those comparisons, or he should assume it's a string and just return var if it's not an int or float. Commented Aug 10, 2011 at 23:56
  • FWIW estimateType, at first glance, appears to only coerce types if it can detect var's type. But you end up always getting a string. In my implementation I substituted that first internal assignment of var to use a temp variable so if no type can be estimated I'm at least given back the original. Commented Jan 10, 2014 at 8:11

6 Answers 6

21

If you want to auto-convert values:

def boolify(s):
    if s == 'True':
        return True
    if s == 'False':
        return False
    raise ValueError("huh?")

def autoconvert(s):
    for fn in (boolify, int, float):
        try:
            return fn(s)
        except ValueError:
            pass
    return s

You can adjust boolify to accept other boolean values if you like.

Sign up to request clarification or add additional context in comments.

3 Comments

+1. I think you could also use a map for boolify to simplify that code a bit. E.g., replace the body of boolify with return {'True': True, 'False': False}[s].
Yes, after I answered, I thought about switching to a map, but got distracted and wandered away.. Squirrel!
What if someone does autoconvert('0'), hoping it is interpreted as an int? Won't it convert it to False?
9

You could just use plain eval to input string if you trust the source:

>>> eval("3.2", {}, {})
3.2
>>> eval("True", {}, {})
True

But if you don't trust the source, you could use literal_eval from ast module.

>>> ast.literal_eval("'hi'")
'hi'
>>> ast.literal_eval("(5, 3, ['a', 'b'])")
(5, 3, ['a', 'b'])

Edit: As Ned Batchelder's comment, it won't accept non-quoted strings, so I added a workaround, also an example about autocaste decorator with keyword arguments.

import ast

def my_eval(s):
    try:
        return ast.literal_eval(s)
    except ValueError: #maybe it's a string, eval failed, return anyway
        return s       #thanks gnibbler

def autocaste(func):
    def wrapped(*c, **d):
        cp = [my_eval(x) for x in c]
        dp = {i: my_eval(j) for i,j in d.items()} #for Python 2.6+
        #you can use dict((i, my_eval(j)) for i,j in d.items()) for older versions
        return func(*cp, **dp)

    return wrapped

@autocaste
def f(a, b):
    return a + b

print(f("3.4", "1")) # 4.4
print(f("s", "sd"))  # ssd
print(my_eval("True")) # True
print(my_eval("None")) # None
print(my_eval("[1, 2, (3, 4)]")) # [1, 2, (3, 4)]

5 Comments

This won't accept simple strings on the command line: they'd all have to be quoted.
@Ned Batchelder, you're right, added a workaround, but not seems so good.
Instead of quoting the string, you could just try doing a literal_eval and return the original string if there is an exception
I'm curious about two things: speed and being able to evaluate lists. The power of being able passing a list from the command line begets worry in my mind, but I guess I have to accept that if I'm going to be autocasting in the first place. The other question is whether if I use a fxn that has been decorated and I run that function a lot, will it significantly slow down? I wonder how fast eval is compared to a fxn like Ned's below that only handles basic types?
Yes, it really depends on speed you're expecting, but using typecasting like int(s) is a lot faster than eval(maybe 20-30x). And If you want only basic types, and speed is an important criteria, Ned Batchelder's answer fits best for you I think.
6

I'd imagine you can make a type signature system with a function decorator, much like you have, only one that takes arguments. For example:

@signature(int, str, int)
func(x, y, z):
    ...

Such a decorator can be built rather easily. Something like this (EDIT -- works!):

def signature(*args, **kwargs):
    def decorator(fn):
        def wrapped(*fn_args, **fn_kwargs):
            new_args = [t(raw) for t, raw in zip(args, fn_args)]
            new_kwargs = dict([(k, kwargs[k](v)) for k, v in fn_kwargs.items()])

            return fn(*new_args, **new_kwargs)

        return wrapped

    return decorator

And just like that, you can now imbue functions with type signatures!

@signature(int, int)
def foo(x, y):
    print type(x)
    print type(y)
    print x+y

>>> foo('3','4')
<type: 'int'>
<type: 'int'>
7

Basically, this is an type-explicit version of @utdemir's method.

4 Comments

Now that's an interesting idea too! I would rather automatically cast if I can because it will save time, but your way would be much "safer"
@sequenceGeek -- that's right, in fact looking at the other answers this is basically an explicit version of utdemir's method.
Nice idea! Just note that current implementation might result in error/less args being sent to 'func' in the case of wanting to evaluate less arguments. e.g. "@signature(int) func(x,y,z)".
Nice idea! Would like to point out that this can also be used on an existing function by "manually" applying the decorator to it: i.e. existing_func = signature(int, str)(existing_func).
2

If you're parsing arguments from the command line, you should use the argparse module (if you're using Python 2.7).

Each argument can have an expected type so knowing what to do with it should be relatively straightforward. You can even define your own types.

...quite often the command-line string should instead be interpreted as another type, like a float or int. The type keyword argument of add_argument() allows any necessary type-checking and type conversions to be performed. Common built-in types and functions can be used directly as the value of the type argument:

parser = argparse.ArgumentParser()
parser.add_argument('foo', type=int)
parser.add_argument('bar', type=file)
parser.parse_args('2 temp.txt'.split())
>>> Namespace(bar=<open file 'temp.txt', mode 'r' at 0x...>, foo=2)

2 Comments

Can the downvoter explain themselves? How is this not better than hacking with eval? The question clearly says "I mostly run python scripts from the command line in pipelines and so my arguments are always strings that need to be type casted.", and this answer will do that exactly.
Not the downvoter, but probably because writing an argument parser is about as much effort as manually casting arguments. Argparse is good for stuff you write once and use often, but for one off uses it's overkill.
0

There are couple of problems in your snippet.

#first test bools
if var == 'True':
        return True
elif var == 'False':
        return False

This would always check for True because you are testing against the strings 'True' and 'False'.

There is not an automatic coercion of types in python. Your arguments when you receive via *args and **kwargs can be anything. First will look for list of values (each of which can be any datatype, primitive and complex) and second will look for a mapping (with any valid mapping possible). So if you write a decorator, you will end up with a good list of error checks.

Normally, if you wish to send in str, just when the function is invoked, typecast it to string via (str) and send it.

Comments

-1

I know I arrived late at this game, but how about eval?

def my_cast(a):
try:
    return eval(a)
except:
    return a

or alternatively (and more safely):

from ast import literal_eval

def mycast(a):
  try:
    return literal_eval(a)
  except:
    return a

1 Comment

Using eval might not be a good idea-- If you try to do my_cast on user-supplied input, they could potentially get remote code execution by inputting the right payload.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.