0

I'm using numpy to retrieving data from csv file, it contains 3 columns with data: offer_id, sms_limit, sms_price. I want to add validation:

  • offer_id - only positive integers
  • sms_limit - only positive integers
  • sms_price - positive float number.

I've tried to write my own validator, something like this:

def int_validator(x):
    if str(x).isdigit():
        return x
    raise ValueError('Invalid choice please use positive integer number')


pd.read_csv(
            converters={'offer_id': int_validator, 'sms_limit': int, 'sms_price': int},
            encoding='utf-8',
            engine='python',
        )

but it doesn't work at all :(

It only works if I use int

pd.read_csv(
            converters={'offer_id': int, 'sms_limit': int, 'sms_price': int},
            encoding='utf-8',
            engine='python',
        )

but it's not what I'm looking for. Also, it's only working for column offer_id if I type a string into sms_limit or sms_price there is no validation. Can smb explain how to write my validators and why only the first column accepts int conversion?

1
  • Could you post an example array, please? Something we can put through your function to figure out waht exactly is going wrong? Also, it's helpful if you tell us the exact error instead of just saying "it doesn't work." Detail is key in helping us help you! Commented Oct 24, 2022 at 19:06

1 Answer 1

1

Here's a solution that correctly checks if the first two columns contain positive integers and if the last column contains positive floats.

# This uses a try-except block to see if the given value is an integer, 
# and an if-else block to see if the value is >= 0.
# Change the sign to > 0 if you want strictly positive values.
def int_validator(x):
    try:

        # A funny little quirk of python: If you have something like x = "7.0", then int(x) returns an error even though int(float(x)) does not.
        x = int(float(x))
        if x >= 0:
            return x
        else:
            raise ValueError('Invalid choice for {}. Please use positive integer number'.format(x))

    except: 
        raise ValueError('Invalid choice for {}. Please use positive integer number'.format(x))

# This does something similar to the int_validator, but checks if it's a float instead.
def float_validator(x):
    try:
        x = float(x)
        if x >= 0:
            return x
        else:
            raise ValueError('Invalid choice for {}. Please use positive float number'.format(x))

    except:
        raise ValueError('Invalid choice for {}. Please use positive float number'.format(x))

# Now we apply the validators to all the columns.
pd.read_csv("example.csv",
            converters={'offer_id': int_validator, 'sms_limit': int_validator, 'sms_price': float_validator},
            encoding='utf-8',
            engine='python',
        )

Let me know if you have questions!

Sign up to request clarification or add additional context in comments.

1 Comment

thanks @AJH, everything right now is working. My problem was importing function which overides my code. This is why validation worked only in one column. Thanks a lot for you help

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.