2

I am using pandas to read a csv file. The data are numbers but stored in the csv file as text. Some of the values are non-numeric when they are bad or missing. How do I filter out these values and convert the remaining data to integers.

I assume there is a better/faster way than looping over all the values and using isdigit() to test for them being numeric.

Does pandas or numpy have a way of just recognizing bad values in the reader? If not, what is the easiest way to do it? Do I have to specific the dtypes to make this work?

3 Answers 3

4

pandas.read_csv has the parameter na_values:

na_values : list-like, default None
    List of additional strings to recognize as NA/NaN

where you can define these bad values.

Sign up to request clarification or add additional context in comments.

3 Comments

Great. This seems to be what I was looking for.
Is there a way to use na_values if the string is column dependent? For instance, I have some columns where negative values are bad, but others where they are fine.
No @Shawn. Ideally you cannot na_values to perform differently for each columns. Handling negative values should be done while Data Pre-Processing / Cleaning
3

You can pass a custom list of values to be treated as missing using pandas.read_csv . Alternately you can pass functions to the converters argument.

Comments

1

NumPy provides the function genfromtxt() specifically for this purpose. The first sentence from the linked documentation:

Load data from a text file, with missing values handled as specified.

3 Comments

Ok, thanks. I thought Pandas was supposed to be a higher level add-on. I was expecting this functionality there. So just use that and convert it to a data frame?
@Dave31415: I don't know exactly how your data looks like, but this is the approach I'd try first.
If pandas.read_csv does not do what you need please create an issue on GitHub: github.com/pydata/pandas/issues

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.