find non-numeric values in a pandas dataframe

Question

Say I import a csv into pandas, and I realize there are some non-numeric values in a column that I expect to be all numeric.

This is how I would find those values (in a dataframe called df in a column called should_be_numbers):

df[pd.to_numeric(df['should_be_numbers'], errors='coerce').isnull()]['should_be_numbers']

My question: Is there a cleaner/more pythonic/less clunky way to do this?

pd.read_csv has a dtype arg where you can specify the data type of the column. I'm assuming you have a column being stored as a string and you're getting scientific notation values — ajoseps
– ajoseps, Commented Aug 1, 2022 at 18:21
nope in this case its floats where the null values are all something like "-.--", but I don't know the data well enough to assume all the null values are filled in like that, and I don't want to blindly coerce in case its some weird formatting on valid data! — steadynappin
– steadynappin, Commented Aug 1, 2022 at 18:23

René · Accepted Answer · 2022-08-01 20:05:20Z

2

df = pd.DataFrame({'should_be_numbers': [1, 22, 'A', 'BB', [1, 22], ['A', 'BB'], 'A1BB22', np.nan, 3.13]})
df[[not (isinstance(value, int) or isinstance(value, float)) for value in df.should_be_numbers]]

Input:

  should_be_numbers
0                 1
1                22
2                 A
3                BB
4           [1, 22]
5           [A, BB]
6            A1BB22
7               NaN
8              3.13

Output:

  should_be_numbers
2                 A
3                BB
4           [1, 22]
5           [A, BB]
6            A1BB22

edited Aug 1, 2022 at 20:05

answered Aug 1, 2022 at 18:40

René

4,9195 gold badges29 silver badges59 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

find non-numeric values in a pandas dataframe

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related