0

I have the following dataframe:

d_test = {
    'c1' : ['31', '421', 'sgdsgd', '523.3'],
    'c2' : ['41', np.nan, '412', '412'],
    'test': [1,2,3,4],
}
df_test = pd.DataFrame(d_test)

I want to replace all values to np.nan if they are not float:

0   31      41   1
1   421     NaN  2
2   NaN     412  3
3   523.3   412  4

here what I do:

df_test[['c1', 'c2']] = df_test[['c1', 'c2']].replace(to_replace=r'^[+-]?([0-9]+([.][0-9]*)?|[.][0-9]+)$', value=np.nan, regex=True)

But result is not what I am looking for:

0   NaN     NaN  1
1   NaN     NaN  2
2   sgdsgd  NaN  3
3   NaN     NaN  4
1
  • You have the sense inverted. That .replace() will turn numbers into NaN, but you explained that you want to turn non-numbers into NaN. Just use .to_numeric() and be done with it. Commented Dec 9, 2022 at 1:12

1 Answer 1

4

IIUC, you can use pandas.to_numeric with errors="coerce":

errors{‘ignore’, ‘raise’, ‘coerce’}, default ‘raise’ :

  • If ‘raise’, then invalid parsing will raise an exception.

  • If ‘coerce’, then invalid parsing will be set as NaN.

  • If ‘ignore’, then invalid parsing will return the input.

df_test = df_test.apply(pd.to_numeric, errors="coerce")

# Output :

print(df_test)
      c1     c2  test
0   31.0   41.0     1
1  421.0    NaN     2
2    NaN  412.0     3
3  523.3  412.0     4
Sign up to request clarification or add additional context in comments.

2 Comments

Nice solution! You can make it even more concise with df_test.apply(pd.to_numeric, errors="coerce") since keyword args passed to apply get passed to the function.
Thanks fsimonjetz, I updated my answer ;)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.