0

I have been looking to remove all the non numerical character, and convert the same to float type, But whenever any String character comes up , it gives the Error as "ValueError: could not convert string to float"

Please suggest how to resolve it .

Input File

col1           col2  

122.45         NaN
Ninety Five    3585/-
9987           178@#?
225 Nine       1983.86
Twelve         7363*

Output File

col1           col2  

122.45         NaN
NaN            3585
9987           178
225            1983.86
NaN            7363

Code i am using :

df[['col1','col2']] = df[['col1','col2']].replace('([^\d/.])', '', regex=True).astype(float)

Getting the Error:

ValueError: could not convert string to float

3
  • Try with the pattern '[^\d\.]' and add .replace('', np.nan) like so: df[['col1','col2']].replace('([^\d\.])', '', regex=True).replace('',np.nan).astype(float) Commented Jun 18, 2021 at 15:25
  • @WholeBrain - Gives the same Error when it comes to the row containing value "One Lakh Two Thousand Three hundred & Twenty Paise" Commented Jun 18, 2021 at 15:32
  • It works for my part. Did you add the second replace ? Commented Jun 18, 2021 at 15:36

1 Answer 1

1

You need to use a raw string (with the r in front) for regex patterns, or double backslash (\\) escapes. Also you need \. to match literal . characters, not /.:

df[['col1', 'col2']] = df[['col1', 'col2']].replace('(-?[^\d\.])', '', regex=True).replace('', float('NaN')).astype(float)
Sign up to request clarification or add additional context in comments.

7 Comments

@Willdasilva - Using the above lines as well Gives the same Error when it comes to the row containing value "One Lakh Two Thousand Three hundred & Twenty Paise"
@AnuragDabas - When we have the all characters as Alpha in row values, For Eg. "One Lakh Two Thousand Three hundred & Twenty Paise" It still gives the same Error.
@Manz so df[['col1','col2']]=df[['col1','col2']].replace('([^\d\.])', '', regex=True).replace('',float("NaN"),regex=True).astype(float) doesn't work?
@willdasilva - Any suggestion on if having the row value as "-122.45", Its not working in this case, how to resolve the same.
@AnuragDabas - Thanks for the Answer , Its working but when we have the row values as "-122.45" it doesn't works , gives the same value as output.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.