I currently have a csv file. The data originally is derived from PDF and doing a further analysis on the data, There are certain rows where the extracted data contains letters in place of numbers,
I need instead of numbers the letters of the variables. So trying to replace the int values by the letters
Such as in the following example:
col_state
2567i
28981
2534s
0123o
in the above table i am looking out to replace (i=1, s=5, o=0)
Expected Output:
col_state
25671
28981
25345
01230
What i have tried so far:
import re
chars_to_remove = ['i', '1', 's', '5', '']
regular_expression = '[' + re.escape (''. join (chars_to_remove)) + ']'
df['col_state'].str.replace(regular_expression, '', regex=True)
print(df['HSN_Code'])
So I have no clue how to handle this problem :(