3

I have a csv file with two formatted columns that currently read in as objects:

  1. contains percentage values which read in as strings like '0.01%'. The % is always at the end.

  2. contains currency values which read in as string like '$1234.5'.

I have tried using the split function to remove the % or $ inside the dataframe, then using float on the result of the split. This will print the correct result but will not assign the value. It also gives a type error that float does not have split function, even though I do the split before the float????

1
  • Thanks to all who helped. Commented Aug 26, 2018 at 20:26

2 Answers 2

3

Try this:

import pandas as pd

df = pd.read_csv('data.csv')

"""
The example df looks like this:
    col1     col2
0  3.04%  $100.25
1  0.15%    $1250
2  0.22%     $322
3  1.30%     $956
4  0.49%     $621
"""

df['col1'] = df['col1'].str.split('%', expand=True)[[0]]
df['col2'] = df['col2'].str.split('$', 1, expand=True)[[1]]

df[['col1', 'col2']] = df[['col1', 'col2']].apply(pd.to_numeric)
Sign up to request clarification or add additional context in comments.

Comments

1

You are probably looking for the apply method.

With

df['first_col'] = df['first_col'].apply(lambda x: float(x.strip('%'))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.