2

I had an XLSX file with 2 columns namely months and revenue and saved it as a CSV file. By using pandas to read my csv file, the revenue column has now turned into object. How can I change this column to float?

data = pd.DataFrame

dat['revenue']

7980.79
Nan
1000.25
17800.85
.....
Nan 
2457.85
6789.33

This is the column I want to change but it has been given me different errors

I tried, astype, to_numeric but no success.

Some of the errors I got is:

Cannot parse a string '798.79'

1

2 Answers 2

2

Now using nucsit026's answer to create a slightly different dataFrame with strings

dic = {'revenue':['7980.79',np.nan,'1000.25','17800.85','None','2457.85','6789.33']}
print(df)
print(df['revenue'].dtypes

Output:

    revenue
0   7980.79
1   NaN
2   1000.25
3   17800.85
4   None
5   2457.85
6   6789.33

dtype('O')

try this:

df['revenue']=pd.to_numeric(data['revenue'], errors='coerce').fillna(0, downcast='infer')

it will replace nan with 0s

Output:

0     7980.79
1        0.00
2     1000.25
3    17800.85
4        0.00
5     2457.85
6     6789.33
Name: revenue, dtype: float64

EDIT:

From your shared error if quotes are the problem you can use

df['revenue']=df['revenue'].str.strip("'")

and then try to convert to float using above mentioned code

EDIT2

OP had some spaces in the column values like this

Month  Revenue
Apr-13 16 004 258.24
May-13
Jun-13 16 469 157.71
Jul-13 19 054 861.01
Aug-13 20 021 803.71
Sep-13 21 285 537.45
Oct-13 22 193 453.80
Nov-13 21 862 298.20
Dec-13 10 053 557.64
Jan-14 17 358 063.34
Feb-14 19 469 161.04
Mar-14 22 567 078.21
Apr-14 20 401 188.64

In this case use following code:

df['revenue']=df['revenue'].replace(' ', '', regex=True)

and then perform the conversion

Sign up to request clarification or add additional context in comments.

7 Comments

I tried your code and my revenue list has all 0s. What happened to my numbers and how do I take it from there?
whats the output if you remove fillna part
Without fillna part, everything is now NaN
whats your data looks like, i have tried your given data and it worked fine for me can you share some of the data set in original form
|
0

From above link:

dic = {'revenue':[7980.79,None,1000.25,17800.85,None,2457.85,6789.33]}
df = pd.DataFrame(dic)
df['revenue'] = df.revenue.astype(float)
df

output

    revenue
0   7980.79
1   NaN
2   1000.25
3   17800.85
4   NaN
5   2457.85
6   6789.33

2 Comments

This is perfect and it worked. However, my revenue column has 3730 values. How will my dic look like? Should use a for loop and then append?
Using df['revenue'] = df.revenue.astype(float) means converting column's values (3730 ) to float, I used dic as example for explanation. You don't need to use it.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.