2

I am trying to convert all the data under the Column Yield_pct of my dataframe df below into float types, but am facing issues doing this. I get the Error ValueError: could not convert string to float: ' na' and hence I added the line if row1['Yield_pct'] != 'na': to my code below, yet I get the same error after adding this line.

              Date      Maturity      Yield_pct Currency
0       1986-01-01          0.25             na      CAD
1       1986-01-02          0.25   0.0948511020      CAD
2       1986-01-03          0.25   0.0972953210      CAD
3       1986-01-06          0.25   0.0965403640      CAD
4       1986-01-07          0.25   0.0953292440      CAD


for (i1, row1) in (df.iterrows()):
    if row1['Yield_pct'] != 'na':
        row1['Yield_pct'] = float(row1['Yield_pct'])
        if isinstance(row1['Yield_pct'], float)==1:
            print('SUCCESS')
        else:
            print('FAILURE')

Thank You

Edit: This is the lower part of the dataframe df:

920538  2015-01-19  empty string                     CAD
920539  2015-01-20  empty string                     CAD
920540  2015-01-21  empty string                     CAD
920541  2015-01-22  empty string                     CAD
920542  2015-01-23  empty string                     CAD
920543  2015-01-26  empty string                     CAD

Code that I am now using:

df = update('CAD')[0]
for (i1, row1) in (df.iterrows()):
    df = df.convert_objects(convert_numeric=True)
    if isinstance(row1['Yield_pct'], float)==1:
        print('SUCCESS')
    else:
        print('FAILURE')
8
  • What's your question? Notice that your error says you cannot convert string ' na' to a float. You're missing the space before the 'na' in your code. That's a starting place. Commented Jun 29, 2015 at 15:17
  • convert_objects can handle empty strings Commented Jun 29, 2015 at 15:56
  • @EdChum Thank You. However, when I try this FAILURE still keeps getting printed out. Commented Jun 29, 2015 at 15:59
  • Please edit you question with the code you tried that now doesn't work Commented Jun 29, 2015 at 16:33
  • 1
    I don't know why you repeatedly call convert_objects, you just need to call it once, also when you claim it fails, after calling convert_objects what does df.info() show? You may be getting what you think is an error because the dtype is actually np.float64 Commented Jun 29, 2015 at 16:37

1 Answer 1

3

Just use convert_objects, it will coerce any duff values into NaN:

In [75]:
df = df.convert_objects(convert_numeric=True)
df

Out[75]:
         Date  Maturity  Yield_pct Currency
0  1986-01-01      0.25        NaN      CAD
1  1986-01-02      0.25   0.094851      CAD
2  1986-01-03      0.25   0.097295      CAD
3  1986-01-06      0.25   0.096540      CAD
4  1986-01-07      0.25   0.095329      CAD
Sign up to request clarification or add additional context in comments.

6 Comments

I didn't know this was a thing. Beautiful.
@EdChum Thank You. But when I try the lines df = df.convert_objects(convert_numeric=True) row1['Yield_pct'] = float(row1['Yield_pct']) I don't know why this is.
I don't understand your comment, using convert_objects you don't need any further casting
@EdChum Unfortunately all the data in the Column are Strings, so I still need to convert those into float and so I use row1['Yield_pct'] = float(row1['Yield_pct']). If I use df = df.convert_objects(convert_numeric=True) alone, I won't be able to satisfy this requirement.
Does df['Yield_pct'] = df['Yield_pct'].astype(np.float64) work? If it doesn't work it's because you have some strange values that you've not shown in your sample data, can you post this
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.