6

I'm making using a panda frame containing columns like below:

data = {'chindice': [ '-1', '5.89 e-06', '6.76 e-06', '6.31 e-06', '1',
       '4', np.nan], 
        'target': ['classe1', 'classe2', 'classe3', np.nan,'classe5', 'classe4', 'classe5' ], 
         }
df = pd.DataFrame(data)

I need to use the columns "chindice" as float, but I couldnt because the columns dtype is 'object'. Any help would be appreciated. I am a newbie to pandas. Thanks

2 Answers 2

5

You can use to_numeric after stripping the problematic space in your scientific notation entries using str.replace:

In [15]:
df['chindice'] = pd.to_numeric(df['chindice'].str.replace(' ',''), errors='force')
df

Out[15]:
   chindice   target
0 -1.000000  classe1
1  0.000006  classe2
2  0.000007  classe3
3  0.000006      NaN
4  1.000000  classe5
5  4.000000  classe4
6       NaN  classe5

Don't worry about the display, the real value is still there:

In [17]:
df['chindice'].iloc[1]

Out[17]:
5.8900000000000004e-06
Sign up to request clarification or add additional context in comments.

Comments

0

You can use replace arbitrary whitespace \s+ and then cast by astype to float:

df['chindice'] = df.chindice.str.replace(r'\s+','').astype(float)
print df
   chindice   target
0 -1.000000  classe1
1  0.000006  classe2
2  0.000007  classe3
3  0.000006      NaN
4  1.000000  classe5
5  4.000000  classe4
6       NaN  classe5

#temporaly display with precision 8
with pd.option_context('display.precision', 8):
    print df
     chindice   target
0 -1.00000000  classe1
1  0.00000589  classe2
2  0.00000676  classe3
3  0.00000631      NaN
4  1.00000000  classe5
5  4.00000000  classe4
6         NaN  classe5

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.