1

I'm trying to convert multiple columns in my df from string to int and float. I used fillna method to get rid of NaN values however I'm still getting an error. Here's what I did:

df = [['column1', 'column2', 'column3', 'column4']]

df = df[['column1', 'column2', 'column3', 'column4']].fillna(0)
convert_dict = {'column1': int,
                 'column2': float,
                  'column3': int, 
                    'column4': float}
df = df.astype(convert_dict)

The error says ValueError: cannot convert float NaN to integer

Edit: Removed inplace=True

13
  • 1
    You have np.nan in the column you want to convert to int Commented Sep 3, 2020 at 17:19
  • 1
    df = [[...]].fillna(0, inplace=True) doesn't make sense. I'm assuming you are missing a df in there and you don't need inplace if you are assigning the result to df. Commented Sep 3, 2020 at 17:20
  • @BEN_YO yes. Does np.nan works with fillna? Commented Sep 3, 2020 at 17:20
  • 1
    should you be using df = df[['column1'... instead of df = [['column1'... ? Commented Sep 3, 2020 at 17:32
  • 1
    df = df.astype(convert_dict, errors = 'ignore') should work. Commented Sep 3, 2020 at 17:44

1 Answer 1

1

You could use Int64, which supports missing integer values:

import numpy as np
import pandas as pd

df = pd.DataFrame({'A': [1, 2, None, 4], 
                   'B': [1.0, 2.0, 3.0, None]})

convert_dict = {'A': 'Int64', 'B': float}
convert_dict

for field, new_type in convert_dict.items():
    df[field] = df[field].astype(new_type)

print(df)
print(df.dtypes)

      A    B
0     1  1.0
1     2  2.0
2  <NA>  3.0
3     4  NaN

A      Int64
B    float64
dtype: object
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.