1

I have a pandas data frame column where I have float values and string type NA value. I need to replace these NAs with the mean using the following code.

trainTestJoin["col1"] = trainTestJoin.groupby("col2")["col1"].
     transform(lambda x: x.fillna(x.median()))

I am getting

TypeError: could not convert string to float: NA

I tried to convert to before filling it.

trainTestJoin["LotFrontage"].astype(float)

But it gives the same issue. How to solve this issue?

1
  • How does data-frame look like? whats the desired output? Commented Dec 26, 2018 at 1:12

2 Answers 2

1

Convert to numeric using pd.to_numeric which supports, via errors='coerce', conversion to float NaN for non-convertible values:

df['col1'] = pd.to_numeric(df['col1'], errors='coerce')

Then use groupby + transform directly:

df['col1'] = df['col1'].fillna(df.groupby('col2')['col1'].transform('mean'))
Sign up to request clarification or add additional context in comments.

Comments

0

Or simply replace:

trainTestJoin['col1'] = trainTestJoin['col1'].replace('NA',np.nan)

And then simply:

trainTestJoin['col1'] = trainTestJoin['col1'].fillna(trainTestJoin.groupby('col2')['col1'].transform('mean'))

And now (after all):

print(trainTestJoin)

Is gonna be expected output.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.