1

Sorry, I've seen many question related with this error but even with all this information I can't solve it.

I have a dataframe df with a column named int_rate. The column's type is O. It keeps the percentages, so each line is like: 10.95 % I need to remove the % sign to transform then the column in int. I've tried with the following code:

df['int_rate']=df['int_rate'].apply(lambda x: x[:-1])

I get the following error:

TypeError: 'float' object is not subscriptable.

The first thing I don't understand is why float object if my column type is not float, and if so, how can I get rid of % sign?

3
  • if I am not mistaken the error you are getting is from the lambda expression and not the data frame. you are doing float[:-1] and float is not subscriptable. please provide a data sample. Commented May 11, 2021 at 18:36
  • Can you give a sample of the dataframe? and did you check dtypes of the columns? Commented May 11, 2021 at 18:37
  • It very well may be that your data frame contains floats but your display options for that data frame is to display percentages. Therefor you see a string 10.95% that is set with pd.options.display.float_format BUT your data is actually a float. Commented May 11, 2021 at 19:00

3 Answers 3

3

You have 'object' column, so it could be a mix of various types. First cast to string, then drop the last character (you also can replace percents instead, as shivam suggested)

df['int_rate'] = df['int_rate'].apply(lambda x: str(x)[:-1]  if str(x).endswith('%') else x)
Sign up to request clarification or add additional context in comments.

Comments

0

Probably your data contains some NaN values, since NaN are considered as float type in Pandas.

You can do a df.info() and look into the Non-Null Count to check whether the count for column int_rate corresponds to the existence of any NaN value.

If there is any NaN value, you can bypass the error by applying your code only to those non NaN entries, as follows:

df['int_rate'] = df.loc[df['int_rate'].notna(), 'int_rate'].apply(lambda x: x[:-1])

Here, we filtered to process only those rows with the boolean mask: df['int_rate'].notna() to exclude those NaN entries.

1 Comment

@CristinaDominguezFernandez Please consider also to upvote the solution if you like it :-)
0

Presumably, not all lines are of the form 'nnn.mm %', some are float ... Could you try this?

df['int_rate']=df['int_rate'].apply(lambda x: x if isinstance(x, float) else float(x[:-1]))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.