0

Data frame=reviews

I get the following errror when I try to convert rating column to integer

''Cannot convert non-finite values (NA or inf) to integer''

how can I fix it?

reviews.replace([np.inf, -np.inf], np.nan)
reviews.dropna() 

reviews['Rating'].astype('int')
3
  • It's hard for us to tell what the issue is if we don't know what the Dataframe looks like. Commented Dec 23, 2018 at 12:11
  • Determine what the non-numeric values are, and where they come from. Determine what integer representation would be appropriate. Code that! Commented Dec 23, 2018 at 12:11
  • @ Gokce , you should accept the answer as that helps to see it as answered and removed from the un-answered queue , you can also upvote Commented Dec 23, 2018 at 13:24

2 Answers 2

1

The simplest way would be to first replace infs to NaN and then use dropna :

Example DataFrame:

>>> df = pd.DataFrame({'col1':[1, 2, 3, 4, 5, np.inf, -np.inf], 'col2':[6, 7, 8, 9, 10, np.inf, -np.inf]})

>>> df
       col1       col2
0  1.000000   6.000000
1  2.000000   7.000000
2  3.000000   8.000000
3  4.000000   9.000000
4  5.000000  10.000000
5       inf        inf
6      -inf       -inf

Solution 1:

Create a df_new that way you will not loose the real dataframe and desired dataFrame will ne df_new separately..

>>> df_new = df.replace([np.inf, -np.inf], np.nan).dropna(subset=["col1", "col2"], how="all").astype(int)
>>> df_new
   col1  col2
0     1     6
1     2     7
2     3     8
3     4     9
4     5    10

Solution 2:

using isin and ~ :

>>> ff = df.isin([np.inf, -np.inf, np.nan]).all(axis='columns')
>>> df[~ff].astype(int)
   col1  col2
0     1     6
1     2     7
2     3     8
3     4     9
4     5    10

OR Directly into original Dataframe, Use pd.DataFrame.isin and check for rows that have any with pd.DataFrame.any. Finally, use the boolean array to slice the dataframe.

>>> df = df[~df.isin([np.nan, np.inf, -np.inf]).any(1)].astype(int)
>>> df
   col1  col2
0     1     6
1     2     7
2     3     8
3     4     9
4     5    10

above taken from here courtesy to the @piRSquared

Solution 3:

You have liberty to use dataFrame.mask + numpy.isinf and the using dronna():

>>> df = df.mask(np.isinf(df)).dropna().astype(int)
>>> df
   col1  col2
0     1     6
1     2     7
2     3     8
3     4     9
4     5    10
Sign up to request clarification or add additional context in comments.

Comments

0

Both .replace() and .dropna() do not perform their actions in place, e.g. modify the existing dataframe unless you specify them to. However if you do specify to perform them in place your code would work:

reviews.replace([np.inf, -np.inf], np.nan, inplace=True)
reviews.dropna(inplace=True) 

reviews['Rating'].astype('int')

Or:

reviews = reviews.replace([np.inf, -np.inf], np.nan)
reviews = reviews.dropna() 

reviews['Rating'].astype('int')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.