2

I have a need to convert an 18 digit float64 pandas column to an integer or string to be readable avoiding the exponential notation. But I am not successful so far.

df=pd.DataFrame(data={'col1':[915235514180670190,915235514180670208]},dtype='float64')
print(df)
       col1
0  9.152355e+17
1  9.152355e+17

Then I tried converting it to int64. But last 3 digits going wrong.

df.col1.astype('int64')
0    915235514180670208
1    915235514180670208
Name: col1, dtype: int64

But you see .. the value is goin wrong. Not sure why. I read from documentation as int64 should be able to hold an 18 digit number.

 int64  Integer (-9223372036854775808 to 9223372036854775807)

Any idea what I am doing wrong ? How can I achieve my requirement ?

Giving further info based on Eric Postpischil comment. If float64 can't hold 18 digits, I might be in trouble. Thing is that I get this data through a pandas read_sql function call from DB. And it automatically type casted to float64. I don't see an option to mention datatype in pandas read_sql()

Any thoughts from any one on what I can do to overcome this problem ?

3
  • 2
    A Float64 cannot represent 915235514180670190. When that decimal numeral is converted to Float64, the result is the nearest representable value, 915235514180670208. Converting the Float64 to decimal cannot reproduce the original value because it is gone. Commented Jul 2, 2021 at 11:18
  • Problem is that , I get this data to a float64 through a read_sql from a DB. So, thinking what can I do ? Commented Jul 2, 2021 at 12:14
  • 1
    You probably want to first figure out whether the database itself holds the information you need, or whether that information was lost when the values were entered into the database. What's the column type of the relevant column of the DB table? If the DB is already using IEEE 754 floating-point for that column then this is an impossible task. If it's using some other floating-point type with higher precision or an integer type then there may be something you can do. Commented Jul 4, 2021 at 8:12

3 Answers 3

4

The problem is that a float64 a mantisse of 53 bits which can represent 15 or 16 decimal digits (ref).

That means that a 18 digit float64 pandas column is an illusion. No need to go into Pandas not even into numpy types:

>>> n = 915235514180670190
>>> d = float(n)
>>> print(n, d, int(d))
915235514180670190 9.152355141806702e+17 915235514180670208
Sign up to request clarification or add additional context in comments.

Comments

2

read_sql in Pandas has a coerce_float parameter that might help. It's on by default, and is documented as:

Attempts to convert values of non-string, non-numeric objects (like decimal.Decimal) to floating point, useful for SQL result sets.

Setting this to False helps, e.g. with the following schema/data:

import psycopg2

con = psycopg2.connect()

with con, con.cursor() as cur:
    cur.execute("CREATE TABLE foo ( id SERIAL PRIMARY KEY, num DECIMAL(30,0) )")
    cur.execute("INSERT INTO foo (num) VALUES (123456789012345678901234567890)")

I can run:

print(pd.read_sql("SELECT * FROM foo", con))

print(pd.read_sql("SELECT * FROM foo", con, coerce_float=False))

which gives me the following output:

   id           num
0   1  1.234568e+29

   id                             num
0   1  123456789012345678901234567890

preserving the precision of the value I inserted.

You've not given many details of the database you're using, but hopefully the above is helpful to somebody!

Comments

0

I did a work around to deal that problem.. Thought of sharing it as it may help some one else.

    #Preapring SQL to extract all rows.
    sql='SELECT * , CAST(col1 AS CHAR(18)) as DUMMY_COL FROM table1;'
    
    #Get data from postgres
    df=pd.read_sql(sql, self.conn)
    
    # converting dummy col to integer
    df['DUMMY_COL']=df['DUMMY_COL'].astype('int64')
    
    # removing the original col1 column with replacing the int64 converted one.
    df['col1'] = df['DUMMY_COL']
    df.drop('DUMMY_COL', axis=1, inplace=True)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.