Pandas dataframe read large number as string

Question

I am creating a dataframe from a csv like this;

topcells=pd.DataFrame.from_csv("url/output_topcell.txt", header=0, sep=', ', parse_dates=True, encoding=None, tupleize_cols=False)

The column I am interested (cell) in contains long numbers (e.g. 6468716846847) which I need to be cast as strings.

After creating the dataframe the datatype seems to be numpy.float64 by default (including some nan values)

When I use:

topcells.cell=topcells.cell.astype(str)

or:

topcells['cell']=topcells['cell'].apply(lambda x: str(x))

The string I get is not actually "6468716846847" but something like "6.468716846847e+12"

How can I avoid this scientific notation and get the full number as a string?

TomAugspurger · Accepted Answer · 2014-01-08 20:47:25Z

5

You should use the read_csvfunction from the top-level namespace, it has more options for reading, including a dtype parameter.

for example, with tst.csv:

c1,c2,c3,c4,c5
a,b,6468716846847,12,13
d,e,6468716846848,13,14

you get:

In [11]: pd.read_csv('tst.csv', dtype={'c3': 'str'})
Out[11]: 
  c1 c2             c3  c4  c5
0  a  b  6468716846847  12  13
1  d  e  6468716846848  13  14

[2 rows x 5 columns]

answered Jan 8, 2014 at 20:47

TomAugspurger

29k8 gold badges89 silver badges71 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Jeff Over a year ago

assuming no nans in that column u could also read in as int64

Collectives™ on Stack Overflow

Pandas dataframe read large number as string

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related