This question follows the question: Problem in Pandas : impossible to do sum of int with arbitrary precision and I used the accepted answer from there: df["my_int"].apply(int).sum()
But it does not work in all cases.
For example, with this file
my_int
9220426963983292163
5657924282683240
The ouput is -9220659185443576213
After looking at the apply(int) output, I understand the problem. In this case, apply(int) returns dtype:int64.
0 9220426963983292163
1 5657924282683240
Name: my_int, dtype: int64
But with large numbers, it returns dtype:object:
0 1111111111111111111111111111111111111111111111...
1 2222222222222222222222222222222222222222222222...
Name: my_int, dtype: object
Is it possible to solve it with pandas ? Or should I follow Tim Robert's answer from the previous question?
Edit 1:
Awful solution. A line is added to the end of the file with a large integer
my_int
9220426963983292163
5657924282683240
11111111111111111111111111111111111111111111111111111111111111111111111111
And after, sum is done on all lines except the last one :
data['my_int'].apply(int).iloc[:-1].sum()