Python float silent overflow precision error

Question

StackOverflow has quite a few topics on the floating point representations, about exceptions, truncation, precision issues. I have been trying to explain this one but still didn't figured it out.

from operator import add, sub, mul, div

fun = 'add(add(sub(sub(safeDiv(xL1, add(D, mul(sub(add(sub(A, 0), mul(D, rv)), mul(xL1, add(D, 3))), xL1))), add(0, rv)), safeDiv(mul(sub(D, sigma2), safeDiv(sub(safeDiv(xL1, A), 1), add(safeDiv(safeDiv(B1, xL1), sub(4, xL2)), add(sigma1, xL1)))), sigma1)), add(4, B1)), add(add(A, A), sub(add(xL1, xL1), mul(xL2, safeDiv(xL1, add(sub(add(mul(D, -4), add(add(safeDiv(mul(sigma2, sigma2), safeDiv(B1, sigma1)), sub(add(D, safeDiv(xL2, B1)), D)), sub(4, B1))), A), add(mul(sigma2, xL1), mul(xL1, mul(rv, xL2)))))))))'

d = [(
51.696521954140991,
31.156112806482234,
54.629863633907163,
27.491618858013698,
26.223584534107289,
77.10005191617563,
2708.4145268939428,
0.20952943771134946,
15.558278150405643,
102.0,
225.0)]

arglabels = ['xL1', 'sigma1', 'xL2', 'sigma2', 'A', 'B1', 'D', 'rv']
other = {'add': add, 'sub':sub, 'mul':mul,'safeDiv':div}
inputs = dict(zip(arglabels, d[0][: -4] + (d[0][-3]*d[0][-4],)))
inputs.update(other)
print eval(fun, inputs)

This code should produce a result somewhere between 225 and 240, but instead returns a negative number. And that's it, no exception, no warning, nothing. So it must be a precision error somewhere that turns the result completely off.

By rounding the maximum I can use to get a reasonable result is 1 decimal place (And that gets me close to 207...), longdoubles from numpy help in some cases but is not enough. I have done it by hand (so considerable precision loss, and obtained 240).

Another detail, running in the notebook together with the main script I have this behaviour:

When I add the locals dictionary for the first time it returns a very plausible result, but the following time it goes back to the negative value. There must be some import affecting this but I cannot find it either.

What should I do to avoid this? How can I get some kind of warning generated? How can I track where it goes wrong?

EDIT: The accepted answer identifies the issue properly, take a look at the comments below the answer for more details. However, it does not discuss how to avoid it or correct the function(s). Maybe that should be a discussion for MathOverflow...

You can track where it goes wrong by separating the complex function into smaller steps and evaluating each step. Perhaps split "fun" into 4-6 pieces, depending on what's easy for you to check. When you find the faulty one, split that further until you eventually find the erroneous operation. — Prune
– Prune, Commented Oct 27, 2015 at 23:48

Orest Hera · Accepted Answer · 2015-11-13 13:08:15Z

3

+50

If the expected result is in range 225...240 the following issues are possible:

an error during generation of fun
incorrect values in d
the functions add, sub, mul and safeDiv should do something more complex than just floating point addition, subtraction, multiplication and division.

The input provided in the question cannot give anything other than -2786.17215265, since it is a perfect numerical result. There is no floating point rounding errors or overflows. Below it is shown by output of the test script with verbose version of the arithmetic functions that all floating point operations are well defined. There is nothing that can give noticeable rounding errors. There are some risky operations when close values are subtracted:

add: -10626.8589858 + 10627.794501 = 0.935515251547

However, it is far to rounding errors.

The same result can obtained by a C program or in mathematics tools (MATLAB/Octave).

The different outputs in the screenshot are due to values of the local variables that are not displayed there. Since Out[108] is the same as Out[110] I assume that ds is the same as data[17]. The local variables are used for both outputs Out[109] and Out[110], so the difference is in values of rv, since only that variable is changed in In[110]. If the values of all other variables are fixed it can be seen that the Out[109] (230.62977145178198) can be obtained if rv is equal to one of the following values: 4.075164147, 4.485709922, 51.72476610. That is also illustrated by the last output line in the test script below.

Note that if fun is analyzed as a function of rv it has two poles (around 3.3 and 51.7). So, technically the function can give results from minus infinity to plus infinity.

Test

from operator import add, sub, mul, div

fun = 'add(add(sub(sub(safeDiv(xL1, add(D, mul(sub(add(sub(A, 0), mul(D, rv)), mul(xL1, add(D, 3))), xL1))), add(0, rv)), safeDiv(mul(sub(D, sigma2), safeDiv(sub(safeDiv(xL1, A), 1), add(safeDiv(safeDiv(B1, xL1), sub(4, xL2)), add(sigma1, xL1)))), sigma1)), add(4, B1)), add(add(A, A), sub(add(xL1, xL1), mul(xL2, safeDiv(xL1, add(sub(add(mul(D, -4), add(add(safeDiv(mul(sigma2, sigma2), safeDiv(B1, sigma1)), sub(add(D, safeDiv(xL2, B1)), D)), sub(4, B1))), A), add(mul(sigma2, xL1), mul(xL1, mul(rv, xL2)))))))))'

d = [(
51.696521954140991,
31.156112806482234,
54.629863633907163,
27.491618858013698,
26.223584534107289,
77.10005191617563,
2708.4145268939428,
0.20952943771134946,
15.558278150405643,
102.0,
225.0)]

def add_verbose(a,b):
    res = a + b;
    print "add: {0} + {1} = {2}".format(a,b,res);
    return res;

def sub_verbose(a,b):
    res = a - b;
    print "sub: {0} - {1} = {2}".format(a,b,res);
    return res;

def div_verbose(a,b):
    res = a / b;
    print "div: {0} / {1} = {2}".format(a,b,res);
    return res;

def mul_verbose(a,b):
    res = a * b;
    print "mul: {0} * {1} = {2}".format(a,b,res);
    return res;

arglabels = ['xL1', 'sigma1', 'xL2', 'sigma2', 'A', 'B1', 'D', 'rv']
other = {'add': add_verbose, 'sub':sub_verbose, 'mul':mul_verbose,'safeDiv':div_verbose}
inputs = dict(zip(arglabels, d[0][: -4] + (d[0][-3]*d[0][-4],)))
inputs.update(other)

# out 108
print eval(fun, inputs)

# set locals that can give out 109
safeDiv = div;
rv = 4.0751641470166795256;
inputs.update(locals());

# out 109
print eval(fun, inputs)

Output

sub: 26.2235845341 - 0 = 26.2235845341
mul: 2708.41452689 * 3.25991727261 = 8829.20729761
add: 26.2235845341 + 8829.20729761 = 8855.43088215
add: 2708.41452689 + 3 = 2711.41452689
mul: 51.6965219541 * 2711.41452689 = 140170.700616
sub: 8855.43088215 - 140170.700616 = -131315.269734
mul: -131315.269734 * 51.6965219541 = -6788542.72473
add: 2708.41452689 + -6788542.72473 = -6785834.3102
div: 51.6965219541 / -6785834.3102 = -7.61830006318e-06
add: 0 + 3.25991727261 = 3.25991727261
sub: -7.61830006318e-06 - 3.25991727261 = -3.25992489091
sub: 2708.41452689 - 27.491618858 = 2680.92290804
div: 51.6965219541 / 26.2235845341 = 1.97137511414
sub: 1.97137511414 - 1 = 0.971375114142
div: 77.1000519162 / 51.6965219541 = 1.49139727397
sub: 4 - 54.6298636339 = -50.6298636339
div: 1.49139727397 / -50.6298636339 = -0.029456869265
add: 31.1561128065 + 51.6965219541 = 82.8526347606
add: -0.029456869265 + 82.8526347606 = 82.8231778914
div: 0.971375114142 / 82.8231778914 = 0.0117283004453
mul: 2680.92290804 * 0.0117283004453 = 31.4426693361
div: 31.4426693361 / 31.1561128065 = 1.00919744165
sub: -3.25992489091 - 1.00919744165 = -4.26912233256
add: 4 + 77.1000519162 = 81.1000519162
add: -4.26912233256 + 81.1000519162 = 76.8309295836
add: 26.2235845341 + 26.2235845341 = 52.4471690682
add: 51.6965219541 + 51.6965219541 = 103.393043908
mul: 2708.41452689 * -4 = -10833.6581076
mul: 27.491618858 * 27.491618858 = 755.789107434
div: 77.1000519162 / 31.1561128065 = 2.47463643475
div: 755.789107434 / 2.47463643475 = 305.414200171
div: 54.6298636339 / 77.1000519162 = 0.708558065477
add: 2708.41452689 + 0.708558065477 = 2709.12308496
sub: 2709.12308496 - 2708.41452689 = 0.708558065477
add: 305.414200171 + 0.708558065477 = 306.122758237
sub: 4 - 77.1000519162 = -73.1000519162
add: 306.122758237 + -73.1000519162 = 233.02270632
add: -10833.6581076 + 233.02270632 = -10600.6354013
sub: -10600.6354013 - 26.2235845341 = -10626.8589858
mul: 27.491618858 * 51.6965219541 = 1421.22107785
mul: 3.25991727261 * 54.6298636339 = 178.088836061
mul: 51.6965219541 * 178.088836061 = 9206.57342319
add: 1421.22107785 + 9206.57342319 = 10627.794501
add: -10626.8589858 + 10627.794501 = 0.935515251547
div: 51.6965219541 / 0.935515251547 = 55.2599456488
mul: 54.6298636339 * 55.2599456488 = 3018.84329521
sub: 103.393043908 - 3018.84329521 = -2915.4502513
add: 52.4471690682 + -2915.4502513 = -2863.00308224
add: 76.8309295836 + -2863.00308224 = -2786.17215265
-2786.17215265
230.629771452

edited Nov 13, 2015 at 13:08

answered Nov 13, 2015 at 3:32

Orest Hera

6,7962 gold badges26 silver badges38 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

rll Over a year ago

Thanks @OrestHera, it did not occur to me to simply print what values the operators are geting in order to decompose it. ds is the same as data[17]. inputs['rv'] is getting changed somewhere when there is the update with the locals(), most probably because I have an rv variable set somewhere in the main module.

rll Over a year ago

Anyway, the problem is then that the function has two poles wrt rv. Can you focus your answer on how one should deal with this? (just forget about the notebook nonsense) right now I am rounding d[-3] to int to try to avoid the issue (it is an assumption I can make) but I still find some solutions with the same issue.

rll Over a year ago

For the sake of completeness, this function is basically an approximation of a method where we are using a taylor series for a complex derivation/integration of the cost function.

Orest Hera Over a year ago

@rll Why do you think that the function should give a result in range 225...240? The function has two major fractions where denominator is combined from input variables. So, the function can give any result basing on input. Is it possible that values of input arglabels are calculated incorrectly? BTW, you do not use the last two values in d[0] tuple: the 7 items are directly assigned to the variable and 8th and 9th are multiplied for rv. Do you consider print of inputs (actual values of arglabels) as a reasonable input?

rll Over a year ago

Well, it should give something in that range for this specific input set, not for any inputs. It is based on the optimum being approximated (225), and on the result I obtained by solving the very same fun by hand (app. 241), so maybe the result is probably something in 205...245. The values of the inputs are randomly generated based on variation from a specific case. The last two values are the targets, so they are not part of the inputs (they are just there because d is coming from a larger dataset). I did not get the last question though...

|

casevh · Accepted Answer · 2015-10-28 03:51:00Z

1

When I try your example, I get an answer of:

>>> print eval(fun, inputs)
-2786.17215265

If I use gmpy2 and set the precision to 200 bits and the exponent range to ~1E9, I get an answer of:

>>> print eval(fun,inputs)
-2786.1721526580839894614784542009831125135156833413835128962432

It looks like the function is returning a stable result. So there is probably something wrong with the function.

I'd follow @Prune's advice and split the complex function into smaller steps.

answered Oct 28, 2015 at 3:51

casevh

11.5k1 gold badge29 silver badges37 bronze badges

Collectives™ on Stack Overflow

Python float silent overflow precision error

2 Answers 2

7 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

7 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related