Dubious 'Int' vs 'Integer' handling in Haskell?

Question

Just for kicks, I wanted to see what would happen if I defined a function in Haskell as Int -> Int, knowing that it would overflow and have to return an Integer. Consider the following:

factorial :: Int -> Int
factorial n = product [1..n]

Now I know that if I run factorial 50, I'm going to get a number outside of the 'codomain' of factorial. Since Haskell is so strongly typed, I was hoping it would return an error. Instead GHCi returns a weird negative Int:

ghci> factorial 50
-3258495067890909184

And note that

ghci> maxBound :: Int
9223372036854775808

since I'm running on 64 bit.

I find this behavior kind of scary: why doesn't Haskell raise an error? And why does factorial 50 return a random negative number? Any clarification would be greatly appreciated.

Checking if an int went out of range every time a calculation is done on it would be completely absurd and destroy performance. If you have a function that may go out of range use Integer. Factorial is a appropriate time to do so because it creates large values so quickly. — DiegoNolan
– DiegoNolan, Commented Jul 20, 2013 at 20:37
I obviously knew that it was going to go out of range beforehand (did you read the post?). The question is why did I get a dubious negative number instead of, say, the upperbound of Int. — ntdef
– ntdef, Commented Jul 20, 2013 at 20:47
why would you expect it to return an Integer? It's a valid Int -> Int -> Int and your answer, with product operating on Int, is Int. Your answer is a very legal Int, by all definitions of Int. — Justin L.
– Justin L., Commented Jul 20, 2013 at 21:15
It is, by the way, not a random negative number. It is precisely what one would expect (for 64-bit Ints on GHC), the remainder of 50! modulo 2^64 in the range -2^63 to 2^63-1. — Daniel Fischer
– Daniel Fischer, Commented Jul 20, 2013 at 21:19
@JustinL. I didn't necessarily expect it to return an Int. All I knew is that it would overflow. — ntdef
– ntdef, Commented Jul 20, 2013 at 21:24

amindfv · Accepted Answer · 2013-07-20 22:03:24Z

20

The Int type is defined as "A fixed-precision integer type with at least the range [-2^29 .. 2^29-1]." [0] The range varies by machine but you can find it with minBound and maxBound

The reason it overflows is that it's got a fixed amount of memory allocated for it. Imagine a simpler example - a positive integer in memory whose maximum value was 7 (stored as 3 bits in memory)

0 is represented as 000, in binary
1 is represented as 001
2 is represented as 010, etc.

Note how the bit math works: when you add 1, you either turn the smallest digit from 0 to 1, or you turn it from 1 to 0 and do the same operation on the next-most-significant digit.

So for example, 011 + 1 is 100.

Now if you naievely (as Haskell does) perform this when there's no next-most-significant digit, then it just increments as usual, but "gets its head cut off." E.g. 111 + 1 becomes 000 instead of 1000. This is what's happening with Haskell, except that ~~its lowest value (represented as a series of 0s) is its lowest negative number.~~ it uses its "leftmost" bit to represent +/-. You'll need to do (maxBound :: Int) + (maxBound :: Int) + 2 to get 0.

So similarly:

>  maxBound :: Int  
9223372036854775807  
>  (maxBound :: Int) + 1  
-9223372036854775808  
>  (maxBound :: Int) + 2  
-9223372036854775807  
>  let x = (maxBound :: Int) + 1 in x + x
0

Why "allow" this to happen? Simple - efficiency. It's much faster to not check for if there'll be an integer overflow. This is the reason that Integer exists - it's unbounded, for when you think you might overflow, but you pay a price in efficiency.

[0] http://www.haskell.org/ghc/docs/latest/html/libraries/base/Data-Int.html

edited Jul 20, 2013 at 22:03

answered Jul 20, 2013 at 20:53

amindfv

8,4486 gold badges38 silver badges58 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Justin L. Over a year ago

A minor point -- the system you are describing for signed ints (offset binary) is actually very uncommon in most CPU architectures. The lowest negative number is actually usually 100... on most modern architectures. See en.wikipedia.org/wiki/Signed_number_representations and en.wikipedia.org/wiki/Two%27s_complement

Justin L. Over a year ago

The neat thing about two's complement is that it allows this "ring" structure to still exist. Take the highest positive number, 0111, and add one, and you get 1000, the lowest negative number. From then on you're counting up; at 1111, you're at -1, and add one more and you get 0000 (zero).

Ingo Over a year ago

@JustinL. Absolutly, I never heard that the lowest number is all null bits. This is probably a misconception of the poster. To check shiftR (maxBound :: Int) 63 should give 0, not minBound+1.

Collectives™ on Stack Overflow

Dubious 'Int' vs 'Integer' handling in Haskell?

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related