138

Even trivially small Haskell programs turn into gigantic executables.

I've written a small program, that was compiled (with GHC) to the binary with the size extending 7 MB!

What can cause even a small Haskell program to be compiled to the huge binary?

What, if anything, can I do to reduce this?

5
  • 3
    Have you tried just stripping it? Commented May 24, 2011 at 19:11
  • 24
    Run the program strip on the binary to remove the symbol table. Commented May 24, 2011 at 19:20
  • 1
    @tm1rbt: Run strip test. This command removes some debug information from the program and makes it smaller. Commented May 24, 2011 at 19:20
  • 8
    As an aside your data types in the 3D math library should be stricter for performance reasons: data M3 = M3 !V3 !V3 !V3 and data V3 = V3 !Float !Float !Float. Compile with ghc -O2 -funbox-strict-fields. Commented May 24, 2011 at 19:24
  • 9
    This post is discussed on meta. Commented Sep 7, 2014 at 21:32

2 Answers 2

227

Let's see what's going on, try

  $ du -hs A
  13M   A

  $ file A
  A: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), 
     dynamically linked (uses shared libs), for GNU/Linux 2.6.27, not stripped

  $ ldd A
    linux-vdso.so.1 =>  (0x00007fff1b9ff000)
    libXrandr.so.2 => /usr/lib/libXrandr.so.2 (0x00007fb21f418000)
    libX11.so.6 => /usr/lib/libX11.so.6 (0x00007fb21f0d9000)
    libGLU.so.1 => /usr/lib/libGLU.so.1 (0x00007fb21ee6d000)
    libGL.so.1 => /usr/lib/libGL.so.1 (0x00007fb21ebf4000)
    libgmp.so.10 => /usr/lib/libgmp.so.10 (0x00007fb21e988000)
    libm.so.6 => /lib/libm.so.6 (0x00007fb21e706000)
    ...      

You see from the ldd output that GHC has produced a dynamically linked executable, but only the C libraries are dynamically linked! All the Haskell libraries are copied in verbatim.

Aside: since this is a graphics-intensive app, I'd definitely compile with ghc -O2

There's two things you can do.

Stripping symbols

An easy solution: strip the binary:

$ strip A
$ du -hs A
5.8M    A

Strip discards symbols from the object file. They are generally only needed for debugging.

Dynamically linked Haskell libraries

More recently, GHC has gained support for dynamic linking of both C and Haskell libraries. Most distros now distribute a version of GHC built to support dynamic linking of Haskell libraries. Shared Haskell libraries may be shared amongst many Haskell programs, without copying them into the executable each time.

At the time of writing Linux and Windows are supported.

To allow the Haskell libraries to be dynamically linked, you need to compile them with -dynamic, like so:

 $ ghc -O2 --make -dynamic A.hs

Also, any libraries you want to be shared should be built with --enabled-shared:

 $ cabal install opengl --enable-shared --reinstall     
 $ cabal install glfw   --enable-shared --reinstall

And you'll end up with a much smaller executable, that has both C and Haskell dependencies dynamically resolved.

$ ghc -O2 -dynamic A.hs                         
[1 of 4] Compiling S3DM.V3          ( S3DM/V3.hs, S3DM/V3.o )
[2 of 4] Compiling S3DM.M3          ( S3DM/M3.hs, S3DM/M3.o )
[3 of 4] Compiling S3DM.X4          ( S3DM/X4.hs, S3DM/X4.o )
[4 of 4] Compiling Main             ( A.hs, A.o )
Linking A...

And, voilà!

$ du -hs A
124K    A

which you can strip to make even smaller:

$ strip A
$ du -hs A
84K A

An eensy weensy executable, built up from many dynamically linked C and Haskell pieces:

$ ldd A
    libHSOpenGL-2.4.0.1-ghc7.0.3.so => ...
    libHSTensor-1.0.0.1-ghc7.0.3.so => ...
    libHSStateVar-1.0.0.0-ghc7.0.3.so =>...
    libHSObjectName-1.0.0.0-ghc7.0.3.so => ...
    libHSGLURaw-1.1.0.0-ghc7.0.3.so => ...
    libHSOpenGLRaw-1.1.0.1-ghc7.0.3.so => ...
    libHSbase-4.3.1.0-ghc7.0.3.so => ...
    libHSinteger-gmp-0.2.0.3-ghc7.0.3.so => ...
    libHSghc-prim-0.2.0.0-ghc7.0.3.so => ...
    libHSrts-ghc7.0.3.so => ...
    libm.so.6 => /lib/libm.so.6 (0x00007ffa4ffd6000)
    librt.so.1 => /lib/librt.so.1 (0x00007ffa4fdce000)
    libdl.so.2 => /lib/libdl.so.2 (0x00007ffa4fbca000)
    libHSffi-ghc7.0.3.so => ...

One final point: even on systems with static linking only, you can use -split-objs, to get one .o file per top level function, which can further reduce the size of statically linked libraries. It needs GHC to be built with -split-objs on, which some systems forget to do.

Sign up to request clarification or add additional context in comments.

10 Comments

when is dynamic linking due to arrive for ghc on the mac?
...doesn't cabal install strip the installed binary by default?
doing so on Windows seems to make the resulting file un-runnable, it complains about missing libHSrts-ghc7.0.3.dll
will this binary be working on other Linux machines after these procedures?
Hi OP from 2011! I'm from the future and can tell that pandoc executable on Ubuntu 16.04 is 50MB fat and it's not going to changed based on packages.ubuntu.com/zesty/pandoc . Message to near-future self and others: contact package maintainer and ask if enable-shared was considered. launchpad.net/ubuntu/+source/pandoc/+bugs
|
13

Haskell uses static linking by default. This is, the whole bindings to OpenGL are copied into your program. As they are quite big, your program gets unnecessarily inflated. You can work around this by using dynamic linking, although it isn't enabled by default.

2 Comments

You can dynamically link libraries to work around this. Not sure why it matters what is default, the flag is simple enough.
The problem is that "any libraries you want to be shared should be built with --enabled-shared" so if your Haskell Platform comes with libraries built without --enabled shared you have to recompile the base libraries which can be quite painful.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.