4

This question has been bothering me for two weeks and I've searched online and asked people but couldn't get an answer.

Python by default build the library libpythonMAJOR.MINOR.a and statically links it into the interpreter. Also it has an --enable-shared flag, which will build a share library libpythonMAJOR.MINOR.so.1.0, and dynamically link it to the interpreter.

Based on my poor CS knowledge, the first thought came into my mind when I saw "shared library", is that, "the shared bulid one must save a lot of memory compared to the static build one!".

Then I had this assumption:

# share build
34K Jun 29 11:32 python3.9
21M Jun 29 11:32 libpython3.9.so.1.0

10 shared python processes, mem usage = 0.034M * 10 + 21M ≈ 21M

# static build
22M Jun 27 23:45 python3.9

10 static python processes, mem usage = 10*22M = 220M

shared python wins!

Later I ran a toy test on my machine and found that's wrong.

test.py

import time
i = 0
while i < 20:
        time.sleep(1)
        i += 1

print('done')

mem_test.sh

#! /bin/bash
for i in {1..1000}
do 
         ./python3.9 test.py &
done

For share python to run I set export LD_LIBRARY_PATH=/home/tian/py3.9.13_share/lib .

I ran mem_test.sh separately (one by one) with 2 pythons and simply monitored the total mem usage via htop in another console. It turns out that both eat almost the same amount of memory.

Later on people taught me there's something call "paging on demand":

Is an entire static program loaded into memory when launched?

How does an executable get loaded into RAM, does the whole file get loaded into RAM even when the whole file won't be needed, or does it get loaded in "chunks"?

so my previous calculation of static python mem usage is completely wrong.

Now I am confused. Shared build python doesn't use less memory via a share library runtime?

Question:

What's the benefit of a shared build python vs a static build python? Or the shared built python indeed save some memory by the mechanism of using a share library, but my test is too trival to reveal?

P.S.

Checking some python official Dockerfiles, e.g. this one you would see they all set --enable-shared.

Also there's related issue on pyenv https://github.com/pyenv/pyenv/issues/2294 , it seems that neither they figure that out.

7
  • Shared vs static is more about their disk storage needs. Dynamic linking uses shared libraries and, because they're shared, there only needs to be one instance on disk. Statically linked libraries are part of the executable and, if another application also needs that library, there would be more than one copy of it on disk. Commented Jul 11, 2022 at 18:27
  • Sharing saves disk space, as you aren't including the library code in multiple executable files. Operating systems don't tend to allow processes to share memory, even read-only memory. Commented Jul 11, 2022 at 18:27
  • I believe there is a growing trend towards static linking, as disk space is not nearly at the same premium as it was 25-30 years ago. Docker has a different set of priorities, attempting to keep images small to decrease network bandwidth usage. (Also keeping shared libraries in a deeper layer could speed up image rebuild time if all you need to rebuild is the executable, not its dependencies.) Commented Jul 11, 2022 at 18:30
  • @Ouroborus disk space? Such a simple and trivial reason? Per your comment, "Dynamic linking uses shared libraries and, because they're shared, there only needs to be one instance on disk", different version python builds has different libpythonMAJOR.MINOR.so.1.0, and used by the corresponding python executable solely. I don't see how that saves space. The system wide python share library might do what you say. Commented Jul 11, 2022 at 18:40
  • That's not the only .so that's linked. I'm not sure if you can choose dynamic vs static linking on a per-object basis, I've always seen the choice made as an all or nothing kind of thing. On Windows, these are usually DLLs (pythonMAJORMINOR.dll, I think, in this case) and there's at least two executables that use them: python.exe and pythonw.exe. Commented Jul 11, 2022 at 19:17

2 Answers 2

1

It turns out to be that others are talking about the scenario "Embedding Python in Another Application" (https://docs.python.org/3/extending/embedding.html).

If that's the case, then "saving disk space" and other mentioned reasons make sense. Because embedding python in another application, either you need to statically link libpythonMAJOR.MINOR.a or dynamically link libpythonMAJOR.MINOR.so.1.0.

So my current conclusion is that whether python is shared built or statically built only affects the "Embedding Python in Another Application" scenario. For normal use cases, e.g. running the python interpeter, it doesn't make much diffferences.

Update:

Disk usage comparsion, see comments in makfile:

https://stackoverflow.com/a/73099136/5983841

Sign up to request clarification or add additional context in comments.

Comments

0

Apart from disk usage purpose as mentioned by @Ouroborus, I think there's also the 'convenience of updating' purpose: suppose you have installed a version of python that turned out to have a critical security problem. Then, all software using python might be exposed. To fix the problem, you need to update all the pythons in your computer. If this is a shared python, you only need to update a few files, but if a software uses a statically built python, then you have to update that entire software to get an update on python.

However, this benefit may also be regarded as a downside: updating python may introduce breaking changes, and certain software may break, and during the update process, we are unable to detect these breakages.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.