Different memory size of dict() and {} in Python

Question

There is this problem I cannot get my head around to solve. So in order to create a dictionary, we have two ways: (1) a function call dict() and (2) literal syntax {}.

When I check the memory size of creation of two empty dictionaries, sys.getsizeof() return different memory size:

import sys

my_dict1 = {}
sys.getsizeof(my_dict1)
# 64

For syntax {}, by running this example you can see that the basic occupation of a Python dictionary is 64 bytes (8 buckets, each is 8 bytes, so 8 x 8 = 64). That makes sense to me. Since the {} syntax is handled in byte-code it can use this optimization mentioned here.

and

import sys

my_dict2 = dict()
sys.getsizeof(my_dict2)
# 232

But for dict(), it is 232 bytes. I think this huge difference is due to the different implementations. dict() is handled like a regular class constructor and Python uses the generic memory allocator, which does not follow an easily predictable pattern like the free list (again, this answer is great).

However, what I do not understand starts here, even if I start my empty dictionary using {} and begin to add elements to this empty dictionary, memory size jumps to 232 from 64 in the first iteration.

import sys

my_dict = {}
for i in range(20):
    my_dict[i] = 100
    print(f"elements = {i+1} size = {sys.getsizeof(my_dict)}")
    
# elements = 1 size = 232
# elements = 2 size = 232
# elements = 3 size = 232
# elements = 4 size = 232
# elements = 5 size = 232
# elements = 6 size = 360
# elements = 7 size = 360
# elements = 8 size = 360
# elements = 9 size = 360
# elements = 10 size = 360
# elements = 11 size = 640
# elements = 12 size = 640
# elements = 13 size = 640
# elements = 14 size = 640
# elements = 15 size = 640
# elements = 16 size = 640
# elements = 17 size = 640
# elements = 18 size = 640
# elements = 19 size = 640
# elements = 20 size = 640

In order to make Python hash table fast and reduce collisions, the interpreter keeps resizing the dictionary when it becomes full for two-third. That I know.

What I do not understand, why in the VERY FIRST insertion, the memory size goes from 64 to 232, even though empty dictionary started by {} is 64 bytes. Because when I create an empty dictionary using {}, I use special opcodes for the containers, instead of performing function calls.

EDIT: Python version I am using is 3.8.12.

I can't reproduce this in my Python 3.10.6 REPL. (Sorry for the formatting; this is a comment so I can't do much better.) >>> sys.getsizeof({}): 64; >>> sys.getsizeof(dict()): 64 — AKX
– AKX, Commented Sep 7, 2022 at 7:48
3.9.13 also shows this behavior, actually. So there we go, it's an implementation detail that has changed between 3.9 and 3.10... — AKX
– AKX, Commented Sep 7, 2022 at 7:52
Somebody call Raymond Hettinger for explanation. That's really an implementation detail. — S.B
– S.B, Commented Sep 7, 2022 at 7:53
As far as I'm concerned, the answer is because implementation details and how much does it really matter to userland code…? How deep do you want to dig here…? — deceze
– deceze ♦, Commented Sep 7, 2022 at 7:56
As a stab into the dark, I'd guess the parser might allocate some minimal struct at parse time when it encounters {} in the source, whereas dict() is a runtime call which initialises the object differently, going through a more complete initialisation phase. Something like that. Maybe it's a purposeful optimisation, maybe it just happens to fall out of the implementation details. — deceze
– deceze ♦, Commented Sep 7, 2022 at 8:31

user2357112 · Accepted Answer · 2022-09-07 08:40:21Z

4

On Python 3.8.12, the version you're on, dict() allocates a new "keys object" (the part of the dict that stores the hash table, the keys, and in non-"split" dicts, the values), while {} uses a common "keys object" shared by many empty dicts. The shared keys object isn't counted towards the size of the {} dict.

Note that as soon as you add entries to the {} dict, it will need to allocate a non-shared keys object for its own use, so using {} won't actually save any memory in most cases.

You can see the code for dict() by looking at dict_new and dict_init, the C functions corresponding to dict.__new__ and dict.__init__.

You can see the code for {} in the bytecode evaluation loop, under the BUILD_MAP opcode, and follow that down to _PyDict_NewPresized.

You can see the code that sys.getsizeof delegates to for a dict in dict_sizeof, the C implementation of dict.__sizeof__. The code that ignores the size of the shared keys object for {} is if (mp->ma_keys->dk_refcnt == 1), in _PyDict_SizeOf.

edited Sep 7, 2022 at 8:40

answered Sep 7, 2022 at 8:34

user2357112

286k32 gold badges490 silver badges571 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

ARAT Over a year ago

Okay, now I see why {} and dict() giving different memory size. I am sorry but I still cannot see why initializing an empty object with {} and adding elements in a loop, when I insert the first element, it jumps from 64 to 232. Should not it still be 64 bytes until 5th element added?

AKX Over a year ago

For reference, d = {} calls BUILD_MAP with oparg = 0.

user2357112 Over a year ago

@ARAT: The dict can't share a keys object with other empty dicts if it's not empty any more.

user2357112 Over a year ago

(Well, technically, there are cases where empty and non-empty dicts can share keys objects, but those would be a few very restricted cases with instance __dict__s. The empty keys object used for {} can't be shared by nonempty dicts.)

Collectives™ on Stack Overflow

Different memory size of dict() and {} in Python

1 Answer 1

4 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related