If I have a python list say : ['aaa', 'bbb']. Is this list stored in 2x8 bytes (for 64-bit addressing) - that is we have only pointers to strings in the list or is it stored in [len('aaa')+len('bbb')]*size_of_char - that is we have a contiguous storage of characters of each string in the list.
-
1Use id(lst) to access the address. Yes, it is, by value.Marcus.Aurelianus– Marcus.Aurelianus2018-07-13 03:41:48 +00:00Commented Jul 13, 2018 at 3:41
-
1stackoverflow.com/questions/1090104/…Marcus.Aurelianus– Marcus.Aurelianus2018-07-13 03:43:25 +00:00Commented Jul 13, 2018 at 3:43
-
Thank's @Marcus.Aurelianus for the link! As Mad Physicist mentioned i think we have only pointers to string objects in the list. Anyway the id function that you mentioned helped me grasp information i didn't know about python. I mentioned one them in my comment to Mad Physicist answer.Ayoub Omari– Ayoub Omari2018-07-14 05:05:42 +00:00Commented Jul 14, 2018 at 5:05
-
no problem, I am also one new learner, just trying to help.Marcus.Aurelianus– Marcus.Aurelianus2018-07-14 11:12:07 +00:00Commented Jul 14, 2018 at 11:12
2 Answers
A way to access python address is to use id().
>>> a=['aaa', 'bbb']
>>> id(a)
62954056
>>> id(a[0])
62748912
>>> id(a[1])
61749544
Further reading is here [understanding-python-variables and memory management].
2 Comments
Under the hood in CPython, everything is a pointer to PyObject. The subtype PyListObject has a pointer to an array of pointers to PyObjects among it's structure fields.
Strings are also a subtype of PyObject, generally implemented in PyUnicodeObject. Similarly to a list, a string contains a pointer to the buffer containing it's elements.
So the sequence of pointers actually looks like this:
- Pointer to list object
- Pointer to list buffer
- Pointer to string object
- Pointer to string data
You can deduce the fact that your list buffer can't have [len('aaa') + len('bbb')] * size_of_char elements from a number of reasons.
- Everything in Python is an object, so at the very least you need to have space for the additional metadata.
- Lists can hold any kind of object, not just fixed length strings. How do you index into a list where elements have different sizes?
- Characters can have different sizes in Unicode. The number of bytes in a string and the number of characters are not directly related. This brings us back to both #1 and #2.
In general, if you are curious about the internal workings of CPython, look into the API docs, and the source code.
2 Comments
[1, 2, 'aa'] the id(=address) of l[2] is not necessarily greater than the address of l[1]. In fact these are not the addresses of cells in the list buffer but the addresses of elements that are referenced by the buffer cells.