1

If I have a python list say : ['aaa', 'bbb']. Is this list stored in 2x8 bytes (for 64-bit addressing) - that is we have only pointers to strings in the list or is it stored in [len('aaa')+len('bbb')]*size_of_char - that is we have a contiguous storage of characters of each string in the list.

4
  • 1
    Use id(lst) to access the address. Yes, it is, by value. Commented Jul 13, 2018 at 3:41
  • 1
    stackoverflow.com/questions/1090104/… Commented Jul 13, 2018 at 3:43
  • Thank's @Marcus.Aurelianus for the link! As Mad Physicist mentioned i think we have only pointers to string objects in the list. Anyway the id function that you mentioned helped me grasp information i didn't know about python. I mentioned one them in my comment to Mad Physicist answer. Commented Jul 14, 2018 at 5:05
  • no problem, I am also one new learner, just trying to help. Commented Jul 14, 2018 at 11:12

2 Answers 2

3

A way to access python address is to use id().

>>> a=['aaa', 'bbb']

>>> id(a)
62954056

>>> id(a[0])
62748912

>>> id(a[1])
61749544

Further reading is here [understanding-python-variables and memory management].

Sign up to request clarification or add additional context in comments.

2 Comments

Technically an implementation detail, but nice reasoning
@ Mad Physicist, Thanks sir.
2

Under the hood in CPython, everything is a pointer to PyObject. The subtype PyListObject has a pointer to an array of pointers to PyObjects among it's structure fields.

Strings are also a subtype of PyObject, generally implemented in PyUnicodeObject. Similarly to a list, a string contains a pointer to the buffer containing it's elements.

So the sequence of pointers actually looks like this:

  1. Pointer to list object
  2. Pointer to list buffer
  3. Pointer to string object
  4. Pointer to string data

You can deduce the fact that your list buffer can't have [len('aaa') + len('bbb')] * size_of_char elements from a number of reasons.

  1. Everything in Python is an object, so at the very least you need to have space for the additional metadata.
  2. Lists can hold any kind of object, not just fixed length strings. How do you index into a list where elements have different sizes?
  3. Characters can have different sizes in Unicode. The number of bytes in a string and the number of characters are not directly related. This brings us back to both #1 and #2.

In general, if you are curious about the internal workings of CPython, look into the API docs, and the source code.

2 Comments

Thanks @Mad Physicist, I've just understand why in [1, 2, 'aa'] the id(=address) of l[2] is not necessarily greater than the address of l[1]. In fact these are not the addresses of cells in the list buffer but the addresses of elements that are referenced by the buffer cells.
@AyoubOm. Your understanding seems to be correct. You should select an answer by clicking on the check mark next to it if you feel that your question has been answered. That will remove it off the unanswered queue and hand out points all around.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.