5

This is an attempt to better understand how reference count works in Python.

Let's create a class and instantiate it. The instance's reference count would be 1 (getrefcount displays 2 because it's own internal structures reference that class instance increasing reference count by 1):

>>> from sys import getrefcount as grc
>>> class A():
    def __init__(self):
        self.x = 100000


>>> a = A()
>>> grc(a)
2

a's internal variable x has 2 references:

>>> grc(a.x)
3

I expected it to be referenced by a and by A's __init__ method. Then I decided to check.

So I created a temporary variable b in the __main__ namespace just to be able to access the variable x. It increased the ref-number by 1 for it to become 3 (as expected):

>>> b = a.x
>>> grc(a.x)
4

Then I deleted the class instance and the ref count decreased by 1:

>>> del a
>>> grc(b)
3

So now there are 2 references: one is by b and one is by A (as I expected).

By deleting A from __main__ namespace I expect the count to decrease by 1 again.

>>> del A
>>> grc(b)
3

But it doesn't happen. There is no class A or its instances that may reference 100000, but still it's referenced by something other than b in __main__ namespace.

So, my question is, what is 100000 referenced by apart from b?


BrenBarn suggested that I should use object() instead of a number which may be stored somewhere internally.

>>> class A():
    def __init__(self):
        self.x = object()


>>> a = A()
>>> b = a.x
>>> grc(a.x)
3
>>> del a
>>> grc(b)
2

After deleting the instance a there were only one reference by b which is very logical.

The only thing that is left to be understood is why it's not that way with number 100000.

3
  • My guess, does A.__dict__ ? Commented Jun 3, 2012 at 22:46
  • @JakobBowyer But when I delete A, then A.__dict__ should be garbage collected because it's not referenced by anything (as I understand). Commented Jun 3, 2012 at 22:47
  • 1
    See this answer: stackoverflow.com/questions/759740/… Commented Jun 3, 2012 at 22:47

2 Answers 2

3

a.x is the integer 10000. This constant is referenced by the code object corresponding to the __init__() method of A. Code objects always include references to all literal constants in the code:

>>> def f(): return 10000
>>> f.__code__.co_consts
(None, 10000)

The line

del A

only deletes the name A and decreases the reference count of A. In Python 3.x (but not in 2.x), classes often include some cyclic references, and hence are only garbage collected when you explicitly run the garbage collector. And indeed, using

import gc
gc.collect()

after del A does lead to the reduction of the reference count of b.

Sign up to request clarification or add additional context in comments.

11 Comments

Shouldn't deleting A lead to deleting __init__ as it's only referenced by A (as I understand)?
Oh. I see now. Reference count of A is 4 before deleting from __main__. So this deleting will only reduce it by 1. What is the other objects which reference A?
@ovgolovin: I just noticed I cannot reproduce your results. For me, the clas A actually does get deleted, and the reference count of b does decrease. The only other thing I can think of: Be careful with the _ of the interactive interpreter. When in doubt, better do experiments with reference counts in a script rather than in the interactive interpreter.
I use Python 3.2, if it matters.
@ovgolovin: Code objects also reference constants in Python 2.x. The difference is that there appear back-references to the class object somewhere in Python 3.x (probably in the MRO) that do not exist in 2.x, and those back-references prevent immediate garbage collection.
|
2

It's likely that this is an artifact of your using an integer as your test value. Python sometimes stores integer objects for later re-use, because they are immutable. When I run your code using self.x = object() instead (which will always create a brand-new object for x) I do get grc(b)==2 at the end.

4 Comments

You are right! Changing 100000 to object() altered the numbers returned to getrefcount. I'll update the question.
@ovgolovin: Of course it does, because the code object can only reference constants appearing in the code. This is completely unrelated to any integer object reuse, though. Integers are not randomly cached for later reuse. Python only holds a cache of small integers (usually from -5 to 256), which are created at interpreter start and used whenever necessary. All other integers are created on demand and never reused.
Downvoting: While the observation is accurate, the reason given in this answer is wrong.
The difference is that object() is only called when instance of A is initialized. But 100000 is referenced by A's __init__ code object. So the reference counts are different. See Sven Marnach's answer for elaborate explanation.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.