I'm stumbling over a weird effect when initializing a Python class. Not sure if I'm overlooking something obvious or not.
First things first, I'm aware that apparently lists passed to classes are passed by reference while integers are passed by value as shown in this example:
class Test:
def __init__(self,x,y):
self.X = x
self.Y = y
self.X += 1
self.Y.append(1)
x = 0
y = []
Test(x,y)
Test(x,y)
Test(x,y)
print x, y
Yielding the result:
0 [1, 1, 1]
So far so good. Now look at this example:
class DataSheet:
MISSINGKEYS = {u'Item': ["Missing"]}
def __init__(self,stuff,dataSheet):
self.dataSheet = dataSheet
if self.dataSheet.has_key(u'Item'):
self.dataSheet[u'Item'].append(stuff[u'Item'])
else:
self.dataSheet[u'Item'] = self.MISSINGKEYS[u'Item']
Calling it like this
stuff = {u'Item':['Test']}
ds = {}
DataSheet(stuff,ds)
print ds
DataSheet(stuff,ds)
print ds
DataSheet(stuff,ds)
print ds
yields:
{u'Item': ['Missing']}
{u'Item': ['Missing', ['Test']]}
{u'Item': ['Missing', ['Test'], ['Test']]}
Now lets print MISSINGKEYS instead:
stuff = {u'Item':['Test']}
ds = {}
DataSheet(stuff,ds)
print DataSheet.MISSINGKEYS
DataSheet(stuff,ds)
print DataSheet.MISSINGKEYS
DataSheet(stuff,ds)
print DataSheet.MISSINGKEYS
This yields:
{u'Item': ['Missing']}
{u'Item': ['Missing', ['Test']]}
{u'Item': ['Missing', ['Test'], ['Test']]}
The exact same output. Why?
MISSINGKEYS is a class variable but at no point is it deliberately altered.
In the first call the class goes into this line:
self.dataSheet[u'Item'] = self.MISSINGKEYS[u'Item']
Which apparently starts it all. Obviously I only want self.dataSheet[u'Item'] to take the value of self.MISSINGKEYS[u'Item'], not to become a reference to it or something like that.
In the following two calls the line
self.dataSheet[u'Item'].append(stuff[u'Item'])
is called instead and the append works on self.dataSheet[u'Item'] AND on self.MISSINGKEYS[u'Item'] which it should not.
This leads to the assumption that after the first call both variables now reference the same object.
However although being equal they do not:
ds == DataSheet.MISSINGKEYS
Out[170]: True
ds is DataSheet.MISSINGKEYS
Out[171]: False
Can someone explain to me what is going on here and how I can avoid it?
EDIT: I tried this:
ds[u'Item'] is DataSheet.MISSINGKEYS[u'Item']
Out[172]: True
So okay, this one entry in both dictionaries references the same object. How can I just assign the value instead?
self.dataSheet[u'Item'] = self.MISSINGKEYS[u'Item']creates a reference so when you change that you change it everywhere. You would need to create a copyself.dataSheet[u'Item'] = list(self.MISSINGKEYS[u'Item'])i = 12; b = i; i += 12b wil still be 12 as ints are immutable but with a mutable structure the change is done in place so no new object is created. Basically if you want to use a mutable value/obect and don't just want a reference, you need to copy or maybe deepcopy depending on the object.t = (1,[2]);t=t2;t[1].append(2), that is where deepcopy comes in.