Why in the following assignment..
d = deque('abc')
a = d
d.clear()
print a
deque([])
returns a empty deque? I expect to retain data in a despite clearing the old deque.
a and d reference the same object. So if you clear it, it will be cleared for "both variables".
You could check that by printing the identity of objects.
>>> id(a)
44988624L
>>> id(d)
44988624L
Copy values by assignment is just possible for fundamental data types like int etc.
If you deal with objects you have to copy it because the variables itself just holding a reference to the object.
You could do that with
d = deque('abc')
a = deque('abc')
or with
>>> import copy
>>> d = copy.copy(a)
which results in
>>> id(a)
44988624L
>>> id(d)
44989352L
but then you will get two different objects in a and d which will differ after you use it.
The line:
a = d
does not create a copy - it just creates another name for the same object.
To create a copy, do this:
d = deque('abc')
a = deque(d)
>>> from copy import deepcopy
>>> d = deque('abc')
>>> a = deepcopy(d)
>>> d.clear()
>>> a
deque(['a', 'b', 'c'])
Or you can use deque's built-in copy function.
>>> d = deque('abc')
>>> a = d.__copy__
>>> a
<built-in method __copy__ of collections.deque object at 0x02437C70>
>>> a = d.__copy__()
>>> a
deque(['a', 'b', 'c'])
>>> d.clear()
>>> a
deque(['a', 'b', 'c'])
You were giving reference to the same object thats why after clearing the d even a was getting cleared. For that you need to copy the object d to a using deepcopy. Which copies the object for you instead of referencing it
>>> id(a)
37976360
>>> id(d)
37976248
Related
I would like to create a copy of an object. I want the new object to possess all properties of the old object (values of the fields). But I want to have independent objects. So, if I change values of the fields of the new object, the old object should not be affected by that.
To get a fully independent copy of an object you can use the copy.deepcopy() function.
For more details about shallow and deep copying please refer to the other answers to this question and the nice explanation in this answer to a related question.
How can I create a copy of an object in Python?
So, if I change values of the fields of the new object, the old object should not be affected by that.
You mean a mutable object then.
In Python 3, lists get a copy method (in 2, you'd use a slice to make a copy):
>>> a_list = list('abc')
>>> a_copy_of_a_list = a_list.copy()
>>> a_copy_of_a_list is a_list
False
>>> a_copy_of_a_list == a_list
True
Shallow Copies
Shallow copies are just copies of the outermost container.
list.copy is a shallow copy:
>>> list_of_dict_of_set = [{'foo': set('abc')}]
>>> lodos_copy = list_of_dict_of_set.copy()
>>> lodos_copy[0]['foo'].pop()
'c'
>>> lodos_copy
[{'foo': {'b', 'a'}}]
>>> list_of_dict_of_set
[{'foo': {'b', 'a'}}]
You don't get a copy of the interior objects. They're the same object - so when they're mutated, the change shows up in both containers.
Deep copies
Deep copies are recursive copies of each interior object.
>>> lodos_deep_copy = copy.deepcopy(list_of_dict_of_set)
>>> lodos_deep_copy[0]['foo'].add('c')
>>> lodos_deep_copy
[{'foo': {'c', 'b', 'a'}}]
>>> list_of_dict_of_set
[{'foo': {'b', 'a'}}]
Changes are not reflected in the original, only in the copy.
Immutable objects
Immutable objects do not usually need to be copied. In fact, if you try to, Python will just give you the original object:
>>> a_tuple = tuple('abc')
>>> tuple_copy_attempt = a_tuple.copy()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'tuple' object has no attribute 'copy'
Tuples don't even have a copy method, so let's try it with a slice:
>>> tuple_copy_attempt = a_tuple[:]
But we see it's the same object:
>>> tuple_copy_attempt is a_tuple
True
Similarly for strings:
>>> s = 'abc'
>>> s0 = s[:]
>>> s == s0
True
>>> s is s0
True
and for frozensets, even though they have a copy method:
>>> a_frozenset = frozenset('abc')
>>> frozenset_copy_attempt = a_frozenset.copy()
>>> frozenset_copy_attempt is a_frozenset
True
When to copy immutable objects
Immutable objects should be copied if you need a mutable interior object copied.
>>> tuple_of_list = [],
>>> copy_of_tuple_of_list = tuple_of_list[:]
>>> copy_of_tuple_of_list[0].append('a')
>>> copy_of_tuple_of_list
(['a'],)
>>> tuple_of_list
(['a'],)
>>> deepcopy_of_tuple_of_list = copy.deepcopy(tuple_of_list)
>>> deepcopy_of_tuple_of_list[0].append('b')
>>> deepcopy_of_tuple_of_list
(['a', 'b'],)
>>> tuple_of_list
(['a'],)
As we can see, when the interior object of the copy is mutated, the original does not change.
Custom Objects
Custom objects usually store data in a __dict__ attribute or in __slots__ (a tuple-like memory structure.)
To make a copyable object, define __copy__ (for shallow copies) and/or __deepcopy__ (for deep copies).
from copy import copy, deepcopy
class Copyable:
__slots__ = 'a', '__dict__'
def __init__(self, a, b):
self.a, self.b = a, b
def __copy__(self):
return type(self)(self.a, self.b)
def __deepcopy__(self, memo): # memo is a dict of id's to copies
id_self = id(self) # memoization avoids unnecesary recursion
_copy = memo.get(id_self)
if _copy is None:
_copy = type(self)(
deepcopy(self.a, memo),
deepcopy(self.b, memo))
memo[id_self] = _copy
return _copy
Note that deepcopy keeps a memoization dictionary of id(original) (or identity numbers) to copies. To enjoy good behavior with recursive data structures, make sure you haven't already made a copy, and if you have, return that.
So let's make an object:
>>> c1 = Copyable(1, [2])
And copy makes a shallow copy:
>>> c2 = copy(c1)
>>> c1 is c2
False
>>> c2.b.append(3)
>>> c1.b
[2, 3]
And deepcopy now makes a deep copy:
>>> c3 = deepcopy(c1)
>>> c3.b.append(4)
>>> c1.b
[2, 3]
Shallow copy with copy.copy()
#!/usr/bin/env python3
import copy
class C():
def __init__(self):
self.x = [1]
self.y = [2]
# It copies.
c = C()
d = copy.copy(c)
d.x = [3]
assert c.x == [1]
assert d.x == [3]
# It's shallow.
c = C()
d = copy.copy(c)
d.x[0] = 3
assert c.x == [3]
assert d.x == [3]
Deep copy with copy.deepcopy()
#!/usr/bin/env python3
import copy
class C():
def __init__(self):
self.x = [1]
self.y = [2]
c = C()
d = copy.deepcopy(c)
d.x[0] = 3
assert c.x == [1]
assert d.x == [3]
Documentation: https://docs.python.org/3/library/copy.html
Tested on Python 3.6.5.
I believe the following should work with many well-behaved classed in Python:
def copy(obj):
return type(obj)(obj)
(Of course, I am not talking here about "deep copies," which is a different story, and which may be not a very clear concept -- how deep is deep enough?)
According to my tests with Python 3, for immutable objects, like tuples or strings, it returns the same object (because there is no need to make a shallow copy of an immutable object), but for lists or dictionaries it creates an independent shallow copy.
Of course this method only works for classes whose constructors behave accordingly. Possible use cases: making a shallow copy of a standard Python container class.
I imagine this is one in a very long list of questions from people who have inadvertantly created references in python, but I've got the following situation. I'm using scipy minimize to set the sum of the top row of an array to 5 (as an example).
class problem_test:
def __init__(self):
test_array = [[1,2,3,4,5,6,7],
[4,5,6,7,8,9,10]]
def set_top_row_to_five(x, array):
array[0] = array[0] + x
return abs(sum(array[0]) - 5)
adjustment = spo.minimize(set_top_row_to_five,0,args=(test_array))
print(test_array)
print(adjustment.x)
ptest = problem_test()
However, the optimization is altering the original array (test_array):
[array([-2.03, -1.03, -0.03, 0.97, 1.97, 2.97, 3.97]), [4, 5, 6, 7, 8, 9, 10]]
[-0.00000001]
I realize I can solve this using, for example, deepcopy, but I'm keen to learn why this is happening so I don't do the same in future by accident.
Thanks in advance!
Names are references to objects. What is to observe is whether the objects (also passed in an argument) is modified itself or a new object is created. An example would be:
>>> l1 = list()
>>> l2 = l1
>>> l2.append(0) # this modifies object currently reference to by l1 and l2
>>> print(l1)
[0]
Whereas:
>>> l1 = list()
>>> l2 = list(l1) # New list object has been created with initial values from l1
>>> l2.append(0)
>>> print(l1)
[]
Or:
>>> l1 = list()
>>> l2 = l1
>>> l2 = [0] # New list object has been created and assigned to l2
>>> l2.append(0)
>>> print(l1)
[]
Similarly assuming l = [1, 2, 3]:
>>> def f1(list_arg):
... return list_arg.reverse()
>>> print(f1, l)
None [3, 2, 1]
We have just passed None returned my list.reverse method through and reversed l (in place). However:
>>> def f2(list_arg):
... ret_list = list(list_arg)
... ret_list.reverse()
... return ret_list
>>> print(f2(l), l)
[3, 2, 1] [1, 2, 3]
Function returns a new reversed object (initialized) from l which remained unchanged (NOTE: in this exampled built-in reversed or slicing would of course make more sense.)
When nested, one must not forget that for instance:
>>> l = [1, 2, 3]
>>> d1 = {'k': l}
>>> d2 = dict(d1)
>>> d1 is d2
False
>>> d1['k'] is d2['k']
True
Dictionaries d1 and d2 are two different objects, but their k item is only one (and shared) instance. This is the case when copy.deepcopy might come in handy.
Care needs to be taken when passing objects around to make sure they are modified or copy is used as wanted and expected. It might be helpful to return None or similar generic value when making in place changes and return the resulting object when working with a copy so that the function/method interface itself hints what the intention was and what is actually going on here.
When immutable objects (as the name suggests) are being "modified" a new object would actually be created and assigned to a new or back to the original name/reference:
>>> s = 'abc'
>>> print('0x{:x} {}'.format(id(s), s))
0x7f4a9dbbfa78 abc
>>> s = s.upper()
>>> print('0x{:x} {}'.format(id(s), s))
0x7f4a9c989490 ABC
Note though, that even immutable type could include reference to a mutable object. For instance for l = [1, 2, 3]; t1 = (l,); t2 = t1, one can t1[0].append(4). This change would also be seen in t2[0] (for the same reason as d1['k'] and d2['k'] above) while both tuples themselves remained unmodified.
One extra caveat (possible gotcha). When defining default argument values (using mutable types), that default argument, when function is called without passing an object, behaves like a "static" variable:
>>> def f3(arg_list=[]):
... arg_list.append('x')
... print(arg_list)
>>> f3()
['x']
>>> f3()
['x', 'x']
Since this is often not a behavior people assume at first glance, using mutable objects as default argument value is usually better avoided.
Similar would be true for class attributes where one object would be shared between all instances:
>>> class C(object):
... a = []
... def m(self):
... self.a.append('x') # We actually modify value of an attribute of C
... print(self.a)
>>> c1 = C()
>>> c2 = C()
>>> c1.m()
['x']
>>> c2.m()
['x', 'x']
>>> c1.m()
['x', 'x', 'x']
Note what the behavior would be in case of class immutable type class attribute in a similar example:
>>> class C(object):
... a = 0
... def m(self):
... self.a += 1 # We assign new object to an attribute of self
... print(self.a)
>>> c1 = C()
>>> c2 = C()
>>> c1.m()
1
>>> c2.m()
1
>>> c1.m()
2
All the fun details can be found in the documentation: https://docs.python.org/3.6/reference/datamodel.html
Why is it that:
>>> a = 1
>>> b = a
>>> a = 2
>>> print(a)
2
>>> print(b)
1
...but:
>>> a = [3, 2, 1]
>>> b = a
>>> a.sort()
>>> print(b)
[1, 2, 3]
I mean, why are variables really copied and iterators just referenced?
Variables are not "really copied". Variables are names for objects, and the assignment operator binds a name to the object on the right hand side of the operator. More verbosely:
>>> a = 1 means "make a a name referring to the object 1".
>>> b = a means "make b a name referring to the object currently referred to by a. Which is 1.
>>> a = 2 means "make a a name referring to the object 2". This has no effect on which object anything else that happened to refer to 1 now refers to, such as b.
In your second example, both a and b are names referring to the same list object. a.sort() mutates that object in place, and because both variables refer to the same object the effects of the mutation are visible under both names.
Think of the assigned variables as pointers to the memory location where the values are held. You can actually get the memory location using id.
a = 1
b = a
>>> id(a)
4298171608
>>> id(b)
4298171608 # points to the same memory location
a = 2
>>> id(a)
4298171584 # memory location has changed
Doing the same with your list example, you can see that both are in fact operating on the same object, but with different variables both pointing to the same memory location.
a = [3, 2, 1]
b = a
a.sort()
>>> id(a)
4774033312
>>> id(b)
4774033312 # Same object
in your first example you've reassigned a's value after making b's value a. so a and b carry different values.
the same would've occurred in your second example if you had reassigned a to a new sorted list instead of just sorting it in place.
a = [3,2,1]
b = a
a.sort()
print b
[1,2,3]
but...
a = [3,2,1]
b = a
sorted(a)
print b
[3,2,1]
I don't know if the heading makes sense... but this is what I am trying to do using list
>>> x = 5
>>> l = [x]
>>> l
[5]
>>> x = 6
>>> l
[5] # I want l to automatically get updated and wish to see [6]
>>>
The same happens with dict, tuple. Is there a python object that can store the dynamic value of variable?
Thanks,
There's no way to get this to work due to how the assignment operator works in Python. x = WHATEVER will always rebind the local name x to WHATEVER, without modifying what previously x was previously bound to.(*)
You can work around this by replacing the integers with a container data type, such as single-element lists:
>>> x = [5]
>>> l = [x]
>>> l
[[5]]
>>> x[0] = 6
>>> l
[[6]]
but that's really a hack, and I wouldn't recommend it for anything but experimentation.
(*) Rebinding may actually modify previously bound objects when their reference count drops to zero, e.g. it may close files. You shouldn't rely on that, though.
A variable is a place to store data. A datastructure is a place to store data. Pick the one which meets your needs.
You can do it with the numpy module.
>>> from numpy import array
>>> a = array(5)
>>> a
array(5)
>>> l = [a]
>>> l
[array(5)]
>>> a.itemset(6)
>>> a
array(6)
>>> l
[array(6)]
Generally a 0-D numpy array can be treated as any regular value as shown below:
>>> a + 3
9
However, if you need to, you can access the underlying object as such:
>>> a.item()
6
Here's a kind of hacky method of dynamic access that isn't very extensible/flexible in its given form, but could be used as a basis for something better.
>>> a = 7
>>> class l:
def a_get(self):
global a
return a
def a_set(self, value):
global a
a = value
a = property(a_get, a_set)
>>> c = l()
>>> c.a
7
>>> a = 4
>>> c.a
4
>>> c.a = 6
>>> a
6
Dictionary d works fine, as expected,
In [335]: d={1:[], 2:[]}
In [336]: d[1].append('word')
In [337]: d
Out[337]: {1: ['word'], 2: []}
But dz, which looks identical to d, doesn't work correctly.
In [339]: dz=dict(zip([1,2],[[]]*2))
In [340]: dz
Out[340]: {1: [], 2: []}
In [341]: dz[1].append('word')
In [342]: dz
Out[342]: {1: ['word'], 2: ['word']}
Am I doing something wrong? Python 2.6.5
This is a common Python gotcha: [[]]*2 creates two references to the same empty list. You want ([], []) or [ [] for _ in xrange(2)] (suitable for long sequences of empty lists).
Simpler example that represents your zip code:
In [1]: dupl = [[]] * 2
In [2]: dupl[0].append(1)
In [3]: dupl
Out[3]: [[1], [1]]
>>> a = []
>>> b = []
>>> c = a
>>> a is b
False
>>> a is a
True
>>> a is c
True
is demonstrates that your unexpected case is due to having two ways of referencing the same object.
>>> d={1:[], 2:[]}
>>> d[1] is d[2]
False
>>> dz=dict(zip([1,2],[[]]*2))
>>> dz[1] is dz[2]
True
If this is not intended behavior, I would probably write
>>> dz = dict( (k, []) for k in [1, 2] )
>>> dz[1] is dz[2]
False
or (assuming new enough Python, and this is what you want)
>>> import collections
>>> dz = collections.defaultdict(list)
In d, the values of the dict are initially two separate empty lists. In dz, they are the same empty list.
[[]]*2 creates a list which contains the same list twice, not a list containing two different lists.
x=[[]]*2
print x[0] is x[1]
# True
The * operator used to copy elements makes a shallow copy instead of a real copy. Essentially, it is duplicating the reference to the same empty list and a change to either reference changes the contents of the same, shared list.
I'm not sure if this is relevant to what you're eventually going to do with your dict but if you want to have the default content in a dict be a certain value, you can use collections.defaultdict
d = collections.defaultdict(list)
d['a'].append(3)
print d
defaultdict(<type 'list'>, {'a': [3]})
OK, thanks for the answers. Using copy() solves the problem, indeed.
In [1]: import copy
In [2]: udict = lambda a,e: dict(zip(a,[copy.copy(e) for _ in xrange(len(a))]))
In [3]: dzx = udict([1,2],[])
In [4]: dzx[1].append('word')
In [5]: dzx
Out[5]: {1: ['word'], 2: []}