Strange variable manipulation behaviour - python

When I have a function like
def foo(A):
tmp=A
tmp=tmp+1
*rest of the function*
return
or
def foo(A):
global tmp
tmp=A
tmp=tmp+1
*rest of the function*
return
both to "A" and "tmp" 1 is added instead of only to "tmp". What am I doing wrong and how can I fix it?

As nneonneo's comment under Blake's answer (which describes a significant part of the problem) states, the code in the original post doesn't actually do what he states. For example:
def foo(A):
tmp = A
tmp = tmp + 1
return (A, tmp)
foo(3)
returns (3,4), meaning A has been left unchanged.
Where this would not be true is where A is a mutable type. Mutable types include lists, dictionaries, and derived types of those, while integers, floats, and tuples are not mutable.
For example:
def foo(A):
tmp = A
tmp[0] = tmp[0] + 1
return (A, tmp)
foo([1, 2])
which returns ([2, 2], [2, 2]), and in this case A has changed.
That foo changes the value of A in the list case but not the integer case is because lists are mutable and integers are not. Assigning A to tmp when A is a mutable type assigns a reference to the mutable object, and changing one of its elements (as in tmp[0] = tmp[0] + 1) doesn't make a new object.
If you do not want your function to have this side-effect-like behavior for a list, for example, a common Python idiom is to use slice notation to duplicate the list. This will make a new list object when you assign it to tmp that is a copy of the list object in A:
def foo(A):
tmp = A[:]
# this slice makes a new list, a copy of A
tmp[0] = tmp[0] + 1
return (A, tmp)
foo([1, 2])
This returns ([1, 2], [2, 2]), so A is unchanged and tmp is changed.
There are other ways to copy lists or other mutable objects which are subtly different from each other. How to clone or copy a list? has a great description of your choices.

That's because python method parameters are pass by reference, not pass by value. You're essentially modifying the same place in memory, that two different variables point to.
>>> def foo(a):
tmp = a
print(tmp, a, id(a), id(tmp))
>>> foo(5)
5 5 505910928 505910928
>>> b = 5
>>> foo(b)
5 5 505910928 505910928
>>> id(b)
505910928
And with the global example:
>>> def foo(a):
global tmp
a = tmp
print(a, tmp, id(a), id(tmp))
>>> foo(5)
7 7 505910960 505910960
>>> foo('s')
7 7 505910960 505910960
>>> tmp
7
>>> tmp = 6
>>> foo('a')
6 6 505910944 505910944

Related

Python: accidentally created a reference but not sure how

I imagine this is one in a very long list of questions from people who have inadvertantly created references in python, but I've got the following situation. I'm using scipy minimize to set the sum of the top row of an array to 5 (as an example).
class problem_test:
def __init__(self):
test_array = [[1,2,3,4,5,6,7],
[4,5,6,7,8,9,10]]
def set_top_row_to_five(x, array):
array[0] = array[0] + x
return abs(sum(array[0]) - 5)
adjustment = spo.minimize(set_top_row_to_five,0,args=(test_array))
print(test_array)
print(adjustment.x)
ptest = problem_test()
However, the optimization is altering the original array (test_array):
[array([-2.03, -1.03, -0.03, 0.97, 1.97, 2.97, 3.97]), [4, 5, 6, 7, 8, 9, 10]]
[-0.00000001]
I realize I can solve this using, for example, deepcopy, but I'm keen to learn why this is happening so I don't do the same in future by accident.
Thanks in advance!
Names are references to objects. What is to observe is whether the objects (also passed in an argument) is modified itself or a new object is created. An example would be:
>>> l1 = list()
>>> l2 = l1
>>> l2.append(0) # this modifies object currently reference to by l1 and l2
>>> print(l1)
[0]
Whereas:
>>> l1 = list()
>>> l2 = list(l1) # New list object has been created with initial values from l1
>>> l2.append(0)
>>> print(l1)
[]
Or:
>>> l1 = list()
>>> l2 = l1
>>> l2 = [0] # New list object has been created and assigned to l2
>>> l2.append(0)
>>> print(l1)
[]
Similarly assuming l = [1, 2, 3]:
>>> def f1(list_arg):
... return list_arg.reverse()
>>> print(f1, l)
None [3, 2, 1]
We have just passed None returned my list.reverse method through and reversed l (in place). However:
>>> def f2(list_arg):
... ret_list = list(list_arg)
... ret_list.reverse()
... return ret_list
>>> print(f2(l), l)
[3, 2, 1] [1, 2, 3]
Function returns a new reversed object (initialized) from l which remained unchanged (NOTE: in this exampled built-in reversed or slicing would of course make more sense.)
When nested, one must not forget that for instance:
>>> l = [1, 2, 3]
>>> d1 = {'k': l}
>>> d2 = dict(d1)
>>> d1 is d2
False
>>> d1['k'] is d2['k']
True
Dictionaries d1 and d2 are two different objects, but their k item is only one (and shared) instance. This is the case when copy.deepcopy might come in handy.
Care needs to be taken when passing objects around to make sure they are modified or copy is used as wanted and expected. It might be helpful to return None or similar generic value when making in place changes and return the resulting object when working with a copy so that the function/method interface itself hints what the intention was and what is actually going on here.
When immutable objects (as the name suggests) are being "modified" a new object would actually be created and assigned to a new or back to the original name/reference:
>>> s = 'abc'
>>> print('0x{:x} {}'.format(id(s), s))
0x7f4a9dbbfa78 abc
>>> s = s.upper()
>>> print('0x{:x} {}'.format(id(s), s))
0x7f4a9c989490 ABC
Note though, that even immutable type could include reference to a mutable object. For instance for l = [1, 2, 3]; t1 = (l,); t2 = t1, one can t1[0].append(4). This change would also be seen in t2[0] (for the same reason as d1['k'] and d2['k'] above) while both tuples themselves remained unmodified.
One extra caveat (possible gotcha). When defining default argument values (using mutable types), that default argument, when function is called without passing an object, behaves like a "static" variable:
>>> def f3(arg_list=[]):
... arg_list.append('x')
... print(arg_list)
>>> f3()
['x']
>>> f3()
['x', 'x']
Since this is often not a behavior people assume at first glance, using mutable objects as default argument value is usually better avoided.
Similar would be true for class attributes where one object would be shared between all instances:
>>> class C(object):
... a = []
... def m(self):
... self.a.append('x') # We actually modify value of an attribute of C
... print(self.a)
>>> c1 = C()
>>> c2 = C()
>>> c1.m()
['x']
>>> c2.m()
['x', 'x']
>>> c1.m()
['x', 'x', 'x']
Note what the behavior would be in case of class immutable type class attribute in a similar example:
>>> class C(object):
... a = 0
... def m(self):
... self.a += 1 # We assign new object to an attribute of self
... print(self.a)
>>> c1 = C()
>>> c2 = C()
>>> c1.m()
1
>>> c2.m()
1
>>> c1.m()
2
All the fun details can be found in the documentation: https://docs.python.org/3.6/reference/datamodel.html

Copy values from one list to another without altering the reference in python

In python objects such as lists are passed by reference. Assignment with the = operator assigns by reference. So this function:
def modify_list(A):
A = [1,2,3,4]
Takes a reference to list and labels it A, but then sets the local variable A to a new reference; the list passed by the calling scope is not modified.
test = []
modify_list(test)
print(test)
prints []
However I could do this:
def modify_list(A):
A += [1,2,3,4]
test = []
modify_list(test)
print(test)
Prints [1,2,3,4]
How can I assign a list passed by reference to contain the values of another list? What I am looking for is something functionally equivelant to the following, but simpler:
def modify_list(A):
list_values = [1,2,3,4]
for i in range(min(len(A), len(list_values))):
A[i] = list_values[i]
for i in range(len(list_values), len(A)):
del A[i]
for i in range(len(A), len(list_values)):
A += [list_values[i]]
And yes, I know that this is not a good way to do <whatever I want to do>, I am just asking out of curiosity not necessity.
You can do a slice assignment:
>>> def mod_list(A, new_A):
... A[:]=new_A
...
>>> liA=[1,2,3]
>>> new=[3,4,5,6,7]
>>> mod_list(liA, new)
>>> liA
[3, 4, 5, 6, 7]
The simplest solution is to use:
def modify_list(A):
A[::] = [1, 2, 3, 4]
To overwrite the contents of a list with another list (or an arbitrary iterable), you can use the slice-assignment syntax:
A = B = [1,2,3]
A[:] = [4,5,6,7]
print(A) # [4,5,6,7]
print(A is B) # True
Slice assignment is implemented on most of the mutable built-in types. The above assignment is essentially the same the following:
A.__setitem__(slice(None, None, None), [4,5,6,7])
So the same magic function (__setitem__) is called when a regular item assignment happens, only that the item index is now a slice object, which represents the item range to be overwritten. Based on this example you can even support slice assignment in your own types.

Store reference to primitive type in Python?

Code:
>>> a = 1
>>> b = 2
>>> l = [a, b]
>>> l[1] = 4
>>> l
[1, 4]
>>> l[1]
4
>>> b
2
What I want to instead see happen is that when I set l[1] equal to 4, that the variable b is changed to 4.
I'm guessing that when dealing with primitives, they are copied by value, not by reference. Often I see people having problems with objects and needing to understand deep copies and such. I basically want the opposite. I want to be able to store a reference to the primitive in the list, then be able to assign new values to that variable either by using its actual variable name b or its reference in the list l[1].
Is this possible?
There are no 'primitives' in Python. Everything is an object, even numbers. Numbers in Python are immutable objects. So, to have a reference to a number such that 'changes' to the 'number' are 'seen' through multiple references, the reference must be through e.g. a single element list or an object with one property.
(This works because lists and objects are mutable and a change to what number they hold is seen through all references to it)
e.g.
>>> a = [1]
>>> b = a
>>> a
[1]
>>> b
[1]
>>> a[0] = 2
>>> a
[2]
>>> b
[2]
You can't really do that in Python, but you can come close by making the variables a and b refer to mutable container objects instead of immutable numbers:
>>> a = [1]
>>> b = [2]
>>> lst = [a, b]
>>> lst
[[1], [2]]
>>> lst[1][0] = 4 # changes contents of second mutable container in lst
>>> lst
[[1], [4]]
>>> a
[1]
>>> b
[4]
I don't think this is possible:
>>> lst = [1, 2]
>>> a = lst[1] # value is copied, not the reference
>>> a
2
>>> lst[1] = 3
>>> lst
[1, 3] # list is changed
>>> a # value is not changed
2
a refers to the original value of lst[1], but does not directly refer to it.
Think of l[0] as a name referring to an object a, and a as a name that referring to an integer.
Integers are immutable, you can make names refer to different integers, but integers themselves can't be changed.
There were a relevant discussion earlier:
Storing elements of one list, in another list - by reference - in Python?
According to #mgilson, when doing l[1] = 4, it simply replaces the reference, rather than trying to mutate the object. Nevertheless, objects of type int are immutable anyway.

Strange behavior with a list as a default function argument [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
“Least Astonishment” in Python: The Mutable Default Argument
List extending strange behaviour
Pyramid traversal view lookup using method names
Let's say I have this function:
def a(b=[]):
b += [1]
print b
Calling it yields this result:
>>> a()
[1]
>>> a()
[1, 1]
>>> a()
[1, 1, 1]
When I change b += [1] to b = b + [1], the behavior of the function changes:
>>> a()
[1]
>>> a()
[1]
>>> a()
[1]
How does b = b + [1] differ from b += [1]? Why does this happen?
In Python there is no guarantee that a += b does the same thing as a = a + b.
For lists, someList += otherList modifies someList in place, basically equivalent to someList.extend(otherList), and then rebinds the name someList to that same list. someList = someList + otherList, on the other hand, constructs a new list by concatenating the two lists, and binds the name someList to that new list.
This means that, with +=, the name winds up pointing to the same object it already was pointing to, while with +, it points to a new object. Since function defaults are only evaluated once (see this much-cited question), this means that with += the operations pile up because they all modify the same original object (the default argument).
b += [1] alters the function default (leading to the least astonishment FAQ). b = b + [1] takes the default argument b - creates a new list with the + [1], and binds that to b. One mutates the list - the other creates a new one.
When you define a function
>>> def a(b=[]):
b += [1]
return b
it saves all the default arguments in a special place. It can be actually accessed by:
>>> a.func_defaults
([],)
The first default value is a list has the ID:
>>> id(a.func_defaults[0])
15182184
Let's try to invoke the function:
>>> print a()
[1]
>>> print a()
[1, 1]
and see the ID of the value returned:
>>> print id(a())
15182184
>>> print id(a())
15182184
As you may see, it's the same as the ID of the list of the first default value.
The different output of the function is explained by the fact that b+=... modifies the b inplace and doesn't create a new list. And b is the list being kept in the tuple of default values. So all your changes to the list are saved there, and each invocation of the function works with the different value of b.

Python why would you use [:] over =

I am just learning python and I am going though the tutorials on https://developers.google.com/edu/python/strings
Under the String Slices section
s[:] is 'Hello' -- omitting both always gives us a copy of the whole
thing (this is the pythonic way to copy a sequence like a string or
list)
Out of curiosity why wouldn't you just use an = operator?
s = 'hello';
bar = s[:]
foo = s
As far as I can tell both bar and foo have the same value.
= makes a reference, by using [:] you create a copy. For strings, which are immutable, this doesn't really matter, but for lists etc. it is crucial.
>>> s = 'hello'
>>> t1 = s
>>> t2 = s[:]
>>> print s, t1, t2
hello hello hello
>>> s = 'good bye'
>>> print s, t1, t2
good bye hello hello
but:
>>> li1 = [1,2]
>>> li = [1,2]
>>> li1 = li
>>> li2 = li[:]
>>> print li, li1, li2
[1, 2] [1, 2] [1, 2]
>>> li[0] = 0
>>> print li, li1, li2
[0, 2] [0, 2] [1, 2]
So why use it when dealing with strings? The built-in strings are immutable, but whenever you write a library function expecting a string, a user might give you something that "looks like a string" and "behaves like a string", but is a custom type. This type might be mutable, so it's better to take care of that.
Such a type might look like:
class MutableString(object):
def __init__(self, s):
self._characters = [c for c in s]
def __str__(self):
return "".join(self._characters)
def __repr__(self):
return "MutableString(\"%s\")" % str(self)
def __getattr__(self, name):
return str(self).__getattribute__(name)
def __len__(self):
return len(self._characters)
def __getitem__(self, index):
return self._characters[index]
def __setitem__(self, index, value):
self._characters[index] = value
def __getslice__(self, start, end=-1, stride=1):
return str(self)[start:end:stride]
if __name__ == "__main__":
m = MutableString("Hello")
print m
print len(m)
print m.find("o")
print m.find("x")
print m.replace("e", "a") #translate to german ;-)
print m
print m[3]
m[1] = "a"
print m
print m[:]
copy1 = m
copy2 = m[:]
print m, copy1, copy2
m[1] = "X"
print m, copy1, copy2
Disclaimer: This is just a sample to show how it could work and to motivate the use of [:]. It is untested, incomplete and probably horribly performant
They have the same value, but there is a fundamental difference when dealing with mutable objects.
Say foo = [1, 2, 3]. You assign bar = foo, and baz = foo[:]. Now let's say you want to change bar - bar.append(4). You check the value of foo, and...
print foo
# [1, 2, 3, 4]
Now where did that extra 4 come from? It's because you assigned bar to the identity of foo, so when you change one you change the other. You change baz - baz.append(5), but nothing has happened to the other two - that's because you assigned a copy of foo to baz.
Note however that because strings are immutable, it doesn't matter.
If you have a list the result is different:
l = [1,2,3]
l1 = l
l2 = l[:]
l2 is a copy of l (different object) while l1 is an alias of l which means that l1[0]=7 will modify also l, while l2[1]=7 will not modify l.
While referencing an object and referencing the object's copy doesn't differ for an immutable object like string, they do for mutable objects (and mutable methods), for instance list.
Same thing on mutable objects:
a = [1,2,3,4]
b = a
c = a[:]
a[0] = -1
print a # will print [1,2,3,4]
print b # will print [-1,2,3,4]
print c # will print [1,2,3,4]
A visualization on pythontutor of the above example - http://goo.gl/Aswnl.

Categories