How does variable copying in Python exactly work? - python

Why is it that:
>>> a = 1
>>> b = a
>>> a = 2
>>> print(a)
2
>>> print(b)
1
...but:
>>> a = [3, 2, 1]
>>> b = a
>>> a.sort()
>>> print(b)
[1, 2, 3]
I mean, why are variables really copied and iterators just referenced?

Variables are not "really copied". Variables are names for objects, and the assignment operator binds a name to the object on the right hand side of the operator. More verbosely:
>>> a = 1 means "make a a name referring to the object 1".
>>> b = a means "make b a name referring to the object currently referred to by a. Which is 1.
>>> a = 2 means "make a a name referring to the object 2". This has no effect on which object anything else that happened to refer to 1 now refers to, such as b.
In your second example, both a and b are names referring to the same list object. a.sort() mutates that object in place, and because both variables refer to the same object the effects of the mutation are visible under both names.

Think of the assigned variables as pointers to the memory location where the values are held. You can actually get the memory location using id.
a = 1
b = a
>>> id(a)
4298171608
>>> id(b)
4298171608 # points to the same memory location
a = 2
>>> id(a)
4298171584 # memory location has changed
Doing the same with your list example, you can see that both are in fact operating on the same object, but with different variables both pointing to the same memory location.
a = [3, 2, 1]
b = a
a.sort()
>>> id(a)
4774033312
>>> id(b)
4774033312 # Same object

in your first example you've reassigned a's value after making b's value a. so a and b carry different values.
the same would've occurred in your second example if you had reassigned a to a new sorted list instead of just sorting it in place.
a = [3,2,1]
b = a
a.sort()
print b
[1,2,3]
but...
a = [3,2,1]
b = a
sorted(a)
print b
[3,2,1]

Related

Why do variables containing lists in Python act differently from say variable containing integers in terms of storing/pointing towards values? [duplicate]

List reference append code
a = [1,2,3,4,5]
b = a
b.append(6)
print(a)
print(b)
#ans:
[1,2,3,4,5,6]
[1,2,3,4,5,6]
Integer reference in int
a = 1
b = a
b +=1
print(a)
print(b)
#ans:
1
2
how reference works in python integer vs list ? in list both value are same, why is in integer section a value is not 2 ?
In Python, everything is an object. Everything is a name for an address (pointer) per the docs.
On that page you can scroll down and find the following:
Numeric objects are immutable; once created their value never changes
Under that you'll see the int type defined, so it makes perfect sense your second example works.
On the top of the same page, you'll find the following:
Every object has an identity, a type and a value. An object’s identity never changes once it has been created; you may think of it as the object’s address in memory.
Python behaves just like C and Java in that you cannot reassign where the pointer to a name points. Python, like Java, is also pass-by-value and doesn't have a pass-by-reference semantic.
Looking at your first example:
>>> a = 1
>>> hex(id(a))
'0x7ffdc64cd420'
>>> b = a + 1
>>> hex(id(b))
'0x7ffdc64cd440'
>>> print(a)
1
>>> print(b)
2
Here it is shown that the operation b = a + 1 leaves a at 1 and b is now 2. That's because int is immutable, names that point to the value 1 will always point to the same address:
>>> a = 1
>>> b = 2
>>> c = 1
>>> hex(id(a))
'0x7ffdc64cd420'
>>> hex(id(b))
'0x7ffdc64cd440'
>>> hex(id(c))
'0x7ffdc64cd420'
Now this only holds true for the values of -5 to 256 in the C implementation, so beyond that you get new addresses, but the mutability shown above holds. I've shown you the sharing of memory addresses for a reason. On the same page you'll find the following:
Types affect almost all aspects of object behavior. Even the importance of object identity is affected in some sense: for immutable types, operations that compute new values may actually return a reference to any existing object with the same type and value, while for mutable objects this is not allowed. E.g., after a = 1; b = 1, a and b may or may not refer to the same object with the value one, depending on the implementation, but after c = []; d = [], c and d are guaranteed to refer to two different, unique, newly created empty lists. (Note that c = d = [] assigns the same object to both c and d.)
So your example:
>>> a = [1, 2, 3, 4, 5]
>>> hex(id(a))
'0x17292e1cbc8'
>>> b = a
>>> hex(id(b))
'0x17292e1cbc8'
I should be able to stop right here, its obvious that both a and b refer to the same object in memory at address 0x17292e1cbc8. Thats because the above is like saying:
# Lets assume that `[1, 2, 3, 4, 5]` is 0x17292e1cbc8 in memory
>>> a = 0x17292e1cbc8
>>> b = a
>>> print(b)
'0x17292e1cbc8'
Long and skinny? You're simply assigning a pointer to a new name, but both names point to the same object in memory! Note: This is not the same as a shallow copy because no external compound object is made.

Python address allocation to variables

Initially variables a, b and c all have value 1 and same address. When variable a is incremented by 1 then address gets altered, while the address of variables b and c remains same. Can someone elaborate on this address allotment?
Also now when variable b is incremented by 1 and address of b now equals to address of a. Can someone please elaborate on this as well?
>>> a = 1
>>> b = a
>>> c = b
>>> a += 1
>>> print a,b,c
2 1 1
>>> id(a)
26976576
>>> id(b)
26976600
>>> id(c)
26976600
>>> b += 1
>>> print a,b,c
2 2 1
>>> id(c)
26976600
>>> id(b)
26976576
>>> id(a)
26976576
Values and memory addresses are all misleading terms. Think of objects, names and IDs. First the object 1 is assigned to the names a, b and c. So the ID of this object can be reached by all the names.
In the second step, you assign a new object, the integer 2, with other ID to the name a.
In the third step, you assign the object integer 2 to b also. This is a implementation detail of CPython, that small integers are only held once in memory, so the object, and therefore its ID, that is reached by the name b is the same as by a.
https://docs.python.org/2/c-api/int.html#c.PyInt_FromLong
The current implementation keeps an array of integer objects for all integers between -5 and 256, when you create an int in that range you actually just get back a reference to the existing object.
Also, In Python, Integer comes from a immutable object: PyIntObject. Once you create a PyIntObject, you'll never change it's value, and the others is just reference.
The reason you are seeing this is because you are thinking in terms of the variables, a, b, and c being the objects, when in fact it is the integers that are the objects of your example. The memory location that you are looking at when you type id() is the location for the int object 1 and the int object 2, and not the variable names a, b, and c.
In python, names are bound to objects. This can be problematic when you have types like lists which are mutable and you try to copy them. Changing the object means that when you poll both references, you see the change propagated in both, and that isn't always what you intend, and unless you are aware of it can cause a lot of problems with debugging.
Back to your example, I added two instances of id() at the beginning to show the id's of the integer objects 1 and 2. This should clarify things for you.
>>> a = 1
>>> b = a
>>> c = a
>>> id(a)
4298174296
>>> id(b)
4298174296
>>> id(c)
4298174296
>>> id(1)
4298174296
>>> id(2)
4298174272
>>> a += 1
>>> id(a)
4298174272
>>> id(b)
4298174296
>>> id(c)
4298174296
>>> b += 1
>>> print a, b, c
2 2 1
>>> id(c)
4298174296
>>> id(b)
4298174272
>>> id(a)
>>> 4298174272
As you can see, the location for 1 and a b c are initially all the same and the location for 2 is different. Then when you change the assignment for a, it points to the location of 2, while b and c stay pointed to 1. Then when you reassign b, it points to the location for 2 as well, leaving only c pointing at 1.
Hope that clarifies it for you.

how can i set a variable in a array in python?

i have the problem where i set a variable but it creates a new one instead, i am not quite sure what is going on here. I have tried using global, setting the variables first and tried to use a tupple but just can't get it working. but this is the problem:
>>> variable = 1
>>> variableList = [variable]
>>> variableList[0] = 2
>>> print(variable)
1
as you can see it variable stays 1 although i set it to 2, is there a easy way to fix this?
Doing variableList = [variable] actually created a new reference(variableList[0]) to the object 1. And when you did variableList[0] = 2, it removed one reference from 1 and assigned variableList[0] to 2. So, using an assignment you can never modify other references.
>>> import sys
>>> variable = 1000
>>> sys.getrefcount(variable)
2
>>> variableList = [variable]
>>> sys.getrefcount(variable) # Reference count increased by 1
3
>>> variableList[0] = 2
>>> sys.getrefcount(variable) #Reference count decreased by 1
2
In fact even you've used +=, that too wouldn't have affected variable because you don't modify a immutable object, you simply assign a new object to that variable name.
>>> a = 100
>>> b = a
>>> b += 10 #This too only affects b
>>> a
100
>>> b
110
But, if variable points to a mutable object and you perform some in-place operation on that object from either variable or variableList[0], then you'll see that both of them have changed.
>>> a = []
>>> b = [a]
>>> b[0].append(1) #in-place operation on a mutable object affects all references
>>> a
[1]
>>> b
[[1]]
you don't set variable. you just change variableList content.
variableList[0] it's a variable like variable so the command variableList = [variable] just copy its value.
it is just like this:
>>> a = 1
>>> b = a
>>> b = 2
>>> print(a)
1
You're reassigning the 0th element of the list (not Array), to the Integer 2, you are not overwriting the variable variable.
>>> variable = 1
>>> variableList = [variable]
>>> variableList[0] = 2
>>> print(variable)
>>> 1
>>> print(variableList)
[2]
It suggest you also look up mutability, as it's important to note that integers are immutable:
The value of some objects can change. Objects whose value can change are said to be mutable; objects whose value is unchangeable once they are created are called immutable.
When you did :variableList = [variable]
you created the first element of veribleList and made its value to balue of variable but it dont makes this first element a varible you just coppied its value so when you change variableList[0] it has nothing to do with variable

Store reference to primitive type in Python?

Code:
>>> a = 1
>>> b = 2
>>> l = [a, b]
>>> l[1] = 4
>>> l
[1, 4]
>>> l[1]
4
>>> b
2
What I want to instead see happen is that when I set l[1] equal to 4, that the variable b is changed to 4.
I'm guessing that when dealing with primitives, they are copied by value, not by reference. Often I see people having problems with objects and needing to understand deep copies and such. I basically want the opposite. I want to be able to store a reference to the primitive in the list, then be able to assign new values to that variable either by using its actual variable name b or its reference in the list l[1].
Is this possible?
There are no 'primitives' in Python. Everything is an object, even numbers. Numbers in Python are immutable objects. So, to have a reference to a number such that 'changes' to the 'number' are 'seen' through multiple references, the reference must be through e.g. a single element list or an object with one property.
(This works because lists and objects are mutable and a change to what number they hold is seen through all references to it)
e.g.
>>> a = [1]
>>> b = a
>>> a
[1]
>>> b
[1]
>>> a[0] = 2
>>> a
[2]
>>> b
[2]
You can't really do that in Python, but you can come close by making the variables a and b refer to mutable container objects instead of immutable numbers:
>>> a = [1]
>>> b = [2]
>>> lst = [a, b]
>>> lst
[[1], [2]]
>>> lst[1][0] = 4 # changes contents of second mutable container in lst
>>> lst
[[1], [4]]
>>> a
[1]
>>> b
[4]
I don't think this is possible:
>>> lst = [1, 2]
>>> a = lst[1] # value is copied, not the reference
>>> a
2
>>> lst[1] = 3
>>> lst
[1, 3] # list is changed
>>> a # value is not changed
2
a refers to the original value of lst[1], but does not directly refer to it.
Think of l[0] as a name referring to an object a, and a as a name that referring to an integer.
Integers are immutable, you can make names refer to different integers, but integers themselves can't be changed.
There were a relevant discussion earlier:
Storing elements of one list, in another list - by reference - in Python?
According to #mgilson, when doing l[1] = 4, it simply replaces the reference, rather than trying to mutate the object. Nevertheless, objects of type int are immutable anyway.

Strange behavior with a list as a default function argument [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
“Least Astonishment” in Python: The Mutable Default Argument
List extending strange behaviour
Pyramid traversal view lookup using method names
Let's say I have this function:
def a(b=[]):
b += [1]
print b
Calling it yields this result:
>>> a()
[1]
>>> a()
[1, 1]
>>> a()
[1, 1, 1]
When I change b += [1] to b = b + [1], the behavior of the function changes:
>>> a()
[1]
>>> a()
[1]
>>> a()
[1]
How does b = b + [1] differ from b += [1]? Why does this happen?
In Python there is no guarantee that a += b does the same thing as a = a + b.
For lists, someList += otherList modifies someList in place, basically equivalent to someList.extend(otherList), and then rebinds the name someList to that same list. someList = someList + otherList, on the other hand, constructs a new list by concatenating the two lists, and binds the name someList to that new list.
This means that, with +=, the name winds up pointing to the same object it already was pointing to, while with +, it points to a new object. Since function defaults are only evaluated once (see this much-cited question), this means that with += the operations pile up because they all modify the same original object (the default argument).
b += [1] alters the function default (leading to the least astonishment FAQ). b = b + [1] takes the default argument b - creates a new list with the + [1], and binds that to b. One mutates the list - the other creates a new one.
When you define a function
>>> def a(b=[]):
b += [1]
return b
it saves all the default arguments in a special place. It can be actually accessed by:
>>> a.func_defaults
([],)
The first default value is a list has the ID:
>>> id(a.func_defaults[0])
15182184
Let's try to invoke the function:
>>> print a()
[1]
>>> print a()
[1, 1]
and see the ID of the value returned:
>>> print id(a())
15182184
>>> print id(a())
15182184
As you may see, it's the same as the ID of the list of the first default value.
The different output of the function is explained by the fact that b+=... modifies the b inplace and doesn't create a new list. And b is the list being kept in the tuple of default values. So all your changes to the list are saved there, and each invocation of the function works with the different value of b.

Categories