This question already has answers here:
How do I pass a variable by reference?
(39 answers)
Closed 8 years ago.
Since Python doesn't have pointers, I am wondering how I can pass a reference to an object through to a function instead of copying the entire object. This is a very contrived example, but say I am writing a function like this:
def some_function(x):
c = x/2 + 47
return c
y = 4
z = 12
print some_function(y)
print some_function(z)
From my understanding, when I call some_function(y), Python allocates new space to store the argument value, then erases this data once the function has returned c and it's no longer needed. Since I am not actually altering the argument within some_function, how can I simply reference y from within the function instead of copying y when I pass it through? In this case it doesn't matter much, but if y was very large (say a giant matrix), copying it could eat up some significant time and space.
Your understanding is, unfortunately, completely wrong. Python does not copy the value, nor does it allocate space for a new one. It passes a value which is itself a reference to the object. If you modify that object (rather than rebinding its name), then the original will be modified.
Edit
I wish you would stop worrying about memory allocation: Python is not C++, almost all of the time you don't need to think about memory.
It's easier to demonstrate rebinding via the use of something like a list:
def my_func(foo):
foo.append(3) # now the source list also has the number 3
foo = [3] # we've re-bound 'foo' to something else, severing the relationship
foo.append(4) # the source list is unaffected
return foo
original = [1, 2]
new = my_func(original)
print original # [1, 2, 3]
print new # [3, 4]
It might help if you think in terms of names rather than variables: inside the function, the name "foo" starts off being a reference to the original list, but then we change that name to point to a new, different list.
Python parameters are always "references".
The way parameters in Python works and the way they are explained on the docs can be confusing and misleading to newcomers to the languages, specially if you have a background on other languages which allows you to choose between "pass by value" and "pass by reference".
In Python terms, a "reference" is just a pointer with some more metadata to help the garbage collector do its job. And every variable and every parameter are always "references".
So, internally, Python pass a "pointer" to each parameter. You can easily see this in this example:
>>> def f(L):
... L.append(3)
...
>>> X = []
>>> f(X)
>>> X
[3]
The variable X points to a list, and the parameter L is a copy of the "pointer" of the list, and not a copy of the list itself.
Take care to note that this is not the same as "pass-by-reference" as C++ with the & qualifier, or pascal with the var qualifier.
Related
I'm an absolute Python-Newbe and I have some trouble with following function. I hope you can help me. Thank you very much for your help in advance!
I have created a list of zip-files in a directory via a list-comprehension:
zips_in_folder = [file for file in os.listdir(my_path) if file.endswith('.zip')]
I then wanted to define a function that replaces a certain character at a certain index in every element fo the list with "-":
print(zips_in_folder)
def replacer_zip_names(r_index, replacer, zips_in_folder=zips_in_folder):
for index, element in enumerate(zips_in_folder):
x = list(element)
x[r_index] = replacer
zips_in_folder[index]=''.join(x)
replacer_zip_names(5,"-")
print(zips_in_folder)
Output:
['12345#6', '22345#6']
['12345-6', '22345-6']
The function worked, but what I cannot wrap my head around: Why will my function update the actual list "zips_in_folder". I thought the "zips_in_folder"-list within the function would only be a "shadow" of the actual list outside the function. Is the scope of the for-loop global instead of local in this case?
In other functions I wrote the scope of the variables was always local...
I was searching for an answer for hours now, I hope my question isn't too obvious!
Thanks again!
Best
Felix
This is a rather intermediate topic. In one line: Python is pass-by-object-reference.
What this means
zips_in_folder is an object. An object has a reference (think of it like an address) that points to its location in memory. To access an object, you need to use its reference.
Now, here's the key part:
For objects Python passes their reference as value
This means that a copy of the reference of the object is created but again, the new reference is pointing to the same location in the memory.
As a consequence, if you use the reference's copy to access the object, then the original object will be modified.
In your function, zips_in_folder is a variable storing a new copy of the reference.
The following line is using the new copy to access the original object:
zips_in_folder[index]=''.join(x)
However, if you decide to reassign the variable that is storing the reference, nothing will be done to the object, or its original reference, because you just reassigned the variable storing the copy of the reference, you did not modify the original object. Meaning that:
def reassign(a):
a = []
a = [1,0]
reassign(a)
print(a) # output: [1,0]
A simple way to think about it is that lists are mutable, this means that the following will be true:
a = [1, 2, 3]
b = a # a, b are referring to the same object
a[1] = 20 # b now is [1, 20, 3]
That is because lists are objects in python, not primitive variables, so the function changes the "original" list i.e. it doesn't make a local copy of it.
The same is true for any class, user-defined or otherwise: a function manipulating an object will not make a copy of the object, it will change the "original" object passed to it.
If you have knowledge of c++ or any other low-level programming language, it's the same as pass-by-reference.
I've found this statement in one of the answers to this question.
What does it mean? I would have no problem if the statement were "Python never implicitly copies dictionary objects". I believe tuples, lists, sets etc are considered "object" in python but the problem with dictionary as described in the question doesn't arise with them.
The statement in the linked answer is broader than it should be. Implicit copies are rare in Python, and in the cases where they happen, it is arguable whether Python is performing the implicit copy, but they happen.
What is definitely true is that the default rules of name assignment do not involve a copy. By default,
a = b
will not copy the object being assigned to a. This default can be overridden by a custom local namespace object, which can happen when using exec or a metaclass with a __prepare__ method, but doing so is extremely rare.
As for cases where implicit copies do happen, the first that comes to mind is that the multiprocessing standard library module performs implicit copies all over the place, which is one of the reasons that multiprocessing causes a lot of confusion. Assignments other than name assignment may also involve copies; a.b = c, a[b] = c, and a[b:c] = d may all involve copies, depending on what a is. a[b:c] = d is particularly likely to involve copying d's data, although it will usually not involve producing an object that is a copy of d.
python has a lot of difficult types. they are divide on two groups:
1) not change - integer, string, tuple
2) change - list, dictionary
for example:
- not change
x = 10
for this 'x' python create new object like 'Int' with link in memory 0x0001f0a
x += 1 # x = x + 1
python create new link in memory like 0x1003c00
- change
x = [1, 2, 'spam']
for this 'x' python create new object like 'Int' with link in memory 0x0001f0a
y = x
python copy link from 'x' to 'y'
This question already has answers here:
"Least Astonishment" and the Mutable Default Argument
(33 answers)
How do I clone a list so that it doesn't change unexpectedly after assignment?
(24 answers)
Closed 5 years ago.
I am quite new to programming, Python and object-oriented programming in general. For a school assignment, I had to write a code that defines a "polynomial" class and, with this, find the root of a polynomial. As the code did not behave as expected, I started analysing it and realised that a global variable, namely the list of coefficients representing the polynomial, which is the "input" for the polynomial class, was being modified. The problem is that I just can't seem to figure out what (part of the code) is causing this manipulation. Below is (the relevant part of) the code I use:
#set input list
il=[1,1,-1]
#define B!/(O-1)!, note that O is an "oh":
def pr(O,B):
p=1
for i in range(O,B+1):
p*=i
return p
#polynomial
class pol:
#init:
def __init__(self,L=[0]):
self.l=L
self.d=len(L)
self.n=self.d-1
#evaluate:
def ev(self,X=0):
if X==0:
return self.l[0]
else:
s=self.l[0]
for i in range(1,self.d):
s+=self.l[i]*X**i
return s
#N-th derivative:
def der(self,N=1):
if self.n < N:
return pol([0])
else:
lwork=self.l
for i in range(N,self.d):
lwork[i]*=pr(i-N+1,i)
return pol(lwork[N:])
#define specific polynomial and take derivative:
#---here I have put some extra prints to make clear what the problem is---
f=pol(il)
print(il)
fd=f.der()
print(il)
fd2=f.der(2)
print(il)
Now this should evaluate to (at least it does so on my machine)
[1,1,-1]
[1,1,-2]
[1,1,-4]
while I expect it to be just the first list three times, since in the definition of the method "der", I do not manipulate the input list, or so it seems to me.
Can someone explain what's going on? Am I missing a (simple) detail, or am I misusing (some aspect of) classes here?
To execute my code, I use an online compiler (repl.it), running on Python 3.5.2.
One: Never use a mutable default argument for a function/method of any kind.
Two: Assigning from one name to another, as in:
lwork=self.l
is just aliasing, lwork becomes a reference to the same list as self.l. If you don't want to change self.l, (shallow) copy it, e.g.: for simple sequences like list:
lwork = self.l[:]
That will make a new list with the same values as self.l. Since the values are all immutable, the shallow copy was enough; if the values might be mutable, you'd want to use the copy module's copy.deepcopy to ensure the copy has no ties to the original list.
Similarly, if you don't want to preserve a tie between the list passed to the pol initializer and the list stored on the instance, make a copy of it, e.g.:
self.l = list(L)
In this case, I used list(L) instead of L[:] because it gets us a guaranteed type (list), from any input iterable type. This actually makes the mutable default argument safe (because you always shallow copy it, so no one is ever actually mutating it), but even so, mutable defaults are usually considered code smell, so it's best to avoid them.
Fixing up the whole __init__ method, you'd end up with:
# Mostly to avoid code smell, use immutable default (list constructor converts to list)
def __init__(self, L=(0,)):
self.l = list(L) # Create list from arbitrary input iterable
self.d = len(self.l) # Get length of now guaranteed list (so iterator inputs work)
self.n = self.d-1
We all know the dogma that global variables are bad. As I began to learn python I read parameters passed to functions are treated as local variables inside the funktion. This seems to be at least half of the truth:
def f(i):
print("Calling f(i)...")
print("id(i): {}\n".format(id(i)))
print("Inside f(): i += 1")
i += 1
print("id(i): {}".format(id(i)))
return
i = 1
print("\nBefore function call...")
print("id(i): {}\n".format(id(i)))
f(i)
This evaluates to:
Before function call...
id(i): 507107200
Calling f(i)...
id(i): 507107200
Inside f(): i += 1
id(i): 507107232
As I read now, the calling mechanism of functions in Python is "Call by object reference". This means an argument is initially passed by it's object reference, but if it is modified inside the function, a new object variable is created. This seems reasonable to me to avoid a design in which functions unintendedly modify global variables.
But what happens if we pass a list as an argument?
def g(l):
print("Calling f(l)...")
print("id(l): {}\n".format(id(l)))
print("Inside f(): l[0] += 1")
l[0] += 1
print("id(l): {}".format(id(l)))
return
l = [1, 2, 3]
print("\nBefore function call...")
print("id(l): {}\n".format(id(l)))
g(l)
This results in:
Before function call...
id(l): 120724616
Calling f(l)...
id(l): 120724616
Inside f(): l[0] += 1
id(l): 120724616
As we can see, the object reference remains the same! So we work on a global variable, don't we?
I know we can easily overcome this by passing a copy of the list to the function with:
g(l[:])
But my question is: What is the reason the implement two different behaviors of function parameters in Python? If we intend to manipulate a global variable, we could also use the "global"-keyword for list like we would do for integers, couldn't we? How is this behavior consistent with the zen of python "explicit is better than implicit"?
Python has two types of objects - mutable and inmutable. Most of build-in types, like int, string or float, are inmutable. This means they cannot change. Types like list, dict or array are mutable, which means that their state can be changed. Almost all user defined objects are mutable too.
When you do i += 1, you assign a new value to i, which is i + 1. This doesn't mutate i in any way, it just says that it should forget i and replace it with value of i + 1. Then i becomes replaced by a completely new object.
But when you do i[0] += 1 in list, you say to the list that is should replace element 0 with i[0] + 1. This means that id(i[0]) will be changed with new object, and the state of list i will change, but it's identity remains the same - it's the same object it was, only changed.
Note that in Python this is not true for strings, as they are immutable and changing one element will copy the string with updated values and create new object.
Why are int & list function parameters differently treated?
They are not. All parameters are treated the same, regardless of type.
You are seeing different behavior between the two cases because you are doing different things to l.
First, let's simplify the += into an = and a +: l = l + 1 in the first case, and l[0] = l[0] + 1 in the second. (+= doesn't always equal an assignment and +; it depends on the runtime class of the object on the left side, which can override it; but here, for ints, it is equivalent to an assignment and +.) Also, the right side of the assignment just reads stuff and is not interesting, so let's just ignore it for now; so you have:
l = something (in the first case)
l[0] = something (in the second case)
The second one is "assigning to an element", which is actually syntactic sugar for a call to the method . __setitem__():
l.__setitem__(0, something)
So now you can see the difference between the two --
In the first case, you are assigning to the variable l. Python is pass-by-value, so this has no effect on outside code. Assigning to the variable simply makes it point to a new object; it has no effect on the object that it used to point to. If you had assigned something to l in the second case, it would also have had no effect on the original object.
In the second case, you are calling a method on the object pointed to by l. This method happens to be a mutating method on lists, and so modifies the contents of the list object, the original list object a pointer to which was passed in to the method. It is true that int (the runtime class of l in the first case) happens to have no methods that are mutating, but that is besides the point.
If you had done the same thing to l in both cases (if that were possible), then you can expect the same semantics.
This is pretty common across a bunch of languages (Ruby, for example).
The variable itself is scoped to the function. But that variable is just a pointer to an object floating around in memory somewhere -- and that object can be changed.
In Python everything is an object, and hence everything is represented by reference. The most notable thing about variables in Python is that they contain references to objects, not the objects themselves. Now, when arguments are passed to functions, they are passed by reference. Consequently, Inside the scope of a function, every parameter is assigned to the reference of the argument and then treated as a local variable inside the function. When you assign a new value to a parameter, you are changing the object it refers to, and so you have a new object and any changes to it (even if it's a mutable object) will not be seen outside the scope of the function in question, and not related anyway to the passed argument. That said, when you don't assign a new reference to the parameter, it stays holding the reference of the argument, and any changes to it (if and only if it's mutable) will be seen outside the scope of the function.
This question already has answers here:
Why variable = object doesn't work like variable = number
(10 answers)
Closed 4 years ago.
There is this code:
# assignment behaviour for integer
a = b = 0
print a, b # prints 0 0
a = 4
print a, b # prints 4 0 - different!
# assignment behaviour for class object
class Klasa:
def __init__(self, num):
self.num = num
a = Klasa(2)
b = a
print a.num, b.num # prints 2 2
a.num = 3
print a.num, b.num # prints 3 3 - the same!
Questions:
Why assignment operator works differently for fundamental type and
class object (for fundamental types it copies by value, for class object it copies by reference)?
How to copy class objects only by value?
How to make references for fundamental types like in C++ int& b = a?
This is a stumbling block for many Python users. The object reference semantics are different from what C programmers are used to.
Let's take the first case. When you say a = b = 0, a new int object is created with value 0 and two references to it are created (one is a and another is b). These two variables point to the same object (the integer which we created). Now, we run a = 4. A new int object of value 4 is created and a is made to point to that. This means, that the number of references to 4 is one and the number of references to 0 has been reduced by one.
Compare this with a = 4 in C where the area of memory which a "points" to is written to. a = b = 4 in C means that 4 is written to two pieces of memory - one for a and another for b.
Now the second case, a = Klass(2) creates an object of type Klass, increments its reference count by one and makes a point to it. b = a simply takes what a points to , makes b point to the same thing and increments the reference count of the thing by one. It's the same as what would happen if you did a = b = Klass(2). Trying to print a.num and b.num are the same since you're dereferencing the same object and printing an attribute value. You can use the id builtin function to see that the object is the same (id(a) and id(b) will return the same identifier). Now, you change the object by assigning a value to one of it's attributes. Since a and b point to the same object, you'd expect the change in value to be visible when the object is accessed via a or b. And that's exactly how it is.
Now, for the answers to your questions.
The assignment operator doesn't work differently for these two. All it does is add a reference to the RValue and makes the LValue point to it. It's always "by reference" (although this term makes more sense in the context of parameter passing than simple assignments).
If you want copies of objects, use the copy module.
As I said in point 1, when you do an assignment, you always shift references. Copying is never done unless you ask for it.
Quoting from Data Model
Objects are Python’s abstraction for data. All data in a Python
program is represented by objects or by relations between objects. (In
a sense, and in conformance to Von Neumann’s model of a “stored
program computer,” code is also represented by objects.)
From Python's point of view, Fundamental data type is fundamentally different from C/C++. It is used to map C/C++ data types to Python. And so let's leave it from the discussion for the time being and consider the fact that all data are object and are manifestation of some class. Every object has an ID (somewhat like address), Value, and a Type.
All objects are copied by reference. For ex
>>> x=20
>>> y=x
>>> id(x)==id(y)
True
>>>
The only way to have a new instance is by creating one.
>>> x=3
>>> id(x)==id(y)
False
>>> x==y
False
This may sound complicated at first instance but to simplify a bit, Python made some types immutable. For example you can't change a string. You have to slice it and create a new string object.
Often copying by reference gives unexpected results for ex.
x=[[0]*8]*8 may give you a feeling that it creates a two dimensional list of 0s. But in fact it creates a list of the reference of the same list object [0]s. So doing x[1][1] would end up changing all the duplicate instance at the same time.
The Copy module provides a method called deepcopy to create a new instance of the object rather than a shallow instance. This is beneficial when you intend to have two distinct object and manipulate it separately just as you intended in your second example.
To extend your example
>>> class Klasa:
def __init__(self, num):
self.num = num
>>> a = Klasa(2)
>>> b = copy.deepcopy(a)
>>> print a.num, b.num # prints 2 2
2 2
>>> a.num = 3
>>> print a.num, b.num # prints 3 3 - different!
3 2
It doesn't work differently. In your first example, you changed a so that a and b reference different objects. In your second example, you did not, so a and b still reference the same object.
Integers, by the way, are immutable. You can't modify their value. All you can do is make a new integer and rebind your reference. (like you did in your first example)
Suppose you and I have a common friend. If I decide that I no longer like her, she is still your friend. On the other hand, if I give her a gift, your friend received a gift.
Assignment doesn't copy anything in Python, and "copy by reference" is somewhere between awkward and meaningless (as you actually point out in one of your comments). Assignment causes a variable to begin referring to a value. There aren't separate "fundamental types" in Python; while some of them are built-in, int is still a class.
In both cases, assignment causes the variable to refer to whatever it is that the right-hand-side evaluates to. The behaviour you're seeing is exactly what you should expect in that environment, per the metaphor. Whether your "friend" is an int or a Klasa, assigning to an attribute is fundamentally different from reassigning the variable to a completely other instance, with the correspondingly different behaviour.
The only real difference is that the int doesn't happen to have any attributes you can assign to. (That's the part where the implementation actually has to do a little magic to restrict you.)
You are confusing two different concepts of a "reference". The C++ T& is a magical thing that, when assigned to, updates the referred-to object in-place, and not the reference itself; that can never be "reseated" once the reference is initialized. This is useful in a language where most things are values. In Python, everything is a reference to begin with. The Pythonic reference is more like an always-valid, never-null, not-usable-for-arithmetic, automatically-dereferenced pointer. Assignment causes the reference to start referring to a different thing completely. You can't "update the referred-to object in-place" by replacing it wholesale, because Python's objects just don't work like that. You can, of course, update its internal state by playing with its attributes (if there are any accessible ones), but those attributes are, themselves, also all references.