Is it true that "Python never implicitly copies objects"? - python

I've found this statement in one of the answers to this question.
What does it mean? I would have no problem if the statement were "Python never implicitly copies dictionary objects". I believe tuples, lists, sets etc are considered "object" in python but the problem with dictionary as described in the question doesn't arise with them.

The statement in the linked answer is broader than it should be. Implicit copies are rare in Python, and in the cases where they happen, it is arguable whether Python is performing the implicit copy, but they happen.
What is definitely true is that the default rules of name assignment do not involve a copy. By default,
a = b
will not copy the object being assigned to a. This default can be overridden by a custom local namespace object, which can happen when using exec or a metaclass with a __prepare__ method, but doing so is extremely rare.
As for cases where implicit copies do happen, the first that comes to mind is that the multiprocessing standard library module performs implicit copies all over the place, which is one of the reasons that multiprocessing causes a lot of confusion. Assignments other than name assignment may also involve copies; a.b = c, a[b] = c, and a[b:c] = d may all involve copies, depending on what a is. a[b:c] = d is particularly likely to involve copying d's data, although it will usually not involve producing an object that is a copy of d.

python has a lot of difficult types. they are divide on two groups:
1) not change - integer, string, tuple
2) change - list, dictionary
for example:
- not change
x = 10
for this 'x' python create new object like 'Int' with link in memory 0x0001f0a
x += 1 # x = x + 1
python create new link in memory like 0x1003c00
- change
x = [1, 2, 'spam']
for this 'x' python create new object like 'Int' with link in memory 0x0001f0a
y = x
python copy link from 'x' to 'y'

Related

Are numbers considered objects in python?

I am aware that numeric values are immutable in python. I have also read how everything is an object in python. I just want to know if numeric types are also objects in python. Because if they are objects, then the variables are actually reference variables right? Does it mean that if I pass a number to a function and modify it inside a function, then two number objects with two references are created? Is there a concept of primitive data types in python?
Note: I too was thinking it as objects. But visualizing in python tutor says differnt:
http://www.pythontutor.com/visualize.html#mode=edit
def test(a):
a+=10
b=100
test(b)
Or is it a defect in the visualization tool?
Are numeric types objects?
>>> isinstance(1, object)
True
Apparently they are. :-).
Note that you might need to adjust your mental model of an object a little. It seems to me that you're thinking of object as something that is "mutable" -- that isn't the case. In reality, we need to think of python names as a reference to an object. That object may hold references to other objects.
name = something
Here, the right hand side is evaluated -- All the names are resolved into objects and the result of the expression (an object) is referenced by "name".
Ok, now lets consider what happens when you pass something to a function.
def foo(x):
x = 2
z = 3
foo(z)
print(z)
What do we expect to happen here? Well, first we create the function foo. Next, we create the object 3 and reference it by the name z. After that, we look up the value that z references and pass that value to foo. Upon entering foo, that value gets referenced by the (local) name x. We then create the object 2 and reference it by the local name x. Note, x has nothing to do with the global z -- They're independent references. Just because they were referencing the same object when you enter the function doesn't mean that they have to reference the function for all time. We can change what a name references at any point by using an assignment statement.
Note, your example with += may seem to complicate things, but you can think of a += 10 as a = a + 10 if it helps in this context. For more information on += check out: When is "i += x" different from "i = i + x" in Python?
Everything in Python is an object, and that includes the numbers. There are no "primitive" types, only built-in types.
Numbers, however, are immutable. When you perform an operation with a number, you are creating a new number object.

Python: Create List of Object References

I want to clean up some code I've written, in order to scale the magnitude of what I'm trying to do. In order to do so, I'd like to ideally create a list of references to objects, so that I can systematically set the objects, using a loop, without actually have to put the objects in list. I've read about the way Python handles references and pass-by, but haven't quite found a way to do this effectively.
To better demonstrate what I'm trying to do:
I'm using bokeh, and would like to set up a large number of select boxes. Each box looks like this
select_one_name = Select(
title = 'test',
value = 'first_value',
options = ['first_value', 'second_value', 'etc']
)
Setting up each select is fine, when I only have a few, but when I have 20, my code gets very long and unwieldy. What I'd like to be able to do, is have a list of sample_list = [select_one_name, select_two_name, etc] that I can then loop through, to set the values of each select_one_name, select_two_name, etc. However, I want to have my reference select_one_name still point to the correct value, rather than necessarily refer to the value by calling sample_list[0].
I'm not sure if this is do-able--if there's a better way to do this, than creating a list of references, please let me know. I know that I could just create a list of objects, but I'm trying to avoid that.
For reference, I'm on Python 2.7, Anaconda distribution, Windows 7. Thanks!
To follow up on #Alex Martelli's post below:
The reason why I thought this might not work, is because when I tried a mini-test with a list of lists, I didn't get the results I wanted. To demonstrate
x = [1, 2, 3]
y = [4, 5, 6]
test = [x, y]
test[0].append(1)
Results in x = [1, 2, 3, 1] but if instead, I use test[0] = [1, 2], then x remains [1, 2, 3], although test itself reflects the change.
Drawing a parallel back to my original example, I thought that I would see the same results as from setting to equal. Is this not true?
Every Python list always is internally an array of references (in CPython, which is no doubt what you're using, at the C level it's an array of PyObject* -- "pointers to Python objects").
No copies of the objects get made implicitly: rather (again, in CPython) each object's reference count gets incremented when the you add "the object" (actually a reference to it) to the list. In fact when you do want an object's copy you need to specifically ask for one (with the copy module in general, or sometimes with type-specific copy methods).
Multiple references to the same object are internally pointers to exactly the same memory. If an object is mutable, then mutating it gets reflected through all the references to it. Of course, there are immutable objects (strings, numbers, tuples, ...) to which such mutation cannot apply.
So when you do, e.g,
sample_list = [select_one_name, select_two_name, etc]
each of the names (as long as it's in scope) still refers to exactly the same object as the corresponding item in sample_list.
In other words, using sample_list[0] and select_one_name is totally equivalent as long as both references to the same object exist.
IOW squared, your stated purpose is already accomplished by Python's most fundamental semantics. Now, please edit the Q to clarify which behavior you're observing that seems to contradict this, versus which behavior you think you should be observing (and desire), and we may be able to help further -- because to this point all the above observations amount to "you're getting exactly the semantics you ask for" so "steady as she goes" is all I can practically suggest!-)
Added (better here in the answer than just below in comments:-): note the focus on mutating operation. The OP tried test[0]= somelist followed by test[0].append and saw somelist mutated accordingly; then tried test[0] = [1, 2] and was surprised to see somelist not changed. But that's because assignment to a reference is not a mutating operation on the object that said reference used to indicate! It just re-seats the reference, decrement the previously-referred-to object's reference count, and that's it.
If you want to mutate an existing object (which needs to be a mutable one in the first place, but, a list satisfies that), you need to perform mutating operations on it (through whatever reference, doesn't matter). For example, besides append and many other named methods, one mutating operation on a list is assignment to a slice, including the whole-list slice denoted as [:]. So, test[0][:] = [1,2] would in fact mutate somelist -- very different from test[0] = [1,2] which assigns to a reference, not to a slice.
This is not recommended, but it works.
sample_list = ["select_one_name", "select_two_name", "select_three_name"]
for select in sample_list:
locals()[select] = Select(
title = 'test',value = 'first_value',
options = ['first_value', 'second_value', 'etc']
)
You can use select_one_name, select_two_name, etc directly because they're set in the local scope due the special locals() list.
A cleaner approach is to use a dictionary, e.g.
selects = {
'select_one_name': Select(...),
'select_two_name': Select(...),
'select_three_name': Select(...)
}
And reference selects['select_one_name'] in your code and you can iterate over selects.keys() or selects.items().

Without pointers, can I pass references as arguments in Python? [duplicate]

This question already has answers here:
How do I pass a variable by reference?
(39 answers)
Closed 8 years ago.
Since Python doesn't have pointers, I am wondering how I can pass a reference to an object through to a function instead of copying the entire object. This is a very contrived example, but say I am writing a function like this:
def some_function(x):
c = x/2 + 47
return c
y = 4
z = 12
print some_function(y)
print some_function(z)
From my understanding, when I call some_function(y), Python allocates new space to store the argument value, then erases this data once the function has returned c and it's no longer needed. Since I am not actually altering the argument within some_function, how can I simply reference y from within the function instead of copying y when I pass it through? In this case it doesn't matter much, but if y was very large (say a giant matrix), copying it could eat up some significant time and space.
Your understanding is, unfortunately, completely wrong. Python does not copy the value, nor does it allocate space for a new one. It passes a value which is itself a reference to the object. If you modify that object (rather than rebinding its name), then the original will be modified.
Edit
I wish you would stop worrying about memory allocation: Python is not C++, almost all of the time you don't need to think about memory.
It's easier to demonstrate rebinding via the use of something like a list:
def my_func(foo):
foo.append(3) # now the source list also has the number 3
foo = [3] # we've re-bound 'foo' to something else, severing the relationship
foo.append(4) # the source list is unaffected
return foo
original = [1, 2]
new = my_func(original)
print original # [1, 2, 3]
print new # [3, 4]
It might help if you think in terms of names rather than variables: inside the function, the name "foo" starts off being a reference to the original list, but then we change that name to point to a new, different list.
Python parameters are always "references".
The way parameters in Python works and the way they are explained on the docs can be confusing and misleading to newcomers to the languages, specially if you have a background on other languages which allows you to choose between "pass by value" and "pass by reference".
In Python terms, a "reference" is just a pointer with some more metadata to help the garbage collector do its job. And every variable and every parameter are always "references".
So, internally, Python pass a "pointer" to each parameter. You can easily see this in this example:
>>> def f(L):
... L.append(3)
...
>>> X = []
>>> f(X)
>>> X
[3]
The variable X points to a list, and the parameter L is a copy of the "pointer" of the list, and not a copy of the list itself.
Take care to note that this is not the same as "pass-by-reference" as C++ with the & qualifier, or pascal with the var qualifier.

Assignment of objects and fundamental types [duplicate]

This question already has answers here:
Why variable = object doesn't work like variable = number
(10 answers)
Closed 4 years ago.
There is this code:
# assignment behaviour for integer
a = b = 0
print a, b # prints 0 0
a = 4
print a, b # prints 4 0 - different!
# assignment behaviour for class object
class Klasa:
def __init__(self, num):
self.num = num
a = Klasa(2)
b = a
print a.num, b.num # prints 2 2
a.num = 3
print a.num, b.num # prints 3 3 - the same!
Questions:
Why assignment operator works differently for fundamental type and
class object (for fundamental types it copies by value, for class object it copies by reference)?
How to copy class objects only by value?
How to make references for fundamental types like in C++ int& b = a?
This is a stumbling block for many Python users. The object reference semantics are different from what C programmers are used to.
Let's take the first case. When you say a = b = 0, a new int object is created with value 0 and two references to it are created (one is a and another is b). These two variables point to the same object (the integer which we created). Now, we run a = 4. A new int object of value 4 is created and a is made to point to that. This means, that the number of references to 4 is one and the number of references to 0 has been reduced by one.
Compare this with a = 4 in C where the area of memory which a "points" to is written to. a = b = 4 in C means that 4 is written to two pieces of memory - one for a and another for b.
Now the second case, a = Klass(2) creates an object of type Klass, increments its reference count by one and makes a point to it. b = a simply takes what a points to , makes b point to the same thing and increments the reference count of the thing by one. It's the same as what would happen if you did a = b = Klass(2). Trying to print a.num and b.num are the same since you're dereferencing the same object and printing an attribute value. You can use the id builtin function to see that the object is the same (id(a) and id(b) will return the same identifier). Now, you change the object by assigning a value to one of it's attributes. Since a and b point to the same object, you'd expect the change in value to be visible when the object is accessed via a or b. And that's exactly how it is.
Now, for the answers to your questions.
The assignment operator doesn't work differently for these two. All it does is add a reference to the RValue and makes the LValue point to it. It's always "by reference" (although this term makes more sense in the context of parameter passing than simple assignments).
If you want copies of objects, use the copy module.
As I said in point 1, when you do an assignment, you always shift references. Copying is never done unless you ask for it.
Quoting from Data Model
Objects are Python’s abstraction for data. All data in a Python
program is represented by objects or by relations between objects. (In
a sense, and in conformance to Von Neumann’s model of a “stored
program computer,” code is also represented by objects.)
From Python's point of view, Fundamental data type is fundamentally different from C/C++. It is used to map C/C++ data types to Python. And so let's leave it from the discussion for the time being and consider the fact that all data are object and are manifestation of some class. Every object has an ID (somewhat like address), Value, and a Type.
All objects are copied by reference. For ex
>>> x=20
>>> y=x
>>> id(x)==id(y)
True
>>>
The only way to have a new instance is by creating one.
>>> x=3
>>> id(x)==id(y)
False
>>> x==y
False
This may sound complicated at first instance but to simplify a bit, Python made some types immutable. For example you can't change a string. You have to slice it and create a new string object.
Often copying by reference gives unexpected results for ex.
x=[[0]*8]*8 may give you a feeling that it creates a two dimensional list of 0s. But in fact it creates a list of the reference of the same list object [0]s. So doing x[1][1] would end up changing all the duplicate instance at the same time.
The Copy module provides a method called deepcopy to create a new instance of the object rather than a shallow instance. This is beneficial when you intend to have two distinct object and manipulate it separately just as you intended in your second example.
To extend your example
>>> class Klasa:
def __init__(self, num):
self.num = num
>>> a = Klasa(2)
>>> b = copy.deepcopy(a)
>>> print a.num, b.num # prints 2 2
2 2
>>> a.num = 3
>>> print a.num, b.num # prints 3 3 - different!
3 2
It doesn't work differently. In your first example, you changed a so that a and b reference different objects. In your second example, you did not, so a and b still reference the same object.
Integers, by the way, are immutable. You can't modify their value. All you can do is make a new integer and rebind your reference. (like you did in your first example)
Suppose you and I have a common friend. If I decide that I no longer like her, she is still your friend. On the other hand, if I give her a gift, your friend received a gift.
Assignment doesn't copy anything in Python, and "copy by reference" is somewhere between awkward and meaningless (as you actually point out in one of your comments). Assignment causes a variable to begin referring to a value. There aren't separate "fundamental types" in Python; while some of them are built-in, int is still a class.
In both cases, assignment causes the variable to refer to whatever it is that the right-hand-side evaluates to. The behaviour you're seeing is exactly what you should expect in that environment, per the metaphor. Whether your "friend" is an int or a Klasa, assigning to an attribute is fundamentally different from reassigning the variable to a completely other instance, with the correspondingly different behaviour.
The only real difference is that the int doesn't happen to have any attributes you can assign to. (That's the part where the implementation actually has to do a little magic to restrict you.)
You are confusing two different concepts of a "reference". The C++ T& is a magical thing that, when assigned to, updates the referred-to object in-place, and not the reference itself; that can never be "reseated" once the reference is initialized. This is useful in a language where most things are values. In Python, everything is a reference to begin with. The Pythonic reference is more like an always-valid, never-null, not-usable-for-arithmetic, automatically-dereferenced pointer. Assignment causes the reference to start referring to a different thing completely. You can't "update the referred-to object in-place" by replacing it wholesale, because Python's objects just don't work like that. You can, of course, update its internal state by playing with its attributes (if there are any accessible ones), but those attributes are, themselves, also all references.

assignment in python

I know that "variable assignment" in python is in fact a binding / re-bindign of a name (the variable) to an object.
This brings the question: is it possible to have proper assignment in python, eg make an object equal to another object?
I guess there is no need for that in python:
Inmutable objects cannot be 'assigned to' since they can't be changed
Mutable objects could potentially be assigned to, since they can change, and this could be useful, since you may want to manipulate a copy of dictionary separately from the original one. However, in these cases the python philosophy is to offer a cloning method on the mutable object, so you can bind a copy rather than the original.
So I guess the answer is that there is no assignment in python, the best way to mimic it would be binding to a cloned object
I simply wanted to share the question in case I'm missing something important here
Thanks
EDIT:
Both Lie Ryan and Sven Marnach answers are good, I guess the overall answer is a mix of both:
For user defined types, use the idiom:
a.dict = dict(b.dict)
(I guess this has problems as well if the assigned class has redefined attribute access methods, but lets not be fussy :))
For mutable built-ins (lists and dicts) use the cloning / copying methods they provide (eg slices, update)
finally inmutable built-ins can't be changed so can't be assigned
I'll choose Lie Ryan because it's an elegant idiom that I hadn't thought of.
Thanks!
I think you are right with your characterization of assignment in Python -- I just would like to add a different method of cloning and ways of assignment in special cases.
"Copy-constructing" a mutable built-in Python object will yield a (shallow) copy of that object:
l = [2, 3]
m = list(l)
l is m
--> False
[Edit: As pointed out by Paul McGuire in the comments, the behaviour of a "copy contructor" (forgive me the C++ terminology) for a immutable built-in Python object is implementation dependent -- you might get a copy or just the same object. But because the object is immutable anyway, you shouldn't care.]
The copy constructor could be called generically by y = type(x)(x), but this seems a bit cryptic. And of course, there is the copy module which allows for shallow and deep copies.
Some Python objects allow assignment. For example, you can assign to a list without creating a new object:
l = [2, 3]
m = l
l[:] = [3, 4, 5]
m
--> [3, 4, 5]
For dictionaries, you could use the clear() method followed by update(otherdict) to assign to a dictionary without creating a new object. For a set s, you can use
s.clear()
s |= otherset
This brings the question: is it
possible to have proper assignment in
python, eg make an object equal to
another object?
Yes you can:
a.__dict__ = dict(b.__dict__)
will do the default assignment semantic in C/C++ (i.e. do a shallow assignment).
The problem with such generalized assignment is that it never works for everybody. In C++, you can override the assignment operator since you always have to pick whether you want a fully shallow assignment, fully deep assignment, or any shade between fully deep copy and fully shallow copy.
I don't think you are missing anything.
I like to picture variables in python as the name written on 'labels' that are attached to boxes but can change its placement by assignment, whereas in other languages, assignment changes the box's contents (and the assignment operator can be overloaded).
Beginners can write quite complex applications without being aware of that, but they are usually messy programs.

Categories