Does tuple() copy the elements of the argument? - python

In python, does the built-in function tuple([iterable]) create a tuple object and fill it with copies of the elements of "iterable", or does it create a tuple containing references to the already existing objects of "iterable"?

tuple will iterate the sequence and copy the values. The underlying sequence will be not stored to actually keep the values, but the tuple representation will replace it. So yes, the conversion to a tuple is actual work and not just some nesting of another type.
You can see this happening when converting a generator:
>>> def gen ():
for i in range(5):
print(i)
yield i
>>> g = gen()
>>> g
<generator object gen at 0x00000000030A9B88>
>>> tuple(g)
0
1
2
3
4
(0, 1, 2, 3, 4)
As you can see, the generator is immediately iterated, making the values generate. Afterwards, the tuple is self-contained, and no reference to the original source is kept. For reference, list() behaves in exactly the same way but creates a list instead.
The behaviour that 275365 pointed out (in the now deleted answer) is the standard copying behaviour of Python values. Because everything in Python is an object, you are essentially only working with references. So when references are copied, the underlying object is not copied. The important bit is that non-mutable objects will be recreated whenever their value changes which will not update all previously existing references but just the one reference you are currently changing. That’s why it works like this:
>>> source = [[1], [2], [3]]
>>> tpl = tuple(source)
>>> tpl
([1], [2], [3])
>>> tpl[0].append(4)
>>> tpl
([1, 4], [2], [3])
>>> source
[[1, 4], [2], [3]]
tpl still contains a reference to the original objects within the source list. As those are lists, they are mutable. Changing a mutable list anywhere will not invalidate the references that exist to that list, so the change will appear in both source and tpl. The actual source list however is only stored in source, and tpl has no reference to it:
>>> source.append(5)
>>> source
[[1, 4], [2], [3], 5]
>>> tpl
([1, 4], [2], [3])

Yes. I think answer to both of your questions are a "yes". When you create a new tuple from an existing iterable, it will simply copy each item and add it to the new tuple object you are creating. Since variables in python are actually names referencing objects, the new tuple you are creating will actually hold references to the same objects as the iterable.
I think this question on variable passing will be helpful.

tuple([iterables]) will create a tuple object with the reference of the iterable. But, if the iterable is a tuple then it will return the same object else it will create a new tuple object initialized from the iterable items.
>>> a = (1,2)
>>> b = tuple(a)
>>> a is b
True
>>> c = [1,2]
>>> d = tuple(c)
>>> c is d
False
>>> c[0] is d[0]
True
>>> c[1] is d[1]
True
>>> type(c), type(d)
(<type 'list'>, <type 'tuple'>)
>>>

It will not copy or deep-copy the elements:
a = [{"key": "value"}]
x = tuple(a)
print x #=> ({"key": "value"},)
a[0]["key"] = "fish"
print x #=> ({"key": "fish"},)

Related

Usage of "__add__" method in Tuple class

python noob here and playing around with the limitations of Tuples and Lists.
I don't have a problem, just a general query about usage of the __methodname__ methods in the Tuple class and/or in general.
I am aware that you cannot modify Tuples and in order to do so, you have to convert it to a list, modify said list, then convert back to Tuple but I had a play around with the __add__ method and found that it works. What are issues and limitations with using this to create new Tuples with modifications to an existing one?
CODE:
myTuple = ('item1', 2, 'item3', ['list1', 'list2'])
tupleModification = myTuple.__add__(('newTupleItem1','newTupleItem2'))
This outputs the following:
('item1', 2, 'item3', ['list1', 'list2'], 'newTupleItem1', 'newTupleItem2')
which is correct but i'm wondering if i'm playing with fire because I haven't seen this solution posted anywhere in relation to modifying Tuples.
EDIT: I am aware that you cannot modify existing Tuples and that this will create a new instance of one. I think I may have confused people with my naming conventions.
__add__ is the method which is called when you do:
myTuple + ("newTupleItem1", "newTupleItem2")
So this does not modify myTuple but creates a new tuple whose content is the content of myTuple concatenated with ("newTupleItem1", "newTupleItem2").
You can print myTuple to see that it has not been modified:
>>> myTuple
('item1', 2, 'item3', ['list1', 'list2'])
And you can check that myTuple and tupleModification are not the same object:
>>> myTuple is tupleModification
False
You cannot modify tuples, that's right. However, you can concatenate two existing tuples into a new tuple. This is done with the + operator, which in turn calls the __add__ method. The resulting tuple will not be a "modification" of any of the original ones, but a new distinct tuple. This is what the code you posted does. More concisely, you can just do:
myTuple = ('item1', 2, 'item3', ['list1', 'list2'])
tupleModification = myTuple + ('newTupleItem1','newTupleItem2')
print(tupleModification)
# ('item1', 2, 'item3', ['list1', 'list2'], 'newTupleItem1', 'newTupleItem2')
EDIT: Just as a clarification, you cannot "edit" a tuple anyhow, that is, add or remove elements from it, or change its contents. However, if your tuple contains a mutable object, such as a list, then that inner object can be modified:
myTuple = (1, [2, 3])
myTuple[1].append(4)
print(myTuple)
# (1, [2, 3, 4])
I think fundamentally you're confused about the difference between creating a new object with modifications (__add__ in this case) and modifying an existing object (extend for example).
__add__
As other answers already mentioned, the __add__ method implements the + operator, and it returns a new object. Tuples and lists each have one. For example:
>>> tuple_0 = (1,)
>>> tuple_1 = tuple_0.__add__((2,))
>>> tuple_1 is tuple_0
False
>>>
>>> list_0 = [1]
>>> list_1 = list_0.__add__([2])
>>> list_1 is list_0
False
extend
Lists, which are mutable, have an extend method, which modifies the existing object and returns None. Tuples, which are immutable, don't. For example:
>>> list_2 = [4, 5, 6]
>>> id_save = id(list_2)
>>> list_2.extend([7])
>>> id(list_2) == id_save
True
>>>
>>> tuple_2 = (4, 5, 6)
>>> tuple_2.extend([7])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'tuple' object has no attribute 'extend'
Lists have other methods which change the existing object, like append, sort, pop, etc, but extend is the most similar to __add__.
There is a difference between 1) modifying an existing tuple (in-place), and 2) leaving the original alone, but creating a new modified copy. These two different paradigms can be seen throughout computer programming, not just python tuples. For example, consider increment a number by one. You could modify the original, or you could leave the original alone, create a copy and then modify the copy.
MODIFYING IN-PLACE
BEFORE:
x == 5
AFTER:
x == 6
CREATING A MODIFIED COPY
BEFORE:
x == 5
AFTER:
x == 5 (unchanged)
y == 6
tuple.__add__ concatenates tuples. For example, (x, y) + (a, b, c) returns (x, y, a, b, c)
tuple.__add__ does not modify the original tuples. It leaves the original tuples alone, and creates a new tuple which is the concatenation of the original two. This is contrast to something like list.append or list.extend which modifies the original list instead of returning a modified copy.
Tuple methods generally do the following:
copy the original
modify the copy in some way
leave the original alone.

what is the difference between del a[:] and a = [] when I want to empty a list called a in python? [duplicate]

This question already has answers here:
Different ways of deleting lists
(6 answers)
Closed 7 years ago.
Please what is the most efficient way of emptying a list?
I have a list called a = [1,2,3]. To delete the content of the list I usually write a = [ ]. I came across a function in python called del. I want to know if there is a difference between del a [:] and what I use.
There is a difference, and it has to do with whether that list is referenced from multiple places/names.
>>> a = [1, 2, 3]
>>> b = a
>>> del a[:]
>>> print(b)
[]
>>> a = [1, 2, 3]
>>> b = a
>>> a = []
>>> print(b)
[1, 2, 3]
Using del a[:] clears the existing list, which means anywhere it's referenced will become an empty list.
Using a = [] sets a to point to a new empty list, which means that other places the original list is referenced will remain non-empty.
The key to understanding here is to realize that when you assign something to a variable, it just makes that name point to a thing. Things can have multiple names, and changing what a name points to doesn't change the thing itself.
This can probably best be shown:
>>> a = [1, 2, 3]
>>> id(a)
45556280
>>> del a[:]
>>> id(a)
45556280
>>> b = [4, 5, 6]
>>> id(b)
45556680
>>> b = []
>>> id(b)
45556320
When you do a[:] you are referring to all elements within the list "assigned" to a. The del statement removes references to objects. So, doing del a[:] is saying "remove all references to objects from within the list assigned to a". The list itself has not changed. We can see this with the id function, which gives us a number representing an object in memory. The id of the list before using del and after remains the same, indicating the same list object is assigned to a.
On the other hand, when we assign a non-empty list to b and then assign a new empty list to b, the id changes. This is because we have actually moved the b reference from the existing [4, 5, 6] list to the new [] list.
Beyond just the identity of the objects you are dealing with, there are other things to be aware of:
>>> a = [1, 2, 3]
>>> b = a
>>> del a[:]
>>> print a
[]
>>> print b
[]
Both b and a refer to the same list. Removing the elements from the a list without changing the list itself mutates the list in place. As b references the same object, we see the same result there. If you did a = [] instead, then a will refer to a new empty list while b continues to reference the [1, 2, 3] list.
>>> list1 = [1,2,3,4,5]
>>> list2 = list1
To get a better understanding, let us see with the help of pictures what happens internally.
>>> list1 = [1,2,3,4,5]
This creates a list object and assigns it to list1.
>>> list2 = list1
The list object which list1 was referring to is also assigned to list2.
Now, lets look at the methods to empty an list and what actually happens internally.
METHOD-1: Set to empty list [] :
>>> list1 = []
>>> list2
[1,2,3,4,5]
This does not delete the elements of the list but deletes the reference to the list. So, list1 now points to an empty list but all other references will have access to that old list1.
This method just creates a new list object and assigns it to list1. Any other references will remain.
METHOD-2: Delete using slice operator[:] :
>>> del list1[:]
>>> list2
[]
When we use the slice operator to delete all the elements of the list, then all the places where it is referenced, it becomes an empty list. So list2 also becomes an empty list.
Well, del uses just a little less space in the computer as the person above me implied. The computer still accepts the variable as the same code, except with a different value. However, when you variable is assigned something else, the computer assigns a completely different code ID to it in order to account for the change in memory required.

Why are lists linked in Python in a persistent way?

A variable is set. Another variable is set to the first. The first changes value. The second does not. This has been the nature of programming since the dawn of time.
>>> a = 1
>>> b = a
>>> b = b - 1
>>> b
0
>>> a
1
I then extend this to Python lists. A list is declared and appended. Another list is declared to be equal to the first. The values in the second list change. Mysteriously, the values in the first list, though not acted upon directly, also change.
>>> alist = list()
>>> blist = list()
>>> alist.append(1)
>>> alist.append(2)
>>> alist
[1, 2]
>>> blist
[]
>>> blist = alist
>>> alist.remove(1)
>>> alist
[2]
>>> blist
[2]
>>>
Why is this?
And how do I prevent this from happening -- I want alist to be unfazed by changes to blist (immutable, if you will)?
Python variables are actually not variables but references to objects (similar to pointers in C). There is a very good explanation of that for beginners in http://foobarnbaz.com/2012/07/08/understanding-python-variables/
One way to convince yourself about this is to try this:
a=[1,2,3]
b=a
id(a)
68617320
id(b)
68617320
id returns the memory address of the given object. Since both are the same for both lists it means that changing one affects the other, because they are, in fact, the same thing.
Variable binding in Python works this way: you assign an object to a variable.
a = 4
b = a
Both point to 4.
b = 9
Now b points to somewhere else.
Exactly the same happens with lists:
a = []
b = a
b = [9]
Now, b has a new value, while a has the old one.
Till now, everything is clear and you have the same behaviour with mutable and immutable objects.
Now comes your misunderstanding: it is about modifying objects.
lists are mutable, so if you mutate a list, the modifications are visible via all variables ("name bindings") which exist:
a = []
b = a # the same list
c = [] # another empty one
a.append(3)
print a, b, c # a as well as b = [3], c = [] as it is a different one
d = a[:] # copy it completely
b.append(9)
# now a = b = [3, 9], c = [], d = [3], a copy of the old a resp. b
What is happening is that you create another reference to the same list when you do:
blist = alist
Thus, blist referes to the same list that alist does. Thus, any modifications to that single list will affect both alist and blist.
If you want to copy the entire list, and not just create a reference, you can do this:
blist = alist[:]
In fact, you can check the references yourself using id():
>>> alist = [1,2]
>>> blist = []
>>> id(alist)
411260888
>>> id(blist)
413871960
>>> blist = alist
>>> id(blist)
411260888
>>> blist = alist[:]
>>> id(blist)
407838672
This is a relevant quote from the Python docs.:
Assignment statements in Python do not copy objects, they create bindings between a target and an object. For collections that are mutable or contain mutable items, a copy is sometimes needed so one can change one copy without changing the other.
Based on this post:
Python passes references-to-objects by value (like Java), and
everything in Python is an object. This sounds simple, but then you
will notice that some data types seem to exhibit pass-by-value
characteristics, while others seem to act like pass-by-reference...
what's the deal?
It is important to understand mutable and immutable objects. Some
objects, like strings, tuples, and numbers, are immutable. Altering
them inside a function/method will create a new instance and the
original instance outside the function/method is not changed. Other
objects, like lists and dictionaries are mutable, which means you can
change the object in-place. Therefore, altering an object inside a
function/method will also change the original object outside.
So in your example you are making the variable bList and aList point to the same object. Therefore when you remove an element from either bList or aList it is reflected in the object that they both point to.
The short answer two your question "Why is this?": Because in Python integers are immutable, while lists are mutable.
You were looking for an official reference in the Python docs. Have a look at this section:
http://docs.python.org/2/reference/simple_stmts.html#assignment-statements
Quote from the latter:
Assignment statements are used to (re)bind names to values and to
modify attributes or items of mutable objects
I really like this sentence, have never seen it before. It answers your question precisely.
A good recent write-up about this topic is http://nedbatchelder.com/text/names.html, which has already been mentioned in one of the comments.

Why do copied dictionaries point to the same directory but lists don't?

Can someone tell me why when you copy dictionaries they both point to the same directory, so that a change to one effects the other, but this is not the case for lists?
I am interested in the logic behind why they would set up the dictionary one way, and lists another. It's confusing and if I know the reason behind it I will probably remember.
dict = {'Dog' : 'der Hund' , 'Cat' : 'die Katze' , 'Bird' : 'der Vogel'}
otherdict = dict
dict.clear()
print otherdict
Which results in otherdict = {}.So both dicts are pointing to the same directory. But this isn't the case for lists.
list = ['one' , 'two' , 'three']
newlist = list
list = list + ['four']
print newlist
newlist still holds on to the old list. So they are not pointing to the same directory. I am wanting to know the rationale behind the reasons why they are different?
Some code with similar intent to yours will show that changes to one list do affect other references.
>>> list = ['one' , 'two' , 'three']
>>> newlist = list
>>> list.append('four')
>>> print newlist
['one', 'two', 'three', 'four']
That is the closest analogy to your dictionary code. You call a method on the original object.
The difference is that with your code you used a separate plus and assignment operator
list = list + ['four']
This is two separate operations. First the interpreter evaluates the expression list + ['four']. It must put the result of that computation in a new list object, because it does not anticipate that you will assign the result back to list. If you had said other_list = list + ['four'], you would have been very annoyed if list were modified.
Now there is a new object, containing the result of list + ['four']. That new object is assigned to list. list is now a reference to the new object, whereas newlist remains a reference to the old object.
Even this is different
list += ['four']
The += has the meaning for mutable object that it will modify the object in place.
Your two cases are doing different things to the objects you're copying, that's why you're seeing different results.
First off, you're not really copying them. Your simply making new "references" or (in more Pythonic terms) binding new names to the same objects.
With the dictionary, you're calling dict.clear, which discards all the contents. This modifies the existing object, so you see the results through both of the references you have to it.
With the list, you're rebinding one of the names to a new list. This new list is not the same as the old list, which remains unmodified.
You could recreate the behavior of your dictionary code with the lists if you want. A slice assignment is one way to modify a whole list at once:
old_list[:] = [] # empties the list in place
One addendum, unrelated to the main issue above: It's a very bad idea to use names like dict and list as variables in your own code. That's because those are the names of the builtin Python dictionary and list types. By using the same names, you shadow the built in ones, which can lead to confusing bugs.
In your dictionary example, you've created a dictionary and store it in dict. You then store the same reference in otherdict. Now both dict and otherdict point to the same dictionary*. Then you call dict.clear(). This clears the dictionary that both dict and otherdict point to.
In your list example, you've created a list and store it in list. You then store the same reference in otherlist. Then you create a new list consisting of the elements of list and another element and store the new list in list. You did not modify the original list you created. You created a new list and changed what list pointed to.
You can get your list example to show the same behavior as the dictionary example by using list.append('four') rather than list = list + ['four'].
Do you mean this?
>>> d = {'test1': 1, 'test2': 2}
>>> new_d = d
>>> new_d['test3'] = 3
>>> new_d
{'test1': 1, 'test3': 3, 'test2': 2}
>>> d # copied over
{'test1': 1, 'test3': 3, 'test2': 2}
>>> lst = [1, 2, 3]
>>> new_lst = lst
>>> new_lst.append(5)
>>> new_lst
[1, 2, 3, 5]
>>> lst # copied over
[1, 2, 3, 5]
>>> new_lst += [5]
>>> lst # copied over
[1, 2, 3, 5, 5]
>>> my_tuple = (1, 2, 3)
>>> new_my_tuple = my_tuple
>>> new_my_tuple += (5,)
>>> new_my_tuple
(1, 2, 3, 5)
>>> my_tuple # immutable, so it is not affected by new_my_tuple
(1, 2, 3)
Lists DO pass reference, not the object themselves. Most (hesitant on saying all) mutable (can be changed, such as lists and dictionaries) objects pass references, whereas immutable (cannot be changed, such as tuples) objects pass the object themselves.

Store reference to primitive type in Python?

Code:
>>> a = 1
>>> b = 2
>>> l = [a, b]
>>> l[1] = 4
>>> l
[1, 4]
>>> l[1]
4
>>> b
2
What I want to instead see happen is that when I set l[1] equal to 4, that the variable b is changed to 4.
I'm guessing that when dealing with primitives, they are copied by value, not by reference. Often I see people having problems with objects and needing to understand deep copies and such. I basically want the opposite. I want to be able to store a reference to the primitive in the list, then be able to assign new values to that variable either by using its actual variable name b or its reference in the list l[1].
Is this possible?
There are no 'primitives' in Python. Everything is an object, even numbers. Numbers in Python are immutable objects. So, to have a reference to a number such that 'changes' to the 'number' are 'seen' through multiple references, the reference must be through e.g. a single element list or an object with one property.
(This works because lists and objects are mutable and a change to what number they hold is seen through all references to it)
e.g.
>>> a = [1]
>>> b = a
>>> a
[1]
>>> b
[1]
>>> a[0] = 2
>>> a
[2]
>>> b
[2]
You can't really do that in Python, but you can come close by making the variables a and b refer to mutable container objects instead of immutable numbers:
>>> a = [1]
>>> b = [2]
>>> lst = [a, b]
>>> lst
[[1], [2]]
>>> lst[1][0] = 4 # changes contents of second mutable container in lst
>>> lst
[[1], [4]]
>>> a
[1]
>>> b
[4]
I don't think this is possible:
>>> lst = [1, 2]
>>> a = lst[1] # value is copied, not the reference
>>> a
2
>>> lst[1] = 3
>>> lst
[1, 3] # list is changed
>>> a # value is not changed
2
a refers to the original value of lst[1], but does not directly refer to it.
Think of l[0] as a name referring to an object a, and a as a name that referring to an integer.
Integers are immutable, you can make names refer to different integers, but integers themselves can't be changed.
There were a relevant discussion earlier:
Storing elements of one list, in another list - by reference - in Python?
According to #mgilson, when doing l[1] = 4, it simply replaces the reference, rather than trying to mutate the object. Nevertheless, objects of type int are immutable anyway.

Categories