Python basic data references, list of same reference

Python basic data references, list of same reference - python

Say I have two lists:
>>> l1=[1,2,3,4]
>>> l2=[11,12,13,14]
I can put those lists in a tuple, or dictionary, and it appears that they are all references back to the original list:
>>> t=(l1,l2)
>>> d={'l1':l1, 'l2':l2}
>>> id(l1)==id(d['l1'])==id(t[0])
True
>>> l1 is d['l1'] is t[0]
True
Since they are references, I can change l1 and the referred data in the tuple and dictionary change accordingly:
>>> l1.append(5)
>>> l1
[1, 2, 3, 4, 5]
>>> t
([1, 2, 3, 4, 5], [11, 12, 13, 14])
>>> d
{'l2': [11, 12, 13, 14], 'l1': [1, 2, 3, 4, 5]}
Including if I append the reference in the dictionary d or mutable reference in the tuple t:
>>> d['l1'].append(6)
>>> t[0].append(7)
>>> d
{'l2': [11, 12, 13, 14], 'l1': [1, 2, 3, 4, 5, 6, 7]}
>>> l1
[1, 2, 3, 4, 5, 6, 7]
If I now set l1 to a new list, the reference count for the original list decreases:
>>> sys.getrefcount(l1)
4
>>> sys.getrefcount(t[0])
4
>>> l1=['new','list']
>>> l1 is d['l1'] is t[0]
False
>>> sys.getrefcount(l1)
2
>>> sys.getrefcount(t[0])
3
And appending or changing l1 does not change d['l1'] or t[0] since it now a new reference. The notion of indirect references is covered fairly well in the Python documents but not completely.
My questions:
Is a mutable object always a reference? Can you always assume that modifying it modifies the original (Unless you specifically make a copy with l2=l1[:] kind of idiom)?
Can I assemble a list of all the same references in Python? ie, Some function f(l1) that returns ['l1', 'd', 't'] if those all those are referring to the same list?
It is my assumption that no matter what, the data will remain valid so long as there is some reference to it.
ie:
l=[1,2,3] # create an object of three integers and create a ref to it
l2=l # create a reference to the same object
l=[4,5,6] # create a new object of 3 ints; the original now referenced
# by l2 is unchanged and unmoved

1) Modifying a mutable object through a reference will always modify the "original". Honestly, this is betraying a misunderstanding of references. The newer reference is just as much the "original" as is any other reference. So long as both names point to the same object, modifying the object through either name will be reflected when accessed through the other name.
2) Not exactly like what you want. gc.get_referrers returns all references to the object.
>>> l = [1, 2]
>>> d = {0: l}
>>> t = (l, )
>>> import gc
>>> import pprint
>>> pprint.pprint(gc.get_referrers(l))
[{'__builtins__': <module '__builtin__' (built-in)>,
'__doc__': None,
'__name__': '__main__',
'__package__': None,
'd': {0: [1, 2]},
'gc': <module 'gc' (built-in)>,
'l': [1, 2],
'pprint': <module 'pprint' from '/usr/lib/python2.6/pprint.pyc'>,
't': ([1, 2],)}, # This is globals()
{0: [1, 2]}, # This is d
([1, 2],)] # this is t
Note that the actual object referenced by l is not included in the returned list because it does not contain a reference to itself. globals() is returned because that does contain a reference to the original list.
3) If by valid, you mean "will not be garbage collected" then this is correct barring a highly unlikely bug. It would be a pretty sorry garbage collector that "stole" your data.

Every variable in Python is a reference.
For lists, you are focusing on the results of the append() method, and loosing sight of the bigger picture of Python data structures. There are other methods on lists, and there are advantages and consequences to how a list is constructed. It is helpful to think of list as view on to other objects referred to in the list. They do not "containing" anything other than the rules and ways of accessing the data referred to by objects within them.
The list.append(x) method specifically is equivalent to l[len(l):]=[list]
So:
>>> l1=range(3)
>>> l2=range(20,23)
>>> l3=range(30,33)
>>> l1[len(l1):]=[l2] # equivalent to 'append' for subscriptable sequences
>>> l1[len(l1):]=l3 # same as 'extend'
>>> l1
[0, 1, 2, [20, 21, 22], 30, 31, 32]
>>> len(l1)
7
>>> l1.index(30)
4
>>> l1.index(20)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: list.index(x): x not in list
>>> 20 in l1
False
>>> 30 in l1
True
By putting the list constructor around l2 in l1[len(l1):]=[l2], or calling l.append(l2), you create a reference that is bound to l2. If you change l2, the references will show the change as well. The length of that in the list is a single element -- the reference to the appended sequence.
With no constructor shortcut as in l1[len(l1):]=l3, you copy each element of the sequence.
If you use other common list methods, such as l.index(something), or in you will not find elements inside of the data references. l.sort() will not sort properly. They are "shallow" operations on the object, and by using l1[len(l1):]=[l2] you are now creating a recursive data structure.
If you use l1[len(l1):]=l3, you are making a true (shallow) copy of the elements in l3.
These are fairly fundamental Python idioms, and most of the time they 'do the right thing.' You can, however, get surprising results, such as:
>>> m=[[None]*2]*3
>>> m
[[None, None], [None, None], [None, None]]
>>> m[0][1]=33
>>> m
[[None, 33], [None, 33], [None, 33]] # probably not what was intended...
>>> m[0] is m[1] is m[2] # same object, that's why they all changed
True
Some Python newbies try to create a multi dimension by doing something like m=[[None]*2]*3 The first sequence replication works as expected; it creates 2 copies of None. It is the second that is the issue: it creates three copies of the reference to the first list. So entering m[0][1]=33 modifies the list inside the list bound to m and then all the bound references change to show that change.
Compare to:
>>> m=[[None]*2,[None]*2,[None]*2]
>>> m
[[None, None], [None, None], [None, None]]
>>> m[0][1]=33
>>> m
[[None, 33], [None, None], [None, None]]
You can also use nested list comprehensions to do the same like so:
>>> m=[[ None for i in range(2)] for j in range(3)]
>>> m
[[None, None], [None, None], [None, None]]
>>> m[0][1]=44
>>> m
[[None, 44], [None, None], [None, None]]
>>> m[0] is m[1] is m[2] # three different lists....
False
For lists and references, Fredrik Lundh has this text for a good intro.
As to your specific questions:
1) In Python, Everything is a label or a reference to an object. There is no 'original' (a C++ concept) and there is no distinction between 'reference', pointer, or actual data (a C / Perl concept)
2) Fredrik Lundh has a great analogy about in reference to a question similar to this:
The same way as you get the name of
that cat you found on your porch: the
cat (object) itself cannot tell you
its name, and it doesn't really care
-- so the only way to find out what it's called is to ask all your
neighbours (namespaces) if it's their
cat (object)...
....and don't be surprised if you'll
find that it's known by many names, or
no name at all!
You can find this list with some effort, but why? Just call it what you are going to call it -- like a found cat.
3) True.

1- Is a mutable object always a
reference? Can you always assume that
modifying it modifies the original
(Unless you specifically make a copy
with l2=l1[:] kind of idiom)?
Yes. Actually non-mutable objects are always a reference as well.
You just can't change them to perceive this.
2 - Can I assemble a list of all the
same references in Python? ie, Some
function f(l1) that returns ['l1',
'd', 't'] if those all those are
referring to the same list?
That is odd, but it can be done.
You can compare objects for "samenes" with the is operator.
Like in l1 is t[0]
And you can get all referred-to objects with the function
gc.get_referrers in the garbage collector module (gc) --
You can check which of these referrers point o your object with the isoperator. So,yes, it can be done.
I just don't think it would be a good idea. It is more likely the is operator offer
a way for you to do waht you need alone
3- It is my assumption that no matter
what, the data will remain valid so
long as there is some reference to it.
Yes.

Is a mutable object always a reference? Can you always assume that modifying it modifies the original (Unless you specifically make a copy with l2=l1[:] kind of idiom)?
Python has reference semantics: variables do not store values as in C++, but instead label them. The concept of "the original" is flawed: if two variables label the same value, it is totally irrelevant which one "came first". It doesn't matter if the object is mutable or not (except that immutable objects won't make it so easy to tell what's going on behind the scenes). To make copies in a more general-purpose way, try the copy module.
Can I assemble a list of all the same references in Python? ie, Some function f(l1) that returns ['l1', 'd', 't'] if those all those are referring to the same list?
Not easily. Refer to aaronasterling's answer for details. You could also try something like k for k, v in locals().items() if v is the_object, but you'll also have to search globals(), you'll miss some stuff and it might cause some kind of problems due to recursing with the names 'k' and 'v' (I haven't tested).
It is my assumption that no matter what, the data will remain valid so long as there is some reference to it.
Absolutely.

"... object is a reference ..." is nonsense. References aren't objects. Variables, member fields, slots in lists and sets, etc. hold references, and these references point to objects. There can be any number (in a non-refcouting implementations, even none - temporarily, i.e. until the GC kicks in) references to an object. Everyone who has a reference to an object can invoke it's methods, access it's members, etc. - this is true for all objects. Of course only mutable objects can be changed this way, so you usually don't care for immutable ones.
Yes, as others have shown. But this shouldn't be necessary unless you're either debugging the GC or tracking down a serious memory leak in your code - why do you think you need this?
Python has automatic memory management, so yes. As long as there is a reference to an object, it won't be deleted (however, it may stay alive for a while after it became unreachable, due to cyclic references and the fact that GCs only run once in a while).

1a. Is a mutable object always a reference?
There is no difference between mutable and non-mutable objects. Seeing the variable names as references is helpful for people with a C-background (but implies they can be dereferenced, which they can not).
1b. Can you always assume that modifying it modifies the original
Please, it's not "the original". It's the same object. b = a means b and a now are the same object.
1c. (Unless you specifically make a copy with l2=l1[:] kind of idiom)?
Right, because then it is not the same object anymore. (Although the entries n the list will be the same objects as the original list).
2. Can I assemble a list of all the same references in Python?
Yes, probably, but you will never ever ever need it, so that would be a waste of energy. :)
3. It is my assumption that no matter what, the data will remain valid so long as there is some reference to it.
Yes, an object will not be garbage collected as long as you have a reference to it.
(Using the word "valid" here seems incorrect, but I assume this is what you mean).

Related

Modifying Python lists via slices and for-loops?

I was trying to modify the values in lists via slices and for-loops, and ran into some pretty interesting behavior. I would appreciate if someone could explain what's happening internally here.
>>> x = [1,2,3,4,5]
>>> x[:2] = [6,7] #slices can be modified
>>> x
[6, 7, 3, 4, 5]
>>> x[:2][0] = 8 #indices of slices cannot be modified
>>> x
[6, 7, 3, 4, 5]
>>> x[:2][:1] = [8] #slices of slices cannot be modified
>>> x
[6, 7, 3, 4, 5]
>>> for z in x: #this version of a for-loop cannot modify lists
... z += 1
...
>>> x
[6, 7, 3, 4, 5]
>>> for i in range(len(x)): #this version of a for-loop can modify lists
... x[i] += 1
...
>>> x
[7, 8, 4, 5, 6]
>>> y = x[:2] #if I assign a slice to a var, it can be modified...
>>> y[0] = 1
>>> y
[1, 8]
>>> x #...but it has no impact on the original list
[7, 8, 4, 5, 6]

Let's break down your comments 1 by 1:
1.) x[:2] = [6, 7] slices can be modified:
See these answers here. It's calling the __setitem__ method from the list object and assigning the slice to it. Each time you reference x[:2] a new slice object is created (you can simple do id(x[:2]) and it's apparent, not once will it be the same id).
2.) indices of slices cannot be modified:
That's not true. It couldn't be modified because you're performing the assignment on the slice instance, not the list, so it doesn't trigger the __setitem__ to be performed on the list. Also, int are immutable so it cannot be changed either way.
3.) slices of slices cannot be modified:
See above. Same reason - you are assigning to an instance of the slice and not modifying the list directly.
4.) this version of a for-loop cannot modify lists:
z being referenced here is the actual objects in the elements of x. If you ran the for loop with id(z) you'll note that they're identical to id(6), id(7), id(3), id(4), id(5). Even though list contains all 5 identical references, when you do z = ... you are only assigning the new value to the object z, not the object that is stored in list. If you want to modify the list, you'll need to assign it by index, for the same reason you can't expect 1 = 6 will turn x into [6, 2, 3, 4, 5].
5.) this version of a for-loop can modify lists:
See my answer above. Now you are directly performing item assignment on the list instead of its representation.
6.) if I assign a slice to a var, it can be modified:
If you've been following so far, you'll realize now you are assigning the instance of x[:2] to the object y, which is now a list. The story follows - you perform an item assignment by index on y, of course it will be updated.
7.) ...but it has no impact on the original list:
Of course. x and y are two different objects. id(x) != id(y), therefore any operation performed on x will not affect y whatsoever. if you however assigned y = x and then made a change to y, then yes, x will be affected as well.
To expand a bit on for z in x:, say you have a class foo() and assigned two instances of such to the list f:
f1 = foo()
f2 = foo()
f = [f1, f2]
f
# [<__main__.foo at 0x15f4b898>, <__main__.foo at 0x15f4d3c8>]
Note that the reference in question is the actual foo instance, not the object f1 and f2. So even if I did the following:
f1 = 'hello'
f
# [<__main__.foo at 0x15f4b898>, <__main__.foo at 0x15f4d3c8>]
f still remains unchanged since the foo instances remains the same even though object f1 now is assigned to a different value. For the same reason, whenever you make changes to z in for z in x:, you are only affecting the object z, but nothing in the list is changed until you update x by index.
If however the object have attribute or is mutable, you can directly update the referenced object in the loop:
x = ['foo']
y = ['foo']
lst = [x,y]
lst
# [['foo'], ['foo']]
for z in lst:
z.append('bar')
lst
# [['foo', 'bar'], ['foo', 'bar']]
x.append('something')
lst
# [['foo', 'bar', 'something'], ['foo', 'bar']]
That is because you are directly updating the object in reference instead of assigning to object z. If you however assigned x or y to a new object, lst will not be affected.

There is nothing odd happening here. Any slice that you obtain from a list is a new object containing copies of your original list. The same is true for tuples.
When you iterate through your list, you get the object which the iteration yields. Since ints are immutable in Python you can't change the state of int objects. Each time you add two ints a new int object is created. So your "version of a for-loop [which] cannot modify lists" is not really trying to modify anything because it will not assign the result of the addition back to the list.
Maybe you can guess now why your second approach is different. It uses a special slicing syntax which is not really creating a slice of your list and allows you to assign to the list (documentation). The newly created object created by the addition operation is stored in the list through this method.
For understanding your last (and your first) examples, it is important to know that slicing creates (at least for lists and tuples, technically you could override this in your own classes) a partial copy of your list. Any change to this new object will, as you already found out, not change anything in your original list.

Why does concatenating a list to another one creates another object in memory, whereas other manipulations induce mutation? [duplicate]

This question already has answers here:
Object id in Python
(4 answers)
Closed 5 years ago.
Case A:
list1=[0, 1, 2, 3]
list2=list1
list1=list1+[4]
print(list1)
print(list2)
Output:
[0, 1, 2, 3, 4]
[0, 1, 2, 3]
(This non-mutating behavior also happens when concatenating a list of more than a single entry, and when 'multiplying' the list, e.g. list1=list1*2, in fact any type of "re-assignment" that performs an operation with an infix operator to the list and then assigns the result of that operation to the same list name using "=" )
In this case the original list object that list1 pointed to has not been altered in memory and list2 still points to it, another object has simply been created in memory for the result of the concatenation that list1 now points to (there are now two distinct, different list objects in memory)
Case B:
list1=[0, 1, 2, 3]
list2=list1
list1.append(4)
print(list1)
print(list2)
---
Output :
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]
Case C:
list1=[0, 1, 2, 3]
list2=list1
list1[-1]="foo"
print(list1)
print(list2)
Outputs:
[0, 1, 2, 'foo']
[0, 1, 2, 'foo']
In case B and C the original list object that list1 points to has mutated, list2 still points to that same object, and as a result the value of list2 has changed. (there is still one single list object in memory and it has mutated).
This behavior seems inconsistent to me as a noob. Is there a good reason/utility for this?
EDIT :
I changed the list variables names from "list" and "list_copy" to "list1" and "list2" as this was clearly a very poor and confusing choice of names.
I chose Kabanus' answer as I liked how he pointed out that mutating operations are always(?) explicit in python.
In fact a short and simple answer to my question can be done summarizing Kabanus' answer into two of his statements :
-"In python mutating operations are explicit"
-"The addition[or multiplication] operator [performed on list objects] creates a new object, and doesn't change x[the list object] implicitly."
I could also add:
-"Every time you use square brackets to describe a list, that creates a new list object"
[this is from : http://www-inst.eecs.berkeley.edu/~selfpace/cs9honline/Q2/mutation.html , great explanations there on this topic]
Also I realized after Kabanus' answer how careful one must be tracking mutations in a program :
l=[1,2]
z=l
l+=[3]
z=z+[3]
and
l=[1,2]
z=l
z=z+[3]
l+=[3]
Will yield completely different values for z. This must be a frequent source of errors isn't it?
I'm only in the beginning of my learning and haven't delved deeply into OOP concepts just yet, but I think I'm already starting to understand what the fuss around functional paradigm is about...

l += [4]
is equivalent to:
l.append(4)
and won't create a copy.
l = l + [4]
is an assignment to l, it first evaluates the right side of the assignment, then assigns the resulting object to name l. There is no way for this operation to be mutating l.
Update: I guess I haven't made myself clear enough. Of course operations on the RHS of the assignment may involve mutating the object that is the current value of LHS; but finally, the result of computing RHS is assigned to the LHS, thus overwriting any previous mutations. Example:
def increment_first(x):
x[0] += 1
return []
l = [ 1 ]
l = increment_first(l)
While the call to increment_first will increment l[0] as its side effect, the mutated list object will be lost anyway as soon as the value of RHS (in this case - an empty list) is assigned to l.

This is by design. The point is python does not like non explicit side effects. Suppose this valid line in your file:
x=[1,2]
print(x+[3,4])
Note there is no assignment, but it's still a valid line. Do you expect x to have changed after that second line? For me it doesn't make sense.
That's what you're seeing - the addition operator creates a new object, and doesn't change x implicitly. If you feel it should, then what about:
[3,4]+x
Of course, addition does not change behavior in assignment, to avoid confusion.
In python mutating operations are explicit:
x+=[3,4]
Or your example:
x[0]=1
Here you are explicitly asking to change a cell, i.e. explicit mutation. These things are consistent - an operation is always a mutation or it isn't, and it won't be both. Usually it makes sense as well, such as concatenating lists on the fly.

Two variables with the same list have different IDs.....why is that?

Trying to understand the following
Why is it that the ID's assigned by Python are different for the same lists?
x = [1, 2, 3]
y = [1, 2, 3]
id(x) != id(y)
True
id(x)
11428848
id(y)
12943768

Every distinct object in Python has its own ID. It's not related to the contents -- it's related to the location where the information that describes the object is stored. Any distinct object stored in a distinct place will have a distinct id. (It's sometimes, but not always, the memory address of the object.)
This is especially important to understand for mutable objects -- that is, objects that can be changed, like lists. If an object can be changed, then you can create two different objects with the same contents. They will have different IDs, and if you change one later, the second will not change.
For immutable objects like integers and strings, this is less important, because the contents can never change. Even if two immutable objects have different IDs, they are essentially identical if they have identical contents.
This set of ideas goes pretty deep. You can think of a variable name as a tag assigned to an ID number, which in turn uniquely identifies an object. Multiple variable names can be used to tag the same object. Observe:
>>> a = [1, 2, 3]
>>> b = [1, 2, 3]
>>> id(a)
4532949432
>>> id(b)
4533024888
That, you've already discovered. Now let's create a new variable name:
>>> c = b
>>> id(c)
4533024888
No new object has been created. The object tagged with b is now tagged with c as well. What happens when we change a?
>>> a[1] = 1000
>>> a
[1, 1000, 3]
>>> b
[1, 2, 3]
a and b are different, as we know because they have different IDs. So a change to one doesn't affect the other. But b and c are the same object -- remember? So...
>>> b[1] = 2000
>>> b
[1, 2000, 3]
>>> c
[1, 2000, 3]
Now, if I assign a new value to b, it doesn't change anything about the objects themselves -- just the way they're tagged:
>>> b = a
>>> a
[1, 1000, 3]
>>> b
[1, 1000, 3]
>>> c
[1, 2000, 3]

The why to that is that if you do that:
l = [1, 2, 3]
m = [1, 2, 3]
l.append(4)
Ids should not be the same and ids must not change for any objects since they identify them.
All mutable objects works this way. But it is also the case for tuples (which are unmutable).
Edit:
As commented below, the ids may refer to memory address in some python implementation but not in all.

Those aren't the same lists. They may contain identical information, but they are not the same. If you made y = x, you'd find that actually the id is the same.

Python keep the mutable variables with different IDs, that's why.
You can check it with immutable object ids too; a tuple, for example.

for n in a list: del n

When I loop over a list, the name that I give within the loop to the elements of the list apparently refers directly to each element in turn, as evidenced by:
>>> a = [1, 2, 3]
>>> for n in a:
... print n is a[a.index(n)]
True
True
True
So why doesn't this seem to do anything?
>>> for n in a: del n
>>> a
[1, 2, 3]
If I try del a[a.index(n)], I get wonky behavior, but at least it's behavior I can understand - every time I delete an element, I shorten the list, changing the indices of the other elements, so I end up deleting every other element of the list:
>>> a = range(10)
>>> a
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> for n in a: del a[a.index(n)]
>>> a
[1, 3, 5, 7, 9]
Clearly I'm allowed to delete from the list while iterating. So what's going on when I try to del n inside the loop? Is anything being deleted?

Inside the block of any for n in X statement, n refers to the variable named n itself, not to any notion of "the place in the list of the last value you iterated over". Therefore, what your loop is doing is repeatedly binding a variable n to a value fetched from the list and then immediately unbinding that same variable again. The del n statement only affects your local variable bindings, rather than the list.

Because you're doing this:
some_reference = a[0]
del some_reference
#do you expect this to delete a[0]? it doesn't.
You're operating on a variable bound to the value of a[0] (then one bound to a[1], then...). You can delete it, but it won't do anything to a.

Others have explained the idea of deleting a reference pretty well, I just wanted to touch on item deletion for completeness. Even though you use similar syntax, del a[1], each type will handle item deletion a little differently. As expected, deleting items from most containers just removes them, and some types do not support item deleting at all, tuples for example. Just as a fun exerciser:
class A(object):
def __delitem__(self, index):
print 'I will NOT delete item {}!'.format(index)
a = A()
del a[3]
# I will NOT delete item 3!

Dolda2000's answer covers the main question very nicely, but there are three other issues here.
Using index(n) within a for n in a loop is almost always a bad idea.
It's incorrect:
a = [1, 2, 1, 2]
for n in a:
print(a.index(n))
This will print 0, then 1, then 0 again. Why? Well, the third value is 1. a.index(1) is the index of the first 1 in the list. And that's 0, not 2.
It's also slow: To find the ith value, you have to check the first i elements in the list. This turns a simple linear (fast) algorithm into a quadratic (slow) one.
Fortunately, Python has a nice tool to do exactly what you want, enumerate:
for i, n in enumerate(a):
print(i)
Or, if you don't need the values at all, just the indices:
for i in len(range(a)):
print(i)
(This appears as a hint in the docs at least twice, conveniently buried in places no novice would ever look. But it's also spelled out early in the tutorial.)
Next, it looks like you were attempting to test for exactly the case Dolda2000 explained was happening, with this:
n is a[a.index(n)]
Why didn't that work? You proved that they are the same object, so why didn't deleting it do anything?
Unlike C-family languages, where variables are addresses where values (including references to other addresses) get stored, Python variables are names that you bind to values that exist on their own somewhere else. So variables can't reference other variables, but they can be names for the same value. The is expression tests whether two expressions name the same value. So, you proved that you had two names for the same value, you deleted one of those names, but the other name, and the value, are still there.
An example is worth 1000 words, so run this:
a = object() # this guarantees us a completely unique value
b = a
print a, b, id(a), id(b), a is b
del b
print a, id(a)
(Of course if you also del a, then at some point Python will delete the value, but you can't see that, because by definition you no longer have any names to look at it with.)
Clearly I'm allowed to delete from the list while iterating.
Well, sort of. Python leaves it undefined what happens when you mutate an iterable while iterating over it—but it does have special language that describes what happens for builtin mutable sequences (which means list) in the for docs), and for dicts (I can't remember where, but it says somewhere that it can't guarantee to raise a RuntimeError, which implies that it should raise a RuntimeError).
So, if you know that a is a list, rather than some subclass of list or third-party sequence class, you are allowed to delete from it while iterating, and to expect the "skipping" behavior if the elements you're deleting are at or to the left of the iterator. But it's pretty hard to think of a realistic good use for that knowledge.

This is what happens when you do
for n in a: del a[a.index(n)]
Try this:
a = [0, 1, 2, 3, 4]
for n in a: print(n, a); del a[a.index(n)]
This is what you get:
(0, [0, 1, 2, 3, 4])
(2, [1, 2, 3, 4])
(4, [1, 3, 4])
Thus n is just a tracker of the index, and you can think this way. Every time the function iterates, n moves on to the next relative position in the iterable.
In this case, n refers to a[0] for the first time, a[1] for the second time, a[2] for the third time. After that there is no nextItem in the list, thus the iterations stops.

How to make sense of this result?

I am new to Python. Here is a question I have about lists:
It is said that lists are mutable and tuples are immutable. But when I write the following:
L1 = [1, 2, 3]
L2 = (L1, L1)
L1[1] = 5
print L2
the result is
([1, 5, 3], [1, 5, 3])
instead of
([1, 2, 3], [1, 2, 3])
But L2 is a tuple and tuples are immutable. Why is it that when I change the value of L1, the value of L2 is also changed?

From the Python documentation (http://docs.python.org/reference/datamodel.html), note:
The value of an immutable container object that contains a reference to a mutable
object can change when the latter’s value is changed; however the container is
still considered immutable, because the collection of objects it contains cannot
be changed. So, immutability is not strictly the same as having an unchangeable
value, it is more subtle.

The tuple is immutable, but the list inside the tuple is mutable. You changed L1 (the list), not the tuple. The tuple contains two copies of L1, so they both show the change, since they are actually the same list.
If an object is "immutable", that doesn't automatically mean everything it touches is also immutable. You can put mutable objects inside immutable objects, and that won't stop you from continuing to mutate the mutable objects.

The tuple didn't get modified, it still contains the same duplicate references to list you gave it.
You modified a list (L1), not the tuple (or more precisely, not the reference to the list in the tuple).
For instance you would not have been able to do
L2[1] = 5
because tuples are immutable as you correctly state.
So the tuple wasn't changed, but the list that the tuple contained a reference to was modified (since both entries were references to the same list, both values in the output changed to 5). No value in the tuple was changed.
It may help if you think of reference as a "pointer" in this context.
EDIT (based on question by OP in comments below):
About references, lists and copies, maybe these examples will be helpful:
L=range(5)
s = (L, L[:]) # a reference to the original list and a copy
s
([0, 1, 2, 3, 4], [0, 1, 2, 3, 4])
then changing L[2]
L[2] = 'a'
gives:
s
([0, 1, 'a', 3, 4], [0, 1, 2, 3, 4]) # copy is not changed
Notice that the "2nd" list didn't change, since it contains a copy.
Now,
L=range(5)
we are creating two copies of the list and giving the references to the tuple
s = (L[:], L[:])
now
L[2] = 'a'
doesn't affect anything but the original list L
s
([0, 1, 2, 3, 4], [0, 1, 2, 3, 4])
Hope this is helpful.

You're right that tuples are immutable: L2 is an immutable tuple of two references to L1 (not, as it might first appear, a tuple of two lists), and L1 is not immutable. When you alter L1, you aren't altering L2, just the objects that L2 references.

Use deepcopy instead of = :
from copy import deepcopy
L2 = deepcopy(L1)

The tuple contains two references, each to the same list (not copies of the list, as you might have expected). Hence, changes in the list will still show up in the tuple (since the tuple contains only the references), but the tuple itself is not altered. Therefore, it's immutability is not violated.

Tuples being immutable means only one thing -- once you construct a tuple, it's impossible to modify it. Lists, on the other hand, can be added elements to, removed elements from. But, both tuples and lists are concerned with the elements they contain, but not with what those elements are.
In Python, and this has nothing to do with tuples or lists, when you add a simple value, like an int, it gets represented as is, but any complex value like a list, a tuple, or any other class-type object is always stored as reference.
If you were to convert your tuple to a set(), you'd get an error message that might surprise you, but given the above it should make sense:
>>> L=range(5)
>>> s = (L, L[:]) # a reference to the original list and a copy
>>> set(1, 2, s)
>>> set((1, 2, s))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'
As values of a set must never change once they are added to the set, any mutable value contained inside the immutable tuple s raises TypeError.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python basic data references, list of same reference - python

Related

Modifying Python lists via slices and for-loops?

Why does concatenating a list to another one creates another object in memory, whereas other manipulations induce mutation? [duplicate]

Two variables with the same list have different IDs.....why is that?

for n in a list: del n

How to make sense of this result?

Categories

Resources