When I loop over a list, the name that I give within the loop to the elements of the list apparently refers directly to each element in turn, as evidenced by:
>>> a = [1, 2, 3]
>>> for n in a:
... print n is a[a.index(n)]
True
True
True
So why doesn't this seem to do anything?
>>> for n in a: del n
>>> a
[1, 2, 3]
If I try del a[a.index(n)], I get wonky behavior, but at least it's behavior I can understand - every time I delete an element, I shorten the list, changing the indices of the other elements, so I end up deleting every other element of the list:
>>> a = range(10)
>>> a
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> for n in a: del a[a.index(n)]
>>> a
[1, 3, 5, 7, 9]
Clearly I'm allowed to delete from the list while iterating. So what's going on when I try to del n inside the loop? Is anything being deleted?
Inside the block of any for n in X statement, n refers to the variable named n itself, not to any notion of "the place in the list of the last value you iterated over". Therefore, what your loop is doing is repeatedly binding a variable n to a value fetched from the list and then immediately unbinding that same variable again. The del n statement only affects your local variable bindings, rather than the list.
Because you're doing this:
some_reference = a[0]
del some_reference
#do you expect this to delete a[0]? it doesn't.
You're operating on a variable bound to the value of a[0] (then one bound to a[1], then...). You can delete it, but it won't do anything to a.
Others have explained the idea of deleting a reference pretty well, I just wanted to touch on item deletion for completeness. Even though you use similar syntax, del a[1], each type will handle item deletion a little differently. As expected, deleting items from most containers just removes them, and some types do not support item deleting at all, tuples for example. Just as a fun exerciser:
class A(object):
def __delitem__(self, index):
print 'I will NOT delete item {}!'.format(index)
a = A()
del a[3]
# I will NOT delete item 3!
Dolda2000's answer covers the main question very nicely, but there are three other issues here.
Using index(n) within a for n in a loop is almost always a bad idea.
It's incorrect:
a = [1, 2, 1, 2]
for n in a:
print(a.index(n))
This will print 0, then 1, then 0 again. Why? Well, the third value is 1. a.index(1) is the index of the first 1 in the list. And that's 0, not 2.
It's also slow: To find the ith value, you have to check the first i elements in the list. This turns a simple linear (fast) algorithm into a quadratic (slow) one.
Fortunately, Python has a nice tool to do exactly what you want, enumerate:
for i, n in enumerate(a):
print(i)
Or, if you don't need the values at all, just the indices:
for i in len(range(a)):
print(i)
(This appears as a hint in the docs at least twice, conveniently buried in places no novice would ever look. But it's also spelled out early in the tutorial.)
Next, it looks like you were attempting to test for exactly the case Dolda2000 explained was happening, with this:
n is a[a.index(n)]
Why didn't that work? You proved that they are the same object, so why didn't deleting it do anything?
Unlike C-family languages, where variables are addresses where values (including references to other addresses) get stored, Python variables are names that you bind to values that exist on their own somewhere else. So variables can't reference other variables, but they can be names for the same value. The is expression tests whether two expressions name the same value. So, you proved that you had two names for the same value, you deleted one of those names, but the other name, and the value, are still there.
An example is worth 1000 words, so run this:
a = object() # this guarantees us a completely unique value
b = a
print a, b, id(a), id(b), a is b
del b
print a, id(a)
(Of course if you also del a, then at some point Python will delete the value, but you can't see that, because by definition you no longer have any names to look at it with.)
Clearly I'm allowed to delete from the list while iterating.
Well, sort of. Python leaves it undefined what happens when you mutate an iterable while iterating over it—but it does have special language that describes what happens for builtin mutable sequences (which means list) in the for docs), and for dicts (I can't remember where, but it says somewhere that it can't guarantee to raise a RuntimeError, which implies that it should raise a RuntimeError).
So, if you know that a is a list, rather than some subclass of list or third-party sequence class, you are allowed to delete from it while iterating, and to expect the "skipping" behavior if the elements you're deleting are at or to the left of the iterator. But it's pretty hard to think of a realistic good use for that knowledge.
This is what happens when you do
for n in a: del a[a.index(n)]
Try this:
a = [0, 1, 2, 3, 4]
for n in a: print(n, a); del a[a.index(n)]
This is what you get:
(0, [0, 1, 2, 3, 4])
(2, [1, 2, 3, 4])
(4, [1, 3, 4])
Thus n is just a tracker of the index, and you can think this way. Every time the function iterates, n moves on to the next relative position in the iterable.
In this case, n refers to a[0] for the first time, a[1] for the second time, a[2] for the third time. After that there is no nextItem in the list, thus the iterations stops.
Related
This question already has answers here:
Object id in Python
(4 answers)
Closed 5 years ago.
Case A:
list1=[0, 1, 2, 3]
list2=list1
list1=list1+[4]
print(list1)
print(list2)
Output:
[0, 1, 2, 3, 4]
[0, 1, 2, 3]
(This non-mutating behavior also happens when concatenating a list of more than a single entry, and when 'multiplying' the list, e.g. list1=list1*2, in fact any type of "re-assignment" that performs an operation with an infix operator to the list and then assigns the result of that operation to the same list name using "=" )
In this case the original list object that list1 pointed to has not been altered in memory and list2 still points to it, another object has simply been created in memory for the result of the concatenation that list1 now points to (there are now two distinct, different list objects in memory)
Case B:
list1=[0, 1, 2, 3]
list2=list1
list1.append(4)
print(list1)
print(list2)
---
Output :
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]
Case C:
list1=[0, 1, 2, 3]
list2=list1
list1[-1]="foo"
print(list1)
print(list2)
Outputs:
[0, 1, 2, 'foo']
[0, 1, 2, 'foo']
In case B and C the original list object that list1 points to has mutated, list2 still points to that same object, and as a result the value of list2 has changed. (there is still one single list object in memory and it has mutated).
This behavior seems inconsistent to me as a noob. Is there a good reason/utility for this?
EDIT :
I changed the list variables names from "list" and "list_copy" to "list1" and "list2" as this was clearly a very poor and confusing choice of names.
I chose Kabanus' answer as I liked how he pointed out that mutating operations are always(?) explicit in python.
In fact a short and simple answer to my question can be done summarizing Kabanus' answer into two of his statements :
-"In python mutating operations are explicit"
-"The addition[or multiplication] operator [performed on list objects] creates a new object, and doesn't change x[the list object] implicitly."
I could also add:
-"Every time you use square brackets to describe a list, that creates a new list object"
[this is from : http://www-inst.eecs.berkeley.edu/~selfpace/cs9honline/Q2/mutation.html , great explanations there on this topic]
Also I realized after Kabanus' answer how careful one must be tracking mutations in a program :
l=[1,2]
z=l
l+=[3]
z=z+[3]
and
l=[1,2]
z=l
z=z+[3]
l+=[3]
Will yield completely different values for z. This must be a frequent source of errors isn't it?
I'm only in the beginning of my learning and haven't delved deeply into OOP concepts just yet, but I think I'm already starting to understand what the fuss around functional paradigm is about...
l += [4]
is equivalent to:
l.append(4)
and won't create a copy.
l = l + [4]
is an assignment to l, it first evaluates the right side of the assignment, then assigns the resulting object to name l. There is no way for this operation to be mutating l.
Update: I guess I haven't made myself clear enough. Of course operations on the RHS of the assignment may involve mutating the object that is the current value of LHS; but finally, the result of computing RHS is assigned to the LHS, thus overwriting any previous mutations. Example:
def increment_first(x):
x[0] += 1
return []
l = [ 1 ]
l = increment_first(l)
While the call to increment_first will increment l[0] as its side effect, the mutated list object will be lost anyway as soon as the value of RHS (in this case - an empty list) is assigned to l.
This is by design. The point is python does not like non explicit side effects. Suppose this valid line in your file:
x=[1,2]
print(x+[3,4])
Note there is no assignment, but it's still a valid line. Do you expect x to have changed after that second line? For me it doesn't make sense.
That's what you're seeing - the addition operator creates a new object, and doesn't change x implicitly. If you feel it should, then what about:
[3,4]+x
Of course, addition does not change behavior in assignment, to avoid confusion.
In python mutating operations are explicit:
x+=[3,4]
Or your example:
x[0]=1
Here you are explicitly asking to change a cell, i.e. explicit mutation. These things are consistent - an operation is always a mutation or it isn't, and it won't be both. Usually it makes sense as well, such as concatenating lists on the fly.
I would like to ask what the following does in Python.
It was taken from http://danieljlewis.org/files/2010/06/Jenks.pdf
I have entered comments telling what I think is happening there.
# Seems to be a function that returns a float vector
# dataList seems to be a vector of flat.
# numClass seems to an int
def getJenksBreaks( dataList, numClass ):
# dataList seems to be a vector of float. "Sort" seems to sort it ascendingly
dataList.sort()
# create a 1-dimensional vector
mat1 = []
# "in range" seems to be something like "for i = 0 to len(dataList)+1)
for i in range(0,len(dataList)+1):
# create a 1-dimensional-vector?
temp = []
for j in range(0,numClass+1):
# append a zero to the vector?
temp.append(0)
# append the vector to a vector??
mat1.append(temp)
(...)
I am a little confused because in the pdf there are no explicit variable declarations. However I think and hope I could guess the variables.
Yes, the method append() adds elements to the end of the list. I think your interpretation of the code is correct.
But note the following:
x =[1,2,3,4]
x.append(5)
print(x)
[1, 2, 3, 4, 5]
while
x.append([6,7])
print(x)
[1, 2, 3, 4, 5, [6, 7]]
If you want something like
[1, 2, 3, 4, 5, 6, 7]
you may use extend()
x.extend([6,7])
print(x)
[1, 2, 3, 4, 5, 6, 7]
Python doesn't have explicit variable declarations. It's dynamically typed, variables are whatever type they get assigned to.
Your assessment of the code is pretty much correct.
One detail: The range function goes up to, but does not include, the last element. So the +1 in the second argument to range causes the last iterated value to be len(dataList) and numClass, respectively. This looks suspicious, because the range is zero-indexed, which means it will perform a total of len(dataList) + 1 iterations (which seems suspicious).
Presumably dataList.sort() modifies the original value of dataList, which is the traditional behavior of the .sort() method.
It is indeed appending the new vector to the initial one, if you look at the full source code there are several blocks that continue to concatenate more vectors to mat1.
append is a list function used to append a value at the end of the list
mat1 and temp together are creating a 2D array (eg = [[], [], []]) or matrix of (m x n)
where m = len(dataList)+1 and n = numClass
the resultant matrix is a zero martix as all its value is 0.
In Python, variables are implicitely declared. When you type this:
i = 1
i is set to a value of 1, which happens to be an integer. So we will talk of i as being an integer, although i is only a reference to an integer value. The consequence of that is that you don't need type declarations as in C++ or Java.
Your understanding is mostly correct, as for the comments. [] refers to a list. You can think of it as a linked-list (although its actual implementation is closer to std::vectors for instance).
As Python variables are only references to objects in general, lists are effectively lists of references, and can potentially hold any kind of values. This is valid Python:
# A vector of numbers
vect = [1.0, 2.0, 3.0, 4.0]
But this is perfectly valid code as well:
# The list of my objects:
list = [1, [2,"a"], True, 'foo', object()]
This list contains an integer, another list, a boolean... In Python, you usually rely on duck typing for your variable types, so this is not a problem.
Finally, one of the methods of list is sort, which sorts it in-place, as you correctly guessed, and the range function generates a range of numbers.
The syntax for x in L: ... iterates over the content of L (assuming it is iterable) and sets the variable x to each of the successive values in that context. For example:
>>> for x in ['a', 'b', 'c']:
... print x
a
b
c
Since range generates a range of numbers, this is effectively the idiomatic way to generate a for i = 0; i < N; i += 1 type of loop:
>>> for i in range(4): # range(4) == [0,1,2,3]
... print i
0
1
2
3
Tried deleting items in a list, no success.
>>> r = [1,2,3,4,5]
>>> for i in r:
if i<3:
del i
>>> print r
[1, 2, 3, 4, 5]
I even tried filtering it,
>>> def f(i):
True if i>2 else False
>>> print list(filter(f,r))
[]
I do not understand why the first one is not working. And I dont understand the result at all, when I use filter(function,iterable).
EDIT:
Seeing Paulo's comment below, now I do not understand why this works.
>>> for i in r:
if i<3:
r.remove(i)
>>> print r
[3, 4, 5]
Shouldn't the iterator problem be still there, and shouldn't the code end up removing only the first element (r[0])
Use a list comprehension instead:
[i for i in r if i >= 3]
and retain instead of delete.
Your filter never returned the test; so you always return None instead and that's false in a boolean context. The following works just fine:
def f(i):
return i > 2
Your initial attempt failed because del i unbinds i, but the list remains unaffected. Only the local name i is cleared.
If you want to delete an item from a list, you need to delete the index:
del r[0]
deletes the first element from the list.
Even if you did manage to delete indices the loop would have held some suprises:
>>> for i, element in enumerate(r):
... if element < 3:
... del r[i]
...
>>> r
[2, 3, 4, 5]
This version fails because the list iterator used by the for loop doesn't know you deleted elements from the list; deleting the value at index 0 shifts up the rest of the list, but the loop iterator looks at item 1 regardless:
first iteration, r = [1, 2, 3, 4, 5], iterator index 0 -> element = 1
second iteration, r = [2, 3, 4, 5], iterator index 1 -> element = 3
I do not understand why the first one is not working.
It is not working because the statement del i undefines the variable i - that is, it deletes it from the scope (global or local) which contains it.
And I dont understand the result at all, when I use filter(function,iterable)
Your function, f does not contain a return statement. Accordingly, it always returns None, which has the boolean equivalent value of False. Thus, filter excludes all values.
What you should probably be doing is filtering using a comprehension, and replacing the list, like so:
r = [i for i in r if i >= 3]
Or, if you really do want to delete part of the original list and modify it, use del on a slice of the list:
del r[:3]
Seeing Paulo's comment below, now I do not understand why [using remove] works.
Because remove(r) searches for the value r in the list, and deletes the first instance of it. Accordingly, repeated modification of the list does not affect the iteration that happens inside remove. However, note that it is still susceptible to the same error, if removal of an item leads to an item being skipped in iteration of the list.
If I have the following Python code
>>> x = []
>>> x = x + [1]
>>> x = x + [2]
>>> x = x + [3]
>>> x
[1, 2, 3]
Will x be guaranteed to always be [1,2,3], or are other orderings of the interim elements possible?
Yes, the order of elements in a python list is persistent.
In short, yes, the order is preserved. In long:
In general the following definitions will always apply to objects like lists:
A list is a collection of elements that can contain duplicate elements and has a defined order that generally does not change unless explicitly made to do so. stacks and queues are both types of lists that provide specific (often limited) behavior for adding and removing elements (stacks being LIFO, queues being FIFO). Lists are practical representations of, well, lists of things. A string can be thought of as a list of characters, as the order is important ("abc" != "bca") and duplicates in the content of the string are certainly permitted ("aaa" can exist and != "a").
A set is a collection of elements that cannot contain duplicates and has a non-definite order that may or may not change over time. Sets do not represent lists of things so much as they describe the extent of a certain selection of things. The internal structure of set, how its elements are stored relative to each other, is usually not meant to convey useful information. In some implementations, sets are always internally sorted; in others the ordering is simply undefined (usually depending on a hash function).
Collection is a generic term referring to any object used to store a (usually variable) number of other objects. Both lists and sets are a type of collection. Tuples and Arrays are normally not considered to be collections. Some languages consider maps (containers that describe associations between different objects) to be a type of collection as well.
This naming scheme holds true for all programming languages that I know of, including Python, C++, Java, C#, and Lisp (in which lists not keeping their order would be particularly catastrophic). If anyone knows of any where this is not the case, please just say so and I'll edit my answer. Note that specific implementations may use other names for these objects, such as vector in C++ and flex in ALGOL 68 (both lists; flex is technically just a re-sizable array).
If there is any confusion left in your case due to the specifics of how the + sign works here, just know that order is important for lists and unless there is very good reason to believe otherwise you can pretty much always safely assume that list operations preserve order. In this case, the + sign behaves much like it does for strings (which are really just lists of characters anyway): it takes the content of a list and places it behind the content of another.
If we have
list1 = [0, 1, 2, 3, 4]
list2 = [5, 6, 7, 8, 9]
Then
list1 + list2
Is the same as
[0, 1, 2, 3, 4] + [5, 6, 7, 8, 9]
Which evaluates to
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Much like
"abdcde" + "fghijk"
Produces
"abdcdefghijk"
You are confusing 'sets' and 'lists'. A set does not guarantee order, but lists do.
Sets are declared using curly brackets: {}. In contrast, lists are declared using square brackets: [].
mySet = {a, b, c, c}
Does not guarantee order, but list does:
myList = [a, b, c]
I suppose one thing that may be concerning you is whether or not the entries could change, so that the 2 becomes a different number, for instance. You can put your mind at ease here, because in Python, integers are immutable, meaning they cannot change after they are created.
Not everything in Python is immutable, though. For example, lists are mutable---they can change after being created. So for example, if you had a list of lists
>>> a = [[1], [2], [3]]
>>> a[0].append(7)
>>> a
[[1, 7], [2], [3]]
Here, I changed the first entry of a (I added 7 to it). One could imagine shuffling things around, and getting unexpected things here if you are not careful (and indeed, this does happen to everyone when they start programming in Python in some way or another; just search this site for "modifying a list while looping through it" to see dozens of examples).
It's also worth pointing out that x = x + [a] and x.append(a) are not the same thing. The second one mutates x, and the first one creates a new list and assigns it to x. To see the difference, try setting y = x before adding anything to x and trying each one, and look at the difference the two make to y.
Yes the list will remain as [1,2,3] unless you perform some other operation on it.
aList=[1,2,3]
i=0
for item in aList:
if i<2:
aList.remove(item)
i+=1
aList
[2]
The moral is when modifying a list in a loop driven by the list, takes two steps:
aList=[1,2,3]
i=0
for item in aList:
if i<2:
aList[i]="del"
i+=1
aList
['del', 'del', 3]
for i in range(2):
del aList[0]
aList
[3]
Yes lists and tuples are always ordered while dictionaries are not
Today I spent about 20 minutes trying to figure out why
this worked as expected:
users_stories_dict[a] = s + [b]
but this would have a None value:
users_stories_dict[a] = s.append(b)
Anyone know why the append function does not return the new list? I'm looking for some sort of sensible reason this decision was made; it looks like a Python novice gotcha to me right now.
append works by actually modifying a list, and so all the magic is in side-effects. Accordingly, the result returned by append is None. In other words, what one wants is:
s.append(b)
and then:
users_stories_dict[a] = s
But, you've already figured that much out. As to why it was done this way, while I don't really know, my guess is that it might have something to do with a 0 (or false) exit value indicating that an operation proceeded normally, and by returning None for functions whose role is to modify their arguments in-place you report that the modification succeeded.
But I agree that it would be nice if it returned the modified list back. At least, Python's behavior is consistent across all such functions.
The append() method returns a None, because it modifies the list it self by adding the object appended as an element, while the + operator concatenates the two lists and return the resulting list
eg:
a = [1,2,3,4,5]
b = [6,7,8,9,0]
print a+b # returns a list made by concatenating the lists a and b
>>> [1, 2, 3, 4, 5, 6, 7, 8, 9, 0]
print a.append(b) # Adds the list b as element at the end of the list a and returns None
>>> None
print a # the list a was modified during the last append call and has the list b as last element
>>> [1, 2, 3, 4, 5, [6, 7, 8, 9, 0]]
So as you can see the easiest way is just to add the two lists together as even if you append the list b to a using append() you will not get the result you want without additional work