Python for loop skipping every other loop? [duplicate] - python

This question already has answers here:
Strange result when removing item from a list while iterating over it
(8 answers)
Closed 4 months ago.
I have a weird problem. Does anyone see anything wrong with my code?
for x in questions:
forms.append((SectionForm(request.POST, prefix=str(x.id)),x))
print "Appended " + str(x)
for (form, question) in forms:
print "Testing " + str(question)
if form.is_valid():
forms.remove((form,question))
print "Deleted " + str(question)
a = form.save(commit=False)
a.audit = audit
a.save()
else:
flag_error = True
Results in:
Appended Question 50
Appended Question 51
Appended Question 52
Testing Question 50
Deleted Question 50
Testing Question 52
Deleted Question 52
It seems to skip question 51. It gets appended to the list, but the for loop skips it. Any ideas?

You are modifying the contents of the object forms that you are iterating over, when you say:
forms.remove((form,question))
According to the Python documentation of the for statement, this is not safe (the emphasis is mine):
The for statement in Python differs a bit from what you may be used to in C or Pascal. Rather than always iterating over an arithmetic progression of numbers (like in Pascal), or giving the user the ability to define both the iteration step and halting condition (as C), Python’s for statement iterates over the items of any sequence (a list or a string), in the order that they appear in the sequence.
It is not safe to modify the sequence being iterated over in the loop (this can only happen for mutable sequence types, such as lists). If you need to modify the list you are iterating over (for example, to duplicate selected items) you must iterate over a copy. The slice notation makes this particularly convenient:
for x in a[:]: # make a slice copy of the entire list
... if len(x) > 6: a.insert(0, x)
See also this paragraph from the Python Language Reference which explains exactly what is going on:
There is a subtlety when the sequence is being modified by the loop (this can only occur for mutable sequences, i.e. lists). An internal counter is used to keep track of which item is used next, and this is incremented on each iteration. When this counter has reached the length of the sequence the loop terminates. This means that if the suite deletes the current (or a previous) item from the sequence, the next item will be skipped (since it gets the index of the current item which has already been treated). Likewise, if the suite inserts an item in the sequence before the current item, the current item will be treated again the next time through the loop.
There are a lot of solutions. You can follow their advice and create a copy. Another possibility is to create a new list as a result of your second for loop, instead of modifying forms directly. The choice is up to you...

You are removing objects from forms while iterating over it. This is supposed to lead to the behaviour you are seeing ( http://docs.python.org/reference/compound_stmts.html#the-for-statement ).
The solution is to either iterate over a copy of that list, or add the forms for removal to a separate collection, then perform the removal afterwards.

Using the remove method on forms (which I assume is a list) changes the size of the list. So think of it this way
[ 50, 51, 52 ]
Is your initial list, and you ask for the first item. You then remove that item from the list, so it looks like
[51, 52]
But now you ask for the second item, so you get 52.

Related

why isn't len returning correct values?

def dbl_linear(n):
u=[1]
i=0
for a in u:
u.append((2*a+1))
u.append((3*a+1))
u=set(u)
u=list(u)
if len(u)>=n:
print(len(u))
break
return len(u)
i want this code to return n elements in list u.But that isn't happening.can someone help? i gave input n=20.the len(u) is coming as 15 or 7.different answers on every run
Modifying an object you're iterating over is basically undefined behaviour, you can't assume whether the iterations will or will not take the new items in account, especially in the face of resize (list is O(1) amortized append, because it's O(1) on reserved space but they regularly need to reallocate the entire thing to make more room for new elements). Not to mention here you're only modifying the initial list during the first iteration, afterwards you're updating a different unrelated list.
There's no reason to even use for a in u, just use an infinite loop (and probably remember the last element as your uniquification via set will scramble the list, alternatively just check before inserting if the element is already present, in is O(n) but so are set(a) and list(a)).

How to efficently manage a list of elements that can either have one of it's elements removed or swapped with it's next one?

I have to build a program having two inputs (eventList, a list composed of strings that hold the type of operation and the id of the element that will undergo it, and idList, a list composed of ints, each one being the id of the element).
The two possible events are the deletion of the corresponding id, or having the id swap it's position in the idList with the following one (i.e. if the selected id is located in idList[2], it will swap value with idList[3]).
It has to pass strict tests with a set timeout and has to use dictionaries.
This is for a programmation assignment, I've alredy built this program but I can't find a way to get a decent time and pass the tester's timeouts.
I've alseo tried using lists instead of dicts, but I still can't pass some timeouts because of the time it takes to use .pop() and .index(), and I've been told the only way to pass all of them is to use dicts.
How I currently handle swaps:
def overtake(dictElement, elementId):
elementIndex = dictElement[elementId]
overtakerId = dictSearchOvertaker(dictElement, elementIndex)
dictElement[elementId], dictElement[overtakerId] = dictElement[overtakerId], dictElement[elementId]
return dictElement
How I currently handle deletions:
def eliminate(dictElement, elementId):
#elementIndex = dictElement[elementId]
del dictElement[elementId]
return dictUpdate(dictElement, elementId)
How i update the dictionary after an element is deleted:
def dictUpdate(dictElement, elementIndex):
listedDict = dictElement.items()
i = 0
for item in listedDict:
i += 1
if item[1] > elementIndex:
dictElement[item[0]] -= 1
return dictElement
I'm expected to handle a list of 200k elements where every element gets deleted one by one in 1.5 seconds, but it takes me more than 5 minutes, and even longer for a test where I get an idList with 1500 elements and every elements gets swapped with the following one untill in the end idList is reversed .
One thing that strikes me about this problem is that you're given a single list of operations and expected to return the result of doing all of them. That means you don't necessarily need to do them all one by one, and can instead do operations in a single batch that would otherwise be individually time-consuming.
Swapping two items is O(1) as long as you already know where they are. That's where a dict would come in -- a dict can help you associate one piece of information with another in such a way that you can find it in O(1) time. In this case, you want a way to find the index of an item given its id.
Deleting an item from the middle of a Python list is O(N), even if you already know its index, because internally it's an array and you need to shift everything over to take up the empty space every time you delete something that's not at the end. A naive solution is going to therefore be O(K*N), which is probably the thing the assignment is trying to get you to avoid. But nothing in the problem requires that you actually delete each item from the list one by one, just that the final result you return does not contain those items.
So, my approach would be:
Build a dict of id -> index. (This is just a single O(n) iteration over the list.)
Create an empty set to track deletions.
For each operation:
If it's a swap:
If the id is in your set, raise an exception.
Use your dict to find the indices of the two ids.
Swap the two items in the list.
Update your dict so it continues to match the list.
If it's a delete:
Add the id to your set.
Create a new list to return as the result.
For each item in the original list:
Check to see if it's in your set.
If it's in the set, skip it (it got deleted).
If not, append it to the result.
Return the result.
Where N is the list size and K is the number of operations, this ends up being O(N+K), because you iterated over the entire list of IDs exactly twice, and the entire list of operations exactly once, and everything you did inside those iterations was O(1).

Duplicate element being added through for loop

NOTE: I do not want to use del
I am trying to understand algorithms better which is why I want to avoid the built-in del statement.
I am populating a list of 10 randomly generated numbers. I then am trying to remove an item from the list by index, using a for loop:
if remove_index < lst_size:
for value in range(remove_index, lst_size-1):
lst[value] = lst[value+1]
lst_size -= 1
Everything works fine, except that the loop is adding the last item twice. Meaning, if the 8th item has the value 4, it will add a 9th item also valued 4. I am not sure why it is doing this. I still am able to move the value at the selected index (while moving everything up), but it adds on the duplicate.
Nothing is being added to your list. It starts out with lst_size elements, and, since you don't delete any, it retains the same number by the time you're done.
If, once you've copied all the items from remove_index onwards to the previous index in the list, you want to remove the last item, then you can do so either using del or lst.pop().
At the risk of sounding flippant, this is a general rule: if you want to do something, you have to do it. Saying "I don't want to use del" won't alter that fact.
Merely decrementing lst_size will have no effect on the list - while you may be using it to store the size of your list, they are not connected, and changing one has no effect on the other.

In Python, how to iterate more than once for an iterative object? [duplicate]

This question already has answers here:
Why can't I iterate twice over the same iterator? How can I "reset" the iterator or reuse the data?
(5 answers)
Closed 4 years ago.
I encounter some code that get back an iterative object from the Dynamo database, and I can do:
print [en["student_id"] for en in enrollments]
However, when I do similar things again:
print [en["course_id"] for en in enrollments]
Then the second iteration will print out nothing, because the iterative structure can only be iterated only once and it has reached its end.
The question is, how can we iterate it more than once, for the case of (1) what if it is known to be only several items in the iteration (2) what if we know there will be lots of items (say a million items) in the iteration, and we don't want to cost a lot of additional memory space?
Related is, I looked up rewind, and it seems like it exists for PHP and Ruby, but not for Python?
enrollments is a generator. Either recreate the generator if you need to iterate again, or convert it to a list first:
enrollments = list(enrollments)
Take into account that APIs often use generators to avoid memory bloat; a list must have references to all objects it contains, so all those objects have to exist at the same time. A generator can produce the elements one by one, as needed; your list comprehension discards those objects again once the 'student_id' key has been extracted.
The alternative is to iterate just once, and do all the things with each object you want to do. So instead of running two list comprehensions, run one regular for loop and extract all the data you need in one place, appending to separate lists as you go along:
courses = []
students = []
for enrollment in enrollments:
courses.append(enrollment['course_id'])
students.append(enrollment['student_id'])
rewind in PHP is unrelated to this; Python has fileobj.seek(0) to do the same, but file objects are not generators.
import itertools
it1, it2 = itertools.tee(enrollments, n=2)
Looks like it is an answer from here: Why can't I iterate twice over the same data?
But it is valid only if you are going to iterate not too much times.

why doesn't following Python removing duplicates function work? [duplicate]

This question already has answers here:
Strange result when removing item from a list while iterating over it
(8 answers)
Closed last month.
As an experiment, I did this:
letters=['a','b','c','d','e','f','g','h','i','j','k','l']
for i in letters:
letters.remove(i)
print letters
The last print shows that not all items were removed ? (every other was).
IDLE 2.6.2
>>> ================================ RESTART ================================
>>>
['b', 'd', 'f', 'h', 'j', 'l']
>>>
What's the explanation for this ? How it could this be re-written to remove every item ?
Some answers explain why this happens and some explain what you should've done. I'll shamelessly put the pieces together.
What's the reason for this?
Because the Python language is designed to handle this use case differently. The documentation makes it clear:
It is not safe to modify the sequence being iterated over in the loop (this can only happen for mutable sequence types, such as lists). If you need to modify the list you are iterating over (for example, to duplicate selected items) you must iterate over a copy.
Emphasis mine. See the linked page for more -- the documentation is copyrighted and all rights are reserved.
You could easily understand why you got what you got, but it's basically undefined behavior that can easily change with no warning from build to build. Just don't do it.
It's like wondering why i += i++ + ++i does whatever the hell it is it that line does on your architecture on your specific build of your compiler for your language -- including but not limited to trashing your computer and making demons fly out of your nose :)
How it could this be re-written to remove every item?
del letters[:] (if you need to change all references to this object)
letters[:] = [] (if you need to change all references to this object)
letters = [] (if you just want to work with a new object)
Maybe you just want to remove some items based on a condition? In that case, you should iterate over a copy of the list. The easiest way to make a copy is to make a slice containing the whole list with the [:] syntax, like so:
#remove unsafe commands
commands = ["ls", "cd", "rm -rf /"]
for cmd in commands[:]:
if "rm " in cmd:
commands.remove(cmd)
If your check is not particularly complicated, you can (and probably should) filter instead:
commands = [cmd for cmd in commands if not is_malicious(cmd)]
You cannot iterate over a list and mutate it at the same time, instead iterate over a slice:
letters=['a','b','c','d','e','f','g','h','i','j','k','l']
for i in letters[:]: # note the [:] creates a slice
letters.remove(i)
print letters
That said, for a simple operation such as this, you should simply use:
letters = []
You cannot modify the list you are iterating, otherwise you get this weird type of result. To do this, you must iterate over a copy of the list:
for i in letters[:]:
letters.remove(i)
It removes the first occurrence, and then checks for the next number in the sequence. Since the sequence has changed it takes the next odd number and so on...
take "a"
remove "a" -> the first item is now "b"
take the next item, "c"
-...
what you want to do is:
letters[:] = []
or
del letters[:]
This will preserve original object letters was pointing to. Other options like, letters = [], would create a new object and point letters to it: old object would typically be garbage-collected after a while.
The reason not all values were removed is that you're changing list while iterating over it.
ETA: if you want to filter values from a list you could use list comprehensions like this:
>>> letters=['a','b','c','d','e','f','g','h','i','j','k','l']
>>> [l for l in letters if ord(l) % 2]
['a', 'c', 'e', 'g', 'i', 'k']
Probably python uses pointers and the removal starts at the front. The variable „letters“ in the second line partially has a different value than tha variable „letters“ in the third line. When i is 1 then a is being removed, when i is 2 then b had been moved to position 1 and c is being removed. You can try to use „while“.
#!/usr/bin/env python
import random
a=range(10)
while len(a):
print a
for i in a[:]:
if random.random() > 0.5:
print "removing: %d" % i
a.remove(i)
else:
print "keeping: %d" % i
print "done!"
a=range(10)
while len(a):
print a
for i in a:
if random.random() > 0.5:
print "removing: %d" % i
a.remove(i)
else:
print "keeping: %d" % i
print "done!"
I think this explains the problem a little better, the top block of code works, whereas the bottom one doesnt.
Items that are "kept" in the bottom list never get printed out, because you are modifiying the list you are iterating over, which is a recipe for disaster.
OK, I'm a little late to the party here, but I've been thinking about this and after looking at Python's (CPython) implementation code, have an explanation I like. If anyone knows why it's silly or wrong, I'd appreciate hearing why.
The issue is moving through a list using an iterator, while allowing that list to change.
All the iterator is obliged to do is tell you which item in the (in this case) list comes after the current item (i.e. with the next() function).
I believe the way iterators are currently implemented, they only keep track of the index of the last element they iterated over. Looking in iterobject.c one can see what appears to be a definition of an iterator:
typedef struct {
PyObject_HEAD
Py_ssize_t it_index;
PyObject *it_seq; /* Set to NULL when iterator is exhausted */
} seqiterobject;
where it_seq points to the sequence being iterated over and it_index gives the index of the last item supplied by the iterator.
When the iterator has just supplied the nth item and one deletes that item from the sequence, the correspondence between subsequent list elements and their indices changes. The former (n+1)st item becomes the nth item as far as the iterator is concerned. In other words, the iterator now thinks that what was the 'next' item in the sequence is actually the 'current' item.
So, when asked to give the next item, it will give the former (n+2)nd item(i.e. the new (n+1)st item).
As a result, for the code in question, the iterator's next() method is going to give only the n+0, n+2, n+4, ... elements from the original list. The n+1, n+3, n+5, ... items will never be exposed to the remove statement.
Although the intended activity of the code in question is clear (at least for a person), it would probably require much more introspection for an iterator to monitor changes in the sequence it iterates over and, then, to act in a 'human' fashion.
If iterators could return prior or current elements of a sequence, there might be a general work-around, but as it is, you need to iterate over a copy of the list, and be certain not to delete any items before the iterator gets to them.
Intially i is reference of a as the loop runs the first position element deletes or removes and the second position element occupies the first position but the pointer moves to the second position this goes on so that's the reason we are not able to delete b,d,f,h,j,l
`

Categories