Number Filtration Algorithm bug

Number Filtration Algorithm bug - python

So I wrote this algorithm where given a set of integers it will remove all integers except 0 and 7 and then it will check if the remaining integers are in a certain order and then will return a boolean. Code below:
def spy_game(nums):
for i in nums:
if i != 0:
if i == 7:
continue
else:
nums.remove(i)
else:
continue
stringlist = [str(o) for o in nums]
mystring = ''.join(stringlist)
return '007' in mystring
spy_game([1,0,2,4,0,7,5])
Now the problem is that if I run
(for example) spy_game([1,0,2,4,0,7,5]) it will not return True regardless of the fact that the sequence of interest is present. After I decided to return the list per se after the filtration process, I found that all numbers except the ones in the middle got filtered out. So in this example, if I return nums it will return [0, 4, 0, 7] although the 4 should've been removed. I am aware that there are more optimal alternatives to this algorithm but I just want to understand why it doesn't work. Thank you.

Instead of modifying the list, use another list to keep track of the wanted numbers.
You should not modify the list while iterating on it.
Here's a cleaned up version
def spy_game(nums):
ans = []
for i in nums:
if i == 0 or i == 7:
ans.append(i)
stringlist = [str(o) for o in ans]
mystring = ''.join(stringlist)
return '007' in mystring

zenwraight's comment says what the problem is: in Python, you can't modify a list while iterating over it.
As for why, the Python documentation discusses this in a note on the for statement's section:
An internal counter is used to keep track of which item is used next, and this is incremented on each iteration. … This means that if the [loop body] deletes the current … item from the sequence, the next item will be skipped (since it gets the index of the current item which has already been treated).
The documentation also describes what happens when you insert an element during a loop, and suggests one possible solution (using a slice to copy the list: for i in nums[:]: ...). In your use case, that solution is likely to work fine, but it is considerably less efficient than options that don't copy the entire list.
A better solution might be to use another list comprehension:
nums = [i for i in nums if i == 0 or i == 7]

Related

Built in (remove) function not working with function variable

Have a good day everyone, pardon my lack of understanding, but I can't seem to figure out why python built in function does not work when being called with another function variable and it just doesn't do what I want at all. Here is the code
def ignoreten(h):
ignoring = False
for i in range (1,len(h)-2):
if ignoring == True and h[i]==10:
h.remove(10)
if ignoring == False and h[i] ==10:
ignoring = True
The basic idea of this is just to decided the first 10 in a list, keep it, continue iterating until you faced another 10, then just remove that 10 to avoid replication, I had searched around but can't seem to find any solution and that's why I have to bring it up here. Thank you

The code you listed
def ignoreten(h):
ignoring = False
for i in range (1,len(h)-2):
if ignoring == True and h[i]==10:
h.remove(10)
if ignoring == False and h[i] ==10:
ignoring = True
Will actually do almost the exact opposite of what you want. It'll iterate over h (sort of, see [1]), and if it finds 10 twice, it'll remove the first occurrence from the list. (And, if it finds 10 three times, it'll remove the first two occurrences from the list.)
Note that list.remove will:
Remove the first item from the list whose value is equal to x. It
raises a ValueError if there is no such item.
Also note that you're mutating the list you're iterating over, so there's some additional weirdness here which may be confusing you, depending on your input.
From your follow-up comment to my question, it looks like you want to remove only the second occurrence of 10, not the first and not any subsequent occurrences.
Here are a few ways:
Iterate, store index, use del
def ignoreten(h):
index = None
found_first = False
for i,v in enumerate(h):
if v == 10:
if not found_first:
found_first = True
else:
index = i
break
if index is not None:
del h[index]
A little more verbose than necessary, but explicit, safe, and modifiable without much fear.
Alternatively, you could delete inside the loop but you want to make sure you immediately break:
def ignoreten(h):
found_first = False
for i,v in enumerate(h):
if v == 10:
if not found_first:
found_first = True
else:
del h[i]
break
Collect indices of 10s, remove second
def ignoreten(h):
indices = [i for (i,v) in enumerate(h) if v == 10]
if len(indices) > 1:
del h[indices[1]] # The second index of 10 is at indices[1]
Clean, but will unnecessarily iterate past the second 10 and collect as many indices of 10s are there are. Not likely a huge issue, but worth pointing out.
Collect indices of 10s, remove second (v2, from comments)
def ignoreten(h):
indices = [i for (i,v) in enumerate(h) if v == 10]
for i in reversed(indices[1:]):
del h[i]
From your comment asking about removing all non-initial occurrences of 10, if you're looking for in-place modification of h, then you almost definitely want something like this.
The first line collects all the indices of 10 into a list.
The second line is a bit tricky, but working inside-out it:
[1:] "throws out" the first element of that list (since you want to keep the first occurrence of 10)
reversed iterates over that list backwards
del h[i] removes the values at those indices.
The reason we iterate backwards is because doing so won't invalidate the rest of our indices that we've yet to delete.
In other words, if the list h was [1, 10, 2, 10, 3, 10], our indices list would be [1, 3, 5].
In both cases we skip 1, fine.
But if we iterate forwards, once we delete 3, and our list shrinks to 5 elements, when we go to delete 5 we get an IndexError.
Even if we didn't go out of bounds to cause an IndexError, our elements would shift and we'd be deleting incorrect values.
So instead, we iterate backwards over our indices, delete 5, the list shrinks to 5 elements, and index 3 is still valid (and still 10).
With list.index
def ignoreten(h):
try:
second_ten = h.index(10, h.index(10)+1)
del h[second_ten]
except ValueError:
pass
The inner .index call finds the first occurrence, the second uses the optional begin parameter to start searching after that. Wrapped in try/except in case there are less than two occurrences.
⇒ Personally, I'd prefer these in the opposite order of how they're listed.
[1] You're iterating over a weird subset of the list with your arguments to range. You're skipping (not applying your "is 10" logic to) the first and last two elements this way.
Bonus: Walrus abuse
(don't do this)
def ignoreten(h):
x = 0
return [v for v in h if v != 10 or (x := x + 1) != 1]
(unlike the previous versions that operated on h in-place, this creates a new list without the second occurrence of 10)
But the walrus operator is contentious enough already, please don't let this code out in the wild. Really.

Why does cycle 'if' delete not all odd numbers from the list (only each second odd number)?

There is a code:
list = [1, 2]
while list[-1]+list[-2] <= 4000000:
list.append(list[-1] + list[-2])
for i in list:
if i % 2 == 1:
print(i)
list.remove(i)
print(list)
print(sum(list))

You shouldn't modify a list (or any container) while iterating through it.
One way to go around it is to use another container,
in_list = [1, 2]
while in_list[-1]+in_list[-2] <= 20:
in_list.append(in_list[-1] + in_list[-2])
print(in_list)
out_list = []
for i in in_list:
if i % 2 != 1:
print(i)
out_list.append(i)
print(out_list)
print(sum(out_list))
This code uses a different approach than yours: it creates the input list, then while iterating it adds the even elements to a new, output list. This has the same effect as removing the odd elements from the input list, however, it doesn't break the iteration by modifying the input list.
Like said in the comments, you shouldn't use built-in names ("list") for your variable names - it will shadow them. Also, when you develop and debug your code it's best to stick to smaller examples. Here I use 20 instead of 4,000,000 - much easier to track and doesn't lose the meaning.

How to return a list that is made up of extracted elements from another list in python?

I am building a function to extract all negatives from a list called xs and I need it to add those extracted numbers into another list called new_home. I have come up with a code that I believe should work, however; it is only showing an empty list.
Example input/output:
xs=[1,2,3,4,0,-1,-2,-3,-4] ---> new_home=[1,2,3,4,0]
Here is my code that returns an empty list:
def extract_negatives(xs):
new_home=[]
for num in range(len(xs)):
if num <0:
new_home= new_home+ xs.pop(num)
return
return new_home

Why not use
[v for v in xs if v >= 0]

def extract_negatives(xs):
new_home=[]
for num in range(len(xs)):
if xs[num] < 0:
new_home.append(xs[num])
return new_home
for your code
But the Chuancong Gao solution is better:
def extract_negative(xs):
return [v for v in xs if v >= 0]

helper function filter could also help. Your function actually is
new_home = filter(lambda x: x>=0, xs)

Inside the loop of your code, the num variable doesn't really store the value of the list as you expect. The loop just iterates for len(xs) times and passes the current iteration number to num variable.
To access the list elements using loop, you should construct loop in a different fashion like this:
for element in list_name:
print element #prints all element.
To achieve your goal, you should do something like this:
another_list=[]
for element in list_name:
if(element<0): #only works for elements less than zero
another_list.append(element) #appends all negative element to another_list

Fortunately (or unfortunately, depending on how you look at it) you aren't examining the numbers in the list (xs[num]), you are examining the indexes (num). This in turn is because as a Python beginner you probably nobody haven't yet learned that there are typically easier ways to iterate over lists in Python.
This is a good (or bad, depending on how you look at it) thing, because had your code taken that branch you would have seen an exception occurring when you attempted to add a number to a list - though I agree the way you attempt it seems natural in English. Lists have an append method to put new elements o the end, and + is reserved for adding two lists together.
Fortunately ignorance is curable. I've recast your code a bit to show you how you might have written it:
def extract_negatives(xs):
out_list = []
for elmt in xs:
if elmt < 0:
out_list.append(elmt)
return out_list
As #ChuangongGoa suggests with his rather terse but correct answer, a list comprehension such as he uses is a much better way to perform simple operations of this type.

In this short recursive function `list_sum(aList)`, the finish condition is `if not aList: return 0`. I see no logic in why this condition works

I am learning the recursive functions. I completed an exercise, but in a different way than proposed.
"Write a recursive function which takes a list argument and returns the sum of its integers."
L = [0, 1, 2, 3, 4] # The sum of elements will be 10
My solution is:
def list_sum(aList):
count = len(aList)
if count == 0:
return 0
count -= 1
return aList[0] + list_sum(aList[1:])
The proposed solution is:
def proposed_sum(aList):
if not aList:
return 0
return aList[0] + proposed_sum(aList[1:])
My solution is very clear in how it works.
The proposed solution is shorter, but it is not clear for me why does the function work. How does if not aList even happen? I mean, how would the rest of the code fulfill a not aList, if not aList means it checks for True/False, but how is it True/False here?
I understand that return 0 causes the recursion to stop.
As a side note, executing without if not aList throws IndexError: list index out of range.
Also, timeit-1million says my function is slower. It takes 3.32 seconds while the proposed takes 2.26. Which means I gotta understand the proposed solution.

On the call of the function, aList will have no elements. Or in other words, the only element it has is null. A list is like a string or array. When you create a variable you reserve some space in the memory for it. Lists and such have a null on the very last position which marks the end so nothing can be stored after that point. You keep cutting the first element in the list, so the only thing left is the null. When you reach it you know you're done.
If you don't use that condition the function will try to take a number that doesn't exist, so it throws that error.

You are counting the items in the list, and the proposed one check if it's empty with if not aList this is equals to len(aList) == 0, so both of you use the same logic.
But, you're doing count -= 1, this has no sense since when you use recursion, you pass the list quiting one element, so here you lose some time.
According to PEP 8, this is the proper way:
• For sequences, (strings, lists, tuples), use the fact that empty
sequences are false.
Yes: if not seq:
if seq:
No: if len(seq)
if not len(seq)
Here is my amateur thougts about why:
This implicit check will be faster than calling len, since len is a function to get the length of a collection, it works by calling an object's __len__ method. This will find up there is no item to check __len__.
So both will find up there is no item there, but one does it directly.

not aList
return True if there is no elements in aList. That if statement in the solution covers edge case and checks if input parameter is not empty list.

For understand this function, let's run it step by step :
step 0 :
L=[0,1,2,3,4]
proposed_sum([0,1,2,3,4])
L != []
return l[0] + proposed_sum([1,2,3,4])
step 1 calcul proposed_sum([1,2,3,4]):
proposed_sum([1,2,3,4])
L != []
return l[0] + sum([2,3,4])
step 2 calcul proposed_sum([2,3,4]):
proposed_sum([2,3,4])
L != []
return l[0] + sum([3,4])
step 3 calcul proposed_sum([3,4]):
proposed_sum([3,4])
L != []
return l[0] + sum([4])
step 4 calcul proposed_sum([4]):
proposed_sum([4])
L != []
return l[0] + sum([])
step 5 calcul proposed_sum([]):
proposed_sum([])
L == []
return 0
step 6 replace:
proposed_sum([0,1,2,3,4])
By
proposed_sum([]) + proposed_sum([4]) + proposed_sum([3,4]) + proposed_sum([2,3,4]) + proposed_sum([1,2,3,4])+ proposed_sum([0,1,2,3,4])
=
(0) + 4 + 3 + 2 + 1 + 0

Python considers as False multiple values:
False (of course)
0
None
empty collections (dictionaries, lists, tuples)
empty strings ('', "", '''''', """""", r'', u"", etc...)
any other object whose __nonzero__ method returns False
in your case, the list is evaluated as a boolean. If it is empty, it is considered as False, else it is considered as True. This is just a shorter way to write if len(aList) == 0:
in addition, concerning your new question in the comments, consider the last line of your function:
return aList[0] + proposed_sum(aList[1:])
This line call a new "instance" of the function but with a subset of the original list (the original list minus the first element). At each recursion, the list passed in argument looses an element and after a certain amount of recursions, the passed list is empty.

Remove items from a list while iterating without using extra memory in Python

My problem is simple: I have a long list of elements that I want to iterate through and check every element against a condition. Depending on the outcome of the condition I would like to delete the current element of the list, and continue iterating over it as usual.
I have read a few other threads on this matter. Two solutions seam to be proposed. Either make a dictionary out of the list (which implies making a copy of all the data that is already filling all the RAM in my case). Either walk the list in reverse (which breaks the concept of the alogrithm I want to implement).
Is there any better or more elegant way than this to do it ?
def walk_list(list_of_g):
g_index = 0
while g_index < len(list_of_g):
g_current = list_of_g[g_index]
if subtle_condition(g_current):
list_of_g.pop(g_index)
else:
g_index = g_index + 1

li = [ x for x in li if condition(x)]
and also
li = filter(condition,li)
Thanks to Dave Kirby

Here is an alternative answer for if you absolutely have to remove the items from the original list, and you do not have enough memory to make a copy - move the items down the list yourself:
def walk_list(list_of_g):
to_idx = 0
for g_current in list_of_g:
if not subtle_condition(g_current):
list_of_g[to_idx] = g_current
to_idx += 1
del list_of_g[to_idx:]
This will move each item (actually a pointer to each item) exactly once, so will be O(N). The del statement at the end of the function will remove any unwanted items at the end of the list, and I think Python is intelligent enough to resize the list without allocating memory for a new copy of the list.

removing items from a list is expensive, since python has to copy all the items above g_index down one place. If the number of items you want to remove is proportional to the length of the list N, then your algorithm is going to be O(N**2). If the list is long enough to fill your RAM then you will be waiting a very long time for it to complete.
It is more efficient to create a filtered copy of the list, either using a list comprehension as Marcelo showed, or use the filter or itertools.ifilter functions:
g_list = filter(not_subtle_condition, g_list)
If you do not need to use the new list and only want to iterate over it once, then it is better to use ifilter since that will not create a second list:
for g_current in itertools.ifilter(not_subtle_condtion, g_list):
# do stuff with g_current

The built-in filter function is made just to do this:
list_of_g = filter(lambda x: not subtle_condition(x), list_of_g)

How about this?
[x for x in list_of_g if not subtle_condition(x)]
its return the new list with exception from subtle_condition

For simplicity, use a list comprehension:
def walk_list(list_of_g):
return [g for g in list_of_g if not subtle_condition(g)]
Of course, this doesn't alter the original list, so the calling code would have to be different.
If you really want to mutate the list (rarely the best choice), walking backwards is simpler:
def walk_list(list_of_g):
for i in xrange(len(list_of_g), -1, -1):
if subtle_condition(list_of_g[i]):
del list_of_g[i]

Sounds like a really good use case for the filter function.
def should_be_removed(element):
return element > 5
a = range(10)
a = filter(should_be_removed, a)
This, however, will not delete the list while iterating (nor I recommend it). If for memory-space (or other performance reasons) you really need it, you can do the following:
i = 0
while i < len(a):
if should_be_removed(a[i]):
a.remove(a[i])
else:
i+=1
print a

If you perform a reverse iteration, you can remove elements on the fly without affecting the next indices you'll visit:
numbers = range(20)
# remove all numbers that are multiples of 3
l = len(numbers)
for i, n in enumerate(reversed(numbers)):
if n % 3 == 0:
del numbers[l - i - 1]
print numbers
The enumerate(reversed(numbers)) is just a stylistic choice. You may use a range if that's more legible to you:
l = len(numbers)
for i in range(l-1, -1, -1):
n = numbers[i]
if n % 3 == 0:
del numbers[i]
If you need to travel the list in order, you can reverse it in place with .reverse() before and after the reversed iteration. This won't duplicate your list either.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Number Filtration Algorithm bug - python

Related

Built in (remove) function not working with function variable

Why does cycle 'if' delete not all odd numbers from the list (only each second odd number)?

How to return a list that is made up of extracted elements from another list in python?

In this short recursive function `list_sum(aList)`, the finish condition is `if not aList: return 0`. I see no logic in why this condition works

Remove items from a list while iterating without using extra memory in Python

Categories

Resources