for-in loop's upper limit changing in each loop - python

How can I update the upper limit of a loop in each iteration? In the following code, List is shortened in each loop. However, the lenList in the for, in loop is not, even though I defined lenList as global. Any ideas how to solve this? (I'm using Python 2.sthg)
Thanks!
def similarity(List):
import difflib
lenList = len(List)
for i in range(1,lenList):
import numpy as np
global lenList
a = List[i]
idx = [difflib.SequenceMatcher(None, a, x).ratio() for x in List]
z = idx > .9
del List[z]
lenList = len(List)
X = ['jim','jimmy','luke','john','jake','matt','steve','tj','pat','chad','don']
similarity(X)

Looping over indices is bad practice in python. You may be able to accomplish what you want like this though (edited for comments):
def similarity(alist):
position = 0
while position < len(alist):
item = alist[position]
position += 1
# code here that modifies alist
A list will evaluate True if it has any entries, or False when it is empty. In this way you can consume a list that may grow during the manipulation of its items.
Additionally, if you absolutely have to have indices, you can get those as well:
for idx, item in enumerate(alist):
# code here, where items are actual list entries, and
# idx is the 0-based index of the item in the list.
In ... 3.x (I believe) you can even pass an optional parameter to enumerate to control the starting value of idx.

The issue here is that range() is only evaluated once at the start of the loop and produces a range generator (or list in 2.x) at that time. You can't then change the range. Not to mention that numbers and immutable, so you are assigning a new value to lenList, but that wouldn't affect any uses of it.
The best solution is to change the way your algorithm works not to rely on this behaviour.

The range is an object which is constructed before the first iteration of your loop, so you are iterating over the values in that object. You would instead need to use a while loop, although as Lattyware and g.d.d.c point out, it would not be very Pythonic.

What you are effectively looping on in the above code is a list which got generated in the first iteration itself.
You could have as well written the above as
li = range(1,lenList)
for i in li:
... your code ...
Changing lenList after li has been created has no effect on li

This problem will become quite a lot easier with one small modification to how your function works: instead of removing similar items from the existing list, create and return a new one with those items omitted.
For the specific case of just removing similarities to the first item, this simplifies down quite a bit, and removes the need to involve Numpy's fancy indexing (which you weren't actually using anyway, because of a missing call to np.array):
import difflib
def similarity(lst):
a = lst[0]
return [a] + \
[x for x in lst[1:] if difflib.SequenceMatcher(None, a, x).ratio() > .9]
From this basis, repeating it for every item in the list can be done recursively - you need to pass the list comprehension at the end back into similarity, and deal with receiving an empty list:
def similarity(lst):
if not lst:
return []
a = lst[0]
return [a] + similarity(
[x for x in lst[1:] if difflib.SequenceMatcher(None, a, x).ratio() > .9])
Also note that importing inside a function, and naming a variable list (shadowing the built-in list) are both practices worth avoiding, since they can make your code harder to follow.

Related

How to return a list that is made up of extracted elements from another list in python?

I am building a function to extract all negatives from a list called xs and I need it to add those extracted numbers into another list called new_home. I have come up with a code that I believe should work, however; it is only showing an empty list.
Example input/output:
xs=[1,2,3,4,0,-1,-2,-3,-4] ---> new_home=[1,2,3,4,0]
Here is my code that returns an empty list:
def extract_negatives(xs):
new_home=[]
for num in range(len(xs)):
if num <0:
new_home= new_home+ xs.pop(num)
return
return new_home
Why not use
[v for v in xs if v >= 0]
def extract_negatives(xs):
new_home=[]
for num in range(len(xs)):
if xs[num] < 0:
new_home.append(xs[num])
return new_home
for your code
But the Chuancong Gao solution is better:
def extract_negative(xs):
return [v for v in xs if v >= 0]
helper function filter could also help. Your function actually is
new_home = filter(lambda x: x>=0, xs)
Inside the loop of your code, the num variable doesn't really store the value of the list as you expect. The loop just iterates for len(xs) times and passes the current iteration number to num variable.
To access the list elements using loop, you should construct loop in a different fashion like this:
for element in list_name:
print element #prints all element.
To achieve your goal, you should do something like this:
another_list=[]
for element in list_name:
if(element<0): #only works for elements less than zero
another_list.append(element) #appends all negative element to another_list
Fortunately (or unfortunately, depending on how you look at it) you aren't examining the numbers in the list (xs[num]), you are examining the indexes (num). This in turn is because as a Python beginner you probably nobody haven't yet learned that there are typically easier ways to iterate over lists in Python.
This is a good (or bad, depending on how you look at it) thing, because had your code taken that branch you would have seen an exception occurring when you attempted to add a number to a list - though I agree the way you attempt it seems natural in English. Lists have an append method to put new elements o the end, and + is reserved for adding two lists together.
Fortunately ignorance is curable. I've recast your code a bit to show you how you might have written it:
def extract_negatives(xs):
out_list = []
for elmt in xs:
if elmt < 0:
out_list.append(elmt)
return out_list
As #ChuangongGoa suggests with his rather terse but correct answer, a list comprehension such as he uses is a much better way to perform simple operations of this type.

Can Python slicing be used to skip one specific element by index?

Suppose I am writing a recursive function where I want to pass a list to a function with a single element missing, as part of a loop. Here is one possible solution:
def Foo(input):
if len(input) == 0: return
for j in input:
t = input[:]
t.remove(j)
Foo(t)
Is there a way to abuse the slice operator to pass the list minus the element j without explicitly copying the list and removing the item from it?
What about this?
for i in range(len(list_)):
Foo(list_[:i] + list_[i+1:])
You are stilling copying items, though you ignore the element at index i while copying.
BTW, you can always try to avoid overriding built-in names like list by appending underscores.
If your lists are small, I recommend using the approach in the answer from #satoru.
If your lists are very large, and you want to avoid the "churn" of creating and deleting list instances, how about using a generator?
import itertools as it
def skip_i(seq, i):
return it.chain(it.islice(seq, 0, i), it.islice(seq, i+1, None))
This pushes the work of skipping the i'th element down into the C guts of itertools, so this should be faster than writing the equivalent in pure Python.
To do it in pure Python I would suggest writing a generator like this:
def gen_skip_i(seq, i):
for j, x in enumerate(seq):
if i != j:
yield x
EDIT: Here's an improved version of my answer, thanks to #Blckknght in comments below.
import itertools as it
def skip_i(iterable, i):
itr = iter(iterable)
return it.chain(it.islice(itr, 0, i), it.islice(itr, 1, None))
This is a big improvement over my original answer. My original answer only worked properly on indexable things like lists, but this will work correctly for any iterable, including iterators! It makes an explicit iterator from the iterable, then (in a "chain") pulls the first i values, and skips only a single value before pulling all remaining values.
Thank you very much #Blckknght!
Here is a code equivalent to satoru's, but is faster, as it makes one copy of the list per iteration instead of two:
before = []
after = list_[:]
for x in range(0, len(list_)):
v = after.pop(0)
Foo(before + after)
before.append(v)
(11ms instead of 18ms on my computer, for a list generated with list(range(1000)))

Modifying specific array elements in Python 3.1

I have two arrays: array and least_common (filter array)
The following code iterates through array, checks for elements that match least_common, and if finds it, modifies it and appends it to a new array.
for i in range (len(array)):
for j in range(len(least_common)):
if array[i] is least_common[j][0]:
new_array.append ((array[i]) + (array[i] * (mod[1]/100)))
However, if the element in array does not match any of the elements in least_common I wan't to append it to new_array, then iterate to the next element in array to begin the checking process again.
This code is a bit wonky to me -- I think you want to start with something more like:
lookup = set([x[0] for x in least_common])
new_array = []
for elem in array:
if elem in lookup:
new_array.append(elem + (elem * (mod[1]/100)))
else:
new_array.append(elem)
In Python, what you are trying to do is done using lists. There is another separate data type called arrays, but that is for totally different purpose. Please don't confuse and use the proper terminology, list.
Lists can be iterated through. You need to not index the elements out of the list and then access them using the index. That is C or C++ way of doing things and not python.
You use a list or a dictionary called mod in your original code. It is a bad idea to override builtin names. I tried to understand what you are trying, came up with the following code. Take it further, but before that, I think some beginner tutorials might help you as well.
new_array = []
somevalue = 0.001
for elem in array:
for anotherelem in least_common:
if elem == anotherelem[0]:
new_array.append(elem + (elem * somevalue))
Keep track of whether you found a match using a boolean, which you set to False before each inner loop and set to True within your if. After each iteration, if it's still False it means you found no matches and should then do your appending.
You should also follow what #Andrew says and iterate over lists using for a in array:. If you need the index, use for i, a in enumerate(array):. And be aware that is is not the same as ==.
new_array = []
for array_item in array:
found = False
for least_common_item in least_common:
if array_item is least_common_item:
found = True
if not found:
new_array.append (array_item * (1 + mod[1]/100))
You can also greatly shorten this code using in if you meant to use == instead of is:
for array_item in array:
if array_item not in least_common:
new_array.append (array_item * (1 + mod[1]/100))
Why not this:
least_common_set = frozenset(x[0] for x in least_common)
for e in array:
if e is not in least_common_set:
new_array.append(e + (e * (mod[1]/100)))
If I understand correctly your problem, here is a possible solution:
for e in array:
for lc in least_common:
if e is lc[0]:
new_array.append(e + e * (md[1] / 100))
break
else:
new_array.append(e)
The else clause in the for loop is executed when the loop terminates through exhaustion of the list, but not when the loop is terminated by a break statement.
Note that there is no need to use range or len, in Python you can just iterate on the elements of a sequence without involving indexes - you may use enumerate for that, but in this case you don't need to. Also, please don't use built-in names such as mod for your variables: here, I have renamed your dictionary md.

Remove items from a list while iterating without using extra memory in Python

My problem is simple: I have a long list of elements that I want to iterate through and check every element against a condition. Depending on the outcome of the condition I would like to delete the current element of the list, and continue iterating over it as usual.
I have read a few other threads on this matter. Two solutions seam to be proposed. Either make a dictionary out of the list (which implies making a copy of all the data that is already filling all the RAM in my case). Either walk the list in reverse (which breaks the concept of the alogrithm I want to implement).
Is there any better or more elegant way than this to do it ?
def walk_list(list_of_g):
g_index = 0
while g_index < len(list_of_g):
g_current = list_of_g[g_index]
if subtle_condition(g_current):
list_of_g.pop(g_index)
else:
g_index = g_index + 1
li = [ x for x in li if condition(x)]
and also
li = filter(condition,li)
Thanks to Dave Kirby
Here is an alternative answer for if you absolutely have to remove the items from the original list, and you do not have enough memory to make a copy - move the items down the list yourself:
def walk_list(list_of_g):
to_idx = 0
for g_current in list_of_g:
if not subtle_condition(g_current):
list_of_g[to_idx] = g_current
to_idx += 1
del list_of_g[to_idx:]
This will move each item (actually a pointer to each item) exactly once, so will be O(N). The del statement at the end of the function will remove any unwanted items at the end of the list, and I think Python is intelligent enough to resize the list without allocating memory for a new copy of the list.
removing items from a list is expensive, since python has to copy all the items above g_index down one place. If the number of items you want to remove is proportional to the length of the list N, then your algorithm is going to be O(N**2). If the list is long enough to fill your RAM then you will be waiting a very long time for it to complete.
It is more efficient to create a filtered copy of the list, either using a list comprehension as Marcelo showed, or use the filter or itertools.ifilter functions:
g_list = filter(not_subtle_condition, g_list)
If you do not need to use the new list and only want to iterate over it once, then it is better to use ifilter since that will not create a second list:
for g_current in itertools.ifilter(not_subtle_condtion, g_list):
# do stuff with g_current
The built-in filter function is made just to do this:
list_of_g = filter(lambda x: not subtle_condition(x), list_of_g)
How about this?
[x for x in list_of_g if not subtle_condition(x)]
its return the new list with exception from subtle_condition
For simplicity, use a list comprehension:
def walk_list(list_of_g):
return [g for g in list_of_g if not subtle_condition(g)]
Of course, this doesn't alter the original list, so the calling code would have to be different.
If you really want to mutate the list (rarely the best choice), walking backwards is simpler:
def walk_list(list_of_g):
for i in xrange(len(list_of_g), -1, -1):
if subtle_condition(list_of_g[i]):
del list_of_g[i]
Sounds like a really good use case for the filter function.
def should_be_removed(element):
return element > 5
a = range(10)
a = filter(should_be_removed, a)
This, however, will not delete the list while iterating (nor I recommend it). If for memory-space (or other performance reasons) you really need it, you can do the following:
i = 0
while i < len(a):
if should_be_removed(a[i]):
a.remove(a[i])
else:
i+=1
print a
If you perform a reverse iteration, you can remove elements on the fly without affecting the next indices you'll visit:
numbers = range(20)
# remove all numbers that are multiples of 3
l = len(numbers)
for i, n in enumerate(reversed(numbers)):
if n % 3 == 0:
del numbers[l - i - 1]
print numbers
The enumerate(reversed(numbers)) is just a stylistic choice. You may use a range if that's more legible to you:
l = len(numbers)
for i in range(l-1, -1, -1):
n = numbers[i]
if n % 3 == 0:
del numbers[i]
If you need to travel the list in order, you can reverse it in place with .reverse() before and after the reversed iteration. This won't duplicate your list either.

python : list index out of range error while iteratively popping elements

I have written a simple python program
l=[1,2,3,0,0,1]
for i in range(0,len(l)):
if l[i]==0:
l.pop(i)
This gives me error 'list index out of range' on line if l[i]==0:
After debugging I could figure out that i is getting incremented and list is getting reduced.
However, I have loop termination condition i < len(l). Then why I am getting such error?
You are reducing the length of your list l as you iterate over it, so as you approach the end of your indices in the range statement, some of those indices are no longer valid.
It looks like what you want to do is:
l = [x for x in l if x != 0]
which will return a copy of l without any of the elements that were zero (that operation is called a list comprehension, by the way). You could even shorten that last part to just if x, since non-zero numbers evaluate to True.
There is no such thing as a loop termination condition of i < len(l), in the way you've written the code, because len(l) is precalculated before the loop, not re-evaluated on each iteration. You could write it in such a way, however:
i = 0
while i < len(l):
if l[i] == 0:
l.pop(i)
else:
i += 1
The expression len(l) is evaluated only one time, at the moment the range() builtin is evaluated. The range object constructed at that time does not change; it can't possibly know anything about the object l.
P.S. l is a lousy name for a value! It looks like the numeral 1, or the capital letter I.
You're changing the size of the list while iterating over it, which is probably not what you want and is the cause of your error.
Edit: As others have answered and commented, list comprehensions are better as a first choice and especially so in response to this question. I offered this as an alternative for that reason, and while not the best answer, it still solves the problem.
So on that note, you could also use filter, which allows you to call a function to evaluate the items in the list you don't want.
Example:
>>> l = [1,2,3,0,0,1]
>>> filter(lambda x: x > 0, l)
[1, 2, 3]
Live and learn. Simple is better, except when you need things to be complex.
What Mark Rushakoff said is true, but if you iterate in the opposite direction, it is possible to remove elements from the list in the for-loop as well. E.g.,
x = [1,2,3,0,0,1]
for i in range(len(x)-1, -1, -1):
if x[i] == 0:
x.pop(i)
It's like a tall building that falls from top to bottom: even if it is in the middle of collapse, you still can "enter" into it and visit yet-to-be-collapsed floors.
I think the best way to solve this problem is:
l = [1, 2, 3, 0, 0, 1]
while 0 in l:
l.remove(0)
Instead of iterating over list I remove 0 until there aren't any 0 in list
List comprehension will lead you to a solution.
But the right way to copy a object in python is using python module copy - Shallow and deep copy operations.
l=[1,2,3,0,0,1]
for i in range(0,len(l)):
if l[i]==0:
l.pop(i)
If instead of this,
import copy
l=[1,2,3,0,0,1]
duplicate_l = copy.copy(l)
for i in range(0,len(l)):
if l[i]==0:
m.remove(i)
l = m
Then, your own code would have worked.
But for optimization, list comprehension is a good solution.
The problem was that you attempted to modify the list you were referencing within the loop that used the list len(). When you remove the item from the list, then the new len() is calculated on the next loop.
For example, after the first run, when you removed (i) using l.pop(i), that happened successfully but on the next loop the length of the list has changed so all index numbers have been shifted. To a certain point the loop attempts to run over a shorted list throwing the error.
Doing this outside the loop works, however it would be better to build and new list by first declaring and empty list before the loop, and later within the loop append everything you want to keep to the new list.
For those of you who may have come to the same problem.
I am using python 3.3.5. The above solution of using while loop did not work for me. Even if i put print (i) after len(l) it gave me an error. I ran the same code in command line (shell)[ window that pops up when we run a function] it runs without error. What i did was calculated len(l) outside the function in main program and passed the length as a parameter. It worked. Python is weird sometimes.
I think most solutions talk here about List Comprehension, but if you'd like to perform in place deletion and keep the space complexity to O(1); The solution is:
i = 0
for j in range(len(arr)):
if (arr[j] != 0):
arr[i] = arr[j]
i +=1
arr = arr[:i]
x=[]
x = [int(i) for i in input().split()]
i = 0
while i < len(x):
print(x[i])
if(x[i]%5)==0:
del x[i]
else:
i += 1
print(*x)
Code:
while True:
n += 1
try:
DATA[n]['message']['text']
except:
key = DATA[n-1]['message']['text']
break
Console :
Traceback (most recent call last):
File "botnet.py", line 82, in <module>
key =DATA[n-1]['message']['text']
IndexError: list index out of range
I recently had a similar problem and I found that I need to decrease the list index by one.
So instead of:
if l[i]==0:
You can try:
if l[i-1]==0:
Because the list indices start at 0 and your range will go just one above that.

Categories