(Unintentionally) skipping items when iterating over a list - python

I have a list and I want to remove from it the items that don't appear in another list. I've tried the following:
for w in common:
for i in range(1,n):
if not w in words[i]:
common.remove(w)
However, this fails to remove some of the items. Adding print statements for w in common:
for i in range(1,n):
print w
if not w in words[i]:
print w
common.remove(w)results in some w never being printed. Any ideas as to what's happening? I assume the answer's simple and I just don't have adequate Python knowledge, but I'm completely out of ideas.

I think you can simplify your statement with something like this:
filtered = filter(lambda x: x in words, common)
That's checking each element in common for it's presence in words and removing based on it. You may need to try x not in words depending on what you're desired result is, but I think that should come close.
I wanted to add one other approach, that might also come close, though I would need to see examples of your initial lists to test it fully.
filtered = [x for x in common if x in words]
-- EDITED -- I had the syntax in the list comprehension backwards, but caught it after I saw the comment. Thanks!

You can't delete items from the list you're iterating over. Try iterating over a copy of the list instead.
for w in common[:]:
for i in range(1,n):
if not w in words[i]:
common.remove(w)

From the Python docs:
It is not safe to modify the sequence being iterated over in the loop (this can only happen for mutable sequence types, such as lists). If you need to modify the list you are iterating over (for example, to duplicate selected items) you must iterate over a copy.

You are modifying the list while trying to iterate through it.
You could modify the first line of the code to iterate through a copy of the list (using common[:]).

If you delete (say) item 5, then the old item 6 will now be item 5. So if you think to move to item 6 you will skip it.
Is it possible to iterate backwards over that list? Then index-changes happen in parts you already processed.

Related

Reduce a list in a specific way

I have a list of strings which looks like this:
['(num1, num2):1', '(num3, num4):1', '(num5, num6):1', '(num7, num8):1']
What I try to achieve is to reduce this list and combine every two elements and I want to do this until there is only one big string element left.
So the intermediate list would look like this:
['((num1, num2):1,(num3, num4):1)', '((num5, num6):1,(num7, num8):1)']
The complicated thing is (as you can see in the intermediate list), that two strings need to be wrapped in paranthesis. So for the above mentioned starting point the final result should look like this:
(((num_1,num_2):1,(num_3,num_4):1),((num_5,num_6):1,(num_7,num_8):1))
Of course this should work in a generic way also for 8, 16 or more string elements in the starting list. Or to be more precise it should work for an=2(n+1).
Just to be very specific how the result should look with 8 elements:
'((((num_1,num_2):1,(num_3,num_4):1),((num_5,num_6):1,(num_7,num_8):1)),(((num_9,num_10):1,(num_11,num_12):1),((num_13,num_14):1,(num_15,num_16):1)))'
I already solved the problem using nested for loops but I thought there should be a more functional or short-cut solution.
I also found this solution on stackoverflow:
import itertools as it
l = [map( ",".join ,list(it.combinations(my_list, l))) for l in range(1,len(my_list)+1)]
Although, the join isn't bad, I still need the paranthesis. I tried to use:
"{},{}".format
instead of .join but this seems to be to easy to work :).
I also thought to use reduce but obviously this is not the right function. Maybe one can implement an own reduce function or so?
I hope some advanced pythonics can help me.
Sounds like a job for the zip clustering idiom: zip(*[iter(x)]*n) where you want to break iterable x into size n chunks. This will discard "leftover" elements that don't make up a full chunk. For x=[1, 2, 3], n=2 this would yield (1, 2)
def reducer(l):
while len(l) > 1:
l = ['({},{})'.format(x, y) for x, y in zip(*[iter(l)]*2)]
return l
reducer(['(num1, num2):1', '(num3, num4):1', '(num5, num6):1', '(num7, num8):1'])
# ['(((num1, num2):1,(num3, num4):1),((num5, num6):1,(num7, num8):1))']
This is an explanation of what is happening in zip(*[iter(l)]*2)
[iter(l)*2] This creates an list of length 2 with two times the same iterable element or to be more precise with two references to the same iter-object.
zip(*...) does the extracting. It pulls:
Loop
the first element from the first reference of the iter-object
the second element from the second reference of the iter-object
Loop
the third element from the first reference of the iter-object
the fourth element from the second reference of the iter object
Loop
the fifth element from the first reference of the iter-object
the sixth element from the second reference of the iter-object
and so on...
Therefore we have the extracted elements available in the for-loop and can use them as x and y for further processing.
This is really handy.
I also want to point to this thread since it helped me to understand the concept.

Getting the indexes of an element in a subset

I have an list and a subset of it, and want to find the index of each element in the subset. I have currently tried this code:
def convert_toindex(listof_elements, listof_indices):
for i in range(len(listof_elements)):
listof_elements[:] = [listof_indices.index(x) for x in listof_elements]
return listof_elements
list1 = ['lol', 'please', 'help']
list2 = ['help', 'lol', 'please', 'extra']
What I want to happen when I do convert_toindex(list1, list2) is the output to be [2, 0, 1]
However, when I do this I get a ValueError: '0' is not in list.
0, however, appears nowhere in either list so I am not sure why this is happening.
Secondly, if I have a list of lists, and I want to do this process all the nested lists inside the big list, would I do something like this?
for smalllist in biglist:
smalllist[:] = [dict_of_indices[x] for x in smalllist]
Where dict_of_indices is the dictionary of indices created following the top answer.
The problem is that, instead of doing this one times, you're doing it over and over, N times:
for i in range(len(listof_elements)):
listof_elements[:] = [listof_indices.index(x) for x in listof_elements]
The first time through, you replace every value in listof_elements with its index in listof_indices. So far, so good. In fact, you should be done there.
But then you do it a second time. You look up each of those indices, as if they were values, in listof_indices. And some of them aren't there. So you get an error.
You can solve this just by removing the outer loop. You're already done after the first time.
You may be confused because this problem seems to inherently require two loops—but you already do have two loops. The first is the obvious one in the list comprehension, and the second one is the one hidden inside listof_indices.index.
While we're at it: while this problem does require two loops, it doesn't require them to be nested.
Instead of looping over listof_indices to find each x, you can loop over it in advance to build a dictionary:
dict_of_indices = {value: index for index, value in enumerate(listof_indices)}
And then just do a direct lookup in that dictionary:
listof_elements[:] = [dict_of_indices[x] for x in listof_elements]
Besides being a whole lot faster (O(N+M) time rather than O(N*M)), I think this might also easier be to understand, and to debug. The first line may be a bit tricky, but you can easily print out the dict and verify that it's correct. And then the second line is about as trivial as you can get.

Delete rows in matrix containing certain elements (python)

The following problem I have, might be very trivial for a more advanced python programmer, but I -- as a python beginner -- can't figure out the problem.
I just want to delete a row from a 2D-list, if it matches a certain condition --- in my case, if the row contains a certain character. I wanted to do it in a more functional, python way, rather than looping over all list items. Therefore, my attempt was
alist = [[1,2],[3,4]]
map(lambda ele : (if 2 in ele: tmp3.remove(ele)), alist)
which should just delete the first row, because it contains a "2". But I just get an error "invalid syntax" and I don't know why!
(I also came across some solution which uses dataframes from the pandas package, but as I'm learning python, I want to avoid pandas at this stage ;) )
Thanks in advance!
You can't use an if statement in a lambda. You could use the more clearer list comprehension:
alist = [row for row in alist if 2 not in row]
This also has the advantage of iterating through the list once, as opposed to using map and list.remove, although you get a new list.
If you are trying to remove elements from a list, you need filter instead of map which is often used for transformation and doesn't change the length of the list:
alist = [[1,2],[3,4]]
filter(lambda ele : 2 not in ele, alist)
# [[3, 4]]

Altering a list using append during a list comprehension

Caveat: this is a straight up question for code-golfing, so I know what I'm asking is bad practise in production
I'm trying to alter an array during a list comprehension, but for some reason it is hanging and I don't know why or how to fix this.
I'm dealing with a list of lists of indeterminite depth and need to condense them to a flat list - for those curious its this question. But at this stage, lets just say I need a flat list of all the elements in the list, and 0 if it is a list.
The normal method is to iterate through the list and if its a list add it to the end, like so:
for o in x:
if type(o)==type([]):x+=o
else:i+=o
print i
I'm trying to shorten this using list comprehension, like so.
print sum([
[o,x.append(o) or 0][type(o)==type([])]
for o in x
]))
Now, I know List.append returns None, so to ensure that I get a numeric value, lazy evaluation says I can do x.append(o) or 0, and since None is "falsy" it will evaulate the second part and the value is 0.
But it doesn't. If I put x.append() into the list comprehension over x, it doesn't break or error, or return an iteration error, it just freezes. Why does append freeze during the list comprehension, but the for loop above works fine?
edit: To keep this question from being deleted, I'm not looking for golfing tips (they are very educational though), I was looking for an answer as to why the code wasn't working as I had written it.
or may be lazy, but list definitions aren't. For each o in x, when the [o,x.append(o) or 0][type(o)==type([])] monstrosity is evaluated, Python has to evaluate [o,x.append(o) or 0], which means evaluating x.append(o) or 0, which means that o will be appended to x regardless of whether it's a list. Thus, you end up with every element of x appended to x, and then they get appended again and again and again and OutOfMemoryError
What about:
y = [element for element in x if type(element) != list or x.extend(element)]
(note that extend will flatten, while append will only add the nested list back to the end, unflattened).

Python: removing specific lines from an object

I have a bit of a weird question here.
I am using iperf to test performance between a device and a server. I get the results of this test over SSH, which I then want to parse into values using a parser that has already been made. However, there are several lines at the top of the results (which I read into an object of lines) that I don't want to go into the parser. I know exactly how many lines I need to remove from the top each time though. Is there any way to drop specific entries out of a list? Something like this in psuedo-python
print list
["line1","line2","line3","line4"]
list = list.drop([0 - 1])
print list
["line3","line4"]
If anyone knows anything I could use I would really appreciate you helping me out. The only thing I can think of is writing a loop to iterate through and make a new list only putting in what I need. Anyway, thanlks!
Michael
Slices:
l = ["line1","line2","line3","line4"]
print l[2:] # print from 2nd element (including) onwards
["line3","line4"]
Slices syntax is [from(included):to(excluded):step]. Each part is optional. So you can write [:] to get the whole list (or any iterable for that matter -- string and tuple as an example from the built-ins). You can also use negative indexes, so [:-2] means from beginning to the second last element. You can also step backwards, [::-1] means get all, but in reversed order.
Also, don't use list as a variable name. It overrides the built-in list class.
This is what the slice operator is for:
>>> before = [1,2,3,4]
>>> after = before[2:]
>>> print after
[3, 4]
In this instance, before[2:] says 'give me the elements of the list before, starting at element 2 and all the way until the end.'
(also -- don't use reserved words like list or dict as variable names -- doing that can lead to confusing bugs)
You can use slices for that:
>>> l = ["line1","line2","line3","line4"] # don't use "list" as variable name, it's a built-in.
>>> print l[2:] # to discard items up to some point, specify a starting index and no stop point.
['line3', 'line4']
>>> print l[:1] + l[3:] # to drop items "in the middle", join two slices.
['line1', 'line4']
why not use a basic list slice? something like:
list = list[3:] #everything from the 3 position to the end
You want del for that
del list[:2]
You can use "del" statment to remove specific entries :
del(list[0]) # remove entry 0
del(list[0:2]) # remove entries 0 and 1

Categories