Removing element messes up the index [duplicate] - python

This question already has answers here:
Loop "Forgets" to Remove Some Items [duplicate]
(10 answers)
Closed 8 years ago.
I have a simple question about lists
Suppose that I want to delete all 'a's from a list:
list = ['a', 'a', 'b', 'b', 'c', 'c']
for element in list:
if element == 'a':
list.remove('a')
print list
==> result:
['a', 'b', 'b', 'c', 'c', 'd', 'd']
I know this is happening because, after I remove the first 'a', the list index gets
incremented while all the elements get pushed left by 1.
In other languages, I guess one way to solve this is to iterate backwards from the end of the list..
However, iterating through reversed(list) returns the same error.
Is there a pythonic way to solve this problem??
Thanks

One of the more Pythonic ways:
>>> filter(lambda x: x != 'a', ['a', 'a', 'b', 'b', 'c', 'c'])
['b', 'b', 'c', 'c']

You should never modify a list while iterating over it.
A better approach would be to use a list comprehension to exclude an item:
list1 = ['a', 'a', 'b', 'b', 'c', 'c']
list2 = [x for x in list1 if x != 'a']
Note: Don't use list as a variable name in Python - it masks the built-in list type.

You are correct, when you remove an item from a list while iterating over it, the list index gets out of sync. What both the other existing answers are hinting at is that you need to create a new list and copy over only the items you want.
For example:
existing_list = ['a', 'a', 'b', 'c', 'd', 'e']
new_list = []
for element in existing_list:
if element != 'a':
new_list.append(element)
existing_list = new_list
print existing_list
outputs: ['b', 'c', 'd', 'e']

Related

How to efficiently get common items from two lists that may have duplicates?

my_list = ['a', 'b', 'a', 'd', 'e', 'f']
my_list_2 = ['a', 'b', 'c']
The common items are:
c = ['a', 'b', 'a']
The code:
for e in my_list:
if e in my_list_2:
c.append(e)
...
If the my_list is long, this would be very inefficient. If I convert both lists into two sets, then use set's intersection() function to get the common items, I will lose the duplicates in my_list.
How to deal with this efficiently?
dict is already a hashmap, so lookups are practically as efficient as a set, so you may not need to do any extra work collecting the values - if it wasn't, you could pack the values into a set to check before checking the dict
However, a large improvement may be to make a generator for the values, rather than creating a new intermediate list, to iterate over where you actually want the values
def foo(src_dict, check_list):
for value in check_list:
if value in my_dict:
yield value
With the edit, you may find you're better off packing all the inputs into a set
def foo(src_list, check_list):
hashmap = set(src_list)
for value in check_list:
if value in hashmap:
yield value
If you know a lot about the inputs, you can do better, but that's an unusual case (for example if the lists are ordered you could bisect, or if you have a huge verifying list or very very few values to check against it you may find some efficiency in the ordering and if you make a set)
I am not sure about time efficiency, but, personally speaking, list comprehension would always be more of interest to me:
[x for x in my_list if x in my_list_2]
Output
['a', 'b', 'a']
First, utilize the set.intersection() method to get the intersecting values in the list. Then, use a nested list comprehension to include the duplicates based on the original list's count on each value:
my_list = ['a', 'b', 'a', 'd', 'e', 'f']
my_list_2 = ['a', 'b', 'c']
c = [x for x in set(my_list).intersection(set(my_list_2)) for _ in range(my_list.count(x))]
print(c)
The above may be slower than just
my_list = ['a', 'b', 'a', 'd', 'e', 'f']
my_list_2 = ['a', 'b', 'c']
c = []
for e in my_list:
if e in my_list_2:
c.append(e)
print(c)
But when the lists are significantly larger, the code block utilizing the set.intersection() method will be significantly more efficient (faster).
sorry for not reading the post carefully and now it is not possible to delete.. however, it is an attempt for solution.
c = lambda my_list, my_list_2: (my_list, my_list_2, list(set(my_list).intersection(set(my_list_2))))
print("(list_1,list_2,duplicate_items) -", c(my_list, my_list_2))
Output:
(list_1,list_2,duplicate_items) -> (['a', 'b', 'a', 'd', 'e', 'f'], ['a', 'b', 'c'], ['b', 'a'])
or can be
[i for i in my_list if i in my_list_2]
output:
['a', 'b', 'a']

Problems removing element while iterating over list [duplicate]

This question already has answers here:
Modify a list while iterating [duplicate]
(4 answers)
Closed 2 years ago.
As a beginner, I am writing a simple script to better acquaint myself with python. I ran the code below and I am not getting the expected output. I think the for-loop ends before the last iteration and I don't know why.
letters = ['a', 'b', 'c', 'c', 'c']
print(letters)
for item in letters:
if item != 'c':
print('not c')
else:
letters.remove(item)
continue
print(letters)
output returned:
['a', 'b', 'c', 'c', 'c']
not c
not c
['a', 'b', 'c']
Expected Output:
['a', 'b', 'c', 'c', 'c']
not c
not c
['a', 'b']
Basically, I am not expecting to have 'c' within my list anymore.
If you have a better way to write the code that would be appreciated as well.
WARNING: This is an inefficient solution that I will provide to answer your question. I'll post a more concise and faster solution in answer #2.
Answer #1
When you are removing items like this, it changes the length of the list, so it is better to loop backwards. Try for item in letters[::-1] to reverse the list:
letters = ['a', 'b', 'c', 'c', 'c']
print(letters)
for item in letters[::-1]:
if item != 'c':
print('not c')
else:
letters.remove(item)
continue
print(letters)
output:
['a', 'b', 'c', 'c', 'c']
not c
not c
['a', 'b']
Answer #2 - Use list comprehension instead of looping (more detail: Is there a simple way to delete a list element by value?):
letters = ['a', 'b', 'c', 'c', 'c']
letters = [x for x in letters if x != 'c']
output:
['a', 'b']
the letters.remove(item) removes only a single instance of the element, but has the unintentional effect of reducing the size of the list as you are iterating over it. This is something you want to generally avoiding doing, modifying the same element you are iterating over. As a result the list becomes shorter and the iterator believes you have traversed all of the elements, even though the last 'c' is still in the list. This is seen with the output of:
letters = ['a', 'b', 'c', 'c', 'c']
print(letters)
for idx,item in enumerate(letters):
print("Index: {} Len: {}".format(idx,len(letters)))
if item != 'c':
print('not c')
else:
letters.remove(item)
continue
print(letters)
"""Index: 0 Len: 5
not c
Index: 1 Len: 5
not c
Index: 2 Len: 5
Index: 3 Len: 4"""
You never iterate over the last element because the index (4) would exceed the indexable elements of the list (0-3 now)!
If you want to filter a list you can use the built in filter function:
filter(lambda x: x!='c', letters)

Python sort list with list.count does not work if list has items of equal occurrence [duplicate]

This question already has answers here:
Python list sort in descending order
(6 answers)
Sort Python list using multiple keys
(6 answers)
Closed 4 years ago.
I have been trying to sort a list of elements (string) according to their occurrence in Python3. I have been using the inbuilt sort() method with the string.count as key as shown below.
p = "acaabbcabac"
print(sorted(p, key=p.count))
# Output : ['c', 'b', 'b', 'c', 'b', 'c', 'a', 'a', 'a', 'a', 'a']
#But expected output is ['a','a','a','a','a','b','b','b','c','c','c']
p = "acaabbcb"
print(sorted(p, key=p.count))
# Output : ['c', 'c', 'a', 'a', 'a', 'b', 'b', 'b']
#Output is as expected
p = "ababab"
print(sorted(p, key=p.count))
# Output :['a', 'b', 'a', 'b', 'a', 'b', 'a', 'b']
#But expected output is ['a','a','a','b','b','b']
What I have observed is, the above sort works as per the occurrence of the element, but it works only if the counts of each element is different. If the occurrence of any two or more elements is same, then they are listed in the same order they appear in the string/list.
Am I doing something wrong or is there a better approach at this ? I tried searching answers for this issue but I could not find and so am posting this here.
Thanks in advance.
Use a lambda function in your sorting key, where the first operation is p.count, and the second simply sorts on the element value (which ends up being alphabetical):
p = "ababab"
sorted(p, key = lambda x: [p.count, x])
# ['a', 'a', 'a', 'b', 'b', 'b']

List of lists weird python issue [duplicate]

This question already has answers here:
How do I clone a list so that it doesn't change unexpectedly after assignment?
(24 answers)
Closed 5 years ago.
list= []
x = ['A', 'B', 'C', 'D', 'E', 'F', 'G']
for i in range(2):
list.append(x)
list[0][0]="x"
print list
And after printing I get this:
[['x', 'B', 'C', 'D', 'E', 'F', 'G'], ['x', 'B', 'C', 'D', 'E', 'F', 'G']]
The first item in every list was replaced by 'x' whereas I only wanted the first item in the first list to be replaced by 'x'(hence the line list[0][0]="x")
The line list.append(x) adds a reference to x into list. Both sublists end up pointing to the same object (as does x). In fact, doing x[0] = 'x' would have the exact same effect as list[0][0] = 'x'. To make the sub-lists independent, make a copy by doing list.append(x.copy()) or list.append(x[:])

Check list item against multiple lists and remove if present in any of them. Python [duplicate]

This question already has answers here:
remove elements in one list present in another list [duplicate]
(2 answers)
Closed 8 years ago.
I have a main list such as:
mainlst = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
and I want to search each item in this mainlst against multiple other search lists and if it's present in any of them to remove it from the main list, so for example:
searchlst1 = ['a', 'b', 'c']
searchlst2 = ['a', 'd', 'f']
searchlst3 = ['e', 'f', 'g']
The issue Im having is I cant work out how to make the loop go through each statement, so if I use and if elif statement it exits the loop as soon as it has found a match
for item in mainlst:
if item in searchlst1:
mainlst.remove(item)
elif item in searchlst2:
mainlst.remove(item)
elif item in searchlst3
mainlst.remove(item)
but obviously this exits the loop as soon as one condition is true, how do I make the loop go through all the conditions?
set objects are great for stuff like this -- the in operator takes O(1) time compared to O(N) time for a list -- And it's easy to create a set from a bunch of existing lists using set.union:
search_set = set().union(searchlst1, searchlst2, searchlst3)
mainlst = [x for x in mainlst if x not in search_set]
Example:
>>> search_set = set().union(searchlst1, searchlst2, searchlst3)
>>> search_set
set(['a', 'c', 'b', 'e', 'd', 'g', 'f'])
>>> mainlst = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
>>> mainlst = [x for x in mainlst if x not in search_set]
>>> mainlst
['h']
How about using a list comprehension and a set:
[i for i in mainlst if i not in set(searchlst1 + searchlst2 + searchlst3)]
returns ['h']
set() takes an iterable (in this case a group of lists) and returns a set containing the unique values. Tests for membership in a set always take the same amount of time, whereas testing for membership in a list scales linearly with the length of the list.
The list comprehension goes through each element of mainlst and constructs a new list whose members are not in the set:
>>> mainlst = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
>>> search = set(searchlst1 + searchlst2 + searchlst3)
>>> search
set(['a', 'c', 'b', 'e', 'd', 'g', 'f'])
>>> [i for i in mainlst if i not in search]
['h']
Replacing the elif statements with if statements will fix your problem.
for item in mainlst:
if item in searchlst1:
mainlst.remove(item)
if item in searchlst2:
mainlst.remove(item)
if item in searchlst3:
mainlst.remove(item)
The problem now is that your doing three searches through the list to remove items. This will become more time consuming as the list or searchlists grow. And in your example there are duplicates in your searchlists.
Combining the searchlists will reduce number of comparisons.

Categories