Creating a function that removes duplicates in list - python

I'm trying to manually make a function that removes duplicates from a list. I know there is a Python function that does something similar (set()), but I want to create my own. This is what I have:
def remove(lst):
for i in range(len(lst)):
aux = lst[0:i] + lst[i+1:len(lst)]
if lst[i] in aux:
del(lst[i])
return lst
I was trying something like creating a sub-list with all the items except the one the for is currently on, and then check if the item is still in the list. If it is, remove it.
The problem is that it gives me an index out of range error. Does the for i in range(len(lst)): line not update every time it starts over? Since I'm removing items from the list, the list will be shorter, so for a list that has 10 items and 2 duplicates, it will go up to index 9 instead of stopping on the 7th.
Is there anyway to fix this, or should I just try doing this is another way?

I know this does not fix your current script, but would something like this work?
def remove(lst):
unique=[]
for i in lst:
if i not in unique: unique.append(i)
return unique
Just simply looping through, creating another list and checking for membership?

The problem is you are manipulating the list as you are iterating over it. This means that when you reach the end of the list, it is now shorter because you're removed elements. You should (generally) avoid removing elements while you are looping over lists.

You got it the first time: len(lst) is evaluated only when you enter the loop. If you want it re-evaluated, try the while version:
i = 0
while i < len(lst):
...
i += 1
Next, you get to worry about another problem: you increment i only when you don't delete an item. When you do delete, shortening the list gets you to the next element.
i = 0
while i < len(lst):
aux = lst[0:i] + lst[i+1:len(lst)]
if lst[i] in aux:
del(lst[i])
else:
i += 1
I think that should solve your problem ... using the logic you intended.

def remove(lst):
new_list = []
for i in lst:
if i not in new_list:
new_list.append(i)
return new_list
You should append the values to a secondary list. As Bobbyrogers said, it's not a good idea to iterate over a list that is changing.

You can also try this:
lst = [1,2,3,3,4,4,5,6]
lst2 = []
for i in lst:
if i not in lst2:
lst2.append(i)
print(lst2)
[1, 2, 3, 4, 5, 6]

Related

How can I remove all the numbers in a list except for 1 number (Python)?

I want my code to remove every single number in a list except for a specific number, which is 3. Instead it removes certain numbers but majority of them still remain in the list.
myList = [0,1,2,3,4,5]
i = 0
for i in myList:
print(i)
if i != 3:
myList.remove(i)
else:
continue
i += 1
print(myList)
You've got a few issues. First, you're trying to modify the list in place, as you're trying to process it. You can fix that by changing:
for i in myList:
to:
for i in myList[:]:
That will allow your code to give the desired result. It makes a "copy" of the list over which to iterate, so you can modify the original list without messing up your iteration loop.
The other thing to note is that you assign a value to i in the for loop, but then you manually change it after your if-else block. That change gets discarded when you go back to the top of the loop.
Also, you're else: continue prevents incrementing i, but it doesn't matter, because that incremented value was just getting tossed anyway.
So... commenting out some of the unnecessary stuff gives:
myList = [0,1,2,3,4,5]
# i = 0
for i in myList[:]:
print(i)
if i != 3:
myList.remove(i)
# else:
# continue
# i += 1
print(myList)
You have a couple of problems. First, the for loop iterates the list values, not its index. You can use enumerate to get both the index and the value. Second, if you delete values in a list, its remaining elements are shifted down by 1 but since the iterator also increments by 1, you miss a value. A trick is to iterate the list in reverse so that any deleted values have already been iterated.
>>> myList = [0,1,2,3,4,5]
>>> mlen = len(myList)
>>> for i, v in enumerate(reversed(myList), 1):
... if v != 3:
... del myList[mlen-i]
...
>>>
>>> myList
But this operation is slow. If you don't need to modify the original list, use a list comprehension
>>> myList = [0,1,2,3,4,5]
>>> myList = [v for v in myList if v==3]
>>> myList
[3]
The issue is you are removing elements from the list you are iterating over. So your loop won't iterate over the entire list.

Removing last list element by popping

I have the following piece of Python code where I am trying to empty out list A by removing one element at a time starting from the end. I cannot seem to reduce the list to an empty one and I would like to know why not. Any insight would be extremely appreciated.
A = [3,4,5,6,2]
for i in A:
A.pop()
var = [3,4,5,6,2]
for x in range(len(var)):
a = var.pop(-1)
print(a)
or reverse a list
var = var[::-1]
This issue you are facing because you are trying to iterate the loop from first element and trying to remove the last element of the list. at one pint for loop runs out of element in a list hence it stops and you don't get empty list.
The proper solution will be to reverse iterate through the list and remove the elements.
Sample Code :
A = [3,4,5,6,2]
for i in range( len(A) -1 , -1, -1):
A.pop()
print (A)
You can try this:
A = [3,4,5,6,2]
for a in range(len(A)):
A.pop(-1)
Output:
>>> A
[]
You can also use a list comprehension instead of a traditional for loop:
A = [3,4,5,6,2]
[A.pop(-1) for a in range(len(A))]
Output:
>>> A
[]
Here you are trying to iterate through the elements of a list while the length of the list is reduced by 1 for each iteration.
This is not the right way to do this,
Try this,
A = [3,4,5,6,2]
for _ in range(len(A)):
A.pop()
This will work.
Note: Never loop through a list in which you are going to perform some operations inside the loop-body. Try duplicating the list or use some other conditions.

Iterating over 2 lists at once and comparing the elements

This is just a small part of my homework assignment. What im trying to do is iterate through list1 and iterate backwards through list2 and determine if list2 is a reversed version of list1. Both lists are equal length.
example: list1 = [1,2,3,4] and list2 = [4,3,2,1]. list2 is a reversed version of list1. You could also have list1 = [1,2,1] and list2 = [1,2,1] then they would be the same list and also reversed lists.
Im not asking for exact code, im just not sure how i would code this. Would i run 2 loops? Any tips are appreciated. Just looking for a basic structure/algorithm.
edit: we are not allowed to use any auxiliary lists etc.
You can just iterate backwards on the second list, and keep a counter of items from the start of the first list. If items match, break out of the loop, otherwise keep going.
Here's what it can look like:
def is_reversed(l1, l2):
first = 0
for i in range(len(l2)-1, -1, -1):
if l2[i] != l1[first]:
return False
first += 1
return True
Which Outputs:
>>> is_reversed([1,2,3,4], [4,3,2,1])
True
>>> is_reversed([1,2,3,4], [4,3,2,2])
False
Although it would be easier to just use builtin functions to do this shown in the comments.
The idea here is that whenever you have an element list1[i], you want to compare it to the element list2[-i-1]. As required, the following code creates no auxiliary list in the process.
list1 = [1, 2, 3]
list2 = [3, 2, 1]
are_reversed = True
for i in range(len(list1)):
if list1[i] != list2[-i - 1]:
are_reversed = False
break
I want to point out that the built-in range does not create a new list in Python3, but a range object instead. Although, if you really want to stay away from those as well you can modify the code to use a while-loop.
You can also make this more compact by taking advantage of the built-in function all. The following line instantiate an iterator, so this solution does not create an auxiliary list either.
are_reversed = all(list1[i] == list2[-i - 1] for i in range(len(list2))) # True
If you want to get the N'th value of each list you can do a for loop with
if (len(list1) <= len(list2):
for x in range(0, len(list1):
if (list1[x] == list2[x]):
#Do something
else:
for x in range(0, len(list2):
if (list1[x] == list2[x]):
#Do something
If you want to check if each value of a list with every value of another list you can nestle a for loop
for i in list1:
for j in list2:
if (list1[i] == list2[j]):
//Do something
EDIT: Changed code to Python

list index out of range

I can't figure out why this is throwing up a "List index out of range error". This code is meant to remove duplicates from a list.
def remove_duplicates(l):
new_list = l
for i in range(0,len(l)):
store = l[i]
for x in range(i+1,len(l)):
if l[x] == store:
new_list.pop(x)
return new_list
print remove_duplicates([1,1,2,2])
Thanks for all the answers. So i tried this (after considerable brain-wracking ) and i can't figure out what's wrong this time.
def remove_duplicates(l):
new_list = l[:]
for i in range(0,len(l)):
count = 0
store = l[i]
for x in range(0,len(new_list)):
if l[x] == store:
count += 1
if count >= 2:
new_list.remove(l[i])
return new_list
print remove_duplicates([1,1,2,2])
Which print [2,2] to the console. I used the remove function, so it can't be an indexing error. I don't see how it can remove the 1 on the second iteration. I'm looping over the modified list in the second for loop, there's no way the count can be >= 2 in the if condition.
new_list = l makes new_list refer to the same object as l. Any changes to new_list will then affect l, which causes errors in your code. You get the out-of-range error because you remove items from new_list, which actually removes them from l, while you're iterating over l.
Use new_list = l[:] to copy l to a new list.
You also have errors with the logic of your code: new_list.pop(x) will change the indexes of all the following elements, which means the next time you remove an element, it will be removed from the wrong index.
You are not copying the list properly, your new_list refers to the same memory location i.e. object as l hence any change you make in new_list will reflect in l as well.
To avoid this you can use shallow copy
example:-
new_list=l[:]
List comprehensions tend to be much faster than for loops and look fancy to boot. if your intention is to create a new list with duplicates removed, this will do it:
def remove_duplicates(mylist):
return [elem for i, elem in enumerate(mylist) if elem not in mylist[i+1:]]
print remove_duplicates([1,1,2,2])
If you have a very large list with many duplicates, a for loop with a set() to track what you've seen may be faster.

Searching for substring in element in a list an deleting the element

I have a list and I am trying to delete the elements that have 'pie' in them. This is what I've done:
['applepie','orangepie', 'turkeycake']
for i in range(len(list)):
if "pie" in list[i]:
del list[i]
I keep getting list index out of range, but when I change the del to a print statement it prints out the elements fine.
Instead of removing an item from the list you're iterating over, try creating a new list with Python's nice list comprehension syntax:
foods = ['applepie','orangepie', 'turkeycake']
pieless_foods = [f for f in foods if 'pie' not in f]
Deleting an element during iteration, changes the size, causing IndexError.
You can rewrite your code as (using List Comprehension)
L = [e for e in L if "pie" not in e]
Something like:
stuff = ['applepie','orangepie', 'turkeycake']
stuff = [item for item in stuff if not item.endswith('pie')]
Modifying an object that you're iterating over should be considered a no-go.
The reason to why you get a error is because you change the length of the list when you delete something!
Example:
first loop: i = 0, length of list will become 1 less because you delete "applepie" (length is now 2)
second loop: i = 1, length of list will now become just 1 because we delete "orangepie"
last/third loop: i = 2, Now you should see the problem, since i = 2 and the length of the list is only 1 (to clarify only list[0] have something in it!).
So rather use something like:
for item in in list:
if "pie" not in item:
new list.append(item)
Another but longer way would be to note down the indexes where you encounter pie and delete those elements after the first for loop

Categories