How to remove duplicates from a list using this function? - python

I'm new to python. Could someone help me understand why the following function doesn't work? It is supposed to return a new list with the duplicates removed but instead prints [4,6].
def remove_duplicates(l):
solution = []
for item in l:
if l.count(item) < 2:
solution.append(item)
else:
l.remove(item)
return solution
print (remove_duplicates([4,5,5,5,4,6]))
I thought it iterates one item at a time in a list. So the first 5 would have a count of 3 and be remove, the second five would have a count of 2 and be removed and the third 5 would have a count of 1 and be appended to the solution list. I can't wrap my head around why the 5s would be completely removed but the 4s would not.

You must not remove items from a list, you are iterating at the moment. Iterating is done by incrementing an index internally.
If you want to keep the last occurence of an item, best, count them at first:
from collections import Counter
def remove_duplicates(l):
solution = []
counts = Counter(l)
for item in l:
if counts[item] == 1:
solution.append(item)
else:
counts[item] -= 1
return solution

Use set data type in python to remove the duplicates.
a = [4,5,5,5,4,6]
solution = list(set(a))
Output:
[4,5,6]

Related

How to handle operating on items in a list without perfectly even len()?

I'm trying to operate on every 5 items in a list, but can't figure out how to handle the remaining items if they don't divide evenly into 5. Right now I'm using modulo, but I can't shake the feeling it's not quite the right answer. Here's an example...
list = ["ValA","ValB","ValC","ValD","ValE","ValF","ValG","ValH","ValI","ValJ","ValK","ValL","ValM","ValN",]
newlist = []
i = 0
for o in list:
i += 1
newlist.append(o)
if i % 5 == 0:
for obj in newlist:
function_for(obj)
newlist.clear()
This code will execute function_for() twice, but not a third time to handle the remaining 4 values. If I add an 'else' statement it runs on every execution.
What's the correct way to handle a situation like this?
This way is pretty easy, if you don't mind modifying the list:
mylist = ["ValA","ValB","ValC","ValD","ValE","ValF","ValG","ValH","ValI","ValJ","ValK","ValL","ValM","ValN",]
while mylist:
function_for( mylist[:5] )
mylist = mylist[5:]
You can also check if the index is equal to the length of the list. (Additionally, it is more idiomatic to use enumerate instead of a counter variable here.)
lst = ["ValA","ValB","ValC","ValD","ValE","ValF","ValG","ValH","ValI","ValJ","ValK","ValL","ValM","ValN",]
newlist = []
for i, o in enumerate(lst, 1):
newlist.append(o)
if i % 5 == 0 or i == len(lst):
print(newlist)
newlist.clear()

For loop: Store items in list, then count and print numbers of items in that list

Trying to count items from Reddit API, and adding them all to a list, and then printing amount of lines in said list.
So, I have tried a few things, but they have all turned out unsuccessful so far. Why is it that this doesn't count correctly, and prints amounts of items in "redditqueue"? Any help/suggestions are mostly welcome!
x = []
for item in redditqueue: #redditqueue is a placeholder
x.append(item)
Count = x.count()
print(Count)
What I want is for the code to print 2 if there's 2 items in redditqueue, but it's simply printing the following:
0
0
len() is what you want. Try this:
x = []
for item in redditqueue:
x.append(item)
print(len(x))
If you want to know how many items are in redditqueue, then simply get the list length:
print(len(redditqueue))
If redditqueue is some sort of iterator or generator, then make a list of its entire sequence and take the length of that.
print len(list(redditqueue))
If that's too long, and you need to count the quantity of items in the list, then don't accumulate the elements in yet another structure ... just count:
for count, item in enumerate(redditqueue):
pass
print(count)

Negative number finder in index and occuring

Write a func/on first_neg that takes a (possibly empty) list of
numbers as input parameter, finds the first occurrence of a
nega/ve number, and returns the index (i.e. the posi/on in the
list) of that number. If the list contains no nega/ve numbers or it
is empty, the program should return None. Use while loop (and
not for loop) and your while loop should stop looping once the
first nega/ve number is found.
This is the question my teacher asked me any ideas this what i did:
def first_neg(list):
count = 0
for number in list:
if number < 0:
count += 1
return count
Dosent seem to work properly i just joined 1st post hope can get some help
x = [1,2,3,-5]
def first_neg(list):
count = 0
for number in list:
count += 1 #moved it outside of the if
if number < 0:
return count
print(first_neg(x)) #prints 4
You want to increment count not when you've found the answer but everytime the forloops loops. Note that this method returns 4 which is the fourth item in the list, not the index, Index of the list starts from 0 so to access it would be 3. Take our list x = [1,2,3,-5], -5 is in the fourth slot of the list, but to access it we have to call x[3] since lists starts at 0 indexing.
If you want to return the index of the list where the first negative number is found try this:
x = [1,2,3,-5]
def first_neg(list):
for count, number in enumerate(list):
if number < 0:
return count
print(first_neg(x)) # prints 3
This is because enumerate creates a "pairing" of the item in the list and it's the current count. Enumerate just counts from 0 everytime it gets an item out of the list.
Also as a side note ( I didn't change it in my answer since I wanted you to understand what's going on ). Don't name your variables keywords like list, tuple, int, str... Just a bad idea and habit, it works as you can see but it can cause issues.
Return the index immediately once you encounter the negative element. Increment the index otherwise:
def first_neg(lst):
count = 0
while count < len(lst):
if lst[count] < 0:
return count
count = count + 1
return None
Note : Better if you use enumerate() rather than using extra count variable. The code you mentioned is not written in pythonic way.
You may try this as well:
def first_neg(lst):
res = [i for i,x in enumerate(lst) if x<0]
return None if res == [] else res[0]
The code above can be improved using generators as suggested by #Chris_Rands.

print unique numbers from a sorted list python

I'm trying to print all elements in a sorted list that only occur once.
My code below works but I'm sure there is a better way:
def print_unique(alist):
i = 0
for i in range(len(alist)):
if i < (len(alist)-1):
if alist[i] == alist[i+1]:
i+=1
if alist[i] == alist[i-1]:
i+=1
elif alist[i] == alist[i-1]:
i+=1
else:
print alist[i]
else:
if alist[-1]!= alist[-2]:
print alist[-1]
randomlist= [1,2,3,3,3,4,4,5,6,7,7,7,7,8,8,8,9,11,12,14,42]
print_unique(randomlist)
This produces
1
2
5
6
9
11
12
14
42
e.g. all values that only appear once in a row.
You could use the itertools.groupby() function to group your inputs and filter on groups that are one element long:
from itertools import groupby
def print_unique(alist):
for elem, group in groupby(alist):
if sum(1 for _ in group) == 1: # count without building a new list
print elem
or if you want to do it 'manually', track the last item seen and if you have seen it more than once:
def print_unique(alist, _sentinel=object()):
last, once = _sentinel, False
for elem in alist:
if elem == last:
once = False
else:
if once:
print last
last, once = elem, True
if last is not _sentinel and once:
print last
You may want to replace the print statements with yield and leave printing to the caller:
def filter_unique(alist):
for elem, group in groupby(alist):
if sum(1 for _ in group) == 1: # count without building a new list
yield elem
for unique in filter_unique(randomlist):
print unique
This question seems to have duplicates.
If you do not wish to preserve the order of your list, you can do
print list(set(sample_list))
You can also try,
unique_list = []
for i in sample_list:
if i not in unique_list:
unique_list.append(i)
EDIT:
In order to print all the elements in the list so that they appear once in a row, you can try this
print '\n'.join([str(i) for i in unique_list])
And, as #martijn-pieters mentioned it in the comments, the first code was found to be very fast compared to the second one, when I did a small benchmark. On a list of 10^5 elements, the second code took 63.66 seconds to complete whereas the first one took a mere 0.2200 seconds. (on a list generated using random.random())
you can do by this:
print (set(YOUR_LIST))
or if you need a list use this:
print (list(set(YOUR_LIST)))
Sets are lists containing unique items. If you construct a set from the array, it will contain only the unique items:
def print_unique(alist):
print set( alist )
The input list does not need to be sorted.

'For' index changing because of list.pop() call

I have this Python code which confronts, one by one, the items in a list of integers (named 'seen' in the code posted) with all the items in the .f field of another list (named 'maxx' in the code posted).
At every iteration I'm counting (through the c variable) how many times does the j-th item appear in the 'maxx' list, and I want to pop() it from the list if it appears less than three times.
The code works perfectly, but popping an item 'pulls' any subsequent item in the 'seen' list back by one position, therefore every time the if condition is satisfied the loop misses the very next item of the list.
Here is the code:
for indj,j in enumerate(seen): # every item in the 'seen' list..
c=0
for k in maxx: # ..checks for a matching item in the 'maxx' list
if j==k.f:
c=c+1;
if c<3: # if the item appears less than 3 times we pop it
seen.pop(indj)
I have tried to add:
indj=indj-1
j=seen[indj]
At the end of the if construct, but it didn't work
You have to make a new list or work with a copy. When you change a list while looping over it you skip some items. I'd do this:
def filter_low(lst, maxk, threshold=3):
for item in lst:
c = sum(1 for k in maxx if item==k.f)
if c >= threshold:
yield item
new_seen = list(filter_low(seen, maxk, 3))
Which is the same as:
new_seen = [item for item in seen
if sum(1 for k in maxx if item==k.f) >= 3]
You can change the original list by doing
seen[:] = [item for item in seen
if sum(1 for k in maxx if item==k.f) >= 3]
Modifying the list you're iterating over is never a good idea. You could iterate over a copy and modify the actual list with
popped = 0
for indj, j in enumerate(seen[:]):
s = sum(j == k.f for k in maxx)
if s < 3:
seen.pop(indj - popped)
popped += 1
If the seen list is very large, this might be inefficient.

Categories