Python: Duplicates in list

Python: Duplicates in list - python

Im trying to create a new list of unique values and remove said values from the original list so that what's left is duplicates. It appears my for loop is skipping over values.
array = [1,3,4,2,2,3,4]
def duplicates(array):
mylist = []
for item in array:
if item not in mylist:
mylist.append(item)
array.remove(item)
return mylist
results:
duplicates(array)
[1, 4, 2]

I think that using collections.Counter is more appropriate for this task:
array = [1, 3, 4, 2, 2, 3, 4]
from collections import Counter
def duplicates(array):
return [n for n, c in Counter(array).items() if c > 1]
print(duplicates(array))
Output:
[3, 4, 2]

The issue is with the array.remove(item), it is deleting the element at the index position visited. So, index number reduces by one and making the loop to skip reading the next value.
[1, 3, 4, 2, 2, 3, 4] -> before 1st iteration index 0 -> value =1
[3, 4, 2, 2, 3, 4] -> After 1st iteration 1 is removed, so index 0 -> value =3(loop not reading it as it already read index 0, so loop is reading index 1 -> value 4)
Correct code to display values without duplicates:
array = [1,3,4,2,2,3,4]
def duplicates(array):
mylist = []
for item in array:
if item not in mylist:
mylist.append(item)
#array.remove(item)
return mylist
res=duplicates(array)
print (res)

You are removing values from the list you are iterating through, so your loop is skipping values, try this
array = [1,3,4,2,2,3,4]
def duplicates(array):
mylist = []
for i, item in enumerate(array):
if item not in mylist:
mylist.append(item)
array[i] = None
array[:] = list(filter(
lambda x: x is not None,
array
))
return mylist
Though you should clarify what you want to do with array variable as it is currently unclear.

array = [1,3,4,2,2,3,4]
def duplicates(array):
mylist = []
for item in array:
if item not in mylist:
mylist.append(item)
array.remove(item)
else:
array.remove(item)
return mylist
just remove the item that you don't append

You do not need to use a loop, it is much clearer to use a list comprehension
dups = list(set([l for l in array if array.count(l) > 1]))
However, the answer provided by kuco 23 does this appropriately with a loop.

A bit unclear what result you expect. If you want to get all unique values while maintaining order of occurrence, the canonical way to achieve this would be to use a collections.OrderedDict:
from collections import OrderedDict
def duplicates(array):
return list(OrderedDict.fromkeys(array))
>>> duplicates(array)
[1, 3, 4, 2]
If you want get a list of only duplicates, i.e. values that occur more than once, you could use a collections.Counter:
from collections import Counter
def duplicates(array):
return [k for k, v in Counter(array).items() if v > 1]
>>> duplicates(array)
[3, 4, 2]

Related

How to iterate the first index value twice before going to the next index position?

I'm trying to make a for loop that iterates each index twice before going to the next one, for example if I have the following list:
l = [1,2,3]
I would like to iterate it if it was in this way:
l = [1,1,2,2,3,3]
could someone help me with this problem please?

The most obvious thing would be a generator function that yields each item in the iterable twice:
def twice(arr):
for val in arr:
yield val
yield val
for x in twice([1, 2, 3]):
print(x)
If you need a list, then
l = list(twice([1, 2, 3]))

You could make a list comprehension that repeats the elements and flattens the result:
l = [1,2,3]
repeat = 2
[n for i in l for n in [i]*repeat]
# [1, 1, 2, 2, 3, 3]

You can solve this by using NumPy.
import numpy as np
l = np.repeat([1,2,3],2)
Repeat repeats elements of an array the specified number of times. This also returns a NumPy array. This can be converted back to a list if you wish with list(l). However, NumPy arrays act similar to lists so most of the time you don't need to change anything.

Unsurprisingly, more-itertools has that:
>>> from more_itertools import repeat_each
>>> for x in repeat_each([1, 2, 3]):
... print(x)
...
1
1
2
2
3
3
(It also has an optional second parameter for telling it how often to repeat each. For example repeat_each([1, 2, 3], 5). Default is 2.)

l = [1, 2, 3]
lst = []
for x in l:
for i in range(2):
lst.append(x)
print(lst)
# [1, 1, 2, 2, 3, 3]

How to get a list with only unique numbers? (Python)

I want to write a program that checks for duplicate values and removes them.
So for example I want from this:
list = [2,6,8,2,9,8,8,5,2,2]
To get only this:
uniques = [6,9,5]
I can't seem to find a good way to compare every item with each other to see if they are equal and them remove them. Any help will be very appreciated!

Use collections.Counter:
>>> from collections import Counter
>>> lst = [2,6,8,2,9,8,8,5,2,2]
>>> c = Counter(lst)
>>> [x for x in c if c[x] == 1]
[6, 9, 5]
Other than using count or in for each element, this should be O(n) instead of O(n²).

If you want to preserve the order of the elements too, follow this best method :
uniques = list(dict.fromkeys(list))
print(uniques)

Given that you want a list of values that appear only once in your list, this should work. I used a hashmap to store the count of values in the the list and then added the keys which have a count of 1 into unqiue.
l = [2,6,8,2,9,8,8,5,2,2]
unique = []
count = {}
for i in l:
if count.get(i) is None:
count[i] = 1
else:
count[i]+=1
for i in count.keys():
if count[i] == 1:
unique.append(i)
print(unique)
Output
[6, 9, 5]

def unique(x: list):
a = set()
for i in range(len(x)):
if x.index(x[i]) == (len(x) - 1) - x[::-1].index(x[i]):
a.add(x[i])
return list(a)
print(unique([1, 2, 1, 11, 4, 4, 6, 3, 2]))
# [3, 11, 6]

list = [2,6,8,2,9,8,8,5,2,2]
uniques = []
for i in range(len(list)):
if list.count(list[i])==1:
uniques.append(list[i])

list = [2,6,8,2,9,8,8,5,2,2]
uniques = []
for i in list:
if i not in uniques:
uniques.append(i)

What am I doing wrong? It shows me that the list index is out of range

I have a given list and I want to add all elements of that list with an odd index to the new list. Here is my code :
def odd_indices(lst):
new_lst = [ lst[i] for i in lst if (i + 2) % 2 != 0 ]
return new_lst
What am I doing wrong?

You might want to take the index instead of the value itself. The list lst contains values and not indices. enumerate helps generating indices. Do like below.
def odd_indices(lst):
new_lst = [ v for i, v in enumerate(lst) if i% 2 != 0 ]
return new_lst
print(odd_indices([1, 2, 4, 5]))

You could slice the list as follows:
lst[1::2]
The general syntax is:
list[start:end:step]

in the list comprehension[lst[i] for i in lst if (i + 2) % 2 != 0] you iterate over items in the list and try to obtain elements from the original list with the respective index i if i+2 is odd.
as mentioned in other replies you can use enumerate to access index and element.
or you can simply use slicing
>>> foo = [1, 2, 3, 4, 5]
>>> foo[1::2]
[2, 4]

Use enumerate to work on index
>>> lst = [0,1,2,3,4,5,6,7,8,9,10]
>>> [value for index, value in enumerate(lst) if index%2 != 0]
[1, 3, 5, 7, 9] # output odd index values
What is the mistake you have done?
Iterating through list.
In each iteration you will get the element of that list.
you are using that element as the index of the list.
EX:
lst = [1,2,3,4,5,6,7,8,9,10]
In above lst, the last element is 10.
lst[10] ---> which does not exists. # Throws out of index range error

Group repeated elements of a list

I am trying to create a function that receives a list and return another list with the repeated elements.
For example for the input A = [2,2,1,1,3,2] (the list is not sorted) and the function would return result = [[1,1], [2,2,2]]. The result doesn't need to be sorted.
I already did it in Wolfram Mathematica but now I have to translate it to python3, Mathematica has some functions like Select, Map and Split that makes it very simple without using long loops with a lot of instructions.

result = [[x] * A.count(x) for x in set(A) if A.count(x) > 1]

Simple approach:
def grpBySameConsecutiveItem(l):
rv= []
last = None
for elem in l:
if last == None:
last = [elem]
continue
if elem == last[0]:
last.append(elem)
continue
if len(last) > 1:
rv.append(last)
last = [elem]
return rv
print grpBySameConsecutiveItem([1,2,1,1,1,2,2,3,4,4,4,4,5,4])
Output:
[[1, 1, 1], [2, 2], [4, 4, 4, 4]]
You can sort your output afterwards if you want to have it sorted or sort your inputlist , then you wouldnt get consecutive identical numbers any longer though.
See this https://stackoverflow.com/a/4174955/7505395 for how to sort lists of lists depending on an index (just use 0) as all your inner lists are identical.
You could also use itertools - it hast things like TakeWhile - that looks much smarter if used
This will ignore consecutive ones, and just collect them all:
def grpByValue(lis):
d = {}
for key in lis:
if key in d:
d[key] += 1
else:
d[key] = 1
print(d)
rv = []
for k in d:
if (d[k]<2):
continue
rv.append([])
for n in range(0,d[k]):
rv[-1].append(k)
return rv
data = [1,2,1,1,1,2,2,3,4,4,4,4,5,4]
print grpByValue(data)
Output:
[[1, 1, 1, 1], [2, 2, 2], [4, 4, 4, 4, 4]]

You could do this with a list comprehension:
A = [1,1,1,2,2,3,3,3]
B = []
[B.append([n]*A.count(n)) for n in A if B.count([n]*A.count(n)) == 0]
outputs [[1,1,1],[2,2],[3,3,3]]
Or more pythonically:
A = [1,2,2,3,4,1,1,2,2,2,3,3,4,4,4]
B = []
for n in A:
if B.count([n]*A.count(n)) == 0:
B.append([n]*A.count(n))
outputs [[1,1,1],[2,2,2,2,2],[3,3,3],[4,4,4,4]]
Works with sorted or unsorted list, if you need to sort the list before hand you can do for n in sorted(A)

This is a job for Counter(). Iterating over each element, x, and checking A.count(x) has a O(N^2) complexity. Counter() will count how many times each element exists in your iterable in one pass and then you can generate your result by iterating over that dictionary.
>>> from collections import Counter
>>> A = [2,2,1,1,3,2]
>>> counts = Counter(A)
>>> result = [[key] * value for key, value in counts.items() if value > 1]
>>> result
[[2, 2, 2], [[1, 1]]

Nested List and count()

I want to get the number of times x appears in the nested list.
if the list is:
list = [1, 2, 1, 1, 4]
list.count(1)
>>3
This is OK. But if the list is:
list = [[1, 2, 3],[1, 1, 1]]
How can I get the number of times 1 appears? In this case, 4.

>>> L = [[1, 2, 3], [1, 1, 1]]
>>> sum(x.count(1) for x in L)
4

itertools and collections modules got just the stuff you need (flatten the nested lists with itertools.chain and count with collections.Counter
import itertools, collections
data = [[1,2,3],[1,1,1]]
counter = collections.Counter(itertools.chain(*data))
print counter[1]
Use a recursive flatten function instead of itertools.chain to flatten nested lists of arbitrarily level depth
import operator, collections
def flatten(lst):
return reduce(operator.iadd, (flatten(i) if isinstance(i, collections.Sequence) else [i] for i in lst))
reduce with operator.iadd has been used instead of sum so that the flattened is built only once and updated in-place

Here is yet another approach to flatten a nested sequence. Once the sequence is flattened it is an easy check to find count of items.
def flatten(seq, container=None):
if container is None:
container = []
for s in seq:
try:
iter(s) # check if it's iterable
except TypeError:
container.append(s)
else:
flatten(s, container)
return container
c = flatten([(1,2),(3,4),(5,[6,7,['a','b']]),['c','d',('e',['f','g','h'])]])
print(c)
print(c.count('g'))
d = flatten([[[1,(1,),((1,(1,))), [1,[1,[1,[1]]]], 1, [1, [1, (1,)]]]]])
print(d)
print(d.count(1))
The above code prints:
[1, 2, 3, 4, 5, 6, 7, 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
1
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
12

Try this:
reduce(lambda x,y: x+y,list,[]).count(1)
Basically, you start with an empty list [] and add each element of the list list to it. In this case the elements are lists themselves and you get a flattened list.
PS: Just got downvoted for a similar answer in another question!
PPS: Just got downvoted for this solution as well!

If there is only one level of nesting flattening can be done with this list comprenension:
>>> L = [[1,2,3],[1,1,1]]
>>> [ item for sublist in L for item in sublist ].count(1)
4
>>>

For the heck of it: count to any arbitrary nesting depth, handling tuples, lists and arguments:
hits = lambda num, *n: ((1 if e == num else 0)
for a in n
for e in (hits(num, *a) if isinstance(a, (tuple, list)) else (a,)))
lst = [[[1,(1,),((1,(1,))), [1,[1,[1,[1]]]], 1, [1, [1, (1,)]]]]]
print sum(hits(1, lst, 1, 1, 1))
15

def nested_count(lst, x):
return lst.count(x) + sum(
nested_count(l,x) for l in lst if isinstance(l,list))
This function returns the number of occurrences, plus the recursive nested count in all contained sub-lists.
>>> data = [[1,2,3],[1,1,[1,1]]]
>>> print nested_count(data, 1)
5

The following function will flatten lists of lists of any depth(a) by adding non-lists to the resultant output list, and recursively processing lists:
def flatten(listOrItem, result = None):
if result is None: result = [] # Ensure initial result empty.
if type(listOrItem) != type([]): # Handle non-list by appending.
result.append(listOrItem)
else:
for item in listOrItem: # Recursively handle each item in a list.
flatten(item, result)
return result # Return flattened container.
mylist = flatten([[1,2],[3,'a'],[5,[6,7,[8,9]]],[10,'a',[11,[12,13,14]]]])
print(f'Flat list is {mylist}, count of "a" is {mylist.count("a")}')
print(flatten(7))
Once you have a flattened list, it's a simple matter to use count on it.
The output of that code is:
Flat list is [1, 2, 3, 'a', 5, 6, 7, 8, 9, 10, 'a', 11, 12, 13, 14], count of "a" is 2
[7]
Note the behaviour if you don't pass an actual list, it assumes you want a list regardless, one containing just the single item.
If you don't want to construct a flattened list, you can just use a similar method to get the count of any item in the list of lists, with something like:
def deepCount(listOrItem, searchFor):
if type(listOrItem) != type([]): # Non-list, one only if equal.
return 1 if listOrItem == searchFor else 0
subCount = 0 # List, recursively collect each count.
for item in listOrItem:
subCount += deepCount(item, searchFor)
return subCount
deepList = [[1,2],[3,'a'],[5,[6,7,[8,9]]],[10,'a',[11,[12,13,14]]]]
print(f'Count of "a" is {deepCount(deepList, "a")}')
print(f'Count of 13 is {deepCount(deepList, 13)}')
print(f'Count of 99 is {deepCount(deepList, 99)}')
As expected, the output of this is:
Count of "a" is 2
Count of 13 is 1
Count of 99 is 0
(a) Up to the limits imposed by Python itself of course, limits you can increase by just adding this to the top of your code:
import sys
sys.setrecursionlimit(1001) # I believe default is 1000.
I mention that just in case you have some spectacularly deeply nested structures but you shouldn't really need it. If you're nesting that deeply then you're probably doing something wrong :-)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python: Duplicates in list - python

I think that using collections.Counter is more appropriate for this task: array = [1, 3, 4, 2, 2, 3, 4] from collections import Counter def duplicates(array): return [n for n, c in Counter(array).items() if c > 1] print(duplicates(array)) Output: [3, 4, 2]

array = [1,3,4,2,2,3,4] def duplicates(array): mylist = [] for item in array: if item not in mylist: mylist.append(item) array.remove(item) else: array.remove(item) return mylist just remove the item that you don't append

You do not need to use a loop, it is much clearer to use a list comprehension dups = list(set([l for l in array if array.count(l) > 1])) However, the answer provided by kuco 23 does this appropriately with a loop.

Related

How to iterate the first index value twice before going to the next index position?

How to get a list with only unique numbers? (Python)

What am I doing wrong? It shows me that the list index is out of range

Group repeated elements of a list

Nested List and count()

Categories

Resources