I would like to search for numbers in existing list. If is one of this numbers repeated then set variable's value to true and break for loop.
list = [3, 5, 3] //numbers in list
So if the function gets two same numbers then break for - in this case there is 3 repeated.
How to do that?
First, don't name your list list. That is a Python built-in, and using it as a variable name can give undesired side effects. Let's call it L instead.
You can solve your problem by comparing the list to a set version of itself.
Edit: You want true when there is a repeat, not the other way around. Code edited.
def testlist(L):
return sorted(set(L)) != sorted(L)
You could look into sets. You loop through your list, and either add the number to a support set, or break out the loop.
>>> l = [3, 5, 3]
>>> s = set()
>>> s
set([])
>>> for x in l:
... if x not in s:
... s.add(x)
... else:
... break
You could also take a step further and make a function out of this code, returning the first duplicated number you find (or None if the list doesn't contain duplicates):
def get_first_duplicate(l):
s = set()
for x in l:
if x not in s:
s.add(x)
else:
return x
get_first_duplicate([3, 5, 3])
# returns 3
Otherwise, if you want to get a boolean answer to the question "does this list contain duplicates?", you can return it instead of the duplicate element:
def has_duplicates(l):
s = set()
for x in l:
if x not in s:
s.add(x)
else:
return true
return false
get_first_duplicate([3, 5, 3])
# returns True
senderle pointed out:
there's an idiom that people sometimes use to compress this logic into a couple of lines. I don't necessarily recommend it, but it's worth knowing:
s = set(); has_dupe = any(x in s or s.add(x) for x in l)
you can use collections.Counter() and any():
>>> lis=[3,5,3]
>>> c=Counter(lis)
>>> any(x>1 for x in c.values()) # True means yes some value is repeated
True
>>> lis=range(10)
>>> c=Counter(lis)
>>> any(x>1 for x in c.values()) # False means all values only appeared once
False
or use sets and match lengths:
In [5]: lis=[3,3,5]
In [6]: not (len(lis)==len(set(lis)))
Out[6]: True
In [7]: lis=range(10)
In [8]: not (len(lis)==len(set(lis)))
Out[8]: False
You should never give the name list to a variable - list is a type in Python, and you can give yourself all kinds of problems masking built-in names like that. Give it a descriptive name, like numbers.
That said ... you could use a set to keep track of which numbers you've already seen:
def first_double(seq):
"""Return the first item in seq that appears twice."""
found = set()
for item in seq:
if item in found:
return item
# return will terminate the function, so no need for 'break'.
else:
found.add(item)
numbers = [3, 5, 3]
number = first_double(numbers)
without additional memory:
any(l.count(x) > 1 for x in l)
Related
How would I check if the first digits in each element in a list are the same?
for i in range(0,len(lst)-1):
if lst[i] == lst[i+1]:
return True
I know that this checks for if the number before is equal to the next number in the list, but I just want to focus on the first digit.
You can use math.log10 and floor division to calculate the first digit. Then use all with a generator expression and zip to test adjacent elements sequentially:
from math import log10
def get_first(x):
return x // 10**int(log10(x))
L = [12341, 1765, 1342534, 176845, 1]
res = all(get_first(i) == get_first(j) for i, j in zip(L, L[1:])) # True
For an explanation of how this construct works, see this related answer. You can apply the same logic via a regular for loop:
def check_first(L):
for i, j in zip(L, L[1:]):
if get_first(i) != get_first(j):
return False
return True
res = check_first(L) # True
Use all() as a generator for the first character(s) of your numbers:
>>> l = [1, 10, 123]
>>> all(str(x)[0] == str(l[0])[0] for x in l)
True
The list comprehension
>>> [str(x)[0] for x in l]
creates a list
['1', '1', '1']
which sounds as if this should be enough. But all processes boolean values, and the boolean value of a string is always True, except when the string is empty. That means that it would also consider ['1','2','3'] to be True. You need to add a comparison against a constant value -- I picked the first item from the original list:
>>> [str(x)[0] == str(l[0])[0] for x in l]
[True, True, True]
whereas a list such as [1,20,333] would show
['1', '2', '3']
and
[True, False, False]
You can adjust it for a larger numbers of digits as well:
>>> all(str(x)[:2] == str(l[0])[:2] for x in l)
False
>>> l = [12,123,1234]
>>> all(str(x)[:2] == str(l[0])[:2] for x in l)
True
You could do something like this:
lst = [12, 13, 14]
def all_equals(l):
return len(set(e[0] for e in map(str, l))) == 1
print all_equals(lst)
Output
True
Explanation
The function map(str, l) converts all elements in the list to string then with (e[0] for e in map(str, l)) get the first digit of all the elements using a generator expression. Finally feed the generator into the set function this will remove all duplicates, finally you have to check if the length of the set is 1, meaning that all elements were duplicates.
For a boolean predicate on a list like this, you want a solution that returns False as soon as a conflict is found -- solutions that convert the entire list just to find the first and second item didn't match aren't good algorithms. Here's one approach:
def all_same_first(a):
return not a or all(map(lambda b, c=str(a[0])[0]: str(b)[0] == c, a[1:]))
Although at first glance this might appear to violate what I said above, the map function is lazy and so only hands the all function what it needs as it needs it, so as soon as some element doesn't match the first (initial-digit-wise) the boolean result is returned and the rest of the list isn't processed.
Going back to your original code:
this checks for if the number before is equal to the next
number in the list
for i in range(0,len(lst)-1):
if lst[i] == lst[i+1]:
return True
This doesn't work, as you claim. To work properly, it would need to do:
for i in range(0, len(lst) - 1):
if lst[i] != lst[i + 1]:
return False
return True
Do you see the difference?
Hello I just want to know what is wrong in my code.
This is a problem in the book 'Think Python' which asks to write a function to return True is the list has any duplicate elements, or False otherwise.
def has_duplicates(t):
for i in t:
if i in t.pop(t.index(i)):
return True
return False
What's wrong with it?
You remove elements from t while iterating over it. This prevents the iteration working as you'd expect, generally the effect is to skip elements. Do not do this.
t.pop(t.index(i)) returns i (or a value equal to it), so whatever you're hoping to achieve by if i in, I don't think you will achieve.
You can test it by comparing the length of the list to the length of the set created from that list, because set removes duplicates.
def has_duplicates(t):
return len(t) > len(set(t))
def has_duplicates(t):
no_dup = list(set(t))
if len(t) > len(no_dup):
return True
return False
First of all, never tamper with a list while iterating on it..
Second, your code is checking if i == the item you're popping. That will always be true since when you're popping an item python returns you that item. So it's no use to compare it with i, because you have just popped i. It's like comparing if i == i..
Try your function for t = ['1', '2', '3'] and use a debugger to verify what I'm saying..
You could do the following since you can't use sets:
def has_duplicates(t):
#first have a backup to keep the iteration clean
t_backup = t.copy()
for i in t_backup:
#first pop the item
t.pop(t.index(i))
#then check if such item still exists
if i in t:
return True
return False
Well, with list comprehension (thanks to Steve Jessop):
def has_duplicates(t):
return any(t.count(i)>1 for i in t)
lst = [1, 2, 3, 4, 5, 6, 1, 5 ,6]
print(has_duplicates(lst))
result:
>>>
True
>>>
Why does this function result in:
Oops, try again. remove_duplicates([]) resulted in an error: list index out of range"?
my function is as follows
def remove_duplicates(listIn):
listSorted = sorted(listIn)
prevInt = listSorted[0]
listOut = []
listOut.append(prevInt)
for x in listSorted:
if (x != prevInt):
prevInt = x
listOut.append(x)
print listOut
return listOut
remove_duplicates([1,2,3,3,3])
which outputs:
[1, 2, 3]
None
Thank you.
to answer your question you need to just check for the length of your list and return if its empty
if not listIn:
return []
however your whole approach could be simplified with a set like so
def remove_duplicates(listIn):
return list(set(listIn))
output:
>>> print remove_duplicates([1,2,3,3,3])
>>> [1, 2, 3]
of course this is assuming you want to keep your data in a list.. if you dont care then remove the outer list() conversion to make it faster. regardless this will be a much faster approach than what you have written
I want to check if there is no identical entries in a list of list. If there are no identical matches, then return True, otherwise False.
For example:
[[1],[1,2],[1,2,3]] # False
[[1,2,3],[10,20,30]] # True
I am thinking of combine all of the entries into one list,
for example: change [[1,2,3][4,5,6]] into [1,2,3,4,5,6] and then check
Thanks for editing the question and helping me!
>>> def flat_unique(list_of_lists):
... flat = [element for sublist in list_of_lists for element in sublist]
... return len(flat) == len(set(flat))
...
>>> flat_unique([[1],[1,2],[1,2,3]])
False
>>> flat_unique([[1,2,3],[10,20,30]])
True
We can use itertools.chain.from_iterable and set built-in function.
import itertools
def check_iden(data):
return len(list(itertools.chain.from_iterable(data))) == len(set(itertools.chain.from_iterable(data)))
data1 = [[1],[1,2],[1,2,3]]
data2 = [[1,2,3],[10,20,30]]
print check_iden(data1)
print check_iden(data2)
Returns
False
True
You could use sets which have intersection methods to find which elements are common
Place all elements of each sublist into a separate list. If that separate list has any duplicates (call set() to find out), then return False. Otherwise return True.
def identical(x):
newX = []
for i in x:
for j in i:
newX.append(j)
if len(newX) == len(set(newX)): # if newX has any duplicates, the len(set(newX)) will be less than len(newX)
return True
return False
I think you can flat the list and count the element in it, then compare it with set()
import itertools
a = [[1],[1,2],[1,2,3]]
b = [[1,2,3],[10,20,30]]
def check(l):
merged = list(itertools.chain.from_iterable(l))
if len(set(merged)) < len(merged):
return False
else:
return True
print check(a) # False
print check(b) # True
Depending on your data you might not want to look at all the elements, here is a solution that returns False as soon as you hit a first duplicate.
def all_unique(my_lists):
my_set = set()
for sub_list in my_lists:
for ele in sub_list:
if ele in my_set:
return False
my_set.add(ele)
else:
return True
Result:
In [12]: all_unique([[1,2,3],[10,20,30]])
Out[12]: True
In [13]: all_unique([[1],[1,2],[1,2,3]])
Out[13]: False
Using this method will make the boolean variable "same" turn to True if there is a number in your list that occurs more than once as the .count() function returns you how many time a said number was found in the list.
li = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
same = False
for nbr in li:
if li.count(nbr) > 1:
same = True
I've seen a lot of variations of this question from things as simple as remove duplicates to finding and listing duplicates. Even trying to take bits and pieces of these examples does not get me my result.
My question is how am I able to check if my list has a duplicate entry? Even better, does my list have a non-zero duplicate?
I've had a few ideas -
#empty list
myList = [None] * 9
#all the elements in this list are None
#fill part of the list with some values
myList[0] = 1
myList[3] = 2
myList[4] = 2
myList[5] = 4
myList[7] = 3
#coming from C, I attempt to use a nested for loop
j = 0
k = 0
for j in range(len(myList)):
for k in range(len(myList)):
if myList[j] == myList[k]:
print "found a duplicate!"
return
If this worked, it would find the duplicate (None) in the list. Is there a way to ignore the None or 0 case? I do not care if two elements are 0.
Another solution I thought of was turn the list into a set and compare the lengths of the set and list to determine if there is a duplicate but when running set(myList) it not only removes duplicates, it orders it as well. I could have separate copies, but it seems redundant.
Try changing the actual comparison line to this:
if myList[j] == myList[k] and not myList[j] in [None, 0]:
I'm not certain if you are trying to ascertain whether or a duplicate exists, or identify the items that are duplicated (if any). Here is a Counter-based solution for the latter:
# Python 2.7
from collections import Counter
#
# Rest of your code
#
counter = Counter(myList)
dupes = [key for (key, value) in counter.iteritems() if value > 1 and key]
print dupes
The Counter object will automatically count occurances for each item in your iterable list. The list comprehension that builds dupes essentially filters out all items appearing only once, and also upon items whose boolean evaluation are False (this would filter out both 0 and None).
If your purpose is only to identify that duplication has taken place (without enumerating which items were duplicated), you could use the same method and test dupes:
if dupes: print "Something in the list is duplicated"
If you simply want to check if it contains duplicates. Once the function finds an element that occurs more than once, it returns as a duplicate.
my_list = [1, 2, 2, 3, 4]
def check_list(arg):
for i in arg:
if arg.count(i) > 1:
return 'Duplicate'
print check_list(my_list) == 'Duplicate' # prints True
To remove dups and keep order ignoring 0 and None, if you have other falsey values that you want to keep you will need to specify is not None and not 0:
print [ele for ind, ele in enumerate(lst[:-1]) if ele not in lst[:ind] or not ele]
If you just want the first dup:
for ind, ele in enumerate(lst[:-1]):
if ele in lst[ind+1:] and ele:
print(ele)
break
Or store seen in a set:
seen = set()
for ele in lst:
if ele in seen:
print(ele)
break
if ele:
seen.add(ele)
You can use collections.defaultdict and specify a condition, such as non-zero / Truthy, and specify a threshold. If the count for a particular value exceeds the threshold, the function will return that value. If no such value exists, the function returns False.
from collections import defaultdict
def check_duplicates(it, condition, thresh):
dd = defaultdict(int)
for value in it:
dd[value] += 1
if condition(value) and dd[value] > thresh:
return value
return False
L = [1, None, None, 2, 2, 4, None, 3, None]
res = check_duplicates(L, condition=bool, thresh=1) # 2
Note in the above example the function bool will not consider 0 or None for threshold breaches. You could also use, for example, lambda x: x != 1 to exclude values equal to 1.
In my opinion, this is the simplest solution I could come up with. this should work with any list. The only downside is that it does not count the number of duplicates, but instead just returns True or False
for k, j in mylist:
return k == j
Here's a bit of code that will show you how to remove None and 0 from the sets.
l1 = [0, 1, 1, 2, 4, 7, None, None]
l2 = set(l1)
l2.remove(None)
l2.remove(0)