Why does this function result in:
Oops, try again. remove_duplicates([]) resulted in an error: list index out of range"?
my function is as follows
def remove_duplicates(listIn):
listSorted = sorted(listIn)
prevInt = listSorted[0]
listOut = []
listOut.append(prevInt)
for x in listSorted:
if (x != prevInt):
prevInt = x
listOut.append(x)
print listOut
return listOut
remove_duplicates([1,2,3,3,3])
which outputs:
[1, 2, 3]
None
Thank you.
to answer your question you need to just check for the length of your list and return if its empty
if not listIn:
return []
however your whole approach could be simplified with a set like so
def remove_duplicates(listIn):
return list(set(listIn))
output:
>>> print remove_duplicates([1,2,3,3,3])
>>> [1, 2, 3]
of course this is assuming you want to keep your data in a list.. if you dont care then remove the outer list() conversion to make it faster. regardless this will be a much faster approach than what you have written
Related
def permute(nums):
result = []
get_permute([], nums, result)
return result
def get_permute(current, num, result):
if not num:
result.append(current+[])
for i, v in enumerate(num):
current.append(num[i])
get_permute(current, num[:i] + num[i + 1:], result)
current.pop()
if __name__ == "__main__":
r = permute([1,2,3])
for perm in r:
print(perm)
What does current + [] do in result.append(current+[]) if I remove +[] it printing blank lists.
It's making a copy of the list. When you remove it, you run into the List of lists changes reflected across sublists unexpectedly problem, because the outer list contains many references to the same list, instead of references to many different lists.
You should be able to replace it with current.copy() (using Python >= 3.3) or list(current) to avoid similar confusion among future readers. (There are a lot of ways to copy a list. Pick the one you like and stick with it.)
What does the + [] do?
It generates a new list with the same contents as the old list.
>>> x = [1]
>>> id(x) == id(x + [])
False
>>> x == x + []
True
Why do I need this?
Whitout adding copies to your result, you would have the same list many times in your result and everytime you update that list, it affects your result.
>>> x = [1, 2]
>>> result = []
>>> result.append(x)
>>> x.append(3)
>>> result.append(x)
>>> result
[[1, 2, 3], [1, 2, 3]]
Some possible ways to make the could more readable would be
result.append(current[:])
or
result.append(list(current))
Why does it return blank lists if you remove the + []?
Because if you do not append copies to the result, there would be just one list in the result but multiple times. And you call .append(num[i]) on this list just as often as .pop(), which results in that list be empty.
Given some function that can return None or another value and a list of values:
def fn(x):
if x == 4:
return None
else return x
lst = [1, 2, 3, 4, 5, 6, 7]
I want to have a list of the outputs of fn() that don't return None
I have some code that works which looks like so:
output = []
for i in lst:
result = fn(i)
if result:
output.append(result)
I can express this as a list comprehension like so:
output = [fn(i) for i in lst if fn(i)]
but it runs fn(i) for anything that doesn't return None twice which is not desirable if fn is an expensive function.
Is there any way to have have the nice pythonic comprehension without running the function twice?
a possible solution:
output = [fn(i) for i in lst]
output = [o for o in f if o]
Your problem is that None is produced by the function, not the thing you are iterating over. Try
output = [fi for fi in map(fn, lst) if fi is not None]
Just combine your solutions:
[x for x in [fn(i) for i in lst] if x is not None]
The downside is that any Nones in the original list not produced by fn will be removed as well.
I have a doctest written:
def extract_second(triples):
"""Given a list of triples, return a list with the second element of each triple
If an item is not a triple, return None for that element
>>> extract_second([('a',3,'x'),('b',4,'y')])
[3, 4]
>>> extract_second([('c',5,'z'),('d',6)])
[5, None]
>>> extract_second([('a',3,'x'),('b',4,'y')]) == [3, 4]
True
"""
for x in triples:
return x[1]
although the code is not returning the 1st index of the 2nd list inputted. Any ideas?
The reason you are only getting a single answer returned is because the return statement breaks your for loop as soon as it is encountered. it is the equivalent of something like this:
for x in list:
break
return x[1]
And a solution to your problem would be as such:
results = []
for i in triples:
if len(i) == 3:
results.append(i[1])
else:
results.append(None)
return results
This solution appends the answer to a list called results, and returns results after it is finished iterating through triples.
I need to create a function that returns the second smallest unique number, which means if
list1 = [5,4,3,2,2,1], I need to return 3, because 2 is not unique.
I've tried:
def second(list1):
result = sorted(list1)[1]
return result
and
def second(list1):
result = list(set((list1)))
return result
but they all return 2.
EDIT1:
Thanks guys! I got it working using this final code:
def second(list1):
b = [i for i in list1 if list1.count(i) == 1]
b.sort()
result = sorted(b)[1]
return result
EDIT 2:
Okay guys... really confused. My Prof just told me that if list1 = [1,1,2,3,4], it should return 2 because 2 is still the second smallest number, and if list1 = [1,2,2,3,4], it should return 3.
Code in eidt1 wont work if list1 = [1,1,2,3,4].
I think I need to do something like:
if duplicate number in position list1[0], then remove all duplicates and return second number.
Else if duplicate number postion not in list1[0], then just use the code in EDIT1.
Without using anything fancy, why not just get a list of uniques, sort it, and get the second list item?
a = [5,4,3,2,2,1] #second smallest is 3
b = [i for i in a if a.count(i) == 1]
b.sort()
>>>b[1]
3
a = [5,4,4,3,3,2,2,1] #second smallest is 5
b = [i for i in a if a.count(i) == 1]
b.sort()
>>> b[1]
5
Obviously you should test that your list has at least two unique numbers in it. In other words, make sure b has a length of at least 2.
Remove non unique elements - use sort/itertools.groupby or collections.Counter
Use min - O(n) to determine the minimum instead of sort - O(nlongn). (In any case if you are using groupby the data is already sorted) I missed the fact that OP wanted the second minimum, so sorting is still a better option here
Sample Code
Using Counter
>>> sorted(k for k, v in Counter(list1).items() if v == 1)[1]
1
Using Itertools
>>> sorted(k for k, g in groupby(sorted(list1)) if len(list(g)) == 1)[1]
3
Here's a fancier approach that doesn't use count (which means it should have significantly better performance on large datasets).
from collections import defaultdict
def getUnique(data):
dd = defaultdict(lambda: 0)
for value in data:
dd[value] += 1
result = [key for key in dd.keys() if dd[key] == 1]
result.sort()
return result
a = [5,4,3,2,2,1]
b = getUnique(a)
print(b)
# [1, 3, 4, 5]
print(b[1])
# 3
Okay guys! I got the working code thanks to all your help and helping me to think on the right track. This code works:
`def second(list1):
if len(list1)!= len(set(list1)):
result = sorted(list1)[2]
return result
elif len(list1) == len(set(list1)):
result = sorted(list1)[1]
return result`
Okay, here usage of set() on a list is not going to help. It doesn't purge the duplicated elements. What I mean is :
l1=[5,4,3,2,2,1]
print set(l1)
Prints
[0, 1, 2, 3, 4, 5]
Here, you're not removing the duplicated elements, but the list gets unique
In your example you want to remove all duplicated elements.
Try something like this.
l1=[5,4,3,2,2,1]
newlist=[]
for i in l1:
if l1.count(i)==1:
newlist.append(i)
print newlist
This in this example prints
[5, 4, 3, 1]
then you can use heapq to get your second largest number in your list, like this
print heapq.nsmallest(2, newlist)[-1]
Imports : import heapq, The above snippet prints 3 for you.
This should to the trick. Cheers!
I would like to search for numbers in existing list. If is one of this numbers repeated then set variable's value to true and break for loop.
list = [3, 5, 3] //numbers in list
So if the function gets two same numbers then break for - in this case there is 3 repeated.
How to do that?
First, don't name your list list. That is a Python built-in, and using it as a variable name can give undesired side effects. Let's call it L instead.
You can solve your problem by comparing the list to a set version of itself.
Edit: You want true when there is a repeat, not the other way around. Code edited.
def testlist(L):
return sorted(set(L)) != sorted(L)
You could look into sets. You loop through your list, and either add the number to a support set, or break out the loop.
>>> l = [3, 5, 3]
>>> s = set()
>>> s
set([])
>>> for x in l:
... if x not in s:
... s.add(x)
... else:
... break
You could also take a step further and make a function out of this code, returning the first duplicated number you find (or None if the list doesn't contain duplicates):
def get_first_duplicate(l):
s = set()
for x in l:
if x not in s:
s.add(x)
else:
return x
get_first_duplicate([3, 5, 3])
# returns 3
Otherwise, if you want to get a boolean answer to the question "does this list contain duplicates?", you can return it instead of the duplicate element:
def has_duplicates(l):
s = set()
for x in l:
if x not in s:
s.add(x)
else:
return true
return false
get_first_duplicate([3, 5, 3])
# returns True
senderle pointed out:
there's an idiom that people sometimes use to compress this logic into a couple of lines. I don't necessarily recommend it, but it's worth knowing:
s = set(); has_dupe = any(x in s or s.add(x) for x in l)
you can use collections.Counter() and any():
>>> lis=[3,5,3]
>>> c=Counter(lis)
>>> any(x>1 for x in c.values()) # True means yes some value is repeated
True
>>> lis=range(10)
>>> c=Counter(lis)
>>> any(x>1 for x in c.values()) # False means all values only appeared once
False
or use sets and match lengths:
In [5]: lis=[3,3,5]
In [6]: not (len(lis)==len(set(lis)))
Out[6]: True
In [7]: lis=range(10)
In [8]: not (len(lis)==len(set(lis)))
Out[8]: False
You should never give the name list to a variable - list is a type in Python, and you can give yourself all kinds of problems masking built-in names like that. Give it a descriptive name, like numbers.
That said ... you could use a set to keep track of which numbers you've already seen:
def first_double(seq):
"""Return the first item in seq that appears twice."""
found = set()
for item in seq:
if item in found:
return item
# return will terminate the function, so no need for 'break'.
else:
found.add(item)
numbers = [3, 5, 3]
number = first_double(numbers)
without additional memory:
any(l.count(x) > 1 for x in l)