I have a function that, when inputting a list and a specific string in that list, removes any duplicates of that specific string from the list. (find_start and find_end are separate functions that determine the first and last position of a certain string)
def remove_duplicates(sorted_list, item):
i = 0
real_list = []
for x in range(len(sorted_list)-1):
if(sorted_list[i] == item):
a = find_start(sorted_list, item)
b = find_end(sorted_list, item)
real_list = real_list + [item]
i = i+(b-a)
else:
real_list = real_list + [sorted_list[i]]
i+=1
return real_list
So for example, remove_duplicates(['a','a','b','b','c','c'], 'a') would return ['a','b','b','c','c']
I'm trying to define another function that uses this function in it for each iteration, like so
def remove_all_duplicates(sorted_list):
i = 0
list_tru = []
for x in range(len(sorted_list)):
list_tru = remove_duplicates(sorted_list, sorted_list[i])
i+=1
return list_tru
but if I input remove_all(['a','a','b','b','c','c']), it outputs ['a','a','b','b','c']. What am I doing wrong?
def remove_all_duplicates(L):
# NOTE: this modifies L IN-PLACE. Tread carefully
i = 1
while i<len(L):
if L[i] == L[i-1]:
del(L[i])
continue
i += 1
Usage:
In [88]: L = ['a','a','b','b','c','c']
In [89]: remove_all_duplicates(L)
In [90]: L
Out[90]: ['a', 'b', 'c']
With every iteration, you just keep going back to the original sorted_list. I would recommend copying it and then operating on that copy:
def remove_all_duplicates(sorted_list):
list_tru = sorted_list[:] # copy it
for x in set(sorted_list): # just use a set
list_tru = remove_duplicates(list_tru, x) # remove this character from your list
return list_tru
I've also turned the sorted list into a set so that you don't try to remove duplicates of the same letter multiple times, and removed the unnecessary i counter.
Of course, if all you really want to do is remove the duplicates from a sorted list of strings and you're not attached to the algorithm you're developing, that's particularly simple:
new_list = sorted(set(old_list))
def remove_duplicates(sorted_list):
for item in sorted_list:
hits = sorted_list.count(item)
while hits > 1:
sorted_list.remove(item)
hits = sorted_list.count(item)
return sorted_list
print(remove_duplicates(["a","a", "b", "b"]))
this is the simplest method I could come up with on the spot uses .count to tell if there are duplicates returns ["a", "b"]
You can use this too:
A = ['a','a','b','c','c'] #example of input list with duplicates
value = remove_duplicates(A) #pass the list to the function
print value #prints ['a','b','c']
def remove_duplicates(A):
B = [] #empty list
for item in A:
if item in B:
pass
else:
B.append(item) #Append the list
return B
Hope that this helps. Have a nice day.
Related
how do i create a function that adds something to an empty list, but if the value already exists on that list it cant be added, and has to go to another empty list. I tried this but this doesn't takes into account if a value other than x is already on the list
List = []
leftover = []
x = 2
def add(x):
if myUniqueList[0:] == []:
myUniqueList.append(x)
print("True")
print(myUniqueList)
elif myUniqueList[0:] == [x]:
leftover.append(x)
print("False")
print(leftover)
add(x)
This may work. By using the if x not in syntax, we can easily check to see if the number is in the existing list.
If it is not in the list, we then append it. If it is in the list, we append it to the leftover list.
mylist = []
leftover = []
x = 2
def add(x):
if x not in mylist:
mylist.append(x)
print('True')
print(mylist)
else:
leftover.append(x)
print('False')
print(leftover)
add(x)
Set vs List:
There are some performance ramifications if you are using massive sequences of numbers. In which case, storing the numbers in a Python set() will greatly speed up any lookups to see if the number already exists. This is due to the way that Python stores numbers in sets vs lists.
myset = set()
leftover = []
x = 2
def add(x):
if x not in mylist:
mylist.add(x) # sets use .add() instead of .append()
print('True')
print(myset)
print(list(myset)) # IF you need to print out a list, you can
# convert sets to lists by encapsulating
# the myset with the list() factory function
# BUT if you do this every cycle you prolly
# start to lose the performance benefits.
else:
leftover.append(x)
print('False')
print(leftover)
add(x)
List = []
leftover = []
x = 2
def add(x):
if x in myUniqueList:
print("False")
print(leftover)
leftover.append(x)
else:
myUniqueList.append(x)
print("True")
print(myUniqueList)
add(x)
My code:
def shorter(lst):
if len(lst) == 0:
return []
if lst[0] in lst[1:]:
lst.remove(lst[0])
shorter(lst[1:])
return lst
print shorter(["c","g",1,"t",1])
Why does it print ["c","g",1,"t",1] instead of ["c","g","t",1]
For a recursive method, what you can do is check a specific index in the again as you have it. If we remove the current element, we want to stay at the same index, otherwise we want to increase the index by one. The base case for this is if we are looking at or beyond the last element in the array since we don't really need to check it.
def shorter(lst, ind=0):
if ind >= len(lst)-1: #Base Case
return lst
if lst[ind] in lst[ind+1:]:
lst.pop(ind)
return shorter(lst,ind)
return shorter(lst, ind+1)
#Stuff to test the function
import random
x = [random.randint(1,10) for i in range(20)]
print(x)
x = shorter(x)
print(x)
Another way to solve this in a single line is to convert the list into a set and then back into a list. Sets have only unique values, so we can use that property to remove any repeating elements.
import random
x = [random.randint(1,10) for i in range(20)]
print(x)
x = list(set(x)) #Converts to set and back to list
print(x)
A possible recursive solution could be:
def shorter(lst):
if lst:
if lst[0] in lst[1:]:
prefix = [] # Skip repeated item.
else:
prefix = [lst[0]] # Keep unique item.
return prefix + shorter(lst[1:])
else:
return lst
The previous code can also be compacted to:
def shorter(lst):
if lst:
return lst[0:(lst[0] not in lst[1:])] + shorter(lst[1:])
else:
return lst
and the function body can also be reduced to a one-liner:
def shorter(lst):
return (lst[0:(lst[0] not in lst[1:])] + shorter(lst[1:])) if lst else lst
or even:
def shorter(lst):
return lst and (lst[0:(lst[0] not in lst[1:])] + shorter(lst[1:]))
So I'm trying to figure out this problem and I can't figure out why it isn't working.
The premise is that you're given an input list and you have to find the second-lowest value. The list can have any number of integers and can repeat values; you can't change the list.
My code:
def second_min(x):
input_list = list(x)
print input_list
list_copy = list(input_list)
list_set = set(list_copy)
if len(list_set) > 1:
list_copy2 = list(list_set)
list_copy2 = list_copy2.sort()
return list_copy2[1]
else:
return None
print second_min([4,3,1,5,1])
print second_min([1,1,1])
The outputs for those two inputs are:
3
None
It's giving me errors on lines 9 and 13.
TypeError: 'NoneType' object has no attribute '__getitem__'
Thanks!
list_copy2 = list_copy2.sort()
.sort() sorts the list in place and returns None. So you're sorting the list, then throwing it away. You want just:
list_copy2.sort()
Or:
list_copy2 = sorted(list_set)
sorted always returns a list, so you can use it to sort the set and convert it to a list in one step!
You need to use sorted instead of sort. sorted returns a new list, that is a sorted version of the original. sort will sort the list in-place, and returns None upon doing so.
def second_min(x):
if len(x) > 1:
return sorted(x)[1]
else:
return None
>>> second_min([4,3,1,5,1])
1
Help, I can't use sorted! It's not allowed!
def second_min(li):
if len(li) < 2:
return None
it = iter(li)
a, b = next(it), next(it)
next_lowest, lowest = max(a, b), min(a, b)
for x in it:
if x < next_lowest:
if x < lowest:
lowest, next_lowest = x, lowest
else:
next_lowest = x
return next_lowest
I was trying to make a definition that would add all numbers within each sublist in a list of lists.
def MassAddition(_list):
output = []
total = 0
for i in _list:
if isinstance(i, list):
output.append(MassAddition(i))
else:
total = total + i
output.append(total)
return output
Problem is that it returns an extra item in a list at the end. I think its because I made total = 0 and then appended it to output list outside of for loop. Can someone help me clean this up?
Ps. This definition should be able to handle any level of nested lists.
example:
input = [[0,1,2], [2,1,5],[2,2,2],2,2,1]
desiredoutput = [[3],[8],[6],5]
Thank you,
You can check the additional numbers for type too. If they can only be int:
def mass_addition(lst):
output = []
total = 0
extra_flag = False
for i in lst:
if isinstance(i, list):
output.append(mass_addition(i))
elif isinstance(i, int):
extra_flag = True
total += i
if extra_flag:
output.append(total)
return output
Problem: I have a list of unique items and a comparison_tool to compare a certain property of item-pairs.
I want to store all items that return 1 for the comparison_tool for any other item without unnecessary comparisons.
Is there an efficient way to cycle through the list?
I have tried figuring it out with itertools.combinations(list_of_items, 2) and also failed with the example below:
def comparison_tool(a, b):
'''Checks if prop matches (this is simplified for this example
and i can't break it out of this function)
'''
if a.prop == b.prop:
return 1 # this is bad
else:
return 0
list_of_items = [...]
faulty_items = set()
for a in list_of_items:
for b in list_of_items:
if comparison_tool(a,b):
faulty_items.add(a,b)
list_of_items.remove(b)
# Here is where i go wrong. I would like to remove 'b' from list_of_items
# so that 'a' doesn't cycle through 'b' in its upcoming loops
Or am i just going about this the wrong way?
You should use list comprehension and do:
newlist = [i for i in list_of_items if i.prop=1]
It is simpler to create a new list containing good values :
list_of_items = [...]
faulty_items = set()
new_list = []
for i in list_of_items:
for j in new_list:
if comparison_tool(i, j):
faulty_items.add(i)
break
else:
new_list.append(i)
list_of_items = new_list