remove non repeating characters from a list - python

I am trying to remove non repeating characters from a list in python. e.g list = [1,1,2,3,3,3,5,6] should return [1,1,3,3].
My initial attempt was:
def tester(data):
for x in data:
if data.count(x) == 1:
data.remove(x)
return data
This will work for some inputs, but for [1,2,3,4,5], for example, it returns [2,4]. Could someone please explain why this occurs?

l=[1,1,2,3,3,3,5,6]
[x for x in l if l.count(x) > 1]
[1, 1, 3, 3, 3]
Adds elements that appear at least twice in your list.
In your own code you need to change the line for x in data to for x in data[:]:
Using data[:] you are iterating over a copy of original list.

There is a linear time solution for that:
def tester(data):
cnt = {}
for e in data:
cnt[e] = cnt.get(e, 0) + 1
return [x for x in data if cnt[x] > 1]

This is occurring because you are removing from a list as you're iterating through it. Instead, consider appending to a new list.
You could also use collections.Counter, if you're using 2.7 or greater:
[a for a, b in collections.Counter(your_list).items() if b > 1]

Another linear solution.
>>> data = [1, 1, 2, 3, 3, 3, 5, 6]
>>> D = dict.fromkeys(data, 0)
>>> for item in data:
... D[item] += 1
...
>>> [item for item in data if D[item] > 1]
[1, 1, 3, 3, 3]

You shouldn't remove items from a mutable list while iterating over that same list. The interpreter doesn't have any way to keep track of where it is in the list while you're doing this.
See this question for another example of the same problem, with many suggested alternative approaches.

you can use the list comprehention,just like this:
def tester(data):
return [x for x in data if data.count(x) != 1]
it is not recommended to remove item when iterating

Related

How to iterate the first index value twice before going to the next index position?

I'm trying to make a for loop that iterates each index twice before going to the next one, for example if I have the following list:
l = [1,2,3]
I would like to iterate it if it was in this way:
l = [1,1,2,2,3,3]
could someone help me with this problem please?
The most obvious thing would be a generator function that yields each item in the iterable twice:
def twice(arr):
for val in arr:
yield val
yield val
for x in twice([1, 2, 3]):
print(x)
If you need a list, then
l = list(twice([1, 2, 3]))
You could make a list comprehension that repeats the elements and flattens the result:
l = [1,2,3]
repeat = 2
[n for i in l for n in [i]*repeat]
# [1, 1, 2, 2, 3, 3]
You can solve this by using NumPy.
import numpy as np
l = np.repeat([1,2,3],2)
Repeat repeats elements of an array the specified number of times. This also returns a NumPy array. This can be converted back to a list if you wish with list(l). However, NumPy arrays act similar to lists so most of the time you don't need to change anything.
Unsurprisingly, more-itertools has that:
>>> from more_itertools import repeat_each
>>> for x in repeat_each([1, 2, 3]):
... print(x)
...
1
1
2
2
3
3
(It also has an optional second parameter for telling it how often to repeat each. For example repeat_each([1, 2, 3], 5). Default is 2.)
l = [1, 2, 3]
lst = []
for x in l:
for i in range(2):
lst.append(x)
print(lst)
# [1, 1, 2, 2, 3, 3]

printing items in a list represented by bit list

I have this problem on writing a python function which takes a bit list as input and prints the items represented by this bit list.
so the question is on Knapsack and it is a relatively simple and straightforward one as I'm new to the python language too.
so technically the items can be named in a list [1,2,3,4] which corresponds to Type 1, Type 2, Type 3 and etc but we won't be needing the "type". the problem is, i represented the solution in a bit list [0,1,1,1] where 0 means not taken and 1 means taken. in another words, item of type 1 is not taken but the rest are taken, as represented in the bit list i wrote.
now we are required to write a python function which takes the bit list as input and prints the item corresponding to it in which in this case i need the function to print out [2,3,4] leaving out the 1 since it is 0 by bit list. any help on this? it is a 2 mark question but i still couldn't figure it out.
def printItems(l):
for x in range(len(l)):
if x == 0:
return False
elif x == 1:
return l
i tried something like that but it is wrong. much appreciated for any help.
You can do this with the zip function that takes two tiers Lee and returns them in pairs:
for bit_item, item in zip(bit_list, item_list):
if bit_item:
print item
Or if you need a list rather than printing them, you can use a list comprehension:
[item for bit_item, item in zip(bit_list, item_list) if bit_item]
You can use itertools.compress for a quick solution:
>>> import itertools
>>> list(itertools.compress(itertools.count(1), [0, 1, 1, 1]))
[2, 3, 4]
The reason your solution doesn't work is because you are using return in your function, where you need to use print, and make sure you are iterating over your list correctly. In this case, enumerate simplifies things, but there are many similar approaches that would work:
>>> def print_items(l):
... for i,b in enumerate(l,1):
... if b:
... print(i)
...
>>> print_items([0,1,1,1])
2
3
4
>>>
You may do it using list comprehension with enumerate() as:
>>> my_list = [0, 1, 1, 1]
>>> taken_list = [i for i, item in enumerate(my_list, 1) if item]
>>> taken_list # by default start with 0 ^
[2, 3, 4]
Alternatively, in case you do not need any in-built function and want to create your own function, you may modify your code as:
def printItems(l):
new_list = []
for x in range(len(l)):
if l[x] == 1:
new_list.append(x+1) # "x+1" because index starts with `0` and you need position
return new_list
Sample run:
>>> printItems([0, 1, 1, 1])
[2, 3, 4]

Is it safe practice to edit a list while looping through it?

I was trying to execute a for loop like:
a = [1,2,3,4,5,6,7]
for i in range(0, len(a), 1):
if a[i] == 4:
a.remove(a[i])
I end up having an index error since the length of the list becomes shorter but the iterator i does not become aware.
So, my question is, how can something like that be coded? Can the range of i be updated in each iterations of the loop based on current array condition?
For the .pop() that you mention for example you can use a list comprehension to create a second list or even modify the original one in place. Like so:
alist = [1, 2, 3, 4, 1, 2, 3, 5, 5, 4, 2]
alist = [x for x in alist if x != 4]
print(alist)
#[1, 2, 3, 1, 2, 3, 5, 5, 2]
As user2393256 more generally puts it, you can generalize and define a function my_filter() which will return a boolean based on some check you implement in it. And then you can do:
def my_filter(a_value):
return True if a_value != 4 else False
alist = [x for x in alist if my_filter(x)]
I would go with the function solution if the check was too complicated to type in the list comprehension, so mainly for readability. The example above is therefore not the best since the check is very simple but i just wanted to show you how it would be done.
If you want to delete elements from your list while iterating over it you should use list comprehension.
a = [1,2,3,4,5,6,7]
a = [x for x in a if not check(x)]
You would need to write a "check" function that returns wether or not you want to keep the element in the list.
I don't know where you are going with that but this would do what I beleive you want :
i=0
a = [1,2,3,4,5,6,7]
while boolean_should_i_stop_the_loop :
if i>=len(a) :
boolean_should_i_stop_the_loop = False
#here goes what you want to do in the for loop
print i;
a.append(4)
i += 1

How to find duplicates in a list without creating a separate list?

How to find the duplicates in a list without creating any other list?
Example
A = [1,2,1,3,4,5,4]
At the end
A = [1,4]
So you want a function which takes a list, A, and mutates that list to contain only those elements which were originally duplicated? I assume the restriction on creating new lists applies to any new collection. It is best to be as clear as possible of the requirements when asking a question about algorithms.
It seems an odd requirement that no other collections be made in this algorithm, but it is possible.
A simple but inefficient solution would be to approach it like this:
for each element, x
set a boolean flag value, for example, hasDuplicates to false
for each element right of x, y
if y is a duplicate of x, remove it and set hasDuplicates to true
if hasDuplicates is false, remove x
If the restriction of not creating another collection can be relaxed, or if the result of the algorithm can be a new list rather than the old one modified, you will find much more (time) efficient ways of doing this.
I would go with checking, for each element, if it appears before it but not after. If it doesn't fit, then either it is not a duplicate or it is an other occurence of the duplicate that you don't want to keep. Either cases we don't keep it.
def simplify(a_list):
for i in range(len(a_list) - 1, -1, -1):
value = a_list[i]
if not value in a_list[:i] or value in a_list[i+1:]:
del a_list[i]
Not sure if using slices fit your requirements though.
Usage:
>>> A = [1,2,1,3,4,5,4]
>>> simplify(A)
>>> A
[1, 4]
>>> A = [1,1,1,1,1,2,2,2,2]
>>> simplify(A)
>>> A
[1, 2]
>>> A = [1,1,1,1,1]
>>> simplify(A)
>>> A
[1]
>>> A = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> simplify(A)
>>> A
[]
You can use set to get only the unique values and then remove them, one-by-one, from the original list - so that only the duplicates will remain:
a = [1,2,1,3,4,5,4]
s = list(set(a))
for x in s:
a.remove(x)
print a # [1, 4]
Another elegant option which I 'stole' from Ritesh Kumar is:
collect only the items that appear more than once, use set to remove the dups, and wrap it with list to return a list as a result:
a = [1,2,1,3,4,5,4]
print list(set([x for x in a if a.count(x) > 1])) # [1, 4]
This should do what you need, barring clarification:
def find_duplicated_items(data):
seen = set()
duplicated = set()
for x in data:
if x in seen:
duplicated.add(x)
else:
seen.add(x)
return duplicated
It takes an iterable and returns a set; you can turn it into a list with list(results).
UPDATE:
Here's another way of doing it, as a generator. Just because :).
from collections import Counter
def find_duplicated(iterable, atleast=2):
counter = Counter()
yielded = set()
for item in iterable:
counter[item] += 1
if (counter[item] >= atleast) and (item not in yielded):
yield item
yielded.add(item)
This code appears to delete 2nd duplicates and non duplicates in place, yielding the old list containing only unique duplicates. I haven't thoroughly tested it. Note that time required will scale as O(N**2) where N is the length of the input list.
Unlike other solutions, there are no new lists constructed here, not even a list for a for loop or list comprehension.
File: "dup.py"
def dups(mylist):
idx = 0
while(idx<len(mylist)):
delidx = idx+1
ndeleted = 0
while delidx < len(mylist):
if mylist[delidx] == mylist[idx]:
del mylist[delidx]
ndeleted += 1
else:
delidx += 1
if ndeleted==0:
del mylist[idx]
else:
idx += 1
return mylist
Usage (iPython)
In [1]: from dup import dups
In [2]: dups([1,1,1,1,1])
Out[2]: [1]
In [3]: dups([1,1,2,1,1])
Out[3]: [1]
In [4]: dups([1,1,2,2,1])
Out[4]: [1, 2]
In [5]: dups([1,1,2,1,2])
Out[5]: [1, 2]
In [6]: dups([1,2,3,1,2])
Out[6]: [1, 2]
In [7]: dups([1,2,1,3,4,5,4])
Out[7]: [1, 4]

Nested List and count()

I want to get the number of times x appears in the nested list.
if the list is:
list = [1, 2, 1, 1, 4]
list.count(1)
>>3
This is OK. But if the list is:
list = [[1, 2, 3],[1, 1, 1]]
How can I get the number of times 1 appears? In this case, 4.
>>> L = [[1, 2, 3], [1, 1, 1]]
>>> sum(x.count(1) for x in L)
4
itertools and collections modules got just the stuff you need (flatten the nested lists with itertools.chain and count with collections.Counter
import itertools, collections
data = [[1,2,3],[1,1,1]]
counter = collections.Counter(itertools.chain(*data))
print counter[1]
Use a recursive flatten function instead of itertools.chain to flatten nested lists of arbitrarily level depth
import operator, collections
def flatten(lst):
return reduce(operator.iadd, (flatten(i) if isinstance(i, collections.Sequence) else [i] for i in lst))
reduce with operator.iadd has been used instead of sum so that the flattened is built only once and updated in-place
Here is yet another approach to flatten a nested sequence. Once the sequence is flattened it is an easy check to find count of items.
def flatten(seq, container=None):
if container is None:
container = []
for s in seq:
try:
iter(s) # check if it's iterable
except TypeError:
container.append(s)
else:
flatten(s, container)
return container
c = flatten([(1,2),(3,4),(5,[6,7,['a','b']]),['c','d',('e',['f','g','h'])]])
print(c)
print(c.count('g'))
d = flatten([[[1,(1,),((1,(1,))), [1,[1,[1,[1]]]], 1, [1, [1, (1,)]]]]])
print(d)
print(d.count(1))
The above code prints:
[1, 2, 3, 4, 5, 6, 7, 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
1
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
12
Try this:
reduce(lambda x,y: x+y,list,[]).count(1)
Basically, you start with an empty list [] and add each element of the list list to it. In this case the elements are lists themselves and you get a flattened list.
PS: Just got downvoted for a similar answer in another question!
PPS: Just got downvoted for this solution as well!
If there is only one level of nesting flattening can be done with this list comprenension:
>>> L = [[1,2,3],[1,1,1]]
>>> [ item for sublist in L for item in sublist ].count(1)
4
>>>
For the heck of it: count to any arbitrary nesting depth, handling tuples, lists and arguments:
hits = lambda num, *n: ((1 if e == num else 0)
for a in n
for e in (hits(num, *a) if isinstance(a, (tuple, list)) else (a,)))
lst = [[[1,(1,),((1,(1,))), [1,[1,[1,[1]]]], 1, [1, [1, (1,)]]]]]
print sum(hits(1, lst, 1, 1, 1))
15
def nested_count(lst, x):
return lst.count(x) + sum(
nested_count(l,x) for l in lst if isinstance(l,list))
This function returns the number of occurrences, plus the recursive nested count in all contained sub-lists.
>>> data = [[1,2,3],[1,1,[1,1]]]
>>> print nested_count(data, 1)
5
The following function will flatten lists of lists of any depth(a) by adding non-lists to the resultant output list, and recursively processing lists:
def flatten(listOrItem, result = None):
if result is None: result = [] # Ensure initial result empty.
if type(listOrItem) != type([]): # Handle non-list by appending.
result.append(listOrItem)
else:
for item in listOrItem: # Recursively handle each item in a list.
flatten(item, result)
return result # Return flattened container.
mylist = flatten([[1,2],[3,'a'],[5,[6,7,[8,9]]],[10,'a',[11,[12,13,14]]]])
print(f'Flat list is {mylist}, count of "a" is {mylist.count("a")}')
print(flatten(7))
Once you have a flattened list, it's a simple matter to use count on it.
The output of that code is:
Flat list is [1, 2, 3, 'a', 5, 6, 7, 8, 9, 10, 'a', 11, 12, 13, 14], count of "a" is 2
[7]
Note the behaviour if you don't pass an actual list, it assumes you want a list regardless, one containing just the single item.
If you don't want to construct a flattened list, you can just use a similar method to get the count of any item in the list of lists, with something like:
def deepCount(listOrItem, searchFor):
if type(listOrItem) != type([]): # Non-list, one only if equal.
return 1 if listOrItem == searchFor else 0
subCount = 0 # List, recursively collect each count.
for item in listOrItem:
subCount += deepCount(item, searchFor)
return subCount
deepList = [[1,2],[3,'a'],[5,[6,7,[8,9]]],[10,'a',[11,[12,13,14]]]]
print(f'Count of "a" is {deepCount(deepList, "a")}')
print(f'Count of 13 is {deepCount(deepList, 13)}')
print(f'Count of 99 is {deepCount(deepList, 99)}')
As expected, the output of this is:
Count of "a" is 2
Count of 13 is 1
Count of 99 is 0
(a) Up to the limits imposed by Python itself of course, limits you can increase by just adding this to the top of your code:
import sys
sys.setrecursionlimit(1001) # I believe default is 1000.
I mention that just in case you have some spectacularly deeply nested structures but you shouldn't really need it. If you're nesting that deeply then you're probably doing something wrong :-)

Categories