Finding elements that are in strictly N of X lists - python

I have N lists, and would like to know which elements are present in strictly X of those lists. I understand that if I have two lists, it's rather straightforward:
lst_a = [1,2,3]
lst_b = [1,2,5]
overlap = list(set(a) & set(b))
What if I have, say, 5 lists, and want to know which elements are in strictly 4 of those?

Merge using counters:
from collections import Counter
lst_a = [1,2,3]
lst_b = [1,2,5]
lsts = [lst_a, lst_b]
counter = Counter()
for lst in lsts:
unique = set(lst)
counter += Counter(unique)
n = 2
print(f"elements in exactly {n} lsts:")
for k, v in counter.items():
if v == n:
print(k)

Similar to #wim's code, but in a more concise manner:
[i for i, c in sum(map(Counter, lsts), Counter()).items() if c == 2]
If the items in the input lists are not unnecessarily unique you can map the lists to sets first:
[i for i, c in sum(map(Counter, map(set, lsts)), Counter()).items() if c == 2]
This returns:
[1, 2]

Related

Common elements in two lists preserving duplicates [duplicate]

This question already has answers here:
Intersection of two lists including duplicates?
(6 answers)
Closed 1 year ago.
The goal is to find common elements in two lists while preserving duplicates.
For example,
Input:
a = [1,3,3,4,5,5]
b = [3,5,5,5,6]
Expected output:
[3,5,5]
I tried set.intersection but set operatons would eliminate duplicates.
Here is my suggestion:
from collections import Counter
ac=Counter(a)
bc=Counter(b)
res=[]
for i in set(a).intersection(set(b)):
res.extend([i] * min(bc[i], ac[i]))
>>> print(res)
[3, 5, 5]
You can use a Counter of your lists and use those keys that occure in both and the minimal amount of their values:
from collections import Counter
a = [1,3,3,4,5,5]
b = [3,5,5,5,6]
ca = Counter(a)
cb = Counter(b)
result = [a for b in ([key] * min(ca[key], cb[key])
for key in ca
if key in cb) for a in b]
print(result)
Output:
[3,5,5]
a = [1,3,3,4,5,5]
b = [3,5,5,5,6]
def findout(a, b):
a = a.copy()
output = []
for i in b:
if i in a:
a.remove(i)
output.append(i)
return output
result = findout(a, b)
print(result) # [3, 5, 5]
may work.
Using Counter from collections module.
from collections import Counter
a = [1,3,3,4,5,5]
b = [3,5,5,5,6]
ans = []
a_count = Counter(a)
b_count = Counter(b)
for i in a_count:
if i in b_count:
ans.extend([i]*min(a_count[i], b_count[i]))
print(ans)
Output
[3, 5, 5]
The answer depends if the lists are always sorted like in your example. If so, you can do a cursor approach where
index_a = 0
index_b = 0
common_elements = []
while index_a < len(a) and index_b < len(b):
if a[index_a] < b[index_b]:
# then a should check the next number, b should stay
index_a += 1
elif a[index_a] > b[index_b]:
# then the reverse
index_b += 1
else:
# they are equal
common_elements.append(a[index_a])
index_a += 1
index_b += 1
However, if they are not sorted like that you're better off maybe doing the set intersection and then turning it back into a list and then for each element add duplicates to equal min(a.count(el), b.count(el))?
That preserving duplicates got my head but finally got a solution
a = [1,3,3,4,5,5]
b = [3,5,5,5,6]
c=[]
def dublicate_finder(a,b):
global c
if len(a)>len(b):
for i in range(len(b)):
if b[i] in a:
c.append(b[i])
remove_index=a.index(b[i],0,len(a))
del a[remove_index]
if len(a)>len(b):
for i in range(len(a)):
if a[i] in b:
c.append(a[i])
remove_index=b.index(a[i],0,len(b))
del a[remove_index]
return c
Try this. You can use the any operator to check if the element is equal to that in other list.
Then remove the element
a = [1,3,3,4,5,5]
b = [3,5,5,5,6]
l3=[]
for i in b:
if any(i==j for j in a):
l3.append(i)
a.remove(i)
print(l3)
Although set.intersection removes duplicates, it can be very useful nonetheless:
a_set = set(a)
b_set = set(b)
intr = a_set.intersection(set_b)
result = [element for element in a if element in intr]
That should work

How to find index of elements of a list that have occurrance of exactly 2 in Python

I'm trying to write a code which returns ALL indices of the elements from a list, which are repeated EXACTLY TWICE. I'm having trouble with my own algorithm. My code only returns the FIRST occurrence it finds. I want this fixed. Here's my own code (it's somehow weird, I know):
from collections import Counter
length = int(input())
user_input = [int(x) for x in input().split()]
occurrances = Counter(user_input)
check_list = []
check_list.append(list(occurrances.keys())[list(occurrances.values()).index(2)])
print(check_list)
I appreciate any help from anyone. Thanks in advance.
Try this:
from collections import Counter
userInput = input().split()
counter = Counter(userInput)
print([x[0] for x in counter.items() if x[1] == 2])
To find the indices of the items occurring twice.
>>> L = [1,2,3,1,4,6,6]
>>> from collections import Counter
>>> c = Counter(L)
>>> for key in filter(lambda x: c[x] == 2, c):
one = L.index(key)
two = L.index(key, one+1)
print(key, 'found at indexes', ' '.join(map(str, [one, two])))
1 found at indexes 0 3
6 found at indexes 5 6
To get the indexes you can use Counter and enumerate inside a list comprehension:
from collections import Counter
L = [1,2,3,4,3,4,2,3,5]
L2 = [i for c in [Counter(L)] for i,v in enumerate(L) if c[v]==2]
print(L2)
[1, 3, 5, 6]
If you're not allowed to use libraries, you can do it without Counter (although it will run slower):
L2 = [i for i,v in enumerate(L) if L.count(v)==2]
This should work if you are looking for the indexes
from collections import Counter
user_input = [int(x) for x in input().split()]
occurrences = Counter(user_input)
keys = [key for key in occurrences.keys() if occurrences[key]==2 ]
check_list = [x for x in range(len(user_input)) if user_input[x] in keys]
print(check_list)

Creating list with specified a length and combining other lists

I am trying to generate a list that combines elements of two other lists, one is a value and one is not.
I've tried having two separate lists with and using the join function and append function to combine the two elements together at the certain stage.
To match the length of list d to list a I've used a while loop as a counter.
a=7*[1]
b=[1,2,3,4,5]
c=['a','b','c']
d=[]
The outcome i'm trying to achieve is such that:
list d becomes the length of list a
& is a combination of list b and list c
d=[1a,1b,1c,2a,2b,2c,3a]
Can think of a Naive solution for now
def create(bk, ck, len_required):
dk = []
for bitem in bk:
for citem in ck:
dk.append(str(bitem) + citem)
if len(dk) == len_required:
return dk
len_required = len(a)
b = [1, 2, 3, 4, 5]
c = ['a', 'b', 'c']
d = create(b, c, len_required)
result = [str(b[int(i / len(c)) % len(b)]) + str(c[i % len(c)]) for i in range(len(a))]
This iterates i from 0 to len(a) and concatenates b[int(i / len(c)) % len(b)] and c[i % len(c)] in the output.
You could do it with a list comprehension:
d = [str(v)+L for v in b*len(a) for L in c][:len(a)]
or, if you're allowed to use itertools:
from itertools import cycle
cycleA = cycle(str(v)+L for v in b for L in c)
d = [ next(cycleA) for _ in a ]

Finding list containing unique elements in list of lists in python?

Say I have a list of lists like so:
list = [[1,2,3,4],[4,5,3,2],[7,8,9,2],[5,6,8,9]]
I want to get the indices of the inner lists that contain unique elements. For the example above, the lists at index 2 is the only one that contains 7 and the list at index 3 is the only one that contains 6.
How would one go about implementing this in python?
Here's a solution using Counter. Each inner list is checked for a value that only has a single count, and then the corresponding index is printed (a la enumerate).
from collections import Counter
from itertools import chain
c = Counter(chain.from_iterable(l))
idx = [i for i, x in enumerate(l) if any(c[y] == 1 for y in x)]
print(idx)
[0, 2, 3]
A possible optimisation might include precomputing unique elements in a set to replace the any call with a set.intersection.
c = Counter(chain.from_iterable(l))
u = {k for k in c if c[k] == 1}
idx = [i for i, x in enumerate(l) if u.intersection(x)]
A naive solution:
>>> from collections import Counter
>>> from itertools import chain
>>> my_list = [[1,2,3,4],[4,5,3,2],[7,8,9,2],[5,6,8,9]]
# find out the counts.
>>> counter = Counter(chain(*my_list))
# find the unique numbers
>>> uniques = [element for element,count in counter.items() if count==1]
# find the index of those unique numbers
>>> result = [indx for indx,elements in enumerate(my_list) for e in uniques if e in elements]
>>> result
[0, 2, 3]
using itertools.chain with set.difference(set)
from itertools import chain
l = [[1,2,3,4],[4,5,3,2],[7,8,9,2],[5,6,8,9]]
[i for i in range(len(l)) if set(l[i]).difference(set(chain(*[j for j in l if j!=l[i]])))]
#[0, 2, 3]

Get list based on occurrences in unknown number of sublists

I'm looking for a way to make a list containing list (a below) into a single list (b below) with 2 conditions:
The order of the new list (b) is based on the number of times the value has occurred in some of the lists in a.
A value can only appear once
Basically turn a into b:
a = [[1,2,3,4], [2,3,4], [4,5,6]]
# value 4 occurs 3 times in list a and gets first position
# value 2 occurs 2 times in list a and get second position and so on...
b = [4,2,3,1,5,6]
I figure one could do this with set and some list magic. But can't get my head around it when a can contain any number of list. The a list is created based on user input (I guess that it can contain between 1 - 20 list with up 200-300 items in each list).
My trying something along the line with [set(l) for l in a] but don't know how to perform set(l) & set(l).... to get all matched items.
Is possible without have a for loop iterating sublist count * items in sublist times?
I think this is probably the closest you're going to get:
from collections import defaultdict
d = defaultdict(int)
for sub in outer:
for val in sub:
d[val] += 1
print sorted(d.keys(), key=lambda k: d[k], reverse = True)
# Output: [4, 2, 3, 1, 5, 6]
There is an off chance that the order of elements that appear an identical number of times may be indeterminate - the output of d.keys() is not ordered.
import itertools
all_items = set(itertools.chain(*a))
b = sorted(all_items, key = lambda y: -sum(x.count(y) for x in a))
Try this -
a = [[1,2,3,4], [2,3,4], [4,5,6]]
s = set()
for l in a:
s.update(l)
print s
#set([1, 2, 3, 4, 5, 6])
b = list(s)
This will add each list to the set, which will give you a unique set of all elements in all the lists. If that is what you are after.
Edit. To preserve the order of elements in the original list, you can't use sets.
a = [[1,2,3,4], [2,3,4], [4,5,6]]
b = []
for l in a:
for i in l:
if not i in b:
b.append(i)
print b
#[1,2,3,4,5,6] - The same order as the set in this case, since thats the order they appear in the list
import itertools
from collections import defaultdict
def list_by_count(lists):
data_stream = itertools.chain.from_iterable(lists)
counts = defaultdict(int)
for item in data_stream:
counts[item] += 1
return [item for (item, count) in
sorted(counts.items(), key=lambda x: (-x[1], x[0]))]
Having the x[0] in the sort key ensures that items with the same count are in some kind of sequence as well.

Categories