How to find common elements in list of lists? - python

I'm trying to figure out how to compare an n number of lists to find the common elements.
For example:
p=[ [1,2,3],
[1,9,9],
..
..
[1,2,4]
>> print common(p)
>> [1]
Now if I know the number of elements I can do comparions like:
for a in b:
for c in d:
for x in y:
...
but that wont work if I don't know how many elements p has. I've looked at this solution that compares two lists
https://stackoverflow.com/a/1388864/1320800
but after spending 4 hrs trying to figure a way to make that recursive, a solution still eludes me so any help would be highly appreciated!

You are looking for the set intersection of all the sublists, and the data type you should use for set operations is a set:
result = set(p[0])
for s in p[1:]:
result.intersection_update(s)
print result

A simple solution (one-line) is:
set.intersection(*[set(list) for list in p])

The set.intersection() method supports intersecting multiple inputs at a time. Use argument unpacking to pull the sublists out of the outer list and pass them into set.intersection() as separate arguments:
>>> p=[ [1,2,3],
[1,9,9],
[1,2,4]]
>>> set(p[0]).intersection(*p)
set([1])

Why not just:
set.intersection(*map(set, p))
Result:
set([1])
Or like this:
ip = iter(p)
s = set(next(ip))
s.intersection(*ip)
Result:
set([1])
edit:
copied from console:
>>> p = [[1,2,3], [1,9,9], [1,2,4]]
>>> set.intersection(*map(set, p))
set([1])
>>> ip = iter(p)
>>> s = set(next(ip))
>>> s.intersection(*ip)
set([1])

p=[ [1,2,3],
[1,9,9],
[1,2,4]]
ans = [ele[0] for ele in zip(*p) if len(set(ele)) == 1]
Result:
>>> ans
[1]

reduce(lambda x, y: x & y, (set(i) for i in p))

You are looking for the set intersection of all the sublists, and the data type you should use for set operations is a set:
result = set(p[0])
for s in p[1:]:
result.intersection_update(s)
print result
However, there is a limitation of 10 lists in a list. Anything bigger causes 'result' list to be out of order. Assuming you've made 'result' into a list by list(result).
Make sure you result.sort() to ensure it's ordered if you depend on it to be that way.

Related

Remove Variable(s) in List A if Variable(s) is/are in List B, Python

Like the title states I want to remove variables in one list if they happen to be in another list. I have tried various techniques but I can't seem to get a proper code. Can anyone help with this?
You may use list comprehension if you want to maintain the order:
>>> l = [1,2,3,4]
>>> l2 = [1,5,6,3]
>>> [x for x in l if x not in l2]
[2, 4]
In case the order of elements in original list don't matter, you may use set:
>>> list(set(l) - set(l2))
[2, 4]
def returnNewList(a,b):
h = {}
for e in b:
h[e] = True
return [e for e in a if e not in h]
hash table is used to keep the run time complexity linear.
In case list b is sorted then on place of using hash table you can perform binary search, complexity in this case will be nlog(n)
There are several ways
# just make a new list
[i for i in a if i not in b]
# use sets
list(set(a).difference(set(b)))
I figured it out, however is there a shorter way to write this code?
a = [0,1,2,3,4,5,6,7,8]
b = [0,5,8]
for i in a:
if i in b:
a.remove(i)

Python, finding unique words in multiple lists

I have the following code:
a= ['hello','how','are','hello','you']
b= ['hello','how','you','today']
len_b=len(b)
for word in a:
count=0
while count < len_b:
if word == b[count]:
a.remove(word)
break
else:
count=count+1
print a
The goal is that it basically outputs (contents of list a)-(contents of list b)
so the wanted result in this case would be a = ['are','hello']
but when i run my code i get a= ['how','are','you']
can anybody either point out what is wrong with my implementation, or is there another better way to solve this?
You can use a set to get all non duplicate elements
So you could do set(a) - set(b) for the difference of sets
The reason for this is because you are mutating the list a while iterating over it.
If you want to solve it correctly, you can try the below method. It uses list comprehension and dictionary to keep track of the number of words in the resulting set:
>>> a = ['hello','how','are','hello','you']
>>> b = ['hello','how','you','today']
>>>
>>> cnt_a = {}
>>> for w in a:
... cnt_a[w] = cnt_a.get(w, 0) + 1
...
>>> for w in b:
... if w in cnt_a:
... cnt_a[w] -= 1
... if cnt_a[w] == 0:
... del cnt_a[w]
...
>>> [y for k, v in cnt_a.items() for y in [k] * v]
['hello', 'are']
It works well in case where there are duplicates, even in the resulting list. However it may not preserve the order, but it can be easily modify to do this if you want.
set(a+b) is alright, too. You can use sets to get unique elements.

for loop in the range "first element of lists"

I have a list which has the following structure:
a = [[1,'a'], [2,'b'], [3,'c']]
I would like to create a range of the first element in every sub-list without making a second for-loop. I was thinking about something like this:
for i in a[][0]:
print i
However, the above last code does not work (SyntaxError: invalid syntax). Any idea if it's possible to do this in Python?
EDIT:
The output I would like to get with the above loop is:
1
2
3
and not
1
a
for sublist in a:
print sublist[0]
To build a list of first items, use list comprehension:
first_items = [sublist[0] for sublist in a]
for i,_ in a:
print i
should do the trick
This is probably overkill for such a simple example, but for variety:
for i in map(operator.itemgetter(0), a):
print i
In Python 2 map builds the whole list of first elements. You could use itertools.imap to avoid that. Of course for 3 elements it doesn't matter anyway.
I mention this because it's more flexible than for i, _ in a (there don't need to be exactly two elements in each sublist) and it gives you the i you want instead of doing for i in a and using i[0] (perhaps multiple times in a less simple example). But of course you could just as easily get the i you want with:
for l in a:
i = l[0]
print i
... not everything needs to be done in the loop header, it's just nice that it can be :-)
>>> a = [[1,'a'], [2,'b'], [3,'c']]
>>> for i in a:
... print i[0]
...
1
2
3
I think this method is kind of close to what you were trying.
>>> a = [[1,'a'], [2,'b'], [3,'c']]
>>> for [x,y] in a:
... print(x)
...
1
2
3
However,if your lists are of unequal size, then #warwaruk's answer is better.

How to take duplicated values from the several lists?

I have several lists in python, and I would like to take only values which are in each list, is there any function to do it directly?
for example I have:
{'a','b','c','d','e'},{'a','g','c','d','h','e'}, {'i','b','m','d','e','a'}
and I want to make one list which contains
{'a','d','e'}
but i don't know how many lists I actually have, cause it's dependent on value 'i'.
thanks for any help!
if the elements are unique and hashable (and order doesn't matter in the result), you can use set intersection: e.g.:
common_elements = list(set(list1).intersection(list2).intersection(list3))
This is functionally equivalent to:
common_elements = list( set(list1) & set(list2) & set(list3) )
The & operator only works with sets whereas the the intersection method works with any iterable.
If you have a list of lists and you want the intersection of all of them, you can do this easily:
common_elements = list( set.intersection( * map(set, your_list_of_lists) ) )
special thanks to DSM for pointing this one out
Or you could just use a loop:
common_elements = set(your_list_of_lists[0])
for elem in your_list_of_lists[1:]:
common_elements = common_elements.intersection(elem) #or common_elements &= set(elem) ...
else:
common_elements = list(common_elements)
Note that if you really want to get the order that they were in the original list, you can do that using a simple sort:
common_elements.sort( key = lambda x, your_list_of_lists[0].index(x) )
By construction, there is no risk of a ValueError being raised here.
Just to put a one-liner on the table:
l=['a','b','c','d','e'],['a','g','c','d','h'], ['i','b','m','d','e']
reduce(lambda a, b: a & b, map(set, l))
or
from operator import and_
l=['a','b','c','d','e'],['a','g','c','d','h'], ['i','b','m','d','e']  
reduce(and_, map(set, l))  
You need make set from first list, then use set's .intersection() method.
a, b, c = ['a','b','c','d','e'], ['a','g','c','d','h'], ['i','b','m','d','e']
exists_in_all = set(a).intersection(b).intersection(c)
Updated.
Simplified according to mgilson's comment.
from operator import and_
import operator
a = [['a','b','c','d','e'],['a','g','c','d','h','e'], ['i','b','m','d','e','a']]
print list(reduce(operator.and_, map(set, a)))
it will give you the commeen element from the list
['a', 'e', 'd']

Is the order of results coming from a list comprehension guaranteed?

When using a list comprehension, is the order of the new list guaranteed in any way? As a contrived example, is the following behavior guaranteed by the definition of a list comprehension:
>> a = [x for x in [1,2,3]]
>> a
[1, 2, 3]
Equally, is the following equality guaranteed:
>> lroot = [1, 2, 3]
>> la = [x for x in lroot]
>> lb = []
>> for x in lroot:
lb.append(x)
>> lb == la
True
Specifically, it's the ordering I'm interested in here.
Yes, the list comprehension preserves the order of the original iterable (if there is one).
If the original iterable is ordered (list, tuple, file, etc.), that's the order you'll get in the result. If your iterable is unordered (set, dict, etc.), there are no guarantees about the order of the items.
Yes, a list is a sequence. Sequence order is significant.
It has been a while, but since I came up with a similar question myself recently, and needed a bit more explanation to understand what this comes down to exactly, I'll add my two cents, may it help someone else in the future! :)
More specifically this is about the order of values resulting from a list comprehension operation.
Imagine you have the following list:
list_of_c = [a, b, c, d, e]
I want to round the variables in that list using the following list comprehension:
list_of_d = [round(value, 4) for value in list_of_c]
My question was whether this would mean that the order resulting from the list comprehension would be the following:
list_of_d = [round_a, round_b, round_c, round_d, round_e]
And the answer I received very kindly from #juanpa.arrivillaga , was that indeeded, YES that was the case!

Categories