How do you find common sublists between two lists? [duplicate] - python

This question already has answers here:
Finding intersection/difference between python lists
(7 answers)
Closed 9 years ago.
How do you find or keep only the sublists of a list if it the sublist is also present within another list?
lsta = [['a','b','c'],['c','d','e'],['e','f','g']]
lstb = [['a','b','c'],['d','d','e'],['e','f','g']]
I'd like to do something like set(lsta) & set(lstb)
Desired_List = [['a','b','c'],['e','f','g']]
The reason I'd like to do something like set is for it's speed as I'm doing this on a very large list where efficiency is quite important.
Also, slightly unrelated, what if I wanted to subtract lstb from lsta to get
Desired_List2 = [['d','d','e']]

Better change the list of lists to list of tuples, then you can easily use the set operations:
>>> tupa = map(tuple, lsta)
>>> tupb = map(tuple, lstb)
>>> set(tupa).intersection(tupb)
set([('a', 'b', 'c'), ('e', 'f', 'g')])
>>> set(tupa).difference(tupb)
set([('c', 'd', 'e')])

If your sub-lists need to remain lists, use a list comprehension
Intersection:
>>> [i for i in lsta if i in lstb]
[['a', 'b', 'c'], ['e', 'f', 'g']]
Subtraction:
>>> [i for i in lsta if i not in lstb]
[['c', 'd', 'e']]

I have written a C module a while ago for this:
>>> lsta = [['a','b','c'],['c','d','e'],['e','f','g']]
>>> lstb = [['a','b','c'],['d','d','e'],['e','f','g']]
>>> list(boolmerge.andmerge(lsta, lstb))
>>> import boolmerge
[['a', 'b', 'c'], ['e', 'f', 'g']]
This is O(n) time, and require the lists to be sorted.

Related

how remove bracket from almost a nested list? [duplicate]

This question already has answers here:
Flatten an irregular (arbitrarily nested) list of lists
(51 answers)
Closed 1 year ago.
I have the following list:
my_list=[[['A','B'],'C'],[['S','A'],'Q']]
How I can remove the bracket from the first two elements?
output:
my_list=[['A','B','C'],['S','A','Q']]
Slice it in?
for a in my_list:
a[:1] = a[0]
This produces the desired output:
[x[0]+[x[1]] for x in my_list]
my_list=[
[['A','B'],'C'],
[['S','A'],'Q'],
]
result = [[item1, item2, item3] for (item1, item2), item3 in my_list]
>>> result
[['A', 'B', 'C'], ['S', 'A', 'Q']]
You can use the flattening solution in Cristian's answer on Flatten an irregular list of lists.
>>> [list(flatten(x)) for x in my_list]
[['A', 'B', 'C'], ['S', 'A', 'Q']]
Other people have already given the oneline solutions, but let's take a step back and think about how to approach this problem, so you can solve it for yourself.
The tools you need:
for [element] in list: Iterate over each element in a list
list[i]: Get the ith element of a list
list.append(element): Add an element to a list
Let's start with the simple case. We have [['A','B'],'C'] and want ['A','B','C'].
We want to
Get the sublist ['A', 'B']
Get the single item that isn't in the sublist, 'C'
Add the single item into the sublist
Make the sublist the main list
Sometimes it's easiest to sketch these things out in the python shell:
>>> l = [['A','B'],'C']
>>> l[0]
['A', 'B']
>>> l[1]
'C'
>>> l[0].append(l[1])
>>> l = l[0]
>>> l
['A', 'B', 'C']
Now, we can build up a function to do this
def fix_single_element(element):
"""
Given a list in the format [['A','B'],'C']
returns a list in the format ['A','B','C']
"""
# We use copy since we don't want to mess up the old list
internal_list = element[0].copy()
last_value = element[1]
internal_list.append(last_value)
return internal_list
Now we can use that:
>>> for sublist in my_list:
... print(sublist)
...
[['A', 'B'], 'C']
[['S', 'A'], 'Q']
Note that the sublists are exactly the problem we just solved.
>>> new_list = []
>>> for sublist in my_list:
... new_list.append(fix_single_element(sublist))
...
>>> new_list
[['A', 'B', 'C'], ['S', 'A', 'Q']]
There are LOTS of ways to do any particular task, and this is probably not the "best" way, but it's a way that will work. Focus on writing code you understand and can change, not the shortest code, especially when you start.

How to add all items at index to new list [duplicate]

This question already has answers here:
How do I iterate through two lists in parallel?
(8 answers)
Closed 3 years ago.
I am currently trying to re-sort a list I got from parsing a website.
I have tried everything but I don't think I found the best solution to my problem.
Let's say we have the following list:
my_list = [['a', 'b', 'c'], ['a', 'b', 'c']]
What I am trying to convert it to:
new_list = [['a', 'a'], ['b', 'b'], ['c', 'c']]
I came up with the following loop:
result = [[], [], []]
for sublist in my_list:
for i in range(0, len(sublist)):
result[i].append(sublist[i])
print(result)
# output: [['a', 'a'], ['b', 'b'], ['c', 'c']]
My method is not the best I assume and I am searching for the most pythonic way to do it if you know what I'm saying.
The Python builtin function zip() is your friend here.
From the official documentation, it
returns an iterator of tuples, where the i-th tuple contains the i-th element from each of the argument sequences or iterables.
What that means is that given two lists, zip(list1, list2) will pair list1[0] with list2[0], list1[1] with list2[1], and so on.
In your case, the lists you want to zip together are inside another list, my_list, so you can unpack it with *my_list. Since zip() returns an iterator, you want to create a list out of the return value of zip(). Final solution:
new_list = list(zip(*my_list))
I have written a code which exactly does what you want. You should use map after zip to convert tuples to lists.
Code:
my_list = [['a', 'b', 'c'], ['a', 'b', 'c']]
my_new_list = list(map(list, zip(my_list[0], my_list[1])))
print(my_new_list)
Output:
>>> python3 test.py
[['a', 'a'], ['b', 'b'], ['c', 'c']]

How to print a list of tupled tuples in CSV-acceptable format? [duplicate]

This question already has answers here:
Flatten an irregular (arbitrarily nested) list of lists
(51 answers)
Closed 5 years ago.
I have a list of tuples I would like to print in CSV format without quotes or brackets.
[(('a','b','c'), 'd'), ... ,(('e','f','g'), 'h')]
Desired output:
a,b,c,d,e,f,g,h
I can get rid of some of the punctuation using chain, .join() or the *-operator, but my knowledge is not sophisticated enough to get rid of all of it for my particular use case.
Thank you.
So, in your case there is a pattern which makes this relatively easy:
>>> x = [(('a','b','c'), 'd') ,(('e','f','g'), 'h')]
>>> [c for a,b in x for c in (*a, b)]
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
Or, an itertools.chain solution:
>>> list(chain.from_iterable((*a, b) for a,b in x))
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
>>>
And, in case you are on an old version of Python, and can't use (*a, b) you will need something like:
[c for a,b in x for c in a+(b,)]

How to combine every element of a list to the other list? [duplicate]

This question already has answers here:
Element-wise addition of 2 lists?
(17 answers)
Closed 5 years ago.
Suppose there are two lists:
['a', 'b', 'c'], ['d', 'e', 'f']
what I want is:
'ad','ae','af','bd','be','bf','cd','ce','cf'
How can I get this without recursion or list comprehension? I mean only use loops, using python?
The itertools module implements a lot of loop-like things:
combined = []
for pair in itertools.product(['a', 'b', 'c'], ['d', 'e', 'f']):
combined.append(''.join(pair))
While iterating through the elements in the first array, you should iterate all of the elements in the second array and push the combined result into the new list.
first_list = ['a', 'b', 'c']
second_list = ['d', 'e', 'f']
combined_list = []
for i in first_list:
for j in second_list:
combined_list.append(i + j)
print(combined_list)
This concept is called a Cartesian product, and the stdlib itertools.product will build one for you - the only problem is it will give you tuples like ('a', 'd') instead of strings, but you can just pass them through join for the result you want:
from itertools import product
print(*map(''.join, product (['a','b,'c'],['d','e','f']))

Getting specific indexed distinct values in nested lists

I have a nested list of around 1 million records like:
l = [['a', 'b', 'c', ...], ['d', 'b', 'e', ...], ['f', 'z', 'g', ...],...]
I want to get the distinct values of inner lists on second index, so that my resultant list be like:
resultant = ['b', 'z', ...]
I have tried nested loops but its not fast, any help will be appreciated!
Since you want the unique items you can use collections.OrderedDict.fromkeys() in order to keep the order and unique items (because of using hashtable fro keys) and use zip() to get the second items.
from collections import OrderedDict
list(OrderedDict.fromkeys(zip(my_lists)[2]))
In python 3.x since zip() returns an iterator you can do this:
colls = zip(my_lists)
next(colls)
list(OrderedDict.fromkeys(next(colls)))
Or use a generator expression within dict.formkeys():
list(OrderedDict.fromkeys(i[1] for i in my_lists))
Demo:
>>> lst = [['a', 'b', 'c'], ['d', 'b', 'e'], ['f', 'z', 'g']]
>>>
>>> list(OrderedDict().fromkeys(sub[1] for sub in lst))
['b', 'z']
You can unzip the list of lists then choice the second tuple with set like below :
This code take 4.05311584473e-06 millseconds, in my laptop
list(set(zip(*lst)[1]))
Input :
lst = [['a', 'b', 'c'], ['d', 'b', 'e'], ['f', 'z', 'g']]
Out put :
['b', 'z']
Would that work for you?
result = set([inner_list[1] for inner_list in l])
I can think of two options.
Set comprehension:
res = {x[1] for x in l}
I think numpy arrays work faster than list/set comprehensions, so converting this list to an array and then using array functions can be faster. Here:
import numpy as np
res = np.unique(np.array(l)[:, 1])
Let me explain: np.array(l) converts the list to a 2d array, then [:, 1] take the second column (starting to count from 0) which consists of the second item of each sublist in the original l, and finally taking only unique values using np.unique.

Categories