Python match elements in two lists of lists

Python match elements in two lists of lists - python

Consider the following example:
list1 = [[A,B,C,D],[A,B,C,D],[A,B,C,D]]
list2 = [[B,C],[B,C],[B,C]]
I want to create a new list of items in list 1 that match B,C from both lists. For example:
list1 = [[1,A,B,2],[1,D,E,3],[2,F,G,4]]
list2 = [[A,B],[B,C],[F,G]]
So after matching the above I want the result to be:
newlst = [[1,A,B,2],[2,F,G,4]]
I attempted to use a for loop and where row[1] in list2 but that didnt work. I also tried:
match = set(list1) & set(list2)
That did not work either.

I have no idea where you were going with your set approach. Since & is a bitwise operator. But what you want to do is loop over list 1 and check if the 2nd and 3rd element matches any in list 2.
This can be done in one line via list comprehension.
newlst = [i for i in list1 if i[1:3] in list2]
This is equivalent to-
newlst = []
#loop over list1
for i in list1:
#i[1:3] returns a list of 2nd and 3rd element
#if the 2nd and 3rd element are in the second list
if i[1:3] in list2:
newlst.append(i)

This looks like it gives what you want:
list1 = [[1, 'A', 'B' ,2], [1, 'D', 'E',3], [2, 'F', 'G', 4]]
list2 = [['A', 'B'], ['B', 'C'], ['F', 'G']]
list3 = [e for (e, k) in zip(list1, list2) if (e[1] == k[0] and e[2] == k[1])]
list3
[[1, 'A', 'B', 2], [2, 'F', 'G', 4]]

Related

python: sort array when sorting other array

I have two arrays:
a = np.array([1,3,4,2,6])
b = np.array(['c', 'd', 'e', 'f', 'g'])
These two array are linked (in the sense that there is a 1-1 correspondence between the elements of the two arrays), so when i sort a by decreasing order I would like to sort b in the same order.
For instance, when I do:
a = np.sort(a)[::-1]
I get:
a = [6, 4, 3, 2, 1]
and I would like to be able to get also:
b = ['g', 'e', 'd', 'f', 'c']

i would do smth like this:
import numpy as np
a = np.array([1,3,4,2,6])
b = np.array(['c', 'd', 'e', 'f', 'g'])
idx_order = np.argsort(a)[::-1]
a = a[idx_order]
b = b[idx_order]
output:
a = [6 4 3 2 1]
b = ['g' 'e' 'd' 'f' 'c']

I don't know how or even if you can do this in numpy arrays. However there is a way using standard lists albeit slightly convoluted. Consider this:-
a = [1, 3, 4, 2, 6]
b = ['c', 'd', 'e', 'f', 'g']
assert len(a) == len(b)
c = []
for i in range(len(a)):
c.append((a[i], b[i]))
r = sorted(c)
for i in range(len(r)):
a[i], b[i] = r[i]
print(a)
print(b)
In your problem statement, there is no relationship between the two tables. What happens here is that we make a relationship by grouping relevant data from each table into a temporary list of tuples. In this scenario, sorted() will carry out an ascending sort on the first element of each tuple. We then just rebuild our original arrays

selecting elements from list of list based on length and intersection

l1 = [['a', 'b', 'c'],
['a', 'd', 'c'],
['a', 'e'],
['a', 'd', 'c'],
['a', 'f', 'c'],
['a', 'e'],
['p', 'q', 'r']]
l2 = [1, 1, 1, 2, 0, 0, 0]
I have two lists as represented above. l1 is a list of lists and l2 is another list with some kind of score.
Problem: For all the lists in l1 with a score of 0 (from l2), find those lists which are either entirely different or have the least length.
For example: if i have the lists [1, 2, 3], [2, 3], [5, 7] all with score 0, i will choose [5, 7] because these elements are not present in any other lists and [2, 3] since it has an intersection with [1, 2, 3] but is of a smaller length.
How I do this now:
l = [x for x, y in zip(l1, l2) if y == 0]
lx = [(x, y) for x, y in zip(l1, l2) if y > 0]
c = list(itertools.combinations(l, 2))
un_usable = []
usable = []
for i, j in c:
intersection = len(set(i).intersection(set(j)))
if intersection > 0:
if len(i) < len(j):
usable.append(i)
un_usable.append(j)
else:
usable.append(j)
un_usable.append(i)
for i, j in c:
intersection = len(set(i).intersection(set(j)))
if intersection == 0:
if i not in un_usable and i not in usable:
usable.append(i)
if j not in un_usable and j not in usable:
usable.append(j)
final = lx + [(x, 0) for x in usable]
and final gives me:
[(['a', 'b', 'c'], 1),
(['a', 'd', 'c'], 1),
(['a', 'e'], 1),
(['a', 'd', 'c'], 2),
(['a', 'e'], 0),
(['p', 'q', 'r'], 0)]
which is the required result.
EDIT: to handle equal lengths:
l1 = [['a', 'b', 'c'],
['a', 'd', 'c'],
['a', 'e'],
['a', 'd', 'c'],
['a', 'f', 'c'],
['a', 'e'],
['p', 'q', 'r'],
['a', 'k']]
l2 = [1, 1, 1, 2, 0, 0, 0, 0]
l = [x for x, y in zip(l1, l2) if y == 0]
lx = [(x, y) for x, y in zip(l1, l2) if y > 0]
c = list(itertools.combinations(l, 2))
un_usable = []
usable = []
for i, j in c:
intersection = len(set(i).intersection(set(j)))
if intersection > 0:
if len(i) < len(j):
usable.append(i)
un_usable.append(j)
elif len(i) == len(j):
usable.append(i)
usable.append(j)
else:
usable.append(j)
un_usable.append(i)
usable = [list(x) for x in set(tuple(x) for x in usable)]
un_usable = [list(x) for x in set(tuple(x) for x in un_usable)]
for i, j in c:
intersection = len(set(i).intersection(set(j)))
if intersection == 0:
if i not in un_usable and i not in usable:
usable.append(i)
if j not in un_usable and j not in usable:
usable.append(j)
final = lx + [(x, 0) for x in usable]
Is there a better, faster & pythonic way of achieving the same?

Assuming I understood everything correctly, here is an O(N) two-pass algorithm.
Steps:
Select lists with zero score.
For each element of each zero-score list, find the length of the shortest zero-score list in which the element occurs. Let's call this the length score of the element.
For each list, find the minimum of length scores of all elements of the list. If the result is less than the length of the list, the list is discarded.
def select_lsts(lsts, scores):
# pick out zero score lists
z_lsts = [lst for lst, score in zip(lsts, scores) if score == 0]
# keep track of the shortest length of any list in which an element occurs
len_shortest = dict()
for lst in z_lsts:
ln = len(lst)
for c in lst:
len_shortest[c] = min(ln, len_shortest.get(c, float('inf')))
# check if the list is of minimum length for each of its chars
for lst in z_lsts:
len_lst = len(lst)
if any(len_shortest[c] < len_lst for c in lst):
continue
yield lst

Efficient way to find similar items in a list in Python

I have a list of list as follows:
list_1 = [[[1,a],[2,b]], [[3,c],[4,d]], [[1,a],[5,d]], [[8,r],[10,u]]]
I am trying to find whether an element is this list is similar to another element. Right now, I'm looping it twice i.e. for each element, check against the rest. My output is:
[[[1,a],[2,b]], [[1,a],[5,d]]]
Is there a way to do this more efficiently?
Thanks.

You can use itertools.combinations and any functions like this
from itertools import combinations
for item in combinations(list_1, 2):
if any(i in item[1] for i in item[0]):
print item
Output
([[1, 'a'], [2, 'b']], [[1, 'a'], [5, 'd']])

I'm assuming that, by similar, you mean that the element has at least one matching pair within it. In this case, rather than do a nested loop, you could map each element into a dict of lists twice (once for each [number,str] pair within it). When you finish, each key in the dict will map to the list of elements which contain that key (i.e., are similar).
Example code:
list_1 = [[[1,'a'],[2,'b']], [[3,'c'],[4,'d']], [[1,'a'],[5,'d']], [[8,'r'],[10,'u']]]
d = {}
for elt in list_1:
s0 = '%d%s' % (elt[0][0], elt[0][1])
if s0 in d:
d[s0].append(elt)
else:
d[s0] = [elt]
s1 = '%d%s' % (elt[1][0], elt[1][1])
if s1 in d:
d[s1].append(elt)
else:
d[s1] = [elt]
for key in d.keys():
print key, ':', d[key]
Example output:
1a : [[[1, 'a'], [2, 'b']], [[1, 'a'], [5, 'd']]]
8r : [[[8, 'r'], [10, 'u']]]
2b : [[[1, 'a'], [2, 'b']]]
3c : [[[3, 'c'], [4, 'd']]]
5d : [[[1, 'a'], [5, 'd']]]
4d : [[[3, 'c'], [4, 'd']]]
10u : [[[8, 'r'], [10, 'u']]]
Any of the dict entries with length > 1 have similar elements. This will reduce the runtime complexity of your code to O(n), assuming you have a way to obtain a string representation of a, b, c, etc.

Trouble find value in list of lists

I have two lists. The first is a_list and is like this:
a_list = [1,2,3]
The second is b_list, and it's a list with lists in it. It's like this:
b_list = [['a',1,'b'],['c',2,'g'],['e',3,'5']
What I'm trying to do is use a_list to find the correct b_list and print the value[2] in the b_list.
My code looks like:
for a in a_list:
for b in b_list:
if b[1] == a:
print b[2]
The actually a_list has 136 values in it. And the real b_list has 315 lists in it.
I had initially written code to index the b item and remove it from b_list if b[1] == a.
I've taken that code out in order to solve the real problem.

There is no need to loop over a_list; a simple in test would suffice:
for b in b_list:
if b[1] in a_list:
print b[2]
This would perform better if you made a_list a set:
a_set = set(a_list)
for b in b_list:
if b[1] in a_set:
print b[2]
Either way, this code prints:
b
g
5
for your example data.

If I understood correctly what you want to do:
a_list = [1,2,3,5]
b_list = [['a',1,'b'],['c',2,'g'],['e',3,'5'],
['d',4,'h'],['Z',5,'X'],['m',6,'i']]
print 'a_list ==',a_list
print '\nb_list before :\n',b_list
print '\nEnumerating b_list in reversed order :'
L = len(b_list)
print (' i el L-i b_list[L-i] \n'
' -------------------------------------')
for i,el in enumerate(b_list[::-1],1):
print ' %d %r %d %r' % (i,el,L-i,b_list[L-i])
L = len(b_list)
for i,el in enumerate(b_list[::-1],1):
if el[1] in a_list:
del b_list[L-i]
print '\nb_list after :\n',b_list
result
a_list == [1, 2, 3, 5]
b_list before :
[['a', 1, 'b'], ['c', 2, 'g'], ['e', 3, '5'],
['d', 4, 'h'], ['Z', 5, 'X'], ['m', 6, 'i']]
Enumerating b_list in reversed order :
i el L-i b_list[L-i]
-------------------------------------
1 ['m', 6, 'i'] 5 ['m', 6, 'i']
2 ['Z', 5, 'X'] 4 ['Z', 5, 'X']
3 ['d', 4, 'h'] 3 ['d', 4, 'h']
4 ['e', 3, '5'] 2 ['e', 3, '5']
5 ['c', 2, 'g'] 1 ['c', 2, 'g']
6 ['a', 1, 'b'] 0 ['a', 1, 'b']
b_list after :
[['d', 4, 'h'], ['m', 6, 'i']]
The reason why it is necessary to iterate in b_list in reversed order is the one said by abarnert and explained hereafter by the doc:
Note: There is a subtlety when the sequence is being modified by the
loop (this can only occur for mutable sequences, i.e. lists). An
internal counter is used to keep track of which item is used next, and
this is incremented on each iteration. When this counter has reached
the length of the sequence the loop terminates. This means that if the
suite deletes the current (or a previous) item from the sequence, the
next item will be skipped (since it gets the index of the current item
which has already been treated). Likewise, if the suite inserts an
item in the sequence before the current item, the current item will be
treated again the next time through the loop. This can lead to nasty
bugs that can be avoided by making a temporary copy using a slice of
the whole sequence, e.g.,
for x in a[:]:
if x < 0: a.remove(x)
http://docs.python.org/2/reference/compound_stmts.html#the-for-statement

How do I delete the Nth list item from a list of lists (column delete)?

How do I delete a "column" from a list of lists?
Given:
L = [
["a","b","C","d"],
[ 1, 2, 3, 4 ],
["w","x","y","z"]
]
I would like to delete "column" 2 to get:
L = [
["a","b","d"],
[ 1, 2, 4 ],
["w","x","z"]
]
Is there a slice or del method that will do that? Something like:
del L[:][2]

You could loop.
for x in L:
del x[2]
If you're dealing with a lot of data, you can use a library that support sophisticated slicing like that. However, a simple list of lists doesn't slice.

just iterate through that list and delete the index which you want to delete.
for example
for sublist in list:
del sublist[index]

You can do it with a list comprehension:
>>> removed = [ l.pop(2) for l in L ]
>>> print L
[['a', 'b', 'd'], [1, 2, 4], ['w', 'x', 'z']]
>>> print removed
['d', 4, 'z']
It loops the list and pops every element in position 2.
You have got list of elements removed and the main list without these elements.

A slightly twisted version:
index = 2 # Delete column 2
[(x[0:index] + x[index+1:]) for x in L]

[(x[0], x[1], x[3]) for x in L]
It works fine.

This is a very easy way to remove whatever column you want.
L = [
["a","b","C","d"],
[ 1, 2, 3, 4 ],
["w","x","y","z"]
]
temp = [[x[0],x[1],x[3]] for x in L] #x[column that you do not want to remove]
print temp
O/P->[['a', 'b', 'd'], [1, 2, 4], ['w', 'x', 'z']]

L = [['a', 'b', 'C', 'd'], [1, 2, 3, 4], ['w', 'x', 'y', 'z']]
_ = [i.remove(i[2]) for i in L]

If you don't mind on creating new list then you can try the following:
filter_col = lambda lVals, iCol: [[x for i,x in enumerate(row) if i!=iCol] for row in lVals]
filter_out(L, 2)

An alternative to pop():
[x.__delitem__(n) for x in L]
Here n is the index of the elements to be deleted.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python match elements in two lists of lists - python

This looks like it gives what you want: list1 = [[1, 'A', 'B' ,2], [1, 'D', 'E',3], [2, 'F', 'G', 4]] list2 = [['A', 'B'], ['B', 'C'], ['F', 'G']] list3 = [e for (e, k) in zip(list1, list2) if (e[1] == k[0] and e[2] == k[1])] list3 [[1, 'A', 'B', 2], [2, 'F', 'G', 4]]

Related

python: sort array when sorting other array

selecting elements from list of list based on length and intersection

Efficient way to find similar items in a list in Python

Trouble find value in list of lists

How do I delete the Nth list item from a list of lists (column delete)?

Categories

Resources