If you have a list in python like this:
list1 = [(1,2), (3,1), (4,1), (1,2), (3,1)]
If you have list1 in python and you want to find if two pairs of
consecutive values in a list in python are equal like (1,2) and
(3,1) is repeated twice and you want to update a variable lets say
i to 2 based on this.
How would you do this in python?
Here is how I got it, looped through the original list and made mini lists of every each pair, then I checked to see if that pair occurs twice.
lst = [(1,2), (3,1), (4,1), (1,2), (3,1)]
#iterate through list and append each pair of values
tmp = []
for i in range(0, len(lst)-1):
tmp.append(lst[i] + lst[i+1])
#see if any of the pairs appear twice, if so add to new list and get the length of that list to be your count
output = [x for x in tmp if tmp.count(x) > 1]
i = len(output)
print(i)
Related
I have a grouped list of strings that sort of looks like this, the lists inside of these groups will always contain 5 elements:
text_list = [['aaa','bbb','ccc','ddd','eee'],
['fff','ggg','hhh','iii','jjj'],
['xxx','mmm','ccc','bbb','aaa'],
['fff','xxx','aaa','bbb','ddd'],
['aaa','bbb','ccc','ddd','eee'],
['fff','xxx','aaa','ddd','eee'],
['iii','xxx','ggg','jjj','aaa']]
The objective is simple, group all of the list that is similar by the first 3 elements that is then compared against all of the elements inside of the other groups.
So from the above example the output might look like this (output is the index of the list):
[[0,2,4],[3,5]]
Notice how if there is another list that contains the same elements but in a different order is removed.
I've written the following code to extract the groups but they would return duplicates and I am unsure how to proceed. I also think this might not be the most efficient way to do the extraction as the real list can contain upwards to millions of groups:
grouped_list = []
for i in range(0,len(text_list)):
int_temp = []
for m in range(0,len(text_list)):
if i == m:
continue
bool_check = all( x in text_list[m] for x in text_list[i][0:3])
if bool_check:
if len(int_temp) == 0:
int_temp.append(i)
int_temp.append(m)
continue
int_temp.append(m)
grouped_list.append(int_temp)
## remove index with no groups
grouped_list = [x for x in grouped_list if x != []]
Is there a better way to go about this? How do I remove the duplicate group afterwards? Thank you.
Edit:
To be clearer, I would like to retrieve the lists that is similar to each other but only using the first 3 elements of the other lists. For example, using the first 3 elements from list A, check if list B,C,D... contains all 3 of the elements from list A. Repeat for the entire list then remove any list that contains duplicate elements.
You can build a set of frozensets to keep track of indices of groups with the first 3 items being a subset of the rest of the members:
groups = set()
sets = list(map(set, text_list))
for i, lst in enumerate(text_list):
groups.add(frozenset((i, *(j for j, s in enumerate(sets) if set(lst[:3]) <= s))))
print([sorted(group) for group in groups if len(group) > 1])
If the input list is long, it would be faster to create a set of frozensets of the first 3 items of all sub-lists and use the set to filter all combinations of 3 items from each sub-list, so that the time complexity is essentially linear to the input list rather than quadratic despite the overhead in generating combinations:
from itertools import combinations
sets = {frozenset(lst[:3]) for lst in text_list}
groups = {}
for i, lst in enumerate(text_list):
for c in map(frozenset, combinations(lst, 3)):
if c in sets:
groups.setdefault(c, []).append(i)
print([sorted(group) for group in groups.values() if len(group) > 1])
Suppose I have two lists:
>>list_a=[(a,b),(b,c),(e,d),(w,z)]
>>list_b=[(f,g),(e,d),(w,z)]
>>compare_lists(list_a,list_b)
would output [(2,1),(3,2)]
where we see (e,d) is a matching element in
both lists and 2 is its index in list_a and 1 is its index in list_b and by similar logic for (w,z) we get (3,2). How would I go about
achieving the above? I am essentially trying to get a list of the ordered pairs of the index of list_a and list_b where there is a match.
You could create a dictionary keyed by the ordered pairs in the first list, then iterate over the second list, checking against the dictionary:
list_a=[('a','b'),('b','c'),('e','d'),('w','z')]
list_b=[('f','g'),('e','d'),('w','z')]
d = {p:i for i,p in enumerate(list_a)}
matches = [(d[p],i) for i,p in enumerate(list_b) if p in d]
print(matches) #[(2, 1), (3, 2)]
In my case duplicate is not a an item that reappear in one list, but also in the same positions on another lists. For example:
list1 = [1,2,3,3,3,4,5,5]
list2 = ['a','b','b','c','b','d','e','e']
list3 = ['T1','T2','T3','T4','T3','T4','T5','T5']
So the position of the real duplicates in all 3 lists is [2,4] and [6,7]. Because in list1 3 is repeated, in list2 'b' is repeated in the same position as in list1, in list 3 'T3'. in second case 5,e,T5 represent duplicated items in the same positions in their lists. I have a hard time to present results "automatically" in one step.
1) I find duplicate in first list
# Find Duplicated part numbers (exact maches)
def list_duplicates(seq):
seen = set()
seen_add = seen.add
# adds all elements it doesn't know yet to seen and all other to seen_twice
seen_twice = set( x for x in seq if x in seen or seen_add(x) )
# turn the set into a list (as requested)
return list(seen_twice)
# List of Duplicated part numbers
D_list1 = list_duplicates(list1)
D_list2 = list_duplicates(list2)
2) Then I find the positions of given duplicate and look at that position in second list
# find the row position of duplicated part numbers
def list_position_duplicates(list1,n,D_list1):
position = []
gen = (i for i,x in enumerate(data) if x == D_list1[n])
for i in gen: position.append(i)
return position
# Actual calculation find the row position of duplicated part numbers, beginning and end
lpd_part = list_position_duplicates(list1,1,D_list1)
start = lpd_part[0]
end = lpd_part[-1]
lpd_parent = list_position_duplicates(list2[start:end+1],0,D_list2)
So in step 2 I need to put n (position of found duplicate in the list), I would like to do this step automatically, to have a position of duplicated elements in the same positions in the lists. For all duplicates in the same time, and not one by one "manualy". I think it just need a for loop or if, but I'm new to Python and I tried many combinations and it didn't work.
You can use items from all 3 lists on the same index as key and store the the corresponding index as value(in a list). If for any key there are more than 1 indices stored in the list, it is duplicate:
from itertools import izip
def solve(*lists):
d = {}
for i, k in enumerate(izip(*lists)):
d.setdefault(k, []).append(i)
for k, v in d.items():
if len(v) > 1:
print k, v
solve(list1, list2, list3)
#(3, 'b', 'T3') [2, 4]
#(5, 'e', 'T5') [6, 7]
I have N lists of 3 elements. I want to find all combinations between them that don't use the same index twice. Each combination must always have 3 items.
Example:
list1 = [l11, l12, l13]
list2 = [l21, l22, l23]
list3 = [l31, l32, l33]
All combinations possible:
combinaison1 = l11, l22, l33
combinaison2 = l11, l23, l32
combinaison3 = l12, l21,l33
combinaison4= l12, l23, l31
combinaison5=l13, l21, l32
combinaison6= l13, l22, l31
BUT I don't want:
BADcombinaison = l11,l21,l32
How can I do that in python?
Since you want only up to 3 items from 3 or more lists, the first step is to find k-permutations of the list of lists with k-3. I.e. permutations(lists, 3). From there you don't actually have to permute the indexes too, because you want unique indexes. (Note: this allows variable number of lists and also a variable length of the lists, but the lengths of all input and output lists are equal).
Essentially instead of trying to permute indexes, the indexes are just (0, 1, 2) since you specify no repetition of indexes, and the lists are permuted.
from itertools import permutations
# number of lists may vary (>= length of lists)
list1 = ["l11", "l12", "l13"]
list2 = ["l21", "l22", "l23"]
list3 = ["l31", "l32", "l33"]
list4 = ["l41", "l42", "l43"]
lists = [list1, list2, list3, list4]
# lenths of lists must be the same and will be the size of outputs
size = len(lists[0])
for subset in permutations(lists, size):
print([sublist[item_i] for item_i, sublist in enumerate(subset)])
So I have two lists:
shape = [1,2,4]
board = [0,0,1,0,0,0,1,1,1]
I want to assign the board index to the shape index using a for loop so that i can put any shape size. I want the output of the shape to be [0,0,1].
I have tried doing:
list1 = []
for x in shape:
list1.append(board[x])
But the output it gives me is [0,1,0].
I want the output to be [0,0,1], the first three index of the board.
You are currently using
list1 = []
for x in shape:
list1.append(board[x])
This code means to make list1 equal to [board[1], board[2], board[4]], since you start with the empty list and then append board[x] for each x in shape (and since list indices start at 0 in Python). You probably want something like
list1 = []
for i in xrange(len(shape)):
list1.append(board[i])
Now i ranges from 0 to 2, as desired.
So what you mean is that the numbers in shape make no difference? You just want the first len(shape) items in board regardless of the numbers in shape?
In that case:
board[0:len(shape)]
list indices are 0-based, the items of board at indices [1,2,4] are infact [0,1,0].
if you want to use 1-based indices you need to adjust them:
list1 = [board[i-1] for i in shape]