Related
I have a list that has 10,000 lists of strings of different lengths. For this question, I will make it simple and give an example of only a list that has 10 lists as follows.
list = [['a','w','r', 't'], ['e','r', 't', 't', 'r', 'd', 's'], ['a','w','r', 't'], ['n', 'g', 'd', 'e', 's'], ['a', 'b', 'c'], ['t', 'f', 'h', 'd', 'p'], ['a', 'b', 'c'], ['a','w','r', 't'], ['s','c','d'], ['e','r', 't', 't', 'r', 'd', 's']]
what I want is to compare each list with all other lists and group the similar lists into one new list (called a cluster) and also group the list indices.
Expected output:
cluster_1_lists = [['a','w','r', 't'], ['a','w','r', 't'], ['a','w','r', 't']]
cluster_1_indices = [0,2,7]
cluster_2_lists = [['e','r', 't', 't', 'r', 'd', 's'],['e','r', 't', 't', 'r', 'd', 's']]
cluster_2_indices = [1,9]
cluster_3_lists = [['n', 'g', 'd', 'e', 's']]
cluster_3_indices = [3]
cluster_4_lists = [['a', 'b', 'c'], ['a', 'b', 'c']]
cluster_4_indices = [4,6]
cluster_5_lists = [['t', 'f', 'h', 'd', 'p']]
cluster_5_indices = [5]
cluster_6_lists = [['s','c','d']]
cluster_6_indices = [8]
Can you help me to implement this in python?
Ok so here, I'll basically be using a dictionary to make a cluster. Here's what I've done:
list= [['a','w','r', 't'], ['e','r', 't', 't', 'r', 'd', 's'], ['a','w','r', 't'], ['n', 'g', 'd', 'e', 's'], ['a', 'b', 'c'], ['t', 'f', 'h', 'd', 'p'], ['a', 'b', 'c'], ['a','w','r', 't'], ['s','c','d'], ['e','r', 't', 't', 'r', 'd', 's']]
cluster = {}
for i in list:
cluster[''.join(i)] = []
cluster[''.join(i)+'_indices'] = []
for j in range(len(list)-1):
for k in cluster:
if ''.join(list[j]) == k:
cluster[k].append(list[j])
cluster[k+'_indices'].append(j)
print(cluster)
The first for loop basically creates a key with the joint name of your list, because you cannot have a key as a list. Then, it stores it val as an empty list which will further be appended. In the second for loop, it iterates again through the list and inside it I have iterated through the keys in the cluster (dict). Then, it basically checks if the joint list is equal to the key name, if yes it appends the value. The output will look like this:
Output: {'awrt': [['a', 'w', 'r', 't'], ['a', 'w', 'r', 't'], ['a', 'w', 'r', 't']], 'awrt_indices': [0, 2, 7], 'erttrds': [['e', 'r', 't', 't', 'r', 'd', 's']], 'erttrds_indices': [1], 'ngdes': [['n', 'g', 'd', 'e', 's']], 'ngdes_indices': [3], 'abc': [['a', 'b', 'c'], ['a', 'b', 'c']], 'abc_indices': [4, 6], 'tfhdp': [['t', 'f', 'h', 'd', 'p']], 'tfhdp_indices': [5], 'scd': [['s', 'c', 'd']], 'scd_indices': [8]}
Note: Creating separate variables as you want will just make the code messy, python has a solution to it which is dictionaries and thus I've used it.
Here is the working answer:
for i in list:
cluster[''.join(i)] = []
xx = []
xx_idx=[]
for k in cluster:
yy = []
yy_ixd = []
for j in range(len(list)):
if k == ''.join(list[j]):
yy.append(list[j])
yy_ixd.append(j)
xx.append(yy)
xx_idx.append(yy_ixd)
print("output", xx)
print("indices: ", xx_idx)
Output:
output [[['a', 'w', 'r', 't'], ['a', 'w', 'r', 't'], ['a', 'w', 'r', 't']], [['e', 'r', 't', 't', 'r', 'd', 's'], ['e', 'r', 't', 't', 'r', 'd', 's']], [['n', 'g', 'd', 'e', 's']], [['a', 'b', 'c'], ['a', 'b', 'c']], [['t', 'f', 'h', 'd', 'p']], [['s', 'c', 'd']]]
indices: [[0, 2, 7], [1, 9], [3], [4, 6], [5], [8]]
I have 2 lists (x and y) and I want to output in x, y (just the print out). May I know how to do it? Do I need a loop to loop through each item in the x and y list?
input :
x = [['A', 'B'], ['C', 'D'], ['F', 'G']]
y = [['L', 'M'], ['J', 'K'], ['O', 'P', 'Q']]
output :
x, y format
['A', 'B'] ['L', 'M']
['C', 'D'] ['J', 'K']
['F', 'G'] ['O', 'P', 'Q']
The closest I got is as below :
for row in x:
n = []
for loop in y :
for x in loop :
n.append(x)
print(' '.join(row).strip().split()) , n
Output :
['A', 'B'] ['L', 'M']
['A', 'B'] ['L', 'M', 'J', 'K']
['A', 'B'] ['L', 'M', 'J', 'K', 'O', 'P', 'Q']
['C', 'D'] ['L', 'M']
['C', 'D'] ['L', 'M', 'J', 'K']
['C', 'D'] ['L', 'M', 'J', 'K', 'O', 'P', 'Q']
['F', 'G'] ['L', 'M']
['F', 'G'] ['L', 'M', 'J', 'K']
['F', 'G'] ['L', 'M', 'J', 'K', 'O', 'P', 'Q']
You can use zip to make tuples of elements of your lists:
list(zip(x, y))
Produces:
[(['A', 'B'], ['L', 'M']),
(['C', 'D'], ['J', 'K']),
(['F', 'G'], ['O', 'P', 'Q'])]
The resulting list is, in this example, of length 3. The first element is:
>>> list(zip(x, y))[0]
(['A', 'B'], ['L', 'M'])
In order to print the tuples with a space in between:
for a, b in zip(x, y):
print(f'{a} {b}')
Output:
['A', 'B'] ['L', 'M']
['C', 'D'] ['J', 'K']
['F', 'G'] ['O', 'P', 'Q']
For example there are three lists:
unsorted_key = ['q', 'w', 'e', 'r', 't', 'y', 'u', 'i', 'o', 'p']
sorted_key = ['e', 'i', 'o', 'p', 'q', 'r', 't', 'u', 'w', 'y']
ciphertext = [
['u', 't', 'x', 'e'],
['p', 'r', 'k', 'p'],
['v', 'n', 'x', 'a'],
['n', 'h', 'e', 'x'],
['x', 'h', 'm', 's'],
['l', 'x', 'c', 'x'],
['x', 'c', 'y', 'a'],
['t', 'u', 'o', 'x'],
['e', 'r', 'm', 'e'],
['y', 'y', 'e', 'x']
]
Is it possible to take the order of the sorted_key and sort it into the unsorted_key, and take the order of the ciphertext and sort it in an identical way?
When moving 'q' from sorted_key[4] to sorted_key[0], it should move ciphertext[4] to ciphertext[0].
All three lists will always be of equal length.
The sorted_key and unsorted_key will never have repeating elements.
The sorted_key will always be a sorted version of unsorted_key.
I've been thinking about it and the only way I can think of would be to use a helper function to dynamically generate and return a lambda function from the order of unsorted_key, and then use something like:
sorted_key, ciphertext = (list(i) for i in zip(*sorted(zip(sorted_key, ciphertext), key=generate(unsorted_key))))
But I really don't know how zip() or lambda functions work or how to make a custom sorting order into one, or if one can even be returned to be used in sorted(). I really can't seem to wrap my head around this problem, so any help would be greatly appreciated!
An efficient approach to solve this problem in linear time is to create a dict that maps keys to indices of sorted_key, and then create a mappping dict that maps indices of unsorted_key to indices of sorted_key based on the same keys, so that you can iterate an index through the range of length of ciphertext to generate a list in the mapped order:
order = dict(map(reversed, enumerate(sorted_key)))
mapping = {i: order[k] for i, k in enumerate(unsorted_key)}
print([ciphertext[mapping[i]] for i in range(len(ciphertext))])
This outputs:
[['x', 'h', 'm', 's'], ['e', 'r', 'm', 'e'], ['u', 't', 'x', 'e'], ['l', 'x', 'c', 'x'], ['x', 'c', 'y', 'a'], ['y', 'y', 'e', 'x'], ['t', 'u', 'o', 'x'], ['p', 'r', 'k', 'p'], ['v', 'n', 'x', 'a'], ['n', 'h', 'e', 'x']]
The builtin sorted with a custom key can do it for you:
sorted(ciphertext, key=lambda x: unsorted_key.index(sorted_key[ciphertext.index(x)]))
Output:
[['x', 'h', 'm', 's'],
['e', 'r', 'm', 'e'],
['u', 't', 'x', 'e'],
['l', 'x', 'c', 'x'],
['x', 'c', 'y', 'a'],
['y', 'y', 'e', 'x'],
['t', 'u', 'o', 'x'],
['p', 'r', 'k', 'p'],
['v', 'n', 'x', 'a'],
['n', 'h', 'e', 'x']]
The lambda basically boils down to:
Find the current index
Find the value of current index in sorted_key
Find the index of sorted_key value in unsorted_key
Sort it
The one thing that I'm not clear about is why do you need to "sort" sorted_key if the end result is identical to unsorted_key? Just sorted_key = unsorted_key[:] is simple enough if that's the case. But if you really need to sort sorted_key as well, you can do this (it would actually make the lambda simpler):
ciphertext, sorted_key = map(list, zip(*sorted(zip(ciphertext, sorted_key), key=lambda x: unsorted_key.index(x[1]))))
ciphertext
[['x', 'h', 'm', 's'],
['e', 'r', 'm', 'e'],
['u', 't', 'x', 'e'],
['l', 'x', 'c', 'x'],
['x', 'c', 'y', 'a'],
['y', 'y', 'e', 'x'],
['t', 'u', 'o', 'x'],
['p', 'r', 'k', 'p'],
['v', 'n', 'x', 'a'],
['n', 'h', 'e', 'x']]
sorted_key
['q', 'w', 'e', 'r', 't', 'y', 'u', 'i', 'o', 'p']
I'm not sure I get the point, but...
First determine the moves (can be the opposite, it0s not clear to me):
moves = [ [i, sorted_key.index(c)] for i, c in enumerate(unsorted_key) ]
#=> [[0, 4], [1, 8], [2, 0], [3, 5], [4, 6], [5, 9], [6, 7], [7, 1], [8, 2], [9, 3]]
Maybe swap elements in [i, sorted_key.index(c)].
Apply the moves to a receiver (res):
res = [ None for _ in range(len(ciphertext))]
for a, b in moves:
res[a] = ciphertext[b]
So the output should be:
for line in res:
print(line)
# ['x', 'h', 'm', 's']
# ['e', 'r', 'm', 'e']
# ['u', 't', 'x', 'e']
# ['l', 'x', 'c', 'x']
# ['x', 'c', 'y', 'a']
# ['y', 'y', 'e', 'x']
# ['t', 'u', 'o', 'x']
# ['p', 'r', 'k', 'p']
# ['v', 'n', 'x', 'a']
# ['n', 'h', 'e', 'x']
For testing execution time
import timeit, functools
def custom_sort(ciphertext, sorted_key, unsorted_key):
return [ ciphertext[b] for _, b in [ [i, sorted_key.index(c)] for i, c in enumerate(unsorted_key) ] ]
custom_sort = timeit.Timer(functools.partial(custom_sort, ciphertext, sorted_key, unsorted_key))
print(custom_sort.timeit(20000))
I'm not sure I'm understanding your question properly, but if you're attempting to sort the unsorted key, and ensure that the ciphertexts are sorted accordingly, this should do what you want:
pairs = zip(unsorted_key, ciphertext)
sorted_key = []
sorted_ciphertexts = []
for t in sorted(pairs):
sorted_key.append(t[0])
sorted_ciphertexts.append(t[1])
I'm sure there's probably a more elegant way to do it, but this will ensure that the key and ciphertexts are placed at the same index.
I have a list of lists, and I would like to duplicate the effect of itertools.product() without using any element more than once.
>>> list = [['A', 'B'], ['C', 'D'], ['A', 'B']]
>>> [''.join(e) for e in itertools.product(*list)]
['ACA', 'ACB', 'ADA', 'ADB', 'BCA', 'BCB', 'BDA', 'BDB']
>>> # Desired output: ['ACB', 'ADB', 'BCA', 'BDA']
The list I need to use this on is too large to compute itertools.product and remove the unneeded elements. (25 billion permutations from itertools.product, while my desired output only has ~500,000). Preferably, an answer will be iterable.
Edit: I know "product" is the wrong term for what I need, but I'm struggling to find the word I'm looking for.
Edit2: This is the list that I desire to perform this operation on:
[['A', 'H', 'L'], ['X', 'B', 'I'], ['Q', 'C', 'V'], ['D', 'N'], ['E', 'F'], ['E', 'F'], ['G'], ['A', 'H', 'L'], ['X', 'B', 'I'], ['W', 'U', 'J', 'K', 'M'], ['W', 'U', 'J', 'K', 'M'], ['A', 'H', 'L'], ['W', 'U', 'J', 'K', 'M'], ['D', 'N'], ['P', 'O', 'T'], ['P', 'O', 'T'], ['Q', 'C', 'V'], ['R'], ['S'], ['P', 'O', 'T'], ['W', 'U', 'J', 'K', 'M'], ['Q', 'C', 'V'], ['W', 'U', 'J', 'K', 'M'], ['X', 'B', 'I']]
A simple stack-based implementation:
def product1(l): return product1_(l,0,[])
def product1_(l,i,buf):
if i==len(l): yield buf
else:
for x in l[i]:
if x not in buf:
buf.append(x)
yield from product1_(l,i+1,buf)
buf.pop()
This is a bit slower than Patrick Haugh's answer (I get 18 s for your test case), but it gives the results in a predictable order.
Note that you have to process the values as it generates "them", since they're all the same list buf; you could write yield tuple(buf) or yield "".join(buf) to generate separate "cooked" values (at a cost of less than one additional second).
If the values are letters, you could use a "bitmask" list to implement the collision test, which reduces the time to about 13 s (but using a set is just as fast). Other possible optimizations include processing lists with fewer eligible elements first, to reduce backtracking; this can get this case down to 11 s.
test1 = [['A', 'B'], ['C', 'D'], ['A', 'B']]
test2 = [['A', 'H', 'L'], ['X', 'B', 'I'], ['Q', 'C', 'V'], ['D', 'N'], ['E', 'F'], ['E', 'F'], ['G'], ['A', 'H', 'L'],
['X', 'B', 'I'], ['W', 'U', 'J', 'K', 'M'], ['W', 'U', 'J', 'K', 'M'], ['A', 'H', 'L'],
['W', 'U', 'J', 'K', 'M'], ['D', 'N'], ['P', 'O', 'T'], ['P', 'O', 'T'], ['Q', 'C', 'V'], ['R'], ['S'],
['P', 'O', 'T'], ['W', 'U', 'J', 'K', 'M'], ['Q', 'C', 'V'], ['W', 'U', 'J', 'K', 'M'], ['X', 'B', 'I']]
def prod(curr, *others):
if not others:
for x in curr:
yield {x} # (x,) for tuples
else:
for o in prod(*others):
for c in curr:
if c not in o:
yield {c, *o} # (c, *o) for tuples
print([''.join(x) for x in prod(*test1)])
# ['ABC', 'ABD', 'ABC', 'ABD']
print(sum(1 for x in prod(*test2)))
# 622080
The longer input takes about five seconds to run on my machine. I use sets to pass values around because they are much more efficent than tuples or lists when it comes to membership checks. If necessary, you can use tuples, it will just be slower.
Some questions to think about: does order matter? What do you want to happen when we can't use an item from the current list (because they've all already been used)?
Your specific case has an interesting property. If we arrange it in a counter, we see that every list occurs as many times as its entries:
Counter({('A', 'H', 'L'): 3,
('D', 'N'): 2,
('E', 'F'): 2,
('G',): 1,
('P', 'O', 'T'): 3,
('Q', 'C', 'V'): 3,
('R',): 1,
('S',): 1,
('W', 'U', 'J', 'K', 'M'): 5,
('X', 'B', 'I'): 3})
In other words, ignoring order, the sequences you want are the cartesian products of permutations of your lists. Suppose your list is called l. Then we can assemble a list of all the permutations of the sublists and take their product:
c = set(tuple(i) for i in l)
permutations = [list(itertools.permutations(i)) for i in c]
permutation_products = itertools.products(*permutations)
An element of permutation_products looks something like:
(('W', 'U', 'J', 'K', 'M'),
('E', 'F'),
('X', 'B', 'I'),
('Q', 'C', 'V'),
('P', 'O', 'T'),
('D', 'N'),
('G',),
('S',),
('R',),
('A', 'L', 'H'))
We have to get it back into the right order. Suppose our permutation is called perm. For each sublist of your list, we have to locate the correct element of perm and then take the next letter in the permutation. We can do this by making a dictionary:
perm_dict = {frozenset(p): list(p) for p in perm}
Then, to construct a single permutation, we have:
s = "".join([perm_dict[frozenset(i)].pop() for i in l])
We can combine all this into a generator:
def permute_list(l):
c = set(tuple(i) for i in l)
permutations = [list(itertools.permutations(i)) for i in c]
permutation_products = itertools.product(*permutations)
for perm in permutation_products:
perm_dict = {frozenset(p): list(p) for p in perm}
yield "".join([perm_dict[frozenset(i)].pop() for i in l])
Here I have a word list as:
[['r', 'o', 't', 'o', 'r'], ['e', 'v', 'e', 'i', 'a'], ['f', 'i', 'n', 'e', 'd'], ['e', 'n', 'e', 't', 'a'], ['r', 'a', 't', 'e', 'r']]
And I have to display all the palindromes in this list which are in rows as well as columns.
I have coded to find all the palindromes in the rows. But cannot implement a method to find the palindromes in the columns.
Here is my code so far:
result_1=""
if len(palindrome)==len_line_str:
for row in range(len(palindrome)):
for horizontal_line in range(len(palindrome[row])):
if ''.join(palindrome[row])==''.join(reversed(palindrome[row])):
result_1=''.join(palindrome[row])+" is a palindrome starting at ["+str(row)+"]["+str(row)+"] and is a row in the table"
print(result_1)
Which will display the output:
rotor is a palindrome starting at [0][0] and is a row in the table
Where "rotor" is a palindrome.
I need a method to get the palindromes in the columns which are:
"refer", "tenet", "radar"
Any help is much appreciated. Thanks in advance!
You can use zip to transpose your lists:
>>> t = [['r', 'o', 't', 'o', 'r'], ['e', 'v', 'e', 'i', 'a'], ['f', 'i', 'n', 'e', 'd'], ['e', 'n', 'e', 't', 'a'], ['r', 'a', 't', 'e', 'r']]
[['r', 'o', 't', 'o', 'r'], ['e', 'v', 'e', 'i', 'a'], ['f', 'i', 'n', 'e', 'd'], ['e', 'n', 'e', 't', 'a'], ['r', 'a', 't', 'e', 'r']]
>>> list(zip(*t))
[('r', 'e', 'f', 'e', 'r'), ('o', 'v', 'i', 'n', 'a'), ('t', 'e', 'n', 'e', 't'), ('o', 'i', 'e', 't', 'e'), ('r', 'a', 'd', 'a', 'r')]
Your columns are now rows, and you can apply the same method than before. If you just need the words, you can use list comprehensions:
>>> rows = [['r', 'o', 't', 'o', 'r'], ['e', 'v', 'e', 'i', 'a'], ['f', 'i', 'n', 'e', 'd'], ['e', 'n', 'e', 't', 'a'], ['r', 'a', 't', 'e', 'r']]
>>> [''.join(row) for row in rows if row[::-1] == row ]
['rotor']
>>> [''.join(column) for column in zip(*rows) if column[::-1] == column ]
['refer', 'tenet', 'radar']
This will do the job:
palindrome=[['r', 'o', 't', 'o', 'r'], ['e', 'v', 'e', 'i', 'a'], ['f', 'i', 'n', 'e', 'd'], ['e', 'n', 'e', 't', 'a'], ['r', 'a', 't', 'e', 'r']]
n=len(palindrome)
for col in range(len(palindrome[0])):
col_word=[palindrome[i][col] for i in range(n)]
if ''.join(col_word)==''.join(reversed(col_word)):
result=''.join(col_word)+" is a palindrome starting at ["+str(col)+"] and is a col in the table"
print(result)
This prints
refer is a palindrome starting at [0] and is a col in the table
tenet is a palindrome starting at [2] and is a col in the table
radar is a palindrome starting at [4] and is a col in the table
Basically, in order to access the words in the column, you can do
col_word=[palindrome[i][col] for i in range(n)]
This fixes the column and iterates over the rows. The rest of the code is structures similarly to yours.
I saw you did not want to use Zip (which I would recommend using):
Alternative answer:
list_ = [['r', 'o', 't', 'o', 'r'], ['e', 'v', 'e', 'i', 'a'], ['f', 'i', 'n', 'e', 'd'], ['e', 'n', 'e', 't', 'a'], ['r', 'a', 't', 'e', 'r']]
You can get the palindromes (rows) by checking each list with the reversed list [::-1]:
[i==i[::-1] for i in list_]
# prints [True, False, False, False, False]
And get the palindromes (columns) by 1. create the column list (called list_2 below) with a list comprehension and 2. same principle as above:
list_2 = [[i[ind] for i in list_] for ind in range(len(list_))]
[i==i[::-1] for i in list_2]
# prints [True, False, True, False, True]
Update
If you want the answers directly you can do:
[i for i in list_ if i==i[::-1]]
# prints [['r', 'o', 't', 'o', 'r']]
# and list_2: [['r', 'e', 'f', 'e', 'r'],['t', 'e', 'n', 'e', 't'],['r', 'a', 'd', 'a', 'r']]
There are a lot of ways to do it. I will take as example your code because of your effort on it
Another alternative following your code, is creating the columns in another list and check wich of them are palindromes:
palindrome = [['r', 'o', 't', 'o', 'r'],
['e', 'v', 'e', 'i', 'a'],
['f', 'i', 'n', 'e', 'd'],
['e', 'n', 'e', 't', 'a'],
['r', 'a', 't', 'e', 'r']]
len_line_str = 5
result_1=""
def is_pal(string):
return string == reversed(string)
colums = []
if len(palindrome)==len_line_str:
for row in range(len(palindrome)):
vertical = []
if ''.join(palindrome[row])==''.join(reversed(palindrome[row])):
result_1+=''.join(palindrome[row])+" is a palindrome starting at ["+str(0)+"]["+str(row)+"] and is a row in the table. " + "\n"
for horizontal_line in range(len(palindrome[row])):
if(len_line_str-1 > horizontal_line):
vertical += [palindrome[horizontal_line][row]]
else:
vertical += [palindrome[horizontal_line][row]]
colums += [(vertical,row)]
for word in colums:
if ''.join(word[0])==''.join(reversed(word[0])):
result_1+=''.join(word[0])+" is a palindrome starting at ["+str(0)+"]["+str(word[1])+"] and is a column in the table" + "\n"
print(result_1)
This should work. First loop iterates through the list s and the second loop iterates through each list.
Assuming s is the name of the list- [['r', 'o', 't', 'o', 'r'], ['e', 'v', 'e', 'i', 'a'], ['f', 'i', 'n', 'e', 'd'], ['e', 'n', 'e', 't', 'a'], ['r', 'a', 't', 'e', 'r']]
for i in xrange(0,len(s),1):
str = ""
for j in s:
str = str + j[i]
print str
if str == str[::-1]:
print str," is a pallindrome - column", i
else:
print str," is not a pallindrome - column", i
There is no column wise traversal in Python. One hacky way you can follow is to perform transpose operation on your input matrix. Below is a simple way to implement transpose using list comprehensions.
def transpose(matrix):
if not matrix:
return []
return [[row[i] for row in matrix] for i in range(len(matrix[0]))]
Your same logic should work once modify your input using transpose.
Hope this helps!!