Related
I want to filter a list of lists for duplicates. I consider two lists to be a duplicate of each other when they contain the same elements but not necessarily in the same order. So for example
[['A', 'B', 'C'], ['C', 'B', 'A'], ['D', 'B', 'A']]
should become
[['A', 'B', 'C'], ['D', 'B', 'A']]
since ['C', 'B', 'A'] is a duplicate of ['A', 'B', 'C'].
It does not matter which one of the duplicates gets removed, as long as the final list of lists does not contain any duplicates anymore. And all lists need to keep the order of there elements. So using set() may not be an option.
I found this related questions:
Determine if 2 lists have the same elements, regardless of order? ,
How to efficiently compare two unordered lists (not sets)?
But they only talk about how to compare two lists, not how too efficiently remove duplicates. I'm using python.
using dictionary comprehension
>>> data = [['A', 'B', 'C'], ['C', 'B', 'A'], ['D', 'B', 'A']]
>>> result = {tuple(sorted(i)): i for i in data}.values()
>>> result
dict_values([['C', 'B', 'A'], ['D', 'B', 'A']])
>>> list( result )
[['C', 'B', 'A'], ['D', 'B', 'A']]
You can use frozenset
>>> x = [['A', 'B', 'C'], ['C', 'B', 'A'], ['D', 'B', 'A']]
>>> [list(s) for s in set([frozenset(item) for item in x])]
[['A', 'B', 'D'], ['A', 'B', 'C']]
Or, with map:
>>> [list(s) for s in set(map(frozenset, x))]
[['A', 'B', 'D'], ['A', 'B', 'C']]
If you want to keep the order of elements:
data = [['A', 'B', 'C'], ['C', 'B', 'A'], ['D', 'B', 'A']]
seen = set()
result = []
for obj in data:
if frozenset(obj) not in seen:
result.append(obj)
seen.add(frozenset(obj))
Output:
[['A', 'B', 'C'], ['D', 'B', 'A']]
Do you want to keep the order of elements?
from itertools import groupby
data = [['A', 'B', 'C'], ['C', 'B', 'A'], ['D', 'B', 'A']]
print([k for k, _ in groupby(data, key=sorted)])
Output:
[['A', 'B', 'C'], ['A', 'B', 'D']]
In python you have to remember that you can't change existing data but you can somehow append / update data.
The simplest way is as follows:
dict = [['A', 'B', 'C'], ['C', 'B', 'A'], ['D', 'B', 'A']]
temp = []
for i in dict:
if sorted(i) in temp:
pass
else:
temp.append(i)
print(temp)
cheers, athrv
I want to write a script to take a list of categories and return the unique ways to split the categories into 2 groups. For now I have it in tuple form (list_a, list_b) where the union of list_a and list_b represents the full list of categories.
Below I have shown with an example with categories ['A','B','C','D'], I can get all the groups. However, some are duplicates (['A'], ['B', 'C', 'D']) represents the same split as (['B', 'C', 'D'], ['A']). How do I retain only unique splits? Also what is a better title for this post?
import itertools
def getCompliment(smallList, fullList):
compliment = list()
for item in fullList:
if item not in smallList:
compliment.append(item)
return compliment
optionList = ['A','B','C','D']
combos = list()
for r in range(1,len(optionList)):
tuples = list(itertools.combinations(optionList, r))
for t in tuples:
combos.append((list(t),getCompliment(list(t), optionList)))
print(combos)
[(['A'], ['B', 'C', 'D']),
(['B'], ['A', 'C', 'D']),
(['C'], ['A', 'B', 'D']),
(['D'], ['A', 'B', 'C']),
(['A', 'B'], ['C', 'D']),
(['A', 'C'], ['B', 'D']),
(['A', 'D'], ['B', 'C']),
(['B', 'C'], ['A', 'D']),
(['B', 'D'], ['A', 'C']),
(['C', 'D'], ['A', 'B']),
(['A', 'B', 'C'], ['D']),
(['A', 'B', 'D'], ['C']),
(['A', 'C', 'D'], ['B']),
(['B', 'C', 'D'], ['A'])]
I need the following:
[(['A'], ['B', 'C', 'D']),
(['B'], ['A', 'C', 'D']),
(['C'], ['A', 'B', 'D']),
(['D'], ['A', 'B', 'C']),
(['A', 'B'], ['C', 'D']),
(['A', 'C'], ['B', 'D']),
(['A', 'D'], ['B', 'C'])]
You are very close. What you need is a set of results.
Since set elements must be hashable and list objects are not hashable, you can use tuple instead. This can be achieved by some trivial changes to your code.
import itertools
def getCompliment(smallList, fullList):
compliment = list()
for item in fullList:
if item not in smallList:
compliment.append(item)
return tuple(compliment)
optionList = ('A','B','C','D')
combos = set()
for r in range(1,len(optionList)):
tuples = list(itertools.combinations(optionList, r))
for t in tuples:
combos.add(frozenset((tuple(t), getCompliment(tuple(t), optionList))))
print(combos)
{frozenset({('A',), ('B', 'C', 'D')}),
frozenset({('A', 'C', 'D'), ('B',)}),
frozenset({('A', 'B', 'D'), ('C',)}),
frozenset({('A', 'B'), ('C', 'D')}),
frozenset({('A', 'C'), ('B', 'D')}),
frozenset({('A', 'D'), ('B', 'C')}),
frozenset({('A', 'B', 'C'), ('D',)})}
If you need to convert the result back to a list of lists, this is possible via a list comprehension:
res = [list(map(list, i)) for i in combos]
[[['A'], ['B', 'C', 'D']],
[['B'], ['A', 'C', 'D']],
[['A', 'B', 'D'], ['C']],
[['A', 'B'], ['C', 'D']],
[['B', 'D'], ['A', 'C']],
[['B', 'C'], ['A', 'D']],
[['A', 'B', 'C'], ['D']]]
I'm trying to extract all the unique combinations of strings from a list of lists in Python. For example, in the code below, ['a', 'b','c'] and ['b', 'a', 'c'] are not unique, while ['a', 'b','c'] and ['a', 'e','f'] or ['a', 'b','c'] and ['d', 'e','f'] are unique.
I've tried converting my list of lists to a list of tuples and using sets to compare elements, but all elements are still being returned.
combos = [['a', 'b', 'c'], ['c', 'b', 'a'], ['d', 'e', 'f'], ['c', 'a', 'b'], ['c', 'f', 'b']]
# converting list of list to list of tuples, so they can be converted into a set
combos = [tuple(item) for item in combos]
combos = set(combos)
grouping_list = set()
for combination in combos:
if combination not in grouping_list:
grouping_list.add(combination)
##
print grouping_list
>>> set([('a', 'b', 'c'), ('c', 'a', 'b'), ('d', 'e', 'f'), ('c', 'b', 'a'), ('c', 'f', 'b')])
How about sorting, (and using a Counter)?
from collections import Counter
combos = [['a', 'b', 'c'], ['c', 'b', 'a'], ['d', 'e', 'f'], ['c', 'a', 'b'], ['c', 'f', 'b']]
combos = Counter(tuple(sorted(item)) for item in combos)
print(combos)
returns:
Counter({('a', 'b', 'c'): 3, ('d', 'e', 'f'): 1, ('b', 'c', 'f'): 1})
EDIT: I'm not sure if I'm correctly understanding your question. You can use a Counter to count occurances, or use a set if you're only interested in the resulting sets of items, and not in their uniqueness.
Something like:
combos = set(tuple(sorted(item)) for item in combos)
Simply returns
set([('a', 'b', 'c'), ('d', 'e', 'f'), ('b', 'c', 'f')])
>>> set(tuple(set(combo)) for combo in combos)
{('a', 'c', 'b'), ('c', 'b', 'f'), ('e', 'd', 'f')}
Simple but if we have same elements in the combo, it will return wrong answer. Then, sorting is the way to go as suggested in others.
How about this:
combos = [['a', 'b', 'c'], ['c', 'b', 'a'], ['d', 'e', 'f'], ['c', 'a', 'b'], ['c', 'f', 'b']]
print [list(y) for y in set([''.join(sorted(c)) for c in combos])]
Currently working on a 2D transposition cipher in Python. So I have a list that contains an encoded message, like below:
['BF', 'AF', 'AF', 'DA', 'CD', 'DD', 'BC', 'EF', 'DA', 'AA', 'EF', 'BF']
The next step is taking that list, splitting it up and putting it into a new matrix with regards to a keyword that the user enters. Which I have below:
Enter the keyword for final encryption: hide
H I D E
['B', 'F', 'A', 'F']
['A', 'F', 'D', 'A']
['C', 'D', 'D', 'D']
['B', 'C', 'E', 'F']
['D', 'A', 'A', 'A']
['E', 'F', 'B', 'F']
What I would like to do next and haven't done is take each of the columns above and print them in alphabetical order, therefore getting another cipher text, like below:
D E H I
['A', 'F', 'B', 'F']
['D', 'A', 'A', 'F']
['D', 'D', 'C', 'D']
['E', 'F', 'B', 'C']
['A', 'A', 'D', 'A']
['B', 'F', 'E', 'F']
Here's my code:
def encodeFinalCipher():
matrix2 = []
# Convert keyword to upper case
keywordKey = list(keyword.upper())
# Convert firstEncryption to a string
firstEncryptionString = ''.join(str(x) for x in firstEncryption)
# Print the first table that will show the firstEncryption and the keyword above it
keywordList = list(firstEncryptionString)
for x in range(0,len(keywordList),len(keyword)):
matrix2.append(list(keywordList[x:x+len(keyword)]))
# Print the matrix to the screen
print (' %s' % ' '.join(map(str, keywordKey)))
for letters in matrix2:
print (letters)
return finalEncryption
I have traversed the 2D matrix and got all the column entries like below:
b = [[matrix2[i][j] for i in range(len(matrix2))] for j in range(len(matrix2[0]))]
for index, item in enumerate (b):
print("\n",index, item)
OUTPUT:------
0 ['B', 'A', 'C', 'B', 'D', 'E']
1 ['F', 'F', 'D', 'C', 'A', 'F']
2 ['A', 'D', 'D', 'E', 'A', 'B']
3 ['F', 'A', 'D', 'F', 'A', 'F']
How would I append each letter of the keywordKey (e.g. 'H' 'I' 'D' 'E') to the list where the numbers 0,1,2,3 are?
Or probably a more efficient solution. How would I put the letters into the keywordKey columns when creating the matrix? Would a dictionary help here? Then I could sort the dictionary and print the final cipher.
Many thanks
You can do something like this:
>>> from operator import itemgetter
>>> from pprint import pprint
>>> lst = [['B', 'F', 'A', 'F'],
['A', 'F', 'D', 'A'],
['C', 'D', 'D', 'D'],
['B', 'C', 'E', 'F'],
['D', 'A', 'A', 'A'],
['E', 'F', 'B', 'F']]
>>> key = 'HIDE'
Sort xrange(len(key)) or range(len(key)) using the corresponding values from key and then you will have a list of indices:
>>> indices = sorted(xrange(len(key)), key=key.__getitem__)
>>> indices
[2, 3, 0, 1]
Now all we need to do is loop over the list and apply these indices to each item using operator.itemgetter and get the corresponding items:
>>> pprint([list(itemgetter(*indices)(x)) for x in lst])
[['A', 'F', 'B', 'F'],
['D', 'A', 'A', 'F'],
['D', 'D', 'C', 'D'],
['E', 'F', 'B', 'C'],
['A', 'A', 'D', 'A'],
['B', 'F', 'E', 'F']]
#or simply
>>> pprint([[x[i] for i in indices] for x in lst])
[['A', 'F', 'B', 'F'],
['D', 'A', 'A', 'F'],
['D', 'D', 'C', 'D'],
['E', 'F', 'B', 'C'],
['A', 'A', 'D', 'A'],
['B', 'F', 'E', 'F']]
Currently working on a transposition problem. What I have so far is that a user enters a message and that message is encrypted into a list, like below:
['BC', 'DE', 'DE', 'DA', 'FD', 'DD', 'BE', 'FE', 'DA', 'EA', 'FE', 'BC']
What I have for the next stage of the cipher is putting this into a table with a key inputted from the user. So if the user enters 'CODE' it outputs this:
2: Enter the keyword for final encryption: code
C O D E
['B', 'C', 'D', 'E']
['D', 'E', 'D', 'A']
['F', 'D', 'D', 'D']
['B', 'E', 'F', 'E']
['D', 'A', 'E', 'A']
['F', 'E', 'B', 'C']
The next stage is to take each value of each column and print the values corresponding to its alphabetical column. So my expected output would be:
C D E O
['B', 'D', 'E', 'C']
['D', 'D', 'A', 'E']
['F', 'D', 'D', 'D']
['B', 'F', 'E', 'E']
['D', 'E', 'A', 'A']
['F', 'B', 'C', 'E']
The problem I'm having is trying to know how to put each of the values in their corresponding column and printing them.
Here's what I have so far:
def encodeFinalCipher():
matrix2 = []
# Convert keyword to upper case
key = list(keyword.upper())
# Convert firstEncryption to a string
firstEncryptionString = ''.join(str(x) for x in firstEncryption)
# Print the first table that will show the firstEncryption and the keyword above it
keywordList = list(firstEncryptionString)
for x in range(0,len(keywordList),len(keyword)):
matrix2.append(list(keywordList[x:x+len(keyword)]))
# Print the un-ordered matrix to the screen
print (' %s' % ' '.join(map(str, key)))
for letters in matrix2:
print (letters)
unOrderedMatrix = [[matrix2[i][j] for i in range(len(matrix2))] for j in range(len(matrix2[0]))]
for index, item in enumerate (unOrderedMatrix):
print("\n",index, item)
index = sorted(key)
print(index)
I get the output of the sorted key:
['A', 'K', 'M', 'R']
What I would like to know is how can this sorted key be applied to the values they represent? I know I can get the first column by doing this:
print(unOrderedMatrix[0])
Which gets me the list of the first column.
Any help would be much appreciated. Complete beginner on Python
msg = ['BC', 'DE', 'DE', 'DA', 'FD', 'DD', 'BE', 'FE', 'DA', 'EA', 'FE', 'BC', '12']
key = 'CODE'
# 'flatten' the message
msg = ''.join(msg)
key_length = len(key)
#create a dictionary with the letters of the key as the keys
#use a slice to create the values
columns = {k:msg[i::key_length] for i, k in enumerate(key)}
print columns
# sort the columns on the key letters
columns = sorted(columns.items())
print columns
# separate the key from the columnar data
header, data = zip(*columns)
print header
# transpose and print
for thing in zip(*data):
print thing
>>>
{'C': 'BDFBDF1', 'E': 'EADEAC', 'D': 'DDDFEB', 'O': 'CEDEAE2'}
[('C', 'BDFBDF1'), ('D', 'DDDFEB'), ('E', 'EADEAC'), ('O', 'CEDEAE2')]
('C', 'D', 'E', 'O')
('B', 'D', 'E', 'C')
('D', 'D', 'A', 'E')
('F', 'D', 'D', 'D')
('B', 'F', 'E', 'E')
('D', 'E', 'A', 'A')
('F', 'B', 'C', 'E')
>>>
code = raw_input("Enter the keyword for final encryption:")
user_input = ['BC', 'DE', 'DE', 'DA', 'FD', 'DD', 'BE', 'FE', 'DA', 'EA', 'FE', 'BC']
user_input = ''.join(user_input)
matrix = [user_input[i:i+len(code)] for i in range(0, len(user_input), len(code))]
matrix.insert(0, code)
result = sorted([[matrix[j][ind] for j in range(len(matrix))] for ind in range(len(code)) ], key= lambda i:i[0])
for row in [[each[ind] for each in result] for ind in range(len(result[0]))]:
print row
Print row results as:
Enter the keyword for final encryption:CODE
['C', 'D', 'E', 'O']
['B', 'D', 'E', 'C']
['D', 'D', 'A', 'E']
['F', 'D', 'D', 'D']
['B', 'F', 'E', 'E']
['D', 'E', 'A', 'A']
['F', 'B', 'C', 'E']
Here is something to get you started (you may want to separate the loop one-liner to smaller bits):
# define data
data = [['B', 'C', 'D', 'E'], ['D', 'E', 'D', 'A'], ['F', 'D', 'D', 'D'], ['B', 'E', 'F', 'E'], ['D', 'A', 'E', 'A'], ['F', 'E', 'B', 'C']]
# choose code word
code = 'code'
# add original locations to code word [(0, c), (1, o), (2, d), (3, e))]
# and sort them alphabetically!
code_with_locations = list(sorted(enumerate(code)))
print code_with_locations # [(0, 'c'), (2, 'd'), (3, 'e'), (1, 'o')]
# re-organize data according to new indexing
for index in range(len(data)):
# check if code is shorter than list in current index,
# or the other way around, don't exceed either list
max_substitutions = min(map(len, [code_with_locations, data[index]]))
# create a new list according to new indices
new_list = []
for i in range(max_substitutions):
current_index = code_with_locations[i][0]
new_list.append(data[index][current_index])
# replace old list with new list
data[index] = new_list
print data
Output for 'code' would be:
[['B', 'D', 'E', 'C'],
['D', 'D', 'A', 'E'],
['F', 'D', 'D', 'D'],
['B', 'F', 'E', 'E'],
['D', 'E', 'A', 'A'],
['F', 'B', 'C', 'E']]
With help from itertools
from pprint import pprint
from itertools import chain, izip_longest
def grouper(iterable, n, fillvalue=None):
"Collect data into fixed-length chunks or blocks"
# grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx
args = [iter(iterable)] * n
return izip_longest(fillvalue=fillvalue, *args)
z = ['BC', 'DE', 'DE', 'DA', 'FD', 'DD', 'BE', 'FE', 'DA', 'EA', 'FE', 'BC']
input = 'CODE'
pprint([[b for (a, b) in sorted(zip(input, x))]
for x in grouper(chain.from_iterable(z), len(input))])
[['B', 'D', 'E', 'C'],
['D', 'D', 'A', 'E'],
['F', 'D', 'D', 'D'],
['B', 'F', 'E', 'E'],
['D', 'E', 'A', 'A'],
['F', 'B', 'C', 'E']]