Is there any simple way to generate all possible unique samples from any given sample frame eg. I have a list with 5 elements members = ['P', 'V', 'S', 'T', 'A'], and would like to draw all possible 2 element combinations, disregarding an order i.e 'PV' is equivalent to 'VP'. So from list ['P', 'V', 'S', 'T', 'A'], I should get 10, 2 element samples.
I created something that does the trick, but I wonder if there is some method or function available already that does it and would allow to simply provide sample frame, size of the sample and created all possible combinations.
members = list('PVSTA')
ms = []
for i in members:
for j in members:
if i != j and i+j not in ms and j+i not in ms:
ms.append(i+j)
else:
continue
print(ms)
['PV', 'PS', 'PT', 'PA', 'VS', 'VT', 'VA', 'ST', 'SA', 'TA']
You can use itertools.combinations(iterable, r) which return r length subsequences of elements from the input iterable. So in your case when the iterable is ['P', 'V', 'S', 'T', 'A'] and r=2 it will return 5C2 = 10 combinations.
Use:
from itertools import combinations
ms = ["".join(c) for c in combinations(list("PVSTA"), r=2)]
print(ms)
Output:
['PV', 'PS', 'PT', 'PA', 'VS', 'VT', 'VA', 'ST', 'SA', 'TA']
What you want to do is called the combinations, you can do this by using the itertools library in python.
from itertools import combinations
members = list('PVSTA')
comb_2 = combinations(members, 2)
result = ["".join(c) for c in comb_2]
print(result)
Others have already posted the itertools.combinations route (the best approach), but here is the manual way to do it for anyone interested:
members = list('PVSTA')
ms = []
for i in range(len(members)-1):
for j in range(i+1, len(members)):
ms.append(members[i] + members[j]
print(ms) # ['PV', 'PS', 'PT', 'PA', 'VS', 'VT', 'VA', 'ST', 'SA', 'TA']
Similar to the query posted at this URL:
https://stackoverflow.com/questions/54477996/finding-unique-elements-in-nested-list/,
I have another query.
If I have a list that I have imported from Pandas and I need to get a single list as an output with all the unique elements as
[Ac, Ad, An, Bi, Co, Cr, Dr, Fa, Mu, My, Sc]
Once I have all the unique elements, I want to check the count of each of these elements within the whole list.
Can someone advise as to how can I accomplish that?
mylist = df.Abv.str.split().tolist()
mylist
[[‘Ac,Cr,Dr’],
[‘Ac,Ad,Sc'],
[‘Ac,Bi,Dr’],
[‘Ad,Dr,Sc'],
[‘An,Dr,Fa’],
[‘Bi,Co,Dr’],
[‘Dr,Mu’],
[‘Ac,Co,My’],
[‘Co,Dr’],
[‘Ac,Ad,Sc'],
[‘An,Ac,Ad’],
]
I have tried different things but can't seem to make it work.
Tried to convert it into a string and apply split function on the string, but to no avail.
You can do it this way in Python3
mylist = [['Ac,Cr,Dr'],
['Ac,Ad,Sc'],
['Ac,Bi,Dr'],
['Ad,Dr,Sc'],
['An,Dr,Fa'],
['Bi,Co,Dr'],
['Dr,Mu'],
['Ac,Co,My'],
['Co,Dr'],
['Ac,Ad,Sc'],
['An,Ac,Ad'],
]
uniquedict = {}
for sublist in mylist:
for item in sublist[0].split(','):
if item in uniquedict.keys():
uniquedict[item] += 1
else:
uniquedict[item] = 1
print(uniquedict)
print(list(uniquedict.keys()))
{'Ac': 6, 'Cr': 1, 'Dr': 7, 'Ad': 4, 'Sc': 3, 'Bi': 2, 'An': 2, 'Fa': 1, 'Co': 3, 'Mu': 1, 'My': 1}
['Ac', 'Cr', 'Dr', 'Ad', 'Sc', 'Bi', 'An', 'Fa', 'Co', 'Mu', 'My']
you can create a dictionary with keys as list value and value as their count
your code may look like this
.
mylists = [[‘Ac,Cr,Dr’],
[‘Ac,Ad,Sc'],
[‘Ac,Bi,Dr’],
[‘Ad,Dr,Sc'],
[‘An,Dr,Fa’],
[‘Bi,Co,Dr’],
[‘Dr,Mu’],
[‘Ac,Co,My’],
[‘Co,Dr’],
[‘Ac,Ad,Sc'],
[‘An,Ac,Ad’],
]
unique = {}
for mylist in mylists:
for elem in mylist:
unique[elem] = unique[elem]+1 if elem in unique else 1
unique.keys() will give unique element array and if you want the count of any value you can get this from dictionary e.g unique['Ad']
You can use collections.Counter to make a dictionary of the counts of the elements. This will also give you easy access to a list of all unique elements. It looks like you have a list of lists where each sublist contains a ingle string. You will need to split these before you add them to the counter.
from collections import Counter
count = Counter()
mylist = [['Ac,Cr,Dr'],
['Ac,Ad,Sc'],
['Ac,Bi,Dr'],
['Ad,Dr,Sc'],
['An,Dr,Fa'],
['Bi,Co,Dr'],
['Dr,Mu'],
['Ac,Co,My'],
['Co,Dr'],
['Ac,Ad,Sc'],
['An,Ac,Ad'],
]
for arr in mylist:
count.update(arr[0].split(','))
print(count) # dictionary of symbols: counts
print(list(count.keys())) # list of all unique elements
You can take advantage of the very powerful tools offered by collections, itertools and functools and get a one-line solution.
If your lists contain only one element:
from collections import Counter
from itertools import chain
from functools import partial
if __name__ == '__main__':
mylist = [
['Ac,Cr,Dr'],
['Ac,Ad,Sc'],
['Ac,Bi,Dr'],
['Ad,Dr,Sc'],
['An,Dr,Fa'],
['Bi,Co,Dr'],
['Dr,Mu'],
['Ac,Co,My'],
['Co,Dr'],
['Ac,Ad,Sc'],
['An,Ac,Ad'],
]
# if lists contain only one element
occurrence_count = Counter(chain(*map(lambda x: x[0].split(','), mylist)))
items = list(occurrence_count.keys()) # items, with no repetitions
all_items = list(occurrence_count.elements()) # all items
ac_occurrences = occurrence_count['Ac'] # occurrences of 'Ac'
print(f"Unique items: {items}")
print(f"All list elements: {all_items}")
print(f"Occurrences of 'Ac': {ac_occurrences}")
And this is what you get:
Unique items: ['Ac', 'Cr', 'Dr', 'Ad', 'Sc', 'Bi', 'An', 'Fa', 'Co', 'Mu', 'My']
All list elements: ['Ac', 'Ac', 'Ac', 'Ac', 'Ac', 'Ac', 'Cr', 'Dr', 'Dr', 'Dr', 'Dr', 'Dr', 'Dr', 'Dr', 'Ad', 'Ad', 'Ad', 'Ad', 'Sc', 'Sc', 'Sc', 'Bi', 'Bi', 'An', 'An', 'Fa', 'Co', 'Co', 'Co', 'Mu', 'My']
Occurrences of 'Ac': 6
Otherwise, if your lists have more than one element:
from collections import Counter
from itertools import chain
from functools import partial
if __name__ == '__main__':
mylist_complex = [
['Ac,Cr,Dr', 'Ac,Ad,Sc'],
['Ac,Ad,Sc', 'Ac,Bi,Dr'],
['Ac,Bi,Dr', 'Ad,Dr,Sc'],
['Ad,Dr,Sc', 'An,Dr,Fa'],
['An,Dr,Fa', 'Bi,Co,Dr'],
['Bi,Co,Dr', 'Dr,Mu'],
['Dr,Mu', 'Ac,Co,My'],
['Ac,Co,My', 'Co,Dr'],
['Co,Dr', 'Ac,Ad,Sc'],
['Ac,Ad,Sc', 'An,Ac,Ad'],
['An,Ac,Ad', 'Ac,Cr,Dr'],
]
# if lists contain more than one element
occurrence_count_complex = Counter(chain(*map(lambda x: chain(*map(partial(str.split, sep=','), x)), mylist_complex)))
items = list(occurrence_count_complex.keys()) # items, with no repetitions
all_items = list(occurrence_count_complex.elements()) # all items
ac_occurrences = occurrence_count_complex['Ac'] # occurrences of 'Ac'
print(f"Unique items: {items}")
print(f"All list elements: {all_items}")
print(f"Occurrences of 'Ac': {ac_occurrences}")
And this is what you get in this case:
Unique items: ['Ac', 'Cr', 'Dr', 'Ad', 'Sc', 'Bi', 'An', 'Fa', 'Co', 'Mu', 'My']
All list elements: ['Ac', 'Ac', 'Ac', 'Ac', 'Ac', 'Ac', 'Ac', 'Ac', 'Ac', 'Ac', 'Ac', 'Ac', 'Cr', 'Cr', 'Dr', 'Dr', 'Dr', 'Dr', 'Dr', 'Dr', 'Dr', 'Dr', 'Dr', 'Dr', 'Dr', 'Dr', 'Dr', 'Dr', 'Ad', 'Ad', 'Ad', 'Ad', 'Ad', 'Ad', 'Ad', 'Ad', 'Sc', 'Sc', 'Sc', 'Sc', 'Sc', 'Sc', 'Bi', 'Bi', 'Bi', 'Bi', 'An', 'An', 'An', 'An', 'Fa', 'Fa', 'Co', 'Co', 'Co', 'Co', 'Co', 'Co', 'Mu', 'Mu', 'My', 'My']
Occurrences of 'Ac': 12
Try below:
from itertools import chain
mylist = [['Ac,Cr,Dr'],
['Ac,Ad,Sc'],
['Ac,Bi,Dr'],
['Ad,Dr,Sc'],
['An,Dr,Fa'],
['Bi,Co,Dr'],
['Dr,Mu'],
['Ac,Co,My'],
['Co,Dr'],
['Ac,Ad,Sc'],
['An,Ac,Ad']
]
flat_list = list(chain.from_iterable(mylist))
unique_list = set(','.join(flat_list).split(','))
The normal kind of permutation is:
'ABC'
↓
'ACB'
'BAC'
'BCA'
'CAB'
'CBA'
But, what if I want to do this:
'ABC'
↓
'AA'
'AB'
'AC'
'BA'
'BB'
'BC'
'CA'
'CB'
'CC'
What is this called, and how efficient would this be with arrays with hundreds of elements?
Your terminology is a bit confusing: what you have are not permutations of your characters, but rather the pairing of every possible character with every possible character: a Cartesian product.
You can use itertools.product to generate these combinations, but note that this returns an iterator rather than a container. So if you need all the combinations in a list, you need to construct a list explicitly:
from itertools import product
mystr = 'ABC'
prodlen = 2
products = list(product(mystr,repeat=prodlen))
Or, if you're only looping over these values:
for char1,char2 in product(mystr,repeat=prodlen):
# do something with your characters
...
Or, if you want to generate the 2-length strings, you can do this in a list comprehension:
allpairs = [''.join(pairs) for pairs in products]
# ['AA', 'AB', 'AC', 'BA', 'BB', 'BC', 'CA', 'CB', 'CC']
Nothing against itertools, but if you want a little insight on how to manually generate permutations of strings by applying modulo arithmetic to an incrementing sequence number. Should work with a string of any length and any value of n where n <= len(s)
The number of permutations generated is len(s) ** n
For example, just call printPermutations("abc", 2)
def printPermutations(s, n) :
if (not s) or (n < 1):
return
maxpermutations = len(s) ** n
for p in range(maxpermutations):
perm = getSpecificPermutation(s, n, p)
print(perm)
def getSpecificPermutation(s, n, p):
# s is the source string
# n is the number of characters to extract
# p is the permutation sequence number
result = ''
for j in range(n):
result = s[p % len(s)] + result
p = p // len(s)
return result
You'll want to use the itertools solution. But I know what it's called...
Most people call it counting. You're being sneaky about it, but I think it's just counting in base len(set), where set is your input set (I'm assuming it is truly a set, no repeated elements). Imagine, in your example A -> 0, B->1, C->2. You're also asking for elements that have a certain amount of max digits. Let me show you:
def numberToBase(n, b):
if n == 0:
return [0]
digits = []
while n:
digits.append(int(n % b))
n /= b
return digits[::-1]
def count_me(set, max_digits=2):
# Just count! From 0 to len(set) ** max_digits to be precise
numbers = [i for i in range(len(set) ** max_digits)]
# Convert to base len(set)
lists_of_digits_in_base_b = [numberToBase(i, len(set)) for i in numbers]
# Add 0s to the front (making each list of digits max_digit - 1 in length)
prepended_with_zeros = []
for li in lists_of_digits_in_base_b:
prepended_with_zeros.append([0]*(max_digits - len(li)) + li)
# Map each digit to an item in our set
m = {index: item for index, item in enumerate(set)}
temp = map(lambda x: [m[digit] for digit in x], prepended_with_zeros)
# Convert to strings
temp2 = map(lambda x: [str(i) for i in x], prepended_with_zeros)
# Concatenate each item
concat_strings = map(lambda a: reduce(lambda x, y: x + y, a, ""), temp)
return concat_strings
Here's some outputs:
print count_me("ABC", 2)
outputs:
['AA', 'AB', 'AC', 'BA', 'BB', 'BC', 'CA', 'CB', 'CC']
and
print count_me("ABCD", 2)
outputs:
['AA', 'AB', 'AC', 'AD', 'BA', 'BB', 'BC', 'BD', 'CA', 'CB', 'CC', 'CD', 'DA', 'DB', 'DC', 'DD']
and
print count_me("ABCD", 3)
outputs (a big one):
['AAA', 'AAB', 'AAC', 'AAD', 'ABA', 'ABB', 'ABC', 'ABD', 'ACA', 'ACB', 'ACC', 'ACD', 'ADA', 'ADB', 'ADC', 'ADD', 'BAA', 'BAB', 'BAC', 'BAD', 'BBA', 'BBB', 'BBC', 'BBD', 'BCA', 'BCB', 'BCC', 'BCD', 'BDA', 'BDB', 'BDC', 'BDD', 'CAA', 'CAB', 'CAC', 'CAD', 'CBA', 'CBB', 'CBC', 'CBD', 'CCA', 'CCB', 'CCC', 'CCD', 'CDA', 'CDB', 'CDC', 'CDD', 'DAA', 'DAB', 'DAC', 'DAD', 'DBA', 'DBB', 'DBC', 'DBD', 'DCA', 'DCB', 'DCC', 'DCD', 'DDA', 'DDB', 'DDC', 'DDD']
P.S. numberToBase courtesy of this post
As it says Andras Deak, using itertools product:
import itertools
for i, j in itertools.product('ABC', repeat=2):
print(i + j)