Counting number of permutations of each element - python

I need some help. I check, there are few questions about 'counting permutations', but I didn't find an answer suitable for my case.
I would like to count the total number of permutations of each item in a list of items. Say, you have two lists ('first', 'second' see below) and for each element of 'first', I would like to have its total number of unique permutations. e.g for 'a' in 'first' we have
ab
ac
ad
ab
ac
ad
ab
ab
by removing duplicates, we have
ab ac ad
So the number of permutations of 'a' will be '3'
The final result I would like to get should be like
(a, 3)
(b, 3)
(c, 3)
(d, 3)
I start with
import itertools
from collections import Counter
first = ['a','b','c','d']
second = [['a','b','c','d'], ['a','b'], ['a','c','d'], ['a','b','d']]
c = Counter()
for let in second:
letPermut = list(set(itertools.permutations(let, 2)))
for i in first:
for permut in letPermut:
if permut[0] == i:
c[i] += 1
for item in c.items():
print(item)
But in the output I get different counts for each element in first list, and the Counter's results are higher than the expected output. I don't know what I am doing wrong.
Any help?

Well, the question is still not very clear, but here my 0.02$:
def do_the_stuff(first, second):
second = list(set(second))
return {
el1: sum(1 for el2 in second if el1 in el2)
for el1 in first
}
With some test data:
>>> first = ['a','b','c','d', 'j']
>>> second = ['abcd', 'ab', 'ab', 'acd', 'abd']
>>> print do_the_stuff(first, second)
{'a': 4, 'c': 2, 'b': 3, 'd': 3, 'j': 0}

If I did understand well your problem, these changes make your code ignore duplicate permutations:
import itertools
from collections import Counter
first = ['a','b','c','d']
second = [['a','b','c','d'], ['a','b'], ['a','c','d'], ['a','b','d']]
uniques = []
c = Counter()
for let in second:
letPermut = list(set(itertools.permutations(let, 2)))
for i in first:
for permut in letPermut:
if permut[0] == i and not permut in uniques:
c[i] += 1
uniques.append(permut)
for item in c.items():
print(item)
The changes:
Declare an empty list called uniques
We check against uniques if permutation is a duplicate before counting +1
After increasing the counter we add the permutation to uniques for future check
Took the printing loop out of the for let in second loop. Thus, each counter is only printed once at the end.

Related

Find common elements of two strings including characters that occur many times

I would like to get common elements in two given strings such that duplicates will be taken care of. It means that if a letter occurs 3 times in the first string and 2 times in the second one, then in the common string it has to occur 2 times. The length of the two strings may be different. eg
s1 = 'aebcdee'
s2 = 'aaeedfskm'
common = 'aeed'
I can not use the intersection between two sets. What would be the easiest way to find the result 'common' ? Thanks.
Well there are multiple ways in which you can get the desired result. For me the simplest algorithm to get the answer would be:
Define an empty dict. Like d = {}
Iterate through each character of the first string:
if the character is not present in the dictionary, add the character to the dictionary.
else increment the count of character in the dictionary.
Create a variable as common = ""
Iterate through the second string characters, if the count of that character in the dictionary above is greater than 0: decrement its value and add this character to common
Do whatever you want to do with the common
The complete code for this problem:
s1 = 'aebcdee'
s2 = 'aaeedfskm'
d = {}
for c in s1:
if c in d:
d[c] += 1
else:
d[c] = 1
common = ""
for c in s2:
if c in d and d[c] > 0:
common += c
d[c] -= 1
print(common)
You can use two arrays (length 26).
One array is for the 1st string and 2nd array is for the second string.
Initialize both the arrays to 0.
The 1st array's 0th index denotes the number of "a" in 1st string,
1st index denotes number of "b" in 1st string, similarly till - 25th index denotes number of "z" in 1st string.
Similarly, you can create an array for the second string and store the count of
each alphabet in their corresponding index.
s1 = 'aebcdee'
s2 = 'aaeedfs'
Below is the array example for the above s1 and s2 values
Now you can run through the 1st String
s1 = 'aebcdee'
for each alphabet find the
K = minimum of ( [ count(alphabet) in Array 1 ], [ count(alphabet) in Array 2 ] )
and print that alphabet K times.
then make that alphabet count to 0 in both the arrays. (Because if you dint make it zero, then our algo might print the same alphabet again if it comes in the future).
Complexity - O( length(S1) )
Note - You can also run through the string having a minimum length to reduce the complexity.
In that case Complexity - O( minimum [ length(S1), length(S2) ] )
Please let me know if you want the implementation of this.
you can use collection.Counter and count each char in two string and if each char exist in two string using min of list and create a new string by join of them.
from collections import Counter, defaultdict
from itertools import zip_longest
s1 = 'aebcdee'
s2 = 'aaeedfskm'
# Create a dictionary the value is 'list' and can append char in each 'list'
res = defaultdict(list)
# get count of each char
cnt1 = Counter(s1) # -> {'e': 3, 'a': 1, 'b': 1, 'c': 1, 'd': 1}
cnt2 = Counter(s2) # -> {'a': 2, 'e': 2, 'd': 1, 'f': 1, 's': 1, 'k': 1, 'm': 1}
# for appending chars in one step, we can zip count of chars in two strings,
# so Because maybe two string have different length, we can use 'itertools. zip_longest'
for a,b in zip_longest(cnt1 , cnt2):
# list(zip_longest(cnt1 , cnt2)) -> [('a', 'a'), ('e', 'e'), ('b', 'd'),
# ('c', 'f'), ('d', 's'), (None, 'k'),
# (None, 'm')]
# Because maybe we have 'none', before 'append' we need to check 'a' and 'b' don't be 'none'
if a: res[a].append(cnt1[a])
if b: res[b].append(cnt2[b])
# res -> {'a': [1, 2], 'e': [3, 2], 'b': [1], 'd': [1, 1], 'c': [1], 'f': [1], 's': [1], 'k': [1], 'm': [1]}
# If the length 'list' of each char is larger than one so this char is duplicated and we repeat this char in the result base min of each char in the 'list' of count char of two strings.
out = ''.join(k* min(v) for k,v in res.items() if len(v)>1)
print(out)
# aeed
We can use this approach for multiple string, like three strings.
s1 = 'aebcdee'
s2 = 'aaeedfskm'
s3 = 'aaeeezzxx'
res = defaultdict(list)
cnt1 = Counter(s1)
cnt2 = Counter(s2)
cnt3 = Counter(s3)
for a,b,c in zip_longest(cnt1 , cnt2, cnt3):
if a: res[a].append(cnt1[a])
if b: res[b].append(cnt2[b])
if c: res[c].append(cnt3[c])
out = ''.join(k* min(v) for k,v in res.items() if len(v)>1)
print(out)
# aeed
s1="ckglter"
s2="ancjkle"
final_list=[]
if(len(s1)<len(s2)):
for i in s1:
if(i in s2):
final_list.append(i)
else:
for i in s2:
if(i in s1):
final_list.append(i)
print(final_list)
you can also do it like this also, just iterate through both the string using for loop and append the common character into the empty list

Searching for an exact values in a Dictionary

I have a Dictionary and I want to search for the keys which correspond to a particular value I need (S009 and S007 in this case)
I wrote the following code but I get nothing from it
Here is my Code:
def find():
L = [{"V": "S001"},
{"V": "S002"},
{"V": "S001"},
{"V": "S001"},
{"V": "S001"},
{"V1": "S002"},
{"V111": "S005"},
{"V2": "S005"},
{"V": "S009"},
{"V3": "S007"}]
L1 = []
for y in range (len(L)) :
for j in L[y].values():
L1.append(j)
L2=[]
for z in L1:
if z not in L2:
L2.append(z)
count =0
l3=[]
s = set(L1)
for z in L2:
for y in L1:
if z in L2:
count =count +1
if count == 2:
l3.append(z)
for s in l3:
print(s)
def main():
find()
main()
My code explained: First, I took all the values in a list and called it L1. Then I get all the values without being copied in L2. Then, I want to search if an element of L2 exists in L1. After this loop, if the count became only one so this is the value I'm looking for & I append it to an empty list called l3
You can do it in two steps. First extract all the values from L:
values = []
for i in L:
for v in i.values():
values.append(v)
Or as a list comprehension:
values = [v for i in L for v in i.values()]
Then filter out the items with count more than 1:
result = [i for i in values if values.count(i) == 1]
print (result)
Result:
['S009', 'S007']
What you've defined above as L is a list of individual dictionaries. I'm not sure this is what was intended. You said you're expected output should be 's009' and 's007', so I'm going to assume that, perhaps, you intended L to just be a list of the values of each individual dictionary. In that's the case,
L = ["S001", "S002", "S001", "S001", "S001", "S002", "S005", "S005", "S009", "S007"]
One of the easiest ways to count the number of items of a list is to use a Counter from the collections module.
Then just create the Counter with the L as the only argument
from collections import Counter
c = Counter(L)
print(c)
Counter({'S001': 4, 'S002': 2, 'S005': 2, 'S009': 1, 'S007': 1})
Now you can see how many instances of each element of L exist. From there you can just use a little list comprehension to filter out anything that doesn't have one instance.
result = [key for key, value in c.items() if value == 1]
print(result)
['S009', 'S007']
All the code:
from collections import Counter
L = ["S001", "S002", "S001", "S001", "S001", "S002", "S005", "S005", "S009", "S007"]
c = Counter(L)
result = [key for key, value in c.items() if value == 1]

Python group by adjacent items in a list with same attributes

Please see the simplified example:
A=[(721,'a'),(765,'a'),(421,'a'),(422,'a'),(106,'b'),(784,'a'),(201,'a'),(206,'b'),(207,'b')]
I want group adjacent tuples with attribute 'a', every two pair wise and leave tuples with 'b' alone.
So the desired tuple would looks like:
A=[[(721,'a'),(765,'a')],
[(421,'a'),(422,'a')],
[(106,'b')],
[(784,'a'),(201,'a')],
[(206,'b')],[(207,'b')]]
What I can do is to build two separated lists contains tuples with a and b.
Then pair tuples in a, and add back. But it seems not very efficient. Any faster and simple solutions?
Assuming a items are always in pairs, a simple approach would be as follows.
Look at the first item - if it's an a, use it and the next item as a pair. Otherwise, just use the single item. Then 'jump' forward by 1 or 2, as appropriate:
A=[(721,'a'),(765,'a'),(421,'a'),(422,'a'),(106,'b'),(784,'a'),(201,'a'),(206,'b'),(207,'b')]
result = []
count = 0
while count <= len(A)-1:
if A[count][1] == 'a':
result.append([A[count], A[count+1]])
count += 2
else:
result.append([A[count]])
count += 1
print(result)
You can use itertools.groupby:
import itertools
A=[(721,'a'),(765,'a'),(421,'a'),(422,'a'),(106,'b'),(784,'a'),(201,'a'),(206,'b'),(207,'b')]
def split(s):
return [s[i:i+2] for i in range(0, len(s), 2)]
new_data = [i if isinstance(i, list) else [i] for i in list(itertools.chain(*[split(list(b)) if a == 'a' else list(b) for a, b in itertools.groupby(A, key=lambda x:x[-1])]))
Output:
[[(721, 'a'), (765, 'a')], [(421, 'a'), (422, 'a')], [(106, 'b')], [(784, 'a'), (201, 'a')], [(206, 'b')], [(207, 'b')]]
No need to use two lists. Edit: If the 'a' are not assumed to come always as pairs/adjacent
A = [(721,'a'),(765,'a'),(421,'a'),(422,'a'),(106,'b'),(784,'a'),(201,'a'),(206,'b'),(207,'b')]
new_list = []
i = 0
while i < len(A):
if i == len(A)-1:
new_list.append([A[i]])
i+=1
elif (A[i][1]==A[i+1][1]=='a') :
new_list.append([A[i], A[i+1]])
i += 2
else:
new_list.append([A[i]])
i += 1
print(new_list)

How to count elements on each position in lists

I have a lot of lists like:
SI821lzc1n4
MCap1kr01lv
All of them have the same length. I need to count how many times each symbol appears on each position. Example:
abcd
a5c1
b51d
Here it'll be a5cd
One way is to use zip to associate characters in the same position. We can then send all of the characters from each position to a Counter, then use Counter.most_common to get the most common character
from collections import Counter
l = ['abcd', 'a5c1', 'b51d']
print(''.join([Counter(z).most_common(1)[0][0] for z in zip(*l)]))
# a5cd
from statistics import mode
[mode([x[i] for x in y]) for i in xrange(len(y[0]))]
where y is your list.
Python 3.4 and up
You could use combination of zip and Counter
a = ("abcd")
b = ("a5c1")
c = ("b51d")
from collections import Counter
zippedList = list(zip(a,b,c))
print("zipped: {}".format(zippedList))
final = ""
for x in zippedList:
countLetters = Counter(x)
print(countLetters)
final += countLetters.most_common(3)[0][0]
print("output: {}".format(final))
output:
zipped: [('a', 'a', 'b'), ('b', '5', '5'), ('c', 'c', '1'), ('d', '1', 'd')]
Counter({'a': 2, 'b': 1})
Counter({'5': 2, 'b': 1})
Counter({'c': 2, '1': 1})
Counter({'d': 2, '1': 1})
output: a5cd
This all depends on where your list is. Is your list coming from another file or is it an actual array? At the end of the day, the best way to do this simply is going to be to use a dictionary and a for loop.
new_dict = {}
for i in range(len(line)):
if i in new_dict:
new_dict[i].append(line[i])
else:
new_dict[i] = [line[i]]
Then after that I'm assuming that you'd like to output the four most common element appearances. For that I'd recommend importing statistics and using the mode method...
from statistics import mode
new_line = ""
for key in new_dict:
x = mode(new_dict[key])
new_line = new_line + x
However, your question is quite vague, please elaborate more next time.
P.s. I'm a newbie so all you experienced programmers plz don't hate :)
I would use a combination of defaultdict, enumerate, and Counter:
>>> from collections import Counter, defaultdict
>>> data = '''abcd
a5c1
b51d
'''
>>> poscount = defaultdict(Counter)
>>> for line in data.split():
for i, character in enumerate(line):
poscount[i][character] += 1
>>> ''.join([poscount[i].most_common(1)[0][0] for i in sorted(poscount)])
'a5cd'
Here's how it works:
The defaultdict() creates new entries when it sees a new key.
The enumerate() function returns both the character and its position in the line.
The Counter counts the occurences of individual characters
Combining the three makes a defaultdict whose keys are the column positions and whose values are character counters. That gives you one character counter per column.
The most_common() method returns the highest frequency (character, count) pair for that counter.
The [0][0] extracts the character from the list of (character, count) tuples.
The str.join() method combines the results back together.

I want to write a function that takes a list and returns a count of total number of duplicate elements in the list

I have tried this, for some unknown reason when it prints h, it prints None, so i thought if it counts the number of None printed then divided by 2 it will give the number of duplicates, but i cant use function count here
a= [1,4,"hii",2,4,"hello","hii"]
def duplicate(L):
li=[]
lii=[]
h=""
for i in L:
y= L.count(i)
if y>1:
h=y
print h
print h.count(None)
duplicate(a)
Use the Counter container:
from collections import Counter
c = Counter(['a', 'b', 'a'])
c is now a dictionary with the data: Counter({'a': 2, 'b': 1})
If you want to get a list with all duplicated elements (with no repetition), you can do as follows:
duplicates = filter(lambda k: c[k] > 1, c.iterkeys())
If you want to only count the duplicates, you can then just set
duplicates_len = len(duplicates)
You can use a set to get the count of unique elements, and then compare the sizes - something like that:
def duplicates(l):
uniques = set(l)
return len(l) - len(uniques)
i found an answer which is
a= [1,4,"hii",2,4,"hello",7,"hii"]
def duplicate(L):
li=[]
for i in L:
y= L.count(i)
if y>1:
li.append(i)
print len(li)/2
duplicate(a)
the answer by egualo is much better, but here is another way using a dictionary.
def find_duplicates(arr):
duplicates = {}
duplicate_elements = []
for element in arr:
if element not in duplicates:
duplicates[element] = False
else:
if duplicates[element] == False:
duplicate_elements.append(element)
duplicates[element] = True
return duplicate_elements
It's pretty simple and doesn't go through the lists twice which is kind of nice.
>> test = [1,2,3,1,1,2,2,4]
>> find_duplicates(test)
[1, 2]

Categories