Related
I have 2 lists
On is a big list with some elements having duplicates
super_set_list = [1,1,2,3,3,4,4,4,5,6,7,8,9]
The other is a subset of the big list, also with duplicates
sub_set_list = [1,2,3,3,4,4,6,7,9]
I want the difference, like this
diff = [1,4,5,8]
Not sure how I would go about this
You can use a Counter
super_set_list = [1,1,1,2,3,3,4,4,4,5,6,7,8,9]
sub_set_list = [1,2,3,3,4,4,6,7,9]
from collections import Counter
super_counter = Counter(super_set_list)
super_counter = Counter({1: 3, 4: 3, 3: 2, 2: 1, 5: 1, 6: 1, 7: 1, 8: 1, 9: 1})
For every element in sub_set_list, reduce the count in in super_counter
for item in sub_set_list:
super_counter[item]-=1
Now super_counter = Counter({1: 2, 4: 1, 5: 1, 8: 1, 2: 0, 3: 0, 6: 0, 7: 0, 9: 0})
Finally, just pick elements that have some count left (but add it that many number of times).
diff=[]
for k,v in super_counter.items():
for _ in range(v):
diff.append(k)
print(diff)
# [1, 1, 4, 5, 8]
You can loop through sub-set list and remove item in super-set list one by one as follows:
super_set_list = [1,1,2,3,3,4,4,4,5,6,7,8,9]
sub_set_list = [1,2,3,3,4,4,6,7,9]
for item in sub_set_list:
if item in super_set_list:
super_set_list.remove(item)
print(super_set_list)
from itertools import *
import collections
for i in combinations_with_replacement(['0','1','2','3','4','5','6','7','8','9','a','b','c','d','e','f'],15):
b = (''.join(i))
freq = collections.Counter(b)
for k in freq:
if freq [k] < 5:
print(k)
this code most print chars what count if less than 5
what i try do , cheek if at string from join at fly if there is repeated any of characters les than x times at any possition of that string and print strings only what true to that.
Problem is no mater what i try do , or its print all and ignore if ... or print notting.
how do it right , or maybe at python exist simple solution ?
Result most be as example les than 5
False - fffaaffbbdd ( repeat 5 titemes f)
False - fffffaaaaac ( repeat 5 times a and f)
True - aaabbbccc11 ( no any character repeated more than 4 times )
More clear explain qustion - filter all string with characters more than x repetions before give to next function.
As examble - there is simple print that strings , and not print strings what not at rule.
If I understand you right, you want to print strings where each character is found only 4-times at maximum:
from collections import Counter
from itertools import combinations_with_replacement
for i in combinations_with_replacement(['0','1','2','3','4','5','6','7','8','9','a','b','c','d','e','f'],15):
c = Counter(i)
if c.most_common(1)[0][1] > 4:
continue
print(''.join(i))
Prints:
...
00002446899cccd
00002446899ccce
00002446899cccf
00002446899ccdd
...
a more constructive approach (meaning: i do not iterate over all possible combinations - i construct the valid combinations directly).
you need to have sympy installed for this to work.
in the example i only use the elements "abcdef" and restrict the repetitions to be strictly smaller than MAX = 4. i fix the length of the strings to be output at M = 6.
i start by getting all the partitions of M with restricted repetitions k=MAX - 1 and not constisting of more than m=N parts. i immediately convert those to a list:
{3: 2} [3, 3, 0, 0, 0, 0]
{3: 1, 2: 1, 1: 1} [3, 2, 1, 0, 0, 0]
{3: 1, 1: 3} [3, 1, 1, 1, 0, 0]
{2: 3} [2, 2, 2, 0, 0, 0]
{2: 2, 1: 2} [2, 2, 1, 1, 0, 0]
{2: 1, 1: 4} [2, 1, 1, 1, 1, 0]
{1: 6} [1, 1, 1, 1, 1, 1]
of those lists i iterate over the multiset permutations - i mean those to represent the elements that i select and how often they are repeated: e.g:
[2, 1, 2, 0, 0, 1] -> "aabccf" # 2*"a", 1*"b", ..., 0*"e", 1*"f"
the result you want is then the multiset permutation of those strings.
from sympy.utilities.iterables import multiset_permutations, partitions
MAX = 4 # (all counts < MAX)
elements = "abcdef"
N = len(elements)
M = 6 # output length
def dict_to_list(dct, N):
ret = [0] * N
j = 0
for k, v in dct.items():
ret[j:j + v] = [k] * v
j += v
return ret
for dct in partitions(M, k=MAX - 1, m=N):
lst = dict_to_list(dct, N)
for part in multiset_permutations(lst):
el = ''.join(n * v for n, v in zip(part, elements))
for msp in multiset_permutations(el):
print(''.join(msp))
for your case you'd then need to change:
MAX = 5 # (all counts < MAX)
elements = "0123456789abcdef"
M = 15 # output length
but the complexity of that is huge (but way better that the one of the original approach)!
I have a list with repeated values and I want to count them using a dictionary comprehension
Here is my initial attempt
number_list = [1,1,2,2,3,3,4,4,5,5]
number_count_dict = {i:1 for i in number_list}
{k: (number_count_dict[k]+1 if k in number_count_dict() else 1) for k in number_list}
Is there a way of achieving this without initialising the dictionary?
Take this example for your question:
numbers = [5,3,3,4,2]
and let us say if you would like to turn it into a dictionary where the key is the index and value is the element in the list. Then you could trt something like this:
{index:numbers[index] for index in range(0,len(numbers))}
Here's the result:
{0: 5, 1: 3, 2: 3, 3: 4, 4: 2}
This will count repetitions only (>1)
>>> from collections import Counter
>>> x=[1,1,2,2,3,3,4,4,5,5]
>>> {i:j for i,j in Counter(x).items() if i>1}
{2: 2, 3: 2, 4: 2, 5: 2}
Your question leaves out a couple of important points. What is the minimum and maximum numbers you want in the dictionary? Do you want numbers with 0 to be counted? Can you use the Counter class?
#1: Count items from 0 to len(number_list) including items with count of 0, using Counter.
>>> from collections import Counter
>>> number_list = [1,1,2,2,3,3,4,4,5,5]
>>> count = Counter(number_list)
>>> number_count_dict = {i:(count[i] if i in number_list else 0) for i in range(len(number_list))}
{0: 0, 1: 2, 2: 2, 3: 2, 4: 2, 5: 2, 6: 0, 7: 0, 8: 0, 9: 0}
#2: Count items from lowest number to highest number in list including items with count of 0, using Counter.
>>> from collections import Counter
>>> number_list = [2,2,3,3,4,4,5,5,7,7]
>>> count = Counter(number_list)
>>> number_count_dict = {i:(count[i] if i in number_list else 0) for i in range(min(number_list),max(number_list)+1)}
{2: 2, 3: 2, 4: 2, 5: 2, 6: 0, 7: 2}
#3: Count items in list using Counter.
>>> from collections import Counter
>>> number_list = [1,1,2,2,3,3,4,4,5,5]
>>> count = Counter(number_list)
>>> number_count_dict = {i:count[i] for i in set(number_list)}
{1: 2, 2: 2, 3: 2, 4: 2, 5: 2}
#4: Count items from 0 to len(number_list) including items with count of 0, NOT using Counter.
>>> number_list = [1,1,2,2,3,3,4,4,5,5]
>>> number_count_dict = {i:number_list.count(i) for i in range(len(number_list))}
{0: 0, 1: 2, 2: 2, 3: 2, 4: 2, 5: 2, 6: 0, 7: 0, 8: 0, 9: 0}
#5: Count items from lowest number to highest number in list including items with count of 0, NOT using Counter.
>>> number_list = [2,2,3,3,4,4,5,5,7,7]
>>> number_count_dict = {i:number_list.count(i) for i in range(min(number_list),max(number_list)+1)}
{2: 2, 3: 2, 4: 2, 5: 2, 6: 0, 7: 2}
#6: Count items in list NOT using Counter.
>>> number_list = [1,1,2,2,3,3,4,4,5,5]
>>> number_count_dict = {i:number_list.count(i) for i in set(number_list)}
{1: 2, 2: 2, 3: 2, 4: 2, 5: 2}
#7: Bonus, using a defaultdict and for loop.
from collections import defaultdict
number_list = [1,1,2,2,3,3,4,4,5,5]
number_count_dict = defaultdict(int)
for i in number_list:
number_count_list[i] += 1
I have a list of dictionaries:
L = [{0:1,1:7,2:3,4:8},{0:3,2:6},{1:2,4:6}....{0:2,3:2}].
As you can see, the dictionaries have different length. What I need is to add missing keys:values to every dictionary to make them being with the same length:
L1 = [{0:1,1:7,2:3,4:8},{0:3,1:0,2:6,3:0,4:0},{0:0, 1:2,3:0,4:6}....{0:2,1:0,2:0,3:2,4:0}],
Means to add zeros for missing values. The maximum length isn't given in advance, so one may get it only iterating through the list.
I tried to make something with defaultdicts, like L1 = defaultdict(L) but it seems I don't understand properly how does it work.
You'll have to make two passes: 1 to get the union of all keys, and another to add the missing keys:
max_key = max(max(d) for d in L)
empty = dict.fromkeys(range(max_key + 1), 0)
L1 = [dict(empty, **d) for d in L]
This uses an 'empty' dictionary as a base to quickly produce all keys; a new copy of this dictionary plus an original dictionary produces the output you want.
Note that this assumes your keys are always sequential. If they are not, you can produce the union of all existing keys instead:
empty = dict.fromkeys(set().union(*L), 0)
L1 = [dict(empty, **d) for d in L]
Demo:
>>> L = [{0: 1, 1: 7, 2: 3, 4: 8}, {0: 3, 2: 6}, {1: 2, 4: 6}, {0: 2, 3: 2}]
>>> max_key = max(max(d) for d in L)
>>> empty = dict.fromkeys(range(max_key + 1), 0)
>>> [dict(empty, **d) for d in L]
[{0: 1, 1: 7, 2: 3, 3: 0, 4: 8}, {0: 3, 1: 0, 2: 6, 3: 0, 4: 0}, {0: 0, 1: 2, 2: 0, 3: 0, 4: 6}, {0: 2, 1: 0, 2: 0, 3: 2, 4: 0}]
or the set approach:
>>> empty = dict.fromkeys(set().union(*L), 0)
>>> [dict(empty, **d) for d in L]
[{0: 1, 1: 7, 2: 3, 3: 0, 4: 8}, {0: 3, 1: 0, 2: 6, 3: 0, 4: 0}, {0: 0, 1: 2, 2: 0, 3: 0, 4: 6}, {0: 2, 1: 0, 2: 0, 3: 2, 4: 0}]
The above approach to merge two dictionaries into a new one with dict(d1, **d2) always works in Python 2. In Python 3 additional constraints have been set on what kind of keys you can use this trick with; only string keys are allowed for the second dictionary. For this example, where you have numeric keys, but you can use dictionary unpacking instead:
{**empty, **d} # Python 3 dictionary unpacking
That'll work in Python 3.5 and newer.
a bit of caution: changes L
>>> allkeys = frozenset().union(*L)
>>> for i in L:
... for j in allkeys:
... if j not in i:
... i[j]=0
>>> L
[{0: 1, 1: 7, 2: 3, 3: 0, 4: 8}, {0: 3, 1: 0, 2: 6, 3: 0, 4: 0}, {0: 0, 1: 2, 2:
0, 3: 0, 4: 6}, {0: 2, 1: 0, 2: 0, 3: 2, 4: 0}]
Maybe not the most elegant solution, but should be working:
L = [{0:1,1:7,2:3,4:8},{0:3,2:6},{1:2,4:6},{0:2,3:2}]
alldicts = {}
for d in L:
alldicts.update(d)
allkeys = alldicts.keys()
for d in L:
for key in allkeys:
if key not in d:
d[key] = 0
print(L)
This is only a solution, but I think it's simple and straightforward. Note that it modifies the dictionaries in place, so if you want them to be copied, let me know and I'll revise accordingly.
keys_seen = []
for D in L: #loop through the list
for key in D.keys(): #loop through each dictionary's keys
if key not in keys_seen: #if we haven't seen this key before, then...
keys_seen.append(key) #add it to the list of keys seen
for D1 in L: #loop through the list again
for key in keys_seen: #loop through the list of keys that we've seen
if key not in D1: #if the dictionary is missing that key, then...
D1[key] = 0 #add it and set it to 0
This is quick and slim:
missing_keys = set(dict1.keys()) - set(dict2.keys())
for k in missing_keys:
dict1[k] = dict2[k]
Unless None is a valid value for a dictionary key you have herein is a great solution for you
L = [{0: 1, 1: 7, 2: 3, 4: 8}, {0: 3, 2: 6}, {1: 2, 4: 6}, {0: 2, 3: 2}]
for i0, d0 in enumerate(L[:-1]):
for d1 in L[i0:]:
_ = [d0.__setitem__(k,d1[k]) for k in d1 if d0.get(k,None) is None]
_ = [d1.__setitem__(k,d0[k]) for k in d0 if d1.get(k,None) is None]
print(L)
>>> [{0: 1, 1: 7, 2: 3, 3: 2, 4: 8}, {0: 3, 1: 2, 2: 6, 3: 2, 4: 6}, {0: 2, 1: 2, 2: 3, 3: 2, 4: 6}, {0: 2, 1: 7, 2: 3, 3: 2, 4: 8}]
I have a default dictionary with name df:
defaultdict(<type 'int'>, {u'DE': 1, u'WV': 1, u'HI': 1, u'WY': 1, u'NH': 2, u'NJ': 1, u'NM': 1, u'TX': 1, u'LA': 1, u'NC': 1, u'NE': 1, u'TN': 1, u'RI': 1, u'VA': 1, u'CO': 1, u'AK': 1, u'AR': 1, u'IL': 1, u'GA': 1, u'IA': 1, u'MA': 1, u'ID': 1, u'ME': 1, u'OK': 2, u'MN': 1, u'MI': 1, u'KS': 1, u'MT': 1, u'MS': 1, u'SC': 2, u'KY': 1, u'OR': 1, u'SD': 1})
how do I get the keys of this dictionary whose values are more than 1.
If I do [df[val] for val in df if df[val]>1]
I get the output as [2, 2, 2]
If I print [df.keys() for val in df if df[val]>1] Still I donot get the key values, I need the keys that has values more than 2 like this ['SC', 'OK', 'NH']
How do I do that??
Reading from a dictionary created using defaultdict() is the same as a normal dict.
To get the keys which have values > 1, you would do:
my_dict = defaultdict(...)
print [key for key, value in my_dict.iteritems() if value > 1]
If you're using Python 3 then it's my_dict.items().
We can use list compression method.
>>> from collections import defaultdict
>>> d = defaultdict(int)
>>> d['HI'] = 1
>>> d['NH'] = 2
>>> d['WY'] = 1
>>> d['OK'] = 2
>>> [i[0] for i in d.items() if i[1]>1]
['NH', 'OK']