how to print dictionary key in the even frequency order - python

I would like print out dictionary key, value pair in the even frequency like
a = dict('A': 3, 'B': 5}
=> ['A', 'B', 'A', 'B', 'A', 'B', 'B', 'B']
a = dict('A': 4, 'B': 1}
=> ['A', 'B', 'A', 'A', 'A']
I know I can use a while loop to print each key and remove the count every time until all value from all key is 0 but if there is better way to do it?
def func(d: dict):
res = []
while any(i > 0 for i in d.values()):
for k, c in d.items():
if c > 0:
res.append(k)
d[k] -= 1
return res

(I'm assuming you're using a version of Python that guarantees the iteration order of dictionaries)
Here's an itertools-y approach. It creates a generator for each letter that yields the letter the given number of times, and it combines all of them together with zip_longest so they get yielded evenly.
from itertools import repeat, zip_longest
def iterate_evenly(d):
generators = [repeat(k, v) for k,v in d.items()]
exhausted = object()
for round in zip_longest(*generators, fillvalue=exhausted):
for x in round:
if x is not exhausted:
yield x
print(list(iterate_evenly({"A": 3, "B": 5})))
print(list(iterate_evenly({"A": 4, "B": 1})))
Result:
['A', 'B', 'A', 'B', 'A', 'B', 'B', 'B']
['A', 'B', 'A', 'A', 'A']
You can do the same thing in fewer lines, although it becomes harder to read.
from itertools import repeat, zip_longest
def iterate_evenly(d):
exhausted = object()
return [x for round in zip_longest(*(repeat(k, v) for k,v in d.items()), fillvalue=exhausted) for x in round if x is not exhausted]
print(iterate_evenly({"A": 3, "B": 5}))
print(iterate_evenly({"A": 4, "B": 1}))

For a one-liner.
First, create a list with two elements: a list of As and a list of Bs:
>>> d = {'A': 3, 'B': 5}
>>> [[k]*v for k, v in d.items()]
[['A', 'A', 'A'], ['B', 'B', 'B', 'B', 'B']]
[k]*v means: a list with v ks. Second, interleave As and B. We need zip_longest because zip would stop after the end of the first list:
>>> import itertools
>>> list(itertools.zip_longest(*[[k]*v for k, v in d.items()]))
[('A', 'B'), ('A', 'B'), ('A', 'B'), (None, 'B'), (None, 'B')]
Now, just flatten the list and remove None values:
>>> [v for vs in itertools.zip_longest(*[[k]*v for k, v in d.items()]) for v in vs if v is not None]
['A', 'B', 'A', 'B', 'A', 'B', 'B', 'B']
Other example:
>>> d = {'A': 4, 'B': 1}
>>> [v for vs in itertools.zip_longest(*[[k]*v for k, v in d.items()]) for v in vs if v is not None]
['A', 'B', 'A', 'A', 'A']

You can just use sum with a generator comprehension:
res = sum(([key]*value for key, value in d.items()), [])
This exploits the fact that sum can "add" anything that can use the + operators, like lists, in addition to sequence multiplication ("A"*4 == "AAAA").
If you want the order to be randomized, use the random module:
from random import shuffle
shuffle(res)
If, as Thierry Lathuille notes, you want to cycle through the values in the original order, you can use some itertools magic:
from itertools import chain, zip_longest
res = [*filter(
bool, # drop Nones
chain(*zip_longest(
*([key]*val for key, val in d.items()))
)
)]

As an alternative to the replication & zip_longest approach, let's try to simplify the OP's original code:
def function(dictionary):
result = []
while dictionary:
result.extend(dictionary)
dictionary = {k: v - 1 for k, v in dictionary.items() if v > 1}
return result
print(function({'A': 3, 'B': 5}))
print(function({'A': 4, 'B': 1}))
OUTPUT
% python3 test.py
['A', 'B', 'A', 'B', 'A', 'B', 'B', 'B']
['A', 'B', 'A', 'A', 'A']
%
Although it might look otherwise, it's not destructive on the dictionary argument, unlike the OP's original code.

It could also be done using a sort of the (position,character) tuples formed by expanding each dictionary entry:
a = {'A': 3, 'B': 5}
result = [c for _,c in sorted( (p,c) for c,n in a.items() for p,c in enumerate(c*n))]
print(result) # ['A', 'B', 'A', 'B', 'A', 'B', 'B', 'B']
If the dictionary's order is usable, you can forgo the sort and use this:
result = [c for i in range(max(a.values())) for c,n in a.items() if i<n]

Related

Python: How to update dictionary with step-index from list

I am a week-old python learner. I would like to know: Let’s say:
list= [“a”, “A”, “b”, “B”, “c”, “C”]
I need to update them in dictionary to be a result like this:
dict={“a”:”A”, “b”:”B”, “c”:”C”}
I try to use index of list within dict.update({list[n::2]: list[n+1::2]} and for n in range(0,(len(list)/2))
I think i did something wrong. Please correct me.
Thank you in advance.
Try the following:
>>> lst = ['a', 'A', 'b', 'B', 'c', 'C']
>>> dct = dict(zip(lst[::2],lst[1::2]))
>>> dct
{'a': 'A', 'b': 'B', 'c': 'C'}
Explanation:
>>> lst[::2]
['a', 'b', 'c']
>>> lst[1::2]
['A', 'B', 'C']
>>> zip(lst[::2], lst[1::2])
# this actually gives a zip iterator which contains:
# [('a', 'A'), ('b', 'B'), ('c', 'C')]
>>> dict(zip(lst[::2], lst[1::2]))
# here each tuple is interpreted as key value pair, so finally you get:
{'a': 'A', 'b': 'B', 'c': 'C'}
NOTE: Don't name your variables same as python keywords.
Correct version of your program would be:
lst = ['a', 'A', 'b', 'B', 'c', 'C']
dct = {}
for n in range(0,int(len(lst)/2)):
dct.update({lst[n]: lst[n+1]})
print(dct)
Yours did not work because you used slices in each iteration, instead of accessing each individual element. lst[0::2] gives ['a', 'b', 'c'] and lst[1::2] gives ['A', 'B', 'C']. So for the first iteration, when n == 0 you are trying to update the dictionary with the pair ['a', 'b', 'c'] : ['A', 'B', 'C'] and you will get a type error as list can not be assigned as key to the dictionary as lists are unhashable.
You can use dictionary comprehension like this:
>>> l = list("aAbBcCdD")
>>> l
['a', 'A', 'b', 'B', 'c', 'C', 'd', 'D']
>>> { l[i] : l[i+1] for i in range(0,len(l),2)}
{'a': 'A', 'b': 'B', 'c': 'C', 'd': 'D'}
The below code would be the perfect apt to your question. Hope this helped you
a = ["a", "A", "B","b", "c","C","d", "D"]
b = {}
for each in range(len(a)):
if each % 2 == 0:
b[a[each]] = a[each + 1]
print(b)

Grouping Varying Size of Lists in a Tuple

I want to group all the lists in a tuple, based on the last element in each list and also count the mount of times the last element occurred. However the challenge I am finding is that all the lists in the tuple can be of different sizes.
Eg input
[['aa', 'b'], ['bb', 'c'], ['cc', 'b'], ['dd','ee','a'], ['ff', 'gg', 'hh', 'a']]
And I am trying to get the output to be
('a', 2, [('dd','ee'),('ff', 'gg', 'hh')]), ( 'b', 2, [('aa'), ('cc')]), ( 'c', 1, [('bb')])
Finally I want to then go ahead and convert it to a panda-dataframe format. If anyone can help/guide, it would be much appreciated.
Readable version
mylist.sort(key=operator.itemgetter(-1)) # sort by last element
result = []
for k, g in itertools.groupby(mylist, key=operator.itemgetter(-1)):
# remove last element from each sublist:
g = [tuple(sublist[:-1]) for sublist in g]
result.append((k, len(g), g))
Without importing a library
list = [['aa', 'b'], ['bb', 'c'], ['cc', 'b'], ['dd','ee','a'], ['ff', 'gg', 'hh', 'a']]
instances = {}
for sublist in list:
leading_elements, last_element = sublist[:-1], sublist[-1]
instances.setdefault(last_element, [])
instances[last_element].append(tuple(leading_elements))
result = tuple()
for key, val in instances.items():
result += (key, len(val), val)
Use itertools.groupby
>>> from itertools import groupby
>>> l = [['aa', 'b'], ['bb', 'c'], ['cc', 'b'], ['dd','ee','a'], ['ff', 'gg', 'hh', 'a']]
>>>
>>> f = lambda sl: sl[-1]
>>> res = [(k, [tuple(sl[:-1]) for sl in v]) for k,v in groupby(sorted(l, key=f), f)]
>>> res = [(k, len(v), v) for k,v in res]
>>> print(res)
[('a', 2, [('dd', 'ee'), ('ff', 'gg', 'hh')]), ('b', 2, [('aa',), ('cc',)]), ('c', 1, [('bb',)])]

Removing unnecessary list brackets inside a dictionary

I have this dictionary:
n ={'b': [['a'], ['c']], 'a': [['c', 'b'], ['c']], 'c': [['b']]}
and require the following output:
n ={'b': ['a', 'c'], 'a': ['c', 'b'], 'c': ['b']}
I tried to use itertools and join but couldn't get it to work, can anyone help out?
Just use chain.from_iterable from itertools to combine these:
from itertools import chain
from_it = chain.from_iterable
{k: list(from_it(i)) for k, i in n.items()}
If you require unique values in the lists (which according to the title you don't), you can additionally wrap the result of from_it in a set.
I would iterate the dict and ignore the irrelevant list.
For uniqueness you can cast each inner_list to a set
n ={'b': [['a', 'b'], ['c']], 'a': [['c', 'b'], ['c']], 'c': [['b']]}
new_n = {}
for k,v in n.items():
n[k] = [inner_item for item in v for inner_item in item]
print (n)
You can try this:
from itertools import chain
n ={'b': [['a'], ['c']], 'a': [['c', 'b'], ['c']], 'c': [['b']]}
new_n = {a:list(set(chain(*[i[0] if len(i) == 1 else i for i in b]))) for a, b in n.items()}
Output:
{'a': ['c', 'b'], 'c': ['b'], 'b': ['a', 'c']}
A one liner solution(and not recommended) to this is :
{key: list(set([item for subarr in value for item in subarr])) for key, value in n.items()}
It is much harder to read though. If you really do not want to import anything, you can write a helper function.
def flat_and_unique_list(list_of_lists):
return list(set([item for sub_list in list_of_lists for item in sub_list]))
{key: flat_and_unique_list(value) for key, value in n.items()}
A solution with sum:
>>> {k: sum(v, []) for k, v in n.items()}
{'a': ['c', 'b', 'c'], 'b': ['a', 'c'], 'c': ['b']}
sum(iterable, start=0, /)
Return the sum of a 'start' value (default: 0) plus an iterable of numbers
Therefore, using an empty list as start value works.
Remove multiplies using without preserving order using set:
>>> {k: list(set(sum(v, []))) for k, v in n.items()}
{'a': ['c', 'b'], 'b': ['a', 'c'], 'c': ['b']}

Python group two lists

I have two lists:
A = ['T', 'D', 'Q', 'D', 'D']
sessionid = [1, 1, 1, 2, 2]
Is there anyway i could group items in A for the same sessionid, so that i could print out the following:
1: ["T", "D","Q"]
2: ["D","D"]
The itertools groupby function is designed to do this sort of thing. Some of the other answers here create a dictionary, which is very sensible, but if you don't actually want a dict then you can do this:
from itertools import groupby
from operator import itemgetter
A = ['T', 'D', 'Q', 'D', 'D']
sessionid = [1, 1, 1, 2, 2]
for k, g in groupby(zip(sessionid, A), itemgetter(0)):
print('{}: {}'.format(k, list(list(zip(*g))[1])))
output
1: ['T', 'D', 'Q']
2: ['D', 'D']
operator.itemgetter(0) returns a callable that fetches the item at index 0 of whatever object you pass it; groupby uses this as the key function to determine what items can be grouped together.
Note that this and similar solutions assume that the sessionid indices are sorted. If they aren't then you need to sort the list of tuples returned by zip(sessionid, A) with the same key function before passing them to groupby.
edited to work correctly on Python 2 and Python 3
Not using itertools, you can use a dictionary:
index = 0
dict = {}
for i in sessionid:
if not (i in dict):
dict[i] = []
else:
dict[i].append(A[index])
index += 1
print(dict) # {1: ['T', 'D', 'Q'], 2: ['D', 'D']}
And based on the remarks below:
from collections import defaultdict
dict = defaultdict(list)
for i, item in enumerate(sessionid):
dict[item].append(A[i])
You could use a dictionary and zip:
A = ['T', 'D', 'Q', 'D', 'D']
sessionid = [1, 1, 1, 2, 2]
result = {i:[] for i in sessionid}
for i,j in zip(sessionid,A):
result[i].append(j)
Or you can use defaultdict:
from collections import defaultdict
result = defaultdict(list)
for k, v in zip(sessionid, A):
result[k].append(v)
Output:
>>> result
{1: ['T', 'D', 'Q'], 2: ['D', 'D']}
One liner
{k: list(i for (i, _) in v) for k, v in itertools.groupby(zip(A, sessionid), operator.itemgetter(1))}
Without nested loop
{k: list(map(operator.itemgetter(0), v)) for k, v in itertools.groupby(zip(A, sessionid), operator.itemgetter(1))}
You can do:
import pandas as pd
A = ['T', 'D', 'Q', 'D', 'D']
sessionid = [1, 1, 1, 2, 2]
pd.DataFrame({'A':A, 'id':sessionid}).groupby('id')['A'].apply(list).to_dict()
#Out[10]: {1: ['T', 'D', 'Q'], 2: ['D', 'D']}
You could also convert them into numpy arrays, and use the indices of the session ids you need with np.where
import numpy as np
A = np.asarray(['T', 'D', 'Q', 'D', 'D'])
sessionid = np.asarray([1, 1, 1, 2, 2])
Ind_1 = np.where(sessionid == 1)
Ind_2 = np.where(sessionid == 2)
print A[Ind_1]
should return ['T' 'D' 'Q']
you could of course turn this into a function which takes N, the desired session and returns your A values.
Hope this helps!

Python, work with list, find max sequence length

for example test_list:
test_list = ['a', 'a', 'a', 'b', 'b', 'a', 'c', 'b', 'a', 'a']
what tool or algorithm i need to use, to get max sequences count, for this example:
'a' = 3
'b' = 2
'c = 1
Using a dict to track max lengths, and itertools.groupby to group the sequences by consecutive value:
from itertools import groupby
max_count = {}
for val, grp in groupby(test_list):
count = sum(1 for _ in grp)
if count > max_count.get(val, 0):
max_count[val] = count
Demo:
>>> from itertools import groupby
>>> test_list = ['a', 'a', 'a', 'b', 'b', 'a', 'c', 'b', 'a', 'a']
>>> max_count = {}
>>> for val, grp in groupby(test_list):
... count = sum(1 for _ in grp)
... if count > max_count.get(val, 0):
... max_count[val] = count
...
>>> max_count
{'a': 3, 'c': 1, 'b': 2}
Here is a direct way to do it:
Counts, Count, Last_item = {}, 0, None
test_list = ['a', 'a', 'a', 'b', 'b', 'a', 'c', 'b', 'a', 'a']
for item in test_list:
if Last_item == item:
Count+=1
else:
Count=1
Last_item=item
if Count>Counts.get(item, 0):
Counts[item]=Count
print Counts
# {'a': 3, 'c': 1, 'b': 2}
You should read about what a dictionary is (dict in Python) and how you could store how many occurrences there are for a sequence.
Then figure out how to code the logic -
Figure out how to loop over your list. As you go, for every item -
If it isn't the same as the previous item
Store how many times you saw the previous item in a row into the dictionary
Else
Increment how many times you've seen the item in the current sequence
Print your results
You can use re module for find all sequences of the character in a string composed by all the characters in your list. Then just pick the largest string for a single character.
import re
test_list = ['a', 'a', 'b', 'b', 'a', 'c', 'b', 'a', 'a', 'a']
# First obtain the characters.
unique = set(test_list)
max_count = {}
for elem in unique:
# Find all sequences for the same character.
result = re.findall('{0}+'.format(elem), "".join(test_list))
# Find the longest.
maximun = max(result)
# Save result.
max_count.update({elem: len(maximun)})
print(max_count)
This will print: {'c': 1, 'b': 2, 'a': 3}
For Python, Martijn Pieters' groupby is the best answer.
That said, here is a 'basic' way to do it that could be translated to any language:
test_list = ['a', 'a', 'a', 'b', 'b', 'a', 'c', 'b', 'a', 'a']
hm={}.fromkeys(set(test_list), 0)
idx=0
ll=len(test_list)
while idx<ll:
item=test_list[idx]
start=idx
while idx<ll and test_list[idx]==item:
idx+=1
end=idx
hm[item]=max(hm[item],end-start)
print hm
# {'a': 3, 'c': 1, 'b': 2}

Categories