Summing instead of overriding values in dictionary comprehension - python

When the keys in a dictionary comprehension is the same, I want their values to be added up. For example,
>>> dct = {-1: 1, 0: 2, 1: 3}
>>> {k**2: v for k, v in dct.items()}
{1: 3, 0: 2}
However, what I want to get in this case is {1: 4, 0: 2}, because both the square of 1 and -1 is 1, and 1 + 3 = 4.
Clearly, I can do it with a for loop, but is there a shorthand?

There isn't a shorthand version, since your comprehension would need to keep track of the current state which isn't doable. Like you said, the answer is a for loop:
old = {-1: 1, 0: 2, 1:3}
new = {}
for k, v in old.items():
new[k**2] = new.get(k**2, 0) + v
The trick using the dict.get method I saw somewhere in the Python docs. It does the same thing as:
if k**2 in new:
new[k**2] += v
else:
new[k**2] = v
But this variation uses the get method which returns a default 0 which is added on to the value that will be assigned (when the key doesn't exist). Since it is 0, and the values are numbers being added, 0 has no effect. By contrast, if you needed to get the product, you'd use 1 as the default as starting off with 0 will mean that you never increase the value.
In addition, the latter, more verbose, method shown above evaluates k**2 twice each cycle which uses up computation. To make it use 1 calculation would require another line of code which in my opinion isn't worth the time when the get method is so much cleaner.

One of the fastest ways to calculate the sums is to use defaultdict - a self-initializing dictionary:
from collections import defaultdict
new = defaultdict(int)
for k, v in old.items():
new[k**2] += v

Related

python list of lists to dict when key appear many times

I know to write something simple and slow with loop, but I need it to run super fast in big scale.
input:
lst = [[1, 1, 2], ["txt1", "txt2", "txt3"]]
desired out put:
d = {1 : ["txt1", "txt2"], 2 : "txt3"]
There is something built-in at python which make dict() extend key instead replacing it?
dict(list(zip(lst[0], lst[1])))
One option is to use dict.setdefault:
out = {}
for k, v in zip(*lst):
out.setdefault(k, []).append(v)
Output:
{1: ['txt1', 'txt2'], 2: ['txt3']}
If you want the element itself for singleton lists, one way is adding a condition that checks for it while you build an output dictionary:
out = {}
for k,v in zip(*lst):
if k in out:
if isinstance(out[k], list):
out[k].append(v)
else:
out[k] = [out[k], v]
else:
out[k] = v
or if lst[0] is sorted (like it is in your sample), you could use itertools.groupby:
from itertools import groupby
out = {}
pos = 0
for k, v in groupby(lst[0]):
length = len([*v])
if length > 1:
out[k] = lst[1][pos:pos+length]
else:
out[k] = lst[1][pos]
pos += length
Output:
{1: ['txt1', 'txt2'], 2: 'txt3'}
But as #timgeb notes, it's probably not something you want because afterwards, you'll have to check for data type each time you access this dictionary (if value is a list or not), which is an unnecessary problem that you could avoid by having all values as lists.
If you're dealing with large datasets it may be useful to add a pandas solution.
>>> import pandas as pd
>>> lst = [[1, 1, 2], ["txt1", "txt2", "txt3"]]
>>> s = pd.Series(lst[1], index=lst[0])
>>> s
1 txt1
1 txt2
2 txt3
>>> s.groupby(level=0).apply(list).to_dict()
{1: ['txt1', 'txt2'], 2: ['txt3']}
Note that this also produces lists for single elements (e.g. ['txt3']) which I highly recommend. Having both lists and strings as possible values will result in bugs because both of those types are iterable. You'd need to remember to check the type each time you process a dict-value.
You can use a defaultdict to group the strings by their corresponding key, then make a second pass through the list to extract the strings from singleton lists. Regardless of what you do, you'll need to access every element in both lists at least once, so some iteration structure is necessary (and even if you don't explicitly use iteration, whatever you use will almost definitely use iteration under the hood):
from collections import defaultdict
lst = [[1, 1, 2], ["txt1", "txt2", "txt3"]]
result = defaultdict(list)
for key, value in zip(lst[0], lst[1]):
result[key].append(value)
for key in result:
if len(result[key]) == 1:
result[key] = result[key][0]
print(dict(result)) # Prints {1: ['txt1', 'txt2'], 2: 'txt3'}

How to switch between keys and values in python dictionary in place (without changing it's location in memory)

i was asked to write a code including a function- reverse_dict_in_place(d)
which switches between keys and values of the inputed dictionary
without changing the dictionary's location in memory (in place).
however, testing it with id() function shows that all my solutions do change dictionaries memory location..
def reverse_dict_in_place(d):
d={y:x for x,y in d.items()}
return d
Alternative to current ones which allows values to be same as keys. Works in mostly the same way though, however once again no two values may be the same.
def reverse_dict_in_place(d):
copy = d.copy().items()
d.clear()
for k, v in copy:
d[v] = k
return d
>>> x = {0: 1, 1: 2}
>>> y = reverse_dict_in_place(x)
>>> id(x) == id(y)
True
>>>
Some assumptions for this to work (thanks to all the users who pointed these out):
There are no duplicate values
There are no non-hashable values
There are no values that are also keys
If you're comfortable with those assumption then I think this should work:
def reverse_dict_in_place(d):
for k,v in d.items():
del d[k]
d[v] = k
return d
Extending on Gad suggestion, you could use dict comprehension:
reversed = {v: k for k, v in d.items()}
Where d is a dict, and the same assumptions apply:
There are no duplicate values
There are no non-hashable values
There are no values that are also keys
This would not work, without modification, for nested dicts.
Note: #NightShade has posted a similar answer as my below answer, earlier than I posted.
You can try this:
def reverse_dict_in_place(d):
d_copy = d.copy()
d.clear()
for k in d_copy:
d[d_copy[k]] = k
This would work even if one of the dictionary's values happens to also be a key (as tested out below)
Testing it out:
my_dict = {1:1, 2:'two', 3:'three'}
reverse_dict_in_place(my_dict)
print (my_dict)
Output:
{1: 1, 'two': 2, 'three': 3}

Extract a list of keys by Sorting the dictionary in python

I have my program's output as a python dictionary and i want a list of keys from the dictn:
s = "cool_ice_wifi"
r = ["water_is_cool", "cold_ice_drink", "cool_wifi_speed"]
good_list=s.split("_")
dictn={}
for i in range(len(r)):
split_review=r[i].split("_")
counter=0
for good_word in good_list:
if good_word in split_review:
counter=counter+1
d1={i:counter}
dictn.update(d1)
print(dictn)
The conditions on which we should get the keys:
The keys with the same values will have the index copied as it is in a dummy list.
The keys with highest values will come first and then the lowest in the dummy list
Dictn={0: 1, 1: 1, 2: 2}
Expected output = [2,0,1]
You can use a list comp:
[key for key in sorted(dictn, key=dictn.get, reverse=True)]
In Python3 it is now possible to use the sorted method, as described here, to sort the dictionary in any way you choose.
Check out the documentation, but in the simplest case you can .get the dictionary's values, while for more complex operations, you'd define a key function yourself.
Dictionaries in Python3 are now insertion-ordered, so one other way to do things is to sort at the moment of dictionary creation, or you could use an OrderedDict.
Here's an example of the first option in action, which I think is the easiest
>>> a = {}
>>> a[0] = 1
>>> a[1] = 1
>>> a[2] = 2
>>> print(a)
{0: 1, 1: 1, 2: 2}
>>>
>>> [(k) for k in sorted(a, key=a.get, reverse=True)]
[2, 0, 1]

Returning unique elements from values in a dictionary

I have a dictionary like this :
d = {'v03':["elem_A","elem_B","elem_C"],'v02':["elem_A","elem_D","elem_C"],'v01':["elem_A","elem_E"]}
How would you return a new dictionary with the elements that are not contained in the key of the highest value ?
In this case :
d2 = {'v02':['elem_D'],'v01':["elem_E"]}
Thank you,
I prefer to do differences with the builtin data type designed for it: sets.
It is also preferable to write loops rather than elaborate comprehensions. One-liners are clever, but understandable code that you can return to and understand is even better.
d = {'v03':["elem_A","elem_B","elem_C"],'v02':["elem_A","elem_D","elem_C"],'v01':["elem_A","elem_E"]}
last = None
d2 = {}
for key in sorted(d.keys()):
if last:
if set(d[last]) - set(d[key]):
d2[last] = sorted(set(d[last]) - set(d[key]))
last = key
print d2
{'v01': ['elem_E'], 'v02': ['elem_D']}
from collections import defaultdict
myNewDict = defaultdict(list)
all_keys = d.keys()
all_keys.sort()
max_value = all_keys[-1]
for key in d:
if key != max_value:
for value in d[key]:
if value not in d[max_value]:
myNewDict[key].append(value)
You can get fancier with set operations by taking the set difference between the values in d[max_value] and each of the other keys but first I think you should get comfortable working with dictionaries and lists.
defaultdict(<type 'list'>, {'v01': ['elem_E'], 'v02': ['elem_D']})
one reason not to use sets is that the solution does not generalize enough because sets can only have hashable objects. If your values are lists of lists the members (sublists) are not hashable so you can't use a set operation
Depending on your python version, you may be able to get this done with only one line, using dict comprehension:
>>> d2 = {k:[v for v in values if not v in d.get(max(d.keys()))] for k, values in d.items()}
>>> d2
{'v01': ['elem_E'], 'v02': ['elem_D'], 'v03': []}
This puts together a copy of dict d with containing lists being stripped off all items stored at the max key. The resulting dict looks more or less like what you are going for.
If you don't want the empty list at key v03, wrap the result itself in another dict:
>>> {k:v for k,v in d2.items() if len(v) > 0}
{'v01': ['elem_E'], 'v02': ['elem_D']}
EDIT:
In case your original dict has a very large keyset [or said operation is required frequently], you might also want to substitute the expression d.get(max(d.keys())) by some previously assigned list variable for performance [but I ain't sure if it doesn't in fact get pre-computed anyway]. This speeds up the whole thing by almost 100%. The following runs 100,000 times in 1.5 secs on my machine, whereas the unsubstituted expression takes more than 3 seconds.
>>> bl = d.get(max(d.keys()))
>>> d2 = {k:v for k,v in {k:[v for v in values if not v in bl] for k, values in d.items()}.items() if len(v) > 0}

Efficient way to find the largest key in a dictionary with non-zero value

I'm new Python and trying to implement code in a more Pythonic and efficient fashion.
Given a dictionary with numeric keys and values, what is the best way to find the largest key with a non-zero value?
Thanks
Something like this should be reasonably fast:
>>> x = {0: 5, 1: 7, 2: 0}
>>> max(k for k, v in x.iteritems() if v != 0)
1
(removing the != 0 will be slightly faster still, but obscures the meaning somewhat.)
To get the largest key, you can use the max function and inspect the keys like this:
max(x.iterkeys())
To filter out ones where the value is 0, you can use a generator expression:
(k for k, v in x.iteritems() if v != 0)
You can combine these to get what you are looking for (since max takes only one argument, the parentheses around the generator expression can be dropped):
max(k for k, v in x.iteritems() if v != 0)
Python's max function takes a key= parameter for a "measure" function.
data = {1: 25, 0: 75}
def keymeasure(key):
return data[key] and key
print max(data, key=keymeasure)
Using an inline lambda to the same effect and same binding of local variables:
print max(data, key=(lambda k: data[k] and k))
last alternative to bind in the local var into the anonymous key function
print max(data, key=(lambda k, mapping=data: mapping[k] and k))
If I were you and speed was a big concern, I'd probably create a new container class "DictMax" that'd keep track of it's largest non-zero value elements by having an internal stack of indexes, where the top element of the stack is always the key of the largest element in the dictionary. That way you'd get the largest element in constant time everytime.
list=[1, 1, 2, 3,3]
s={items:list.count(items)for items in list} ##occurrence of list
largest_value=max(k for k, v in s.items() if v != 0) #largest number occurrence
print(largest_value)

Categories