I have data which consists of a series of categories, each with two amounts. For example, {'cat':'red', 'a':1, 'b':2}, {'cat':'red', 'a':3, 'b':3}, {'cat':'blue', 'a':1, 'b':3}
I want to keep a running total of the two amounts, by category. Result would be {'cat':'red', 'a':4, 'b':5}, {'cat':'blue', 'a':1, 'b':3}
Is there a more pythonic method than:
totals = {}
for item in data:
if item['cat'] in totals:
totals[item['cat']]['a'] += item['a']
totals[item['cat']]['b'] += item['b']
else:
totals[item['cat']] = {'a':item['a'], 'b':item['b']}
Your data structure should really be moved to a dictionary, keyed on the cat value. Use collections.defaultdict() and collections.Counter() to keep track of the values and make summing easier:
from collections import defaultdict, Counter
totals = defaultdict(Counter)
for item in data:
cat = item.pop('cat')
totals[cat] += Counter(item)
Demo:
>>> from collections import defaultdict, Counter
>>> data = {'cat':'red', 'a':1, 'b':2}, {'cat':'red', 'a':3, 'b':3}, {'cat':'blue', 'a':1, 'b':3}
>>> totals = defaultdict(Counter)
>>> for item in data:
... cat = item.pop('cat')
... totals[cat] += Counter(item)
...
>>> totals
defaultdict(<class 'collections.Counter'>, {'blue': Counter({'b': 3, 'a': 1}), 'red': Counter({'b': 5, 'a': 4})})
>>> totals['blue']
Counter({'b': 3, 'a': 1})
>>> totals['red']
Counter({'b': 5, 'a': 4})
If you still require a sequence of dictionaries in the same format, you can then turn the above dictionary of counters back into 'plain' dictionaries again:
output = []
for cat, counts in totals.iteritems():
item = {'cat': cat}
item.update(counts)
output.append(item)
resulting in:
>>> output
[{'a': 1, 'b': 3, 'cat': 'blue'}, {'a': 4, 'b': 5, 'cat': 'red'}]
Have a look at dict.setdefault and collections.counter.
Possible solution using setdefault:
totals = {}
for item in data:
d = totals.setdefault(item['cat'], {'a':0, 'b':0})
d['a'] += item['a']
d['b'] += item['b']
with result total = {'blue': {'a': 1, 'b': 3}, 'red': {'a': 4, 'b': 5}}. Note that this does not have the 'cat' entries like in your expected answer. Instead, the colors are used directly as the key for the resulting dictionary.
See Martijn's answer for an example using Counter.
I would collect your data into a temporary composite data structure based on a combination of the collections.Counter and collections.defaultdict classes. The keys of this data structure would be the cat's color and associated with each will be a Counterto hold the totals for each color cat. Making it adefaultdictmeans not having to worry about whether it's first time the color has been encountered and or not.
This will perform the summing of values needed as it is created and is fairly easy to turn into the output sequence you want afterwards:
from collections import Counter, defaultdict
data = ({'cat':'red', 'a':1, 'b':2},
{'cat':'red', 'a':3, 'b':3},
{'cat':'blue', 'a':1, 'b':3})
cat_totals = defaultdict(Counter) # hybrid data structure
for entry in data:
cat_totals[entry['cat']].update({k:v for k,v in entry.iteritems()
if k != 'cat'})
results = tuple(dict([('cat', color)] + cat_totals[color].items())
for color in cat_totals)
print results # ({'a': 1, 'b': 3, 'cat': 'blue'}, {'a': 4, 'b': 5, 'cat': 'red'})
Related
I've been programming in Python for quite a while now. I've always wondered, is there a way to remove an item from a dictionary and return the newly created dictionary? Basically removing an item from a dict in a functional way.
As far as I know, there are only the del dict[item] and dict.pop(item) methods, however both modify data and don't return the new dict.
There is no built-in way for dicts, you have to do it yourself. Something to the effect of:
>>> data = dict(a=1,b=2,c=3)
>>> data
{'a': 1, 'b': 2, 'c': 3}
>>> {k:v for k,v in data.items() if k != item}
Note, Python 3.9 did add a | operator for dicts to create a new, merged dict:
>>> data
{'a': 1, 'b': 2, 'c': 3}
>>> more_data = {"b":4, "c":5, "d":6}
Then
>>> data | more_data
{'a': 1, 'b': 4, 'c': 5, 'd': 6}
So, similar to + for list concatenation. Previously, could have done something like:
>>> {**data, **more_data}
{'a': 1, 'b': 4, 'c': 5, 'd': 6}
Note, set objects support operators to create new sets, providing operators for various basic set operations:
>>> s1 = {'a','b','c'}
>>> s2 = {'b','c','d'}
>>> s1 & s2 # set intersection
{'b', 'c'}
>>> s1 | s2 # set union
{'c', 'a', 'b', 'd'}
>>> s1 - s2 # set difference
{'a'}
>>> s1 ^ s2 # symmetric difference
{'a', 'd'}
This comes down to API design choices.
The solution would be to use dict.copy() to create a copy then to do your operations.
For Example:
initial_dict = {"a": 1, "b": 2}
dict_copy = initial_dict.copy()
# Then you can do your item operations
del dict_copy["a"]
# or
dict_copy.pop("b")
I want to write a code which takes the following inputs:
list (list of maps)
request_keys (list of strings)
operation (add,substract,multiply,concat)
The code would look at the list for the maps having the same value for all keys except the keys given in request_keys. Upon finding two maps for which the value in the search keys match, the code would do the operation (add,multiple,substract,concat) on the two maps and combine them into one map. This combination map would basically replace the other two maps.
i have written the following peice of code to do this. The code only does add operation. It can be extended to make the other operations
In [83]: list
Out[83]:
[{'a': 2, 'b': 3, 'c': 10},
{'a': 2, 'b': 3, 'c': 3},
{'a': 2, 'b': 4, 'c': 4},
{'a': 2, 'b': 3, 'c': 2},
{'a': 2, 'b': 3, 'c': 3}]
In [84]: %cpaste
Pasting code; enter '--' alone on the line to stop or use Ctrl-D.
:def func(list,request_keys):
: new_list = []
: found_indexes = []
: for i in range(0,len(list)):
: new_item = list[i]
: if i in found_indexes:
: continue
: for j in range(0,len(list)):
: if i != j and {k: v for k,v in list[i].iteritems() if k not in request_keys} == {k: v for k,v in list[j].iteritems() if k not in request_keys}:
: found_indexes.append(j)
: for request_key in request_keys:
: new_item[request_key] += list[j][request_key]
: new_list.append(new_item)
: return new_list
:--
In [85]: func(list,['c'])
Out[85]: [{'a': 2, 'b': 3, 'c': 18}, {'a': 2, 'b': 4, 'c': 4}]
In [86]:
What i want to know is, is there a faster, more memory efficient, cleaner and a more pythonic way of doing the same?
Thank you
You manually generate all the combinations and then compare each of those combinations. This is pretty wasteful. Instead, I suggest grouping the dictionaries in another dictionary by their matching keys, then adding the "same" dictionaries. Also, you forgot the operator parameter.
import collections, operator, functools
def func(lst, request_keys, op=operator.add):
matching_dicts = collections.defaultdict(list)
for d in lst:
key = tuple(sorted(((k, d[k]) for k in d if k not in request_keys)))
matching_dicts[key].append(d)
for group in matching_dicts.values():
merged = dict(group[0])
merged.update({key: functools.reduce(op, (g[key] for g in group))
for key in request_keys})
yield merged
What this does: First, it creates a dictionary, mapping the key-value pairs that have to be equal for two dictionaries to match to all those dictionaries that have those key-value pairs. Then it iterates the dicts from those groups, using one of that group as a prototype and updating it with the sum (or product, or whatever, depending on the operator) of the all the dicts in that group for the required_keys.
Note that this returns a generator. If you want a list, just call it like list(func(...)), or accumulate the merged dicts in a list and return that list.
from itertools import groupby
from operator import itemgetter
def mergeDic(inputData, request_keys):
keys = inputData[0].keys()
comparedKeys = [item for item in keys if item not in request_keys]
grouper = itemgetter(*comparedKeys)
result = []
for key, grp in groupby(sorted(inputData, key = grouper), grouper):
temp_dict = dict(zip(comparedKeys, key))
for request_key in request_keys:
temp_dict[request_key] = sum(item[request_key] for item in grp)
result.append(temp_dict)
return result
inputData = [{'a': 2, 'b': 3, 'c': 10},
{'a': 2, 'b': 3, 'c': 3},
{'a': 2, 'b': 4, 'c': 4},
{'a': 2, 'b': 3, 'c': 2},
{'a': 2, 'b': 3, 'c': 3}]
from pprint import pprint
pprint(mergeDic(inputData,['c']))
i just picked up python not too long ago.
An example below
i have a dictionary within a list
myword = [{'a': 2},{'b':3},{'c':4},{'a':1}]
I need to change it to the output below
[{'a':3} , {'b':3} , {'c':4}]
is there a way where i can add the value together? I tried using counter, but it prints out the each dict out.
what i did using Counter:
for i in range(1,4,1):
text = myword[i]
Print Counter(text)
The output
Counter({'a': 2})
Counter({'b': 3})
Counter({'c': 4})
Counter({'a': 1})
i have read the link below but what they compared was between 2 dict.
Is there a better way to compare dictionary values
Thanks!
Merge dictionaries into one dictionary (Counter), and split them.
>>> from collections import Counter
>>> myword = [{'a': 2}, {'b':3}, {'c':4}, {'a':1}]
>>> c = Counter()
>>> for d in myword:
... c.update(d)
...
>>> [{key: value} for key, value in c.items()]
[{'a': 3}, {'c': 4}, {'b': 3}]
>>> [{key: value} for key, value in sorted(c.items())]
[{'a': 3}, {'b': 3}, {'c': 4}]
I would like to add together the values from a dictionary in Python, if their keys begin with the same letter..
For example, if I have this dictionary: {'apples': 3, 'oranges': 5, 'grapes': 4, 'apricots': 2, 'grapefruit': 9}
The result would be: {'A': 5,'G': 13, 'O': 5}
I only got this far and I'm stuck:
for k in dic.keys():
if k.startswith('A'):
Any help will be appreciated
Take the first character of each key, call .upper() on that and sum your values by that uppercased letter. The following loop
out = {}
for key, value in original.iteritems():
out[key[0].upper()] = out.get(key[0].upper(), 0) + value
should do it.
You can also use a collections.defaultdict() object to simplify that a little:
from collections import defaultdict:
out = defaultdict(int)
for key, value in original.iteritems():
out[key[0].upper()] += value
or you could use itertools.groupby():
from itertools import groupby
key = lambda i: i[0][0].upper()
out = {key: sum(v for k, v in group) for key, group in groupby(sorted(original.items(), key=key), key=key)}
You can use a defaultdict here:
from collections import defaultdict
new_d = defaultdict(int)
for k, v in d.iteritems():
new_d[k[0].upper()] += v
print new_d
Prints:
defaultdict(<type 'int'>, {'A': 5, 'O': 5, 'G': 13})
Lots of ways to do this. Here's a variant using Counter that nobody else has suggested and unlike Ashwini's solution it doesn't create potentially long intermediate strings:
>>> from collections import Counter
>>> dic = {'apples': 3, 'oranges': 5, 'grapes': 4, 'apricots': 2, 'grapefruit': 9}
>>> sum((Counter({k[0].upper():dic[k]}) for k in dic), Counter())
Counter({'G': 13, 'A': 5, 'O': 5})
I need a little bit of homework help. I have to write a function that combines several dictionaries into new dictionary. If a key appears more than once; the values corresponding to that key in the new dictionary should be a unique list. As an example this is what I have so far:
f = {'a': 'apple', 'c': 'cat', 'b': 'bat', 'd': 'dog'}
g = {'c': 'car', 'b': 'bat', 'e': 'elephant'}
h = {'b': 'boy', 'd': 'deer'}
r = {'a': 'adam'}
def merge(*d):
newdicts={}
for dict in d:
for k in dict.items():
if k[0] in newdicts:
newdicts[k[0]].append(k[1])
else:
newdicts[k[0]]=[k[1]]
return newdicts
combined = merge(f, g, h, r)
print(combined)
The output looks like:
{'a': ['apple', 'adam'], 'c': ['cat', 'car'], 'b': ['bat', 'bat', 'boy'], 'e': ['elephant'], 'd': ['dog', 'deer']}
Under the 'b' key, 'bat' appears twice. How do I remove the duplicates?
I've looked under filter, lambda but I couldn't figure out how to use with (maybe b/c it's a list in a dictionary?)
Any help would be appreciated. And thank you in advance for all your help!
Just test for the element inside the list before adding it: -
for k in dict.items():
if k[0] in newdicts:
if k[1] not in newdicts[k[0]]: # Do this test before adding.
newdicts[k[0]].append(k[1])
else:
newdicts[k[0]]=[k[1]]
And since you want just unique elements in the value list, then you can just use a Set as value instead. Also, you can use a defaultdict here, so that you don't have to test for key existence before adding.
Also, don't use built-in for your as your variable names. Instead of dict some other variable.
So, you can modify your merge method as:
from collections import defaultdict
def merge(*d):
newdicts = defaultdict(set) # Define a defaultdict
for each_dict in d:
# dict.items() returns a list of (k, v) tuple.
# So, you can directly unpack the tuple in two loop variables.
for k, v in each_dict.items():
newdicts[k].add(v)
# And if you want the exact representation that you have shown
# You can build a normal dict out of your newly built dict.
unique = {key: list(value) for key, value in newdicts.items()}
return unique
>>> import collections
>>> import itertools
>>> uniques = collections.defaultdict(set)
>>> for k, v in itertools.chain(f.items(), g.items(), h.items(), r.items()):
... uniques[k].add(v)
...
>>> uniques
defaultdict(<type 'set'>, {'a': set(['apple', 'adam']), 'c': set(['car', 'cat']), 'b': set(['boy', 'bat']), 'e': set(['elephant']), 'd': set(['deer', 'dog'])})
Note the results are in a set, not a list -- far more computationally efficient this way. If you would like the final form to be lists then you can do the following:
>>> {x: list(y) for x, y in uniques.items()}
{'a': ['apple', 'adam'], 'c': ['car', 'cat'], 'b': ['boy', 'bat'], 'e': ['elephant'], 'd': ['deer', 'dog']}
In your for loop add this:
for dict in d:
for k in dict.items():
if k[0] in newdicts:
# This line below
if k[1] not in newdicts[k[0]]:
newdicts[k[0]].append(k[1])
else:
newdicts[k[0]]=[k[1]]
This makes sure duplicates aren't added
Use set when you want unique elements:
def merge_dicts(*d):
result={}
for dict in d:
for key, value in dict.items():
result.setdefault(key, set()).add(value)
return result
Try to avoid using indices; unpack tuples instead.