Creating multiple dictionaries based on other dictionary values in python - python

I have a list that contains many dictionaries. Each dictionary represents a change that has occurred within my application. The "change" dictionary has the following entries:
userid: The user ID for a user
ctype: A reference to a change type in my application
score: A score
The ctype can be one of about 12 different strings to include "deletion", "new", "edit" and others. Here is an example of one of the "change" dictionaries:
{'userid':2, 'score':10, 'ctype':'edit'}
My question is, how can I create a dictionary that will aggregate all of the change types for each user within this large list of dictionaries? I would like to add the score from each change dictionary to create a total score and add each ctype instance together to get a count of each instance. The goal is to have a list of dictionaries with each dictionary looking like this:
{'userid':2, 'score':325, 'deletion':2, 'new':4, 'edit':9}
I have been trying to work this out but I am pretty new to python and I wasn't sure how to count the actual change types. The other part that gets me is how to refer to a dictionary based on 'userid'. If someone can present an answer I am sure that all of this will become very apparent to me. I appreciate any and all help.

The key thing to agregate data here is to have a dictionary where each key is the userid, and each entry is the data relevant to that userid.
final_data = {}
for entry in data:
userid = entry["userid"]
if userid not in final_data:
final_data[userid] = {"userid": userid, "score": 0}
final_data[userid]["score"] += entry["score"]
if not entry["ctype"] in final_data[userid]:
final_data[userid][entry["ctype"]] = 1
else:
final_data[userid][entry["ctype"]] += 1
If you want the result as a list of dictionaries, just use final_data.values()

Could you have
(Mock up not real python.)
{userid : {score : 1, ctype : ''}}
You can nest dict's as values in python dictionaries.

It could look like so:
change_types = ['deletion', 'new', 'edit', ...]
user_changes = {}
for change in change_list:
userid = change['userid']
if not userid in user_changes:
aggregate = {}
aggregate['score'] = 0
for c in change_types:
aggregate[c] = 0
aggregate['userid'] = userid
user_changes[userid] = aggregate
else:
aggregate = user_changes[userid]
change_type = change['ctype']
aggregate[change_type] = aggregate[change_type] + 1
aggregate['score'] = aggregate['score'] + change['score']
Actually making a class for the aggregates would be a good idea.

To index the dictionaries with respect to userid, you can use a dictionary of dictionaries:
from collections import defaultdict
dict1 = {'userid': 1, 'score': 10, 'ctype': 'edit'}
dict2 = {'userid': 2, 'score': 13, 'ctype': 'other'}
dict3 = {'userid': 1, 'score': 1, 'ctype': 'edit'}
list_of_dicts = [dict1, dict2, dict3]
user_dict = defaultdict(lambda: defaultdict(int))
for d in list_of_dicts:
userid = d['userid']
user_dict[userid]['score'] += d['score']
user_dict[userid][d['ctype']] += 1
# user_dict is now
# defaultdict(<function <lambda> at 0x02A7DF30>,
# {1: defaultdict(<type 'int'>, {'edit': 2, 'score': 11}),
# 2: defaultdict(<type 'int'>, {'score': 13, 'other': 1})})
In the example, I used a defaultdict to avoid checking at every iteration if the key d['ctype'] exists.

Related

Is there a way to combine common keys and add values in Python?

Hi there ive seen various posts on how to combine two dictionaries in Python, However what I would like to do is combine like keys in the same dictionary and add the values? ..in truth what i will do initially is rename the keys into various categories and then id like to combine similar keys and add the values. is there a simple method to make this possible? at the moment ive opted for iterating and changing keys..but this seems a slow way of doing things. im wondering whether or not to iterate over transaction description first and create a new list then zip or not? any ideas apprecaited thanks.
here is my modest start
df = pd.read_csv("spreadsheet.csv")
item = df['Transaction Description']
cost = df['Debit Amount']
purchases = {}
purchases = {}
key, value in zip(item, cost):
purchases[key] = value
keys = purchases.keys()
for i in purchaces.keys {
if key == "Soverign housing"
purchases [Rent] = purchaces[Soverign housing]
del dictionary[Soverign housing]
...etc
The simplest is to use in to see if a new dictionary contains a key from each dictionary,
in this example i used *args which is the list of the arguments to make it more flexible.
dic_a = {'A': 20, 'B': 30, 'C': 40}
dic_b = {'A': 5, 'C': 2, 'F': 100}
dic_c = {'H': 32, 'K': 75, 'G': 15}
def combine_dict(*args):
comb = dict()
for dic in args:
for key in dic:
if key in comb:
comb[key] += dic[key]
else:
comb[key] = dic[key]
return comb
if __name__ == '__main__':
print(combine_dict(dic_a, dic_b))
print(combine_dict(dic_a, dic_b, dic_c))
I don't understand what does your mean about condensed values etc.
#user1717828 is Right for asking input/Out code so we can understand it clearly.
But how much i understand is that you want when you combine 2 dict object it should keep same keys but combine Values e.g.
dict_a = {'A':1, 'B': 2, 'C':3}
dict_b = {'A':10, 'C': 20, 'D':30}
for key,val in dict_b:
try:
_key = dict_a[key] # Already Exist Key in Dict A
except: # If Object does not Exist
dict_a[key]=val
else: # If exist Then Make a list and Append
dict_a_val = dict_a[key] #Current value in dict_a
dict_a[key] = []
dict_a[key].append(dict_a_val)
dict_a[key].append(val)
#output:
# {'A':[1,10], 'B': 2, 'C':[3,20] , 'D':30}
Hope you were looking for this

Adding a key, value to a dict, but I didn't tell it to?

I'm working on some game code in python 3.7. I'm using a lot of dictionaries to store various data, and at one point I needed a new dictionary to store certain values from one dictionary as keys in other dictionaries. When I ran the code, I ended up with a value in my new dictionary that I can't figure out why it got in there.
Example:
data = {'value1': 4, 'value2': 'hello'}
target = {'value1': 3}
def fillTarget(info, times):
for key, value in info.items():
if key == 'value2':
target[key] = value
target[value] = times
else:
for _ in range(times):
target[key] = target[key] + info[key]
fillTarget(data, 4)
When I print(target), I end up with:
{'value1': 19, 'value2': 'hello', 'hello': 4}
Why am I getting 'value2' as a key, value in (target)? Isn't line 6-8 telling it to add the value as a key, but then the else: should not transfer the key to (target)?
You are adding two items to dictionary
target[key] = value //{'value2': 'hello'}
// and
target[value] = times //{'hello': 4 }
//where key = 'value2', value = 'hello' and times = 4
Click Here to get more details on how to add items to dictionary in python.

How can I add the value if there are duplicate record in a list of dictionaries?

I've tried other solutions but still have no luck, My problem is that I have a list of dictionaries in which I have to check if there are any duplicate value in the key (name of the person):
Sample list:
[{"id": 1,"name": "jack","count": 7},{"id": 12,"name": "jack","count": 5}]
If there are duplicate names, It should add the value in the key count, and the result should be:
[{"id": 1,"name": "jack","count": 12}]
Edited: ID's don't matter, I need at least one id to appear.
A detailed solution could be that:
new = {}
for d in data:
name = d["name"]
if name in new:
new[name]["count"] += d["count"]
else:
new[name] = dict(d)
result = list(new.values())
NB: this could be simplified with the use of list comprehension and the method get, but I think this one is the more readable.
As id field is not so important I will create a dict with key as name and value being the item values in a list
from collections import defaultdict
a = [{"id": 1,"name": "jack","count": 7},{"id": 12,"name": "jack","count": 5}]
d = defaultdict(list)
Iterate over the list and map key and values
for idx,i in enumerate(a):
d[i['name']].append(i) #key is 'name'
At this point d will look like this
{'jack': [{'count': 7, 'id': 1, 'name': 'jack'},{'count': 5, 'id': 12,
'name': 'jack'}]}
Now if the length of list is >1 means we have to iterate over the list and do the summation and update the count
for k,v in d.items():
if len(v) > 1:
temp = v[0]
for t in v[1:]:
temp['count'] = temp['count']+t['count'] #update the count
d[k] = temp
print(list(d.values())) #[{'count': 12, 'id': 1, 'name': 'jack'}]
In order to handle the case when count is missing like
[{"id": 1,"name": "jack"},{"id": 12,"name": "jack","count": 5}]
replace above count update logic with
temp['count']=temp.get('count',0)+t.get('count',0)

Correct way of creating list of dictionaries and sorting by value in dictionaries

Im trying to loop through a query set and create dictionary for every item, and then add every dictionary to a list. I need to check if the dictionary exist and then update the value if it does. My problem is that I don't know how to sort the list by the amount value.
Maybe this is not the best way of creating dicts?
Here is my code:
#Create list
my_list_of_dicts = []
#Create dict object
my_dict = {}
#Find every user in query_set
for item in query_set: #query_set is a list of objects from a Django query: <QuerySet [<User: name123>, <User: name123>, <User: name456>, <User: name789>,]>
if item.name in my_dict:
#Update object
my_dict[item.name]['amount'] += item.amount
else:
#Create object
my_dict[itemrecruiter] = {'amount': item.amount, 'not-important' item.foo}
#Add dict to list
recruiters.append(my_dict)
print(my_list_of_dicts)
>>> [{"name123": {"amount": 8, 'not-important': 'foo123'}, "name456": {"amount": 3, 'not-important': 'foo456'}, "name789": {"amount": 20, 'not-important': 'foo789'}}]
Suppose dict_ is the dictionary variable inside the list (here a[0]), you can do this :
import operator
output = sorted(dict_.items(), key = lambda x : x[1]['amount'])
Output :
[('name456', {'amount': 3}), ('name123', {'amount': 8}), ('name789', {'amount': 20})]
why do you want to use add your dictionary to a list when they are holding just one value, you might just complicate your code more. Try this and see if it works for you
# Create dictionary here
my_dict = {}
# Query from db
for item in query_set:
if item.name in my_dict: # If the item already exist in the dictionary
my_dict[item.name] += item.amount # just update the amount
else: # If the item does exit in the dictionary then create it
my_dict[item.name] = item.amount
>>>print my_dict
{"name123": 8, "name456": 3, "name789": 20}
>>>my_dict["name123"]
8
And you can always loop over your dictionary like this
>>>>for key in my_dict:
print key,":", my_dict[key]
name123 : 8
name456 : 3
name789 : 20

How can I add new keys to a dictionary?

How do I add a key to an existing dictionary? It doesn't have an .add() method.
You create a new key/value pair on a dictionary by assigning a value to that key
d = {'key': 'value'}
print(d) # {'key': 'value'}
d['mynewkey'] = 'mynewvalue'
print(d) # {'key': 'value', 'mynewkey': 'mynewvalue'}
If the key doesn't exist, it's added and points to that value. If it exists, the current value it points to is overwritten.
I feel like consolidating info about Python dictionaries:
Creating an empty dictionary
data = {}
# OR
data = dict()
Creating a dictionary with initial values
data = {'a': 1, 'b': 2, 'c': 3}
# OR
data = dict(a=1, b=2, c=3)
# OR
data = {k: v for k, v in (('a', 1), ('b',2), ('c',3))}
Inserting/Updating a single value
data['a'] = 1 # Updates if 'a' exists, else adds 'a'
# OR
data.update({'a': 1})
# OR
data.update(dict(a=1))
# OR
data.update(a=1)
Inserting/Updating multiple values
data.update({'c':3,'d':4}) # Updates 'c' and adds 'd'
Python 3.9+:
The update operator |= now works for dictionaries:
data |= {'c':3,'d':4}
Creating a merged dictionary without modifying originals
data3 = {}
data3.update(data) # Modifies data3, not data
data3.update(data2) # Modifies data3, not data2
Python 3.5+:
This uses a new feature called dictionary unpacking.
data = {**data1, **data2, **data3}
Python 3.9+:
The merge operator | now works for dictionaries:
data = data1 | {'c':3,'d':4}
Deleting items in dictionary
del data[key] # Removes specific element in a dictionary
data.pop(key) # Removes the key & returns the value
data.clear() # Clears entire dictionary
Check if a key is already in dictionary
key in data
Iterate through pairs in a dictionary
for key in data: # Iterates just through the keys, ignoring the values
for key, value in d.items(): # Iterates through the pairs
for key in d.keys(): # Iterates just through key, ignoring the values
for value in d.values(): # Iterates just through value, ignoring the keys
Create a dictionary from two lists
data = dict(zip(list_with_keys, list_with_values))
To add multiple keys simultaneously, use dict.update():
>>> x = {1:2}
>>> print(x)
{1: 2}
>>> d = {3:4, 5:6, 7:8}
>>> x.update(d)
>>> print(x)
{1: 2, 3: 4, 5: 6, 7: 8}
For adding a single key, the accepted answer has less computational overhead.
"Is it possible to add a key to a Python dictionary after it has been created? It doesn't seem to have an .add() method."
Yes it is possible, and it does have a method that implements this, but you don't want to use it directly.
To demonstrate how and how not to use it, let's create an empty dict with the dict literal, {}:
my_dict = {}
Best Practice 1: Subscript notation
To update this dict with a single new key and value, you can use the subscript notation (see Mappings here) that provides for item assignment:
my_dict['new key'] = 'new value'
my_dict is now:
{'new key': 'new value'}
Best Practice 2: The update method - 2 ways
We can also update the dict with multiple values efficiently as well using the update method. We may be unnecessarily creating an extra dict here, so we hope our dict has already been created and came from or was used for another purpose:
my_dict.update({'key 2': 'value 2', 'key 3': 'value 3'})
my_dict is now:
{'key 2': 'value 2', 'key 3': 'value 3', 'new key': 'new value'}
Another efficient way of doing this with the update method is with keyword arguments, but since they have to be legitimate python words, you can't have spaces or special symbols or start the name with a number, but many consider this a more readable way to create keys for a dict, and here we certainly avoid creating an extra unnecessary dict:
my_dict.update(foo='bar', foo2='baz')
and my_dict is now:
{'key 2': 'value 2', 'key 3': 'value 3', 'new key': 'new value',
'foo': 'bar', 'foo2': 'baz'}
So now we have covered three Pythonic ways of updating a dict.
Magic method, __setitem__, and why it should be avoided
There's another way of updating a dict that you shouldn't use, which uses the __setitem__ method. Here's an example of how one might use the __setitem__ method to add a key-value pair to a dict, and a demonstration of the poor performance of using it:
>>> d = {}
>>> d.__setitem__('foo', 'bar')
>>> d
{'foo': 'bar'}
>>> def f():
... d = {}
... for i in xrange(100):
... d['foo'] = i
...
>>> def g():
... d = {}
... for i in xrange(100):
... d.__setitem__('foo', i)
...
>>> import timeit
>>> number = 100
>>> min(timeit.repeat(f, number=number))
0.0020880699157714844
>>> min(timeit.repeat(g, number=number))
0.005071878433227539
So we see that using the subscript notation is actually much faster than using __setitem__. Doing the Pythonic thing, that is, using the language in the way it was intended to be used, usually is both more readable and computationally efficient.
dictionary[key] = value
If you want to add a dictionary within a dictionary you can do it this way.
Example: Add a new entry to your dictionary & sub dictionary
dictionary = {}
dictionary["new key"] = "some new entry" # add new dictionary entry
dictionary["dictionary_within_a_dictionary"] = {} # this is required by python
dictionary["dictionary_within_a_dictionary"]["sub_dict"] = {"other" : "dictionary"}
print (dictionary)
Output:
{'new key': 'some new entry', 'dictionary_within_a_dictionary': {'sub_dict': {'other': 'dictionarly'}}}
NOTE: Python requires that you first add a sub
dictionary["dictionary_within_a_dictionary"] = {}
before adding entries.
The conventional syntax is d[key] = value, but if your keyboard is missing the square bracket keys you could also do:
d.__setitem__(key, value)
In fact, defining __getitem__ and __setitem__ methods is how you can make your own class support the square bracket syntax. See Dive Into Python, Classes That Act Like Dictionaries.
You can create one:
class myDict(dict):
def __init__(self):
self = dict()
def add(self, key, value):
self[key] = value
## example
myd = myDict()
myd.add('apples',6)
myd.add('bananas',3)
print(myd)
Gives:
>>>
{'apples': 6, 'bananas': 3}
This popular question addresses functional methods of merging dictionaries a and b.
Here are some of the more straightforward methods (tested in Python 3)...
c = dict( a, **b ) ## see also https://stackoverflow.com/q/2255878
c = dict( list(a.items()) + list(b.items()) )
c = dict( i for d in [a,b] for i in d.items() )
Note: The first method above only works if the keys in b are strings.
To add or modify a single element, the b dictionary would contain only that one element...
c = dict( a, **{'d':'dog'} ) ## returns a dictionary based on 'a'
This is equivalent to...
def functional_dict_add( dictionary, key, value ):
temp = dictionary.copy()
temp[key] = value
return temp
c = functional_dict_add( a, 'd', 'dog' )
Let's pretend you want to live in the immutable world and do not want to modify the original but want to create a new dict that is the result of adding a new key to the original.
In Python 3.5+ you can do:
params = {'a': 1, 'b': 2}
new_params = {**params, **{'c': 3}}
The Python 2 equivalent is:
params = {'a': 1, 'b': 2}
new_params = dict(params, **{'c': 3})
After either of these:
params is still equal to {'a': 1, 'b': 2}
and
new_params is equal to {'a': 1, 'b': 2, 'c': 3}
There will be times when you don't want to modify the original (you only want the result of adding to the original). I find this a refreshing alternative to the following:
params = {'a': 1, 'b': 2}
new_params = params.copy()
new_params['c'] = 3
or
params = {'a': 1, 'b': 2}
new_params = params.copy()
new_params.update({'c': 3})
Reference: What does `**` mean in the expression `dict(d1, **d2)`?
There is also the strangely named, oddly behaved, and yet still handy dict.setdefault().
This
value = my_dict.setdefault(key, default)
basically just does this:
try:
value = my_dict[key]
except KeyError: # key not found
value = my_dict[key] = default
E.g.,
>>> mydict = {'a':1, 'b':2, 'c':3}
>>> mydict.setdefault('d', 4)
4 # returns new value at mydict['d']
>>> print(mydict)
{'a':1, 'b':2, 'c':3, 'd':4} # a new key/value pair was indeed added
# but see what happens when trying it on an existing key...
>>> mydict.setdefault('a', 111)
1 # old value was returned
>>> print(mydict)
{'a':1, 'b':2, 'c':3, 'd':4} # existing key was ignored
This question has already been answered ad nauseam, but since my
comment
gained a lot of traction, here it is as an answer:
Adding new keys without updating the existing dict
If you are here trying to figure out how to add a key and return a new dictionary (without modifying the existing one), you can do this using the techniques below
Python >= 3.5
new_dict = {**mydict, 'new_key': new_val}
Python < 3.5
new_dict = dict(mydict, new_key=new_val)
Note that with this approach, your key will need to follow the rules of valid identifier names in Python.
If you're not joining two dictionaries, but adding new key-value pairs to a dictionary, then using the subscript notation seems like the best way.
import timeit
timeit.timeit('dictionary = {"karga": 1, "darga": 2}; dictionary.update({"aaa": 123123, "asd": 233})')
>> 0.49582505226135254
timeit.timeit('dictionary = {"karga": 1, "darga": 2}; dictionary["aaa"] = 123123; dictionary["asd"] = 233;')
>> 0.20782899856567383
However, if you'd like to add, for example, thousands of new key-value pairs, you should consider using the update() method.
Here's another way that I didn't see here:
>>> foo = dict(a=1,b=2)
>>> foo
{'a': 1, 'b': 2}
>>> goo = dict(c=3,**foo)
>>> goo
{'c': 3, 'a': 1, 'b': 2}
You can use the dictionary constructor and implicit expansion to reconstruct a dictionary. Moreover, interestingly, this method can be used to control the positional order during dictionary construction (post Python 3.6). In fact, insertion order is guaranteed for Python 3.7 and above!
>>> foo = dict(a=1,b=2,c=3,d=4)
>>> new_dict = {k: v for k, v in list(foo.items())[:2]}
>>> new_dict
{'a': 1, 'b': 2}
>>> new_dict.update(newvalue=99)
>>> new_dict
{'a': 1, 'b': 2, 'newvalue': 99}
>>> new_dict.update({k: v for k, v in list(foo.items())[2:]})
>>> new_dict
{'a': 1, 'b': 2, 'newvalue': 99, 'c': 3, 'd': 4}
>>>
The above is using dictionary comprehension.
First to check whether the key already exists:
a={1:2,3:4}
a.get(1)
2
a.get(5)
None
Then you can add the new key and value.
Add a dictionary (key,value) class.
class myDict(dict):
def __init__(self):
self = dict()
def add(self, key, value):
#self[key] = value # add new key and value overwriting any exiting same key
if self.get(key)!=None:
print('key', key, 'already used') # report if key already used
self.setdefault(key, value) # if key exit do nothing
## example
myd = myDict()
name = "fred"
myd.add('apples',6)
print('\n', myd)
myd.add('bananas',3)
print('\n', myd)
myd.add('jack', 7)
print('\n', myd)
myd.add(name, myd)
print('\n', myd)
myd.add('apples', 23)
print('\n', myd)
myd.add(name, 2)
print(myd)
I think it would also be useful to point out Python's collections module that consists of many useful dictionary subclasses and wrappers that simplify the addition and modification of data types in a dictionary, specifically defaultdict:
dict subclass that calls a factory function to supply missing values
This is particularly useful if you are working with dictionaries that always consist of the same data types or structures, for example a dictionary of lists.
>>> from collections import defaultdict
>>> example = defaultdict(int)
>>> example['key'] += 1
>>> example['key']
defaultdict(<class 'int'>, {'key': 1})
If the key does not yet exist, defaultdict assigns the value given (in our case 10) as the initial value to the dictionary (often used inside loops). This operation therefore does two things: it adds a new key to a dictionary (as per question), and assigns the value if the key doesn't yet exist. With the standard dictionary, this would have raised an error as the += operation is trying to access a value that doesn't yet exist:
>>> example = dict()
>>> example['key'] += 1
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: 'key'
Without the use of defaultdict, the amount of code to add a new element would be much greater and perhaps looks something like:
# This type of code would often be inside a loop
if 'key' not in example:
example['key'] = 0 # add key and initial value to dict; could also be a list
example['key'] += 1 # this is implementing a counter
defaultdict can also be used with complex data types such as list and set:
>>> example = defaultdict(list)
>>> example['key'].append(1)
>>> example
defaultdict(<class 'list'>, {'key': [1]})
Adding an element automatically initialises the list.
Adding keys to dictionary without using add
# Inserting/Updating single value
# subscript notation method
d['mynewkey'] = 'mynewvalue' # Updates if 'a' exists, else adds 'a'
# OR
d.update({'mynewkey': 'mynewvalue'})
# OR
d.update(dict('mynewkey'='mynewvalue'))
# OR
d.update('mynewkey'='mynewvalue')
print(d) # {'key': 'value', 'mynewkey': 'mynewvalue'}
# To add/update multiple keys simultaneously, use d.update():
x = {3:4, 5:6, 7:8}
d.update(x)
print(d) # {'key': 'value', 'mynewkey': 'mynewvalue', 3: 4, 5: 6, 7: 8}
# update operator |= now works for dictionaries:
d |= {'c':3,'d':4}
# Assigning new key value pair using dictionary unpacking.
data1 = {4:6, 9:10, 17:20}
data2 = {20:30, 32:48, 90:100}
data3 = { 38:"value", 99:"notvalid"}
d = {**data1, **data2, **data3}
# The merge operator | now works for dictionaries:
data = data1 | {'c':3,'d':4}
# Create a dictionary from two lists
data = dict(zip(list_with_keys, list_with_values))
dico["new key"] = "value"

Categories