I have a list of dicts.
dictList = [
{'name': 'some name'},
{'name': 'some other name'},
{'age': 'some age'},
{'last_name': 'some last name'}
]
In that list of dicts each dict has one key and one value for that key, as shown above.
I need to create a dict that has the keys from all the dicts and each value for every key is a set with item values from the list of dicts. In the example, it'd be something like this:
expected_dict = {
'name': ['some name', 'some other name'],
'age': ['some age'],
'last_name': ['some last name']
}
How can I do this in Python?
collections.defaultdict is one way:
from collections import defaultdict
d = defaultdict(list)
dictList = [
{'name': 'some name'},
{'name': 'some other name'},
{'age': 'some age'},
{'last_name': 'some last name'}
]
for i in dictList:
for k, v in i.items():
d[k].append(v)
# defaultdict(list,
# {'age': ['some age'],
# 'last_name': ['some last name'],
# 'name': ['some name', 'some other name']})
You can use the builtin setdefault() function.
dictList = [
{'name': 'some name'},
{'name': 'some other name'},
{'age': 'some age'},
{'last_name': 'some last name'}
]
expected_dict = {}
for dictionary in dictList:
for key, val in dictionary.items():
expected_dict.setdefault(key, []).append(val)
print(expected_dict)
Output:
{
'name': ['some name', 'some other name'],
'age': ['some age'],
'last_name': ['some last name']
}
Note: Using collections.defaultdict (as shown in this answer) is simpler and faster than using dict.setdefault().
From the documentation:
Working of collections.defaultdict:
When each key is encountered for the first time, it is not already in the mapping; so an entry is automatically created using the default_factory function which returns an empty list. The list.append() operation then attaches the value to the new list. When keys are encountered again, the look-up proceeds normally (returning the list for that key) and the list.append() operation adds another value to the list. This technique is simpler and faster than an equivalent technique using dict.setdefault().
bigD = {}
for element in dictList:
for key in element:
if key in bigD:
bigD[key].append(element[key])
else:
bigD[key] = element[key]
You can use itertools.groupby:
import import itertools
dictList = [
{'name': 'some name'},
{'name': 'some other name'},
{'age': 'some age'},
{'last_name': 'some last name'}
]
new_list = {a:[c for [[_, c]] in b] for a, b in itertools.groupby(map(lambda x:x.items(), dictList), key=lambda x:x[0][0])}
Output:
{'age': ['some age'], 'last_name': ['some last name'], 'name': ['some name', 'some other name']}
Very simply.
dict = {}
list = [1,2,3]
dict['numbs'] = list
print(dict)
Output :
{'numbs': [1, 2, 3]}
Related
I have this list of dicts:
[{'name': 'aly', 'age': '104'},
{'name': 'Not A name', 'age': '99'}]
I want the name value to be the key and the age value to be the value of new dict.
Expected output:
['aly' : '104', 'Not A name': '99']
If you want output to be single dict, you can use dict comprehension:
output = {p["name"]: p["age"] for p in persons}
>>> {'aly': '104', 'Not A name': '99'}
If you want output to be list of dicts, you can use list comprehension:
output = [{p["name"]: p["age"]} for p in persons]
>>> [{'aly': '104'}, {'Not A name': '99'}]
You can initialize the new dict, iterate through the list and add to the new dict:
lst = [{'name': 'aly', 'age': '104'}, {'name': 'Not A name', 'age': '99'}]
newdict = {}
for item in lst:
newdict[item['name']] = item['age']
This will help you:
d = [
{'name': 'aly', 'age': '104'},
{'name': 'Not A name', 'age': '99'}
]
dict([i.values() for i in d])
# Result
{'aly': '104', 'Not A name': '99'}
# In case if you want a list of dictionary, use this
[dict([i.values() for i in d])]
# Result
[{'aly': '104', 'Not A name': '99'}]
Just a side note:
Your expected answer looks like a list (because of [ ]) but values inside the list are dictionary (key:value) which is invalid.
Here is the easiest way to convert the new list of dicts
res = list(map(lambda data: {data['name']: data['age']}, d))
print(res)
How would I insert a key-value pair at a specified location in a python dictionary that was loaded from a YAML document?
For example if a dictionary is:
dict = {'Name': 'Zara', 'Age': 7, 'Class': 'First'}
I wish to insert the element 'Phone':'1234'
before 'Age', and after 'Name' for example. The actual dictionary I shall be working on is quite large (parsed YAML file), so deleting and reinserting might be a bit cumbersome (I don't really know).
If I am given a way of inserting into a specified position in an OrderedDict, that would be okay, too.
On python < 3.7 (or cpython < 3.6), you cannot control the ordering of pairs in a standard dictionary.
If you plan on performing arbitrary insertions often, my suggestion would be to use a list to store keys, and a dict to store values.
mykeys = ['Name', 'Age', 'Class']
mydict = {'Name': 'Zara', 'Age': 7, 'Class': 'First'} # order doesn't matter
k, v = 'Phone', '123-456-7890'
mykeys.insert(mykeys.index('Name')+1, k)
mydict[k] = v
for k in mykeys:
print(f'{k} => {mydict[k]}')
# Name => Zara
# Phone => 123-456-7890
# Age => 7
# Class => First
If you plan on initialising a dictionary with ordering whose contents are not likely to change, you can use the collections.OrderedDict structure which maintains insertion order.
from collections import OrderedDict
data = [('Name', 'Zara'), ('Phone', '1234'), ('Age', 7), ('Class', 'First')]
odict = OrderedDict(data)
odict
# OrderedDict([('Name', 'Zara'),
# ('Phone', '1234'),
# ('Age', 7),
# ('Class', 'First')])
Note that OrderedDict does not support insertion at arbitrary positions (it only remembers the order in which keys are inserted into the dictionary).
You will have to initialize your dict as OrderedDict. Create a new empty OrderedDict, go through all keys of the original dictionary and insert before/after when the key name matches.
from pprint import pprint
from collections import OrderedDict
def insert_key_value(a_dict, key, pos_key, value):
new_dict = OrderedDict()
for k, v in a_dict.items():
if k==pos_key:
new_dict[key] = value # insert new key
new_dict[k] = v
return new_dict
mydict = OrderedDict([('Name', 'Zara'), ('Age', 7), ('Class', 'First')])
my_new_dict = insert_key_value(mydict, "Phone", "Age", "1234")
pprint(my_new_dict)
Had the same issue and solved this as described below without any additional imports being required and only a few lines of code.
Tested with Python 3.6.9.
Get position of key 'Age' because the new key value pair should get inserted before
Get dictionary as list of key value pairs
Insert new key value pair at specific position
Create dictionary from list of key value pairs
mydict = {'Name': 'Zara', 'Age': 7, 'Class': 'First'}
print(mydict)
# {'Name': 'Zara', 'Age': 7, 'Class': 'First'}
pos = list(mydict.keys()).index('Age')
items = list(mydict.items())
items.insert(pos, ('Phone', '123-456-7890'))
mydict = dict(items)
print(mydict)
# {'Name': 'Zara', 'Phone': '123-456-7890', 'Age': 7, 'Class': 'First'}
Edit 2021-12-20:
Just saw that there is an insert method available ruamel.yaml, see the example from the project page:
import sys
from ruamel.yaml import YAML
yaml_str = """\
first_name: Art
occupation: Architect # This is an occupation comment
about: Art Vandelay is a fictional character that George invents...
"""
yaml = YAML()
data = yaml.load(yaml_str)
data.insert(1, 'last name', 'Vandelay', comment="new key")
yaml.dump(data, sys.stdout)
This is a follow-up on nurp's answer. Has worked for me, but offered with no warranty.
# Insert dictionary item into a dictionary at specified position:
def insert_item(dic, item={}, pos=None):
"""
Insert a key, value pair into an ordered dictionary.
Insert before the specified position.
"""
from collections import OrderedDict
d = OrderedDict()
# abort early if not a dictionary:
if not item or not isinstance(item, dict):
print('Aborting. Argument item must be a dictionary.')
return dic
# insert anywhere if argument pos not given:
if not pos:
dic.update(item)
return dic
for item_k, item_v in item.items():
for k, v in dic.items():
# insert key at stated position:
if k == pos:
d[item_k] = item_v
d[k] = v
return d
d = {'A':'letter A', 'C': 'letter C'}
insert_item(['A', 'C'], item={'B'})
## Aborting. Argument item must be a dictionary.
insert_item(d, item={'B': 'letter B'})
## {'A': 'letter A', 'C': 'letter C', 'B': 'letter B'}
insert_item(d, pos='C', item={'B': 'letter B'})
# OrderedDict([('A', 'letter A'), ('B', 'letter B'), ('C', 'letter C')])
Would this be "pythonic"?
def add_item(d, new_pair, old_key): #insert a newPair (key, value) after old_key
n=list(d.keys()).index(old_key)
return {key:d.get(key,new_pair[1]) for key in list(d.keys())[:n+1] +[new_pair[0]] + list(d.keys())[n+1:] }
INPUT: new_pair=('Phone',1234) , old_key='Age'
OUTPUT: {'Name': 'Zara', 'Age': 7, 'Phone': 1234, 'Class': 'First'}
Simple reproducible example (using zip() for unpacking and packing)
### Task - Insert 'Bangladesh':'Dhaka' after 'India' in the capitals dictinary
## Given dictionary
capitals = {'France':'Paris', 'United Kingdom':'London', 'India':'New Delhi',
'United States':'Washington DC','Germany':'Berlin'}
## Step 1 - Separate into 2 lists containing : 1) keys, 2) values
country, cap = (list(tup) for tup in zip(*capitals.items()))
# or
country, cap = list(map(list, zip(*capitals.items())))
print(country)
#> ['France', 'United Kingdom', 'India', 'United States', 'Germany']
print(cap)
#> ['Paris', 'London', 'New Delhi', 'Washington DC', 'Berlin']
## Step 2 - Find index of item before the insertion point (from either of the 2 lists)
req_pos = country.index('India')
print(req_pos)
#> 2
## Step 3 - Insert new entry at specified position in both lists
country.insert(req_pos+1, 'Bangladesh')
cap.insert(req_pos+1, 'Dhaka')
print(country)
#> ['France', 'United Kingdom', 'India', 'Bangladesh', 'United States', 'Germany']
print(cap)
#> ['Paris', 'London', 'New Delhi', 'Dhaka', 'Washington DC', 'Berlin']
## Step 4 - Zip up the 2 lists into a dictionary
capitals = dict(zip(country, cap))
print(capitals)
#> {'France': 'Paris', 'United Kingdom': 'London', 'India': 'New Delhi', 'Bangladesh': 'Dhaka', 'United States': 'Washington DC', 'Germany': 'Berlin'}
Once your have used load() (without option Loader=RoundTripLoader) and your data is in a dict() it is to late, as the order that was available in the YAML file is normally gone (the order depending on the actual keys used, the python used (implementation, version and possible compile options).
What you need to do is use round_trip_load():
import sys
from ruamel import yaml
yaml_str = "{'Name': 'Zara', 'Age': 7, 'Class': 'First'}"
data = yaml.round_trip_load(yaml_str)
pos = list(data.keys()).index('Age') # determine position of 'Age'
# insert before position of 'Age'
data.insert(pos, 'Phone', '1234', comment='This is the phone number')
data.fa.set_block_style() # I like block style
yaml.round_trip_dump(data, sys.stdout)
this will invariable give:
Name: Zara
Phone: '1234' # This is the phone number
Age: 7
Class: First
Under the hood round_trip_dump() transparently gives you back a subclass of orderddict to make this possible (which actual implementation is dependent on your Python version).
Since your elements comes in pairs, I think this will could work.
dict = {'Name': 'Zara', 'Age': 7, 'Class': 'First'}
new_element = { 'Phone':'1234'}
dict = {**dict,**new_element}
print(dict)
This is the output I got:
{'Name': 'Zara', 'Age': 7, 'Class': 'First', 'Phone': '1234'}
I had this:
[{'name': 'Peter'}, {'name': 'Anna'}]
And I wanted to make this out of it:
[{'name': 'Peter Williams'}, {'name': 'Anna Williams'}]
So I did:
>>> li = [{'name': 'Peter'}, {'name': 'Anna'}]
>>> new_li = []
>>> dic = {}
>>> for i in li:
... dic["name"] = i["name"] + " Williams"
... new_li.append(dic)
But:
>>> new_li
[{'name': 'Anna Williams'}, {'name': 'Anna Williams'}]
Why?
Could you also show how to best get [{'name': 'Peter Williams'}, {'name': 'Anna Williams'}]?
Edit
The reason why I didn't understand this behavior is because I assumed that:
>>> dict = {'name':'Peter'}
>>> lis = [dict]
>>> dict['name'] = 'Olaf'
Where
>>> print lis
gives
[{'name': 'Peter'}]
While it actually is
[{'name': 'Olaf'}]
Because you're using the same dictionary object in each iteration.
On the second iteration, you are altering the value that you assigned to the name key in the previous iteration, and your list ends up containing two references to the same object.
I'd strongly recommend checking out the "Python Tutor" tool (pythontutor.com), which allows you to visualise the execution of some python code, and see what objects are being created in the stack. e.g. Python Tutor with your code
The correct way to do what you wanted would be:
li = [{'name': 'Peter'}, {'name': 'Anna'}]
new_li = []
for p in li:
new_p = {'name': p['name'] + ' Williams'}
new_li.append(new_p)
This way a new dictionary object is created with each iteration.
A more concise solution:
li = [{'name': 'Peter'}, {'name': 'Anna'}]
new_li = [{'name': p['name'] + ' Williams'} for p in li]
You need to create a new dictionary on each iteration of the loop, otherwise on each iteration you modify and append the same object to new_li:
>>> li = [{'name': 'Peter'}, {'name': 'Anna'}]
>>> new_li = []
>>> for i in li:
... new_li.append({'name': i['name'] + ' Williams'})
...
>>> new_li
[{'name': 'Peter Williams'}, {'name': 'Anna Williams'}]
I'm assuming you want to alter it in-place. You don't need to make a new dict.
for d in inlist:
d["name"] += " Williams"
Is all you have to do.
You have run into a really nasty problem. The problem is with your dic. You are iterating over the list of names, and first you hit the one with 'Peter'. At this point, you set dic to {'name': 'Peter Williams'} and append it. But then, you hit the one with 'Anna' and change that very same dic to {'name': 'Anna Williams'}. So you end up with the same dictionary in your list twice. To fix this, you will need to do this instead:
new_li = []
for i in li:
n_li.append({'name': i['name'] + ' Williams'})
UPDATE: Just to be clear, I'd like to check the key-value of the 'name' and 'last' and add only if these are not already in the list.
I have:
lst = [{'name':'John', 'last':'Smith'.... .... (other key-values)... },
{'name':'Will', 'last':'Smith'... ... (other key-values)... }]
I want to append a a new dict into this list only if it is not the exact same as an existing dictionary.
In other words:
dict1 = {'name':'John', 'last':'Smith'} # ==> wouldn't be appended
but...
dict2 = {'name':'John', 'last':'Brown'} # ==> WOULD be appended
Could someone explain the simplest way to do this, as well as in English, what is happening in the solution. THANKS!
Reference: Python: Check if any list element is a key in a dictionary
Since you asked for a way to only check the two keys, even if the dicts have other keys in them:
name_pairs = set((i['name'], i['last']) for i in lst)
if (d['name'], d['last']) not in name_pairs:
lst.append(d)
You can do it with this list comprehension just append everything to your list and run this:
lst.append(dict1)
lst.append(dict2)
[dict(y) for y in set(tuple(x.items()) for x in lst)]
The output is:
[
{'last': 'Smith', 'name': 'John'},
{'last': 'Brown', 'name': 'John'},
{'last': 'Smith', 'name': 'Will'}
]
With this method you can add extra fields and it will still work.
You could also write a small method to do it and return the list
def update_if_not_exist(lst, val):
if len([d for d in lst if (d['name'], d['last']) == (val['name'], val['last'])]) == 0:
lst.append(val)
return lst
lst = update_if_not_exist(lst, dict1)
lst = update_if_not_exist(lst, dict2)
It works by filtering the original list to matching the name and last keys and seeing if the result is empty.
>>> class Person(dict):
... def __eq__(self, other):
... return (self['first'] == other['first'] and
... self['second'] == other['second'])
... def __hash__(self):
... return hash((self['first'], self['second']))
>>> l = [{'first': 'John', 'second': 'Smith', 'age': 23},
... {'first': 'John', 'second': 'Smith', 'age': 30},
... {'first': 'Ann', 'second': 'Rice', 'age': 31}]
>>> l = set(map(Person, l))
>>> print l
set([{'first': 'Ann', 'second': 'Rice', 'age': 31},
{'first': 'John', 'second': 'Smith', 'age': 23}])
Instance of the Person class can be used as simple dict.
I have a dict which contains some lists and some dicts, as illustrated below.
What is the most pythonic way to iterate over the dict and print out the name and address pairs for each top level dict key?
Thanks
{
'Resent-Bcc': [],
'Delivered-To': [],
'From': {'Name': 'Steve Watson', 'Address': 'steve.watson#example.org'},
'Cc': [],
'Resent-Cc': [],
'Bcc': [ {'Name': 'Daryl Hurstbridge', 'Address': 'daryl.hurstbridge#example.org'},
{'Name': 'Sally Hervorth', 'Address': 'sally.hervorth#example.org'},
{'Name': 'Mike Merry', 'Address': 'mike.merry#example.org'},
{'Name': 'Jenny Callisto', 'Address': 'jenny.callisto#example.org'}
],
'To': {'Name': 'Darius Jedburgh', 'Address': 'darius.jedburgh#example.org'}
}
Use the iteritems() method on the dict. It's clear and easy to understand: that seems Pythonic to me. iteritems() also creates less temporary items than items(), as Preet Kukreti mentioned in the comments. First, fix your data. Right now, some of the values in the top-level dict are lists, and some are more dicts:
# list
'Delivered-To': [],
# dict
'From': {'Name': 'Steve Watson', 'Address': 'steve.watson#example.org'},
This means you have to check the type of the value and act accordingly (and you might forget to check!). Make your data consistent:
# list
'Delivered-To': [],
# also list
'From': [{'Name': 'Steve Watson', 'Address': 'steve.watson#example.org'}],
This will prevent weird type-related bugs in the future. Since Python is an interpreted language, it's very easy to make type bugs and not notice until your code is in production and crashes. Try to make your code as type-safe as possible!
Then you can use something like this:
for k, v in d.iteritems():
for row in v:
if "Name" in row and "Address" in row:
print row["Name"], ":", row["Address"]
One way is to change the lone dicts into a list containing the dict. Then all the entries can be treated the same
>>> D = {
... 'Resent-Bcc': [],
... 'Delivered-To': [],
... 'From': {'Name': 'Steve Watson', 'Address': 'steve.watson#example.org'},
... 'Cc': [],
... 'Resent-Cc': [],
... 'Bcc': [ {'Name': 'Daryl Hurstbridge', 'Address': 'daryl.hurstbridge#example.org'},
... {'Name': 'Sally Hervorth', 'Address': 'sally.hervorth#example.org'},
... {'Name': 'Mike Merry', 'Address': 'mike.merry#example.org'},
... {'Name': 'Jenny Callisto', 'Address': 'jenny.callisto#example.org'}
... ],
... 'To': {'Name': 'Darius Jedburgh', 'Address': 'darius.jedburgh#example.org'}
... }
>>> L = [v if type(v) is list else [v] for v in D.values()]
>>> [(d["Name"], d["Address"]) for item in L for d in item ]
[('Steve Watson', 'steve.watson#example.org'), ('Daryl Hurstbridge', 'daryl.hurstbridge#example.org'), ('Sally Hervorth', 'sally.hervorth#example.org'), ('Mike Merry', 'mike.merry#example.org'), ('Jenny Callisto', 'jenny.callisto#example.org'), ('Darius Jedburgh', 'darius.jedburgh#example.org')]
Or the one liner version
[(d["Name"], d["Address"]) for item in (v if type(v) is list else [v] for v in D.values())]
It's probably best to keep your data simple, by making the naked dict's be a list of one element holding the original dict. Otherwise, you're kind of asking for harder to test code.
I tend to lean away from isinstance(foo, dict) and instead use things like:
if getattr(d, 'iteritems'): print list(d.iteritems())
...It strikes me as more duck-typed this way; it opens the door to using one of the many dict-replacements - things that act like a dict, but nominally aren't a dict.
for key in header:
if header[key] and type(header[key])==type([]):
for item in header[key]:
print (item)
elif type(header[key])==type({}):
print(header[key])
# this option is not the easiest to read, so I classify it as less "pythonic"
l = [header[key] for key in header if header[key] and type(header[key])==type({})] + [header[key][i] for key in header if header[key] and type(header[key])==type([]) for i in range(len(header[key]))]
for item in l:
print(item)
if you're looking for the contents of a specific header you could modify the if statements accordingly. Both of these examples print the dictionaries, but could easily be adapted to print specific values.
for i in dict:
if 'Name' in dict[i]:
print (dict[i]['Name'],dict[i]['Address'])
this will not work for the bcc where its in a list (right now it will only print the from and to names and addresses) Do you need it to print the bcc addresses too?