Convert dictionary lists to multi-dimensional list of dictionaries - python

I've been trying to convert the following:
data = {'title':['doc1','doc2','doc3'], 'name':['test','check'], 'id':['ddi5i'] }
to:
[{'title':'doc1', 'name': 'test', 'id': 'ddi5i'},
{'title':'doc2', 'name': 'test', 'id': 'ddi5i'},
{'title':'doc3', 'name': 'test', 'id': 'ddi5i'},
{'title':'doc1', 'name': 'check', 'id': 'ddi5i'},
{'title':'doc2', 'name': 'check', 'id': 'ddi5i'},
{'title':'doc3', 'name': 'check', 'id': 'ddi5i'}]
I've tried various options (list comprehensions, pandas and custom code) but nothing seems to work. For example, the following:
panda.DataFrame(data).to_dict('list')
throws an error because, since it tries to map the lists, all of them have to be of the same length. Besides, the output would only be uni-dimensional which is not what I'm looking for.

itertools.product may be what you're looking for here, and it can be applied to the values of your data to get appropriate value groupings for the new dicts. Something like
list(dict(zip(data, ele)) for ele in product(*data.values()))
Demo
>>> from itertools import product
>>> list(dict(zip(data, ele)) for ele in product(*data.values()))
[{'id': 'ddi5i', 'name': 'test', 'title': 'doc1'},
{'id': 'ddi5i', 'name': 'test', 'title': 'doc2'},
{'id': 'ddi5i', 'name': 'test', 'title': 'doc3'},
{'id': 'ddi5i', 'name': 'check', 'title': 'doc1'},
{'id': 'ddi5i', 'name': 'check', 'title': 'doc2'},
{'id': 'ddi5i', 'name': 'check', 'title': 'doc3'}]
It is clear how this works once seeing
>>> list(product(*data.values()))
[('test', 'doc1', 'ddi5i'),
('test', 'doc2', 'ddi5i'),
('test', 'doc3', 'ddi5i'),
('check', 'doc1', 'ddi5i'),
('check', 'doc2', 'ddi5i'),
('check', 'doc3', 'ddi5i')]
and now it is just a matter of zipping back into a dict with the original keys.

Related

Remove item from nested dictionaries if specified key contains None values

I have a list of dictionaries in which I am trying to remove any dictionary should the value of a certain key is None, it will be removed.
item_dict = [
{'code': 'aaa0000',
'id': 415294,
'index_range': '10-33',
'location': 'A010',
'type': 'True'},
{'code': 'bbb1458',
'id': 415575,
'index_range': '30-62',
'location': None,
'type': 'True'},
{'code': 'ccc3013',
'id': 415575,
'index_range': '14-59',
'location': 'C041',
'type': 'True'}
]
for item in item_dict:
filtered = dict((k,v) for k,v in item.iteritems() if v is not None)
# Output Results
# Item - aaa0000 is missing
# {'index_range': '14-59', 'code': 'ccc3013', 'type': 'True', 'id': 415575, 'location': 'C041'}
In my example, the output result is missing one of the dictionary and if I tried to create a new list to append filtered, item bbb1458 will be included in the list as well.
How can I rectify this?
[item for item in item_dict if None not in item.values()]
Each item in this list is a dictionary. And a dictionary is only appended to this list if None does not appear in the dictionary values.
You can create a new list using a list comprehension, filtering on the condition that all values are not None:
item_dict = [
{'code': 'aaa0000',
'id': 415294,
'index_range': '10-33',
'location': 'A010',
'type': 'True'},
{'code': 'bbb1458',
'id': 415575,
'index_range': '30-62',
'location': None,
'type': 'True'},
{'code': 'ccc3013',
'id': 415575,
'index_range': '14-59',
'location': 'C041',
'type': 'True'}
]
filtered = [d for d in item_dict if all(value is not None for value in d.values())]
print(filtered)
#[{'index_range': '10-33', 'id': 415294, 'location': 'A010', 'type': 'True', 'code': 'aaa0000'}, {'index_range': '14-59', 'id': 415575, 'location': 'C041', 'type': 'True', 'code': 'ccc3013'}]

compare two different length lists of dictionaries in python

I want to compare below dictionaries. Name key in the dictionary is common in both dictionaries.
If Name matched in both the dictionaries, i wanted to do some other stuff with the data.
PerfData = [
{'Name': 'abc', 'Type': 'Ex1', 'Access': 'N1', 'perfStatus':'Latest Perf', 'Comments': '07/12/2017 S/W Version'},
{'Name': 'xyz', 'Type': 'Ex1', 'Access': 'N2', 'perfStatus':'Latest Perf', 'Comments': '11/12/2017 S/W Version upgrade failed'},
{'Name': 'efg', 'Type': 'Cust1', 'Access': 'A1', 'perfStatus':'Old Perf', 'Comments': '11/10/2017 S/W Version upgrade failed, test data is active'}
]
beatData = [
{'Name': 'efg', 'Status': 'Latest', 'rcvd-timestamp': '1516756202.632'},
{'Name': 'abc', 'Status': 'Latest', 'rcvd-timestamp': '1516756202.896'}
]
Thanks
Rajeev
l = [{'name': 'abc'}, {'name': 'xyz'}]
k = [{'name': 'a'}, {'name': 'abc'}]
[i['name'] for i in l for f in k if i['name'] == f['name']]
Hope above logic work for you.
The answer provided didn't assign the result to any variable. If you want to print it, add the following would work:
result = [i['name'] for i in l for f in k if i['name'] == f['name']]
print(result)

Filter/group dictionary by nested value

Here‘s a simplified example of some data I have:
{"id": "1234565", "fields": {"name": "john", "email":"john#example.com", "country": "uk"}}
The wholeo nested dictionary is a bigger list of address data. The goal is to create pairs of people from the list with randomized partners where partners from the same country should be preferd. So my first real issue is to find a good way to group them by that country value.
I‘m sure there‘s a smarter way to do this than iterating through the dict and writing all records out to some new list/dict?
I think this is close to what you need:
result = {key:[i for i in value] for key, value in itertools.groupby(people, lambda item: item["fields"]["country"])}
What this does is use itertools.groupby to group all people in the people list by their specified country. The resulting dictionary has countries as keys, and the unpacked groupings (matching people) as values. Input is expected as a list of dictionaries like the one in your example:
people = [{"id": "1234565", "fields": {"name": "john", "email":"john#example.com", "country": "uk"}},
{"id": "654321", "fields": {"name": "sam", "email":"sam#example.com", "country": "uk"}}]
Sample output:
>>> print(result)
>>> {'uk': [{'fields': {'name': 'john', 'email': 'john#example.com', 'country': 'uk'}, 'id': '1234565'}, {'fields': {'name': 'sam', 'email': 'sam#example.com', 'country': 'uk'}, 'id': '654321'}]}
For a cleaner result, the looping construct can be tweaked so that only the ID of each person is included in the result dict:
result = {key:[i["id"] for i in value] for key, value in itertools.groupby(people, lambda item: item["fields"]["country"])}
>>> print(result)
>>> {'uk': ['1234565', '654321']}
EDIT: Sorry, I forgot about the sorting. Simply sort the list of people by country before putting it through groupby. It should now work properly:
sort = sorted(people, key=lambda item: item["fields"]["country"])
Here is another one that uses defaultdict:
import collections
def make_groups(nested_dicts, nested_key):
default = collections.defaultdict(list)
for nested_dict in nested_dicts:
for value in nested_dict.values():
try:
default[value[nested_key]].append(nested_dict)
except TypeError:
pass
return default
To test the results:
import random
COUNTRY = {'af', 'br', 'fr', 'mx', 'uk'}
people = [{'id': i, 'fields': {
'name': 'name'+str(i),
'email': str(i)+'#email',
'country': random.sample(COUNTRY, 1)[0]}}
for i in range(10)]
country_groups = make_groups(people, 'country')
for country, persons in country_groups.items():
print(country, persons)
Random output:
fr [{'id': 0, 'fields': {'name': 'name0', 'email': '0#email', 'country': 'fr'}}, {'id': 1, 'fields': {'name': 'name1', 'email': '1#email', 'country': 'fr'}}, {'id': 4, 'fields': {'name': 'name4', 'email': '4#email', 'country': 'fr'}}]
br [{'id': 2, 'fields': {'name': 'name2', 'email': '2#email', 'country': 'br'}}, {'id': 8, 'fields': {'name': 'name8', 'email': '8#email', 'country': 'br'}}]
uk [{'id': 3, 'fields': {'name': 'name3', 'email': '3#email', 'country': 'uk'}}, {'id': 7, 'fields': {'name': 'name7', 'email': '7#email', 'country': 'uk'}}]
af [{'id': 5, 'fields': {'name': 'name5', 'email': '5#email', 'country': 'af'}}, {'id': 9, 'fields': {'name': 'name9', 'email': '9#email', 'country': 'af'}}]
mx [{'id': 6, 'fields': {'name': 'name6', 'email': '6#email', 'country': 'mx'}}]

Python find element from list of dict in other list of dict

I have two list of dict.
students = [{'lastname': 'JAKUB', 'id': '92051048757', 'name': 'BAJOREK'},
{'lastname': 'MARIANNA', 'id': '92051861424', 'name': 'SLOTARZ'}, {'lastname':
'SZYMON', 'id': '92052033215', 'name': 'WNUK'}, {'lastname': 'WOJCIECH', 'id':
'92052877491', 'name': 'LESKO'}]
And
house = [{'id_pok': '2', 'id': '92051048757'}, {'id_pok': '24', 'id': '92051861424'}]
How to find elements that not exist in house list of dict matching by id?
Output
output = [{'lastname':
'SZYMON', 'id': '92052033215', 'name': 'WNUK'}]
I try do that
for student in students:
for home in house:
if student['id'] != home['id']:
print student
But this only repeat list
The reason your code doesn't work is that if there's any house_id which doesn't match a student_id, the student will be printed. You'd need some more logic or the any function:
for student in students:
if not any (student['id'] == home['id'] for home in house):
print(student)
It outputs:
{'lastname': 'SZYMON', 'id': '92052033215', 'name': 'WNUK'}
{'lastname': 'WOJCIECH', 'id': '92052877491', 'name': 'LESKO'}
A more efficient solution would be to keep a set of house_ids, and find students whose id isn't included in this set:
students = [{'lastname': 'JAKUB', 'id': '92051048757', 'name': 'BAJOREK'},
{'lastname': 'MARIANNA', 'id': '92051861424', 'name': 'SLOTARZ'}, {'lastname':
'SZYMON', 'id': '92052033215', 'name': 'WNUK'}, {'lastname': 'WOJCIECH', 'id':
'92052877491', 'name': 'LESKO'}]
house = [{'id_pok': '2', 'id': '92051048757'}, {'id_pok': '24', 'id': '92051861424'}]
house_ids = set(house_dict['id'] for house_dict in house)
result = [student for student in students if student['id'] not in house_ids]
print(result)
It outputs:
[{'lastname': 'SZYMON', 'id': '92052033215', 'name': 'WNUK'}, {'lastname': 'WOJCIECH', 'id': '92052877491', 'name': 'LESKO'}]
Note that 2 students match your description.
The reason setenter link description here is used is that it allows much faster lookup than a list.
student_ids = set(d.get('id') for d in students)
house_ids = set(d.get('id') for d in house)
ids_not_in_house = student_ids ^ house_ids
students = [{'lastname': 'JAKUB', 'id': '92051048757', 'name': 'BAJOREK'},
{'lastname': 'MARIANNA', 'id': '92051861424', 'name': 'SLOTARZ'}, {'lastname':
'SZYMON', 'id': '92052033215', 'name': 'WNUK'}, {'lastname': 'WOJCIECH', 'id':
'92052877491', 'name': 'LESKO'}]
house = [{'id_pok': '2', 'id': '92051048757'}, {'id_pok': '24', 'id': '92051861424'}]
s = {item['id'] for item in students}
h = {item['id'] for item in house}
not_in_house_ids = s.difference(h)
not_in_house_items = [x for x in students if x['id'] in not_in_house_ids]
print (not_in_house_items)
>>>[{'name': 'WNUK', 'lastname': 'SZYMON', 'id': '92052033215'}, {'name': 'LESKO', 'lastname': 'WOJCIECH', 'id': '92052877491'}]

Updating a value in a dictionary inside a dictionary

If I have a list of contact dictionaries like this:
{'name': 'Rob', 'phoneNumbers': [{'phone': '123-3214', 'type': 'home'}, {'phone': '456-3216', 'type': 'work'}]}
how could I update this dictionary to remove the dashes from the phone numbers in a list of contact dictionaries pythonically?
You could just nest loops:
for contact_dict in list_of_dicts:
for phone_dict in contact_dict['phoneNumbers']:
phone_dict['phone'] = phone_dict['phone'].replace('-', '')
This alters the values in-place.
Or you could create a whole new copy of the structure, with the alterations made:
[dict(contact, phoneNumbers=[
dict(phone_dict, phone=phone_dict['phone'].replace('-', ''))
for phone_dict in contact['phoneNumbers']])
for contact in list_of_dicts]
This creates a semi-shallow copy; only the phoneNumbers key is explicitly copied, but any other mutable values are just referenced by the new dictionaries.
Demo:
>>> list_of_dicts = [{'name': 'Rob', 'phoneNumbers': [{'phone': '123-3214', 'type': 'home'}, {'phone': '456-3216', 'type': 'work'}]}]
>>> [dict(contact, phoneNumbers=[
... dict(phone_dict, phone=phone_dict['phone'].replace('-', ''))
... for phone_dict in contact['phoneNumbers']])
... for contact in list_of_dicts]
[{'phoneNumbers': [{'phone': '1233214', 'type': 'home'}, {'phone': '4563216', 'type': 'work'}], 'name': 'Rob'}]
>>> for contact_dict in list_of_dicts:
... for phone_dict in contact_dict['phoneNumbers']:
... phone_dict['phone'] = phone_dict['phone'].replace('-', '')
...
>>> list_of_dicts
[{'phoneNumbers': [{'phone': '1233214', 'type': 'home'}, {'phone': '4563216', 'type': 'work'}], 'name': 'Rob'}]
Just str.replace the -
d ={'name': "Rob", 'phoneNumbers': [{'phone': '123-3214', 'type': 'home'}, {'phone': '456-3216', 'type': 'work'}]}
for dct in d["phoneNumbers"]:
dct['phone'] = dct['phone'].replace("-","",1)
Which gives you:
{'phoneNumbers': [{'phone': '1233214', 'type': 'home'}, {'phone': '4563216', 'type': 'work'}], 'name': 'Rob'}

Categories