Not working too much with dictionaries and thinking about it for too long. I'm looking for merging dictionaries in a way I'm describing a little lower. Dictionary might be a bit bigger.
Thanks!
dict1:
{'0': '5251', '1': '5259'}
list:
[{'id': 5259, 'name': 'chair'}, {'id': 5251, 'name': 'table'}]
Result:
{'0': {'id': 5251, 'name': 'table'}, '1': {'id': 5259, 'name': 'chair'}}
This should work:
dict1 = {'0': '5251', '1': '5259'}
ls = [{'id': 5259, 'name': 'chair'}, {'id': 5251, 'name': 'table'}]
idToEntry = dict([x['id'], x] for x in ls)
dict2 = dict([k, idToEntry[int(v)]] for k, v in dict1.items())
print(dict2)
The output:
{'0': {'id': 5251, 'name': 'table'}, '1': {'id': 5259, 'name': 'chair'}}
This solution will be somewhat slow for larger lists, but it is very simple:
result = {}
for key, val in dict1.items():
for dict2 in list1:
if dict2['id'] == val:
result['key'] = dict2
Construct a dictionary with reversed keys and values from your original:
d = {'0': '5251', '1': '5259'}
rev_dict = {v: k for k, v in d.items()}
Now you can use the id from each item of your list as an index into your dict
l = [{'id': 5259, 'name': 'chair'}, {'id': 5251, 'name': 'table'}]
merged_data = {rev_dict[str(x['id'])]: x for x in l}
# {'1': {'id': 5259, 'name': 'chair'}, '0': {'id': 5251, 'name': 'table'}}
d = {'0': '5251', '1': '5259'}
l = [{'id': 5259, 'name': 'chair'}, {'id': 5251, 'name': 'table'}]
# Lets prepare an intermediate dict for faster computation on large data
d1 = {str(x['id']): x for x in l}
merged_dict = {x: d1[y] for x, y in d.items()}
print(merged_dict)
Related
I have a dictionary with missing values (the key is there, but the associated value is empty). For example I want the dictionary below:
dct = {'ID': '', 'gender': 'male', 'age': '20', 'weight': '', 'height': '5.7'}
to be changed to this form:
dct = {'ID': {'link': '','value': ''}, 'gender': 'male', 'age': '20', 'weight': {'link': '','value': ''}, 'height': '5.7'}
I want the ID and Weight key should be replaced with nested dictionary if its empty.
How can I write that in the most time-efficient way?
I have tried solutions from below links but didnt work,
def update(orignal, addition):
for k, v in addition.items():
if k not in orignal:
orignal[k] = v
else:
if isinstance(v, dict):
update(orignal[k], v)
elif isinstance(v, list):
for i in range(len(v)):
update(orignal[k][i], v[i])
else:
if not orignal[k]:
orignal[k] = v
Error: TypeError: 'str' object does not support item assignment
Fill missing keys by comparing example json in python
Adding missing keys in dictionary in Python
It seems similar with this issue https://stackoverflow.com/a/3233356/6396981
import collections.abc
def update(d, u):
for k, v in u.items():
if isinstance(v, collections.abc.Mapping):
d[k] = update(d.get(k, {}) or {}, v)
else:
d[k] = v
return d
For example in your case:
>>> dict1 = {'ID':'', 'gender':'male', 'age':'20', 'weight':'', 'height':'5.7'}
>>> dict2 = {'ID': {'link':'','value':''}, 'weight': {'link':'','value':''}}
>>>
>>> update(dict1, dict2)
{'ID': {'link': '', 'value': ''}, 'gender': 'male', 'age': '20', 'weight': {'link': '', 'value': ''}, 'height': '5.7'}
>>>
You can iterate through the list and see if the value is an empty string('') if it is, replace it with the default value. Here's a small snippet which does it -
dct = {'ID':'', 'gender':'male', 'age':'20', 'weight':'', 'height':'5.7'}
def update(d, default):
for k, v in d.items():
if v == '':
d[k] = default.copy()
update(dct, {'link':'','value':''})
print(dct)
Output :
{'ID': {'link': '', 'value': ''}, 'gender': 'male', 'age': '20', 'weight': {'link': '', 'value': ''}, 'height': '5.7'}
Note that the dict is passed by reference to the function, so any updates made there will be reflected in the original dictionary as well as seen in the above example.
If your dict is nested and you want the replacement to be done for nested items as well then you can use this function -
def nested_update(d, default):
for k, v in d.items():
if v == '':
d[k] = default.copy()
if isinstance(v, list):
for item in v:
nested_update(item, default)
if isinstance(v, dict):
nested_update(v, default)
here's a small example with list of dictionaries and nested dictionary -
dct = {'ID':'', 'gender':'male', 'age':'20', 'weight':'', 'height':'5.7', "list_data":[{'empty': ''}, {'non-empty': 'value'}], "nested_dict": {"key1": "val1", "missing_nested": ""}}
nested_update(dct, {'key1': 'val1-added', 'key2': 'val2-added'})
print(dct)
Output :
{'ID': {'key1': 'val1-added', 'key2': 'val2-added'}, 'gender': 'male', 'age': '20', 'weight': {'key1': 'val1-added', 'key2': 'val2-added'}, 'height': '5.7', 'list_data': [{'empty': {'key1': 'val1-added', 'key2': 'val2-added'}}, {'non-empty': 'value'}], 'nested_dict': {'key1': 'val1', 'missing_nested': {'key1': 'val1-added', 'key2': 'val2-added'}}}
For "this default dictionary to only specified keys like ID and Weight and not for other keys", you can update the condition of when we replace the value -
def nested_update(d, default):
for k, v in d.items():
if k in ('ID', 'weight') and v == '':
d[k] = default.copy()
if isinstance(v, list):
for item in v:
nested_update(item, default)
if isinstance(v, dict):
nested_update(v, default)
I have to sort a dict like:
jobs = {'elem_05': {'id': 'fifth'},
'elem_03': {'id': 'third'},
'elem_01': {'id': 'first'},
'elem_00': {'id': 'zeroth'},
'elem_04': {'id': 'fourth'},
'elem_02': {'id': 'second'}}
based on the "id" elements, whose order can be found in a list:
sorting_list = ['zeroth', 'first', 'second', 'third', 'fourth', 'fifth']
The trivial way to solve the problem is to use:
tmp = {}
for x in sorting_list:
for k, v in jobs.items():
if v["id"] == x:
tmp.update({k: v})
but I was trying to figure out a more efficient and pythonic way.
I've been trying sorted and lambda functions as key, but I'm not familiar with that yet, so I was unsuccessful so far.
I would use a dictionary as key for sorted:
order = {k:i for i,k in enumerate(sorting_list)}
# {'zeroth': 0, 'first': 1, 'second': 2, 'third': 3, 'fourth': 4, 'fifth': 5}
out = dict(sorted(jobs.items(), key=lambda x: order.get(x[1].get('id'))))
output:
{'elem_00': {'id': 'zeroth'},
'elem_01': {'id': 'first'},
'elem_02': {'id': 'second'},
'elem_03': {'id': 'third'},
'elem_04': {'id': 'fourth'},
'elem_05': {'id': 'fifth'}}
There is a way to sort the dict using lambda as a sorting key:
jobs = {'elem_05': {'id': 'fifth'},
'elem_03': {'id': 'third'},
'elem_01': {'id': 'first'},
'elem_00': {'id': 'zeroth'},
'elem_04': {'id': 'fourth'},
'elem_02': {'id': 'second'}}
sorting_list = ['zeroth', 'first', 'second', 'third', 'fourth', 'fifth']
sorted_jobs = dict(sorted(jobs.items(), key=lambda x: sorting_list.index(x[1]['id'])))
print(sorted_jobs)
This outputs
{'elem_00': {'id': 'zeroth'}, 'elem_01': {'id': 'first'}, 'elem_02': {'id': 'second'}, 'elem_03': {'id': 'third'}, 'elem_04': {'id': 'fourth'}, 'elem_05': {'id': 'fifth'}}
I have a feeling the sorted expression could be cleaner but I didn't get it to work any other way.
You can use OrderedDict:
from collections import OrderedDict
sorted_jobs = OrderedDict([(el, jobs[key]['id']) for el, key in zip(sorting_list, jobs.keys())])
This creates an OrderedDict object which is pretty similar to dict, and can be converted to dict using dict(sorted_jobs).
Similar to what is already posted, but with error checking in case id doesn't appear in sorting_list
sorting_list = ['zeroth', 'first', 'second', 'third', 'fourth', 'fifth']
jobs = {'elem_05': {'id': 'fifth'},
'elem_03': {'id': 'third'},
'elem_01': {'id': 'first'},
'elem_00': {'id': 'zeroth'},
'elem_04': {'id': 'fourth'},
'elem_02': {'id': 'second'}}
def custom_order(item):
try:
return sorting_list.index(item[1]["id"])
except ValueError:
return len(sorting_list)
jobs_sorted = {k: v for k, v in sorted(jobs.items(), key=custom_order)}
print(jobs_sorted)
The sorted function costs O(n log n) in average time complexity. For a linear time complexity you can instead create a reverse mapping that maps each ID to the corresponding dict entry:
mapping = {d['id']: (k, d) for k, d in jobs.items()}
so that you can then construct a new dict by mapping sorting_list with the ID mapping above:
dict(map(mapping.get, sorting_list))
which, with your sample input, returns:
{'elem_00': {'id': 'zeroth'}, 'elem_01': {'id': 'first'}, 'elem_02': {'id': 'second'}, 'elem_03': {'id': 'third'}, 'elem_04': {'id': 'fourth'}, 'elem_05': {'id': 'fifth'}}
Demo: https://replit.com/#blhsing/WorseChartreuseFonts
I have one list of elements and another list of dictionaries and i want to insert list of elements into each dictionary of list
list_elem = [1,2,3]
dict_ele = [{"Name":"Madhu","Age":25},{"Name":"Raju","Age:24},{""Name":"Mani","Age":12}],
OUTPUT As:
[{"ID":1,"Name":"Madhu","Age":25},{"ID":2,"Name":"Raju","Age:24},{"ID":3,"Name":"Mani","Age":12}]
I have tried this way :
dit = [{"id":item[0]} for item in zip(sam)]
# [{"id":1,"id":2,"id":3}]
dic1 = list(zip(dit,data))
print(dic1)
# [({"id":1},{{"Name":"Madhu","Age":25}},{"id":2},{"Name":"Raju","Age:24},{"id":3},{""Name":"Mani","Age":12})]
What is the most efficient way to do this in Python?
Making an assumption here that the OP's original question has a typo in the definition of dict_ele and also that list_elem isn't really necessary.
dict_ele = [{"Name":"Madhu","Age":25},{"Name":"Raju","Age":24},{"Name":"Mani","Age":12}]
dit = [{'ID': id_, **d} for id_, d in enumerate(dict_ele, 1)]
print(dit)
Output:
[{'ID': 1, 'Name': 'Madhu', 'Age': 25}, {'ID': 2, 'Name': 'Raju', 'Age': 24}, {'ID': 3, 'Name': 'Mani', 'Age': 12}]
dict_ele = [{"Name":"Madhu","Age":25},{"Name":"Raju","Age":24},{"Name":"Mani","Age":12}]
list_elem = [1,2,3]
[{'ID': id, **_dict} for id, _dict in zip(list_elem, dict_ele)]
[{'ID': 1, 'Name': 'Madhu', 'Age': 25}, {'ID': 2, 'Name': 'Raju', 'Age': 24}, {'ID': 3, 'Name': 'Mani', 'Age': 12}]
try this: r = [{'id':e[0], **e[1]} for e in zip(list_elem, dict_ele)]
I have a list of dictionaries:
mydict = [
{'name': 'test1', 'value': '1_1'},
{'name': 'test2', 'value': '2_1'},
{'name': 'test1', 'value': '1_2'},
{'name': 'test1', 'value': '1_3'},
{'name': 'test3', 'value': '3_1'},
{'name': 'test4', 'value': '4_1'},
{'name': 'test4', 'value': '4_2'},
]
I would like to use it to create a dictionary where the values are lists or single values depending of number of their occurrences in the list above.
Expected output:
outputdict = {
'test1': ['1_1', '1_2', '1_3'],
'test2': '2_1',
'test3': '3_1',
'test4': ['4_1', '4_2'],
}
I tried to do it the way below but it always returns a list, even when there is just one value element.
outputdict = {}
outputdict.setdefault(mydict.get('name'), []).append(mydict.get('value'))
The current output is:
outputdict = {
'test1': ['1_1', '1_2', '1_3'],
'test2': ['2_1'],
'test3': ['3_1'],
'test4': ['4_1', '4_2'],
}
Do what you have already done, and then convert single-element lists afterwards:
outputdict = {
name: (value if len(value) > 1 else value[0])
for name, value in outputdict.items()
}
You can use a couple of the built-in functions mainly itertools.groupby:
from itertools import groupby
from operator import itemgetter
mydict = [
{'name': 'test1', 'value': '1_1'},
{'name': 'test2', 'value': '2_1'},
{'name': 'test1', 'value': '1_2'},
{'name': 'test1', 'value': '1_3'},
{'name': 'test3', 'value': '3_1'},
{'name': 'test4', 'value': '4_1'},
{'name': 'test4', 'value': '4_2'},
]
def keyFunc(x):
return x['name']
outputdict = {}
# groupby groups all the items that matches the returned value from keyFunc
# in our case it will use the names
for name, groups in groupby(mydict, keyFunc):
# groups will contains an iterator of all the items that have the matched name
values = list(map(itemgetter('value'), groups))
if len(values) == 1:
outputdict[name] = values[0]
else:
outputdict[name] = values
print(outputdict)
For each item in dictA, I want to search for it in dictB, if dictB has it then I want to pull some other values from dictB and add it to dictA.
An example that is working is here, however it is rather slow as I have 50,000+ items to search through and it will perform this similar function on multiple dicts.
Is there a fast method of performing this search?
dictA = [
{'id': 12345},
{'id': 67890},
{'id': 11111},
{'id': 22222}
]
dictB = [
{'id': 63351, 'name': 'Bob'},
{'id': 12345, 'name': 'Carl'},
{'id': 59933, 'name': 'Amy'},
{'id': 11111, 'name': 'Chris'}
]
for i in dictA:
name = None
for j in dictB:
if i['id'] == j['id']:
name = j['name']
i['name'] = name
The dictA output after this would be:
dictA = [
{'id': 12345, 'name': 'Carl'},
{'id': 67890, 'name': None},
{'id': 11111, 'name': 'Chris'},
{'id': 22222, 'name': None}
]
The given is list of dict. You can create dict from that assuming id is uninque. Converting from list of dict to dict will work for your case.
dictA = [
{'id': 12345},
{'id': 67890},
{'id': 11111},
{'id': 22222}
]
dictB = [
{'id': 63351, 'name': 'Bob'},
{'id': 12345, 'name': 'Carl'},
{'id': 59933, 'name': 'Amy'},
{'id': 11111, 'name': 'Chris'}
]
actual_dictB = dict()
for d in dictB:
actual_dictB[d['id']] = d['name']
for i in dictA:
i['name'] = actual_dictB.pop(i['id'], None) # now search have became O(1) constant. So best time complexity achived O(n) n=length of dictA
print(dictA)
Follow up for additional question:
actual_dictB = dict()
for d in dictB:
id_ = d['id']
d.pop('id')
actual_dictB[id_] = d
tmp = dict([(k,None) for k in dictB[0].keys() if k!='id'])
for i in dictA:
if i['id'] not in actual_dictB:
i.update(tmp)
else:
i.update(actual_dictB[i['id']])
print(dictA)