I have multiple lists of string which I want to be my keys for my JSON.
For example, I have the following lists:
['dev', 'aws', 'test']
['dev', 'azure', 'test']
['prod', 'aws', 'test']
Based on that, I want to create the backbone of my JSON with these values representing the keys.
This is the output I would like:
{
"dev": {
"aws": {
"test": ""
},
"azure": {
"test": ""
}
},
"prod": {
"aws": {
"test": ""
}
}
}
My problem is that I need to create it dynamically, as the keys are not static and could change.
I can't figure out a way to create this dynamically and can't seem to find help on the web, so if you have any idea on how to handle this case, it would really appreciate it.
Thanks a lot.
This should work:
def add_elem(json, keys, val):
parent = json
for k in keys[:-1]:
if k not in parent:
parent[k] = {}
parent = parent[k]
parent[keys[-1]] = val
d = {}
paths = [
['dev', 'aws', 'test'],
['dev', 'azure', 'test'],
['prod', 'aws', 'test']
]
for keys in paths:
add_elem(d, keys, "")
print(d)
Output:
{'dev': {'aws': {'test': ''}, 'azure': {'test': ''}}, 'prod': {'aws': {'test': ''}}}
A flexible recursive solution that should work for varying lengths of fields and be flexible for future keys or additional fields or lists.
the function takes a dictionary and a list, it will take the first tiem from the list and if it doesnt exist in the dictionary it will create it, if there are more items left in the list it will allocate a new dict to the key and then pass this new dict and the remaining list items back to its self.
if there are no items left in the list it will create the last item as a key with a string value of ''.
def build_dict(my_dict, my_list):
key, *data = my_list
if key not in my_dict:
my_dict[key] = {} if data else ''
if data:
build_dict(my_dict[key], data)
my_lists = [
['dev', 'aws', 'test'],
['dev', 'azure', 'test'],
['dev', 'azure', 'preprod', 'more', 'long', 'length'],
['prod', 'aws', 'test'],
['prod', 'short']
]
my_dict = {}
for list_data in my_lists:
build_dict(my_dict, list_data)
print(my_dict)
OUTPUT
{'dev': {'aws': {'test': ''}, 'azure': {'test': '', 'preprod': {'more': {'long': {'length': ''}}}}}, 'prod': {'aws': {'test': ''}, 'short': ''}}
This is one approach using dict.setdefault
Ex:
data = [
['dev', 'aws', 'test'],
['dev', 'azure', 'test'],
['prod', 'aws', 'test']
]
result = {}
for k,v, m in data:
result.setdefault(k, {}).setdefault(v, {}).update({m: ''})
print(result)
Output:
{'dev': {'aws': {'test': ''}, 'azure': {'test': ''}},
'prod': {'aws': {'test': ''}}}
If we treat the input as a list of lists:
lol = [
['dev', 'aws', 'test'],
['dev', 'azure', 'test'],
['azure', 'dev', 'azure']
]
Then you can use the following:
d = {}
for lst in lol:
use = d # make sure that for each list we refer to the base dictionary
*head, tail = lst # unpack so we can...
for key in head: # ... loop over each except the trailing key
use = use.setdefault(key, {}) # default to a dict and point `use` to the result
use[tail] = '' # assign blank string to last element
This then gives you d as:
{
"dev": {
"aws": {
"test": ""
},
"azure": {
"test": "",
"test2": ""
}
}
}
Simply with collections.defaultdict:
from collections import defaultdict
import pprint
lists = [
['dev', 'aws', 'test'],
['dev', 'azure', 'test'],
['prod', 'aws', 'test']
]
paths = defaultdict(dict)
for k1, k2, k3 in lists:
if k2 not in paths[k1]: paths[k1][k2] = {}
paths[k1][k2].update({k3: ""})
pprint.pprint(dict(paths), width=4)
The output:
{'dev': {'aws': {'test': ''},
'azure': {'test': ''}},
'prod': {'aws': {'test': ''}}}
dict = {}
for i in lists:
try:
dict[i[0]][i[1]][i[2]] = ""
except KeyError:
try:
dict[i[0]][i[1]] = {i[2] : ""}
except KeyError:
dict[i[0]] = {i[1]: {i[2]: ""}}
Output:
{'prod': {'aws': {'test': ''}}, 'dev': {'azure': {'test': ''}, 'aws': {'test': ''}}}
Related
I have a dictionary with some values that are type list, i need to convert each list in another dictionary and insert this new dictionary at the place of the list.
Basically, I have this dictionary
Dic = {
'name': 'P1',
'srcintf': 'IntA',
'dstintf': 'IntB',
'srcaddr': 'IP1',
'dstaddr': ['IP2', 'IP3', 'IP4'],
'service': ['P_9100', 'SNMP'],
'schedule' : 'always',
}
I need to reemplace the values that are lists
Expected output:
Dic = {
'name': 'P1',
'srcintf': 'IntA',
'dstintf': 'IntB',
'srcaddr': 'IP1',
'dstaddr': [
{'name': 'IP2'},
{'name': 'IP3'},
{'name': 'IP4'}
],
'service': [
{'name': 'P_9100'},
{'name': 'SNMP'}
],
'schedule' : 'always',
}
So far I have come up with this code:
for k,v in Dic.items():
if not isinstance(v, list):
NewDic = [k,v]
print(NewDic)
else:
values = v
keys = ["name"]*len(values)
for item in range(len(values)):
key = keys[item]
value = values[item]
SmallDic = {key : value}
liste.append(SmallDic)
NewDic = [k,liste]
which print this
['name', 'P1']
['srcintf', 'IntA']
['dstintf', 'IntB']
['srcaddr', 'IP1']
['schedule', 'always']
['schedule', 'always']
I think is a problem with the loop for, but so far I haven't been able to figure it out.
You need to re-create the dictionary. With some modifications to your existing code so that it generates a new dictionary & fixing the else clause:
NewDic = {}
for k, v in Dic.items():
if not isinstance(v, list):
NewDic[k] = v
else:
NewDic[k] = [
{"name": e} for e in v # loop through the list values & generate a dict for each
]
print(NewDic)
Result:
{'name': 'P1', 'srcintf': 'IntA', 'dstintf': 'IntB', 'srcaddr': 'IP1', 'dstaddr': [{'name': 'IP2'}, {'name': 'IP3'}, {'name': 'IP4'}], 'service': [{'name': 'P_9100'}, {'name': 'SNMP'}], 'schedule': 'always'}
a =[{
"id":"1",
"Name":'BK',
"Age":'56'
},
{
"id":"1",
"Sex":'Male'
},
{
"id":"2",
"Name":"AK",
"Age":"32"
}]
I have a list of dictionary with a person information split in multiple dictionary as above for ex above id 1's information is contained in first 2 dictionary , how can i get an output of below
{1: {'Name':'BK','Age':'56','Sex':'Male'}, 2: { 'Name': 'AK','Age':'32'}}
You can use a defaultdict to collect the results.
from collections import defaultdict
a =[{ "id":"1", "Name":'BK', "Age":'56' }, { "id":"1", "Sex":'Male' }, { "id":"2", "Name":"AK", "Age":"32" }]
results = defaultdict(dict)
key = lambda d: d['id']
for a_dict in a:
results[a_dict.pop('id')].update(a_dict)
This gives you:
>>> results
defaultdict(<class 'dict'>, {'1': {'Name': 'BK', 'Age': '56', 'Sex': 'Male'}, '2': {'Name': 'AK', 'Age': '32'}})
The defaultdict type behaves like a normal dict, except that when you reference an unknown value, a default value is returned. This means that as the dicts in a are iterated over, the values (except for id) are updated onto either an existing dict, or an automatic newly created one.
How does collections.defaultdict work?
Using defaultdict
from collections import defaultdict
a = [{
"id": "1",
"Name": 'BK',
"Age": '56'
},
{
"id": "1",
"Sex": 'Male'
},
{
"id": "2",
"Name": "AK",
"Age": "32"
}
]
final_ = defaultdict(dict)
for row in a:
final_[row.pop('id')].update(row)
print(final_)
defaultdict(<class 'dict'>, {'1': {'Name': 'BK', 'Age': '56', 'Sex': 'Male'}, '2': {'Name': 'AK', 'Age': '32'}})
You can combine 2 dictionaries by using the .update() function
dict_a = { "id":"1", "Name":'BK', "Age":'56' }
dict_b = { "id":"1", "Sex":'Male' }
dict_a.update(dict_b) # {'Age': '56', 'Name': 'BK', 'Sex': 'Male', 'id': '1'}
Since the output the you want is in dictionary form
combined_dict = {}
for item in a:
id = item.pop("id") # pop() remove the id key from item and return the value
if id in combined_dict:
combined_dict[id].update(item)
else:
combined_dict[id] = item
print(combined_dict) # {'1': {'Name': 'BK', 'Age': '56', 'Sex': 'Male'}, '2': {'Name': 'AK', 'Age': '32'}}
from collections import defaultdict
result = defaultdict(dict)
a =[{ "id":"1", "Name":'BK', "Age":'56' }, { "id":"1", "Sex":'Male' }, { "id":"2", "Name":"AK", "Age":"32" }]
for b in a:
result[b['id']].update(b)
print(result)
d = {}
for p in a:
id = p["id"]
if id not in d.keys():
d[id] = p
else:
d[id] = {**d[id], **p}
d is the result dictionary you want.
In the for loop, if you encounter an id for the first time, you just store the incomplete value.
If the id is in the existing keys, update it.
The combination happens in {**d[id], **p}
where ** is unpacking the dict.
It unpacks the existing incomplete dict associated withe the id and the current dict, then combine them into a new dict.
I have a rather deep dict that I need to simplify. And I've encountered some problems by doing that.
Here is a small sample of the dictionary that needs to be simplified:
data_dict = {
"DATA": {
"Page1": [{
"Section": [{
"Name": [{
"text": "John"
}],
"ID_Number": [{
"text": "123456"
}]
}]
}],
"Page2": [{
"Section": [{
"Name": [{
"text": "Rob"
}],
"ID_Number": [{
"text": "654321"
}]
}]
}]
}
}
What I've done already:
my_dict = {}
for value in data_dict.values():
for key, val in value.items():
if "Tab" in key:
my_dict[key] = val
if type(val) == list:
for i in val:
for key1, val1 in i.items():
my_dict[key] = val1
result_dict = {}
page_list = []
for keys, values in my_dict.items():
for val in values:
if type(val) != str:
for key1, val1 in val.items():
for x in val1:
result_dict[key1] = x.get('text')
page_list.append(result_dict)
my_dict[keys] = page_list
print("my_dict = ", my_dict)
Current result:
my_dict = {'Page1': [{'Name': 'Rob', 'ID_Number': '654321'}, {'Name': 'Rob', 'ID_Number': '654321'}, {'Name': 'Rob', 'ID_Number': '65432
1'}, {'Name': 'Rob', 'ID_Number': '654321'}], 'Page2': [{'Name': 'Rob', 'ID_Number': '654321'}, {'Name': 'Rob', 'ID_Number': '
654321'}, {'Name': 'Rob', 'ID_Number': '654321'}, {'Name': 'Rob', 'ID_Number': '654321'}]}
The problem is that result_dict is being appended to page_list more than once which is unnecessary. Also, my approach is very messy. Is there a cleaner way to get the same result?
Desired result:
my_dict = {"Page1": [{"Name": "John", "ID_Number": "123456"}], "Page2": [{"Name": "Rob", "ID_Number": "654321"}]}
Solution 1 (less loops, but added if statements):
If you want to avoid too many nested for loops. I would take advantage of knowing before-hand the duplicate keys and use that information to easily get to the inner keys or values.
Reference to dict for solution 1 & 2:
data_dict = {"DATA": {"Page1": [{"Section": [{"Name": [{"text": "John"}],"ID_Number": [{"text": "123456"}]}]}],"Page2": [{"Section": [{"Name": [{"text": "Rob"}],"ID_Number": [{"text": "654321"}]}]}]}}
Code:
# Depth #1
old_dict = data_dict["DATA"]
new_dict = {}
for d1_key in old_dict:
d2 = old_dict[d1_key][0]["Section"][0]
for d2_key in d2:
if d2_key == "Name":
new_dict[d1_key] = [{d2_key: d2[d2_key][0]["text"]}]
if d2_key == "ID_Number":
merge = new_dict[d1_key][0]
# Merge above if statement (dict merging)
new_dict[d1_key] = [{**merge, **{d2_key:d2[d2_key][0]["text"]}}]
print(new_dict)
Output:
{'Page1': [{'Name': 'John', 'ID_Number': '123456'}], 'Page2': [{'Name': 'Rob', 'ID_Number': '654321'}]}
Solution 2: (more for loops, more readible)
(Recommended)
Here is a second solution that gives the same desired output that does not take advantage of information about the keys or values, but only looks at the structure of the data. I prefer this one as it is easy to read, modify or extend!
Code:
# Depth #1
old_dict = data_dict["DATA"]
new_dict = {}
unlist = 0
k3_temp = None # instead of merge
v4_temp = None
for k1, v1 in old_dict.items():
for v2 in v1[unlist].values(): # using values because we don't use the Section key
for k3, v3 in v2[unlist].items():
for k4, v4 in v3[unlist].items():
new_dict[k1] = [{k3_temp:v4_temp, k3:v4}]
k3_temp = k3
v4_temp = v4
print(new_dict)
Output:
{'Page1': [{'Name': 'John', 'ID_Number': '123456'}], 'Page2': [{'Name': 'Rob', 'ID_Number': '654321'}]}
Just to see another solution with a ridiculous amount of for loops:
new_dic = {}
inner_list = []
for i in data_dict:
for j in data_dict[i]:
for k in data_dict[i][j]:
for m in k:
for n in k[m]:
for x in n:
for y in n[x]:
for keys, values in y.items():
inner_list.append(values)
new_dic[j] = [{'Name': inner_list[0], 'ID_Number': inner_list[1]}]
inner_list = []
print(new_dic)
output
{'Page1': [{'Name': 'John', 'ID_Number': '123456'}], 'Page2': [{'Name': 'Rob', 'ID_Number': '654321'}]}
I have a JSON data as below.
input_list = [["Richard",[],{"children":"yes","divorced":"no","occupation":"analyst"}],
["Mary",["testing"],{"children":"no","divorced":"yes","occupation":"QA analyst","location":"Seattle"}]]
I have another list where I have the prospective keys present
list_keys = ['name', 'current_project', 'details']
I am trying to create a dic using both to make the data usable for metrics
I have summarized the both the list for the question but it goes on forever, there are multiple elements in the list. input_list is a nested list which has 500k+ elements and each list element have 70+ elements of their own (expect the details one)
list_keys also have 70+ elements in it.
I was trying to create a dict using zip but that its not helping given the size of data, also with zip I am not able to exclude the "details" element from
I am expecting output something like this.
[
{
"name": "Richard",
"current_project": "",
"children": "yes",
"divorced": "no",
"occupation": "analyst"
},
{
"name": "Mary",
"current_project" :"testing",
"children": "no",
"divorced": "yes",
"occupation": "QA analyst",
"location": "Seattle"
}
]
I have tried this so far
>>> for line in input_list:
... zipbObj = zip(list_keys, line)
... dictOfWords = dict(zipbObj)
...
>>> print dictOfWords
{'current_project': ['testing'], 'name': 'Mary', 'details': {'location': 'Seattle', 'children': 'no', 'divorced': 'yes', 'occupation': 'QA analyst'}}
but with this I am unable to to get rid of nested dict key "details". so looking for help with that
Seems like what you wanted was a list of dictionaries, here is something i coded up in the terminal and copied in here. Hope it helps.
>>> list_of_dicts = []
>>> for item in input_list:
... dict = {}
... for i in range(0, len(item)-2, 3):
... dict[list_keys[0]] = item[i]
... dict[list_keys[1]] = item[i+1]
... dict.update(item[i+2])
... list_of_dicts.append(dict)
...
>>> list_of_dicts
[{'name': 'Richard', 'current_project': [], 'children': 'yes', 'divorced': 'no', 'occupation': 'analyst'
}, {'name': 'Mary', 'current_project': ['testing'], 'children': 'no', 'divorced': 'yes', 'occupation': '
QA analyst', 'location': 'Seattle'}]
I will mention it is not the ideal method of doing this since it relies on perfectly ordered items in the input_list.
people = input_list = [["Richard",[],{"children":"yes","divorced":"no","occupation":"analyst"}],
["Mary",["testing"],{"children":"no","divorced":"yes","occupation":"QA analyst","location":"Seattle"}]]
list_keys = ['name', 'current_project', 'details']
listout = []
for person in people:
dict_p = {}
for key in list_keys:
if not key == 'details':
dict_p[key] = person[list_keys.index(key)]
else:
subdict = person[list_keys.index(key)]
for subkey in subdict.keys():
dict_p[subkey] = subdict[subkey]
listout.append(dict_p)
listout
The issue with using zip is that you have that additional dictionary in the people list. This will get the following output, and should work through a larger list of individuals:
[{'name': 'Richard',
'current_project': [],
'children': 'yes',
'divorced': 'no',
'occupation': 'analyst'},
{'name': 'Mary',
'current_project': ['testing'],
'children': 'no',
'divorced': 'yes',
'occupation': 'QA analyst',
'location': 'Seattle'}]
This script will go through every item of input_list and creates new list where there aren't any list or dictionaries:
input_list = [
["Richard",[],{"children":"yes","divorced":"no","occupation":"analyst"}],
["Mary",["testing"],{"children":"no","divorced":"yes","occupation":"QA analyst","location":"Seattle"}]
]
list_keys = ['name', 'current_project', 'details']
out = []
for item in input_list:
d = {}
out.append(d)
for value, keyname in zip(item, list_keys):
if isinstance(value, dict):
d.update(**value)
elif isinstance(value, list):
if value:
d[keyname] = value[0]
else:
d[keyname] = ''
else:
d[keyname] = value
from pprint import pprint
pprint(out)
Prints:
[{'children': 'yes',
'current_project': '',
'divorced': 'no',
'name': 'Richard',
'occupation': 'analyst'},
{'children': 'no',
'current_project': 'testing',
'divorced': 'yes',
'location': 'Seattle',
'name': 'Mary',
'occupation': 'QA analyst'}]
I have a list of dictionaries, themselves with nested lists of dictionaries. All of the nest levels have a similar structure, thankfully. I desire to sort these nested lists of dictionaries. I grasp the technique to sort a list of dictionaries by value. I'm struggling with the recursion that will sort the inner lists.
def reorder(l, sort_by):
# I have been trying to add a recursion here
# so that the function calls itself for each
# nested group of "children". So far, fail
return sorted(l, key=lambda k: k[sort_by])
l = [
{ 'name': 'steve',
'children': [
{ 'name': 'sam',
'children': [
{'name': 'sally'},
{'name': 'sabrina'}
]
},
{'name': 'sydney'},
{'name': 'sal'}
]
},
{ 'name': 'fred',
'children': [
{'name': 'fritz'},
{'name': 'frank'}
]
}
]
print(reorder(l, 'name'))
def reorder(l, sort_by):
l = sorted(l, key=lambda x: x[sort_by])
for item in l:
if "children" in item:
item["children"] = reorder(item["children"], sort_by)
return l
Since you state "I grasp the technique to sort a list of dictionaries by value" I will post some code for recursively gathering data from another SO post I made, and leave it to you to implement your sorting technique. The code:
myjson = {
'transportation': 'car',
'address': {
'driveway': 'yes',
'home_address': {
'state': 'TX',
'city': 'Houston'}
},
'work_address': {
'state': 'TX',
'city': 'Sugarland',
'location': 'office-tower',
'salary': 30000}
}
def get_keys(some_dictionary, parent=None):
for key, value in some_dictionary.items():
if '{}.{}'.format(parent, key) not in my_list:
my_list.append('{}.{}'.format(parent, key))
if isinstance(value, dict):
get_keys(value, parent='{}.{}'.format(parent, key))
else:
pass
my_list = []
get_keys(myjson, parent='myjson')
print(my_list)
Is intended to retrieve all keys recursively from the json file. It outputs:
['myjson.address',
'myjson.address.home_address',
'myjson.address.home_address.state',
'myjson.address.home_address.city',
'myjson.address.driveway',
'myjson.transportation',
'myjson.work_address',
'myjson.work_address.state',
'myjson.work_address.salary',
'myjson.work_address.location',
'myjson.work_address.city']
The main thing to note is that if isinstance(value, dict): results in get_keys() being called again, hence the recursive capabilities of it (but only for nested dictionaries in this case).