I have a very large json file with several nested keys. From whaat I've read so far, if you do:
x = json.loads(data)
Python will interpret it as a dictionary (correct me if I'm wrong). The fourth level of nesting in the json file contains several elements named by an ID number and all of them contain an element called children, something like this:
{"level1":
{"level2":
{"level3":
{"ID1":
{"children": [1,2,3,4,5]}
}
{"ID2":
{"children": []}
}
{"ID3":
{"children": [6,7,8,9,10]}
}
}
}
}
What I need to do is to replace all items in all the "children" elements with nothing, meaning "children": [] if the ID number is in a list called new_ids and then convert it back to json. I've been reading on the subject for a few hours now but I haven't found anything similar to this to try to help myself.
I'm running Python 3.3.3. Any ideas are greatly appreciated!!
Thanks!!
EDIT
List:
new_ids=["ID1","ID3"]
Expected result:
{"level1":
{"level2":
{"level3":
{"ID1":
{"children": []}
}
{"ID2":
{"children": []}
}
{"ID3":
{"children": []}
}
}
}
}
First of all, your JSON is invalid. I assume you want this:
{"level1":
{"level2":
{"level3":
{
"ID1":{"children": [1,2,3,4,5]},
"ID2":{"children": []},
"ID3":{"children": [6,7,8,9,10]}
}
}
}
}
Now, load your data as a dictionary:
>>> with open('file', 'r') as f:
... x = json.load(f)
...
>>> x
{u'level1': {u'level2': {u'level3': {u'ID2': {u'children': []}, u'ID3': {u'children': [6, 7, 8, 9, 10]}, u'ID1': {u'children': [1, 2, 3, 4, 5]}}}}}
Now you can loop over the keys in x['level1']['level2']['level3'] and check whether they are in your new_ids.
>>> new_ids=["ID1","ID3"]
>>> for key in x['level1']['level2']['level3']:
... if key in new_ids:
... x['level1']['level2']['level3'][key]['children'] = []
...
>>> x
{u'level1': {u'level2': {u'level3': {u'ID2': {u'children': []}, u'ID3': {u'children': []}, u'ID1': {u'children': []}}}}}
You can now write x back to a file like this:
with open('myfile', 'w') as f:
f.write(json.dumps(x))
If your new_ids list is large, consider making it a set.
If you have simple dictionary like this
data_dict = {
"level1": {
"level2":{
"level3":{
"ID1":{"children": [1,2,3,4,5]},
"ID2":{"children": [] },
"ID3":{"children": [6,7,8,9,10]},
}
}
}
}
than you need only this:
data_dict = {
"level1": {
"level2":{
"level3":{
"ID1":{"children": [1,2,3,4,5]},
"ID2":{"children": [] },
"ID3":{"children": [6,7,8,9,10]},
}
}
}
}
new_ids=["ID1","ID3"]
for idx in new_ids:
if idx in data_dict['level1']["level2"]["level3"]:
data_dict['level1']["level2"]["level3"][idx]['children'] = []
print data_dict
'''
{
'level1': {
'level2': {
'level3': {
'ID2': {'children': []},
'ID3': {'children': []},
'ID1': {'children': []}
}
}
}
}
'''
but if you have more complicated dictionary
data_dict = {
"level1a": {
"level2a":{
"level3a":{
"ID2":{"children": [] },
"ID3":{"children": [6,7,8,9,10]},
}
}
},
"level1b": {
"level2b":{
"level3b":{
"ID1":{"children": [1,2,3,4,5]},
}
}
}
}
new_ids =["ID1","ID3"]
for level1 in data_dict.values():
for level2 in level1.values():
for level3 in level2.values():
for idx in new_ids:
if idx in level3:
level3[idx]['children'] = []
print data_dict
'''
{
'level1a': {
'level2a': {
'level3a': {
'ID2': {'children': []},
'ID3': {'children': []}
}
}
},
'level1b': {
'level2b': {
'level3b': {
'ID1': {'children': []}
}
}
}
}
'''
Related
I believe there must be a way to point specific key from nested dict, not in the traditional ways.
imagine dictionary like this.
dict1 = { 'level1': "value",
'unexpectable': { 'maybe': { 'onemotime': {'name': 'John'} } } }
dict2 = { 'level1': "value", 'name': 'Steve'}
dict3 = { 'find': { 'what': { 'you': { 'want': { 'in': { 'this': { 'maze': { 'I': { 'made': { 'for': { 'you': { 'which': { 'is in': { 'fact that': { 'was just': { 'bully your': { 'searching': { 'for': { 'the name': { 'even tho': { 'in fact': { 'actually': { 'there': { 'is': { 'in reality': { 'only': { 'one': { 'key': { 'named': { 'name': 'Michael' } } } } } } } } } } } } } } } } } } } } } } } } } } } } } }
in this case, if we want to point 'name' key to get 'John' and 'Steve' and the 'Michael', you should code differently against dict1 and dict2 and dict3
and the traditional way to point the key buried in nested dictionary that I know is this.
print(dict1['unexpectable']['maybe']['onemotime']['name'])
and if you don't want your code to break because of empty value of dict, you may want to use get() function.
and I'm curious that if I want to get 'name' of dict1 safely with get(), should I code like this?
print(dict1.get('unexpectable', '').get('maybe', '').get('onemotime', '').get('name', ''))
in fact, i've got error when run those get().get().get().get() thing.
And please consider if you have to print() 'name' from that horrible dict3 even it has actually only one key.
and, imagine the case you extract 'name' from unknown dict4 which you cannot imagine what nesting structure the dict4 would have.
I believe that python must have a way to deal with this.
I searched on the internet about this problem, but the solutions seems really really difficult.
I just wanted simple solution.
the solution without pointing every keys on the every level.
like just pointing that last level key, the most important key.
like, print(dict1.lastlevel('name')) --> 'John'
like, no matter what structure of nesting levels they have, no matter how many duplicates they have, even if they omitted nested key in the middle of nested dict so that dict17 has one less level of dict16, you could get what you want, the last level value of the last level key.
So Conclusion.
I want to know if there is a simple solution like
print(dict.lastlevel('name'))
without creating custom function.
I want to know if there is solution like above from the default python methods, syntax, function, logic or concept.
The solution like above can be applied to dict1, dict2, dict3, to whatever dict would come.
There is no built-in method to accomplish what you are asking for. However, you can use a recursive function to dig through a nested dictionary. The function checks if the desired key is in the dictionary and returns the value if it is. Otherwise it iterates over the dict's values for other dictionaries and scans their keys as well, repeating until it reaches the bottom.
dict1 = { 'level1': "value",
'unexpectable': { 'maybe': { 'onemotime': {'name': 'John'} } } }
dict2 = { 'level1': "value", 'name': 'Steve'}
dict3 = { 'find': { 'what': { 'you': { 'want': { 'in': { 'this': { 'maze': { 'I': {
'made': { 'for': { 'you': { 'which': { 'is in': { 'fact that': {
'was just': { 'bully your': { 'searching': { 'for': { 'the name': {
'even tho': { 'in fact': { 'actually': { 'there': { 'is': { 'in reality': {
'only': { 'one': { 'key': { 'named': { 'name': 'Michael'
} } } } } } } } } } } } } } } } } } } } } } } } } } } } } }
def get_nested_dict_key(d, key):
if key in d:
return d[key]
else:
for item in d.values():
if not isinstance(item, dict):
continue
return get_nested_dict_key(item, key)
print(get_nested_dict_key(dict1, 'name'))
print(get_nested_dict_key(dict2, 'name'))
print(get_nested_dict_key(dict3, 'name'))
# prints:
# John
# Steve
# Michael
You can make simple recursive generator function which yields value of every particular key:
def get_nested_key(source, key):
if isinstance(source, dict):
key_value = source.get(key)
if key_value:
yield key_value
for value in source.values():
yield from get_nested_key(value, key)
elif isinstance(source, (list, tuple)):
for value in source:
yield from get_nested_key(value, key)
Usage:
dictionaries = [
{'level1': 'value', 'unexpectable': {'maybe': {'onemotime': {'name': 'John'}}}},
{'level1': 'value', 'name': 'Steve'},
{'find': {'what': {'you': {'want': {'in': {'this': {'maze': {'I': {'made': {'for': {'you': {'which': {'is in': {'fact that': {'was just': {'bully your': {'searching': {'for': {'the name': {'even tho': {'in fact': {'actually': {'there': {'is': {'in reality': {'only': {'one': {'key': {'named': {'name': 'Michael'}}}}}}}}}}}}}}}}}}}}}}}}}}}}}},
{'level1': 'value', 'unexpectable': {'name': 'Alex', 'maybe': {'onemotime': {'name': 'John'}}}},
{}
]
for d in dictionaries:
print(*get_nested_key(d, 'name'), sep=', ')
Output:
John
Steve
Michael
Alex, John
I have some json that I would like to transform from this:
[
{
"name":"field1",
"intValue":"1"
},
{
"name":"field2",
"intValue":"2"
},
...
{
"name":"fieldN",
"intValue":"N"
}
]
into this:
{ "field1" : "1",
"field2" : "2",
...
"fieldN" : "N",
}
For each pair, I need to change the value of the name field to a key, and the values of the intValue field to a value. This doesn't seem like flattening or denormalizing. Are there any tools that might do this out-of-the-box, or will this have to be brute-forced? What's the most pythonic way to accomplish this?
parameters = [ # assuming this is loaded already
{
"name":"field1",
"intValue":"1"
},
{
"name":"field2",
"intValue":"2"
},
{
"name":"fieldN",
"intValue":"N"
}
]
field_int_map = dict()
for p in parameters:
field_int_map[p['name']] = p['intValue']
yields {'field1': '1', 'field2': '2', 'fieldN': 'N'}
or as a dict comprehension
field_int_map = {p['name']:p['intValue'] for p in parameters}
This works to combine the name attribute with the intValue as key:value pairs, but the result is a dictionary instead of the original input type which was a list.
Use dictionary comprehension:
json_dct = {"parameters":
[
{
"name":"field1",
"intValue":"1"
},
{
"name":"field2",
"intValue":"2"
},
{
"name":"fieldN",
"intValue":"N"
}
]}
dct = {d["name"]: d["intValue"] for d in json_dct["parameters"]}
print(dct)
# {'field1': '1', 'field2': '2', 'fieldN': 'N'}
I'm trying to deepmerge lists to get a specific json.
what I want to achieve is this format (the order of the elements is irelevant):
{
"report": {
"context": [{
"report_id": [
"Report ID 30"
],
"status": [
"Status 7"
],
"fallzahl": [
"Fallzahl 52"
],
"izahl": [
"IZahl 20"
]
}
],
"körpergewicht": [{
"any_event_en": [{
"gewicht": [{
"|magnitude": 185.44,
"|unit": "kg"
}
],
"kommentar": [
"Kommentar 94"
],
"bekleidung": [{
"|code": "at0011"
}
]
}
]
}
]
}
}
I try to deepmerge dicts and lists to achieve this specific format. My baseline are some dicts:
{'körpergewicht': [{'any_event_en': [{'gewicht': [{'|magnitude': '100', '|unit': 'kg'}]}]}]}
{'körpergewicht': [{'any_event_en': [{'bekleidung': [{'|code': 'at0013'}]}]}]}
{'körpergewicht': [{'any_event_en': [{'kommentar': ['none']}]}]}
{'context': [{'status': ['fatty']}]}
{'context': [{'fallzahl': ['123']}]}
{'context': [{'report_id': ['123']}]}
{'context': [{'izahl': ['123']}]}
what I tried to do is following I have a dict called tmp_dict in that I hold a baseline dict as I loop through. The so called collect_dict is the dict in that I try to merge my baseline dicts. element holds my current baseline dict.
if (index == (len(element)-1)): #when the baseline dict is traversed completely
if tmp_dict:
first_key_of_tmp_dict=list(tmp_dict.keys())[0]
if not (first_key_of_tmp_dict in collect_dict):
collect_dict.update(tmp_dict)
else:
merge(tmp_dict,collect_dict)
else:
collect_dict.update(tmp_dict)
and I also wrote a merge method:
def merge(tmp_dict,collect_dict):
first_common_key_of_dicts=list(tmp_dict.keys())[0]
second_depth_key_of_tmp_dict=list(tmp_dict[first_common_key_of_dicts][0].keys())[0]
second_depth_tmp_dict=tmp_dict[first_common_key_of_dicts][0]
second_depth_key_of_coll_dict=collect_dict[first_common_key_of_dicts][0]
if not second_depth_key_of_tmp_dict in second_depth_key_of_coll_dict:
collect_dict.update(second_depth_tmp_dict)
else:
merge(second_depth_tmp_dict,second_depth_key_of_coll_dict)
what I'm coming up with goes in the right direction but is far from beeing my desired output:
{"report": {
"k\u00f6rpergewicht": [{
"any_event_en": [{
"kommentar": ["none"]
}
],
"bekleidung": [{
"|code": "at0013"
}
],
"gewicht": [{
"|magnitude": "100",
"|unit": "kg"
}
]
}
],
"context": [{
"fallzahl": ["234"]
}
],
"report_id": ["234"],
"status": ["s"],
"izahl": ["234"]
}
}
With another set of inputs:
{'atemfrequenz': {'context': [{'status': [{'|code': 'at0012'}]}]}},
{'atemfrequenz': {'context': [{'kategorie': ['Kategorie']}]}},
{'atemfrequenz': {'atemfrequenz': [{'messwert': [{'|magnitude': '123', '|unit': 'min'}]}]}}
I would like to achieve the following output:
"atemfrequenz": {
"context": [
{
"status": [
{
"|code": "at0012"
}
],
"kategorie": [
"Kategorie"
]
}
],
"atemfrequenz": [
{
"messwert": [
{
"|magnitude": 123,
"|unit": "/min"
}
]
}
]
}
This code should get the correct answer. I removed the special character (ö) to prevent errors.
dd = [
{'korpergewicht': [{'any_event_en': [{'gewicht': [{'|magnitude': '100', '|unit': 'kg'}]}]}] },
{'korpergewicht': [{'any_event_en': [{'bekleidung': [{'|code': 'at0013'}]}]}]},
{'korpergewicht': [{'any_event_en': [{'kommentar': ['none']}]}]},
{'context': [{'status': ['fatty']}]},
{'context': [{'fallzahl': ['123']}]},
{'context': [{'report_id': ['123']}]},
{'context': [{'izahl': ['123']}]}
]
def merge(d):
if (type(d) != type([])): return d
if (type(list(d[0].values())[0])) == type(""): return d
keys = list(set(list(k.keys())[0] for k in d))
lst = [{k:[]} for k in keys]
for e in lst:
for k in d:
if (list(e.keys())[0] == list(k.keys())[0]):
e[list(e.keys())[0]] += k[list(k.keys())[0]]
for e in lst:
if (type(e[list(e.keys())[0]][0]) == type({})):
e[list(e.keys())[0]] = merge(e[list(e.keys())[0]])
for i in lst[1:]: lst[0].update(i)
lst2 = [] # return list of single dictionary
lst2.append(lst[0])
return lst2
dx = merge(dd)
dx = {'report': dx[0]} # no list at lowest level
print(dx)
Output (formatted)
{'report': {
'korpergewicht': [{
'any_event_en': [{
'kommentar': ['none'],
'bekleidung': [{'|code': 'at0013'}],
'gewicht': [{'|magnitude': '100', '|unit': 'kg'}]}]}],
'context': [{
'report_id': ['123'],
'izahl': ['123'],
'fallzahl': ['123'],
'status': ['fatty']}]}}
Concerning the second data set provided, the data needs to structured to match the previous data set.
This data set works correctly:
dd = [
{'atemfrequenz': [{'context': [{'status': [{'|code': 'at0012'}]}]}]},
{'atemfrequenz': [{'context': [{'kategorie': ['Kategorie']}]}]},
{'atemfrequenz': [{'atemfrequenz': [{'messwert': [{'|magnitude': '123', '|unit': 'min'}]}]}]}
]
Output (formatted)
{'report': {
'atemfrequenz': [{
'atemfrequenz': [{
'messwert': [{'|magnitude': '123', '|unit': 'min'}]}],
'context': [{
'kategorie': ['Kategorie'],
'status': [{'|code': 'at0012'}]}]}]}}
I can't insert my new document value (dict) without overwriting my existing data. I've looked through all different resources and can't find an answer.
I've also though of putting the values from first_level_dict into a list "first_level_dict" : [dict1, dict2] but I won't know how to append the dict eighter.
Sample Data:
# Create the document
target_dict = {
"_id": 55,
"Root_dict": {
"first_level_dict": {
"second_level_dict1": {"Content1": "Value1"}
}
},
"Root_key": "Root_value"
}
collection.insert_one(target_dict)
The result I'm looking for:
result_dict = {
"_id": 55,
"Root_dict": {
"first_level_dict": {
"second_level_dict1": {"Content1": "Value1"},
"second_level_dict2": {"Content2": "Value2"}
}
},
"Root_key": "Root_value"
}
Update: New Values example 2:
# New Values Sample
new_values = {
"_id": 55,
"Root_dict": {
"first_level_dict": {
"secon_level_dict2": {"Content2": "Value2"},
"secon_level_dict3": {"Content3": "Value3"}
}
}
collection.insert_one(target_dict)
Update: The result I'm looking for example 2:
result_dict = {
"_id": 55,
"Root_dict": {
"first_level_dict": {
"second_level_dict1": {"Content1": "Value1"},
"second_level_dict2": {"Content2": "Value2"},
"second_level_dict3": {"Content3": "Value3"},
}
},
"Root_key": "Root_value"
}
What I've tried:
# Update document "$setOnInsert"
q = {"_id": 55}
target_dict = {"$set": {"Root_dict": {"first_level_dict": {"second_level_dict2": {"Content2": "Value2"}}}}}
collection.update_one(q, target_dict)
What I've tried example 2:
# Update document
q = {"_id": 55}
target_dict = {"$set": {"Root_dict.first_level_dict": {
"second_level_dict2": {"Content2": "Value2"},
"second_level_dict3": {"Content3": "Value3"}}}}
collection.update_one(q, target_dict)
Try using the dot notation:
target_dict = {$set: {"Root_dict.first_level_dict.second_level_dict2": {"Content2": "Value2"}}}
Additionally, to update/add multiple fields (for "example 2"):
target_dict = {$set: {
"Root_dict.first_level_dict.second_level_dict2": {"Content2": "Value2"},
"Root_dict.first_level_dict.second_level_dict3": {"Content3": "Value3"}
}
}
I am writing a function that takes 2 strings as inputs and would move a section of the dictionary to another.
def move(item_to_move, destination):
# do something....
My initial dictionary looks like this.
directories = {
'beers': {
'ipa': {
'stone': {}
}
},
'wines': {
'red': {
'cabernet': {}
}
},
'other' : {}
}
I would like to move either a subsection or section of the dictionary to another section. The sections are represented by each key of the path delimited by a '/'. For example, the inputs for my function would be:
item_to_move='beers/ipa'
destination='other'
move(directories, item_to_move,destination)
The output would be:
{
'wines': {
'red': {
'cabernet': {}
},
},
'other' :{
'beers': {
'ipa': {
'stone': {}
} }
},
}
NOTE: I am assuming all input paths for items_to_move are valid.
Find the origin's parent dictionary and the target's dictionary, then update the the target's dictionary with the origin's key and value (removing it from the origin's parent):
def move(tree,originPath,targetPath):
originKey = None
for originName in originPath.split("/"):
originParent = originParent[originKey] if originKey else tree
originKey = originName
targetDict = tree
for targetName in targetPath.split("/"):
targetDict = targetDict[targetName]
targetDict.update({originKey:originParent.pop(originKey)})
output:
directories = {
'beers': {
'ipa': {
'stone': {}
}
},
'wines': {
'red': {
'cabernet': {}
}
},
'other' : {}
}
move(directories,'beers/ipa','other')
print(directories)
{ 'beers': {},
'wines': { 'red': {'cabernet': {}} },
'other': { 'ipa': {'stone': {}} }
}