Related
I want to update Dict dictionary's value by inp dictionary's values using recursion or loop.
also the format should not change mean use recursion or loop on same format
please suggest a solution that is applicable to all level nesting not for this particular case
dict={
"name": "john",
"quality":
{
"type1":"honest",
"type2":"clever"
},
"marks":
[
{
"english":34
},
{
"math":90
}
]
}
inp = {
"name" : "jack",
"type1" : "dumb",
"type2" : "liar",
"english" : 28,
"math" : 89
}
Another solution, changing the dict in-place:
dct = {
"name": "john",
"quality": {"type1": "honest", "type2": "clever"},
"marks": [{"english": 34}, {"math": 90}],
}
inp = {
"name": "jack",
"type1": "dumb",
"type2": "liar",
"english": 28,
"math": 89,
}
def change(d, inp):
if isinstance(d, list):
for i in d:
change(i, inp)
elif isinstance(d, dict):
for k, v in d.items():
if not isinstance(v, (list, dict)):
d[k] = inp.get(k, v)
else:
change(v, inp)
change(dct, inp)
print(dct)
Prints:
{
"name": "jack",
"quality": {"type1": "dumb", "type2": "liar"},
"marks": [{"english": 28}, {"math": 89}],
}
First, make sure you change the name of the first Dictionary, say to myDict, since dict is reserved in Python as a Class Type.
The below function will do what you are looking for, in a recursive manner.
def recursive_swipe(input_var, updates):
if isinstance(input_var, list):
output_var = []
for entry in input_var:
output_var.append(recursive_swipe(entry, updates))
elif isinstance(input_var, dict):
output_var = {}
for label in input_var:
if isinstance(input_var[label], list) or isinstance(input_var[label], dict):
output_var[label] = recursive_swipe(input_var[label], updates)
else:
if label in updates:
output_var[label] = updates[label]
else:
output_var = input_var
return output_var
myDict = recursive_swipe(myDict, inp)
You may look for more optimal solutions if there are some limits to the formatting of the two dictionaries that were not stated in your question.
I want to get the value of a specific key in a nested json file, without knowing the exact location. So basically looking through all the keys (and nested keys) until it finds the match, and return a dictionary {match: "value"}
Nested json_data:
{
"$id": "1",
"DataChangedEntry": {
"$id": "2",
"PathProperty": "/",
"Metadata": null,
"PreviousValue": null,
"CurrentValue": {
"CosewicWsRefId": {
"Value": "QkNlrjq2HL9bhTQqU8-qH"
},
"Date": {
"Value": "2022-05-20T00:00:00Z"
},
"YearSentToMinister": {
"Value": "0001-01-01T00:00:00"
},
"DateSentToMinister": {
"Value": "0001-01-01T00:00:00"
},
"Order": null,
"Type": {
"Value": "REGULAR"
},
"ReportType": {
"Value": "NEW"
},
"Stage": {
"Value": "ASSESSED"
},
"State": {
"Value": "PUBLISHED"
},
"StatusAndCriteria": {
"Status": {
"Value": "EXTINCT"
},
"StatusComment": {
"EnglishText": null,
"FrenchText": null
},
"StatusChange": {
"Value": "NOT_INITIALIZED"
},
"StatusCriteria": {
"EnglishText": null,
"FrenchText": null
},
"ApplicabilityOfCriteria": {
"ApplicabilityCriteriaList": []
}
},
"Designation": null,
"Note": null,
"DomainEvents": [],
"Version": {
"Value": 1651756761385.1248
},
"Id": {
"Value": "3z3XlCkaXY9xinAbK5PrU"
},
"CreatedAt": {
"Value": 1651756761384
},
"ModifiedAt": {
"Value": 1651756785274
},
"CreatedBy": {
"Value": "G#a"
},
"ModifiedBy": {
"Value": "G#a"
}
}
},
"EventAction": "Create",
"EventDataChange": {
"$ref": "2"
},
"CorrelationId": "3z3XlCkaXY9xinAbK5PrU",
"EventId": "WGxlewsUAHayLHZ2LHvFk",
"EventTimeUtc": "2022-05-06T13:15:31.7463355Z",
"EventDataVersion": "1.0.0",
"EventType": "AssessmentCreatedInfrastructure"
}
Desired return is the value from json_data["DataChangedEntry"]["CurrentValue"]["Date"]["Value"]:
"2022-05-20T00:00:00Z"
So far I've tried a recursive function but it keeps return None:
match_dict = {}
def recursive_json(data,attr,m_dict):
for k,v in data.items():
if k == attr:
for k2,v2 in v.items():
m_dict = {attr, v2}
print('IF: ',m_dict)
return m_dict
elif isinstance(v,dict):
return recursive_json(v,attr,m_dict)
print('RETURN: ',recursive_json(json_data, "Date", match_dict))
Output:
RETURN: None
I tried removing the second return statement, and it now prints the value I want in the function, but still returns None:
match_dict = {}
def recursive_json(data,attr,m_dict):
for k,v in data.items():
if k == attr:
for k2,v2 in v.items():
m_dict = {attr, v2}
print('IF: ',m_dict)
return m_dict
elif isinstance(v,dict):
recursive_json(v,attr,m_dict)
print('RETURN: ',recursive_json(json_data, "Date", match_dict))
Output:
IF: {'Date', '2022-05-20T00:00:00Z'}
RETURN: None
I don't get why it keeps returning None. Is there a better way to return the value I want?
The underlying question is: how can we make multiple recursive calls in a loop, return the recursive result if any of them returns something useful, and fail otherwise?
If we blindly return inside the loop, then only one recursive call can be made. Whatever it returns, gets returned at this level. If it didn't find the useful result, we don't get a useful result.
If we blindly don't return inside the loop, then the values that were returned don't matter. Nothing in the current call makes use of them, so we will finish looping, make all the recursive calls, reach the end of the function... and thus implicitly return None.
The way around this, of course, is to check whether the recursive call returned something useful. If it did, we can return that; otherwise, we keep going. If we reach the end, then we signal that we couldn't find anything useful - that way, if we are being recursively called, the caller can do the right thing.
Assuming that None cannot be a "useful" value, we can naturally use that as the signal. We don't even have to return it explicitly at the end.
After fixing some other typos (we should not overwrite the global built-in dict name, and anyway we don't need to name the dict that we pass in at the start, and the parameter should be m_dict so that it's properly defined when we make the recursive call), we get:
def recursive_json(data, attr, m_dict):
for k,v in data.items():
if k == attr:
for k2,v2 in v.items():
m_dict = {attr, v2}
print('IF: ', m_dict)
return m_dict
elif isinstance(v,dict):
result = recursive_json(v, attr, m_dict)
if result:
return result
# call it:
recursive_json(json_data, "Date", {})
We can see that the debug trace is printed, and the value is also returned.
Let's improve this a bit:
First off, the inner for k2,v2 in v.items(): loop doesn't make any sense. Again, we can only return once per call, so this would skip any values in the dict after the first. We would be better served just returning v directly. Also, the m_dict parameter doesn't actually help implement the logic; we don't modify it between calls. It doesn't make sense to use a set for our return value, since it's fundamentally unordered; we care about the order here. Finally, we don't need the debug trace any more. That gives us:
def recursive_json(data, attr):
for k, v in data.items():
if k == attr:
return attr, v
elif isinstance(v,dict):
result = recursive_json(v, attr)
if result:
return result
To get fancier, we can separate the base case from the recursive case, and use more elegant tools for each. To check if any of the keys matches, we can simply check with the in operator. To recurse and return the first fruitful result, the built-in next is useful. We get:
def recursive_json(data, attr):
if not isinstance(data, dict):
# reached a leaf, can't search in here.
return None
if attr in data:
return k, data[k]
candidates = (recursive_json(v, attr) for v in data.values())
try:
# the first non-None candidate, if any.
return next(c for c in candidates if c is not None)
except StopIteration:
return None # all candidates were None.
It seems like you're trying to write something like this:
from json import loads
from typing import Any
test_json = """
{
"a": {
"b": {
"value": 1
}
},
"b": {
"value": 2
},
"c": {
"b": {
"value": 3
},
"c": {
"value": 4
}
},
"d": {}
}
"""
json_data = loads(test_json)
def find_value(data: dict, attr: str, depth_first: bool=True) -> (bool, Any):
# assumes data is a dict, with 'value' attributes for the attr to be found
# returns [whether value was found]: bool, [actual value]: Any
for k, v in data.items():
if k == attr and 'value' in v:
return True, v['value']
elif depth_first and isinstance(v, dict):
if (t := find_value(v, attr, depth_first))[0]:
return t
if not depth_first:
for _, v in data.items():
if isinstance(v, dict) and (t := find_value(v, attr, depth_first))[0]:
return t
return False, None
# returns True, 1 - first 'b' with a 'value', depth-first
print(find_value(json_data, 'b'))
# returns True, 2 - first 'b' with a 'value', breadth-first
print(find_value(json_data, 'b', False))
# returns True, 4 - first 'c' with a 'value' - the 'c' at the root level has no 'value'
print(find_value(json_data, 'c'))
# returns False, None - no 'd' with a value
print(find_value(json_data, 'd'))
# returns False, None - no 'e' in data
print(find_value(json_data, 'e'))
Your own function can return None because you don't actually return the value a recursive call would return. And the default return value for a function is None.
However, your code also doesn't account for the case where there is nothing to be found.
(Note: this solution only works in Python 3.8 or later, due to its use of the walrus operator := - of course it's not that hard to write it without, but that's left as an exercisae for the reader
I am trying to simulate the problem statement using the below program:
import json
class System:
def __init__(self):
self.model = "abc"
self.fwVersion = "123"
self.prevfwVersion = "456"
self.safemodeVersion = "5756"
def __setitem__(self, key, val):
self.__dict__[key] = val
def __getitem__(self, key):
return self.__dict__[key]
def toJSON(self):
return self.__dict__
class Mainwall:
def __init__(self):
self.system = System()
def __setitem__(self, key, val):
self.__dict__[key] = val
def __getitem__(self, key):
return self.__dict__[key]
def toJSON(self):
return self.__dict__
class ComplexEncoder(json.JSONEncoder):
def default(self, obj):
if hasattr(obj, 'toJSON'):
return obj.toJSON()
else:
return json.JSONEncoder.default(self, obj)
fw = Mainwall()
def my_print():
print(json.dumps(fw.toJSON(), cls=ComplexEncoder, indent=4))
if __name__ == '__main__':
my_print()
Since python dictionary does not preserve the insertion order , the output of the above program is always will have the different key order.
Say, first time it prints:
{
"system": {
"safemodeVersion": "5756",
"prevfwVersion": "456",
"fwVersion": "123",
"model": "abc"
}
}
Second time it prints:
{
"system": {
"fwVersion": "123",
"prevfwVersion": "456",
"safemodeVersion": "5756",
"model": "abc"
}
}
But, in the output I would like to preserve the order in which the class members are initialized. i.e., Exactly as below:
{
"system": {
"model": "abc",
"fwVersion": "123",
"prevfwVersion": "456",
"safemodeVersion": "5756",
}
}
How to achieve the expected output for the same example using OrderedDict() or some other method?
There is a work-around I made for you. I invite you to look into the System() class. I created an OrderedDict() instead of four self attributes. Then, in you method toJSON(self):, instead of returning the self.__dict__ attributes, I am returning the OrderedDict() I set earlier.
class System:
def __init__(self, model='abc', fwVersion='123', prevfwVersion='456', safemodeVersion='5756'):
self.my_ordered_dict = OrderedDict()
self.my_ordered_dict['model'] = model
self.my_ordered_dict['fwVersion'] = fwVersion
self.my_ordered_dict['prevfwVersion'] = prevfwVersion
self.my_ordered_dict['safemodeVersion'] = safemodeVersion
# self.model = "abc"
# self.fwVersion = "123"
# self.prevfwVersion = "456"
# self.safemodeVersion = "5756"
def __setitem__(self, key, val):
self.__dict__[key] = val
def __getitem__(self, key):
return self.__dict__[key]
def toJSON(self):
return self.my_ordered_dict
This System() class instead of the one above, with the same code, outputs...
{
"system": {
"model": "abc",
"fwVersion": "123",
"prevfwVersion": "456",
"safemodeVersion": "5756"
}
}
upgrading to python 3.6 solved the problem.
I have a nested dict in Python containing YAML structures like
- id: left_time
type: u2
doc: Time left
and I want to obtain pairs like {id: doc}. For this example I want it to be: {"left_time": "Time left"}. The problem is I need to walk through them recursively.
My attempt is
def get_dict_recursively(search_dict, field):
fields_found = []
name = ""
for key, value in search_dict.items():
if key == "id":
name = value
if key == field:
fields_found.append({name: value})
elif isinstance(value, dict):
results = get_dict_recursively(value, field)
for result in results:
fields_found.append({name: result})
elif isinstance(value, list):
for item in value:
if isinstance(item, dict):
more_results = get_dict_recursively(item, field)
for another_result in more_results:
fields_found.append({name: another_result})
return fields_found
calling it like
get_dict_recursively(dict, "doc")
where
dict = {
meta:
id: foo
title: Foo
types:
data:
seq:
- id: left_time
type: u2
doc: Time left
gps:
seq:
- id: gps_st
type: b2
- id: sats
type: b6
doc: Number of satellites
}
There's a mistake, but I can't find it out.
Let's first state your example data as a dict:
data = {
"meta": {
"id": "foo",
"title": "Foo"
},
"types": {
"data": {
"seq": [
{
"id": "left_time",
"type": "u2",
"doc": "Time left"
}
]
},
"gps": {
"seq": [
{
"id": "gps_st",
"type": "b2"
},
{
"id": "sats",
"type": "b6",
"doc": "Number of satellites"
}
]
}
}
}
Next, we can simplify your recursive function to look like this:
def extract_docs(data):
result = []
if isinstance(data, list):
for d in data:
result += extract_docs(d)
elif isinstance(data, dict):
if "id" in data and "doc" in data:
result.append((data["id"], data["doc"]))
else:
for d in data.values():
result += extract_docs(d)
return result
With this you get
>>> dict(extract_docs(data))
{'sats': 'Number of satellites', 'left_time': 'Time left'}
Get the value from a nested dictionary with the help of key path, here is the dict:
json = {
"app": {
"Garden": {
"Flowers": {
"Red flower": "Rose",
"White Flower": "Jasmine",
"Yellow Flower": "Marigold"
}
},
"Fruits": {
"Yellow fruit": "Mango",
"Green fruit": "Guava",
"White Flower": "groovy"
},
"Trees": {
"label": {
"Yellow fruit": "Pumpkin",
"White Flower": "Bogan"
}
}
}
The input parameter to the method is the key path with dots separated, from the key path = "app.Garden.Flowers.white Flower" need to print 'Jasmine'. My code so far:
import json
with open('data.json') as data_file:
j = json.load(data_file)
def find(element, JSON):
paths = element.split(".")
# print JSON[paths[0]][paths[1]][paths[2]][paths[3]]
for i in range(0,len(paths)):
data = JSON[paths[i]]
# data = data[paths[i+1]]
print data
find('app.Garden.Flowers.White Flower',j)
This is an instance of a fold. You can either write it concisely like this:
from functools import reduce
import operator
def find(element, json):
return reduce(operator.getitem, element.split('.'), json)
Or more Pythonically (because reduce() is frowned upon due to poor readability) like this:
def find(element, json):
keys = element.split('.')
rv = json
for key in keys:
rv = rv[key]
return rv
j = {"app": {
"Garden": {
"Flowers": {
"Red flower": "Rose",
"White Flower": "Jasmine",
"Yellow Flower": "Marigold"
}
},
"Fruits": {
"Yellow fruit": "Mango",
"Green fruit": "Guava",
"White Flower": "groovy"
},
"Trees": {
"label": {
"Yellow fruit": "Pumpkin",
"White Flower": "Bogan"
}
}
}}
print find('app.Garden.Flowers.White Flower', j)
I was in a similar situation and found this dpath module. Nice and easy.
I suggest you to use python-benedict, a python dict subclass with full keypath support and many utility methods.
You just need to cast your existing dict:
d = benedict(json)
# now your keys support dotted keypaths
print(d['app.Garden.Flower.White Flower'])
Here the library and the documentation:
https://github.com/fabiocaccamo/python-benedict
Note: I am the author of this project
Your code heavily depends on no dots every occurring in the key names, which you might be able to control, but not necessarily.
I would go for a generic solution using a list of element names and then generate the list e.g. by splitting a dotted list of key names:
class ExtendedDict(dict):
"""changes a normal dict into one where you can hand a list
as first argument to .get() and it will do a recursive lookup
result = x.get(['a', 'b', 'c'], default_val)
"""
def multi_level_get(self, key, default=None):
if not isinstance(key, list):
return self.get(key, default)
# assume that the key is a list of recursively accessible dicts
def get_one_level(key_list, level, d):
if level >= len(key_list):
if level > len(key_list):
raise IndexError
return d[key_list[level-1]]
return get_one_level(key_list, level+1, d[key_list[level-1]])
try:
return get_one_level(key, 1, self)
except KeyError:
return default
get = multi_level_get # if you delete this, you can still use the multi_level-get
Once you have this class it is easy to just transform your dict and get "Jasmine":
json = {
"app": {
"Garden": {
"Flowers": {
"Red flower": "Rose",
"White Flower": "Jasmine",
"Yellow Flower": "Marigold"
}
},
"Fruits": {
"Yellow fruit": "Mango",
"Green fruit": "Guava",
"White Flower": "groovy"
},
"Trees": {
"label": {
"Yellow fruit": "Pumpkin",
"White Flower": "Bogan"
}
}
}
}
j = ExtendedDict(json)
print j.get('app.Garden.Flowers.White Flower'.split('.'))
will get you:
Jasmine
Like with a normal get() from a dict, you get None if the key (list) you specified doesn't exists anywhere in the tree, and you can specify a second parameter as return value instead of None
Very close. You need to (as you had in your comment) recursively go through the main JSON object. You can accomplish that by storing the result of the outermost key/value, then using that to get the next key/value, etc. till you're out of paths.
def find(element, JSON):
paths = element.split(".")
data = JSON
for i in range(0,len(paths)):
data = data[paths[i]]
print data
You still need to watch out for KeyErrors though.
one-liner:
from functools import reduce
a = {"foo" : { "bar" : "blah" }}
path = "foo.bar"
reduce(lambda acc,i: acc[i], path.split('.'), a)
Option 1: pyats library from Cisco [its a c extension]
Its quick and Super fast (measure it with timeit if required)
Javascript-ish usage [Bracket lookup ,dotted lookup, combined lookup]
Dotted Lookup for missing key raises Attribute error, bracket or default python dict lookup gives KeyError.
pip install pyats pyats-datastructures pyats-utils
from pyats.datastructures import NestedAttrDict
item = {"specifications": {"os": {"value": "Android"}}}
path = "specifications.os.value"
x = NestedAttrDict(item)
print(x[path])# prints Android
print(x['specifications'].os.value)# prints Android
print(x['specifications']['os']['value'])#prints Android
print(x['specifications'].os.value1)# raises Attribute Error
Option 2:pyats.utils chainget
super fast (measure it with timeit if required)
from pyats.utils import utils
item = {"specifications": {"os": {"value": "Android"}}}
path = "specifications.os.value"
path1 = "specifications.os.value1"
print(utils.chainget(item,path))# prints android (string version)
print(utils.chainget(item,path.split('.')))# prints android(array version)
print(utils.chainget(item,path1))# raises KeyError
Option 3: python without external library
Better speed in comparison to lambda.
Separate Error handling not required as in lambda and other cases.
Readable and concise can be a utils function/helper in the project
from functools import reduce
item = {"specifications": {"os": {"value": "Android"}}}
path1 = "specifications.family.value"
path2 = "specifications.family.value1"
def test1():
print(reduce(dict.get, path1.split('.'), item))
def test2():
print(reduce(dict.get, path2.split('.'), item))
test1() # prints Android
test2() # prints None
Wrote function that works with lists in dict.
d = {'test': [
{'value1': 'val'},
{'value1': 'val2'}]}
def find_element(keys: list, dictionary: dict):
rv = dictionary
if isinstance(dictionary, dict):
rv = find_element(keys[1:], rv[keys[0]])
elif isinstance(dictionary, list):
if keys[0].isnumeric():
rv = find_element(keys[1:], dictionary[int(keys[0])])
else:
return rv
return rv
val = find_element('test.1.value1'.split('.'), d)
data = {
"data": {
"author_id": "1",
"text": "hi msg",
"attachments": {
"media_keys": [
"3_16"
]
},
"id": "2",
"edit_history_tweet_ids": [
"2"
]
},
"includes": {
"media": [
{
"media_key": "3_16",
"height": 500,
"type": "photo",
"width": 500,
"url": "https://pbs.twimg.com/media/xxxxxx.png"
}
],
"users": [
{
"id": "1",
"name": "name1",
"username": "username1"
}
]
}
}
def get_value_from_dict(dic_obj, keys: list, default):
"""
get value from dict with key path.
:param dic_obj: dict
:param keys: dict key
:param default: default value
:return:
"""
if not dic_obj or not keys:
return default
pre_obj = dic_obj
for key in keys:
t = type(pre_obj)
if t is dict:
pre_obj = pre_obj.get(key)
elif (t is list or t is tuple) and str(key).isdigit() and len(pre_obj) > int(key):
pre_obj = pre_obj[int(key)]
else:
return default
return pre_obj
print('media_key:', get_value_from_dict(data, 'data.attachments.media_keys'.split('.'), None))
print('username:', get_value_from_dict(data, 'includes.users.0.username'.split('.'), None))
media_key: ['3_16']
username: username1