Given a dictionary like below, what I'd like to do is find all elements (keys+data) and their full root key path. If the data is a string it returns the data and the full key root. If the data is a dict, then it returns the first key inside that dict as the data and the full key root.
a = {
"level1_key": {
"level2_key": {
"level3_key": {
"status": "down"
}
}
}
}
For example, the first key has no root, and it's data is a dict, so return the current key as data and no parent keys.
Key = None
Data = level1_key
The data for level2_key is a dict, so return the current key and it's parents.
Key = level1_key
Data = level2_key
Key = level1_key.level2_key
Data = level3_key
Key = level1_key.level2_key.level3_key
Data = status
The last keys data is a string, so return the string as data, and all its keys
Key = level1_key.level2_key.level3_key.status
Data = down
Because there are 5 elements in the dict (4 keys and 1 string) I would end up with 5 tuples of key paths and data.
The reason behind this is that each element represents configuration, if say the "status" needs changing to "down" what actually needs changing is: level1_key.level2_key.level3_key.status
A non-working example I wrote which gets all the strings, their keys, and partial root paths, but it doesn't quite get all the root keys
current_key = ""
key_list = []
def thing(data):
global current_key
global key_list
if isinstance(data, dict):
for key, value in data.items():
current_key = key
key_list.append(key)
thing(value)
elif isinstance(data, str):
print(f"Key: {current_key}")
print(f"Key List: {key_list}")
print(f"Data: {data}")
print("##########")
key_list = []
Data is:
{
"100": {
"status": "down"
},
"200": {
"status1": "up",
"status2": "down",
"status3": {
"nested": "more_data"
}
}
}
Key: status
Key List: [100, 'status']
Data: down
##########
Key: status1
Key List: [200, 'status1']
Data: up
##########
Key: status2
Key List: ['status2']
Data: down
##########
Key: nested
Key List: ['status3', 'nested']
Data: more_data
##########
The last bit "more_data" is missing a root key of 200 for example.
Related
Disclaimer: I've been at this for about a week, and it's entirely possible that I've come up with the solution, but I missed it in my troubleshooting. Also, the INI files can be over 200 lines long and 10 deep with combinations of dictionaries and lists.
Situation: I maintain a couple dozen applications, and each application has a JSON formatted INI file that tracks certain system settings. On my computer, I aggregated all those INI files into a single file and then collapsed the structure. That collapsed structure is the unique keys from all those INI files, followed by all the possible values that each key has, and then what I may want each value to be replaced with (see examples below).
Goal: When I need to make configuration changes in those applications, I want to instead make the value changes in my JSON file and then use a Python script to replace the matching key-value pairs in all those other system files.
Simplifications:
I understand opening and writing the files, my problem is the parsing.
The recursion will always end with a key-value pair where the type(value) is str.
Sample INI file from one of those applications
{
"Version": "3.24.2",
"Package": [
{
"ID": "42",
"Display": "4",
"Driver": "E10A"
}, {
"ID": "50",
"Display": "1",
"Driver": "E12A"
}
]
}
My change file
Example use: If I want to replace all instances of {"Display":"1"} with {"Display":"10"}, then all I have to do is put a 10 between the double quotes below ... {"Display": {"1": ""}} to {"Display": {"1": "10"}}
{
"Version" {
"3.24.2": "",
"42.1": "",
"2022-10-1": ""
},
"ID" {
"42": "",
"50": ""
},
"Display": {
"1": "",
"4": ""
},
"Driver": {
"01152003.1": "",
"E10A": "",
"E12A": ""
}
}
Attempt 1
I read that Python assigns values like a C *pointer, but that was not my experience with this attempt. There are no errors, and the data variable never changed.
def RecursiveSearch(val, key=None):
if isinstance(val, dict):
for k, v in val.items():
RecursiveSearch(v, k)
elif isinstance(val, list):
for v in val:
RecursiveSearch(v, key)
elif isinstance(val, str):
# Is the key being tracked in my change file
if key in ChangeFile:
# Is that key's value being tracked in my change file
if val in ChangeFile[key].keys():
# Find the matching key-value and apply the replacement value
for k, v in ChangeFile[key].items():
# Only replace the value if it has something to replace it with
if k == val and v != "":
key[val] = v
data = open('config.ini', 'w', encoding='UTF-8', errors='ignore')
data = convertJSON(data)
ChangeFile = open('change.json', 'r', encoding='UTF-8', errors='ignore')
ChangeFile = convertJSON(data)
data = RecursiveSearch(val=data, key=None)
print(data)
Attempt 2
Same code but with return values. In this attempt the data is completely replaced with the last key-value pair the recursion looked at.
def RecursiveSearch(val, key=None):
if isinstance(val, dict):
for k, v in val.items():
tmp = RecursiveSearch(v, k)
if tmp != {k: v}:
return tmp
return val
elif isinstance(val, list):
for v in val:
tmp = RecursiveSearch(v, key)
if v != tmp:
return tmp
return val
elif isinstance(val, str):
# Is the key being tracked in my change file
if key in ChangeFile:
# Is that key's value being tracked in my change file
if val in ChangeFile[key].keys():
# Find the matching key-value and apply the replacement value
for k, v in ChangeFile[key].items():
# Only replace the value if it has something to replace it with
if k == val and v != "":
return v
else: return val
else: return val
else: return val
else: return val
# Return edited data after the recursion uncoils
return {key: val}
data = open('config.ini', 'w', encoding='UTF-8', errors='ignore')
data = convertJSON(data)
ChangeFile = open('change.json', 'r', encoding='UTF-8', errors='ignore')
ChangeFile = convertJSON(data)
data = RecursiveSearch(val=data, key=None)
print(data)
I have a huge nested json file and I want to get the values of "text" but only on a certain level as there are many "text" keys deeper in the json file. The level I mean would be the "text:"Hi" after "event":"user".
The file looks like this:
`
{
"_id":{
"$oid":"123"
},
"events":[
{
"event":"action",
"metadata":{
"model_id":"12"
},
"action_text":null,
"hide_rule_turn":false
},
{
"event":"user",
"text":"Hi",
"parse_data":{
"intent":{
"name":"greet",
"confidence":{
"$numberDouble":"0.9601748585700989"
}
},
"entities":[
],
"text":"Hi",
"metadata":{
},
"text_tokens":[
[
{
"$numberInt":"0"
},
{
"$numberInt":"2"
}
]
],
"selector":{
"ideas":{
"response":{
"responses":[
{
"text":"yeah"
},
{
"text":"No"
},
{
"text":"Goo"
}
]
},
`
First I uses this function to get the text data but of course if gave me all of them:
def json_extract(obj, key):
"""Recursively fetch values from nested JSON."""
arr = []
def extract(obj, arr, key):
"""Recursively search for values of key in JSON tree."""
if isinstance(obj, dict):
for k, v in obj.items():
if isinstance(v, (dict, list)):
extract(v, arr, key)
elif k == key:
arr.append(v)
elif isinstance(obj, list):
for item in obj:
extract(item, arr, key)
return arr
values = extract(obj, arr, key)
return values
I also tried to access only the second level through this text but it gave me a KeyNotFound Error:
for i in data["events"][0]:
print(i["text"])
Maybe because that key is not in every nested list? ... I really don't know what else I could do
Since events is a list, you can write a list comprehension (if there are multiple items you need), or you can use the next function to get an element that you need from the iterator:
event = next(e for e in data.get('events', list()) if e.get('event')=='user')
print(event.get('text', ''))
Using get method gives you the safety that it won't throw an exception if the key doesn't exist in the dictionary
Edit:
If you need this for all events:
all_events = [e for e in data.get('events', list()) if e.get('event')=='user']
for event in all_events:
print(event.get('text', ''))
Convert your JSON to a Python dictionary (e.g., json.load or json.loads depending on how you're accessing the JSON). Then just pass a reference to the dictionary to this:
def json_extract(jdata):
assert isinstance(jdata, dict)
arr = []
def _extract(d, arr):
if 'event' in d and (t := d.get('text')):
arr.append(t)
for k, v in d.items():
if k not in {'event', 'text'}:
if isinstance(v, list):
for e in v:
if isinstance(e, dict):
_extract(e, arr)
elif isinstance(v, dict):
_extract(v, arr)
return arr
return _extract(jdata, arr)
This will return a list of all values associated with the key 'text' providing that key is found in a dictionary that also has an 'event' key
I have a JSON file that looks like this:
data = {
"x": {
"y": {
"key": {
},
"w": {
}
}}}
And have converted it into a dict in python to them parse through it to look for keys, using the following code:
entry = input("Search JSON for the following: ") //search for "key"
if entry in data:
print(entry)
else:
print("Not found.")
However, even when I input "key" as entry, it still returns "Not found." Do I need to control the depth of data, what if I don't know the location of "key" but still want to search for it.
Your method is not working because key is not a key in data. data has one key: x. So you need to look at the dictionary and see if the key is in it. If not, you can pass the next level dictionaries back to the function recursively. This will find the first matching key:
data = {
"x": {
"y": {
"key": "some value",
"w": {}
}}}
key = "key"
def findValue(key, d):
if key in d:
return d[key]
for v in d.values():
if isinstance(v, dict):
found = findValue(key, v)
if found is not None:
return found
findValue(key, data)
# 'some value'
It will return None if your key is not found
Here's an approach which allows you to collect all the values from a nested dict, if the keys are repeated at different levels of nesting. It's very similar to the above answer, just wrapped in a function with a nonlocal list to hold the results:
def foo(mydict, mykey):
result = []
num_recursive_calls = 0
def explore(mydict, mykey):
#nonlocal result #allow successive recursive calls to write to list
#actually this is unnecessary in this case! Here
#is where we would need it, for a call counter:
nonlocal num_recursive_calls
num_recursive_calls += 1
for key in mydict.keys(): #get all keys from that level of nesting
if mykey == key:
print(f"Found {key}")
result.append({key:mydict[key]})
elif isinstance(mydict.get(key), dict):
print(f"Found nested dict under {key}, exploring")
explore(mydict[key], mykey)
explore(mydict, mykey)
print(f"explore called {num_recursive_calls} times") #see above
return result
For example, with
data = {'x': {'y': {'key': {}, 'w': {}}}, 'key': 'duplicate'}
This will return:
[{'key': {}}, {'key': 'duplicate'}]
I'm receiving json files improperly and am trying to create a temporary fix until the documents come in the proper format. Instead of the value being set to the derivationsValue key, it is being set as a key value pair, so there is an extraneous key. I want to set the the inner value to the outer key.
{
"derivationName": "other_lob_subcd",
"derivationValue": {
"OOP3": "TENANT"
}
}
Given the above json, I want the result to be
{
"derivationName": "other_lob_subcd",
"derivationValue": "TENANT"
}
I could also live with
{
"derivationName": "other_lob_subcd",
"derivationValue": "OOP3-TENANT"
}
or something like that. It just can't be another json element.
Based on #Diana Ayala's answer, I have written this to try solving the problem with variable keys.
for k,v in data['mcsResults']['derivationsOutput']:
if isinstance(k['derivationValue'], dict):
for sk, sv in k['derivationValue']:
k['derivationValue'] = sv
You can use below generic code for your requirement.
import json
filePath = 'file.json'
def getValue(value):
if type(value) is dict:
ans = list(value)[0]
for k in value:
ans += '-'+getValue(value[k])
return ans
return value
def correct(data):
for key in data:
data[key] = getValue(data[key])
return data
if __name__ == "__main__":
with open(filePath) as fp:
data = json.load(fp)
data = correct(data)
print (data)
output:
D:\>python file.py
{'derivationName': 'other_lob_subcd', 'derivationValue': 'OOP3-TENANT'}
For the example given:
import json
with open('inpt.txt') as json_file:
data = json.load(json_file)
data['derivationValue'] = data['derivationValue']['OOP3']
Gives the output:
{'derivationName': 'other_lob_subcd', 'derivationValue': 'TENANT'}
In general, you can look at the solutions here.
You can do something like this:
val = {
"derivationName": "other_lob_subcd",
"derivationValue": {
"OOP3": "TENANT"
}
}
val["derivationValue"] = val["derivationValue"]["OOP3"]
print(val)
This will be the output:
val = {
"derivationName": "other_lob_subcd",
"derivationValue": "TENANT"
}
I have a dict, lets say mydict
I also know about this json, let's say myjson:
{
"actor":{
"name":"",
"type":"",
"mbox":""
},
"result":{
"completion":"",
"score":{ "scaled":"" },
"success":"",
"timestamp":""
},
"verb":{
"display":{
"en-US":""
},
"id":""
},
"context":{
"location":"",
"learner_id": "",
"session_id": ""
},
"object":{
"definition":{
"name":{
"en-US":""
}
},
"id":"",
"activity_type":""
}
}
I want to know if ALL of myjson keys (with the same hierarchy) are in mydict. I don't care if mydict has more data in it (it can have more data). How do I do this in python?
Make a dictionary of myjson
import json
with open('myjson.json') as j:
new_dict = json.loads(j.read())
Then go through each key of that dictionary, and confirm that the value of that key is the same in both dictionaries
def compare_dicts(new_dict, mydict):
for key in new_dict:
if key in mydict and mydict[key] == new_dict[key]:
continue
else:
return False
return True
EDIT:
A little more complex, but something like this should suit you needs
def compare(n, m):
for key in n:
if key in m:
if m[key] == n[key]:
continue
elif isinstance(n[key], dict) and isinstance(m[key],dict):
if compare(n[key], m[key]):
continue
else:
return False
else:
return False
return True
If you just care about the values and not the keys you can do this:
>>> all(v in mydict.items() for v in myjson.items())
True
Will be true if all values if myjson are in mydict, even if they have other keys.
Edit: If you only care about the keys, use this:
>>> all(v in mydict.keys() for v in myjson.keys())
True
This returns true if every key of myjson is in mydict, even if they point to different values.