Drift management of JSON configurations by comparing with dictionary data - python

I am trying to write a python code for drift management, which compares the application's configuration in JSON with the predefined dictionary of key-value pairs.
Ex: Application configuration in JSON:
{
"location": "us-east-1",
"properties": [
{
"type": "t2.large",
"os": "Linux"
}
],
"sgs": {
"sgid": "x-1234"
}
}
Ex: Dictionary with desired values to compare:
{
"os": "Windows",
"location": "us-east-1"
}
Expected output:
Difference is:
{
"os": "Windows"
}
I have been trying to convert the entire JSON (including sub dicts) into a single dict without sub dicts, and then iterate over it with each values of desired dict. I am able to print all the key, values in line but couldn't convert into a dict.
Is there a better way to do this? Or any references that could help me out here?
import json
def openJsonFile(file):
with open(file) as json_data:
workData = json.load(json_data)
return workData
def recursive_iter(obj):
if isinstance(obj, dict):
for item in obj.items():
yield from recursive_iter(item)
elif any(isinstance(obj, t) for t in (list, tuple)):
for item in obj:
yield from recursive_iter(item)
else:
yield obj
data = openJsonFile('file.json')
for item in recursive_iter(data):
print(item)
Expected output:
{
"location": "us-east-1",
"type": "t2.large",
"os": "Linux"
"sgid": "x-1234"
}

I think this will do what you say you want. I used the dictionary flattening code in this answer with a small modification — I changed it to not concatenate the keys of parent dictionaries with those of the nested ones since that seems to be what you want. This assumes that the keys used in all nested dictionaries are unique from one another, which in my opinion is a weakness of your approach.
You asked for references that could help you: Searching this website for related questions is often a productive way to find solutions to your own problems. This is especially true when what you want to know is something that has probably been asked before (such as how to flatten nested dictionaries).
Also note that I have written the code to closely follow the PEP 8 - Style Guide for Python Code guidelines — which I strongly suggest you read (and start following as well).
import json
desired = {
"os": "Windows",
"location": "us-east-1"
}
def read_json_file(file):
with open(file) as json_data:
return json.load(json_data)
def flatten(d):
out = {}
for key, val in d.items():
if isinstance(val, dict):
val = [val]
if isinstance(val, list):
for subdict in val:
deeper = flatten(subdict).items()
out.update({key2: val2 for key2, val2 in deeper})
else:
out[key] = val
return out
def check_drift(desired, config):
drift = {}
for key, value in desired.items():
if config[key] != value:
drift[key] = value
return drift
if __name__ == '__main__':
from pprint import pprint
config = flatten(read_json_file('config.json'))
print('Current configuration (flattened):')
pprint(config, width=40, sort_dicts=False)
drift = check_drift(desired, config)
print()
print('Drift:')
pprint(drift, width=40, sort_dicts=False)
This is the output it produces:
Current configuration (flattened):
{'location': 'us-east-1',
'type': 't2.large',
'os': 'Linux',
'sgid': 'x-1234'}
Drift:
{'os': 'Windows'}

Related

How can I get RecursiveSearch() to return the modified data dictionary?

Disclaimer: I've been at this for about a week, and it's entirely possible that I've come up with the solution, but I missed it in my troubleshooting. Also, the INI files can be over 200 lines long and 10 deep with combinations of dictionaries and lists.
Situation: I maintain a couple dozen applications, and each application has a JSON formatted INI file that tracks certain system settings. On my computer, I aggregated all those INI files into a single file and then collapsed the structure. That collapsed structure is the unique keys from all those INI files, followed by all the possible values that each key has, and then what I may want each value to be replaced with (see examples below).
Goal: When I need to make configuration changes in those applications, I want to instead make the value changes in my JSON file and then use a Python script to replace the matching key-value pairs in all those other system files.
Simplifications:
I understand opening and writing the files, my problem is the parsing.
The recursion will always end with a key-value pair where the type(value) is str.
Sample INI file from one of those applications
{
"Version": "3.24.2",
"Package": [
{
"ID": "42",
"Display": "4",
"Driver": "E10A"
}, {
"ID": "50",
"Display": "1",
"Driver": "E12A"
}
]
}
My change file
Example use: If I want to replace all instances of {"Display":"1"} with {"Display":"10"}, then all I have to do is put a 10 between the double quotes below ... {"Display": {"1": ""}} to {"Display": {"1": "10"}}
{
"Version" {
"3.24.2": "",
"42.1": "",
"2022-10-1": ""
},
"ID" {
"42": "",
"50": ""
},
"Display": {
"1": "",
"4": ""
},
"Driver": {
"01152003.1": "",
"E10A": "",
"E12A": ""
}
}
Attempt 1
I read that Python assigns values like a C *pointer, but that was not my experience with this attempt. There are no errors, and the data variable never changed.
def RecursiveSearch(val, key=None):
if isinstance(val, dict):
for k, v in val.items():
RecursiveSearch(v, k)
elif isinstance(val, list):
for v in val:
RecursiveSearch(v, key)
elif isinstance(val, str):
# Is the key being tracked in my change file
if key in ChangeFile:
# Is that key's value being tracked in my change file
if val in ChangeFile[key].keys():
# Find the matching key-value and apply the replacement value
for k, v in ChangeFile[key].items():
# Only replace the value if it has something to replace it with
if k == val and v != "":
key[val] = v
data = open('config.ini', 'w', encoding='UTF-8', errors='ignore')
data = convertJSON(data)
ChangeFile = open('change.json', 'r', encoding='UTF-8', errors='ignore')
ChangeFile = convertJSON(data)
data = RecursiveSearch(val=data, key=None)
print(data)
Attempt 2
Same code but with return values. In this attempt the data is completely replaced with the last key-value pair the recursion looked at.
def RecursiveSearch(val, key=None):
if isinstance(val, dict):
for k, v in val.items():
tmp = RecursiveSearch(v, k)
if tmp != {k: v}:
return tmp
return val
elif isinstance(val, list):
for v in val:
tmp = RecursiveSearch(v, key)
if v != tmp:
return tmp
return val
elif isinstance(val, str):
# Is the key being tracked in my change file
if key in ChangeFile:
# Is that key's value being tracked in my change file
if val in ChangeFile[key].keys():
# Find the matching key-value and apply the replacement value
for k, v in ChangeFile[key].items():
# Only replace the value if it has something to replace it with
if k == val and v != "":
return v
else: return val
else: return val
else: return val
else: return val
# Return edited data after the recursion uncoils
return {key: val}
data = open('config.ini', 'w', encoding='UTF-8', errors='ignore')
data = convertJSON(data)
ChangeFile = open('change.json', 'r', encoding='UTF-8', errors='ignore')
ChangeFile = convertJSON(data)
data = RecursiveSearch(val=data, key=None)
print(data)

Parsing through nested JSON keys

I have a JSON file that looks like this:
data = {
"x": {
"y": {
"key": {
},
"w": {
}
}}}
And have converted it into a dict in python to them parse through it to look for keys, using the following code:
entry = input("Search JSON for the following: ") //search for "key"
if entry in data:
print(entry)
else:
print("Not found.")
However, even when I input "key" as entry, it still returns "Not found." Do I need to control the depth of data, what if I don't know the location of "key" but still want to search for it.
Your method is not working because key is not a key in data. data has one key: x. So you need to look at the dictionary and see if the key is in it. If not, you can pass the next level dictionaries back to the function recursively. This will find the first matching key:
data = {
"x": {
"y": {
"key": "some value",
"w": {}
}}}
key = "key"
def findValue(key, d):
if key in d:
return d[key]
for v in d.values():
if isinstance(v, dict):
found = findValue(key, v)
if found is not None:
return found
findValue(key, data)
# 'some value'
It will return None if your key is not found
Here's an approach which allows you to collect all the values from a nested dict, if the keys are repeated at different levels of nesting. It's very similar to the above answer, just wrapped in a function with a nonlocal list to hold the results:
def foo(mydict, mykey):
result = []
num_recursive_calls = 0
def explore(mydict, mykey):
#nonlocal result #allow successive recursive calls to write to list
#actually this is unnecessary in this case! Here
#is where we would need it, for a call counter:
nonlocal num_recursive_calls
num_recursive_calls += 1
for key in mydict.keys(): #get all keys from that level of nesting
if mykey == key:
print(f"Found {key}")
result.append({key:mydict[key]})
elif isinstance(mydict.get(key), dict):
print(f"Found nested dict under {key}, exploring")
explore(mydict[key], mykey)
explore(mydict, mykey)
print(f"explore called {num_recursive_calls} times") #see above
return result
For example, with
data = {'x': {'y': {'key': {}, 'w': {}}}, 'key': 'duplicate'}
This will return:
[{'key': {}}, {'key': 'duplicate'}]

Python remove nested JSON key or combine key with value

I'm receiving json files improperly and am trying to create a temporary fix until the documents come in the proper format. Instead of the value being set to the derivationsValue key, it is being set as a key value pair, so there is an extraneous key. I want to set the the inner value to the outer key.
{
"derivationName": "other_lob_subcd",
"derivationValue": {
"OOP3": "TENANT"
}
}
Given the above json, I want the result to be
{
"derivationName": "other_lob_subcd",
"derivationValue": "TENANT"
}
I could also live with
{
"derivationName": "other_lob_subcd",
"derivationValue": "OOP3-TENANT"
}
or something like that. It just can't be another json element.
Based on #Diana Ayala's answer, I have written this to try solving the problem with variable keys.
for k,v in data['mcsResults']['derivationsOutput']:
if isinstance(k['derivationValue'], dict):
for sk, sv in k['derivationValue']:
k['derivationValue'] = sv
You can use below generic code for your requirement.
import json
filePath = 'file.json'
def getValue(value):
if type(value) is dict:
ans = list(value)[0]
for k in value:
ans += '-'+getValue(value[k])
return ans
return value
def correct(data):
for key in data:
data[key] = getValue(data[key])
return data
if __name__ == "__main__":
with open(filePath) as fp:
data = json.load(fp)
data = correct(data)
print (data)
output:
D:\>python file.py
{'derivationName': 'other_lob_subcd', 'derivationValue': 'OOP3-TENANT'}
For the example given:
import json
with open('inpt.txt') as json_file:
data = json.load(json_file)
data['derivationValue'] = data['derivationValue']['OOP3']
Gives the output:
{'derivationName': 'other_lob_subcd', 'derivationValue': 'TENANT'}
In general, you can look at the solutions here.
You can do something like this:
val = {
"derivationName": "other_lob_subcd",
"derivationValue": {
"OOP3": "TENANT"
}
}
val["derivationValue"] = val["derivationValue"]["OOP3"]
print(val)
This will be the output:
val = {
"derivationName": "other_lob_subcd",
"derivationValue": "TENANT"
}

Format some JSON object with certain fields on one-line?

I want to re-format a JSON file so that certain objects (dictionaries) with some specific keys are on one-line.
For example, any object with key name should appear in one line:
{
"this": "that",
"parameters": [
{ "name": "param1", "type": "string" },
{ "name": "param2" },
{ "name": "param3", "default": "#someValue" }
]
}
The JSON file is generated, and contains programming language data. One-line certain fields makes it much easier to visually inspect/review.
I tried to override python json.JSONEncoder to turn matching dict into a string before writing, only to realize quotes " within the string are escaped again in the result JSON file, defeating my purpose.
I also looked at jq but couldn't figure out a way to do it. I found similar questions and solutions based on line length, but my requirements are simpler, and I don't want other shorter lines to be changed. Only certain objects or fields.
This code recursively replaces all the appropriate dicts in the data with unique strings (UUIDs) and records those replacements, then in the indented JSON string the unique strings are replaced with the desired original single line JSON.
replace returns a pair of:
A modified version of the input argument data
A list of pairs of JSON strings where for each pair the first value should be replaced with the second value in the final pretty printed JSON.
import json
import uuid
def replace(o):
if isinstance(o, dict):
if "name" in o:
replacement = uuid.uuid4().hex
return replacement, [(f'"{replacement}"', json.dumps(o))]
replacements = []
result = {}
for key, value in o.items():
new_value, value_replacements = replace(value)
result[key] = new_value
replacements.extend(value_replacements)
return result, replacements
elif isinstance(o, list):
replacements = []
result = []
for value in o:
new_value, value_replacements = replace(value)
result.append(new_value)
replacements.extend(value_replacements)
return result, replacements
else:
return o, []
def pretty(data):
data, replacements = replace(data)
result = json.dumps(data, indent=4)
for old, new in replacements:
result = result.replace(old, new)
return result
print(pretty({
"this": "that",
"parameters": [
{"name": "param1", "type": "string"},
{"name": "param2"},
{"name": "param3", "default": "#someValue"}
]
}))

Recursively finding paths in a nested Dict

I have a nested dictionary:
d = {
"#timestamp": "2019-01-08T19:33:50.066Z",
"metricset": {
"rtt": 2592,
"name": "filesystem",
"module": "system"
},
"system": {
"filesystem": {
"free_files": 9660022,
"type": "rootfs",
"device_name": "rootfs",
"available": 13555355648,
"files": 9766912,
"mount_point": "/",
"total": 19992150016,
"used": {
"pct": 0.322,
"bytes": 6436794368
},
"free": 13555355648
}
},
"host": {
"name": "AA"
},
"beat": {
"name": "AA",
"hostname": "AA",
"version": "6.3.2"
}
}
What I would like to do is write this dictionary to a CSV file. I'd like the headers of the csv to be something like this:
system.filesystem.type
where the path is made up by each level separated by a period. I am able to go through the dictionary and get most of the headers I need; however my problem is with duplicate values.
PROBLEM: I recursively go through the dict and grab all the values and put them in a list. Then, I am searching for those values in the dictionary again, but this time saving the path to construct the header. However, with duplicate values (i.e. the value "rootfs"), I am getting only the first key-value ("type": "rootfs") returned.
Here is my traversal to grab all the values from the dict, which does exactly what I want:
def traverse(valuelist, dictionary):
for k,v in dictionary.items():
if isinstance(v, dict):
traverse(valuelist,v)
else:
valuelist.append(v)
return valuelist
Now here is the code that grabs the path for each value from the code above:
def getpath(nested_dict, value, prepath=()):
for k,v in nested_dict.items():
path = prepath + (k,)
if v == value: # found value
return path
elif hasattr(v, 'items'): # v is a dict
p = getpath(v, value, path) # recursive call
if p is not None:
return p
This part is not my own code. I found it here on SO and would like to modify it to grab every unique path for duplicate values (i.e. For value "rootfs" 1st path: "system.filesystem.type" 2nd path: "system.filesystem.device_name").
Thank you very much, and any help is appreciated!
An easy way to do this is to turn getpath into a generator:
def getpath(nested_dict, value, prepath=()):
for k,v in nested_dict.items():
path = prepath + (k,)
if v == value: # found value
yield path # yield the value
elif hasattr(v, 'items'):
yield from getpath(v, value, path) # yield all paths from recursive call
This way it yields every single valid path recursively. You can use it like so:
for path in getpath(nested_dict, value):
# do stuff with path

Categories