python read dict with array value - python

I have a test.yaml:
exclude:
- name: apple
version: [3]
- name: pear
version: [2,4,5]
I have a function to check these values in dict and compare it.
def do_something(fruit_name: str, data: dict):
result =[]
versions = [2,3,5,6,7,8]
for version in versions:
url = f"some.api.url/subjects/{fruit_name}/versions/{version}"
response = sr_rest_api.session.post(url, json=data).json()
config = read_yaml("test.yaml") # not sure
for schema in config['exclude']: # not sure
# I'm stuck here
# if version and name exist in the yaml, skip
# else, append to the list such as:
else:
result.append(response["is_fruit"]) # Boolean
return result
I'm not sure how to unwrap the array from a dictionary.
Result from reading the yaml:
{'exclude': [{'name': 'apple', 'version': [3]},
{'name': 'pear', 'version': [2,4,5]}]}

Try this:
def do_something(fruit_name: str, data: dict):
result =[]
# do this once outside the loop
config = read_yaml("test.yaml")
# make the config exclusions easier to work with
exclusions = {schema["name"]: schema for schema in config["exclude"]}
versions = [2,3,5,6,7,8]
for version in versions:
if fruit_name in exclusions and version in exclusions[fruit_name]["version"]:
# this is an excluded fruit version, skip it
continue
else:
# fetch the data and update result
url = f"some.api.url/subjects/{fruit_name}/versions/{version}"
response = sr_rest_api.session.post(url, json=data).json()
result.append(response["is_fruit"])
return result

Related

How to format JSON item OrderedDic dumps with substrings-PYTHON 3

I am trying to convert a Json file that looks like
{
# "item_1":"value_11",
# "item_2":"value_12",
# "item_3":"value_13",
# "item_4":["sub_value_14", "sub_value_15"],
# "item_5":{
# "sub_item_1":"sub_item_value_11",
# "sub_item_2":["sub_item_value_12", "sub_item_value_13"]
# }
# }
TO something that looks like this:
{
# "node_item_1":"value_11",
# "node_item_2":"value_12",
# "node_item_3":"value_13",
# "node_item_4_0":"sub_value_14",
# "node_item_4_1":"sub_value_15",
# "node_item_5_sub_item_1":"sub_item_value_11",
# "node_item_5_sub_item_2_0":"sub_item_value_12",
# "node_item_5_sub_item_2_0":"sub_item_value_13"
# }
I am aware that you can't maintain the order of the Json file when converted to CSV. I am considering to do a workaround by loading the JSON data into OrderedDic objects (which cause them to be added in the order that the input document lists them. However, I am new to working with JSON files, as well as OrderedDic function.
To split items into subgroups i used:
def reduce_item(key, value):
global reduced_item
#Reduction Condition 1
if type(value) is list:
i=0
for sub_item in value:
reduce_item(key+'_'+to_string(i), sub_item)
i=i+1
#Reduction Condition 2
elif type(value) is dict:
sub_keys = value.keys()
for sub_key in sub_keys:
reduce_item(key+'_'+to_string(sub_key), value[sub_key])
#Base Condition
else:
reduced_item[to_string(key)] = to_string(value)
But how do I use the orderedDic along with the above code to show this output:
{
# "node_item_1":"value_11",
# "node_item_2":"value_12",
# "node_item_3":"value_13",
# "node_item_4_0":"sub_value_14",
# "node_item_4_1":"sub_value_15",
# "node_item_5_sub_item_1":"sub_item_value_11",
# "node_item_5_sub_item_2_0":"sub_item_value_12",
# "node_item_5_sub_item_2_0":"sub_item_value_13"
# }
I have the below code as well but it does not split each in subgroups based on the conditions of the subtring code above:
import json
from collections import OrderedDict
with open("/home/file/official.json", 'r') as fp:
metrics_types = json.load(fp, object_pairs_hook=OrderedDict)
print(metrics_types)
That shows:
Any suggestions?
You can use a function that iterates through the given dict or list items and merges the keys from the dict output of the recursive calls:
def flatten(d):
if not isinstance(d, (dict, list)):
return d
out = {}
for k, v in d.items() if isinstance(d, dict) else enumerate(d):
f = flatten(v)
if isinstance(f, dict):
out.update({'%s_%s' % (k, i): s for i, s in f.items()})
else:
out[k] = f
return out
so that given:
d = {
"item_1":"value_11",
"item_2":"value_12",
"item_3":"value_13",
"item_4":["sub_value_14", "sub_value_15"],
"item_5":{
"sub_item_1":"sub_item_value_11",
"sub_item_2":["sub_item_value_12", "sub_item_value_13"]
}
}
flatten(d) returns:
{'item_1': 'value_11',
'item_2': 'value_12',
'item_3': 'value_13',
'item_4_0': 'sub_value_14',
'item_4_1': 'sub_value_15',
'item_5_sub_item_1': 'sub_item_value_11',
'item_5_sub_item_2_0': 'sub_item_value_12',
'item_5_sub_item_2_1': 'sub_item_value_13'}
The above assumes that you're using Python 3.7 or later, where dict keys are guaranteed to be ordered. If you're using earlier versions, you can use OrderedDict in place of a regular dict.

Replacing strings in YAML using Python

I have the following YAML:
instance:
name: test
flavor: x-large
image: centos7
tasks:
centos-7-prepare:
priority: 1
details::
ha: 0
args:
template: &startup
name: startup-centos-7
version: 1.2
timeout: 1800
centos-7-crawl:
priority: 5
details::
ha: 1
args:
template: *startup
timeout: 0
The first task defines template name and version, which is then used by other tasks. Template definition should not change, however others especially task name will.
What would be the best way to change template name and version in Python?
I have the following regex for matching (using re.DOTALL):
template:.*name: (.*?)version: (.*?)\s
However did not figure out re.sub usage so far. Or is there any more convenient way of doing this?
For this kind of round-tripping (load-modify-dump) of YAML you should be using ruamel.yaml (disclaimer: I am the author of that package).
If your input is in input.yaml, you can then relatively easily find
the name and version under key template and update them:
import sys
import ruamel.yaml
def find_template(d):
if isinstance(d, list):
for elem in d:
x = find_template(elem)
if x is not None:
return x
elif isinstance(d, dict):
for k in d:
v = d[k]
if k == 'template':
if 'name' in v and 'version' in v:
return v
x = find_template(v)
if x is not None:
return x
return None
yaml = ruamel.yaml.YAML()
# yaml.indent(mapping=4, sequence=4, offset=2)
yaml.preserve_quotes = True
with open('input.yaml') as ifp:
data = yaml.load(ifp)
template = find_template(data)
template['name'] = 'startup-centos-8'
template['version'] = '1.3'
yaml.dump(data, sys.stdout)
which gives:
instance:
name: test
flavor: x-large
image: centos7
tasks:
centos-7-prepare:
priority: 1
'details:':
ha: 0
args:
template: &startup
name: startup-centos-8
version: '1.3'
timeout: 1800
centos-7-crawl:
priority: 5
'details:':
ha: 1
args:
template: *startup
timeout: 0
Please note that the (superfluous) quotes that I inserted in the input, as well as the comment and the name of the alias are preserved.
I would parse the yaml file into a dictionary, and the edit the field and write the dictionary back out to yaml.
See this question for discussion on parsing yaml in python How can I parse a YAML file in Python but I think you would end up with something like this.
from ruamel.yaml import YAML
from io import StringIO
yaml=YAML(typ='safe')
yaml.default_flow_style = False
#Parse from string
myConfig = yaml.load(doc)
#Example replacement code
for task in myConfig["tasks"]:
if myConfig["tasks"][task]["details"]["args"]["template"]["name"] == "&startup":
myConfig["tasks"][task]["details"]["args"]["template"]["name"] = "new value"
#Convert back to string
buf = StringIO()
yaml.dump(myConfig, buf)
updatedYml = buf.getvalue()

Best way to add dictionary entry and append to JSON file in Python

I have a need to add entries to a dictionary with the following keys:
name
element
type
I want each entry to append to a JSON file, where I will access them for another piece of the project.
What I have below technically works, but there are couple things(at least) wrong with this.
First, it doesn't prevent duplicates being entered. For example I can have 'xyz', '4444' and 'test2' appear as JSON entries multiple times. Is there a way to correct this?
Is there a cleaner way to write the actual data entry piece so when I am entering these values into the dictionary it's not directly there in the parentheses?
Finally, is there a better place to put the JSON piece? Should it be inside the function?
Just trying to clean this up a bit. Thanks
import json
element_dict = {}
def add_entry(name, element, type):
element_dict["name"] = name
element_dict["element"] = element
element_dict["type"] = type
return element_dict
#add entry
entry = add_entry('xyz', '4444', 'test2')
#export to JSON
with open('elements.json', 'a', encoding="utf-8") as file:
x = json.dumps(element_dict, indent=4)
file.write(x + '\n')
There are several questions here. The main points worth mentioning:
Use can use a list to hold your arguments and use *args to unpack when you supply them to add_entry.
To check / avoid duplicates, you can use set to track items already added.
For writing to JSON, now you have a list, you can simply iterate your list and write in one function at the end.
Putting these aspects together:
import json
res = []
seen = set()
def add_entry(res, name, element, type):
# check if in seen set
if (name, element, type) in seen:
return res
# add to seen set
seen.add(tuple([name, element, type]))
# append to results list
res.append({'name': name, 'element': element, 'type': type})
return res
args = ['xyz', '4444', 'test2']
res = add_entry(res, *args) # add entry - SUCCESS
res = add_entry(res, *args) # try to add again - FAIL
args2 = ['wxy', '3241', 'test3']
res = add_entry(res, *args2) # add another - SUCCESS
Result:
print(res)
[{'name': 'xyz', 'element': '4444', 'type': 'test2'},
{'name': 'wxy', 'element': '3241', 'type': 'test3'}]
Writing to JSON via a function:
def write_to_json(lst, fn):
with open(fn, 'a', encoding='utf-8') as file:
for item in lst:
x = json.dumps(item, indent=4)
file.write(x + '\n')
#export to JSON
write_to_json(res, 'elements.json')
you can try this way
import json
import hashlib
def add_entry(name, element, type):
return {hashlib.md5(name+element+type).hexdigest(): {"name": name, "element": element, "type": type}}
#add entry
entry = add_entry('xyz', '4444', 'test2')
#Update to JSON
with open('my_file.json', 'r') as f:
json_data = json.load(f)
print json_data.values() # View Previous entries
json_data.update(entry)
with open('elements.json', 'w') as f:
f.write(json.dumps(json_data))

Unicode strings returned by API not equal to my dict

So I'm trying to compare a dict that I have created to a dict response returned by a boto3 call.
The response is a representation of a JSON document and I want to check they are the same.
Boto3 always returned the strings as unicode. Here's the response:
{u'Version': u'2012-10-17', u'Statement': [{u'Action': u'sts:AssumeRole', u'Principal': {u'Service': u'ec2.amazonaws.com'}, u'Effect': u'Allow', u'Sid': u''}]}
I initially created my dict like this:
default_documment = {}
default_documment['Version'] = '2012-10-17'
default_documment['Statement'] = [{}]
default_documment['Statement'][0]['Sid'] = ''
default_documment['Statement'][0]['Effect'] = 'Allow'
default_documment['Statement'][0]['Principal'] = {}
default_documment['Statement'][0]['Principal']['Service'] = 'ec2.amazonaws.com'
default_documment['Statement'][0]['Action'] = 'sts:AssumeRole'
However, when i compare these two dicts with == they are not equal.
So then I tried adding u to all the strings when I create the dict:
# Default document for a new role
default_documment = {}
default_documment[u'Version'] = u'2012-10-17'
default_documment[u'Statement'] = [{}]
default_documment[u'Statement'][0][u'Sid'] = u''
default_documment[u'Statement'][0][u'Effect'] = u'Allow'
default_documment[u'Statement'][0][u'Principal'] = {}
default_documment[u'Statement'][0][u'Principal'][u'Service'] = u'ec2.amazonaws.com'
default_documment[u'Statement'][0][u'Action'] = u'sts:AssumeRole'
This doesn't work either. The dicts are not equally and if i do a print of my dict it doesn't show u'somestring' it just shows 'somestring'.
How can I compare my dict to what boto3 has returned?
Your second attempt works correctly in Python 2.7 and 3.3. Below is just a cut-and-paste of your Boto3 response and your code (with document spelling corrected :)
D = {u'Version': u'2012-10-17', u'Statement': [{u'Action': u'sts:AssumeRole', u'Principal': {u'Service': u'ec2.amazonaws.com'}, u'Effect': u'Allow', u'Sid': u''}]}
default_document = {}
default_document[u'Version'] = u'2012-10-17'
default_document[u'Statement'] = [{}]
default_document[u'Statement'][0][u'Sid'] = u''
default_document[u'Statement'][0][u'Effect'] = u'Allow'
default_document[u'Statement'][0][u'Principal'] = {}
default_document[u'Statement'][0][u'Principal'][u'Service'] = u'ec2.amazonaws.com'
default_document[u'Statement'][0][u'Action'] = u'sts:AssumeRole'
print(D == default_document)
Output:
True

using ConfigParser and dictionary in Python

I am trying some basic python scripts using ConfigParser and converting to a dictionary. I am reading a file named "file.cfg" which contains three sections - root, first, second. Currently the code reads the file and converts everything within the file to a dictionary.
My requirement is to convert only sections named "first" and "second" and so on, its key value pair to a dictionary. What would be best way of excluding the section "root" and its key value pair?
import urllib
import urllib2
import base64
import json
import sys
from ConfigParser import SafeConfigParser
parser = SafeConfigParser()
parser.read('file.cfg')
print parser.get('root', 'auth')
config_dict = {}
for sect in parser.sections():
config_dict[sect] = {}
for name, value in parser.items(sect):
config_dict[sect][name] = value
print config_dict
Contents of file.cfg -
~]# cat file.cfg
[root]
username = admin
password = admin
auth = http://192.168.1.1/login
[first]
username = pete
password = sEcReT
url = http://192.168.1.1/list
[second]
username = ron
password = SeCrET
url = http://192.168.1.1/status
Output of the script -
~]# python test4.py
http://192.168.1.1/login
{'second': {'username': 'ron', 'url': 'http://192.168.1.1/status', 'password': 'SeCrEt'}, 'root': {'username': 'admin', 'password': 'admin', 'auth': 'http://192.168.1.1/login'}, 'first': {'username': 'pete', 'url': 'http://192.168.1.1/list', 'password': 'sEcReT'}}
You can remove root section from parser.sections() as follows:
parser.remove_section('root')
Also you don't have to iterate over each pair in each section. You can just convert them to dict:
config_dict = {}
for sect in parser.sections():
config_dict[sect] = dict(parser.items(sect))
Here is one liner:
config_dict = {sect: dict(parser.items(sect)) for sect in parser.sections()}
Bypass the root section by comparison.
for sect in parser.sections():
if sect == 'root':
continue
config_dict[sect] = {}
for name, value in parser.items(sect):
config_dict[sect][name] = value
Edit after acceptance:
ozgur's one liner is a much more concise solution. Upvote from me. If you don't feel like removing sections from the parser directly, the entry can be deleted afterwards.
config_dict = {sect: dict(parser.items(sect)) for sect in parser.sections()} # ozgur's one-liner
del config_dict['root']
Maybe a bit off topic, but ConfigParser is a real pain when in comes to store int, floats and booleans. I prefer using dicts which I dump into configparser.
I also use funtcions to convert between ConfigParser objects and dicts, but those deal with variable type changing, so ConfigParser is happy since it requests strings, and my program is happy since 'False' is not False.
def configparser_to_dict(config: configparser.ConfigParser) -> dict:
config_dict = {}
for section in config.sections():
config_dict[section] = {}
for key, value in config.items(section):
# Now try to convert back to original types if possible
for boolean in ['True', 'False', 'None']:
if value == boolean:
value = bool(boolean)
# Try to convert to float or int
try:
if isinstance(value, str):
if '.' in value:
value = float(value)
else:
value = int(value)
except ValueError:
pass
config_dict[section][key] = value
# Now drop root section if present
config_dict.pop('root', None)
return config_dict
def dict_to_configparser(config_dict: dict) -> configparser.ConfigParser:
config = configparser.ConfigParser()
for section in config_dict.keys():
config.add_section(section)
# Now let's convert all objects to strings so configparser is happy
for key, value in config_dict[section].items():
config[section][key] = str(value)
return config

Categories