python how to find if a dictionary contains data from other dictionary - python

In Python, how do i find if a dictionary contains data from the other dictionary.
my data is assigned to a variable like this
childDict = {
"assignee" : {
"first":"myFirstName",
"last":"myLastName"
},
"status" : "alive"
}
I have another dictionary named masterDict with similar hierarchy but with some more data in it.
masterDict = {
"description": "sample description",
"assignee" : {
"first" : "myFirstName",
"last" : "myLastName"
},
"status" : "dead",
"identity": 1234
}
Now I need to read through childDict and find out if masterDict has these values in them or not.
data is nested, it can have more depth.
In the above example since the status didn't match, it should return false otherwise it should have returned true. how do i compare them. I am new to python. Thanks for your help.

Note that there were some errors in your dictionary (missing commas).
childDict1 = {
"assignee": {
"first":"myFirstName",
"last":"myLastName"
},
"status" : "alive"
}
childDict2 = {
"assignee": {
"first":"myFirstName",
"last":"myLastName"
},
"status" : "dead"
}
masterDict = {
"description": "sample description",
"assignee": {
"first":"myFirstName",
"last":"myLastName"
},
"status": "dead",
"identity": 1234
}
def contains_subdict(master, child):
if isinstance(master, dict) and isinstance(child, dict):
for key in child.keys():
if key in master:
if not contains_subdict(master[key], child[key]):
return False
return True
else:
if child == master:
return True
return False
print contains_subdict(masterDict, childDict1)
print contains_subdict(masterDict, childDict2)
Running the code produces the output:
False
True

Related

Iterate a nested dictionary and filter specific fields

I have an example object which is mixed of lists and dicts:
{
"field_1" : "aaa",
"field_2": [
{
"name" : "bbb",
.....
"field_4" : "ccc",
"field_need_to_filter" : False,
},
{
"name" : "ddd",
.....
"details": [
{
"name" : "eee",
....
"details" : [
{
"name": "fff",
.....
"field_10": {
"field_11": "rrr",
...
"details": [
{
"name": "xxx",
...
"field_need_to_filter": True,
},
{
"name": "yyy",
...
"field_need_to_filter": True,
},
{
"field_13": "zzz",
...
"field_need_to_filter": False,
}
]
}
},
]}]}
]
}
I'd like to iterate this dictionary and add all the corresponding fields for name where field_need_to_filter is True, so for this example, expected output would be:
["ddd.eee.fff.xxx", "ddd.eee.fff.yyy"]. I've been looking at this for too long and my brain stops working now, any help would be appreciated. Thanks.
Ok, it took me some time to think about the different cases and fix bugs, but this works (at least on your example of dict); note that it assumes that dicts containing "field_need_to_filter": True are end-points (the function doesn't delve deeper into those)). I'll be glad to add explanations to the code if you want some.
mydict = {
"field_1" : "aaa",
"field_2": [
{
"name" : "bbb",
"field_4" : "ccc",
"field_need_to_filter" : False,
},
{
"name" : "ddd",
"details": [
{
"name" : "eee",
"details" : [
{
"name": "fff",
"field_10": {
"field_11": "rrr",
"details": [
{
"name": "xxx",
"field_need_to_filter": True,
},
{
"name": "yyy",
"field_need_to_filter": True,
},
{
"field_13": "zzz",
"field_need_to_filter": False,
}
]
}
},
]}]}
]
}
def filter_paths(thing, path=''):
if type(thing) == dict:
# if this dict has a name, log it
if thing.get("name"):
path += ('.' if path else '') + thing["name"]
# if this dict has "...filter": True, we've reached an end point, and return the path
if thing.get("field_need_to_filter") and thing["field_need_to_filter"]:
return [path]
# else we delve deeper
result = []
for key in thing:
result += [deep_path for deep_path in filter_paths(thing[key], path)]
return result
# if the current object is a list, we simply delve deeper
elif type(thing) == list:
result = []
for element in thing:
result += [deep_path for deep_path in filter_paths(element, path)]
return result
# We've reached a dead-end, so we return an empty list
else:
return []
filter_paths(mydict)
# Out[204]: ['ddd.eee.fff.xxx', 'ddd.eee.fff.yyy']
If that is your correct code, the you call iterate_fun instead of iterate_func after the check for field name of ddd

Formatting issue, when python dictionary is dumped in to json objects

I have two dictionaries - test1 and test2. I have to recursively compare both, if test1 contains key description$$, I have to replace second test2 of same key with the value of test1 key and then dump this in to a JSON file. I was able to get this, but the output of JSON file is not as of expected format.
sample.py
import json
test1 = {
"info" : {
"title" : "some random data",
"description$$" : "CHANGED::::",
"version" : "x.x.1"
},
"schemes" : [ "https" ],
"basePath" : "/sch/f1"
}
test2 = {
"info" : {
"title" : "some random data",
"description" : "before change",
"version" : "x.x.4"
},
"schemes" : [ "https" ],
"basePath" : "/sch/f2"
}
def walk(test1, test2):
for key, item in test1.items():
if type(item) is dict:
walk(item, test2[key])
else:
if str(key) == "description$$" or str(key) == "summary$$":
modfied_key = str(key)[:-2]
test2[modfied_key] = test1[key]
walk(test1, test2)
json.dump(test2, open('outputFile.json', "w"), indent=2)
My output is -
outpufile.json
{
"info": {
"title": "some random data",
"description": "CHANGED::::",
"version": "x.x.4"
},
"schemes": [
"https"
],
"basePath": "/sch/f2"
}
but the expected output should be -
{
"info": {
"title": "some random data",
"description": "CHANGED::::",
"version": "x.x.4"
},
"schemes": ["https"],
"basePath": "/sch/f2"
}
the schema should be printed in single line, but in my output it's taking 3 lines. how can I fix this?
Thank you

Get the Parent key and the nested value in nested json

I have a nested json for a JSON schema like this:
{
"config": {
"x-permission": true
},
"deposit_schema": {
"additionalProperties": false,
"$schema": "http://json-schema.org/draft-04/schema#",
"type": "object",
"properties": {
"control_number": {
"type": "string",
"x-cap-permission": {
"users": [
"test#test.com"
]
}
},
"initial": {
"properties": {
"status": {
"x-permission": {
"users": [
"test3#test.com"
]
},
"title": "Status",
"type": "object",
"properties": {
"main_status": {
"type": "string",
"title": "Stage"
}
}
},
"gitlab_repo": {
"description": "Add your repository",
"items": {
"properties": {
"directory": {
"title": "Subdirectory",
"type": "string",
"x-permission": {
"users": [
"test1#test.com",
"test2#test.com"
]
}
},
"gitlab": {
"title": "Gitlab",
"type": "string"
}
},
"type": "object"
},
"title": "Gitlab Repository",
"type": "array"
},
"title": "Initial Input",
"type": "object"
}
},
"title": "Test Analysis"
}
}
The JSON is nested and I want to have the dict of x-permission fields with their parent_key like this:
{
"control_number": {"users": ["test#test.com"]},
"initial.properties.status": {"users": ["test3#test.com"]},
"initial.properties.gitlab_repo.items.properties.directory": {"users": [
"test1#test.com",
"test2#test.com"
]}
}
I am trying to do implement recursive logic for every key in JSON like this:
def extract(obj, parent_key):
"""Recursively search for values of key in JSON tree."""
for k, v in obj.items():
key = parent_key + '.' + k
if isinstance(v, dict):
if v.get('x-permission'):
return key, v.get('x-permission')
elif v.get('properties'):
return extract(v.get('properties'), key)
return None, None
def collect_permission_info(object_):
# _schema = _schema.deposit_schema.get('properties')
_schema = object_ # above json
x_cap_fields = {}
for k in _schema:
parent_key, permission_info = extract(_schema.get(k), k)
if parent_key and permission_info:
x_cap_fields.update({parent_key: permission_info})
return x_cap_fields
I am getting empty dict now, what I am missing here?
You could use this generator of key/value tuples:
def collect_permission_info(schema):
for key, child in schema.items():
if isinstance(child, dict):
if "x-permission" in child:
yield key, child["x-permission"]
if "properties" in child:
for rest, value in collect_permission_info(child["properties"]):
yield key + "." + rest, value
Then call it like this:
result = dict(collect_permission_info(schema))
A few issues I can spot:
You use the parent_key directly in the recursive function. In a case when multiple properties exist in an object ("_experiment" has 2 properties), the path will be incorrect (e.g. _experiment.type.x-permission is constructed in second loop call). Use a new variable so that each subsequent for loop call uses the initial parent_key value
The elif branch is never executed as the first branch has priority. It is a duplicate.
The return value from the recursive execute(...) call is ignored. Anything you might find on deeper levels is therefore ignored
Judging by your example json schema and the desired result, a recursive call on the "initial": {...} object should return multiple results. You would have to modify the extract(...) function to allow for multiple results instead of a single one
You only check if an object contains a x-permission or a properties attribute. This ignores the desired result in the provided "initial" schema branch which contains x-permission nested inside a status and main_status branch. The easiest solution is to invoke a recursive call every time isinstance(v, dict) == true
After reading through the comments and the answers. I got this solution working for my use case.
def parse_schema_permission_info(schema):
x_fields = {}
def extract_permission_field(field, parent_field):
for field, value in field.items():
if field == 'x-permission':
x_fields.update({parent_field: value})
if isinstance(value, dict):
key = parent_field + '.' + field
if value.get('x-permission'):
x_fields.update(
{key: value.get('x-permission')}
)
extract_permission_field(value, key)
for field in schema:
extract_permission_field(schema.get(field), field)
return x_fields

Create nested maps in DynamoDB

I am trying to create a new map and also assign it a new value at the same time
This is the data format I want to store in my db:
{
"user_id": 1,
"project_id": 1,
"MRR": {
"NICHE": {
"define your niche": {
"vertical": "test",
"ideal prospect": "He is the best"
}
},
"Environment": {
"Trend 1": {
"description": "something"
},
"Trend 2": {
"description": "something else"
}
}
}
}
My code so far for inserting data is:
def update_dynamo(user_id, project_id, group, sub_type, data):
dynmoTable.update_item(
Key = {
"user_id": user_id
},
ConditionExpression=Attr("project_id").eq(project_id),
UpdateExpression="SET MRR.#group = :group_value",
ExpressionAttributeNames={
"#group": group
},
ExpressionAttributeValues={
":group_value": {}
}
)
dynmoTable.update_item(
Key={
"user_id": user_id
},
ConditionExpression=Attr("project_id").eq(project_id),
UpdateExpression="SET MRR.#group.#subgroup = :sub_value",
ExpressionAttributeNames={
"#group": group,
'#subgroup': sub_type
},
ExpressionAttributeValues={
":sub_value": data
}
)
data = {
"description": "world",
}
if __name__ == "__main__":
update_dynamo(1, 1, "New Category", "Hello", data)
My question is can these 2 update_items somehow be merged into one?
Sure, you can assign to the top-level attribute an entire nested "document", you don't need to assign only scalars.
Something like this should work:
dynmoTable.update_item(
Key = {
"user_id": user_id
},
ConditionExpression=Attr("project_id").eq(project_id),
UpdateExpression="SET MRR.#group = :group_value",
ExpressionAttributeNames={
"#group": group
},
ExpressionAttributeValues={
":group_value": {sub_type: sub_data}
}
)
Note how you set the "group" attribute to the Python dictionary {subtype: sub_data}. boto3 will convert this dictionary into the appropriate DynamoDB map attribute, as you expect. You can set sophisticated nested dictionaries, lists, nested in each other this way - in a single update.

Extracting data from JSON depending on other parameters

What are the options for extracting value from JSON depending on other parameters (using python)? For example, JSON:
"list": [
{
"name": "value",
"id": "123456789"
},
{
"name": "needed-value",
"id": "987654321"
}
]
When using json_name["list"][0]["id"] it obviously returns 123456789. Is there a way to indicate "name" value "needed-value" so i could get 987654321 in return?
For example:
import json as j
s = '''
{
"list": [
{
"name": "value",
"id": "123456789"
},
{
"name": "needed-value",
"id": "987654321"
}
]
}
'''
js = j.loads(s)
print [x["id"] for x in js["list"] if x["name"] == "needed-value"]
The best way to handle this is to refactor the json as a single dictionary. Since "name" and "id" are redundant you can make the dictionary with the value from "name" as the key and the value from "id" as the value.
import json
j = '''{
"list":[
{
"name": "value",
"id": "123456789"
},{
"name": "needed-value",
"id": "987654321"
}
]
}'''
jlist = json.loads(j)['list']
d = {jd['name']: jd['id'] for jd in jlist}
print(d) ##{'value': '123456789', 'needed-value': '987654321'}
Now you can iterate the items like you normally would from a dictionary.
for k, v in d.items():
print(k, v)
# value 123456789
# needed-value 987654321
And since the names are now hashed, you can check membership more efficiently than continually querying the list.
assert 'needed-value' in d
jsn = {
"list": [
{
"name": "value",
"id": "123456789"
},
{
"name": "needed-value",
"id": "987654321"
}
]
}
def get_id(list, name):
for el in list:
if el['name'] == name:
yield el['id']
print(list(get_id(jsn['list'], 'needed-value')))
Python innately treats JSON as a list of dictionaries. With this in mind, you can call the index of the list you need to be returned since you know it's location in the list (and child dictionary).
In your case, I would use list[1]["id"]
If, however, you don't know where the position of your needed value is within the list, the you can run an old fashioned for loop this way:
for user in list:
if user["name"] == "needed_value":
return user["id"]
This is assuming you only have one unique needed_value in your list.

Categories