How to search in particular level in JSON tree using Python - python

I'm trying to learn a method of searching within a JSON tree on a particular level. I have the following JSON template example:
"Pair": {
"Instrument_A": {
"Segment A": {
"default": {
"value X": 1,
"value Z": 2,
"value Y": 3,
}
},
"Segment B": {
"default": {
"value X": 1,
"value Z": 2,
"value Y": 3,
}
}
},
"Instrument_B": {
"Segment A": {
"not-default": {
"value X": 1,
"value Z": 2,
"value Y": 3,
}
}
}
}
My goal is to count all arrays with the name that does not equal to "default", for example, you can see on the 4 level under instrument B, Segment A, there is an object named "not-default"

You can parse json using json package and iterate through the dictionaries.
Python2:
import json
json_str = "..."
json_object = json.loads(json_str)
for pair_k, pair_v in json_object.iteritems():
for inst_k, inst_v in pair_v.iteritems():
for seg_k, seg_v in inst_v.iteritems():
if not seg_v == "default"
pass # do whatever you want - print, append to list, etc.
Python3:
import json
json_str = "..."
json_object = json.loads(json_str)
for pair_k, pair_v in json_object.items():
for inst_k, inst_v in pair_v.items():
for seg_k, seg_v in inst_v.items():
if not seg_v == "default"
pass # do whatever you want - print, append to list, etc.

count = Counter()
def count_keys(data):
if isinstance(data, dict):
for key, value in data.items():
if key != 'default':
count[key]+=1
count_keys(data[key])
count_keys(data)
print(count)
prints
Counter({'value X': 3, 'value Z': 3, 'value Y': 3, 'Segment A': 2, 'Pair': 1,'Instrument_A': 1, 'Segment B': 1, 'Instrument_B': 1, 'not-default': 1})

Related

python how to search a string, count values and group by in json

I have a python program calling an API that receives the result as below:
{
"result": [
{
"company" : "BMW",
"model" : "5"
},
{
"company" : "BMW",
"model" : "5"
},
{
"company" : "BMW",
"model" : "5"
},
{
"company" : "BMW",
"model" : "3"
},
{
"company" : "BMW",
"model" : "7"
},
{
"company" : "AUDI",
"model" : "A3"
},
{
"company" : "AUDI",
"model" : "A7"
},
]
}
Now my task is to identify the number of occurrences of elements from the list in JSON output and group them. The expected output should look like this:
{
"BMW" :
{
"5series" : 3,
"3series" : 1,
"7series" : 1,
},
"AUDI" :
{
"A3" : 1,
"A7" : 1,
},
"MERCEDES":
{
"EClass" : 0,
"SClass" : 0
}
}
I need to find the "company" from list of elements. This will include names that may not be in JSON response sometimes, then the expected output should include that as 0. The "model" names (3,5,7,A3 etc..,) are fixed, so we know that's those are only ones that may or may not be in json api response.
For ex: The List has 3 company names in below code. - companyname = ["BMW,"AUDI","MERCEDES"] . However, sometimes, the JSON API response may not have one or more elements. In this case, "MERCEDES" is missing, but the final output should include "MERCEDES" as well with value as 0.
Here is what i have tried so far:
def modelcount():
companyname= ["BMW","AUDI","MERCEDES"]
url = apiurl
#Send Request
apiresponse = requests.get(url, auth=(user, password), headers=headers, proxies=proxies)
# Decode the JSON response into a dictionary and use the data
data = apiresponse.json()
print(len(data['result']))
3series= 0
5series= 0
7series= 0
A3=0
A7=0
EClass = 0
SClass = 0
modelcountjson = {}
for name in companyname:
for item in data['result']:
models= {}
if item['company'] == name:
if item['model'] == 3:
3series = 3series + 1
elif item['model'] == 5:
5series = 5series + 1
elif item['model'] == 7:
7series = 7series + 1
models['3series'] = 3series
models['5series'] = 5series
models['7series'] = 7series
#I still haven't written AUDI, MERCEDES above. This is where i feel i am writing inefficiently.
modelcountjson[name] = models
return jsonify(modelcountjson)
```
As the number of models grow, I am worried of code getting redundant with many for loops and may cause performance overhead. I am looking for help on achieving the end result in most efficient way.
Thank you so much for your help.
A useful package for working directly with JSON-style dictionaries and lists is toolz (see documentation for more details). This way you can concisely group the data and count occurrences of each model while handling potentially missing data separately:
from toolz import itertoolz
result = {
"result": [
{
"company" : "BMW",
"model" : "5"
},
{
"company" : "BMW",
"model" : "5"
},
{
"company" : "BMW",
"model" : "5"
},
{
"company" : "BMW",
"model" : "3"
},
{
"company" : "BMW",
"model" : "7"
},
{
"company" : "AUDI",
"model" : "A3"
},
{
"company" : "AUDI",
"model" : "A7"
},
]
}
final_output = {}
grouped_result = itertoolz.groupby('company', result['result'])
if 'MERCEDES' not in grouped_result:
final_output['MERCEDES'] = {
'EClass': 0,
'SClass': 0
}
for key, value in grouped_result.items():
models = itertoolz.pluck('model', value)
final_output[key] = itertoolz.frequencies(models)
The output results in:
{'AUDI': {'A3': 1, 'A7': 1}, 'BMW': {'3': 1, '5': 3, '7': 1}, 'MERCEDES': {'EClass': 0, 'SClass': 0}}
You could go for a bit of a separation of code and config:
conf = {
'BMW': {'format': '{}series', 'keys': ['3', '5', '7']},
'AUDI': {'format': '{}', 'keys': ['A3', 'A7']},
'MERCEDES': {'format': '{}Class', 'keys': ['E', 'S']},
}
def modelcount():
# retrieve `data`
# ...
result = {
k: {
v['format'].format(key): 0 for key in v['keys']
} for k, v in conf.items()
}
for car in data['result']:
com = car['company']
mod = car['model']
key = conf[com]['format'].format(mod)
result[com][key] += 1
for com in result:
result[com]['Total'] = sum(result[com].values())
return result
>>> modelcount()
{'BMW': {'3series': 1, '5series': 3, '7series': 1},
'AUDI': {'A3': 1, 'A7': 1},
'MERCEDES': {'EClass': 0, 'SClass': 0}}
This way, for more companies and models, you will only have to touch the conf, not the code. The time complexity of this is O(m+n) with m the total number of distinct models and n the number of cars in the API response.

i want to convert sample JSON data into nested JSON using specific key-value in python

I have below sample data in JSON format :
project_cost_details is my database result set after querying.
{
"1": {
"amount": 0,
"breakdown": [
{
"amount": 169857,
"id": 4,
"name": "SampleData",
"parent_id": "1"
}
],
"id": 1,
"name": "ABC PR"
}
}
Here is full json : https://jsoneditoronline.org/?id=2ce7ab19af6f420397b07b939674f49c
Expected output :https://jsoneditoronline.org/?id=56a47e6f8e424fe8ac58c5e0732168d7
I have this sample JSON which i created using loops in code. But i am stuck at how to convert this to expected JSON format. I am getting sequential changes, need to convert to tree like or nested JSON format.
Trying in Python :
project_cost = {}
for cost in project_cost_details:
if cost.get('Parent_Cost_Type_ID'):
project_id = str(cost.get('Project_ID'))
parent_cost_type_id = str(cost.get('Parent_Cost_Type_ID'))
if project_id not in project_cost:
project_cost[project_id] = {}
if "breakdown" not in project_cost[project_id]:
project_cost[project_id]["breakdown"] = []
if 'amount' not in project_cost[project_id]:
project_cost[project_id]['amount'] = 0
project_cost[project_id]['name'] = cost.get('Title')
project_cost[project_id]['id'] = cost.get('Project_ID')
if parent_cost_type_id == cost.get('Cost_Type_ID'):
project_cost[project_id]['amount'] += int(cost.get('Amount'))
#if parent_cost_type_id is None:
project_cost[project_id]["breakdown"].append(
{
'amount': int(cost.get('Amount')),
'name': cost.get('Name'),
'parent_id': parent_cost_type_id,
'id' : cost.get('Cost_Type_ID')
}
)
from this i am getting sample JSON. It will be good if get in this code only desired format.
Also tried this solution mention here : https://adiyatmubarak.wordpress.com/2015/10/05/group-list-of-dictionary-data-by-particular-key-in-python/
I got approach to convert sample JSON to expected JSON :
data = [
{ "name" : "ABC", "parent":"DEF", },
{ "name" : "DEF", "parent":"null" },
{ "name" : "new_name", "parent":"ABC" },
{ "name" : "new_name2", "parent":"ABC" },
{ "name" : "Foo", "parent":"DEF"},
{ "name" : "Bar", "parent":"null"},
{ "name" : "Chandani", "parent":"new_name", "relation": "rel", "depth": 3 },
{ "name" : "Chandani333", "parent":"new_name", "relation": "rel", "depth": 3 }
]
result = {x.get("name"):x for x in data}
#print(result)
tree = [];
for a in data:
#print(a)
if a.get("parent") in result:
parent = result[a.get("parent")]
else:
parent = ""
if parent:
if "children" not in parent:
parent["children"] = []
parent["children"].append(a)
else:
tree.append(a)
Reference help : http://jsfiddle.net/9FqKS/ this is a JavaScript solution i converted to Python
It seems that you want to get a list of values from a dictionary.
result = [value for key, value in project_cost_details.items()]

Get all parents keys in nested dictionary for all items

I want to get all parent keys for all items in a nested python dictionary with unlimited levels. Take an analogy, if you think of a nested dictionary as a directory containing sub-directories, the behaviour I want is similar to what glob.glob(dir, recursive=True) does.
For example, suppose we have the following dictionary:
sample_dict = {
"key_1": {
"sub_key_1": 1,
"sub_key_2": 2,
},
"key_2": {
"sub_key_1": 3,
"sub_key_2": {
"sub_sub_key_1": 4,
},
},
}
I want to get the full "path" of every value in the dictionary:
["key_1", "sub_key_1", 1]
["key_1", "sub_key_2", 2]
["key_2", "sub_key_1", 3]
["key_2", "sub_key_2", "sub_sub_key_1", 4]
Just wondering if there is a clean way to do that?
Using generators can often simplify the code for these type of tasks and make them much more readable while avoiding passing explicit state arguments to the function. You get a generator instead of a list, but this is a good thing because you can evaluate lazily if you want to. For example:
def getpaths(d):
if not isinstance(d, dict):
yield [d]
else:
yield from ([k] + w for k, v in d.items() for w in getpaths(v))
result = list(getpaths(sample_dict))
Result will be:
[['key_1', 'sub_key_1', 1],
['key_1', 'sub_key_2', 2],
['key_2', 'sub_key_1', 3],
['key_2', 'sub_key_2', 'sub_sub_key_1', 4]]
You can solve it recursively
sample_dict = {
"key_1": {
"sub_key_1": 1,
"sub_key_2": 2,
},
"key_2": {
"sub_key_1": 3,
"sub_key_2": {
"sub_sub_key_1": 4,
},
}
}
def full_paths(sample_dict, paths=[], parent_keys=[]):
for key in sample_dict.keys():
if type(sample_dict[key]) is dict:
full_paths(sample_dict[key], paths=paths, parent_keys=(parent_keys + [key]))
else:
paths.append(parent_keys + [key] + [sample_dict[key]])
return paths
print(full_paths(sample_dict))
You can use this solution.
sample_dict = {
"key_1": {
"sub_key_1": 1,
"sub_key_2": 2,
},
"key_2": {
"sub_key_1": 3,
"sub_key_2": {
"sub_sub_key_1": 4,
},
},
}
def key_find(sample_dict, li=[]):
for key, val in sample_dict.items():
if isinstance(val, dict):
key_find(val, li=li + [key])
else:
print(li + [key] + [val])
key_find(sample_dict)

Python: Efficient way of sorting dictionary that has n dictionaries within it

I'll just go straight to example:
Here we have a dictionary with a test name, and another dict which contains the level categorization.
EDIT
Input:
test_values={
{
"name":"test1",
"level_map":{
"system":1,
"system_apps":2,
"app_test":3
}
},
{
"name":"test2",
"level_map":{
"system":1,
"system_apps":2,
"app_test":3
}
},
{
"name":"test3",
"level_map":{
"system":1,
"memory":2,
"memory_test":3
}
}
}
Output:
What I want is this:
dict_obj:
{
"system":{
"system_apps":{
"app_test":{
test1 object,
test2 object
},
"memory":{
"memory_test":{
test3 object
}
}
}
}
I just can't wrap my head around the logic and I'm struggling to even come up with an approach. If someone could guide me, that would be great.
Let's start with level_map. You can sort keys on values to get the ordered levels:
>>> level_map = { "system": 1, "system_apps": 2, "app_test": 3}
>>> L = sorted(level_map.keys(), key=lambda k: level_map[k])
>>> L
['system', 'system_apps', 'app_test']
Use these elements to build a tree:
>>> root = {}
>>> temp = root
>>> for k in L[:-1]:
... temp = temp.setdefault(k, {}) # create new inner dict if necessary
...
>>> temp.setdefault(L[-1], []).append("test") # and add a name
>>> root
{'system': {'system_apps': {'app_test': ['test']}}}
I split the list before the last element, because the last element will be associated to a list, not a dict (leaves of the tree are lists in your example).
Now, the it's easy to repeat this with the list of dicts:
ds = [{ "name": "test1",
"level_map": { "system": 1, "system_apps": 2, "app_test": 3}
}, { "name": "test2",
"level_map": { "system": 1, "system_apps": 2, "app_test": 3}
}, { "name": "test3",
"level_map": { "system": 1, "memory": 2, "memory_test": 3}
}]
root = {}
for d in ds:
name = d["name"]
level_map = d["level_map"]
L = sorted(level_map.keys(), key=lambda k: level_map[k])
temp = root
for k in L[:-1]:
temp = temp.setdefault(k, {})
temp.setdefault(L[-1], []).append(name)
# root is: {'system': {'system_apps': {'app_test': ['test1', 'test2']}, 'memory': {'memory_test': ['test3']}}}

I can not parse thru a Nested Json file because Unicode type is returned instead of an Array or List

I'm trying to get the value of each json elements. I am expecting the type to be an array or list but instead, I get type unicode.
Here's my sample json file:
{
"accounts": [
{
"account": {
"basicDetails": {
"accountId": {
"acctName": "Test A",
"acctNumber": "Test B"
},
"accountBranchId": {
"branchName": "Test C",
"brancNumber": "Test D"
},
"cusName": "Test E"
},
"otherDetails": {
"dateCreated": "1999-10-01",
"dateClosed": "2000-10-01"
}
}
}
],
"userExtension": {
"testId": null,
"version": null
},
"status": {
"overallStatus": "S",
"messages": null
},
"_links": null
}
Here is the code I am currently trying
def extract_key(self,obj):
def extract(obj):
if type(obj)== type(OrderedDict()) or isinstance(obj, list):
for k, v in obj.items():
if type(v) == type(OrderedDict()) or type(v)==type(list):
extract(v)
elif type(v) != type(OrderedDict()) or type(v)!=type(list):
print(type(k))
print(k)
results = extract(obj)
return results
def print_keys(self):
with open("C:\\Account.json", "r+") as jsonFile:
data = json.load(jsonFile, object_pairs_hook=OrderedDict)
names = self.extract_key(data)
return names
I'm expecting to get the elements after "accounts": [ but it wont go thru because it treats "accounts" as a unicode instead of a list or array.
You've asked this
if type(obj)== type(OrderedDict()) or isinstance(obj, list):
and entered into the "accounts", that's ok
Then you've just need:
for k, v in obj.items():
if obj.items():
results.append(v)
Results got loaded with:
[OrderedDict([('account', OrderedDict([('basicDetails', OrderedDict([('accountId', OrderedDict([('acctName', 'Test A'), ('acctNumber', 'Test B')])), ('accountBranchId', OrderedDict([('branchName', 'Test C'), ('brancNumber', 'Test D')])), ('cusName', 'Test E')])), ('otherDetails', OrderedDict([('dateCreated', '1999-10-01'), ('dateClosed', '2000-10-01')]))]))])], OrderedDict([('testId', None), ('version', None)]), OrderedDict([('overallStatus', 'S'), ('messages', None)]), None]
Of course results must be declared before that, like:
results = []

Categories