Error while adding key/value to Python Dict in nested loop - python

I have a Json Structure as Follows:
{
"_id" : ObjectId("asdasda156121s"),
"Hp" : {
"bermud" : [
{
"abc" : {
"gfh" : 1,
"fgh" : 0.0,
"xyz" : [
{
"kjl" : "0",
"bnv" : 0,
}
],
"xvc" : "bv",
"hgth" : "INnn",
"sdf" : 0,
}
}
},
{
"abc" : {
"gfh" : 1,
"fgh" : 0.0,
"xyz" : [
{
"kjl" : "0",
"bnv" : 0,
}
],
"xvc" : "bv",
"hgth" : "INnn",
"sdf" : 0,
}
}
},
..
I am trying to parse this json and add a new value with key ['cat'] inside the object 'xyz',below is my py code.
data = []
for x in a:
for y in x['Hp'].values():
for z in y:
for k in z['abc']['xyz']:
for m in data:
det = m['response']
// Some processing with det whose output is stored in s
k['cat'] = s
print x
However when x is printed only the last value is being appended onto the whole dictionary, wheras there are different values for s. Its obvious that the 'cat' key is being overwritten everytime the loop rounds,but can't find a way to make it right.What mistake am I making?

Related

MongoEngine not deleting all documents

I have a some unit tests which submit some info to a server which saves the info into a document in a mongo engine. At the end of the test, I want to delete all of the documents made by the test:
#router.delete("/all", summary="Delete all jobs in an organization")
async def delete_all_jobs(job_data: AuthorizedResource = Depends(CanActOnResource("delete", "jobs"))):
MongoJob.objects(organization=job_data.organization).delete()
However when I run this endpoint, some of the documents are only partially deleted:
This is what the JSON looks like before being deleted:
{
"_id" : "242d07ac-eafb-4875-a8f4-8ec89c7bc21f",
"_cls" : "MongoJob",
"_created_by" : "tom.mclean",
"_date_created" : ISODate("2022-02-24T08:23:50.943Z"),
"_date_modified" : ISODate("2022-02-24T08:25:02.062Z"),
"_modified_by" : "tom.mclean",
"client_info" : {
"protocol" : "tcp",
"interface" : "0.0.0.0",
"port" : 0
},
"grib_data" : {
"grib_dir_clim" : "X:\\Weather_Files\\Climatology",
"grib_dir_wind" : "X:\\Weather_Files\\NOAA_Forcasts",
"grib_dir_wave" : "X:\\Weather_Files\\NOAA_Forcasts"
},
"organization" : "8b50d3f2-03fe-4aca-9cf6-9922854f2989",
"output_dir" : "C:\\Users\\Tom.Mclean\\src\\routingserver\\WeatherRouting\\WeatherRouting\\..\\output",
"polars" : [
"5d19d7d0-eba2-49e5-8719-760d352d50dc"
],
"result" : {
"costs" : {
"total_cost" : null,
"fuel_cost" : null,
"hire_cost" : null
}
},
"route_form" : {
"waypoints" : [
{
"type" : "Waypoint",
"lon" : -7.25,
"lat" : 49.42,
"normal_deviation" : 0.2
},
{
"type" : "Waypoint",
"lon" : -50.0,
"lat" : 40.0,
"normal_deviation" : 0.0
}
],
"start_time" : ISODate("2022-02-24T08:23:50.842Z"),
"arrival_window" : {
"early" : null,
"late" : null
},
"max_tws" : 40.0,
"max_lat" : 65.0,
"min_lat" : -40.0,
"max_speed" : 16.0,
"min_speed" : 8.0,
"great_circle" : false,
"objective_funcs" : [
{
"hire_cost" : 16000.0,
"fuel_cost" : 550.0
}
],
"decision_time" : 24.0,
"course_change_angle" : 15.0,
"speed_step" : 0.5
},
"ship" : "f2775ef8-c58d-4aa3-a6a0-b82539535e88",
"status" : "FAILED",
"wave_data" : false
}
And then after running that end point of the API, some of the documents are deleted however some are left with just three fields:
{
"_id" : "5f04ffc3-45a3-4652-a79d-68b37e737268",
"_date_modified" : ISODate("2022-02-24T15:13:28.013Z"),
"status" : "FAILED"
}
If I run the unit tests in debug mode and pause on the line which calls the delete endpoint and then run it later on, it safely deletes all the documents:
#classmethod
def tearDownClass(cls) -> None:
# TODO Once jobs can be deleted, clear test jobs from the routing server
loop = asyncio.new_event_loop()
loop.run_until_complete(cls.oauth.get_new_access_token())
organization_path = cls.api._organization_path
pathname = f"{organization_path}/jobs/all"
loop.run_until_complete(cls.api.delete(pathname, token=cls.oauth.access_token)) <- PAUSE HERE
How can I safely ensure that all of the documents are deleted? I could add a pause to the unit test before calling the delete endpoint, but this does not feel right and I should just try and fix the issue first.

get data from a json

I want to get the data from a json. I have the idea of a loop to access all levels.
I have only been able to pull data from a single block.
print(output['body']['data'][0]['list'][0]['outUcastPkts'])
How do I get the other data?
import json,urllib.request
data = urllib.request.urlopen("http://172.0.0.0/statistic").read()
output = json.loads(data)
for elt in output['body']['data']:
print(output['body']['data'][0]['inUcastPktsAll'])
for elt in output['list']:
print(output['body']['data'][0]['list'][0]['outUcastPkts'])
{
"body": {
"data": [
{
"inUcastPktsAll": 3100617019,
"inMcastPktsAll": 7567,
"inBcastPktsAll": 8872,
"outPktsAll": 8585575441,
"outUcastPktsAll": 8220240108,
"outMcastPktsAll": 286184143,
"outBcastPktsAll": 79151190,
"list": [
{
"outUcastPkts": 117427359,
"outMcastPkts": 1990586,
"outBcastPkts": 246120
},
{
"outUcastPkts": 0,
"outMcastPkts": 0,
"outBcastPkts": 0
}
]
},
{
"inUcastPktsAll": 8269483865,
"inMcastPktsAll": 2405765,
"inBcastPktsAll": 124466,
"outPktsAll": 3101194852,
"outUcastPktsAll": 3101012296,
"outMcastPktsAll": 173409,
"outBcastPktsAll": 9147,
"list": [
{
"outUcastPkts": 3101012296,
"outMcastPkts": 90488,
"outBcastPkts": 9147
},
{
"outUcastPkts": 0,
"outMcastPkts": 0,
"outBcastPkts": 0
}
]
}
],
"msgs": [ "successful" ]
},
"header": {
"opCode": "1",
"token": "",
"state": "",
"version": 1
}
}
output = json.loads(data) #Type of output is a dictionary.
#Try to use ".get()" method.
print(output.get('body')) #Get values of key 'body'
print(output.get('body').get('data')) #Get a list of key 'data'
If a key doesn't exist, the '.get()' method will return None.
https://docs.python.org/3/library/stdtypes.html#dict.get
In python you can easily iterate over the objects of a list like so:
>>> l = [1, 2, 3, 7]
>>> for elem in l:
... print(elem)
...
1
2
3
7
This works regarding what can of object do you have in the list (integers, tuples, dictionaries). Having that in mind, your solution was not far off, you only to do the following changes:
for entry in output['body']['data']:
print(entry['inUcastPktsAll'])
for list_element in entry['list']:
print(list_element['outUcastPkts'])
This will give you the following for the json object you have provided:
3100617019
117427359
0
8269483865
3101012296
0

python how to search a string, count values and group by in json

I have a python program calling an API that receives the result as below:
{
"result": [
{
"company" : "BMW",
"model" : "5"
},
{
"company" : "BMW",
"model" : "5"
},
{
"company" : "BMW",
"model" : "5"
},
{
"company" : "BMW",
"model" : "3"
},
{
"company" : "BMW",
"model" : "7"
},
{
"company" : "AUDI",
"model" : "A3"
},
{
"company" : "AUDI",
"model" : "A7"
},
]
}
Now my task is to identify the number of occurrences of elements from the list in JSON output and group them. The expected output should look like this:
{
"BMW" :
{
"5series" : 3,
"3series" : 1,
"7series" : 1,
},
"AUDI" :
{
"A3" : 1,
"A7" : 1,
},
"MERCEDES":
{
"EClass" : 0,
"SClass" : 0
}
}
I need to find the "company" from list of elements. This will include names that may not be in JSON response sometimes, then the expected output should include that as 0. The "model" names (3,5,7,A3 etc..,) are fixed, so we know that's those are only ones that may or may not be in json api response.
For ex: The List has 3 company names in below code. - companyname = ["BMW,"AUDI","MERCEDES"] . However, sometimes, the JSON API response may not have one or more elements. In this case, "MERCEDES" is missing, but the final output should include "MERCEDES" as well with value as 0.
Here is what i have tried so far:
def modelcount():
companyname= ["BMW","AUDI","MERCEDES"]
url = apiurl
#Send Request
apiresponse = requests.get(url, auth=(user, password), headers=headers, proxies=proxies)
# Decode the JSON response into a dictionary and use the data
data = apiresponse.json()
print(len(data['result']))
3series= 0
5series= 0
7series= 0
A3=0
A7=0
EClass = 0
SClass = 0
modelcountjson = {}
for name in companyname:
for item in data['result']:
models= {}
if item['company'] == name:
if item['model'] == 3:
3series = 3series + 1
elif item['model'] == 5:
5series = 5series + 1
elif item['model'] == 7:
7series = 7series + 1
models['3series'] = 3series
models['5series'] = 5series
models['7series'] = 7series
#I still haven't written AUDI, MERCEDES above. This is where i feel i am writing inefficiently.
modelcountjson[name] = models
return jsonify(modelcountjson)
```
As the number of models grow, I am worried of code getting redundant with many for loops and may cause performance overhead. I am looking for help on achieving the end result in most efficient way.
Thank you so much for your help.
A useful package for working directly with JSON-style dictionaries and lists is toolz (see documentation for more details). This way you can concisely group the data and count occurrences of each model while handling potentially missing data separately:
from toolz import itertoolz
result = {
"result": [
{
"company" : "BMW",
"model" : "5"
},
{
"company" : "BMW",
"model" : "5"
},
{
"company" : "BMW",
"model" : "5"
},
{
"company" : "BMW",
"model" : "3"
},
{
"company" : "BMW",
"model" : "7"
},
{
"company" : "AUDI",
"model" : "A3"
},
{
"company" : "AUDI",
"model" : "A7"
},
]
}
final_output = {}
grouped_result = itertoolz.groupby('company', result['result'])
if 'MERCEDES' not in grouped_result:
final_output['MERCEDES'] = {
'EClass': 0,
'SClass': 0
}
for key, value in grouped_result.items():
models = itertoolz.pluck('model', value)
final_output[key] = itertoolz.frequencies(models)
The output results in:
{'AUDI': {'A3': 1, 'A7': 1}, 'BMW': {'3': 1, '5': 3, '7': 1}, 'MERCEDES': {'EClass': 0, 'SClass': 0}}
You could go for a bit of a separation of code and config:
conf = {
'BMW': {'format': '{}series', 'keys': ['3', '5', '7']},
'AUDI': {'format': '{}', 'keys': ['A3', 'A7']},
'MERCEDES': {'format': '{}Class', 'keys': ['E', 'S']},
}
def modelcount():
# retrieve `data`
# ...
result = {
k: {
v['format'].format(key): 0 for key in v['keys']
} for k, v in conf.items()
}
for car in data['result']:
com = car['company']
mod = car['model']
key = conf[com]['format'].format(mod)
result[com][key] += 1
for com in result:
result[com]['Total'] = sum(result[com].values())
return result
>>> modelcount()
{'BMW': {'3series': 1, '5series': 3, '7series': 1},
'AUDI': {'A3': 1, 'A7': 1},
'MERCEDES': {'EClass': 0, 'SClass': 0}}
This way, for more companies and models, you will only have to touch the conf, not the code. The time complexity of this is O(m+n) with m the total number of distinct models and n the number of cars in the API response.

i want to convert sample JSON data into nested JSON using specific key-value in python

I have below sample data in JSON format :
project_cost_details is my database result set after querying.
{
"1": {
"amount": 0,
"breakdown": [
{
"amount": 169857,
"id": 4,
"name": "SampleData",
"parent_id": "1"
}
],
"id": 1,
"name": "ABC PR"
}
}
Here is full json : https://jsoneditoronline.org/?id=2ce7ab19af6f420397b07b939674f49c
Expected output :https://jsoneditoronline.org/?id=56a47e6f8e424fe8ac58c5e0732168d7
I have this sample JSON which i created using loops in code. But i am stuck at how to convert this to expected JSON format. I am getting sequential changes, need to convert to tree like or nested JSON format.
Trying in Python :
project_cost = {}
for cost in project_cost_details:
if cost.get('Parent_Cost_Type_ID'):
project_id = str(cost.get('Project_ID'))
parent_cost_type_id = str(cost.get('Parent_Cost_Type_ID'))
if project_id not in project_cost:
project_cost[project_id] = {}
if "breakdown" not in project_cost[project_id]:
project_cost[project_id]["breakdown"] = []
if 'amount' not in project_cost[project_id]:
project_cost[project_id]['amount'] = 0
project_cost[project_id]['name'] = cost.get('Title')
project_cost[project_id]['id'] = cost.get('Project_ID')
if parent_cost_type_id == cost.get('Cost_Type_ID'):
project_cost[project_id]['amount'] += int(cost.get('Amount'))
#if parent_cost_type_id is None:
project_cost[project_id]["breakdown"].append(
{
'amount': int(cost.get('Amount')),
'name': cost.get('Name'),
'parent_id': parent_cost_type_id,
'id' : cost.get('Cost_Type_ID')
}
)
from this i am getting sample JSON. It will be good if get in this code only desired format.
Also tried this solution mention here : https://adiyatmubarak.wordpress.com/2015/10/05/group-list-of-dictionary-data-by-particular-key-in-python/
I got approach to convert sample JSON to expected JSON :
data = [
{ "name" : "ABC", "parent":"DEF", },
{ "name" : "DEF", "parent":"null" },
{ "name" : "new_name", "parent":"ABC" },
{ "name" : "new_name2", "parent":"ABC" },
{ "name" : "Foo", "parent":"DEF"},
{ "name" : "Bar", "parent":"null"},
{ "name" : "Chandani", "parent":"new_name", "relation": "rel", "depth": 3 },
{ "name" : "Chandani333", "parent":"new_name", "relation": "rel", "depth": 3 }
]
result = {x.get("name"):x for x in data}
#print(result)
tree = [];
for a in data:
#print(a)
if a.get("parent") in result:
parent = result[a.get("parent")]
else:
parent = ""
if parent:
if "children" not in parent:
parent["children"] = []
parent["children"].append(a)
else:
tree.append(a)
Reference help : http://jsfiddle.net/9FqKS/ this is a JavaScript solution i converted to Python
It seems that you want to get a list of values from a dictionary.
result = [value for key, value in project_cost_details.items()]

logical error in python dictionary traversal

one of my queries in mongoDB through pymongo returns:
{ "_id" : { "origin" : "ABE", "destination" : "DTW", "carrier" : "EV" }, "Ddelay" : -5.333333333333333,
"Adelay" : -12.666666666666666 }
{ "_id" : { "origin" : "ABE", "destination" : "ORD", "carrier" : "EV" }, "Ddelay" : -4, "Adelay" : 14 }
{ "_id" : { "origin" : "ABE", "destination" : "ATL", "carrier" : "EV" }, "Ddelay" : 6, "Adelay" : 14 }
I am traversing the result as below in my python module but I am not getting all the 3 results but only two. I believe I should not use len(results) as I am doing currently. Can you please help me correctly traverse the result as I need to display all three results in the resultant json document on web ui.
Thank you.
code:
pipe = [{ '$match': { 'origin': {"$in" : [origin_ID]}}},
{"$group" :{'_id': { 'origin':"$origin", 'destination': "$dest",'carrier':"$carrier"},
"Ddelay" : {'$avg' :"$dep_delay"},"Adelay" : {'$avg' :"$arr_delay"}}}, {"$limit" : 4}]
results = connect.aggregate(pipeline=pipe)
#pdb.set_trace()
DATETIME_FORMAT = '%Y-%m-%d'
for x in range(len(results)):
origin = (results['result'][x])['_id']['origin']
destination = (results['result'][x])['_id']['destination']
carrier = (results['result'][x])['_id']['carrier']
Adelay = (results['result'][x])['Adelay']
Ddelay = (results['result'][x])['Ddelay']
obj = {'Origin':origin,
'Destination':destination,
'Carrier': carrier,
'Avg Arrival Delay': Adelay,
'Avg Dep Delay': Ddelay}
json_result.append(obj)
return json.dumps(json_result,indent= 2, sort_keys=False,separators=(',',':'))
Pymongo returns result in format:
{u'ok': 1.0, u'result': [...]}
So you should iterate over result:
for x in results['result']:
...
In your code you try to calculate length of dict, not length of result container.

Categories