I have a JSON like below:
[
[
{
"subject": "Subject_1",
"qapali_correct_count": "12"
},
{
"subject": "Subject_2",
"qapali_correct_count": "9"
}
],
[
{
"subject": "Subject_1",
"qapali_correct_count": "14"
},
{
"subject": "Subject_2",
"qapali_correct_count": "15"
}
],
[
{
"subject": "Subject_1",
"qapali_correct_count": "11"
},
{
"subject": "Subject_2",
"qapali_correct_count": "12"
}
]
]
I have to output every subject's average: for example: subject_1 = 12.33, subject_2=12
I tried this code it works but I just wonder is there any option to speed up this code, Are there any other efficient ways to achieve it.
results = Result.objects.filter(exam=obj_exam, grade=obj_grade)
student_count = results.count()
final_data = {}
for result in results:
st_naswer_js = json.loads(result.student_answer_data_finish)
for rslt in st_naswer_js:
previus_data = final_data.get(rslt['subject'],0)
previus_data = previus_data+int(rslt['qapali_correct_count'])
final_data.update({rslt['subject']:previus_data})
for dudu, data in final_data.items():
tmp_data = data/student_count
final_data[dudu]=tmp_data
print(final_data)
Please note that it is a Django project.
The code in your question has several non-relevant bits. I'll stick to this part:
I have to output every subject's average: for example: subject_1 = 12.33, subject_2=12
I'll assume the list of results above is in a list called results. If it's json-loaded per student, handling that is probably already in your existing code. Primary focus below is on subject_score.
Store the score of each subject in a dictionary, whose values are lists of scores. I'm using a defaultdict here, with list as the default factory so when a dictionary value which doesn't exist is accessed, it gets initialised to an empty list (rather than throwing an KeyError which would happen with a standard dictionary.
import collections
subject_score = collections.defaultdict(list)
for result in results:
for stud_score in result:
# add each score to the list of scores for that subject
# use int or float above as needed
subject_score[stud_score['subject']].append(int(stud_score['qapali_correct_count']))
# `subject_score` is now:
# defaultdict(list, {'Subject_1': [12, 14, 11], 'Subject_2': [9, 15, 12]})
averages = {sub: sum(scores)/len(scores) for sub, scores in subject_score.items()}
averages is:
{'Subject_1': 12.333333333333334, 'Subject_2': 12.0}
Or you can print or save to file, db, etc. as needed.
Related
I want to implement a use case through hashmap in python. I will be having a json object of type dict.
x={
"name": "Jim",
"age": 30,
"married": True,
"phonenumber": "123456"
"pets": None,
"cars": [
{"model": "toyota", "mpg": 23.5},
{"model": "kia", "mpg": 21.1}
]
}
if (x["married"] == True):
h.put(x[phonenumber]) // i want to make phonenumber as key and the entire dict x as value. by using something like Map<String, dict> h = new HashMap<>();
when i do h.get(phonenumber), i want the entire dict x as an output.
How can i implement this usecase in python.
A dict is python's version of a hash map. To do what you want, try the following:
m = {}
if x["married"]:
m[x["phonenumber"]] = x
As has already been said, the dict() is Python's implementation of hastmaps / hash tables. I've assumed that you are going to have many records, all of which will be stored in a dict(), and that you are adding records according to your example. I have not error-checked for the existence of a phone number in the record; since you're using the phone number as the key then you should already have ensured that there is an actual phone number in the information which you are adding.
#just one record
x={
"name": "Jim",
"age": 30,
"married": True,
"phonenumber": "123456", #you left out a comma here
"pets": None,
"cars": [
{"model": "toyota", "mpg": 23.5},
{"model": "kia", "mpg": 21.1}
]
}
my_dict = dict() #this would hold ALL of your records
def add_to_mydict(my_dict:dict, new_record:dict):
if new_record["married"]:
phone = new_record.pop("phonenumber")
y = {phone: new_record}
my_dict.update(y)
add_to_mydict(my_dict, x)
print(my_dict)
my_dict would be the master dictionary in which you will store your data. SQLite makes a much better database than in-memory dicts.
I am trying to update values based off an API call on a server. I have a list of IDs that I have pulled from a previous call saved in a list. I am iterating through the 4 values in the list and doing a new API call to grab some alerts in JSON. If the part of the JSON I'm looking for is blank I want the loop to continue but if there is a value then I want it to find and replace text so I can use it on the next step to do a PUT API call.
I can't figure out why the loop continues to give me ALL of the values.
My code:
site_ids = []
for ids in parsed['resources']:
site_ids.append((ids['id']))
This gives me a list of [6, 5, 7, 1] which I then use in my next API call to get the alerts
for sid in site_ids:
smtp_url = "my url"+str(sid)+"API endpoint"
smtp_payload={}
smtp_headers = {
'Accept': 'application/json;charset=UTF-8',
'Authorization': 'my stuff'
}
smtp_response = requests.request("GET", smtp_url, headers=smtp_headers, data=smtp_payload, verify=False)
smtp_text = smtp_response.text
smtp_json = json.loads(smtp_text)
print(json.dumps(smtp_json["resources"], indent=4, sort_keys=True))
This gives me the results for each JSON
[
{
"name": "Test1",
"notification": "SMTP",
"recipients": [
"abc#abc.com"
],
"relayServer": "1.2.3.4",
"senderEmailAddress": "test#abc.com"
},
{
"name": "Test2",
"notification": "SMTP",
"recipients": [
"abc#abc.com"
],
"relayServer": "1.2.3.4",
"senderEmailAddress": "test#abc.com"
}
]
[
{
"name": "Test3",
"notification": "SMTP",
"recipients": [
"abc#abc.com"
],
"relayServer": "1.2.3.4",
"senderEmailAddress": "test#abc.com"
},
{
"name": "Test4",
"notification": "SMTP",
"recipients": [
"abc#abc.com"
],
"relayServer": "1.2.3.4",
"senderEmailAddress": "test#abc.com"
}
]
[]
[]
At the end you can see the last two sites that it iterated through are blank showing only the []
Everything up to this point is working as I expected. This is where I'm running into issues though. I'm trying to take that response in a further if statement that essentially ignores the results where the "resources" block is empty [] but adds the sid that was used from the call where there actually is data. My problem is that I'm still getting all 4 sid no matter how I do it.
When I use this:
site_ids_with_alerts = []
if smtp_json['resources'] != None:
site_ids_with_alerts.append(sid)
print(site_ids_with_alerts)
I still get a full list of [6, 5, 7, 1]
I was EXPECTING to get [6, 5]
I have also tried these below as well but every time I get the same results:
site_ids_with_alerts = []
site_ids_with_alerts = [sid if smtp_json['resources'] != "[]" else None]
if smtp_json['resources'] == None:
None
else:
site_ids_with_alerts.append(sid)
if smtp_json['resources'] == '[]':
None
else:
site_ids_with_alerts.append(sid)
The issue was how I was working with the empty value in JSON. I found the answer here
So I changed the code to look like this:
if not len(smtp_json['resources']) == 0:
site_ids_with_alerts.append(sid)
Which gives me the list I wanted of [6,5]
I have a JSON that has around 50k items where each has an id and name as follows (I cut the data):
[
{
"id": 2,
"name": "Cannonball"
},
{
"id": 6,
"name": "Cannon base"
},
{
"id": 8,
"name": "Cannon stand"
},
{
"id": 10,
"name": "Cannon barrels"
},
{
"id": 12,
"name": "Cannon furnace"
},
{
"id": 28,
"name": "Insect repellent"
},
{
"id": 30,
"name": "Bucket of wax"
}]
Now, I have an array of item names and I want to find the corresponding id and to add it into an id array.
For example, I have itemName = ['Cannonball', 'Cannon furnace', 'Bucket of wax]
I would like to search inside the JSON and to return id_array = [2, 12, 30]
I wrote the following code which does the work however it seems like a huge waste of energy:
file_name = "database.json"
with open(file_name, 'r') as f:
document = json.loads(f.read())
items = ['Cannonball', 'Cannon furnace','Bucket of wax']
for item_name in items:
for entry in document:
if item_name == entry ['name']:
id_array.append(entry ['id'])
Is there any faster method that can do it?
The example above shows only 3 results but I'm talking about a few thousand and it feels like a waste to iterate over 1k+ results.
Thank you
Build a lookup dictionary mapping names to ids and then look up the names on that dictionary:
lookup = { d["name"] : d["id"] for d in document}
items = ['Cannonball', 'Cannon furnace','Bucket of wax']
result = [lookup[item] for item in items]
print(result)
Output
[2, 12, 30]
The time complexity of this approach is O(n + m) where n is the number of elements in the document (len(document)) and m is the number of items (len(items)), in contrast your approach is O(nm).
An alternative approach that uses less space, is to filter out those names that are not in items:
items = ['Cannonball', 'Cannon furnace', 'Bucket of wax']
item_set = set(items)
lookup = {d["name"]: d["id"] for d in document if d["name"] in item_set}
result = [lookup[item] for item in items]
This approach has the same time complexity as the previous one.
You could generate a dict which maps name to id first:
file_name = "database.json"
with open(file_name, 'r') as f:
document = json.loads(f.read())
name_to_id = {item["name"]:item["id"] for item in document}
Now you can just iterate over items:
items = ['Cannonball', 'Cannon furnace','Bucket of wax']
id_array = [ name_to_id[name] for name in items]
I have the following json that I extracted using request with python and json.loads. The whole json basically repeats itself with changes in the ID and names. It has a lot of information but I`m just posting a small sample as an example:
"status":"OK",
"statuscode":200,
"message":"success",
"apps":[
{
"id":"675832210",
"title":"AGED",
"desc":"No annoying ads&easy to play",
"urlImg":"https://test.com/pImg.aspx?b=675832&z=1041813&c=495181&tid=API_MP&u=https%3a%2f%2fcdna.test.com%2fbanner%2fwMMUapCtmeXTIxw_square.png&q=",
"urlImgWide":"https://cdna.test.com/banner/sI9MfGhqXKxVHGw_rectangular.jpeg",
"urlApp":"https://admin.test.com/appLink.aspx?b=675832&e=1041813&tid=API_MP&sid=2c5cee038cd9449da35bc7b0f53cf60f&q=",
"androidPackage":"com.agedstudio.freecell",
"revenueType":"cpi",
"revenueRate":"0.10",
"categories":"Card",
"idx":"2",
"country":[
"CH"
],
"cityInclude":[
"ALL"
],
"cityExclude":[
],
"targetedOSver":"ALL",
"targetedDevices":"ALL",
"bannerId":"675832210",
"campaignId":"495181210",
"campaignType":"network",
"supportedVersion":"",
"storeRating":"4.3",
"storeDownloads":"10000+",
"appSize":"34603008",
"urlVideo":"",
"urlVideoHigh":"",
"urlVideo30Sec":"https://cdn.test.com/banner/video/video-675832-30.mp4?rnd=1620699136",
"urlVideo30SecHigh":"https://cdn.test.com/banner/video/video-675832-30_o.mp4?rnd=1620699131",
"offerId":"5825774"
},
I dont need all that data, just a few like 'title', 'country', 'revenuerate' and 'urlApp' but I dont know if there is a way to extract only that.
My solution so far was to make the json a dataframe and then drop the columns, however, I wanted to find an easier solution.
My ideal final result would be to have a dataframe with selected keys and arrays
Does anybody know an easy solution for this problem?
Thanks
I assume you have that data as a dictionary, let's call it json_data. You can just iterate over the apps and write them into a list. Alternatively, you could obviously also define a class and initialize objects of that class.
EDIT:
I just found this answer: https://stackoverflow.com/a/20638258/6180150, which tells how you can convert a list of dicts like from my sample code into a dataframe. See below adaptions to the code for a solution.
json_data = {
"status": "OK",
"statuscode": 200,
"message": "success",
"apps": [
{
"id": "675832210",
"title": "AGED",
"desc": "No annoying ads&easy to play",
"urlImg": "https://test.com/pImg.aspx?b=675832&z=1041813&c=495181&tid=API_MP&u=https%3a%2f%2fcdna.test.com%2fbanner%2fwMMUapCtmeXTIxw_square.png&q=",
"urlImgWide": "https://cdna.test.com/banner/sI9MfGhqXKxVHGw_rectangular.jpeg",
"urlApp": "https://admin.test.com/appLink.aspx?b=675832&e=1041813&tid=API_MP&sid=2c5cee038cd9449da35bc7b0f53cf60f&q=",
"androidPackage": "com.agedstudio.freecell",
"revenueType": "cpi",
"revenueRate": "0.10",
"categories": "Card",
"idx": "2",
"country": [
"CH"
],
"cityInclude": [
"ALL"
],
"cityExclude": [
],
"targetedOSver": "ALL",
"targetedDevices": "ALL",
"bannerId": "675832210",
"campaignId": "495181210",
"campaignType": "network",
"supportedVersion": "",
"storeRating": "4.3",
"storeDownloads": "10000+",
"appSize": "34603008",
"urlVideo": "",
"urlVideoHigh": "",
"urlVideo30Sec": "https://cdn.test.com/banner/video/video-675832-30.mp4?rnd=1620699136",
"urlVideo30SecHigh": "https://cdn.test.com/banner/video/video-675832-30_o.mp4?rnd=1620699131",
"offerId": "5825774"
},
]
}
filtered_data = []
for app in json_data["apps"]:
app_data = {
"id": app["id"],
"title": app["title"],
"country": app["country"],
"revenueRate": app["revenueRate"],
"urlApp": app["urlApp"],
}
filtered_data.append(app_data)
print(filtered_data)
# Output
d = [
{
'id': '675832210',
'title': 'AGED',
'country': ['CH'],
'revenueRate': '0.10',
'urlApp': 'https://admin.test.com/appLink.aspx?b=675832&e=1041813&tid=API_MP&sid=2c5cee038cd9449da35bc7b0f53cf60f&q='
}
]
d = pd.DataFrame(filtered_data)
print(d)
# Output
id title country revenueRate urlApp
0 675832210 AGED [CH] 0.10 https://admin.test.com/appLink.aspx?b=675832&e=1041813&tid=API_MP&sid=2c5cee038cd9449da35bc7b0f53cf60f&q=
if your endgame is dataframe, just load the dataframe and take the columns you want:
setting the json to data
df = pd.json_normalize(data['apps'])
yields
id title desc urlImg ... urlVideoHigh urlVideo30Sec urlVideo30SecHigh offerId
0 675832210 AGED No annoying ads&easy to play https://test.com/pImg.aspx?b=675832&z=1041813&... ... https://cdn.test.com/banner/video/video-675832... https://cdn.test.com/banner/video/video-675832... 5825774
[1 rows x 28 columns]
then if you want certain columns:
df_final = df[['title', 'desc', 'urlImg']]
title desc urlImg
0 AGED No annoying ads&easy to play https://test.com/pImg.aspx?b=675832&z=1041813&...
use a dictionary comprehension to extract a dictionary of key/value pairs you want
import json
json_string="""{
"id":"675832210",
"title":"AGED",
"desc":"No annoying ads&easy to play",
"urlApp":"https://admin.test.com/appLink.aspx?b=675832&e=1041813&tid=API_MP&sid=2c5cee038cd9449da35bc7b0f53cf60f&q=",
"revenueRate":"0.10",
"categories":"Card",
"idx":"2",
"country":[
"CH"
],
"cityInclude":[
"ALL"
],
"cityExclude":[
]
}"""
json_dict = json.loads(json_string)
filter_fields=['title','country','revenueRate','urlApp']
dict_result = { key: json_dict[key] for key in json_dict if key in filter_fields}
json_elements = []
for key in dict_result:
json_elements.append((key,json_dict[key]))
print(json_elements)
output:
[('title', 'AGED'), ('urlApp', 'https://admin.test.com/appLink.aspx?b=675832&e=1041813&tid=API_MP&sid=2c5cee038cd9449da35bc7b0f53cf60f&q='), ('revenueRate', '0.10'), ('country', ['CH'])]
So I have a flattened tree in JSON like this, as array of objects:
[{
aid: "id3",
data: ["id1", "id2"]
},
{
aid: "id1",
data: ["id3", "id2"]
},
{
aid: "id2",
nested_data: {aid: "id4", atype: "nested", data: ["id1", "id3"]},
data: []
}]
I want to gather that tree and resolve ids into data with recursion loops into something like this (say we start from "id3"):
{
"aid":"id3",
"payload":"1",
"data":[
{
"id1":{
"aid":"id1",
"data":[
{
"id3":null
},
{
"id2":null
}
]
}
},
{
"id2":{
"aid":"id2",
"nested_data":{
"aid":"id4",
"atype":"nested",
"data":[
{
"id1":null
},
{
"id3":null
}
]
},
"data":[
]
}
}
]
}
So that we would get breadth-first search and resolve some field into "value": "object with that field" on first entrance and "value": Null
How to do such a thing in python 3?
Apart from all the problems that your structure has in terms of syntax (identifiers must be within quotes, etc.), the code below will provide you with the requested answer.
But you should carefully think about what you are doing, and have the following into account:
Using the relations expressed in the flat structure that you provide will mean that you will have an endless recursion since you have items that include other items that in turn include the first ones (like id3 including id1, which in turn include id3. So, you have to define stop criteria, or be sure that this does not occur in your flat structure.
Your initial flat structure is better to be in the form of a dictionary, instead of a list of pairs {id, data}. That is why the first thing the code below does is to transform this.
Your final, desired structure contains a lot of redundancies in terms of information contained. Consider simplifying it.
Finally, you mentioned nothing about the "nested_data" nodes, and how they should be treated. I simply assumed that in case that exist, further expansion is required.
Please, consider trying to provide a bit of context in your questions, some real data examples (I believe the data provided is not real data, therefore the inconsistencies and redundancies), and try yourself and provide your efforts; that's the only way to learn.
from pprint import pprint
def reformat_flat_info(flat):
reformatted = {}
for o in flat:
key = o["aid"]
del o["aid"]
reformatted[key] = o
return reformatted
def expand_data(aid, flat, lvl=0):
obj = flat[aid]
if obj is None: return {aid: obj}
obj.update({"aid": aid})
if lvl > 1:
return {aid: None}
for nid,id in enumerate(obj["data"]):
obj["data"][nid] = expand_data(id, flat, lvl=lvl+1)
if "nested_data" in obj:
for nid,id in enumerate(obj["nested_data"]["data"]):
obj["nested_data"]["data"][nid] = expand_data(id, flat, lvl=lvl+1)
return {aid: obj}
# Provide the flat information structure
flat_info = [
{
"aid": "id3",
"data": ["id1", "id2"]
}, {
"aid": "id1",
"data": ["id3", "id2"]
}, {
"aid": "id2",
"nested_data": {"aid": "id4", "atype": "nested", "data": ["id1", "id3"]},
"data": []
}
]
pprint(flat_info)
print('-'*80)
# Reformat the flat information structure
new_flat_info = reformat_flat_info(flat=flat_info)
pprint(new_flat_info)
print('-'*80)
# Generate the result
starting_id = "id3"
result = expand_data(aid=starting_id, flat=new_flat_info)
pprint(result)