Extracting values from nested dictionary from text file to JSON

Extracting values from nested dictionary from text file to JSON - python

The text file contains dictionary of dictionary. In that text file for exmaple "2018" acts as they further "8" is the month which is value for "2018" but key for next dictionary. I want to fetch the "total_queries_count","total_dislike","unique_users" values.
{"2018":
{"8":{ "total_queries_count": 4,
"queries_without_teachers": 3,
"non_teacher_queries": 1,
"total_dislike": 0,
"unique_users": [", "landmark", "232843"],
"user_dislike": 0
},
"9":{ "total_queries_count": 1021,
"queries_without_teachers": 0,
"non_teacher_queries": 1021,
"total_dislike": 0,
"unique_users": [", "1465146", "14657", "dfgf", "1123", "456", "1461546", "Ra", "siva", "234", "ramesh", "3456", "23", "43567", "sfdf", "sdsd", "ra", "sddff", "1234", "rames", "RAM", "444", "123", "333", "RAM", "789", "itassistant", "rame", "12345"],
"user_dislike": 0},
"10": {"total_queries_count": 352,
"queries_without_teachers": 1,
"non_teacher_queries": 351,
"total_dislike": 0,
"unique_users": [", "1465146", "777", "43567", "1234", "456", "123456", "12345", "232843"],
"user_dislike": 0
},
"11": {"total_queries_count": 180,
"queries_without_teachers": 0,
"non_teacher_queries": 180,
"total_dislike": 12,
"unique_users": [", "75757575", "9000115", "9000157", "9000494", "9000164", "123453"],
"user_dislike": 12},
"12": {"total_queries_count": 266,
"queries_without_teachers": 0,
"non_teacher_queries": 266,
"total_dislike": 16,
"unique_users": [", "131422", "121550", "9000508", "9000560", "9000115", "9000371", "9000372", "93979", "146625", "114586", "165937", "9000494", "9000463", "38404", "129458", "62948", "125143", "9000179", "9000145", "9000001", "9000164", "81849", "102663", "9000123", "105407", "33517", "21344", "9000213", "202074", "9000103", "18187", "9000342", "9000125", "9000100", "9000187", "18341", "9000181", "168802", "9000529", "12345", "110127", "9000134", "100190", "9000352", "9000156", "9000055", "tcs_hariharas", "9000078", "204101", "9000050", "9000139"],
"user_dislike": 16}
}
}

Check https://docs.python.org/3/tutorial/datastructures.html#dictionaries
You can access needed keys like this:
# assuming your initial nested dict is called 'data'
data["2018"]["8"]["total_queries_count"]
If you want to aggregate data for all years and months in one place, you can do this:
overall_queries = 0
overall_dislikes = 0
users = set() # this is a set not a list in order to preserve uniqueness of users
for year in data: # year is a key in data dict
for month in data[year]: # month is a key in data[year] dict
users.update(data[year][month]["unique_users"])
overall_queries += data[year][month]["total_queries_count"]
overall_dislikes += data[year][month]["total_dislike"]
If you want to keep your result separated by years you can do this:
result = {}
for year in data:
overall_queries = 0
overall_dislikes = 0
users = set()
for month in data[year]:
overall_queries += data[year][month]["total_queries_count"]
overall_dislikes += data[year][month]["total_dislike"]
users.update(data[year][month]["unique_users"])
result[year] = {
"overall_queries": overall_queries,
"overall_dislikes": overall_dislikes,
"users": users,
}
Result:
{'2018': {'overall_dislikes': 28,
'overall_queries': 1823,
'users': {'100190',
'102663',
'105407',
'110127',
...}}}

Related

Dynamic values within a JSON parameter using Python

To be clear, I am practicing my Python skills using CoinMarketCaps API.
The below code works great:
import json
# 1 JSON string in list, works
info_1_response = ['{"status": {"timestamp": "2023-01-25T22:59:58.760Z", "error_code": 0, "error_message": null, "elapsed": 16, "credit_count": 1, "notice": null}, "data": {"BTC": {"id": 1, "name": "Bitcoin", "symbol": "BTC"}}}']
for response in info_1_response:
info_1_dict = json.loads(response)
#print(info_1_dict) #works
data = info_1_dict['data']['BTC']
print(f"only id = {data['id']}")
OUTPUT: only id = 1
However, if I have 2 responses in a list, how would I got about getting the ID for each symbol (BTC/ETH)? Code:
info_2_response = ['{"status": {"timestamp": "2023-01-25T22:59:58.760Z", "error_code": 0, "error_message": null, "elapsed": 16, "credit_count": 1, "notice": null}, "data": {"BTC": {"id": 1, "name": "Bitcoin", "symbol": "BTC"}}}', '{"status": {"timestamp": "2023-01-25T22:59:59.087Z", "error_code": 0, "error_message": null, "elapsed": 16, "credit_count": 1, "notice": null}, "data": {"ETH": {"id": 1027, "name": "Ethereum", "symbol": "ETH"}}}']
for response in info_2_response:
info_2_dict = json.loads(response)
#print(info_2_dict) #works
print(info_2_dict['data']) #works
OUTPUT:
{'BTC': {'id': 1, 'name': 'Bitcoin', 'symbol': 'BTC'}}
{'ETH': {'id': 1027, 'name': 'Ethereum', 'symbol': 'ETH'}}
But what if I only wanted the ID? It seems as if I would need a dynamic parameter as so:
data = info_2_dict['data']['DYNAMIC PARAMETER-(BTC/ETH)']
print(f"only id = {data['id']}")
Desired Output: 1, 1027
is this possible?

Just iterate another level:
>>> for response in info_2_response:
... response = json.loads(response)
... for coin, data in response['data'].items():
... print(coin, data['id'])
...
BTC 1
ETH 1027
We can remove some boilerplate from the body of the loop by using map(json.loads, ...), and since you know there is only one item in the dict, you can get fancy and use iterable unpacking:
>>> for response in map(json.loads, info_2_response):
... [(coin, data), *_] = response['data'].items()
... print(coin, data['id'])
...
BTC 1
ETH 1027
And if you expect there to only be one, you might want an error thrown, so you can do:
>>> for response in map(json.loads, info_2_response):
... [(coin, data)] = response['data'].items()
... print(coin, data['id'])
...
BTC 1
ETH 1027
So note, in the top version, [(coin, data), *_] = response['data'].items() will not fail if there is more than one item in the dict, the rest of the items get assigned to a list called _, which we are ignoring. But that's just a conventional name for a "throwaway" variable.
However, the other version would fail:
>>> response
{'status': {'timestamp': '2023-01-25T22:59:59.087Z', 'error_code': 0, 'error_message': None, 'elapsed': 16, 'credit_count': 1, 'notice': None}, 'data': {'ETH': {'id': 1027, 'name': 'Ethereum', 'symbol': 'ETH'}}}
>>> response['data']['FOO'] = "FOO STUFF"
>>> [(coin, data)] = response['data'].items()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: too many values to unpack (expected 1)
>>>

As long as the data dictionary only has a single key, you can do something like this:
import json
info_2_response = [
"""{"status": {"timestamp": "2023-01-25T22:59:58.760Z", "error_code": 0, "error_message": null, "elapsed": 16, "credit_count": 1, "notice": null},
"data": {"BTC": {"id": 1, "name": "Bitcoin", "symbol": "BTC"}}}""",
"""{"status": {"timestamp": "2023-01-25T22:59:59.087Z", "error_code": 0, "error_message": null, "elapsed": 16, "credit_count": 1, "notice": null},
"data": {"ETH": {"id": 1027, "name": "Ethereum", "symbol": "ETH"}}}""",
]
for response in info_2_response:
info_2_dict = json.loads(response)
print(list(info_2_dict["data"].values())[0]["id"])
This will print:
1
1027
The code works by using the .values() method of a dictionary to get a list of values. Since there's only a single value, we just take the first item from the list and then look up the id attribute.
We can expand the compound statement to make the operations a little more clear:
for response in info_2_response:
info_2_dict = json.loads(response)
all_values = info_2_dict["data"].values()
first_value = list(all_values)[0]
id = first_value["id"]
print(id)

Iterate through nested JSON in Python

js = {
"status": "ok",
"meta": {
"count": 1
},
"data": {
"542250529": [
{
"all": {
"spotted": 438,
"battles_on_stunning_vehicles": 0,
"avg_damage_blocked": 39.4,
"capture_points": 40,
"explosion_hits": 0,
"piercings": 3519,
"xp": 376586,
"survived_battles": 136,
"dropped_capture_points": 382,
"damage_dealt": 783555,
"hits_percents": 74,
"draws": 2,
"battles": 290,
"damage_received": 330011,
"frags": 584,
"stun_number": 0,
"direct_hits_received": 1164,
"stun_assisted_damage": 0,
"hits": 4320,
"battle_avg_xp": 1299,
"wins": 202,
"losses": 86,
"piercings_received": 1004,
"no_damage_direct_hits_received": 103,
"shots": 5857,
"explosion_hits_received": 135,
"tanking_factor": 0.04
}
}
]
}
}
Let us name this json "js" as a variable, this variable will be in a for-loop.
To understand better what I'm doing here, I'm trying to collect data from a game.
This game has hundreds of different tanks, each tank has tank_id with which I can post tank_id to the game server and respond the performance data as "js".
for tank_id: json = requests.post(tank_id) etc...
and fetch all these values to my database as shown in the screenshot.
my python code for it:
def api_get():
for property in js['data']['542250529']['all']:
spotted = property['spotted']
battles_on_stunning_vehicles = property['battles_on_stunning_vehicles']
# etc
# ...
insert_to_db(spotted, battles_on_stunning_vehicles, etc....)
the exception is:
for property in js['data']['542250529']['all']:
TypeError: list indices must be integers or slices, not str
and when:
print(js['data']['542250529'])
i get the rest of the js as a string, and i can't iterate... can't be used a valid json string, also what's inside js['data']['542250529'] is a list containing only the item 'all'..., any help would be appreciated

You just missed [0] to get the first item in a list:
def api_get():
for property in js['data']['542250529'][0]['all']:
spotted = property['spotted']
# ...
Look carefully at the data structure in the source JSON.

There is a list containing the dictionary with a key of all. So you need to use js['data']['542250529'][0]['all'] not js['data']['542250529']['all']. Then you can use .items() to get the key-value pairs.
See below.
js = {
"status": "ok",
"meta": {
"count": 1
},
"data": {
"542250529": [
{
"all": {
"spotted": 438,
"battles_on_stunning_vehicles": 0,
"avg_damage_blocked": 39.4,
"capture_points": 40,
"explosion_hits": 0,
"piercings": 3519,
"xp": 376586,
"survived_battles": 136,
"dropped_capture_points": 382,
"damage_dealt": 783555,
"hits_percents": 74,
"draws": 2,
"battles": 290,
"damage_received": 330011,
"frags": 584,
"stun_number": 0,
"direct_hits_received": 1164,
"stun_assisted_damage": 0,
"hits": 4320,
"battle_avg_xp": 1299,
"wins": 202,
"losses": 86,
"piercings_received": 1004,
"no_damage_direct_hits_received": 103,
"shots": 5857,
"explosion_hits_received": 135,
"tanking_factor": 0.04
}
}
]
}
}
for key, val in js['data']['542250529'][0]['all'].items():
print("key:", key, " val:", val)
#Or this way
for key in js['data']['542250529'][0]['all']:
print("key:", key, " val:", js['data']['542250529'][0]['all'][key])

How convert list to json based on same key value in django

How convert list to json based on same key value in django
My django list (my question)
{'include_product_id[x_734652]': ['1'],
'product_name[x_734652]': ['Test_1'],
'product_qty[x_734652]': ['1'],
'product_price[x_734652]': ['10'],
'total_amount[x_734652]': ['10'],
'include_product_id[x_332559]': ['2'],
'product_name[x_332559]': ['Test_2'],
'product_qty[x_332559]': ['10'],
'product_price[x_332559]': ['10'],
'total_amount[x_332559]': ['100']}
I need result like that
[{'include_product_id': 1,
'product_name': Test_1,
'product_qty': 1,
'product_price': 1,
'total_amount': 10,
},
{'include_product_id': 2,
'product_name': Test_2,
'product_qty': 10,
'product_price': 10,
'total_amount': 100,
}]

Iterate through the items and read the x_<digits> part to know for which dict it should go:
x = {}
for key, value in data.items():
x_id_index = key.rindex("[")
x_id = key[x_id_index+1:-1]
x_dict = x.setdefault(x_id, {})
x_dict[key[:x_id_index]] = value[0]
x_values = list(x.values())
print(x_values)
Output (pretty printed)
[
{
"include_product_id": "1",
"product_name": "Test_1",
"product_qty": "1",
"product_price": "10",
"total_amount": "10"
},
{
"include_product_id": "2",
"product_name": "Test_2",
"product_qty": "10",
"product_price": "10",
"total_amount": "100"
}
]

Creating Lists from Multiple Jsons with Missing Keys

I am trying to create lists from json datas by pulling one by one and append them to the lists. However, some variables does not given in all json files. For example: for the json file below, data does not have ['statistics']['aerialLost'] , so it return Key Error. My Expected solution is when json file does not have key, append 'None' value to the list and continue.
Code
s_aerialLost = []
s_aerialWon = []
s_duelLost = []
s_duelWon = []
players = ['Martin Linnes', 'Christian Luyindama', 'Marcão', 'Ömer Bayram', 'Oghenekaro Etebo', 'Muhammed Kerem Aktürkoğlu', 'Gedson Fernandes', 'Emre Kılınç', 'Ryan Babel', 'Mostafa Mohamed', 'Florent Hadergjonaj', 'Tomáš Břečka', 'Duško Tošić', 'Oussama Haddadi', 'Kristijan Bistrović', 'Aytaç Kara', 'Haris Hajradinović', 'Armin Hodžić', 'Gilbert Koomson', 'Isaac Kiese Thelin']
players_id = [109569, 867191, 840951, 68335, 839110, 903324, 862055, 202032, 1876, 873551, 354860, 152971, 14557, 867180, 796658, 128196, 254979, 138127, 341107, 178743]
for player, player_id in zip(players, players_id):
url = base_url + str(player_id)
data = requests.request("GET", url).json()
## just added 4 data for simplify
accurateLongBalls = str(data['statistics']['accurateLongBalls'])
aerialLost = str(data['statistics']['aerialLost'])
aerialWon = str(data['statistics']['aerialWon'])
duelLost = str(data['statistics']['duelLost'])
s_aerialLost.append()
s_aerialWon.append()
s_duelLost.append()
s_duelWon.append()
Json File
{
"player": {
"name": "Martin Linnes",
"slug": "martin-linnes",
"shortName": "M. Linnes",
"position": "D",
"userCount": 339,
"id": 109569,
"marketValueCurrency": "€",
"dateOfBirthTimestamp": 685324800
},
"team": {
"name": "Galatasaray",
"slug": "galatasaray",
"shortName": "Galatasaray",
"gender": "M",
"userCount": 100254,
"nameCode": "GAL",
"national": false,
"type": 0,
"id": 3061,
"teamColors": {
"primary": "#ff9900",
"secondary": "#ff0000",
"text": "#ff0000"
}
},
"statistics": {
"totalPass": 32,
"accuratePass": 22,
"totalLongBalls": 7,
"accurateLongBalls": 3,
"totalCross": 2,
"aerialWon": 1,
"duelLost": 2,
"duelWon": 7,
"totalContest": 3,
"wonContest": 2,
"totalClearance": 4,
"totalTackle": 3,
"wasFouled": 1,
"fouls": 1,
"minutesPlayed": 82,
"touches": 63,
"rating": 7.3,
"possessionLostCtrl": 18,
"keyPass": 1
},
"position": "D"
}
Error
KeyError: 'aerialLost'

Use .get(). You can specify a default value to return if the key is not found, and it defaults to None.
So you can use
aerialLost = str(data.get('statistics', {}).get('aerialLost'))
The first call defaults to an empty dictionary so that there's something to make the second .get() call on. The second call just returns the default None.

Trying to find sums of unique values within a nested dictionary. (See example!)

Let's say I have this variable list_1 which is a list of dictionaries.
Each dictionary has a nested dictionary called "group" in which it has some information including "name".
What I'm trying to do is to sum the scores of each unique group name.
So I am looking for an output similar to:
Total Scores in (Ceramics) = (18)
Total Scores in (Math) = (20)
Total Scores in (History) = (5)
I have the above info in parenthesis because I would like this code to work regardless of the amount of items in the list, or amount of unique groups represented.
The list_1 variable:
list_1 = [
{"title" : "Painting",
"score" : 8,
"group" : {"name" : "Ceramics",
"id" : 391}
},
{"title" : "Exam 1",
"score" : 10,
"group" : {"name" : "Math",
"id" : 554}
},
{"title" : "Clay Model",
"score" : 10,
"group" : {"name" : "Ceramics",
"id" : 391}
},
{"title" : "Homework 3",
"score" : 10,
"group" : {"name" : "Math",
"id" : 554}
},
{"title" : "Report 1",
"score" : 5,
"group" : {"name" : "History",
"id" : 209}
},
]
My first idea was to create a new list variable and append each unique group name. Here's the code for that. But will this help in ultimately finding the sum of the scores for each one of these?
group_names_list = []
for item in list_1:
group_name = item["group"]["name"]
if group_name not in group_names_list:
group_names_list.append(group_name)
This gives me the value of group_names_list as:
['Ceramics','Math','History']
Any help or suggestions are appreciated! Thanks.

You can use a dict to keep track of scores per name:
score_dict = dict()
for d in list_1:
name = d['group']['name']
if name in score_dict:
score_dict[name] += d['score']
else:
score_dict[name] = d['score']
print(score_dict)
RESULTS:
{'Ceramics': 18, 'Math': 20, 'History': 5}

data = {}
for item in list_1: # for each item in our list
# set our category to its existing value (or 0) + the new score
data[item['group']['name']] = item['score'] + data.get(item['group']['name'],0)
print(data) # output = {'History': 5, 'Math': 20, 'Ceramics': 18}
then you can print it easy enough using format strings
for group_name,scores_summed in data.items():
print("Totals for {group_name} = {scores_summed}".format(group_name=group_name,scores_summed=scores_summed))

Both the answers of #JacobIRR and #JoranBeasley are great, as alternative you could do the following:
data = [
{"title": "Painting", "score": 8, "group": {"name": "Ceramics", "id": 391}},
{"title": "Exam 1", "score": 10, "group": {"name": "Math", "id": 554}},
{"title": "Clay Model", "score": 10, "group": {"name": "Ceramics", "id": 391}},
{"title": "Homework 3", "score": 10, "group": {"name": "Math", "id": 554}},
{"title": "Report 1", "score": 5, "group": {"name": "History", "id": 209}}
]
result = {}
scores = iter((e['group']['name'], e['score']) for e in data)
for name, score in scores:
result[name] = result.get(name, 0) + score
print(result)
Output
{'Ceramics': 18, 'History': 5, 'Math': 20}

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Extracting values from nested dictionary from text file to JSON - python

Related

Dynamic values within a JSON parameter using Python

Iterate through nested JSON in Python

How convert list to json based on same key value in django

Creating Lists from Multiple Jsons with Missing Keys

Trying to find sums of unique values within a nested dictionary. (See example!)

Categories

Resources