How to parse key values from nested dictionaries in this example? - python

Please see the JSON below taken from an API.
my_json =
{
"cities":[
{
"portland":[
{"more_info":[{"rank": "3", "games_played": "5"}
],
"team_name": "blazers"
},
{
"cleveland":[
{"more_info":[{"rank": "2", "games_played": "7"}
],
"team_name": "cavaliers"
}
]
}
I would like to create a new dictionary from this my_json with "team_name" as the key and "rank" as the value.
Like this: {'Blazers': 3, 'Cavaliers': 2, 'Bulls': 7}
I'm not sure how to accomplish this... I can return a list of cities, and I can return a list of ranks, but they end up being two separate lists with no relation, I'm not sure how to relate the two.
Any help would be appreciated (I'm also open to organizing this info in a list rather than dict if that is easier).
If I run this:
results_dict = {}
cities = my_json.get('cities', [])
for x in cities:
for k,v in x.items():
print k, v
it returns:
team_name blazers
portland [{"rank": "3", "games_played": "5"}
team_name cavaliers
cavaliers [{"rank": "2", "games_played": "7"}

If you want to take your cities list and your ranks list and combine them, you could use zip() and a dictionary comprehension:
output = {city: rank for city, rank in zip(cities, ranks)}

Valid JSON looks like:
{
"cities":[
{
"portland":[
{"more_info":
[{"rank": "3", "games_played": "5"}],
"team_name":
"blazers"
}
]
},
{
"cleveland":[
{"more_info":
[{"rank": "2", "games_played": "7"}],
"team_name":
"cavaliers"
}
]
}
]
}
This part of code returns all you want, but I'll try to write more readable code instead of this:
results_dict = {}
cities = my_json.get('cities', [])
for x in cities:
for k,v in x.items():
for element in v:
team = element.get('team_name', '')
meta_data = element.get('more_info', [])
for item in meta_data:
rank = item.get('rank')
results_dict.update({team: rank})
>>> results_dict
{'blazers': '3', 'cavaliers': '2'}

What API is that? The JSON structure (if pivanchy got it right) seems to be unnecessarily nested in lists. (Can a city have more than one team? Probably yes. Can a team have more than one rank, though?)
But just for sports, here is a gigantic dictionary comprehension to extract the data you want:
{ team['team_name']: team['more_info'][0]['rank']
for ((team,),) in (
city.values() for city in my_json['cities']
)
}

The json seemed to be missing some closing brackets. After adding them I got this:
my_json = {
"cities": [
{"portland":[
{"more_info":[{"rank": "3", "games_played": "5"}],"team_name": "blazers"}]},
{"cleveland":[{"more_info":[{"rank": "2", "games_played": "7"}],"team_name": "cavaliers"}]}
]
}
Given that structure, which is extremely nested, the following code will extract the data you want, but its very messy:
results = {}
for el in my_json["cities"]:
name = el.keys()[0]
rank = el.values()[0][0]["more_info"][0]["rank"]
results[name] = rank
print results
Which will give you:
{'portland': '3', 'cleveland': '2'}

Related

Why is this dictionary comprehension for list of dictionaries not returning values?

I am iterating over a list of dictionaries in a list formatted as follows (each dictionary relates to one item of clothing, I have only listed the first:
new_products = [{'{"uniq_id": "1234", "sku": "abcdefgh", "name": "Levis skinny jeans", '
'"list_price": "75.00", "sale_price": "55.00", "category": "womens"}'}]
def find_product(dictionary, uniqid):
if 'uniq_id' in dictionary:
if ['uniq_id'] == uniqid:
return(keys, values in dictionary)
print(find_product(new_products, '1234'))
This is returning
None
The reason for the if statement in there is that not every product has a value for uniq_id so I was getting a key error on an earlier version of my code.
Your dictionary definition is quite unclear.
Assuming that you have given a list of dictionaries of size 1, it should be something like this:
new_products = [{"uniq_id": "1234", "sku": "abcdefgh", "name": "Levis skinny jeans", "list_price": "75.00", "sale_price": "55.00", "category": "womens"}]
def find_product(list_of_dicts, uniqid):
for dictionary in list_of_dicts:
if 'uniq_id' in dictionary:
if dictionary['uniq_id'] == uniqid:
return dictionary
print(find_product(new_products, '1234'))
You are using something like this:
new_products = [{'{ "some" : "stuff" }'}]
This is a list (the outer []) containing a set (the {})
{'{ "some" : "stuff" }'}
Note {1} is a set containing the number 1. Though it uses the curly braces it isn't a dictionary.
Your set contains a string:
'{ "some" : "stuff" }'
If I ask if 'some' is in this, I get True back, but if I ask for this string's keys there are no keys.
Make your new_products a list containing a dictionary (not a set), and don't put the payload in a string:
new_products = [{"uniq_id": "1234",
"sku": "abcdefgh",
"name": "Levis skinny jeans",
"list_price": "75.00",
"sale_price": "55.00",
"category": "womens"}]
Then loop over the dictionaries in the list in your function:
def find_product(dictionary_list, uniqid):
for d in dictionary_list:
if 'uniq_id' in d:
if d['uniq_id'] == uniqid:
return d.keys(), d.values()
return "not found" # or something better
>>> find_product(new_products, '1234')
(dict_keys(['uniq_id', 'sku', 'name', 'list_price', 'sale_price', 'category']), dict_values(['1234', 'abcdefgh', 'Levis skinny jeans', '75.00', '55.00', 'womens']))
>>> find_product(new_products, '12345')
'not found'

Select specific keys inside a json using python

I have the following json that I extracted using request with python and json.loads. The whole json basically repeats itself with changes in the ID and names. It has a lot of information but I`m just posting a small sample as an example:
"status":"OK",
"statuscode":200,
"message":"success",
"apps":[
{
"id":"675832210",
"title":"AGED",
"desc":"No annoying ads&easy to play",
"urlImg":"https://test.com/pImg.aspx?b=675832&z=1041813&c=495181&tid=API_MP&u=https%3a%2f%2fcdna.test.com%2fbanner%2fwMMUapCtmeXTIxw_square.png&q=",
"urlImgWide":"https://cdna.test.com/banner/sI9MfGhqXKxVHGw_rectangular.jpeg",
"urlApp":"https://admin.test.com/appLink.aspx?b=675832&e=1041813&tid=API_MP&sid=2c5cee038cd9449da35bc7b0f53cf60f&q=",
"androidPackage":"com.agedstudio.freecell",
"revenueType":"cpi",
"revenueRate":"0.10",
"categories":"Card",
"idx":"2",
"country":[
"CH"
],
"cityInclude":[
"ALL"
],
"cityExclude":[
],
"targetedOSver":"ALL",
"targetedDevices":"ALL",
"bannerId":"675832210",
"campaignId":"495181210",
"campaignType":"network",
"supportedVersion":"",
"storeRating":"4.3",
"storeDownloads":"10000+",
"appSize":"34603008",
"urlVideo":"",
"urlVideoHigh":"",
"urlVideo30Sec":"https://cdn.test.com/banner/video/video-675832-30.mp4?rnd=1620699136",
"urlVideo30SecHigh":"https://cdn.test.com/banner/video/video-675832-30_o.mp4?rnd=1620699131",
"offerId":"5825774"
},
I dont need all that data, just a few like 'title', 'country', 'revenuerate' and 'urlApp' but I dont know if there is a way to extract only that.
My solution so far was to make the json a dataframe and then drop the columns, however, I wanted to find an easier solution.
My ideal final result would be to have a dataframe with selected keys and arrays
Does anybody know an easy solution for this problem?
Thanks
I assume you have that data as a dictionary, let's call it json_data. You can just iterate over the apps and write them into a list. Alternatively, you could obviously also define a class and initialize objects of that class.
EDIT:
I just found this answer: https://stackoverflow.com/a/20638258/6180150, which tells how you can convert a list of dicts like from my sample code into a dataframe. See below adaptions to the code for a solution.
json_data = {
"status": "OK",
"statuscode": 200,
"message": "success",
"apps": [
{
"id": "675832210",
"title": "AGED",
"desc": "No annoying ads&easy to play",
"urlImg": "https://test.com/pImg.aspx?b=675832&z=1041813&c=495181&tid=API_MP&u=https%3a%2f%2fcdna.test.com%2fbanner%2fwMMUapCtmeXTIxw_square.png&q=",
"urlImgWide": "https://cdna.test.com/banner/sI9MfGhqXKxVHGw_rectangular.jpeg",
"urlApp": "https://admin.test.com/appLink.aspx?b=675832&e=1041813&tid=API_MP&sid=2c5cee038cd9449da35bc7b0f53cf60f&q=",
"androidPackage": "com.agedstudio.freecell",
"revenueType": "cpi",
"revenueRate": "0.10",
"categories": "Card",
"idx": "2",
"country": [
"CH"
],
"cityInclude": [
"ALL"
],
"cityExclude": [
],
"targetedOSver": "ALL",
"targetedDevices": "ALL",
"bannerId": "675832210",
"campaignId": "495181210",
"campaignType": "network",
"supportedVersion": "",
"storeRating": "4.3",
"storeDownloads": "10000+",
"appSize": "34603008",
"urlVideo": "",
"urlVideoHigh": "",
"urlVideo30Sec": "https://cdn.test.com/banner/video/video-675832-30.mp4?rnd=1620699136",
"urlVideo30SecHigh": "https://cdn.test.com/banner/video/video-675832-30_o.mp4?rnd=1620699131",
"offerId": "5825774"
},
]
}
filtered_data = []
for app in json_data["apps"]:
app_data = {
"id": app["id"],
"title": app["title"],
"country": app["country"],
"revenueRate": app["revenueRate"],
"urlApp": app["urlApp"],
}
filtered_data.append(app_data)
print(filtered_data)
# Output
d = [
{
'id': '675832210',
'title': 'AGED',
'country': ['CH'],
'revenueRate': '0.10',
'urlApp': 'https://admin.test.com/appLink.aspx?b=675832&e=1041813&tid=API_MP&sid=2c5cee038cd9449da35bc7b0f53cf60f&q='
}
]
d = pd.DataFrame(filtered_data)
print(d)
# Output
id title country revenueRate urlApp
0 675832210 AGED [CH] 0.10 https://admin.test.com/appLink.aspx?b=675832&e=1041813&tid=API_MP&sid=2c5cee038cd9449da35bc7b0f53cf60f&q=
if your endgame is dataframe, just load the dataframe and take the columns you want:
setting the json to data
df = pd.json_normalize(data['apps'])
yields
id title desc urlImg ... urlVideoHigh urlVideo30Sec urlVideo30SecHigh offerId
0 675832210 AGED No annoying ads&easy to play https://test.com/pImg.aspx?b=675832&z=1041813&... ... https://cdn.test.com/banner/video/video-675832... https://cdn.test.com/banner/video/video-675832... 5825774
[1 rows x 28 columns]
then if you want certain columns:
df_final = df[['title', 'desc', 'urlImg']]
title desc urlImg
0 AGED No annoying ads&easy to play https://test.com/pImg.aspx?b=675832&z=1041813&...
use a dictionary comprehension to extract a dictionary of key/value pairs you want
import json
json_string="""{
"id":"675832210",
"title":"AGED",
"desc":"No annoying ads&easy to play",
"urlApp":"https://admin.test.com/appLink.aspx?b=675832&e=1041813&tid=API_MP&sid=2c5cee038cd9449da35bc7b0f53cf60f&q=",
"revenueRate":"0.10",
"categories":"Card",
"idx":"2",
"country":[
"CH"
],
"cityInclude":[
"ALL"
],
"cityExclude":[
]
}"""
json_dict = json.loads(json_string)
filter_fields=['title','country','revenueRate','urlApp']
dict_result = { key: json_dict[key] for key in json_dict if key in filter_fields}
json_elements = []
for key in dict_result:
json_elements.append((key,json_dict[key]))
print(json_elements)
output:
[('title', 'AGED'), ('urlApp', 'https://admin.test.com/appLink.aspx?b=675832&e=1041813&tid=API_MP&sid=2c5cee038cd9449da35bc7b0f53cf60f&q='), ('revenueRate', '0.10'), ('country', ['CH'])]

Getting JSON data out of dictionary within a dictionary using python

I am using a python and getting the data from an API the data formatted as listed in the example I have a problem getting out Cust_id and name put of the API
Below is one of the things I tried and one of the things answered by SimonR. I am sure I am doing something really dumb right now but I get the error
typeError: the JSON object must be str, bytes or bytearray, not dict. Thank you everyone in advance for your answers
import json
a = {
"count": 5,
"Customers": {
"32759": {
"cust_id": "1234",
"name": "Mickey Mouse"
},
"11053": {
"cust_id": "1235",
"name": "Mini Mouse"
},
"21483": {
"cust_id": "1236",
"name": "Goofy"
},
"12441": {
"cust_id": "1237",
"name": "Pluto"
},
"16640": {
"cust_id": "1238",
"name": "Donald Duck"
}
}
}
d = json.loads(a)
customers = {v["cust_id"]: v["name"] for v in d["Customers"].values()}
Is this what you're trying to do ?
import json
d = json.loads(a)
customers = {v["cust_id"]: v["name"] for v in d["Customers"].values()}
outputs :
{'1234': 'Mickey Mouse',
'1235': 'Mini Mouse',
'1236': 'Goofy',
'1237': 'Pluto',
'1238': 'Donald Duck'}
Well if I understood correctly you can do this:
# d is the API response in your post
# This will give you the list of customers
customers = d['Customers']
Then you can iterate over the customers dictionary and save them to any data structure you want:
# This will print out the name and cust_id
for k, v in customers.items():
print(v['cust_id'], v['name'])
Hope it helps!
import json
# convert json to python dict
response = json.loads(json_string)
# loop through all customers
for key, customer in response['Customers'].items():
# get customer id
customer['cust_id']
# get customer name
custoemr['name']

Creating a new dict from a list of dicts

I have a list of dictionaries in the following format
data = [
{
"Members": [
"user11",
"user12",
"user13"
],
"Group": "Group1"
},
{
"Members": [
"user11",
"user21",
"user22",
"user23"
],
"Group": "Group2"
},
{
"Members": [
"user11",
"user22",
"user31",
"user32",
"user33",
],
"Group": "Group3"
}]
I'd like to return a dictionary where every user is a key and the value is a list of all the groups which they belong to. So for the above example, this dict would be:
newdict = {
"user11": ["Group1", "Group2", "Group3"]
"user12": ["Group1"],
"user13": ["Group1"],
"user21": ["Group2"],
"user22": ["Group2", "Group3"],
"user23": ["Group2"],
"user31": ["Group3"],
"user32": ["Group3"],
"user33": ["Group3"],
}
My initial attempt was using a defaultdict in a nested loop, but this is slow (and also isn't returning what I expected). Here was that attempt:
user_groups = defaultdict(list)
for user in users:
for item in data:
if user in item["Members"]:
user_groups[user].append(item["Group"])
Does anyone have any suggestions for improvement for speed, and also just a generally better way to do this?
Code
new_dict = {}
for d in data: # each item is dictionary
members = d["Members"]
for m in members:
# appending corresponding group for each member
new_dict.setdefault(m, []).append(d["Group"])
print(new_dict)
Out
{'user11': ['Group1', 'Group2', 'Group3'],
'user12': ['Group1'],
'user13': ['Group1'],
'user21': ['Group2'],
'user22': ['Group2', 'Group3'],
'user23': ['Group2'],
'user31': ['Group3'],
'user32': ['Group3'],
'user33': ['Group3']}

python trasform data from csv to array of dictionaries and group by field value

I have csv like this:
id,company_name,country,country_id
1,batstop,usa, xx
2,biorice,italy, yy
1,batstop,italy, yy
3,legstart,canada, zz
I want an array of dictionaries to import to firebase. I need to group the different country informations for the same company in a nested list of dictionaries. This is the desired output:
[ {'id':'1', 'agency_name':'batstop', countries [{'country':'usa','country_id':'xx'}, {'country':'italy','country_id':'yy'}]} ,
{'id':'2', 'agency_name':'biorice', countries [{'country':'italy','country_id':'yy'}]},
{'id':'3', 'legstart':'legstart', countries [{'country':'canada','country_id':'zz'}]} ]
Recently I had a similar task, the groupby function from itertools and the itemgetter function from operator - both standard python libraries - helped me a lot. Here's the code considering your csv, note how defining the primary keys of your csv dataset is important.
import csv
import json
from operator import itemgetter
from itertools import groupby
primary_keys = ['id', 'company_name']
# Start extraction
with open('input.csv', 'r') as file:
# Read data from csv
reader = csv.DictReader(file)
# Sort data accordingly to primary keys
reader = sorted(reader, key=itemgetter(*primary_keys))
# Create a list of tuples
# Each tuple containing a dict of the group primary keys and its values, and a list of the group ordered dicts
groups = [(dict(zip(primary_keys, _[0])), list(_[1])) for _ in groupby(reader, key=itemgetter(*primary_keys))]
# Create formatted dict to be converted into firebase objects
group_dicts = []
for group in groups:
group_dict = {
"id": group[0]['id'],
"agency_name": group[0]['company_name'],
"countries": [
dict(country=_['country'], country_id=_['country_id']) for _ in group[1]
],
}
group_dicts.append(group_dict)
print("\n".join([json.dumps(_, indent=2) for _ in group_dicts]))
Here's the output:
{
"id": "1",
"agency_name": "batstop",
"countries": [
{
"country": "usa",
"country_id": " xx"
},
{
"country": "italy",
"country_id": " yy"
}
]
}
{
"id": "2",
"agency_name": "biorice",
"countries": [
{
"country": "italy",
"country_id": " yy"
}
]
}
{
"id": "3",
"agency_name": "legstart",
"countries": [
{
"country": "canada",
"country_id": " zz"
}
]
}
There's no external library,
Hope it suits you well!
You can try this, you may have to change a few parts to get it working with your csv, but hope it's enough to get you started:
csv = [
"1,batstop,usa, xx",
"2,biorice,italy, yy",
"1,batstop,italy, yy",
"3,legstart,canada, zz"
]
output = {} # dictionary useful to avoid searching in list for existing ids
# Parse each row
for line in csv:
cols = line.split(',')
id = int(cols[0])
agency_name = cols[1]
country = cols[2]
country_id = cols[3]
if id in output:
output[id]['countries'].append([{'country': country,
'country_id': country_id}])
else:
output[id] = {'id': id,
'agency_name': agency_name,
'countries': [{'country': country,
'country_id': country_id}]
}
# Put into list
json_output = []
for key in output.keys():
json_output.append( output[key] )
# Check output
for row in json_output:
print(row)

Categories