I've got some json from last.fm's api which I've serialised into a dictionary using simplejson. A quick example of the basic structure is below.
{
"artist": "similar": {
"artist": {
"name": "Blah",
"image": [{
"#text": "URLHERE",
"size": "small"
}, {
"#text": "URLHERE",
"size": "medium"
}, {
"#text": "URLHERE",
"size": "large"
}]
}
}
}
Any ideas how I can access the image urls of various different sizes?
Thanks,
Jack
Python does not have any problem with # in strings used as dict keys.
>>> import json
>>> j = '{"#foo": 6}'
>>> print json.loads(j)
{u'#foo': 6}
>>> print json.loads(j)[u'#foo']
6
>>> print json.loads(j)['#foo']
6
There are, however, problems with the JSON you post. For one, it isn't valid (perhaps you're missing a couple commas?). For two, you have a JSON object with the same key "image" three times, which cannot coexist and do anything useful.
In Javascript, these two syntaxes are equivalent:
o.foo
o['foo']
In Python they are not. The first gives you the foo attribute, the second gives you the foo key. (It's debatable whether this was a good idea or not.) In Python, you wouldn't be able to access #text as:
o.#text
because the hash will start a comment, and you'll have a syntax error.
But you want
o['#text']
in any case.
You can get what you want from the image list with a list comprehension. Something like
desired = [x for x in images if minSize < x['size'] < maxSize]
Here, images would be the list of dicts from the inner level of you data structure.
Related
I'm working on some code that processes a json database with very detailed information into a simpler format. It copies some of the fields and reserializes others into a new json file.
I'm currently using a dictionary comprehension like this MVCE:
converted_data = {
raw_item['name']: {
'state': raw_item['field_a'],
'variations': [variant for variant in raw_item['field_b']['variations']]
} for raw_item in json.loads(my_file.read())
}
An example file (not the actual data being used) is this:
[
{
"name": "Object A",
"field_a": "foo",
"field_b": {
"bar": "baz",
"variants": [
"foo",
"bar",
"baz"
]
}
},
{
"name": "Object B",
"field_a": "foo",
"field_b": {
"bar": "baz",
}
}
]
The challenge is that not all items contain variations. I see two potential solutions:
Use an if statement to conditionally apply the variations field into the dictionary.
Include an empty variations field for all items and fill it if the raw item contains variations.
I'll probably settle on the 2nd solution. However, is there a way to conditionally include a particular field within a dictionary comprehension?
Edit: In other words, is approach 1 possible inside a dictionary comprehension?
An example of the desired output (using a dictionary comprehension) would be as follows:
{
"Object A": {
"state": "foo",
"variants": ["foo", "bar", "baz"]
},
"Object B": {
"state": "foo"
}
}
I've found some other questions that change the entries conditionally or filter the entries, but these don't unconditionally create an item where a particular field (in the item) is conditionally absent.
I'm not sure you realise you can use the if inside an assignment, which seems like a very clean way to solve it to me:
converted_data = {
raw_item['name']: {
'state': raw_item['field_a'],
'variants': [] if 'variants' not in raw_item['field_b'] else
[str(variant) for variant in raw_item['field_b']['variants']]
} for raw_item in example
}
(Note: using str() instead of undefined function that was given in initial example)
After clarification of the question, here's an alternate solution that adds a different dictionary (missing the empty 'variations' key if there is none:
converted_data = {
raw_item['name']: {
'state': raw_item['field_a'],
'variants': [str(variant) for variant in raw_item['field_b']['variants']]
} if 'variants' in raw_item['field_b'] else {
'state': raw_item['field_a'],
} for raw_item in example
}
If the question actually is: can a key/value pair in a dictionary literal be optional (which would solve your problem) then the answer is simply "no". But the above achieves the same for this simple example.
If the real life situation is more complicated, simply construct the dictionary as in the first solution given here and then use del(dictionary['key']) to remove any added keys that have a None or [] value after construction.
For example, after the first example, converted_data could be cleaned up with:
for item in converted_data.values:
if not item['variants']:
del(item['variants'])
You could pass the process out to a function?
def check_age(age):
return age >= 18
my_dic = {
"age": 25,
"allowed_to_drink": check_age(25)
}
You end up with the value as the result of the function call
{'age': 25, 'allowed_to_drink': True}
How you would implement this I don't know, but some food for thought.
I need to figure out how to structure my data in python such that when I call dumps and write to file I get the following structure of data as an example:
{
"Prop_a": [
{
"car": "brown",
"color": "yellow",
"engine": [
{
"mod_a": "x1",
"name": [
{
"diesel": "yes",
}
...
As you can see, I have nested elements that need expanded. The end-goal is to import the data into a database, I need JSON or CSV formatted data to do it.
EDIT: To all: I can easily print a single level of the dict of lists to JSON. What I need assistance with is how to format the nested structure.
EDIT #2:
Since code is being requested...
my_dict = {}
for x in group:
my_dict[x] = []
for y in sub_group:
mydict[x].append(data_symbol_reference)
Produces an output like:
{
"Prop_a" : [
"car",
"color",
],
...
I need assistance on the nesting the dict of lists within the list structure.
Get the official docs for python 3.6: JSON encoder and decoder
I have JSON output as follows:
{
"service": [{
"name": ["Production"],
"id": ["256212"]
}, {
"name": ["Non-Production"],
"id": ["256213"]
}]
}
I wish to find all ID's where the pair contains "Non-Production" as a name.
I was thinking along the lines of running a loop to check, something like this:
data = json.load(urllib2.urlopen(URL))
for key, value in data.iteritems():
if "Non-Production" in key[value]: print key[value]
However, I can't seem to get the name and ID from the "service" tree, it returns:
if "Non-Production" in key[value]: print key[value]
TypeError: string indices must be integers
Assumptions:
The JSON is in a fixed format, this can't be changed
I do not have root access, and unable to install any additional packages
Essentially the goal is to obtain a list of ID's of non production "services" in the most optimal way.
Here you go:
data = {
"service": [
{"name": ["Production"],
"id": ["256212"]
},
{"name": ["Non-Production"],
"id": ["256213"]}
]
}
for item in data["service"]:
if "Non-Production" in item["name"]:
print(item["id"])
Whatever I see JSON I think about functionnal programming ! Anyone else ?!
I think it is a better idea if you use function like concat or flat, filter and reduce, etc.
Egg one liner:
[s.get('id', [0])[0] for s in filter(lambda srv : "Non-Production" not in srv.get('name', []), data.get('service', {}))]
EDIT:
I updated the code, even if data = {}, the result will be [] an empty id list.
I have a json response from an API in this way:-
{
"meta": {
"code": 200
},
"data": {
"username": "luxury_mpan",
"bio": "Recruitment Agents👑👑👑👑\nThe most powerful manufacturers,\nwe have the best quality.\n📱Wechat:13255996580💜💜\n📱Whatsapp:+8618820784535",
"website": "",
"profile_picture": "https://scontent.cdninstagram.com/t51.2885-19/10895140_395629273936966_528329141_a.jpg",
"full_name": "Mpan",
"counts": {
"media": 17774,
"followed_by": 7982,
"follows": 7264
},
"id": "1552277710"
}
}
I want to fetch the data in "media", "followed_by" and "follows" and store it in three different lists as shown in the below code:--
for r in range(1,5):
var=r,st.cell(row=r,column=3).value
xy=var[1]
ij=str(xy)
myopener=Myopener()
url=myopener.open('https://api.instagram.com/v1/users/'+ij+'/?access_token=641567093.1fb234f.a0ffbe574e844e1c818145097050cf33')
beta=json.load(url)
for item in beta['data']:
list1.append(item['media'])
list2.append(item['followed_by'])
list3.append(item['follows'])
When I run it, it shows the error TypeError: string indices must be integers
How would my loop change in order to fetch the above mentioned values?
Also, Asking out of curiosity:- Is there any way to fetch the Watzapp no from the "BIO" key in data dictionary?
I have referred questions similar to this and still did not get my answer. Please help!
beta['data'] is a dictionary object. When you iterate over it with for item in beta['data'], the values taken by item will be the keys of the dictionary: "username", "bio", etc.
So then when you ask for, e.g., item['media'] it's like asking for "username"['media'], which of course doesn't make any sense.
It isn't quite clear what it is that you want: is it just the stuff inside counts? If so, then instead of for item in beta['data']: you could just say item = beta['data']['counts'], and then item['media'] etc. will be the values you want.
As to your secondary question: I suggest looking into regular expressions.
I am learning to use Python and APIs (specifically, this World Cup API, http://www.kimonolabs.com/worldcup/explorer)
The JSON data looks like this:
[
{
"firstName": "Nicolas Alexis Julio",
"lastName": "N'Koulou N'Doubena",
"nickname": "N. N'Koulou",
"assists": 0,
"clubId": "5AF524A1-830C-4D75-8C54-2D0BA1F9BE33",
"teamId": "DF25ABB8-37EB-4C2A-8B6C-BDA53BF5A74D",
"id": "D9AD1E6D-4253-4B88-BB78-0F43E02AF016",
"type": "Player"
},
{
"firstName": "Alexandre Dimitri",
"lastName": "Song-Billong",
"nickname": "A. Song",
"clubId": "35BCEEAF-37D3-4685-83C4-DDCA504E0653",
"teamId": "DF25ABB8-37EB-4C2A-8B6C-BDA53BF5A74D",
"id": "A84540B7-37B6-416F-8C4D-8EAD55D113D9",
"type": "Player"
},
]
I am simply trying to print all of the firstNames in this API. Here's what I have:
import urllib2
import json
url = "http://worldcup.kimonolabs.com/api/players?apikey=xxx"
json_obj = urllib2.urlopen(url).read
readable_json = json.dumps(json_obj)
playerstuff = readable_json['firstName']
for i in playerstuff:
print i['firstName']
But when I run it, I get the error "...line 8, in ...TypeError: list indices must be integers, not str"
I have looked around for solutions, but seem to find questions to more "in depth" API questions and I don't really understand it all yet, so any help or explanation as to what I need to do would be amazing. Thank you!
I solved changing
readable_json['firstName']
by
readable_json[0]['firstName']
First of all, you should be using json.loads, not json.dumps. loads converts JSON source text to a Python value, while dumps goes the other way.
After you fix that, based on the JSON snippet at the top of your question, readable_json will be a list, and so readable_json['firstName'] is meaningless. The correct way to get the 'firstName' field of every element of a list is to eliminate the playerstuff = readable_json['firstName'] line and change for i in playerstuff: to for i in readable_json:.
You can simplify your code down to
url = "http://worldcup.kimonolabs.com/api/players?apikey=xxx"
json_obj = urllib2.urlopen(url).read
player_json_list = json.loads(json_obj)
for player in readable_json_list:
print player['firstName']
You were trying to access a list element using dictionary syntax. the equivalent of
foo = [1, 2, 3, 4]
foo["1"]
It can be confusing when you have lists of dictionaries and keeping the nesting in order.