How can I sort through JSON?

How can I sort through JSON? - python

I'm trying to sort through some JSON I get from a webpage and set specific attributes of part of the JSON to variables.
Here is some of the code I am using, but I am quite confused and do not work with JSON often.
data = json.load(url)
print(list(data['products_and_categories']['Bags']))
Here is some of the JSON:
{"products_and_categories":{"Bags": [
{"name":"Duffle Bag","id":172614...},
{"name":"Backpack","id":172607...}]}
]
I want to sort through the JSON based on the name, and then pull the id associated with it.

You can do this:
backpackid = null
for x in yourjson['products_and_categories']['Bags']:
if x['name'] == "Backpack":
print("found it!")
backpackid = x['id']
print("Backpack ID is: %", backpackid)

Related

Python3 - Parse list of strings inside nested json

Python Noob here. I saw many similar questions but none of it my exact use case. I have a simple nested json, and I'm trying to access the element name present inside metadata. Below is my sample json.
{
"items": [{
"metadata": {
"name": "myname1"
}
},
{
"metadata": {
"name": "myname1"
}
}
]
}
Below is the code That I have tried so far, but not successfull.
import json
f = open('./myfile.json')
x = f.read()
data = json.loads(x)
for i in data['items']:
for j in i['metadata']:
print (j['name'])
It errors out stating below
File "pythonjson.py", line 8, in
print (j['name']) TypeError: string indices must be integers
When I printed print (type(j)) I received the following o/p <class 'str'>. So I can see that it is a list of strings and not an dictinoary. So now How can I parse through a list of strings? Any official documentation or guide would be much helpful to know the concept of this.

Your json is bad, and the python exception is clear and unambiguous. You have the basic string "name" and you are trying to ... do a lookup on that?
Let's cut out all the json and look at the real issue. You do not know how to iterate over a dict. You're actually iterating over the keys themselves. If you want to see their values too, you're going to need dict.items()
https://docs.python.org/3/tutorial/datastructures.html#looping-techniques
metadata = {"name": "myname1"}
for key, value in metadata.items():
if key == "name":
print ('the name is', value)
But why bother if you already know the key you want to look up?
This is literally why we have dict.
print ('the name is', metadata["name"])

You likely need:
import json
f = open('./myfile.json')
x = f.read()
data = json.loads(x)
for item in data['items']:
print(item["metadata"]["name"]
Your original JSON is not valid (colons missing).

to access contents of name use "i["metadata"].keys()" this will return all keys in "metadata".
Working code to access all values of the dictionary in "metadata".
for i in data['items']:
for j in i["metadata"].keys():
print (i["metadata"][j])
**update:**Working code to access contents of "name" only.
for i in data['items']:
print (i["metadata"]["name"])

Reading nested JSON dynamically

I am currently trying out something which I am unsure if it is possible.
I am trying to map API values from a JSON string (which has nested values) to a database field but I wish for it to be dynamic.
In the YAML example below, the key would be the database field name and the database field value would be where to obtain the information from the JSON string ("-" delimited for nested values). I am able to read the YAML config but what I don't understand is how to translate it to python code. If it were to be dynamic I have no idea how many [] I would have to put.
YAML: (PYYAML package)
employer: "properties-employer_name"
...
employee_name: "employee"
Python Code: (Python 3.8)
json_data = { properties: {employer_name: "XYZ"}, employee: "Sam" }
employer = json_data["properties"]["employer_name"] # How Do I add [] based on how nested the value is dynamically?
employee = json_data["employee"]
Many thanks!

You could try something like this:
def get_value(data, keys):
# Go over each key and adjust data value to current level
for key in keys:
data = data[key]
return data # Once last key is reached return value
You would get your keys by splitting on '-' if that is how you have it in your yaml so in my example I just saved the value to a string and did it this way:
employer = "properties-employer_name"
keys = employer.split('-') # Gives us ['properties', 'employer_name']
Now we can call our get_value function defined above:
get_value(json_data, keys)
Which returns 'XYZ'

Yet another Python looping over JSON array

I spent several hours on this, tried everything I found online, pulled some of the hair left on my head...
I have this JSON sent to a Flask webservice I'm writing :
{'jsonArray': '[
{
"nom":"0012345679",
"Start":"2018-08-01",
"Finish":"2018-08-17",
"Statut":"Validee"
},
{
"nom":"0012345679",
"Start":"2018-09-01",
"Finish":"2018-09-10",
"Statut":"Demande envoyée au manager"
},
{
"nom":"0012345681",
"Start":"2018-04-01",
"Finish":"2018-04-08",
"Statut":"Validee"
},
{
"nom":"0012345681",
"Start":"2018-07-01",
"Finish":"2018-07-15",
"Statut":"Validee"
}
]'}
I want to simply loop through the records :
app = Flask(__name__)
#app.route('/graph', methods=['POST'])
def webhook():
if request.method == 'POST':
req_data = request.get_json()
print(req_data) #-> shows JSON that seems to be right
##print(type(req_data['jsonArray']))
#j1 = json.dumps(req_data['jsonArray'])
#j2 = json.loads(req_data['jsonArray'])
#data = json.loads(j1)
#for rec in data:
# print(rec) #-> This seems to consider rec as one of the characters of the whole JSON string, and prints every character one by one
#for key in data:
# value = data[key]
# print("The key and value are ({}) = ({})".format(key, value)) #-> TypeError: string indices must be integers
for record in req_data['jsonArray']:
for attribute, value in rec.items(): #-> Gives error 'str' object has no attribute 'items'
print(attribute, value)
I believe I am lost between JSON object, python dict object, strings, but I don't know what I am missing. I really tried to put the JSON received through json.dumps and json.loads methods, but still nothing. What am I missing ??
I simply want to loop through each record to create another python object that I will feed to a charting library like this :
df = [dict(Task="0012345678", Start='2017-01-01', Finish='2017-02-02', Statut='Complete'),
dict(Task="0012345678", Start='2017-02-15', Finish='2017-03-15', Statut='Incomplete'),
dict(Task="0012345679", Start='2017-01-17', Finish='2017-02-17', Statut='Not Started'),
dict(Task="0012345679", Start='2017-01-17', Finish='2017-02-17', Statut='Complete'),
dict(Task="0012345680", Start='2017-03-10', Finish='2017-03-20', Statut='Not Started'),
dict(Task="0012345680", Start='2017-04-01', Finish='2017-04-20', Statut='Not Started'),
dict(Task="0012345680", Start='2017-05-18', Finish='2017-06-18', Statut='Not Started'),
dict(Task="0012345681", Start='2017-01-14', Finish='2017-03-14', Statut='Complete')]

The whole thing is wrapped in single quotes, meaning it's a string and you need to parse it.
for record in json.loads(req_data['jsonArray']):
Looking at your commented code, you did this:
j1 = json.dumps(req_data['jsonArray'])
data = json.loads(j1)
Using json.dumps on a string is the wrong idea, and moreover json.loads(json.dumps(x)) is just the same as x, so that just got you back where you started, i.e. data was the same thing as req_data['jsonArray'] (a string).
This was the right idea:
j2 = json.loads(req_data['jsonArray'])
but you never used j2.
As you've seen, iterating over a string gives you each character of the string.

Parsing Python JSON with multiple same strings with different values

I am stuck on an issue where I am trying to parse for the id string in JSON that exists more than 1 time. I am using the requests library to pull json from an API. I am trying to retrieve all of the values of "id" but have only been able to successfully pull the one that I define. Example json:
{
"apps": [{
"id": "app1",
"id": "app2",
"id": "new-app"
}]
}
So what I have done so far is turn the json response into dictionary so that I am actually parse the first iteration of "id". I have tried to create for loops but have been getting KeyError when trying to find string id or TypeError: list indices must be integers or slices, not str. The only thing that I have been able to do successfully is define which id locations to output.
(data['apps'][N]['id']) -> where N = 0, 1 or 2
This would work if there was only going to be 1 string of id at a time but will always be multiple and the location will change from time to time.
So how do return the values of all strings for "id" from this single json output? Full code below:
import requests
url = "http://x.x.x.x:8080/v2/apps/"
response = requests.get(url)
#Error if not 200 and exit
ifresponse.status_code!=200:
print("Status:", response.status_code, "CheckURL.Exiting")
exit()
#Turn response into a dict and parse for ids
data = response.json()
for n in data:
print(data['apps'][0]['id'])
OUTPUT:
app1
UPDATE:
Was able to get resolution thanks to Robᵩ. Here is what I ended up using:
def list_hook(pairs):
result = {}
for name, value in pairs:
if name == 'id':
result.setdefault(name, []).append(value)
print(value)
data = response.json(object_pairs_hook = list_hook)
Also The API that I posted as example is not a real API. It was just supposed to be a visual representation of what I was trying to achieve. I am actually using Mesosphere's Marathon API . Trying to build a python listener for port mapping containers.

Your best choice is to contact the author of the API and let him know that his data format is silly.
Your next-best choice is to modify the behavior of the the JSON parser by passing in a hook function. Something like this should work:
def list_hook(pairs):
result = {}
for name, value in pairs:
if name == 'id':
result.setdefault(name, []).append(value)
else:
result[name] = value
return result
data = response.json(object_pairs_hook = list_hook)
for i in range(3):
print(i, data['apps'][0]['id'][i])

Trying to iterate over a JSON object

I am trying to iterate over a JSON object, using simplejson.
def main(arg1):
response = urllib2.urlopen("http://search.twitter.com/search.json?q=" + arg1) #+ "&rpp=100&page=15")
twitsearch = simplejson.load(response)
twitsearch = twitsearch['results']
twitsearch = twitsearch['text']
print twitsearch
I am passing a list of values to search for in Twitter, like "I'm", "Think", etc.
The problem is that there are multiple text fields, one each for every Tweet. I want to iterate over the entire JSON object, pulling out the "text" field.
How would I do this? I'm reading the documentation and can't see exactly where it talks about this.
EDIT: It appears to be stored as a list of JSON objects.
Trying to do this:
for x in twitsearch:
x['text']
How would I store x['text'] in a list? Append?

Note that
twitsearch['results']
is a Python list. You can iterate over that list, storing the text component of each of those objects in your own list. A list comprehension would be a good thing to use here.
text_list = [x['text'] for x in twitsearch['results']]

Easy. Figured it out.
tweets = []
for x in twitsearch:
tweets.append(x['text'])

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How can I sort through JSON? - python

You can do this: backpackid = null for x in yourjson['products_and_categories']['Bags']: if x['name'] == "Backpack": print("found it!") backpackid = x['id'] print("Backpack ID is: %", backpackid)

Related

Python3 - Parse list of strings inside nested json

Reading nested JSON dynamically

Yet another Python looping over JSON array

Parsing Python JSON with multiple same strings with different values

Trying to iterate over a JSON object

Categories

Resources