How do you extract multiple values from JSON? - python

I have a JSON file like this:
print(resp)
dataList = []
{
"totalCount": 9812,
"pageSize": 50,
"nextPageKey": "12345",
"problems": [
{
"problemId": "1234",
"displayId": "test1",
"title": "host memory saturation",
"impactLevel": "INFRASTRUCTURE",
"severityLevel": "RESOURCE_CONTENTION",
"status": "OPEN"
}
]
}
I need to extract "title", "impactLevel" and "severityLevel" and create a data frame out this:
I have tried this:
dataList.append([resp['problems'][0]['title']['impactLevel']['severityLevel']])
hyperEntitydf = pd.DataFrame(dataList, columns=['Alert','Type','Severity'])
hyperEntitydf=hyperEntitydf.drop_duplicates()
print(hyperEntitydf.head())
On this line:
dataList.append([resp['problems'][0]['title']['impactLevel']['severityLevel']])
I am getting this error:
TypeError: string indices must be integers
Is it possible to extract multiple fields with one call?
dataList.append([resp['problems'][0]['title']['impactLevel']['severityLevel']])

No, it's not possible. For this to work, the json would need to be nested such that impactLevel were within title, and severityLevel were within impactLevel.
I'd suggest
x = resp['problems'][0]
dataList.append([x['title'], x['impactLevel', x['severityLevel'])
Unless, you want all the fields of the json, in which case you can do:
dataList.append(list(resp['problems'][0].values()))

Use operators.itemgetter:
from operator import itemgetter
f = itemgetter('title', 'impactLevel', 'severityLevel')
dataList.append(list(f(resp['problems'][0])))
You only need to use list if you specifically need a list; f will return a tuple.
If you want the three values from every element of resp['problems'],
dataList = list(map(f, resp['problems']))
Here, you only need list if DataFrame requires a list, rather than an arbitrary iterable.

Related

Is there an efficient way to write data to a JSON file using a dictionary in Python?

I'm writing a program in Python to use an API that needs to get input from a JSON payload in a really specific way which is shown below. The poid element will contain a different number with each run of the program, the inventories element contains a list of dictionaries that I am trying to send to the API.
[
{
"poid":"22130",
"inventories":
[
{
"item": "SAMPLE-ITEM-1",
"mfgr": "SAMPLE-MANUFACTURER-1",
"quantity": "1",
"condition": "REF"
},
{
"item": "SAMPLE-ITEM-2",
"mfgr": "SAMPLE-MANUFACTURER-2",
"quantity": "3",
"condition": "REF"
}
]
}
]
The data I need to put into the file is stored in a dictionary and a list as shown below. For simplicity of this post, I'm showing what the dictionary and list would look like after another method creates them. I'm not sure if this is the most efficient way of storing this data when I'm having to write it to JSON.
pn_and_mfgr_dict = {'SAMPLE-ITEM-1': 'SAMPLE-MANUFACTURER-1', 'SAMPLE-ITEM-2': 'SAMPLE-MANUFACTURER-2'}
quantities = ["1","3"]
poid = 22130 #this will be different each run
If it makes sense from what I've written above, I need to generate a JSON file that looks like the first codeblock given the information from the second codeblock. The item at index 0 in the quantities list corresponds to the first key/value pair in the dictionary and so on. The "condition" value in the first codeblock will always have "REF" as its value for my use, but I need to also include that in the final payload that gets sent to the API. Since the part number and manufacturer dictionary will be a different length with each run, I also need this method to work regardless of how many values are in the dictionary. This dictionary and the quantities list will always be the same length though. I think the best way I can solve this is making a for loop that iterates through the dictionary and puts respective data where it needs to be, then reading the file when the for loop is done and sending it as the payload but please correct me if there's a better way to do this like storing everything in variables. I also have no experience with JSON so I have attempted to use JSON libraries to accomplish this with no idea what I'm doing wrong. I can edit this with my attempts tonight but I wanted to post this as soon as possible.
Here is one possible solution:
import json
pn_and_mfgr_dict = {
'SAMPLE-ITEM-1': 'SAMPLE-MANUFACTURER-1',
'SAMPLE-ITEM-2': 'SAMPLE-MANUFACTURER-2'
}
quantities = ['1', '3']
poid = 22130
payload = {
'poid': poid,
'inventories': [{
'item': item,
'mfgr': mfgr,
'quantity': quantity,
'condition': 'REF'
} for (item, mfgr), quantity in zip(pn_and_mfgr_dict.items(), quantities)]
}
print(json.dumps(payload, indent=2))
The code above will result in:
{
"poid": 22130,
"inventories": [
{
"item": "SAMPLE-ITEM-1",
"mfgr": "SAMPLE-MANUFACTURER-1",
"quantity": "1",
"condition": "REF"
},
{
"item": "SAMPLE-ITEM-2",
"mfgr": "SAMPLE-MANUFACTURER-2",
"quantity": "3",
"condition": "REF"
}
]
}
Naturally, you can adjust that for multiple poids with something like this:
poids = [22130, 22131, 22132]
for poid in poids:
# implement here the logic to get items and quantities for
# each poid
payload = {
'poid': poid,
'inventories': [{
'item': item,
'mfgr': mfgr,
'quantity': quantity,
'condition': 'REF'
} for (item, mfgr), quantity in zip(pn_and_mfgr_dict.items(), quantities)]
}
print(json.dumps(payload, indent=2))
You will need to change it to have the correspondents items and quantities for each poid, and I leave that as starting point for you to implement.
Your second block is your input, so you could immediately start by write down a function taking those input and returning a JSON string.
import json
from typing import Dict, List
def jsonify_data(pn_and_mfgr_dict: Dict, quantities: List, poid: int):
constructed_data = [] # TODO
return json.dumps(constructed_data)
Then you could start working on using the inputs to construct the output data you desired. And you already know how to do it.
I think the best way I can solve this is making a for loop that iterates through the dictionary and puts respective data where it needs to be
Yes, that's the way to do it.
Here's my version of solution:
import json
from typing import Dict, List
def jsonify_data(pn_and_mfgr_dict: Dict, quantities: List, poid: int):
inventories = [
{
'item': item,
'mfgr': mfgr,
'quantity': quantity,
'condition': 'REF',
} for (item, mfgr), quantity in zip(pn_and_mfgr_dict.items(), quantities)
]
constructed_data = [
{
'poid': f'{poid}',
'inventories': inventories,
}
]
return json.dumps(constructed_data)
import json
data = {'inventories': [{'SAMPLE-ITEM-1': 'SAMPLE-MANUFACTURER-1'}, {'SAMPLE-ITEM-2': 'SAMPLE-MANUFACTURER-2'}]}
quantities = ["1", "3"]
poid = 22130
# Add poid to data
data['poid'] = poid
# Add quantities to data
for item in data['inventories']:
item['quantity'] = quantities.pop(0)
# Serializing json
json_object = json.dumps(data, indent=4)
print(json_object)

get value by key from json data with python [duplicate]

I have some JSON data like:
{
"status": "200",
"msg": "",
"data": {
"time": "1515580011",
"video_info": [
{
"announcement": "{\"announcement_id\":\"6\",\"name\":\"INS\\u8d26\\u53f7\",\"icon\":\"http:\\\/\\\/liveme.cms.ksmobile.net\\\/live\\\/announcement\\\/2017-08-18_19:44:54\\\/ins.png\",\"icon_new\":\"http:\\\/\\\/liveme.cms.ksmobile.net\\\/live\\\/announcement\\\/2017-10-20_22:24:38\\\/4.png\",\"videoid\":\"15154610218328614178\",\"content\":\"FOLLOW ME PLEASE\",\"x_coordinate\":\"0.22\",\"y_coordinate\":\"0.23\"}",
"announcement_shop": "",
etc.
How do I grab the content "FOLLOW ME PLEASE"? I tried using
replay_data = raw_replay_data['data']['video_info'][0]
announcement = replay_data['announcement']
But now announcement is a string representing more JSON data. I can't continue indexing announcement['content'] results in TypeError: string indices must be integers.
How can I get the desired string in the "right" way, i.e. respecting the actual structure of the data?
In a single line -
>>> json.loads(data['data']['video_info'][0]['announcement'])['content']
'FOLLOW ME PLEASE'
To help you understand how to access data (so you don't have to ask again), you'll need to stare at your data.
First, let's lay out your data nicely. You can either use json.dumps(data, indent=4), or you can use an online tool like JSONLint.com.
{
'data': {
'time': '1515580011',
'video_info': [{
'announcement': ( # ***
"""{
"announcement_id": "6",
"name": "INS\\u8d26\\u53f7",
"icon": "http:\\\\/\\\\/liveme.cms.ksmobile.net\\\\/live\\\\/announcement\\\\/2017-08-18_19:44:54\\\\/ins.png",
"icon_new": "http:\\\\/\\\\/liveme.cms.ksmobile.net\\\\/live\\\\/announcement\\\\/2017-10-20_22:24:38\\\\/4.png",
"videoid": "15154610218328614178",
"content": "FOLLOW ME PLEASE",
"x_coordinate": "0.22",
"y_coordinate": "0.23"
}"""),
'announcement_shop': ''
}]
},
'msg': '',
'status': '200'
}
*** Note that the data in the announcement key is actually more json data, which I've laid out on separate lines.
First, find out where your data resides. You're looking for the data in the content key, which is accessed by the announcement key, which is part of a dictionary inside a list of dicts, which can be accessed by the video_info key, which is in turn accessed by data.
So, in summary, "descend" the ladder that is "data" using the following "rungs" -
data, a dictionary
video_info, a list of dicts
announcement, a dict in the first dict of the list of dicts
content residing as part of json data.
First,
i = data['data']
Next,
j = i['video_info']
Next,
k = j[0] # since this is a list
If you only want the first element, this suffices. Otherwise, you'd need to iterate:
for k in j:
...
Next,
l = k['announcement']
Now, l is JSON data. Load it -
import json
m = json.loads(l)
Lastly,
content = m['content']
print(content)
'FOLLOW ME PLEASE'
This should hopefully serve as a guide should you have future queries of this nature.
You have nested JSON data; the string associated with the 'annoucement' key is itself another, separate, embedded JSON document.
You'll have to decode that string first:
import json
replay_data = raw_replay_data['data']['video_info'][0]
announcement = json.loads(replay_data['announcement'])
print(announcement['content'])
then handle the resulting dictionary from there.
The content of "announcement" is another JSON string. Decode it and then access its contents as you were doing with the outer objects.

Python3 - Parse list of strings inside nested json

Python Noob here. I saw many similar questions but none of it my exact use case. I have a simple nested json, and I'm trying to access the element name present inside metadata. Below is my sample json.
{
"items": [{
"metadata": {
"name": "myname1"
}
},
{
"metadata": {
"name": "myname1"
}
}
]
}
Below is the code That I have tried so far, but not successfull.
import json
f = open('./myfile.json')
x = f.read()
data = json.loads(x)
for i in data['items']:
for j in i['metadata']:
print (j['name'])
It errors out stating below
File "pythonjson.py", line 8, in
print (j['name']) TypeError: string indices must be integers
When I printed print (type(j)) I received the following o/p <class 'str'>. So I can see that it is a list of strings and not an dictinoary. So now How can I parse through a list of strings? Any official documentation or guide would be much helpful to know the concept of this.
Your json is bad, and the python exception is clear and unambiguous. You have the basic string "name" and you are trying to ... do a lookup on that?
Let's cut out all the json and look at the real issue. You do not know how to iterate over a dict. You're actually iterating over the keys themselves. If you want to see their values too, you're going to need dict.items()
https://docs.python.org/3/tutorial/datastructures.html#looping-techniques
metadata = {"name": "myname1"}
for key, value in metadata.items():
if key == "name":
print ('the name is', value)
But why bother if you already know the key you want to look up?
This is literally why we have dict.
print ('the name is', metadata["name"])
You likely need:
import json
f = open('./myfile.json')
x = f.read()
data = json.loads(x)
for item in data['items']:
print(item["metadata"]["name"]
Your original JSON is not valid (colons missing).
to access contents of name use "i["metadata"].keys()" this will return all keys in "metadata".
Working code to access all values of the dictionary in "metadata".
for i in data['items']:
for j in i["metadata"].keys():
print (i["metadata"][j])
**update:**Working code to access contents of "name" only.
for i in data['items']:
print (i["metadata"]["name"])

How can I access the nested data in this complex JSON, which includes another JSON document as one of the strings?

I have some JSON data like:
{
"status": "200",
"msg": "",
"data": {
"time": "1515580011",
"video_info": [
{
"announcement": "{\"announcement_id\":\"6\",\"name\":\"INS\\u8d26\\u53f7\",\"icon\":\"http:\\\/\\\/liveme.cms.ksmobile.net\\\/live\\\/announcement\\\/2017-08-18_19:44:54\\\/ins.png\",\"icon_new\":\"http:\\\/\\\/liveme.cms.ksmobile.net\\\/live\\\/announcement\\\/2017-10-20_22:24:38\\\/4.png\",\"videoid\":\"15154610218328614178\",\"content\":\"FOLLOW ME PLEASE\",\"x_coordinate\":\"0.22\",\"y_coordinate\":\"0.23\"}",
"announcement_shop": "",
etc.
How do I grab the content "FOLLOW ME PLEASE"? I tried using
replay_data = raw_replay_data['data']['video_info'][0]
announcement = replay_data['announcement']
But now announcement is a string representing more JSON data. I can't continue indexing announcement['content'] results in TypeError: string indices must be integers.
How can I get the desired string in the "right" way, i.e. respecting the actual structure of the data?
In a single line -
>>> json.loads(data['data']['video_info'][0]['announcement'])['content']
'FOLLOW ME PLEASE'
To help you understand how to access data (so you don't have to ask again), you'll need to stare at your data.
First, let's lay out your data nicely. You can either use json.dumps(data, indent=4), or you can use an online tool like JSONLint.com.
{
'data': {
'time': '1515580011',
'video_info': [{
'announcement': ( # ***
"""{
"announcement_id": "6",
"name": "INS\\u8d26\\u53f7",
"icon": "http:\\\\/\\\\/liveme.cms.ksmobile.net\\\\/live\\\\/announcement\\\\/2017-08-18_19:44:54\\\\/ins.png",
"icon_new": "http:\\\\/\\\\/liveme.cms.ksmobile.net\\\\/live\\\\/announcement\\\\/2017-10-20_22:24:38\\\\/4.png",
"videoid": "15154610218328614178",
"content": "FOLLOW ME PLEASE",
"x_coordinate": "0.22",
"y_coordinate": "0.23"
}"""),
'announcement_shop': ''
}]
},
'msg': '',
'status': '200'
}
*** Note that the data in the announcement key is actually more json data, which I've laid out on separate lines.
First, find out where your data resides. You're looking for the data in the content key, which is accessed by the announcement key, which is part of a dictionary inside a list of dicts, which can be accessed by the video_info key, which is in turn accessed by data.
So, in summary, "descend" the ladder that is "data" using the following "rungs" -
data, a dictionary
video_info, a list of dicts
announcement, a dict in the first dict of the list of dicts
content residing as part of json data.
First,
i = data['data']
Next,
j = i['video_info']
Next,
k = j[0] # since this is a list
If you only want the first element, this suffices. Otherwise, you'd need to iterate:
for k in j:
...
Next,
l = k['announcement']
Now, l is JSON data. Load it -
import json
m = json.loads(l)
Lastly,
content = m['content']
print(content)
'FOLLOW ME PLEASE'
This should hopefully serve as a guide should you have future queries of this nature.
You have nested JSON data; the string associated with the 'annoucement' key is itself another, separate, embedded JSON document.
You'll have to decode that string first:
import json
replay_data = raw_replay_data['data']['video_info'][0]
announcement = json.loads(replay_data['announcement'])
print(announcement['content'])
then handle the resulting dictionary from there.
The content of "announcement" is another JSON string. Decode it and then access its contents as you were doing with the outer objects.

For each loop with JSON object python

Alright, so I'm struggling a little bit with trying to parse my JSON object.
My aim is to grab the certain JSON key and return it's value.
JSON File
{
"files": {
"resources": [
{
"name": "filename",
"hash": "0x001"
},
{
"name": "filename2",
"hash": "0x002"
}
]
}
}
I've developed a function which allows me to parse the JSON code above
Function
def parsePatcher():
url = '{0}/{1}'.format(downloadServer, patcherName)
patch = urllib2.urlopen(url)
data = json.loads(patch.read())
patch.close()
return data
Okay so now I would like to do a foreach statement which prints out each name and hash inside the "resources": [] object.
Foreach statement
for name, hash in patcher["files"]["resources"]:
print name
print hash
But it only prints out "name" and "hash" not "filename" and "0x001"
Am I doing something incorrect here?
By using name, hash as the for loop target, you are unpacking the dictionary:
>>> d = {"name": "filename", "hash": "0x001"}
>>> name, hash = d
>>> name
'name'
>>> hash
'hash'
This happens because iteration over a dictionary only produces the keys:
>>> list(d)
['name', 'hash']
and unpacking uses iteration to produce the values to be assigned to the target names.
That that worked at all is subject to random events even, on Python 3.3 and newer with hash randomisation enabled by default, the order of those two keys could equally be reversed.
Just use one name to assign the dictionary to, and use subscription on that dictionary:
for resource in patcher["files"]["resources"]:
print resource['name']
print resource['hash']
So what you intend to do is :
for dic in x["files"]["resources"]:
print dic['name'],dic['hash']
You need to iterate on those dictionaries in that array resources.
The problem seems to be you have a list of dictionaries, first get each element of the list, and then ask the element (which is the dictionary) for the values for keys name and hash
EDIT: this is tested and works
mydict = {"files": { "resources": [{ "name": "filename", "hash": "0x001"},{ "name": "filename2", "hash": "0x002"}]} }
for element in mydict["files"]["resources"]:
for d in element:
print d, element[d]
If in case you have multiple files and multiple resources inside it. This generalized solution works.
for keys in patcher:
for indices in patcher[keys].keys():
print(patcher[keys][indices])
Checked output from myside
for keys in patcher:
... for indices in patcher[keys].keys():
... print(patcher[keys][indices])
...
[{'hash': '0x001', 'name': 'filename'}, {'hash': '0x002', 'name': 'filename2'}]

Categories